Assortativity and correlation coefficients

This is a technical follow-up to a previous post on assortative mixing in networks. In a footnote, I claimed that Newman’s (2003) assortativity coefficient equals the Pearson correlation coefficient when there are two possible node types. This post proves that claim.

Notation

Consider an undirected network $N$ in which each node has a type belonging to a (finite) set $T$. The assortativity coefficient is defined as $$r=\frac{\sum_{t\in T}x_{tt}-\sum_{t\in T}y_t^2}{1-\sum_{t\in T}y_t^2},$$ where $x_{st}$ is the proportion of edges joining nodes of type $s$ to nodes of type $t$, and where $$y_t=\sum_{s\in T}x_{st}$$ is the proportion of edges incident with nodes of type $t$. The Pearson correlation of adjacent nodes’ types is given by $$\DeclareMathOperator{\Cov}{Cov} \DeclareMathOperator{\Var}{Var} \rho=\frac{\Cov(t_i,t_j)}{\sqrt{\Var(t_i)\Var(t_j)}},$$ where $t_i\in T$ and $t_j\in T$ are the types of nodes $i$ and $j$, and where (co)variances are computed with respect to the frequency at which nodes of type $t_i$ and $t_j$ are adjacent in $N$.

Proof

Let $T=\{a,b\}\subset\mathbb{R}$ with $a\not=b$. I show that the correlation coefficient $\rho$ and assortativity coefficient $r$ can be expressed as the same function of $y_a$ and $x_{ab}$, implying $\rho=r$.

Consider $\rho$. It can be understood by presenting the mixing matrix $X=(x_{st})$ in tabular form:

`$t_i$`	`$t_j$`	`$x_{t_it_j}$`
`$a$`	`$a$`	`$x_{aa}$`
`$a$`	`$b$`	`$x_{ab}$`
`$b$`	`$a$`	`$x_{ba}$`
`$b$`	`$b$`	`$x_{bb}$`

The first two columns enumerate the possible type pairs $(t_i,t_j)$ and the third column stores the proportion of adjacent node pairs $(i,j)$ with each type pair. This third column defines the joint distribution of types across adjacent nodes. Thus $\rho$ equals the correlation of the first two columns, weighted by the third column. (Here $x_{ab}=x_{ba}$ since $N$ is undirected.) Now $t_i$ has mean $$\DeclareMathOperator{\E}{E} \begin{aligned} \E[t_i] &= x_{aa}a+x_{ab}a+x_{ba}b+x_{bb}b \\ &= y_aa+y_bb \end{aligned}$$ and second moment $$\begin{aligned} \E[t_i^2] &= x_{aa}a^2+x_{ab}a^2+x_{ba}b^2+x_{bb}b^2 \\ &= y_aa^2+y_bb^2, \end{aligned}$$ and similar calculations reveal $\E[t_j]=\E[t_i]$ and $\E[t_j^2]=\E[t_i^2]$. Thus $t_i$ has variance $$\begin{aligned} \Var(t_i) &= \E[t_i^2]-\E[t_i]^2 \\ &= y_aa^2+y_bb^2-(y_aa+y_bb)^2 \\ &= y_a(1-y_a)a^2+y_b(1-y_b)b^2-2y_ay_bab \end{aligned}$$ and similarly $\Var(t_j)=\Var(t_i)$. We can simplify this expression for the variance by noticing that $$x_{aa}+x_{ab}+x_{ba}+x_{bb}=1,$$ which implies $$\begin{aligned} y_b &= x_{ab}+x_{bb} \\ &= 1-x_{aa}-x_{ba} \\ &= 1-y_a \end{aligned}$$ and therefore $$\begin{aligned} \Var(t_i) &= y_a(1-y_a)a^2+(1-y_a)y_ab^2-2y_a(1-y_a)ab \\ &= y_a(1-y_a)(a-b)^2. \end{aligned}$$ We next express the covariance $\Cov(t_i,t_j)=\E[t_it_j]-\E[t_i]\E[t_j]$ in terms of $y_a$ and $x_{ab}$. Now $$\begin{aligned} \E[t_it_j] &= x_{aa}a^2+x_{ab}ab+x_{ba}ab+x_{bb}b^2 \\ &= (y_a-x_{ab})a^2+2x_{ab}ab+(y_b-x_{ab})b^2 \\ &= y_aa^2+y_bb^2-x_{ab}(a-b)^2 \end{aligned}$$ because $x_{ab}=x_{ba}$. It follows that $$\begin{aligned} \Cov(t_i,t_j) &= y_aa^2+y_bb^2-x_{ab}(a-b)^2-(y_aa+y_bb)^2 \\ &= y_a(1-y_a)a^2+y_b(1-y_b)b^2-2y_ay_bab-x_{ab}(a-b)^2 \\ &= y_a(1-y_a)(a-b)^2-x_{ab}(a-b)^2, \end{aligned}$$ where the last line uses the fact that $y_b=1-y_a$. Putting everything together, we have $$\begin{aligned} \rho &= \frac{\Cov(t_i,t_j)}{\sqrt{\Var(t_i)\Var(t_j)}} \\ &= \frac{y_a(1-y_a)-x_{ab}}{y_a(1-y_a)}, \end{aligned}$$ a function of $y_a$ and $x_{ab}$.

Now consider $r$. Its numerator equals $$\begin{aligned} \sum_{t\in T}x_{tt}-\sum_{t\in T}y_t^2 &= x_{aa}+x_{bb}-y_a^2-y_b^2 \\ &= (y_a-x_{ab})+(y_b-x_{ab})-y_a^2-y_b^2 \\ &= y_a(1-y_a)+y_b(1-y_b)-2x_{ab} \\ &\overset{\star}{=} 2y_a(1-y_a)-2x_{ab} \end{aligned}$$ and its denominator equals $$\begin{aligned} 1-\sum_{t\in T}y_t^2 &= 1-y_a^2-y_b^2 \\ &\overset{\star\star}{=} 1-y_a^2-(1-y_a)^2 \\ &= 2y_a(1-y_a), \end{aligned}$$ where $\star$ and $\star\star$ both use the fact that $y_b=1-y_a$. Thus $$r=\frac{y_a(1-y_a)-x_{ab}}{y_a(1-y_a)},$$ the same function of $y_a$ and $x_{ab}$, and so $\rho=r$ as claimed.

Writing $\rho=r$ in terms of $y_a$ and $x_{ab}$ makes it easy to check the boundary cases: if there are no within-type edges then $y_a=x_{ab}=1/2$ and so $\rho=r=-1$; if there are no between-type edges then $x_{ab}=0$ and so $\rho=r=1$.

Appendix: Constructing the mixing matrix

The proof relies on noticing that $x_{ab}=x_{ba}$, which comes from undirectedness of the network $N$ and from how the mixing matrix $X$ is constructed. I often forget this construction, so here’s a simple algorithm: Consider some type pair $(s,t)$. Look at the edges beginning at type $s$ nodes and count how many end at type $t$ nodes. Call this count $m_{st}$. Do the same for all type pairs to obtain a matrix $M=(m_{st})$ of edge counts. Divide the entries in $M$ by their sum to obtain $X$.

`\(t_i\)`	`\(t_j\)`	`\(x_{t_it_j}\)`
`\(a\)`	`\(a\)`	`\(x_{aa}\)`
`\(a\)`	`\(b\)`	`\(x_{ab}\)`
`\(b\)`	`\(a\)`	`\(x_{ba}\)`
`\(b\)`	`\(b\)`	`\(x_{bb}\)`