This is a technical follow-up to a previous post on assortative mixing in networks. In a footnote, I claimed that Newman’s (2003) assortativity coefficient equals the Pearson correlation coefficient when there are two possible node types. This post proves that claim.

## Notation

Consider an undirected network $$N$$ in which each node has a type belonging to a (finite) set $$T$$. The assortativity coefficient is defined as $$r=\frac{\sum_{t\in T}x_{tt}-\sum_{t\in T}y_t^2}{1-\sum_{t\in T}y_t^2},$$ where $$x_{st}$$ is the proportion of edges joining nodes of type $$s$$ to nodes of type $$t$$, and where $$y_t=\sum_{s\in T}x_{st}$$ is the proportion of edges incident with nodes of type $$t$$. The Pearson correlation of adjacent nodes’ types is given by $$\DeclareMathOperator{\Cov}{Cov} \DeclareMathOperator{\Var}{Var} \rho=\frac{\Cov(t_i,t_j)}{\sqrt{\Var(t_i)\Var(t_j)}},$$ where $$t_i\in T$$ and $$t_j\in T$$ are the types of nodes $$i$$ and $$j$$, and where (co)variances are computed with respect to the frequency at which nodes of type $$t_i$$ and $$t_j$$ are adjacent in $$N$$.

## Proof

Let $$T=\{a,b\}\subset\mathbb{R}$$ with $$a\not=b$$. I show that the correlation coefficient $$\rho$$ and assortativity coefficient $$r$$ can be expressed as the same function of $$y_a$$ and $$x_{ab}$$, implying $$\rho=r$$.

Consider $$\rho$$. It can be understood by presenting the mixing matrix $$X=(x_{st})$$ in tabular form:

$$t_i$$ $$t_j$$ $$x_{t_it_j}$$
$$a$$ $$a$$ $$x_{aa}$$
$$a$$ $$b$$ $$x_{ab}$$
$$b$$ $$a$$ $$x_{ba}$$
$$b$$ $$b$$ $$x_{bb}$$

The first two columns enumerate the possible type pairs $$(t_i,t_j)$$ and the third column stores the proportion of adjacent node pairs $$(i,j)$$ with each type pair. This third column defines the joint distribution of types across adjacent nodes. Thus $$\rho$$ equals the correlation of the first two columns, weighted by the third column. (Here $$x_{ab}=x_{ba}$$ since $$N$$ is undirected.) Now $$t_i$$ has mean \DeclareMathOperator{\E}{E} \begin{aligned} \E[t_i] &= x_{aa}a+x_{ab}a+x_{ba}b+x_{bb}b \\ &= y_aa+y_bb \end{aligned} and second moment \begin{aligned} \E[t_i^2] &= x_{aa}a^2+x_{ab}a^2+x_{ba}b^2+x_{bb}b^2 \\ &= y_aa^2+y_bb^2, \end{aligned} and similar calculations reveal $$\E[t_j]=\E[t_i]$$ and $$\E[t_j^2]=\E[t_i^2]$$. Thus $$t_i$$ has variance \begin{aligned} \Var(t_i) &= \E[t_i^2]-\E[t_i]^2 \\ &= y_aa^2+y_bb^2-(y_aa+y_bb)^2 \\ &= y_a(1-y_a)a^2+y_b(1-y_b)b^2-2y_ay_bab \end{aligned} and similarly $$\Var(t_j)=\Var(t_i)$$. We can simplify this expression for the variance by noticing that $$x_{aa}+x_{ab}+x_{ba}+x_{bb}=1,$$ which implies \begin{aligned} y_b &= x_{ab}+x_{bb} \\ &= 1-x_{aa}-x_{ba} \\ &= 1-y_a \end{aligned} and therefore \begin{aligned} \Var(t_i) &= y_a(1-y_a)a^2+(1-y_a)y_ab^2-2y_a(1-y_a)ab \\ &= y_a(1-y_a)(a-b)^2. \end{aligned} We next express the covariance $$\Cov(t_i,t_j)=\E[t_it_j]-\E[t_i]\E[t_j]$$ in terms of $$y_a$$ and $$x_{ab}$$. Now \begin{aligned} \E[t_it_j] &= x_{aa}a^2+x_{ab}ab+x_{ba}ab+x_{bb}b^2 \\ &= (y_a-x_{ab})a^2+2x_{ab}ab+(y_b-x_{ab})b^2 \\ &= y_aa^2+y_bb^2-x_{ab}(a-b)^2 \end{aligned} because $$x_{ab}=x_{ba}$$. It follows that \begin{aligned} \Cov(t_i,t_j) &= y_aa^2+y_bb^2-x_{ab}(a-b)^2-(y_aa+y_bb)^2 \\ &= y_a(1-y_a)a^2+y_b(1-y_b)b^2-2y_ay_bab-x_{ab}(a-b)^2 \\ &= y_a(1-y_a)(a-b)^2-x_{ab}(a-b)^2, \end{aligned} where the last line uses the fact that $$y_b=1-y_a$$. Putting everything together, we have \begin{aligned} \rho &= \frac{\Cov(t_i,t_j)}{\sqrt{\Var(t_i)\Var(t_j)}} \\ &= \frac{y_a(1-y_a)-x_{ab}}{y_a(1-y_a)}, \end{aligned} a function of $$y_a$$ and $$x_{ab}$$.

Now consider $$r$$. Its numerator equals \begin{aligned} \sum_{t\in T}x_{tt}-\sum_{t\in T}y_t^2 &= x_{aa}+x_{bb}-y_a^2-y_b^2 \\ &= (y_a-x_{ab})+(y_b-x_{ab})-y_a^2-y_b^2 \\ &= y_a(1-y_a)+y_b(1-y_b)-2x_{ab} \\ &\overset{\star}{=} 2y_a(1-y_a)-2x_{ab} \end{aligned} and its denominator equals \begin{aligned} 1-\sum_{t\in T}y_t^2 &= 1-y_a^2-y_b^2 \\ &\overset{\star\star}{=} 1-y_a^2-(1-y_a)^2 \\ &= 2y_a(1-y_a), \end{aligned} where $$\star$$ and $$\star\star$$ both use the fact that $$y_b=1-y_a$$. Thus $$r=\frac{y_a(1-y_a)-x_{ab}}{y_a(1-y_a)},$$ the same function of $$y_a$$ and $$x_{ab}$$, and so $$\rho=r$$ as claimed.

Writing $$\rho=r$$ in terms of $$y_a$$ and $$x_{ab}$$ makes it easy to check the boundary cases: if there are no within-type edges then $$y_a=x_{ab}=1/2$$ and so $$\rho=r=-1$$; if there are no between-type edges then $$x_{ab}=0$$ and so $$\rho=r=1$$.

## Appendix: Constructing the mixing matrix

The proof relies on noticing that $$x_{ab}=x_{ba}$$, which comes from undirectedness of the network $$N$$ and from how the mixing matrix $$X$$ is constructed. I often forget this construction, so here’s a simple algorithm: Consider some type pair $$(s,t)$$. Look at the edges beginning at type $$s$$ nodes and count how many end at type $$t$$ nodes. Call this count $$m_{st}$$. Do the same for all type pairs to obtain a matrix $$M=(m_{st})$$ of edge counts. Divide the entries in $$M$$ by their sum to obtain $$X$$.