This is a technical follow-up to a previous post on assortative mixing in networks. In a footnote, I claimed that Newman’s (2003) assortativity coefficient equals the Pearson correlation coefficient when there are two possible node types. This post proves that claim.

## Notation

Consider an undirected network `\(N\)`

in which each node has a type belonging to a (finite) set `\(T\)`

.
The assortativity coefficient is defined as
`$$r=\frac{\sum_{t\in T}x_{tt}-\sum_{t\in T}y_t^2}{1-\sum_{t\in T}y_t^2},$$`

where `\(x_{st}\)`

is the proportion of edges joining nodes of type `\(s\)`

to nodes of type `\(t\)`

, and where
`$$y_t=\sum_{s\in T}x_{st}$$`

is the proportion of edges incident with nodes of type `\(t\)`

.
The Pearson correlation of adjacent nodes’ types is given by
`$$\DeclareMathOperator{\Cov}{Cov} \DeclareMathOperator{\Var}{Var} \rho=\frac{\Cov(t_i,t_j)}{\sqrt{\Var(t_i)\Var(t_j)}},$$`

where `\(t_i\in T\)`

and `\(t_j\in T\)`

are the types of nodes `\(i\)`

and `\(j\)`

, and where (co)variances are computed with respect to the frequency at which nodes of type `\(t_i\)`

and `\(t_j\)`

are adjacent in `\(N\)`

.

## Proof

Let `\(T=\{a,b\}\subset\mathbb{R}\)`

with `\(a\not=b\)`

.
I show that the correlation coefficient `\(\rho\)`

and assortativity coefficient `\(r\)`

can be expressed as the same function of `\(y_a\)`

and `\(x_{ab}\)`

, implying `\(\rho=r\)`

.

Consider `\(\rho\)`

.
It can be understood by presenting the mixing matrix `\(X=(x_{st})\)`

in tabular form:

`\(t_i\)` |
`\(t_j\)` |
`\(x_{t_it_j}\)` |
---|---|---|

`\(a\)` |
`\(a\)` |
`\(x_{aa}\)` |

`\(a\)` |
`\(b\)` |
`\(x_{ab}\)` |

`\(b\)` |
`\(a\)` |
`\(x_{ba}\)` |

`\(b\)` |
`\(b\)` |
`\(x_{bb}\)` |

The first two columns enumerate the possible type pairs `\((t_i,t_j)\)`

and the third column stores the proportion of adjacent node pairs `\((i,j)\)`

with each type pair.
This third column defines the joint distribution of types across adjacent nodes.
Thus `\(\rho\)`

equals the correlation of the first two columns, weighted by the third column.
(Here `\(x_{ab}=x_{ba}\)`

since `\(N\)`

is undirected.)
Now `\(t_i\)`

has mean
`$$\DeclareMathOperator{\E}{E} \begin{aligned} \E[t_i] &= x_{aa}a+x_{ab}a+x_{ba}b+x_{bb}b \\ &= y_aa+y_bb \end{aligned}$$`

and second moment
`$$\begin{aligned} \E[t_i^2] &= x_{aa}a^2+x_{ab}a^2+x_{ba}b^2+x_{bb}b^2 \\ &= y_aa^2+y_bb^2, \end{aligned}$$`

and similar calculations reveal `\(\E[t_j]=\E[t_i]\)`

and `\(\E[t_j^2]=\E[t_i^2]\)`

.
Thus `\(t_i\)`

has variance
`$$\begin{aligned} \Var(t_i) &= \E[t_i^2]-\E[t_i]^2 \\ &= y_aa^2+y_bb^2-(y_aa+y_bb)^2 \\ &= y_a(1-y_a)a^2+y_b(1-y_b)b^2-2y_ay_bab \end{aligned}$$`

and similarly `\(\Var(t_j)=\Var(t_i)\)`

.
We can simplify this expression for the variance by noticing that
`$$x_{aa}+x_{ab}+x_{ba}+x_{bb}=1,$$`

which implies
`$$\begin{aligned} y_b &= x_{ab}+x_{bb} \\ &= 1-x_{aa}-x_{ba} \\ &= 1-y_a \end{aligned}$$`

and therefore
`$$\begin{aligned} \Var(t_i) &= y_a(1-y_a)a^2+(1-y_a)y_ab^2-2y_a(1-y_a)ab \\ &= y_a(1-y_a)(a-b)^2. \end{aligned}$$`

We next express the covariance `\(\Cov(t_i,t_j)=\E[t_it_j]-\E[t_i]\E[t_j]\)`

in terms of `\(y_a\)`

and `\(x_{ab}\)`

.
Now
`$$\begin{aligned} \E[t_it_j] &= x_{aa}a^2+x_{ab}ab+x_{ba}ab+x_{bb}b^2 \\ &= (y_a-x_{ab})a^2+2x_{ab}ab+(y_b-x_{ab})b^2 \\ &= y_aa^2+y_bb^2-x_{ab}(a-b)^2 \end{aligned}$$`

because `\(x_{ab}=x_{ba}\)`

.
It follows that
`$$\begin{aligned} \Cov(t_i,t_j) &= y_aa^2+y_bb^2-x_{ab}(a-b)^2-(y_aa+y_bb)^2 \\ &= y_a(1-y_a)a^2+y_b(1-y_b)b^2-2y_ay_bab-x_{ab}(a-b)^2 \\ &= y_a(1-y_a)(a-b)^2-x_{ab}(a-b)^2, \end{aligned}$$`

where the last line uses the fact that `\(y_b=1-y_a\)`

.
Putting everything together, we have
`$$\begin{aligned} \rho &= \frac{\Cov(t_i,t_j)}{\sqrt{\Var(t_i)\Var(t_j)}} \\ &= \frac{y_a(1-y_a)-x_{ab}}{y_a(1-y_a)}, \end{aligned}$$`

a function of `\(y_a\)`

and `\(x_{ab}\)`

.

Now consider `\(r\)`

.
Its numerator equals
`$$\begin{aligned} \sum_{t\in T}x_{tt}-\sum_{t\in T}y_t^2 &= x_{aa}+x_{bb}-y_a^2-y_b^2 \\ &= (y_a-x_{ab})+(y_b-x_{ab})-y_a^2-y_b^2 \\ &= y_a(1-y_a)+y_b(1-y_b)-2x_{ab} \\ &\overset{\star}{=} 2y_a(1-y_a)-2x_{ab} \end{aligned}$$`

and its denominator equals
`$$\begin{aligned} 1-\sum_{t\in T}y_t^2 &= 1-y_a^2-y_b^2 \\ &\overset{\star\star}{=} 1-y_a^2-(1-y_a)^2 \\ &= 2y_a(1-y_a), \end{aligned}$$`

where `\(\star\)`

and `\(\star\star\)`

both use the fact that `\(y_b=1-y_a\)`

.
Thus
`$$r=\frac{y_a(1-y_a)-x_{ab}}{y_a(1-y_a)},$$`

the same function of `\(y_a\)`

and `\(x_{ab}\)`

, and so `\(\rho=r\)`

as claimed.

Writing `\(\rho=r\)`

in terms of `\(y_a\)`

and `\(x_{ab}\)`

makes it easy to check the boundary cases:
if there are no within-type edges then `\(y_a=x_{ab}=1/2\)`

and so `\(\rho=r=-1\)`

;
if there are no between-type edges then `\(x_{ab}=0\)`

and so `\(\rho=r=1\)`

.

## Appendix: Constructing the mixing matrix

The proof relies on noticing that `\(x_{ab}=x_{ba}\)`

, which comes from undirectedness of the network `\(N\)`

and from how the mixing matrix `\(X\)`

is constructed.
I often forget this construction, so here’s a simple algorithm:
Consider some type pair `\((s,t)\)`

.
Look at the edges beginning at type `\(s\)`

nodes and count how many end at type `\(t\)`

nodes.
Call this count `\(m_{st}\)`

.
Do the same for all type pairs to obtain a matrix `\(M=(m_{st})\)`

of edge counts.
Divide the entries in `\(M\)`

by their sum to obtain `\(X\)`

.