People tend to be less popular than their friends. This paradox, first observed by Feld (1991), is due to popular people appearing on many friend lists.
For example, consider the social network among members of a karate club studied by Zachary (1977):
The network contains nodes with mean degree where takes expected values across nodes and is the degree of node . If denotes the set of 's neighbors, then the mean degree among those neighbors equals The friendship paradox states that in any network. In Zachary’s network we have , about twice the mean degree.
We can approximate using the following procedure:
- Choose a stub (i.e., the endpoint of an edge) uniformly at random.
- Record the degree of the chosen stub.
Repeating these steps many times yields a degree distribution that over-samples from high-degree nodes. The mean of this distribution answers the following question: “How many friends does a typical friend have?” The probability of choosing node in the first step equals the proportion of stubs that adds to the network. Using the probabilities to compute the expected value of the degrees yields an approximation of . Notice that if then ; in that case, everyone has the same degree as their friends, and so there is no friendship paradox. The difference between the mean degree and the typical friend’s degree grows as the variance in degrees grows.
The approximation is closest to when there is no assortative mixing with respect to degree. Then the are uncorrelated with the . But this isn’t true in Zachary’s network:
Indeed, in Zachary’s network we have , which is smaller than the true value . To see why, notice that where holds because appears in neighborhoods . But by the definition of covariance, from which it follows that Thus under-estimates in Zachary’s network because is negative.
The value of depends only on the mean and variance of degrees, and not the correlation of degrees across adjacent nodes. Thus is invariant to degree-preserving randomizations (DPRs). But can vary under DPRs because they can change the correlation of adjacent nodes’ degrees. For example, consider the three networks shown below:
The networks , , and have the same degree distributions. As a result, they have the same mean degrees and approximations of . But the true values of differ because the correlations differ:
Network | ||||
---|---|---|---|---|
1.43 | 1.6 | 1.43 | 1.00 | |
1.43 | 1.6 | 1.57 | 0.20 | |
1.43 | 1.6 | 1.71 | -0.91 |
The network is perfectly assortative with respect to degree, so over-estimates . Whereas is dis-assortative with respect to degree, so under-estimates . The network is relatively unsorted, so is close to .