Let [n]{1,2,,n} be a set of individuals. Suppose I have data {(yij,xij):i,j[n] with i<j} on pairs in [n] generated by the process yij=xijβ+εij, where xij is a row vector of pair {i,j}'s characteristics, β is a vector of coefficients to be estimated, and εij is a random error term with zero mean and zero correlation with the xij. For example, [n] could be the nodes in a network, xij the dimensions along which nodes i and j interact, and yij the outcome of such interaction.

We can rewrite the data-generating process (DGP) in matrix form as y=Xβ+ε, where y is the vector of outcomes, X is the design matrix, and ε is the vector of errors. Here X has Nn(n1)2 rows, each corresponding to a(n unordered) pair of individuals in [n]. Since the xij and εij are uncorrelated, the ordinary least squares estimator β^=(XTX)1XTy of β is unbiased. However, β^ may not be efficient because the errors εij may be correlated. For example, if εij=ui+uj+vij with ui, uj, and vij independent then Cov(εij,εjk)=Var(uj). Intuitively, the pairs {i,j} and {j,k} are linked through individual j, and so any errors specific to that individual affect the errors for both pairs. Consequently, the homoskedastic estimator Var^Hom.(β^)=σ^2(XTX)1 with σ^2=1Nijε^ij2 and ε^ij=yijxijβ^ will typically under-estimate the variance in β^ by failing to account for linked pairs having dependent errors.

So, how can we account for such dependence? Consider the “sandwich” form Var(β^)=BMB of the (co)variance matrix for β^, where B=(XTX)1 is the “bread” matrix and M=XTVX is the “meat” matrix with V=Var(ε) the error (co)variance matrix. We need to estimate M because we don’t observe the εij. Indexing pairs by p, the homoskedastic estimator defined above uses M^Hom.=σ^2XTX=σ^2p=1NxpTxp, which assumes all errors have equal variance. In contrast, White (1980) suggests using M^White=XTdiag(ε^p2)X=p=1Nε^p2xpTxp, which allows for unequal error variances (heteroskedasticity). But neither M^Hom. nor M^White allow for dyadic dependence among the errors. To that end, Aronow et al. (2017) suggest augmenting White’s estimator via M^Aronow=M^White+p=1NqD(p)ε^pε^qxpTxq, where D(p) is the set of pairs qp linked to p by a shared individual. We can express M^Aronow in matrix form as M^Aronow=XT(Dε^ε^T)X, where D=(dpq) is the dyadic dependence matrix with dpq={1if pairs p and q are linked0otherwise, and where denotes element-wise multiplication. Aronow et al. show that, under mild conditions, BM^AronowB is a consistent estimator for Var(β^) when the data exhibit dyadic dependence.1

To see Aronow et al.‘s estimator in action, suppose the DGP is given by the system yij=βxij+εijxij=zi+zjεij=ui+uj+vij, where zi, zj, ui, uj and vij are iid standard normal, and β=1 is the (scalar) coefficient to be estimated. Both the xij and the εij exhibit dyadic dependence, so we expect the homoskedastic and White estimators to under-estimate the true variance in β^. Indeed, the box plots below show that Aronow et al.‘s estimator is less biased than the homoskedastic and White estimators, and gets more accurate as the number of individuals n grows.

Aronow et al.‘s estimator can also be applied to generalized linear models. For example, suppose yij={1if nodes i and j are adjacent0otherwise is an indicator for the event in which nodes i and j are adjacent in a network. We can model the link formation process as Pr(yij=1)=Λ1(xijβ+εij), where Λ(x)log(x/(1x)) is the logit link function. The logistic regression estimate β^ of β reveals how the observable characteristics xij of nodes i and j determine their probability of being adjacent. We can estimate the variance of β^ consistently by letting P^ij=Λ1(xijβ^) be the predicted probability for pair {i,j}, replacing the bread matrix B=(XTX)1 with B^=(XTdiag(P^ij(1P^ij))X)1, and computing B^M^AronowB^. My co-authors and I use this approach in “Research Funding and Collaboration:” we estimate how grant proposal outcomes determine the probability with which pairs of researchers co-author, and we compare σ^2B^ and B^M^AronowB^ to show that our inferences are robust to dyadic dependence.


  1. Fafchamps and Gubert (2007) describe a similar variance estimator to Aronow et al. but do not establish its consistency. ↩︎