Dyadic dependence

Let $[n] \equiv {1, 2, \dots, n}$ be a set of individuals. Suppose I have data ${(y_{i j}, x_{i j}) : i, j \in [n] with i < j}$ on pairs in $[n]$ generated by the process $y_{i j} = x_{i j} β + ε_{i j},$ where $x_{i j}$ is a row vector of pair ${i, j}$ 's characteristics, $β$ is a vector of coefficients to be estimated, and $ε_{i j}$ is a random error term with zero mean and zero correlation with the $x_{i j}$ . For example, $[n]$ could be the nodes in a network, $x_{i j}$ the dimensions along which nodes $i$ and $j$ interact, and $y_{i j}$ the outcome of such interaction.

We can rewrite the data-generating process (DGP) in matrix form as $y = X β + ε,$ where $y$ is the vector of outcomes, $X$ is the design matrix, and $ε$ is the vector of errors. Here $X$ has $N \equiv \frac{n (n - 1)}{2}$ rows, each corresponding to a(n unordered) pair of individuals in $[n]$ . Since the $x_{i j}$ and $ε_{i j}$ are uncorrelated, the ordinary least squares estimator $\hat{β} = (X^{T} X)^{- 1} X^{T} y$ of $β$ is unbiased. However, $\hat{β}$ may not be efficient because the errors $ε_{i j}$ may be correlated. For example, if $ε_{i j} = u_{i} + u_{j} + v_{i j}$ with $u_{i}$ , $u_{j}$ , and $v_{i j}$ independent then $Cov (ε_{i j}, ε_{j k}) = Var (u_{j}) .$ Intuitively, the pairs ${i, j}$ and ${j, k}$ are linked through individual $j$ , and so any errors specific to that individual affect the errors for both pairs. Consequently, the homoskedastic estimator ${\hat{Var}}_{Hom.} (\hat{β}) = {\hat{σ}}^{2} (X^{T} X)^{- 1}$ with ${\hat{σ}}^{2} = \frac{1}{N} \sum_{i j} {\hat{ε}}_{i j}^{2}$ and ${\hat{ε}}_{i j} = y_{i j} - x_{i j} \hat{β}$ will typically under-estimate the variance in $\hat{β}$ by failing to account for linked pairs having dependent errors.

So, how can we account for such dependence? Consider the “sandwich” form $Var (\hat{β}) = B M B$ of the (co)variance matrix for $\hat{β}$ , where $B = (X^{T} X)^{- 1}$ is the “bread” matrix and $M = X^{T} V X$ is the “meat” matrix with $V = Var (ε)$ the error (co)variance matrix. We need to estimate $M$ because we don’t observe the $ε_{i j}$ . Indexing pairs by $p$ , the homoskedastic estimator defined above uses $\begin{aligned} {\hat{M}}_{Hom.} & = {\hat{σ}}^{2} X^{T} X \\ = {\hat{σ}}^{2} \sum_{p = 1}^{N} x_{p}^{T} x_{p}, \end{aligned}$ which assumes all errors have equal variance. In contrast, White (1980) suggests using $\begin{aligned} {\hat{M}}_{White} & = X^{T} d i a g ({\hat{ε}}_{p}^{2}) X \\ = \sum_{p = 1}^{N} {\hat{ε}}_{p}^{2} x_{p}^{T} x_{p}, \end{aligned}$ which allows for unequal error variances (heteroskedasticity). But neither ${\hat{M}}_{Hom.}$ nor ${\hat{M}}_{White}$ allow for dyadic dependence among the errors. To that end, Aronow et al. (2017) suggest augmenting White’s estimator via $\begin{aligned} {\hat{M}}_{Aronow} & = {\hat{M}}_{White} + \sum_{p = 1}^{N} \sum_{q \in D (p)} {\hat{ε}}_{p} {\hat{ε}}_{q} x_{p}^{T} x_{q}, \end{aligned}$ where $D (p)$ is the set of pairs $q \neq p$ linked to $p$ by a shared individual. We can express ${\hat{M}}_{Aronow}$ in matrix form as ${\hat{M}}_{Aronow} = X^{T} (D ⊙ \hat{ε} {\hat{ε}}^{T}) X,$ where $D = (d_{p q})$ is the dyadic dependence matrix with $d_{p q} = {\begin{cases} 1 & if pairs p and q are linked \\ 0 & otherwise, \end{cases}$ and where $⊙$ denotes element-wise multiplication. Aronow et al. show that, under mild conditions, $B {\hat{M}}_{Aronow} B$ is a consistent estimator for $Var (\hat{β})$ when the data exhibit dyadic dependence.¹

To see Aronow et al.‘s estimator in action, suppose the DGP is given by the system $\begin{aligned} y_{i j} & = β x_{i j} + ε_{i j} \\ x_{i j} & = z_{i} + z_{j} \\ ε_{i j} & = u_{i} + u_{j} + v_{i j}, \end{aligned}$ where $z_{i}$ , $z_{j}$ , $u_{i}$ , $u_{j}$ and $v_{i j}$ are iid standard normal, and $β = 1$ is the (scalar) coefficient to be estimated. Both the $x_{i j}$ and the $ε_{i j}$ exhibit dyadic dependence, so we expect the homoskedastic and White estimators to under-estimate the true variance in $\hat{β}$ . Indeed, the box plots below show that Aronow et al.‘s estimator is less biased than the homoskedastic and White estimators, and gets more accurate as the number of individuals $n$ grows.

Aronow et al.‘s estimator can also be applied to generalized linear models. For example, suppose $y_{i j} = {\begin{cases} 1 & if nodes i and j are adjacent \\ 0 & otherwise \end{cases}$ is an indicator for the event in which nodes $i$ and $j$ are adjacent in a network. We can model the link formation process as $Pr (y_{i j} = 1) = Λ^{- 1} (x_{i j} β + ε_{i j}),$ where $Λ (x) \equiv \log (x / (1 - x))$ is the logit link function. The logistic regression estimate $\hat{β}$ of $β$ reveals how the observable characteristics $x_{i j}$ of nodes $i$ and $j$ determine their probability of being adjacent. We can estimate the variance of $\hat{β}$ consistently by letting ${\hat{P}}_{i j} = Λ^{- 1} (x_{i j} \hat{β})$ be the predicted probability for pair ${i, j}$ , replacing the bread matrix $B = (X^{T} X)^{- 1}$ with $\hat{B} = {(X^{T} d i a g ({\hat{P}}_{i j} (1 - {\hat{P}}_{i j})) X)}^{- 1},$ and computing $\hat{B} {\hat{M}}_{Aronow} \hat{B}$ . My co-authors and I use this approach in “Research Funding and Collaboration:” we estimate how grant proposal outcomes determine the probability with which pairs of researchers co-author, and we compare ${\hat{σ}}^{2} \hat{B}$ and $\hat{B} {\hat{M}}_{Aronow} \hat{B}$ to show that our inferences are robust to dyadic dependence.

Fafchamps and Gubert (2007) describe a similar variance estimator to Aronow et al. but do not establish its consistency. ↩︎