Ben Davies Blog Research Vita

Let \(X\), \(Y\) and \(Z\) be random variables. Suppose that both \(X\) and \(Y\) are positively correlated with \(Z\). Are \(X\) and \(Y\) positively correlated?

The answer to this question is “not necessarily.” To see why, let \(\rho\in[-1,1]\) be a constant, and define the random variables $$X=\rho Z+W \tag{1}$$ and $$Y=\rho Z-W \tag{2}$$ with \(Z\sim N(0,1)\) and \(W\sim N(0,1-\rho^2)\).1 Then \(W\), \(X\), \(Y\) and \(Z\) have zero means, while \(X\), \(Y\) and \(Z\) have unit variances. It follows that $$\begin{align} \newcommand{\E}{\mathrm{E}} \newcommand{\Corr}{\mathrm{Corr}} \newcommand{\Cov}{\mathrm{Cov}} \newcommand{\Var}{\mathrm{Var}} \Corr(X,Y) &= \frac{\Cov(X,Y)}{\sqrt{\Var(X)}\sqrt{\Var(Y)}} \\ &= \Cov(X,Y) \\ &= \E[XY]-\E[X]\E[Y] \\ &= \E[XY], \end{align}$$ and similarly \(\Corr(X,Z)=\E[XZ]\) and \(\Corr(Y,Z)=\E[YZ]\). Now $$\begin{align} \E[XZ] &= \E[(\rho Z+W)Z] \\ &= \rho\E[Z^2]+\E[WZ] \\ &= \rho\Var(Z)+\rho\E[Z]^2+\Cov(W,Z)+\E[W]\E[Y] \\ &= \rho \end{align}$$ because \(W\) and \(Z\) are independent. A similar argument yields \(\E[YZ]=\rho\). Finally, substituting \((1)\) into \((2)\) so as to eliminate \(W\) gives $$Y=2\rho Z-X,$$ from which we obtain $$\begin{align} \Corr(X,Y) &= \E[XY] \\ &= \E[X(2\rho Z-X)] \\ &= 2\rho\E[XZ]-\E[X^2] \\ &= 2\rho\E[XZ]-\Var(X)+\E[X]^2 \\ &= 2\rho^2-1. \end{align}$$ Thus, if \(\rho\in(0,1/\sqrt{2})\) then \(X\) and \(Y\) share a negative correlation even though both are correlated positively with \(Z\). Intuitively, if \(\rho\) is sufficiently small then the negative correlation between the error terms \(W\) and \(-W\) dominates the positive correlations between \(X\) and \(Z\), and \(Y\) and \(Z\).


  1. Here \(N(\mu,\sigma^2)\) denotes the normal distribution with mean \(\mu\) and variance \(\sigma^2\). ↩︎