Paying for the truth

In a previous post, I showed that if the truth doesn’t matter then I’m better off being an ideologue with ideological friends. I discussed the trade-off between (i) experiencing reality and (ii) experiencing what my friends experience. Truth-seeking made sense only when the benefit of (i) exceeded the cost of forgoing (ii). This post discusses another cost of truth-seeking: having to pay—financially, cognitively, or emotionally—for information.

One way to model that cost is as follows.¹ Suppose the truth is determined by a random variable $\theta\in\{0,1\}$. I learn about $\theta$ by observing a signal $s(x)\in\{0,1\}$ with precision $$\Pr(s(x)=\theta)=\frac{1+x}{2}.$$ The parameter $x\in[0,1]$ determines the signal’s quality. If $x=1$ then the signal is fully informative; if $x=0$ then it is uninformative.

My prior estimate $\theta_0\in[0.5,1]$ of $\theta$ is based on no information; it reflects my ideology. I use the realization of $s(x)$ and my prior $\theta_0$ to form a posterior estimate $$\hat\theta(s(x))=\Pr\left(\theta=1\,\vert\,s(x)\right)$$ via Bayes’ rule. I care about the mean squared error $$\newcommand{\E}{\mathrm{E}} \newcommand{\MSE}{\mathrm{MSE}} \MSE(x)=\E\left[\left(\theta-\hat\theta(s(x))\right)^2\right]$$ of my posterior estimate, where $\E$ is the expectation operator taken with respect to the joint distribution of $\theta$ and $s(x)$ given my prior $\theta_0$. But I also care about the cost $cx$ I endure from observing a signal of quality $x$. This cost reflects the resources I use to seek the information and process it (e.g., money, time, and mental energy). I choose the quality $x^*$ that minimizes $$f(x)=\MSE(x)+cx.$$ The chart below plots my objective $f(x)$ against $x$ when I have prior $\theta_0\in\{0.5,0.7,0.9\}$ and face marginal cost $c\in\{0,0.1,0.2,0.3\}$. Since $f$ is concave in $x$, it has (constrained) local minima at $x=0$ and $x=1$. My choice between these minima depends on the value of $c$. If it’s small then information is cheap and I “buy” as much as I can. If it’s large then information is expensive and I don’t buy any. But there’s no middle ground: I seek all the truth or none of it.

Let $c^*$ be the threshold value of $c$ at which I stop paying for information: the “choke price” of truth. How does $c^*$ depend on my prior $\theta_0$? Intuitively, increasing $\theta_0$ has two competing effects:

it increases the error in my posterior estimate when $\theta=0$;
it increases my confidence that $\theta=1$.

The first effect makes me want more information, increasing $c^*$. The second effect makes me think I need less information, decreasing $c^*$. The chart below shows that the second effect dominates. The more ideological I am about the value of $\theta$, the cheaper the truth must be for me to seek it. If I’m a pure ideologue (i.e., $\theta_0=1$) then I won’t seek the truth even if it’s free.

One reason the first effect might dominate is if I care about errors when $\theta=0$ more than when $\theta=1$. For example, if $\theta$ indicates whether it will be sunny then I’d rather bring an umbrella I don’t use than be caught wearing flip-flops in the rain. I can capture that asymmetry by replacing the MSE component of my objective with a weighted version $$\newcommand{\WMSE}{\mathrm{WMSE}} \WMSE(x)=\E\left[W(\theta)\cdot\left(\theta-\hat\theta(s(x))\right)^2\right],$$ where the weighting function $$W(\theta)=\begin{cases} 1 & \text{if}\ \theta=1 \\ w & \text{if}\ \theta=0 \end{cases}$$ has $w\ge1$. Increasing $w$ nudges my optimal posterior estimate towards zero because I want to avoid being “confidently wrong” when $\theta=0$. Since $\WMSE(x)$ is concave in $x$, I still optimally pay for all the truth or none of it. But now the choke price $c^*$ at which I stop paying for the truth depends on my prior $\theta_0$ and the error weight $w$.

The chart below shows that $c^*$ is non-monotonic in $\theta_0$ when $w$ is large. This is due to the two competing effects described above. The first effect dominates when $w$ is large and my prior is low. In that case, it’s really bad to be wrong and I’m not confident I’ll be right. Whereas the second effect dominates when $w$ is large and my prior is high. In that case, I’m so confident I’ll be right that I don’t care what happens if I’m wrong.

This example raises a philosophical question: what does it mean for the estimate to be “wrong?” For example, suppose I thought there was a 30% chance of rain. If it rained, was I wrong? What if I thought there was a 5% chance? A 95% chance? Where should I draw the line? On those questions, I recommend Michael Lewis’ discussion with Nate Silver about 17 minutes into this podcast episode.

See here for my discussion of the case when $\theta$ and $s$ are normally distributed. ↩︎