In a previous post, I showed that if the truth doesn’t matter then I’m better off being an ideologue with ideological friends. I discussed the trade-off between (i) experiencing reality and (ii) experiencing what my friends experience. Truth-seeking made sense only when the benefit of (i) exceeded the cost of forgoing (ii). This post discusses another cost of truth-seeking: having to pay—financially, cognitively, or emotionally—for information.

One way to model that cost is as follows.1 Suppose the truth is determined by a random variable θ{0,1}. I learn about θ by observing a signal s(x){0,1} with precision Pr(s(x)=θ)=1+x2. The parameter x[0,1] determines the signal’s quality. If x=1 then the signal is fully informative; if x=0 then it is uninformative.

My prior estimate θ0[0.5,1] of θ is based on no information; it reflects my ideology. I use the realization of s(x) and my prior θ0 to form a posterior estimate θ^(s(x))=Pr(θ=1|s(x)) via Bayes’ rule. I care about the mean squared error MSE(x)=E[(θθ^(s(x)))2] of my posterior estimate, where E is the expectation operator taken with respect to the joint distribution of θ and s(x) given my prior θ0. But I also care about the cost cx I endure from observing a signal of quality x. This cost reflects the resources I use to seek the information and process it (e.g., money, time, and mental energy). I choose the quality x that minimizes f(x)=MSE(x)+cx. The chart below plots my objective f(x) against x when I have prior θ0{0.5,0.7,0.9} and face marginal cost c{0,0.1,0.2,0.3}. Since f is concave in x, it has (constrained) local minima at x=0 and x=1. My choice between these minima depends on the value of c. If it’s small then information is cheap and I “buy” as much as I can. If it’s large then information is expensive and I don’t buy any. But there’s no middle ground: I seek all the truth or none of it.

Let c be the threshold value of c at which I stop paying for information: the “choke price” of truth. How does c depend on my prior θ0? Intuitively, increasing θ0 has two competing effects:

  1. it increases the error in my posterior estimate when θ=0;
  2. it increases my confidence that θ=1.

The first effect makes me want more information, increasing c. The second effect makes me think I need less information, decreasing c. The chart below shows that the second effect dominates. The more ideological I am about the value of θ, the cheaper the truth must be for me to seek it. If I’m a pure ideologue (i.e., θ0=1) then I won’t seek the truth even if it’s free.

One reason the first effect might dominate is if I care about errors when θ=0 more than when θ=1. For example, if θ indicates whether it will be sunny then I’d rather bring an umbrella I don’t use than be caught wearing flip-flops in the rain. I can capture that asymmetry by replacing the MSE component of my objective with a weighted version WMSE(x)=E[W(θ)(θθ^(s(x)))2], where the weighting function W(θ)={1if θ=1wif θ=0 has w1. Increasing w nudges my optimal posterior estimate towards zero because I want to avoid being “confidently wrong” when θ=0. Since WMSE(x) is concave in x, I still optimally pay for all the truth or none of it. But now the choke price c at which I stop paying for the truth depends on my prior θ0 and the error weight w.

The chart below shows that c is non-monotonic in θ0 when w is large. This is due to the two competing effects described above. The first effect dominates when w is large and my prior is low. In that case, it’s really bad to be wrong and I’m not confident I’ll be right. Whereas the second effect dominates when w is large and my prior is high. In that case, I’m so confident I’ll be right that I don’t care what happens if I’m wrong.

This example raises a philosophical question: what does it mean for the estimate to be “wrong?” For example, suppose I thought there was a 30% chance of rain. If it rained, was I wrong? What if I thought there was a 5% chance? A 95% chance? Where should I draw the line? On those questions, I recommend Michael Lewis’ discussion with Nate Silver about 17 minutes into this podcast episode.


  1. See here for my discussion of the case when θ and s are normally distributed. ↩︎