This post describes a continuous-time model of Bayesian learning about a binary state. It complements the discrete-time models discussed in previous posts (see, e.g., here or here). I present the model, discuss its learning dynamics, and derive these dynamics analytically.

The model has been used to study decision times (Fudenberg et al., 2018), experimentation (Bolton and Harris, 1999; Moscarini and Smith, 2001), information acquisition (Morris and Strack, 2019), and persuasion (Liao, 2021). It also underlies the drift-diffusion model of reaction times used by psychologists—see Ratcliff (1978) for an early example, and Hébert and Woodford (2023) or Smith (2000) for related discussions.

## Model

Suppose I want to learn about a state `\(\mu\)`

that may be high (equal to `\(H\)`

) or low (equal to `\(L<H\)`

).
I observe a continuous sample path `\((X_t)_{t\ge0}\)`

with random, instantaneous increments
`$$\DeclareMathOperator{\E}{E} \newcommand{\der}{\mathrm{d}} \newcommand{\R}{\mathbb{R}} \der X_t=\mu\der t+\sigma \der W_t,$$`

where `\(\sigma>0\)`

amplifies the noise generated by the standard Wiener process `\((W_t)_{t\ge0}\)`

.
These increments provide noisy signals of the state `\(\mu\)`

.
I use these signals, my prior belief `\(p_0=\Pr(\mu=H)\)`

, and Bayes’ rule to form a posterior belief
`$$p_t\equiv \Pr\left(\mu=H\mid (X_s)_{s<t}\right)$$`

about `\(\mu\)`

given the sample path observed up to time `\(t\)`

.
As shown below, this posterior belief has increments
`$$\der p_t=p_t(1-p_t)\frac{(H-L)}{\sigma}\der Z_t,$$`

where `\((Z_t)_{t\ge0}\)`

is a Wiener process with respect to my information at time `\(t\)`

.
Its increments
`$$\der Z_t=\frac{1}{\sigma}\left(\der X_t-\hat\mu_t\der t\right)$$`

exceed zero precisely when the corresponding increments `\(\der X_t\)`

in the sample path exceed my posterior estimates
`$$\begin{align} \hat\mu_t &\equiv \E\left[\mu\mid (X_s)_{s<t}\right] \\ &= p_tH+(1-p_t)L. \end{align}$$`

## Learning dynamics

My belief increments `\(\der p_t\)`

get smaller as `\(p_t\)`

approaches zero or one.
The ratio `\((H-L)/\sigma\)`

controls how quickly this happens.
Intuitively, if `\((H-L)\)`

is large then the high and low states are easy to tell apart from the trends in `\((X_t)_{t\ge0}\)`

they imply.
But if `\(\sigma\)`

is large then these trends are blurred by the random fluctuations `\(\sigma\der W_t\)`

.

I illustrate these dynamics in the chart below.
It shows the sample paths `\((X_t)_{t\ge0}\)`

and corresponding beliefs `\((p_t)_{t\ge0}\)`

when `\((H,L,\mu,p_0)=(1,0,H,0.5)\)`

and `\(\sigma\in\{1,2\}\)`

.
I use the same realization of the underlying Wiener process `\((W_t)_{t\ge0}\)`

for each value of `\(\sigma\)`

.
Increasing this value slows my convergence to the correct belief `\(p_t=1\)`

because it makes the signals `\(\der X_t\)`

less informative about `\(\mu=H\)`

.

## Deriving the belief increments

The increments `\(\der W_t\)`

of the Wiener process `\((W_t)_{t\ge0}\)`

are iid normally distributed with mean zero and variance `\(\der t\)`

:
`$$\der W_t\sim N(0,\der t).$$`

Thus, given `\(\mu\)`

, the increments `\(\der X_t\)`

of the sample path `\((X_t)_{t\ge0}\)`

are iid normal with mean `\(\mu\der t\)`

and variance `\(\sigma^2\der t\)`

:
`$$\der X_t\mid\mu\sim N(\mu\der t,\sigma^2\der t).$$`

So these increments have conditional PDF
`$$\begin{align} f_\mu(\der X_t) &= \frac{1}{\sigma\sqrt{2\pi\der t}}\exp\left(-\frac{(\der X_t-\mu\der t)^2}{2\sigma^2\der t}\right) \\ &= \frac{1}{\sigma\sqrt{2\pi\der t}}\exp\left(-\frac{(\der X_t)^2}{2\sigma^2\der t}\right)\exp\left(\frac{\mu\der X_t}{\sigma^2}-\frac{\mu^2\der t}{2\sigma^2}\right). \end{align}$$`

But the rules of Itô calculus imply `\((\der X_t)^2=\sigma^2\der t\)`

and
`$$\begin{align} \exp\left(\frac{\der X_t\mu}{\sigma^2}-\frac{\mu^2\der t}{2\sigma^2}\right) &= \sum_{k\ge0}\frac{1}{k!}\left(\frac{\mu\der X_t}{\sigma^2}-\frac{\mu^2\der t}{2\sigma^2}\right)^k \\ &= 1+\frac{\mu\der X_t}{\sigma^2} \end{align}$$`

because these rules treat terms of order `\((\der t)^2\)`

or smaller as equal to zero.
Thus
`$$f_\mu(\der X_t)=\frac{1}{\sigma^3\sqrt{2\pi\der t}}\exp\left(-\frac{1}{2}\right)\left(\mu\der X_t+\sigma^2\right)$$`

for each `\(\mu\in\{H,L\}\)`

.
Applying Bayes’ rule then gives
`$$\begin{align} p_{t+\der t} &= \frac{p_tf_H(\der X_t)}{p_tf_H(\der X_t)+(1-p_t)f_L(\der X_t)} \\ &= \frac{p_t\left(H\der X_t+\sigma^2\right)}{\hat\mu_t\der X_t+\sigma^2}, \end{align}$$`

where `\(\hat\mu_t=\E[\mu\mid (X_s)_{s<t}]\)`

is my posterior estimate of `\(\mu\)`

.
So the belief process `\((p_t)_{t\ge0}\)`

has increments
`$$\begin{align} \der p_t &\equiv p_{t+\der t}-p_t \\ &= \frac{p_t(1-p_t)(H-L)\der X_t}{\hat\mu_t\der X_t+\sigma^2}. \end{align}$$`

Finally, taking a Maclaurin series expansion and applying the rules of Itô calculus gives
`$$\begin{align} \frac{\der X_t}{\hat\mu_t\der X_t+\sigma^2} &= \der X_t\sum_{k\ge0}\frac{(-1)^kk!}{(\sigma^2)^{k+1}}(\der X_t)^k \\ &= \der X_t\left(\frac{1}{\sigma^2}-\frac{1}{\sigma^4}\der X_t\right) \\ &= \frac{1}{\sigma^2}\left(\der X_t-\hat\mu_t\der t\right), \end{align}$$`

from which we obtain the expressions for `\(\der p_t\)`

and `\(\der Z_t\)`

provided above.