Suppose I’m running a race. I have a fixed amount of energy to “spend” on running fast. But the energy cost of running fast varies during the race (e.g., it’s high on hills and low on flats). How should I pace myself to minimize my race time?
This post discusses my optimal pacing problem. I describe it mathematically, derive its solution in simple and general settings, and analyze these solutions’ properties. I assume energy costs are deterministic, whereas my next post allows them to be random.
The optimal pacing problem
My race consists of \(N\)
“laps” \(n\in\{1,2,\ldots,N\}\)
with equal lengths.
I start with \(k_1=k>0\)
units of energy and finish the race with none.
Running lap \(n\)
at speed \(s_n\)
costs \(c_ns_n\)
units of energy, where \(c_n>0\)
varies with \(n\)
.
My goal is to find the speed sequence \((s_n)_{n=1}^N\)
that minimizes my total time^{1}
$$\DeclareMathOperator{\E}{E} \DeclareMathOperator{\Var}{Var} \newcommand{\der}{\mathrm{d}} \newcommand{\parfrac}[2]{\frac{\partial #1}{\partial #2}} T\equiv\sum_{n=1}^N\frac{1}{s_n}$$
subject to the dynamic energy constraint
$$k_{n+1}=k_nc_ns_n,$$
boundary conditions \(k_1=k\)
and \(k_{N+1}=0\)
, and nonnegativity constraint \(s_n\ge0\)
.
Solving the twolap case
We can build intuition by solving the case with \(N=2\)
.
Then the dynamic constraint and boundary conditions imply
$$T=\frac{c_1}{kk_2}+\frac{c_2}{k_2},$$
where \(k_2\)
is the energy I choose to leave for the second lap.
It satisfies the firstorder condition \(\partial T/\partial k_2=0\)
, which we can write as
$$\frac{c_1}{(kk_2)^2}=\frac{c_2}{k_2^2}.$$
The lefthand side is the marginal cost (in units of total time) of using less energy in the first lap.
The righthand side is the marginal benefit of using more energy in the second lap.
The firstorder condition balances this marginal cost and benefit.
It determines how I should smooth my energy consumption across laps.
Rearranging the firstorder condition for \(k_2\)
gives
$$k_2=\frac{\sqrt{c_2}}{\sqrt{c_1}+\sqrt{c_2}}k.$$
So I should spend my energy proportionally to the square roots of the costs I face.
For example, if \(c_1=4c_2\)
then I should spend a third of my energy on the first lap and two thirds on the second.
This leads me to run twice as fast on the second lap and makes my total time equal \(9c_2/k\)
.
In contrast, if I spent energy proportionally to costs then I would spend a fifth on the first lap and four fifths on the second.
I would run at a constant speed and my total time would equal \(10c_2/k\)
.
That strategy would be optimal if the costs were constant at their mean \(5c_2/2\)
.
But they aren’t constant: they vary by a factor of four.
Squareroot scaling takes advantage of this variation.
It makes me run slow when it’s expensive to run fast.
Solving the general case
The results and intuitions from the case with \(N=2\)
generalize to cases with \(N>2\)
.
But those cases require more powerful solution methods.
I explain two: using the Hamiltonian and using the Bellman equation.
The first is faster, but the second is more intuitive and extends naturally to a setting with random costs.
Using the Hamiltonian
The Hamiltonian for my optimal pacing problem is
$$H\equiv\frac{1}{s_n}\lambda_{n+1}c_ns_n,$$
where \(\lambda_{n+1}\)
is a costate that satisfies
$$\lambda_{n+1}\lambda_n=\parfrac{H}{k_n}$$
for each \(n\)
.
But \(\partial H/\partial k_n=0\)
and so \(\lambda_{n+1}=\lambda\)
is constant.
Substituting it into the firstorder condition \(\partial H/\partial s_n=0\)
gives
$$s_n=\frac{1}{\sqrt{\lambda c_n}}.$$
Now the dynamic constraint and boundary conditions imply
$$\sum_{n=1}^Nc_ns_n=k,$$
from which it follows that
$$\sqrt\lambda=\frac{1}{k}\sum_{n=1}^N\sqrt{c_n}$$
and therefore
$$s_n=\frac{k}{\sqrt{c_n}\sum_{m=1}^N\sqrt{c_m}}$$
for each \(n\)
.
Then my total time equals
$$T=\frac{1}{k}\left(\sum_{n=1}^N\sqrt{c_n}\right)^2.$$
For example, letting \(N=2\)
and \(c_1=4c_2\)
yields the optimal time \(T=9c_2/k\)
described above.
Using the Bellman equation
The dynamic constraint implies
$$s_n=\frac{k_nk_{n+1}}{c_n}$$
for each \(n.\)
Consequently, the cost sequence \((c_n)_{n=1}^N\)
and “remaining energy” sequence \((k_{n+1})_{n=0}^N\)
uniquely determine the speed sequence \((s_n)_{n=1}^N\)
.
So if
$$V_n\equiv\sum_{m=n}^N\frac{1}{s_n}$$
denotes the time spent running laps \(n\)
through \(N\)
when I pace myself optimally, then \(V_n\)
must satisfy the Bellman equation
$$V_n=\min_{k_{n+1}}\left\{\frac{c_n}{k_nk_{n+1}}+V_{n+1}\right\}.$$
This equation echoes my objective in the twolap case.
Intuitively, my optimal speeds in the \(N\)
lap case solve a sequence of twolap problems, where the second “lap” is the remainder of my race.
We can solve the Bellman equation using the method of undetermined coefficients.
Suppose \(V_{n+1}=a_{n+1}/k_{n+1}\)
for some \(a_{n+1}\ge0\)
.
Then, under optimal pacing, we have
$$\begin{align} 0 &= \parfrac{}{k_{n+1}}\left(\frac{c_n}{k_nk_{n+1}}+V_{n+1}\right) \\ &= \frac{c_n}{(k_nk_{n+1})^2}\frac{a_{n+1}}{k_{n+1}} \end{align}$$
and therefore
$$k_{n+1}=\frac{\sqrt{a_{n+1}}}{\sqrt{a_{n+1}}+\sqrt{c_n}}k_n.$$
Substituting this recurrence into the Bellman equation gives \(V_n=a_n/k_n\)
, where
$$a_n\equiv\left(\sqrt{a_{n+1}}+\sqrt{c_n}\right)^2$$
and \(a_{N+1}=0\)
.
Solving recursively gives
$$\sqrt{a_n}=\sum_{m=n}^N\sqrt{c_m}$$
for each \(n\)
, from which it follows that
$$k_{n+1}=\frac{\sum_{m=n+1}^N\sqrt{c_m}}{\sum_{m=1}^N\sqrt{c_m}}k$$
and
$$s_n=\frac{k}{\sqrt{c_n}\sum_{m=1}^N\sqrt{c_m}}.$$
So we get the same optimal speed sequence and total time as obtained using the Hamiltonian.
We also see the squareroot scaling from the twolap case generalize to the \(N\)
lap case.
For example, if the costs I face in the first half of the race are four times the costs I face in the second, then I should run half as fast in the first half than I run in the second.
Solution properties
As explained above, each speed term \(s_n\)
scales with the inverse squareroot of the corresponding cost term \(c_n\)
.
This scaling takes advantage of the variation in costs faced during my race.
But scaling all of the cost terms has a linear effect: doubling each \(c_n\)
halves each \(s_n\)
and so doubles my total time \(T\)
.
Likewise, doubling my initial energy \(k\)
doubles each \(s_n\)
and so halves \(T\)
.
These linearities come from the linearity of the dynamic constraint \(k_{n+1}=k_nc_ns_n\)
.
Rearranging the cost sequence \((c_n)_{n=1}^N\)
leads to the same rearrangement of the optimal speed sequence \((s_n)_{n=1}^N\)
.
This is because the sequences satisfy
$$\sqrt{c_n}s_n=\frac{k}{\sum_{m=1}^N\sqrt{c_m}},$$
the righthand side of which doesn’t change if I rearrange the \(c_n\)
.
Nor does my minimized time \(T\)
change.
Intuitively, swapping the laps on which I run slow and fast doesn’t change my average pace.
Whereas variation in costs improves my average pace.
To see how, let
$$\E[c_n]\equiv\frac{1}{N}\sum_{n=1}^Nc_n$$
be the empirical mean cost of energy during my race and let
$$\overline{T}\equiv\frac{\E[c_n]N}{k}$$
be my optimal time when \(c_n=\E[c_n]\)
for each \(n\)
.
Then
$$\begin{align} \overline{T}T &= \frac{N}{k}\left(\E[c_n]\frac{1}{N^2}\left(\sum_{n=1}^N\sqrt{c_n}\right)^2\right) \\ &= \frac{N}{k}\left(\E[\sqrt{c_n}^2]\E[\sqrt{c_n}]^2\right) \end{align}$$
and therefore
$$T=\overline{T}\frac{N}{k}\Var(\sqrt{c_n}),$$
where \(\Var(\sqrt{c_n})=\E[\sqrt{c_n}^2]\E[\sqrt{c_n}]^2\)
is the empirical variance of the \(\sqrt{c_n}\)
.
So applying a meanpreserving spread to the distribution of \(\sqrt{c_n}\)
values lowers my optimal time \(T\)
.
But this is not the same as increasing the variance in \(c_n\)
.
For example, consider the cost sequences \((c_n)_{n=1}^{100}\)
and \((c_n')_{n=1}^{100}\)
defined by
$$c_n=\begin{cases} 145 & \text{if}\ n\le 50 \\ 55 & \text{otherwise} \end{cases}$$
and
$$c_n'=\begin{cases} 200 & \text{if}\ n\le 20 \\ 75 & \text{otherwise}. \end{cases}$$
Then \(\E[c_n]=\E[c_n']=100\)
, while \(\Var(c_n)=2025\)
and \(\Var(c_n')=2500\)
.
So the \(c_n\)
have lower variance than the \(c_n'\)
.
But \(\Var(\sqrt{c_n})\approx5.3\)
is larger than \(\Var(\sqrt{c_n'})\approx4.8\)
, which means my optimal time is smaller under \((c_n)_{n=1}^{100}\)
than under \((c_n')_{n=1}^{100}\)
.
Intuitively, I prefer cost sequences with a mix of highs and lows to sequences with a few sharp highs and lots of mild lows.

Replacing
\(T\)
with\(T/N\)
, and letting\(x=n/N\)
and\(N\to\infty\)
, yields a special (linear) case of the problem discussed in my post on negative splits. ↩︎