Martingales: Optional Stopping & Doob's Inequalities
A martingale formalizes the concept of a "fair game" — a stochastic process whose expected future value, given the present, equals its current value. Martingale theory provides the sharpest tools for controlling random processes: optional stopping determines when you can stop a process without changing its expected value, and Doob's inequalities bound the maximum of the process over time.
Concepts
A fair coin-flip game: you start with $100, win $1 on heads, lose $1 on tails. After any number of flips, your expected future wealth equals your current wealth — the game has no memory advantage, and no clever strategy changes your expectation. This is a martingale. The optional stopping theorem makes this rigorous: no betting rule, however elaborate, can turn a fair game into a profitable one by choosing when to stop.
Filtrations and Adapted Processes
A filtration is an increasing sequence of sigma-algebras representing the information available at time : for . A process is adapted to if is -measurable for each .
The natural filtration of is — the information generated by the process itself.
Martingales: Definition and Examples
An adapted integrable process is a martingale with respect to if:
The filtration in this definition is not a technicality — it is the precise specification of what "knowing the present" means. The same process can be a martingale with respect to one filtration and not another: is a martingale with respect to the natural filtration of Brownian motion, but enlarging the filtration to include the future destroys this property. Stating which filtration makes a process fair is not optional; it is the entire content of the definition.
A supermartingale satisfies (expected to decrease); a submartingale satisfies (expected to increase).
Canonical examples:
| Process | Type |
|---|---|
| with and | Martingale |
| Brownian motion | Martingale |
| Martingale | |
| (exponential martingale) | Martingale |
| $- | M_n |
| for convex (Jensen) | Submartingale |
Martingale difference sequences: where satisfies . The variance of satisfies (orthogonality of increments).
Optional Stopping Theorem
A stopping time is a random variable with for all — the decision to stop can only depend on current and past information, not the future.
Optional Stopping Theorem (OST): let be a martingale and a stopping time. If any of the following conditions hold:
- is bounded: a.s. for some fixed
- uniformly
- and the increments are bounded: a.s.
then .
Consequence: in any fair game, no stopping strategy can create an expected profit. The Gambler's Ruin is a direct application: a gambler starting at betting on a fair coin, stopping at or , has , so the ruin probability from is (solving ).
Doob's Inequalities
Doob's maximal inequality: for a non-negative submartingale :
For a martingale: (using Jensen, is a submartingale).
Doob's inequality: for :
The constant is sharp (Doob, 1953). For : .
Martingale Convergence Theorems
Upcrossings: an upcrossing of by is a passage from below to above . Let be the number of upcrossings by time .
Doob's upcrossing inequality: .
Martingale convergence theorem: if is a supermartingale with , then exists a.s. The limit is finite a.s.
convergence: if , then both a.s. and in .
Azuma-Hoeffding inequality: let be a martingale with bounded differences a.s. Then:
This is the fundamental concentration inequality for dependent variables and is the basis for proving generalization bounds in learning theory.
Worked Example
Example 1: Azuma-Hoeffding for Learned Models
Let be the empirical risk of a learned classifier. Define . Then is a martingale (by tower property), and if replacing one training point changes by at most (bounded sensitivity), then .
Azuma-Hoeffding gives:
This is the McDiarmid inequality for functions of independent variables. It implies that the empirical risk concentrates around its mean at rate , matching VC theory but with sharper constants.
Example 2: Gambler's Ruin with Biased Coin
A gambler bets repeatedly, with win probability . Define where is the current fortune. Then is a martingale: .
Apply OST with : . Since at ruin or at success:
Solving: . For (favorable game), , so as and : a favorable game still has positive ruin probability from finite capital.
Example 3: Martingales in SGD Analysis
Define the loss at step of SGD. The stochastic gradient has and noise satisfying .
The accumulated noise term is a martingale. Azuma-Hoeffding controls its fluctuations when gradient noise is bounded. The convergence rate of SGD — the rate for convex problems — follows from balancing the deterministic progress term against the martingale noise term .
Connections
Where Your Intuition Breaks
Optional stopping says you cannot profit in a fair game by choosing a clever stopping time — but the theorem requires the stopping time to be bounded, or the stopped process to be uniformly integrable. The gambler's doubling strategy (double the bet after each loss, stop after the first win) appears to always profit: the stopping time is almost surely finite. The flaw is not in the math but in the conditions: the maximum loss before stopping is unbounded, the process is not uniformly integrable, and the OST does not apply. More subtly, random walk on with one absorbing barrier has an almost-surely-finite stopping time with infinite expectation — technically satisfying the letter of some OST conditions while violating the spirit. The conditions of optional stopping are not formalities; they are exactly the hypotheses that prevent infinite-horizon exploitation.
The optional stopping theorem formulates "no free lunch" precisely. Any stopping rule — doubling bets, stopping after wins, timing the market — cannot change the expected value of a fair game. This is not just an informal principle but a theorem with precise conditions. The conditions matter: if you allow unbounded stopping times and the process can grow, OST may fail (the St. Petersburg paradox involves an unbounded martingale). The theorem works exactly when you cannot "gain information from the future."
Martingale differences are the right generalization of independence. Many concentration inequalities (Chernoff, Hoeffding) assume independence. Azuma-Hoeffding replaces independence with the martingale difference structure , which allows arbitrarily complex dependence between and past variables. This is why martingale methods appear throughout machine learning theory: training examples are not always independent (active learning, online learning, Markov chain-sampled minibatches), but the error increment can still form a martingale difference sequence.
Doob's inequality has a sharp constant that blows up at . The constant as , so no analog exists for . There is no uniform bound on in terms of — the maximum can be much larger than the terminal value. This failure at is not a gap in the theorem; it reflects a genuine phenomenon: there exist martingales with but . Processes with heavy tails require different tools (e.g., Burkholder-Davis-Gundy inequalities using the quadratic variation).
Enjoying these notes?
Get new lessons delivered to your inbox. No spam.