Itô calculus extends differentiation and integration to stochastic processes driven by Brownian motion. The key departure from classical calculus is that Brownian paths have nonzero quadratic variation — (dB)2=dt — which generates an extra second-derivative term in the chain rule and makes stochastic differential equations a distinct object from their deterministic counterparts.
Concepts
Classical calculus was built for smooth curves — functions you can zoom in on until they look like straight lines. Brownian motion breaks this: zoom in on any Brownian path at any scale and it looks just as jagged. The total length of a Brownian path over any interval is infinite, so ordinary Riemann integration fails entirely. Itô calculus rebuilds integration from scratch for these paths, and the price of doing so is a correction term in the chain rule that has no classical analogue — not a small perturbation, but a term of the same order as the drift.
The Itô Integral
The classical Riemann-Stieltjes integral ∫0Tf(t)dBt fails because Brownian motion has infinite total variation. The Itô integral is defined as an L2 limit of left-Riemann sums over adapted integrands:
∫0THtdBt=lim∥Π∥→0∑kHtk(Btk+1−Btk),
where H is adapted to (Ft) and E[∫0THt2dt]<∞. The left endpoint (vs midpoint or right endpoint) is essential: it gives the integral its martingale property.
The left endpoint evaluation is the unique choice that makes ∫0tHsdBs a martingale — the mathematical expression of "no foresight." Using the midpoint (Stratonovich) yields a different integral that obeys the classical chain rule but loses the martingale property. This is not a convention; the left endpoint is the specific choice that preserves the probabilistic structure needed to analyze stochastic systems, while the Stratonovich convention is preferred in physics where the chain rule must hold for physical-noise approximations.
Itô isometry: E[(∫0THtdBt)2]=E[∫0THt2dt].
The Itô integral Mt=∫0tHsdBs is a martingale (not just a local martingale) when E[∫0THs2ds]<∞.
Contrast with Stratonovich integral: ∫0THt∘dBt=lim∑H(tk+tk+1)/2(Btk+1−Btk) uses midpoints. Stratonovich satisfies the classical chain rule but loses the martingale property. Itô is used in finance (self-financing portfolios); Stratonovich appears in physics (Wong-Zakai theorem for physical noise).
Itô's Lemma (Stochastic Chain Rule)
Let Xt be an Itô process: dXt=μtdt+σtdBt. For f∈C2:
The Itô correction term21f′′(Xt)σt2dt arises from the quadratic variation d[X,X]t=σt2dt. In the formal Itô multiplication table: (dt)2=0, dt⋅dBt=0, (dBt)2=dt.
Multidimensional Itô lemma: for Xt∈Rd with dXt=μtdt+σtdBt (Bt∈Rm):
df(t,Xt)=∂t∂fdt+∇f⋅dXt+21tr(σtσtT∇2f)dt.
The Hessian term 21tr(Σt∇2f) involves the instantaneous covariance matrix Σt=σtσtT.
Stochastic Differential Equations
A stochastic differential equation (SDE) in Itô form is:
dXt=μ(Xt,t)dt+σ(Xt,t)dBt,X0=x0.
The drift μ and diffusion coefficient σ can depend on the current state and time. The SDE is a shorthand for the integral equation Xt=x0+∫0tμ(Xs,s)ds+∫0tσ(Xs,s)dBs.
Existence and uniqueness (strong solutions): if μ and σ satisfy a Lipschitz condition in x and a linear growth bound, then a unique strong solution exists.
Fokker-Planck equation: the probability density p(x,t) of the solution satisfies:
∂t∂p=−∂x∂[μ(x,t)p]+21∂x2∂2[σ(x,t)2p].
This is the forward equation — it evolves the density forward in time given the SDE coefficients.
Important SDEs:
SDE
Name
Application
dXt=μXtdt+σXtdBt
Geometric Brownian Motion
Stock prices (Black-Scholes)
dXt=θ(μ−Xt)dt+σdBt
Ornstein-Uhlenbeck
Mean-reverting process
dXt=−∇U(Xt)dt+2TdBt
Langevin SDE
MCMC sampling
dXt=f(Xt)dt+σdBt
General Itô diffusion
Neural SDEs, physics simulation
The Feynman-Kac Formula
For the SDE dXt=μ(Xt)dt+σ(Xt)dBt with terminal condition g, the function:
u(x,t)=Ex[g(XT)e−∫tTr(Xs)ds]
satisfies the backward Kolmogorov PDE (Feynman-Kac):
This converts the PDE into a stochastic expectation — the basis of Monte Carlo PDE solvers and the Black-Scholes option pricing formula (GBM with r= risk-free rate, g=max(ST−K,0)).
Worked Example
Example 1: Geometric Brownian Motion and Log-Normal Prices
Apply Itô's lemma to f(Xt)=logXt where dXt=μXtdt+σXtdBt:
Integrating: logXt=logX0+(μ−σ2/2)t+σBt, so Xt=X0exp((μ−σ2/2)t+σBt).
Xt is log-normal: E[Xt]=X0eμt, Var[Xt]=X02e2μt(eσ2t−1).
The Itô correction−σ2/2 in the drift of logXt explains why E[logXt]=logX0+(μ−σ2/2)t<logE[Xt] — Jensen's inequality for the concave log function. A naive "apply log to the SDE" without Itô's lemma would miss this term.
Example 2: Ornstein-Uhlenbeck Process
dXt=θ(μ−Xt)dt+σdBt, X0=x0. This SDE has a linear drift and additive noise. The exact solution:
Xt=μ+(x0−μ)e−θt+σ∫0te−θ(t−s)dBs.
Stationary distribution: X∞∼N(μ,σ2/(2θ)) (mean-reverting Gaussian). Autocorrelation: Cov(Xt,Xt+h)=2θσ2e−θh (exponential decay). The OU process is the continuous-time analog of an AR(1) model.
ML connection: the continuous-time limit of stochastic gradient descent with weight decay is an OU process: dθt=−λθtdt−η∇L(θt)dt+2ηTdBt. The stationary distribution approximates the Gibbs distribution ∝e−L(θ)/T under mild conditions.
Example 3: Black-Scholes via Feynman-Kac
European call option: payoff g(ST)=(ST−K)+ at expiry T, stock dSt=rStdt+σStdBt under risk-neutral measure.
By Feynman-Kac, the option price C(S,t)=e−r(T−t)E[(ST−K)+∣St=S] satisfies:
∂t∂C+rS∂S∂C+21σ2S2∂S2∂2C−rC=0.
The closed-form solution: C=SΦ(d1)−Ke−r(T−t)Φ(d2) where d1,2=σT−tlog(S/K)+(r±σ2/2)(T−t).
The ±σ2/2 in d1 vs d2 is the Itô correction term appearing again. Without Itô's lemma, Black-Scholes cannot be derived correctly.
Connections
Where Your Intuition Breaks
The Itô correction term 21f′′(Xt)σt2dt looks like a small perturbation, but for f(x)=x2 and Xt=Bt it gives d(Bt2)=2BtdBt+dt. Integrating both sides yields BT2=2∫0TBtdBt+T: the correction contributes T — exactly the quadratic variation, not a negligible error. Without it, you would compute E[BT2]=0, but BT∼N(0,T) so E[BT2]=T. The correction term is not a refinement of classical calculus; it is a genuinely new term that cannot be recovered by any limiting argument from ordinary integration.
💡Intuition
(dB)2=dt is the fundamental identity of stochastic calculus. In classical calculus, second-order terms (dx)2→0 as dx→0. But Brownian motion's quadratic variation is [B,B]t=t — finite, deterministic, growing. This makes (dBt)2=dt a finite correction, not a negligible term. Every surprising result in stochastic calculus — Itô's lemma correction, the Black-Scholes σ2/2 terms, the Fokker-Planck diffusion term — traces back to this single identity. Internalizing (dB)2=dt is the key to reading SDEs fluently.
💡Intuition
The Fokker-Planck equation connects microscopic SDEs to macroscopic densities. A single SDE trajectory is a random path; the Fokker-Planck equation evolves the probability density of the ensemble. The drift term −∂x(μp) is convection; the diffusion term 21∂xx(σ2p) is spreading. For Langevin dynamics dX=−∇Udt+2TdB, the stationary Fokker-Planck solution is p∗∝e−U(x)/T — the Boltzmann distribution. This is why Langevin dynamics samples from the correct posterior: the SDE's stationary distribution is exactly the target.
⚠️Warning
The Itô and Stratonovich interpretations give different numerical discretizations. The Euler-Maruyama scheme Xt+Δt≈Xt+μ(Xt)Δt+σ(Xt)ΔBt implements Itô. The Milstein scheme and Runge-Kutta-style methods for Stratonovich SDEs require additional correction terms. For an SDE of the form dX=f(X)∘dB (Stratonovich), converting to Itô form adds a drift correction: dX=21f′(X)f(X)dt+f(X)dB. Numerical SDE libraries (torchsde, diffeqpy) let you specify Itô vs Stratonovich — using the wrong convention silently produces wrong results.