Neural-Path/Notes
40 min

Continuous-Time Markov Chains & Poisson Processes

Continuous-time Markov chains model systems that jump between states at random times — from radioactive decay to queuing systems to neural spike trains. The Poisson process is the canonical example, and its properties underlie survival analysis, event modeling, and the connection between discrete probabilistic models and differential equations.

Concepts

A hospital emergency room, a packet router, a radioactive nucleus — all are systems that jump between states at random times, with no memory of how long they have been waiting. The exponential distribution's memorylessness is not a modeling convenience but a mathematical necessity: it is the unique continuous distribution where P(wait>s+twait>s)=P(wait>t)P(\text{wait} > s+t \mid \text{wait} > s) = P(\text{wait} > t). Continuous-time Markov chains are the precise framework for such systems, with the generator matrix QQ encoding the rate of every possible transition.

Continuous-Time Markov Chains: Generator and Kolmogorov Equations

A continuous-time Markov chain (CTMC) on state space S\mathcal{S} satisfies the Markov property in continuous time: for all s<ts < t, P(Xt=jXu,us)=P(Xt=jXs)P(X_t = j \mid X_u, u \leq s) = P(X_t = j \mid X_s).

The chain is characterized by the generator matrix (or QQ-matrix): Qij0Q_{ij} \geq 0 for iji \neq j (transition rate from ii to jj), and Qii=jiQijQ_{ii} = -\sum_{j \neq i} Q_{ij} (so each row sums to zero). The holding time in state ii is exponential with rate qi=Qii=jiQijq_i = -Q_{ii} = \sum_{j \neq i} Q_{ij}.

The transition semigroup P(t)=etQP(t) = e^{tQ} (matrix exponential) satisfies P(s+t)=P(s)P(t)P(s+t) = P(s)P(t) (Chapman-Kolmogorov). The probability vector μ(t)=μ(0)etQ\mu(t) = \mu(0) e^{tQ} evolves as:

dμdt=μQ(Kolmogorov forward equation).\frac{d\mu}{dt} = \mu Q \quad \text{(Kolmogorov forward equation)}.

Equivalently, ddtP(t)=P(t)Q=QP(t)\frac{d}{dt}P(t) = P(t)Q = QP(t).

The matrix exponential P(t)=etQP(t) = e^{tQ} is forced by two requirements working together: the Chapman-Kolmogorov equation P(s+t)=P(s)P(t)P(s+t) = P(s)P(t) (from the Markov property) and differentiability at t=0t = 0 (from the exponential holding-time distribution). Any solution satisfying both is a matrix exponential; the generator Q=P(0)Q = P'(0) is the unique finite object encoding instantaneous transition rates. Without QQ, you would need to specify P(t)P(t) for every tt separately — with QQ, one matrix determines all finite-time probabilities.

Stationary distribution: πQ=0\pi Q = 0, i.e., iπiQij=0\sum_i \pi_i Q_{ij} = 0 for all jj. Detailed balance: πiQij=πjQji\pi_i Q_{ij} = \pi_j Q_{ji}.

Relationship to discrete-time chain: the embedded chain has transitions Pij=Qij/qiP_{ij} = Q_{ij}/q_i for iji \neq j. The CTMC visits states in the order of the embedded chain but spends exponential holding times at each.

The Poisson Process

The Poisson process N(t)N(t) with rate λ\lambda is the canonical CTMC with states S={0,1,2,}\mathcal{S} = \{0, 1, 2, \ldots\} and generator Qi,i+1=λQ_{i,i+1} = \lambda, Qii=λQ_{ii} = -\lambda (only upward jumps).

Definition via inter-arrivals: inter-arrival times T1,T2,T_1, T_2, \ldots are iid Exp(λ)\text{Exp}(\lambda); N(t)=max{n:T1++Tnt}N(t) = \max\{n : T_1 + \cdots + T_n \leq t\}.

Marginal distribution: N(t)Poisson(λt)N(t) \sim \text{Poisson}(\lambda t), i.e., P(N(t)=k)=eλt(λt)k/k!P(N(t) = k) = e^{-\lambda t}(\lambda t)^k / k!.

Independent increments: for s<ts < t, N(t)N(s)Poisson(λ(ts))N(t) - N(s) \sim \text{Poisson}(\lambda(t-s)) and is independent of Fs\mathcal{F}_s.

Superposition: the merge of kk independent Poisson processes with rates λ1,,λk\lambda_1, \ldots, \lambda_k is Poisson with rate λi\sum \lambda_i.

Thinning: independently keep each arrival with probability pp; the result is Poisson with rate λp\lambda p.

Conditioning: given N(t)=nN(t) = n, the nn arrival times are distributed as the order statistics of nn iid Uniform[0,t]\text{Uniform}[0,t] random variables.

Birth-Death Chains

A birth-death chain has states {0,1,2,}\{0, 1, 2, \ldots\} with transitions only between adjacent states: birth rate λi\lambda_i from state ii to i+1i+1, death rate μi\mu_i from ii to i1i-1, with μ0=0\mu_0 = 0.

The detailed balance equations reduce to a product formula for the stationary distribution:

πi=π0k=0i1λkμk+1.\pi_i = \pi_0 \prod_{k=0}^{i-1} \frac{\lambda_k}{\mu_{k+1}}.

M/M/1 queue: arrivals at rate λ\lambda, service at rate μ\mu, ρ=λ/μ\rho = \lambda/\mu. Stationary distribution: πi=(1ρ)ρi\pi_i = (1-\rho)\rho^i for ρ<1\rho < 1. Mean queue length: E[N]=ρ/(1ρ)E[N] = \rho/(1-\rho); mean wait time (Little's law): E[W]=1/(μλ)E[W] = 1/(\mu - \lambda).

Renewal Theory

A renewal process generalizes Poisson by allowing non-exponential inter-arrival times TiFT_i \sim F with mean μT=E[Ti]\mu_T = E[T_i].

Renewal theorem: m(t)=E[N(t)]t/μTm(t) = E[N(t)] \to t/\mu_T as tt \to \infty.

Inspection paradox: the renewal interval containing a fixed time tt has length distribution stochastically greater than FF. This is why buses always seem late — you are more likely to arrive during a long gap.

Worked Example

Example 1: Waiting Times for Rare Events

In a network, packet errors occur as a Poisson process with rate λ=0.01\lambda = 0.01 errors/second.

Mean time to first error: E[T1]=1/λ=100E[T_1] = 1/\lambda = 100 seconds.

Probability of error-free period of length t=60t = 60s: P(N(60)=0)=e0.60.55P(N(60) = 0) = e^{-0.6} \approx 0.55.

Expected errors in 1000 seconds: E[N(1000)]=10E[N(1000)] = 10, with Var[N(1000)]=10\text{Var}[N(1000)] = 10 (Poisson: mean = variance).

Overdispersion check: if observed variance \gg mean, the Poisson assumption fails (e.g., bursty errors form a clustered process). Variance-to-mean ratio =1= 1 is the Poisson diagnostic.

Example 2: M/M/1 Queue Near Saturation

A web server handles requests: λ=90\lambda = 90/sec, μ=100\mu = 100/sec, ρ=0.9\rho = 0.9.

Mean queue length: E[N]=0.9/0.1=9E[N] = 0.9/0.1 = 9 requests. Mean response time: E[W]=1/(10090)=0.1E[W] = 1/(100-90) = 0.1 seconds.

At ρ=0.99\rho = 0.99: E[N]=99E[N] = 99, E[W]=1E[W] = 1 second — 10× worse for a 10% load increase. Near ρ=1\rho = 1, the system enters congestion collapse. Real CDN capacity planning provisions ρ<0.7\rho < 0.7 to maintain sub-100ms latency.

Example 3: Gillespie Algorithm for Chemical Kinetics

A system has two species AA and BB with reactions Ak1BA \xrightarrow{k_1} B and Bk2AB \xrightarrow{k_2} A. At state (nA,nB)(n_A, n_B), the total rate is r=k1nA+k2nBr = k_1 n_A + k_2 n_B.

The Gillespie algorithm: time to next event ΔtExp(r)\Delta t \sim \text{Exp}(r); event type ABA \to B with probability k1nA/rk_1 n_A / r. The stationary distribution is binomial: πnA(NnA)(k1/k2)nA\pi_{n_A} \propto \binom{N}{n_A}(k_1/k_2)^{n_A}. This is exact stochastic kinetics with no mean-field approximation.

Connections

Where Your Intuition Breaks

The Poisson process is memoryless in the sense that the waiting time for the next arrival is independent of how long you have been waiting. But this does not mean the process is memoryless about the total count: N(t)N(t) grows steadily with time, and the process is not stationary. More subtly, the inspection paradox undermines the intuition that "a random time lands in an interval of average length": if you sample a uniformly random time in [0,T][0, T], you are more likely to land inside a long inter-arrival interval than a short one. The average length of the interval containing your random time is E[T12]/E[T1]E[T_1^2]/E[T_1] — strictly greater than E[T1]E[T_1] unless all intervals are equal. The memorylessness of the exponential eliminates the inspection paradox for the Poisson process specifically, but it returns for any renewal process with non-exponential inter-arrivals.

💡Intuition

The generator QQ is the continuous-time analog of PIP - I. The Kolmogorov forward equation μ˙=μQ\dot\mu = \mu Q is the probability analog of a first-order linear ODE. When QQ has eigenvalues γk-\gamma_k with γk>0\gamma_k > 0, the transient terms decay as eγkte^{-\gamma_k t}, and the system relaxes to equilibrium at rate γ1\gamma_1 — the spectral gap of the generator, analogous to the discrete case. This makes eigendecomposition of QQ the primary tool for analyzing relaxation times.

💡Intuition

Poisson processes are maximally unstructured arrival streams. The Poisson process is the unique renewal process with both independent and stationary increments simultaneously. It is the maximum entropy point process for a given mean rate — knowing only the mean intensity fully specifies the distribution. This is why Poisson is the default model for counts of independent events: it uses the least information beyond the mean. Departures from Poisson (overdispersion, clustering, periodic structure) are meaningful signals about structure in the data-generating process.

⚠️Warning

Little's law is exact but requires stationarity. L=λWL = \lambda W (average number in system = arrival rate × average time in system) holds for any stable queueing system under very general assumptions — not just M/M/1. But it requires the system to be in steady state. During transients (e.g., a server starting up or experiencing a traffic spike), L=λWL = \lambda W can fail dramatically. In production observability, computing latency from throughput and queue depth via Little's law is valid only if the system is stationary; measurement during traffic spikes gives misleading estimates.

Enjoying these notes?

Get new lessons delivered to your inbox. No spam.