Generating Functions, Combinatorial Enumeration & the Probabilistic Method
Generating functions encode combinatorial sequences as power series, making enumeration problems algebraic; the probabilistic method proves existence of combinatorial objects by showing a random construction succeeds with positive probability. Together they form two of the most powerful nonconstructive tools in discrete mathematics.
Concepts
Erdős–Rényi random graph $G(n,p)$ with $n=20$ nodes. Each edge included independently with probability $p$. Phase transition at $p_c = 1/(n-1) \approx 0.053$.
Combinatorics asks two questions: "how many?" and "does one exist?" The answers use completely different techniques. Generating functions convert counting sequences into algebraic equations that can be solved symbolically; the probabilistic method bypasses counting entirely by showing a random construction succeeds with positive probability, guaranteeing existence without constructing the object.
Ordinary and Exponential Generating Functions
The ordinary generating function (OGF) of a sequence is the formal power series:
Addition of sequences corresponds to addition of OGFs; convolution of sequences corresponds to multiplication: . This algebraic structure makes GFs ideal for counting objects built from independent parts.
The convolution rule is why power series are the natural representation for counting: most combinatorial structures are built by splitting into a left component of size and a right component of size . Multiplying OGFs algebraically encodes this splitting — the coefficient of in the product is the sum over all splits, which is exactly the count by inclusion over partition points. The EGF uses to handle labeling: labeled structures of size split in ways (choosing which labels go left), and the EGF product absorbs this binomial coefficient automatically via the product rule for EGFs.
For labeled structures where order matters, the exponential generating function (EGF) is:
The EGF of permutations is . The EGF of derangements (permutations with no fixed points) is , since the derangement count is . The EGF of labeled rooted trees is the solution to (by Lagrange inversion, , recovering Cayley's formula).
Fibonacci and Catalan Numbers via Generating Functions
Fibonacci: The recurrence with , gives . Partial fractions with roots , yield Binet's formula .
Catalan numbers: counts binary trees with internal nodes, triangulations of an -gon, and Dyck paths of length . The recurrence says "split at the root." Taking OGF:
Solving: . Expanding via the generalized binomial theorem and using the saddle point method gives .
The Probabilistic Method
The probabilistic method (Erdős, 1947) shows existence of a combinatorial object by constructing a probability space over all candidate objects and proving . Since the probability is positive, at least one such object exists — without constructing it.
Basic form: To show a graph with vertices and edges has chromatic number , exhibit a probability distribution over colorings and show all have a monochromatic edge with positive probability.
Alteration method: First apply a random construction, then fix the small number of violations. Example: take a random subset of vertices by including each independently with probability ; vertices violating independence are removed. Optimizing gives the independence number lower bound for -regular graphs.
Lovász Local Lemma (LLL): Let be "bad" events, each with probability at most , and each depending on (sharing probability space structure with) at most other events. If:
then — all bad events can be avoided simultaneously. Here is Euler's number.
Ramsey Numbers and Graph Coloring Bounds
The Ramsey number is the minimum such that every 2-coloring of the edges of contains a monochromatic -clique. Ramsey (1930) showed is finite; the probabilistic method gives the lower bound:
Proof: Color each edge of red or blue independently with probability . For a fixed set of vertices, the probability all edges are the same color is . By union bound over sets:
when . For such , there exists a 2-coloring with no monochromatic -clique, so .
For -SAT: a CNF formula where each clause has exactly literals and each variable appears in at most clauses is always satisfiable. By LLL with (probability a random assignment falsifies one clause) and dependencies: for .
Worked Example
Catalan numbers via OGF. Start from . Rearranging: . Quadratic formula gives . The minus sign is chosen so (use L'Hôpital: as , ). Expanding via the generalized binomial theorem:
Extracting the coefficient of in yields . For : . The saddle-point asymptotics follow from the dominant singularity at : .
Ramsey lower bound. For : the bound gives . In fact ; the first moment calculation only gives , which is weak — the probabilistic bound is loose for small but becomes asymptotically tight in the logarithmic sense. For : , while the true value is unknown (between 798 and 23556). The lower bound is exponential in ; the upper bound from the Erdős-Szekeres theorem is — the gap between and remains a major open problem.
Applying LLL to 3-SAT. A 3-CNF formula has each clause with literals. Each clause is falsified by exactly one of assignments; a random assignment falsifies it with probability . Each clause shares variables with at most other clauses where clauses per variable: . Check: . The symmetric LLL fails here; the asymmetric LLL with variable-specific bounds is needed, confirming that 3-SAT satisfiability is more delicate than 4-SAT.
Connections
Where Your Intuition Breaks
The probabilistic lower bound gives , and the recursion upper bound gives . These bounds are separated by an exponential, and for the true value of is unknown. The gap has resisted decades of effort. The key misconception is that the probabilistic argument can be "made constructive" by analyzing its proof — it cannot. The proof shows a random coloring avoids a monochromatic clique with positive probability, but provides no way to identify which coloring is good. Derandomizing this argument (finding an explicit good coloring) is an open problem that would require fundamentally new combinatorial insights, not just tighter analysis of the probabilistic proof. The probabilistic method proves good objects are common without letting you find one.
The probabilistic method is one of the most conceptually striking proof techniques in mathematics: you prove an object exists without finding it. Erdős used it to show Ramsey numbers grow at least exponentially, a bound that had no constructive proof for decades and still has no explicit construction matching it. The power comes from the union bound and linearity of expectation — tools so elementary that the conclusions seem disproportionately strong. The method works because most random objects have the desired property; the "bad" cases are rare and avoidable.
Generating functions are the bridge between discrete combinatorics and complex analysis. The EGF of a combinatorial class determines its analytic properties: singularities control asymptotics (the radius of convergence of the EGF determines exponential growth rates), and the saddle-point method extracts precise asymptotic coefficients from the behavior near the dominant singularity. The EGF of permutations is (pole at 1, giving growth); the EGF of set partitions (Bell numbers) is (essential singularity, giving super-exponential growth). This analytic structure, studied in analytic combinatorics, predicts asymptotic behavior without computing terms directly.
The probabilistic method proves existence but gives no algorithm. This gap between existence and constructibility is significant: for Ramsey colorings, we know good colorings exist for but cannot find them efficiently. The Moser–Tardos algorithmic LLL closes this gap: when the LLL condition holds, a simple randomized algorithm (resample any violated bad event) terminates in polynomial expected time and produces an outcome with no bad events. This constructive version has applications in combinatorial optimization, coding theory, and constraint satisfaction that the existential version alone cannot provide.
Enjoying these notes?
Get new lessons delivered to your inbox. No spam.