Linear Programming: Simplex Method, Geometry & LP Duality

Linear programming is the backbone of practical optimization: it maximizes or minimizes a linear objective over a convex polytope defined by linear constraints. The simplex method exploits the geometry of this polytope — searching only its vertices — while LP duality reveals a symmetric companion problem whose optimal value always matches.

Concepts

The feasible region of an LP is a convex polytope. The optimal solution always lies at a vertex. The objective function sweeps as a hyperplane — the last vertex it touches is optimal.

Optimal: (2,2) with objective value 4. Blue dashed = constraints. Green solid = optimal iso-line (c·x = 4).

A factory must schedule production of two products, each using some raw material and machine time, to maximize profit subject to resource limits. Each constraint is a line in the plane; the feasible region is a polygon — a convex polytope. Because profit is linear, the maximum always occurs at a corner. Linear programming formalizes this insight for any number of variables and constraints, and the simplex method exploits the corner structure to find the optimum efficiently.

Standard Form

Every LP can be converted to standard form:

\min_{x} \; c^\top x \quad \text{s.t.} \quad Ax = b, \; x \geq 0

where $c, x \in \mathbb{R}^n$ , $b \in \mathbb{R}^m$ , and $A \in \mathbb{R}^{m \times n}$ . Inequality constraints $a_i^\top x \leq b_i$ are converted by adding a slack variable $s_i \geq 0$ ; free variables $x_j$ are split as $x_j = x_j^+ - x_j^-$ with both parts non-negative.

The standard form is not just notational convenience — it exposes the basis structure that the simplex method exploits. Each basis $B$ selects $m$ linearly independent columns of $A$ and corresponds to exactly one vertex (extreme point) of the feasible polytope. Every LP algorithm is, at some level, a strategy for searching or interpolating between these vertices.

Geometric Interpretation

The feasible set $\mathcal{P} = \{x \geq 0 : Ax = b\}$ is a convex polytope. Because the objective is linear, the optimal value is attained at a vertex (extreme point) of this polytope — a point that cannot be written as a strict convex combination of two other feasible points.

A basic feasible solution (BFS) corresponds to a vertex algebraically: choose $m$ linearly independent columns of $A$ to form a basis matrix $B \in \mathbb{R}^{m \times m}$ ; set non-basic variables to zero; the basic variables are $x_B = B^{-1}b \geq 0$ .

Every vertex of the polytope is a BFS and vice versa. Since there are finitely many bases, the simplex method searches this finite set rather than a continuous domain.

Simplex Method

Starting from an initial BFS, simplex iterates:

Optimality check. Compute reduced costs $\bar{c}_j = c_j - c_B^\top B^{-1} A_j$ for each non-basic variable $j$ . If all $\bar{c}_j \geq 0$ , the current BFS is optimal.
Entering variable. Choose the non-basic variable $j^*$ with the most negative reduced cost (Bland's rule uses the smallest index to prevent cycling).
Leaving variable. Increase $x_{j^*}$ along the edge of the polytope; the first basic variable to hit zero leaves the basis. The minimum ratio test gives the leaving index: $\arg\min_{i: (B^{-1}A_{j^*})_i > 0} \frac{(B^{-1}b)_i}{(B^{-1}A_{j^*})_i}$ .
Pivot. Update $B$ , $B^{-1}$ , and the BFS.

Each pivot moves to an adjacent vertex with no worse objective value. With a non-degeneracy assumption, simplex terminates finitely. The revised simplex method maintains $B^{-1}$ via rank-1 updates (LU factorization in practice), keeping each iteration at $O(mn)$ work.

LP Duality

Every primal LP has a dual LP. For the standard primal:

\min \; c^\top x \quad \text{s.t.} \quad Ax = b, \; x \geq 0

the dual is:

\max \; b^\top y \quad \text{s.t.} \quad A^\top y \leq c

Weak duality. For any primal-feasible $x$ and dual-feasible $y$ :

b^\top y = (Ax)^\top y = x^\top A^\top y \leq x^\top c = c^\top x

so every dual objective value is a lower bound on every primal objective value.

Strong duality (fundamental theorem of LP). If both problems are feasible, their optimal values coincide:

\min \; c^\top x^* = b^\top y^* = \max \; b^\top y

Strong duality follows from the simplex certificate at optimality: when reduced costs are all non-negative, the multipliers $y^* = (B^{-1})^\top c_B$ satisfy dual feasibility with equality.

Complementary slackness. A primal-dual pair $(x^*, y^*)$ is optimal if and only if:

x_j^* (\bar{c}_j) = 0 \quad \forall j, \qquad (c - A^\top y^*)_j x_j^* = 0 \quad \forall j

Complementary slackness is the basis of primal-dual algorithms and the economic interpretation of shadow prices: a non-zero primal variable forces its reduced cost to zero.

The dual variables $y^*$ are shadow prices — the marginal value of relaxing the $i$ -th constraint by one unit. This interpretation is central to sensitivity analysis in resource allocation.

Worked Example

2-Variable Resource Allocation

A factory makes two products. Each unit of product 1 uses 2 hours of labor and 1 kg of material; product 2 uses 1 hour and 3 kg. Available: 8 labor-hours, 9 kg material. Profit: 5 per unit of product 1, 4 per unit of product 2.

Formulation (convert max to min):

\min \; -5x_1 - 4x_2 \quad \text{s.t.} \quad 2x_1 + x_2 \leq 8, \; x_1 + 3x_2 \leq 9, \; x_1, x_2 \geq 0

Add slacks $s_1, s_2$ :

2x_1 + x_2 + s_1 = 8, \quad x_1 + 3x_2 + s_2 = 9

Vertex enumeration. The four candidate vertices come from setting pairs of variables to zero:

Vertex	$(x_1, x_2)$	Feasible?	Profit
O	(0, 0)	Yes	0
A	(4, 0)	Yes	20
B	(3, 2)	Yes	23
C	(0, 3)	Yes	12

Point B is the intersection of both active constraints: solving $2x_1 + x_2 = 8$ and $x_1 + 3x_2 = 9$ gives $x_1 = 3, x_2 = 2$ , profit = 23.

Simplex pivots. Starting BFS: $x_1 = x_2 = 0$ , $s_1 = 8$ , $s_2 = 9$ . Basis $= \{s_1, s_2\}$ .

Reduced costs: $\bar{c}_1 = -5$ , $\bar{c}_2 = -4$ . Most negative: $x_1$ enters.
Ratios: $8/2 = 4$ , $9/1 = 9$ . Min ratio: $s_1$ leaves. New BFS: $x_1 = 4$ .
Reduced costs recomputed: $\bar{c}_2 = -4 + 5 \cdot (1/2) = -3/2 < 0$ . $x_2$ enters.
Ratios in updated tableau: $x_2$ can increase to 2. $s_2$ leaves. BFS: $x_1 = 3, x_2 = 2$ .
All reduced costs now non-negative. Optimal: profit = 23.

Dual problem and strong duality. The dual is:

\max \; 8y_1 + 9y_2 \quad \text{s.t.} \quad 2y_1 + y_2 \leq 5, \; y_1 + 3y_2 \leq 4, \; y_1, y_2 \geq 0

At optimality, $y^* = (B^\top)^{-1} c_B$ . With $B$ corresponding to $\{x_1, x_2\}$ :

y_1^* = \frac{11}{5}, \quad y_2^* = \frac{3}{5}, \quad b^\top y^* = 8 \cdot \frac{11}{5} + 9 \cdot \frac{3}{5} = \frac{88 + 27}{5} = 23 \checkmark

Strong duality holds: primal and dual optimal values both equal 23.

Connections

Where Your Intuition Breaks

LP duality appears perfectly symmetric: primal minimizes $c^T x$ , dual maximizes $b^T y$ , and strong duality says their optima coincide. This symmetry might suggest that solving one automatically solves the other. In practice, the two formulations have very different algorithmic behavior: primal simplex pivots columns, dual simplex pivots rows, and each can be dramatically faster or slower depending on whether the problem is primal- or dual-degenerate. Degenerate problems — where multiple bases correspond to the same vertex — can cause cycling without careful pivot rules. The theoretical symmetry of strong duality does not translate to equivalent computational behavior; choosing primal vs. dual simplex requires understanding the problem's structure.

Complexity

Algorithm	Worst-case	Typical practice
Simplex (various pivots)	Exponential (Klee-Minty)	Very fast, polynomial-average
Ellipsoid method (Khachian 1979)	$O(n^4 L)$ — first polynomial	Impractical
Interior-point (Karmarkar 1984)	$O(\sqrt{n} L)$ iterations, polynomial	Fast in practice for large LPs

💡Intuition

Simplex is exponential in theory but polynomial in practice because real-world polytopes have far fewer vertices reachable on the "short" path between any two BFS. The average-case analysis (Spielman-Teng smoothed complexity) shows simplex runs in expected polynomial time under small random perturbations — explaining the empirical gap.

Degeneracy and Cycling

A BFS is degenerate if any basic variable equals zero. Degenerate pivots move to the same vertex without changing the objective, and a sequence of degenerate pivots can cycle forever. Bland's rule (always pick the smallest-index entering/leaving variable) guarantees termination by preventing cycles.

⚠️Warning

In practice, degeneracy is the rule rather than the exception — many real LPs have highly degenerate BFS because constraints are nearly redundant or data is structured. Commercial solvers use perturbation or lexicographic pivoting to handle degeneracy robustly.

Shadow Prices and Sensitivity

The dual optimal $y^*$ quantifies the value of relaxing each constraint. In the factory example, $y_1^* = 11/5$ means one additional labor-hour is worth $2.2$ profit units. This sensitivity analysis is why LP duality is central to economics and operations: the dual solves the pricing problem that is dual to the allocation problem.

Enjoying these notes?

Get new lessons delivered to your inbox. No spam.

Numerical Methods

Bridge: Autodiff in PyTorch/JAX, Mixed Precision & Numerical Stability in Training

Integer Programming: Branch & Bound & Cutting Planes