Normal Form Games, Nash Equilibria & Mixed Strategies

Strategic interaction arises whenever one agent's payoff depends on the actions of others — and that describes nearly every interesting decision in economics, computer science, and AI. Normal form games give this situation a precise mathematical structure, and Nash equilibrium tells us what rational play looks like when all players reason simultaneously about each other. Understanding these foundations unlocks a surprisingly large territory: from auction design to adversarial training to multi-agent reinforcement learning.

Concepts

Classic 2×2 games. Green border = Nash equilibrium. Gold border = social optimum. Payoffs: (row, column).

	Column player
	Cooperate	Defect
Row player	Cooperate	3, 3	0, 5
Row player	Defect	5, 0	1, 1 Nash ★

Unique Nash: (D, D) with payoffs (1,1). Social optimum: (C, C) with payoffs (3,3). Price of anarchy = 3.

Two firms setting prices, two countries choosing tariffs, a player and an opponent in a card game — in each case, what you should do depends on what the other party does, and what they do depends on what you do. Normal form games give this circular dependency a precise mathematical structure. The payoff matrix encodes all possible outcomes; Nash equilibrium identifies the strategy profiles where no player, acting alone, can do better by switching.

Normal Form: Players, Actions, Utilities

A normal form game (also called a strategic form game) is a triple $(N, A, u)$ where:

$N = \{1, \ldots, n\}$ is the finite set of players.
$A = A_1 \times A_2 \times \cdots \times A_n$ is the joint action space; each $A_i$ is player $i$ 's finite set of pure actions.
$u = (u_1, \ldots, u_n)$ where $u_i : A \to \mathbb{R}$ is player $i$ 's utility function.

A strategy profile $a = (a_1, \ldots, a_n) \in A$ specifies one action per player. Write $a_{-i}$ for the profile of all players except $i$ .

Dominant Strategies and IESDS

Action $a_i$ strictly dominates $a_i'$ for player $i$ if

$u_i(a_i, a_{-i}) > u_i(a_i', a_{-i}) \quad \forall a_{-i} \in A_{-i}.$

Rational players never play strictly dominated actions. Iterated Elimination of Strictly Dominated Strategies (IESDS) applies this repeatedly: after each round of elimination, newly dominated actions become eliminable. The surviving set is order-independent (Bernheim 1984). Weak dominance ( $\geq$ rather than $>$ ) is trickier — IESDS with weak dominance is order-dependent.

Nash Equilibrium

A strategy profile $a^* \in A$ is a Nash equilibrium if no player can profitably deviate unilaterally:

$u_i(a_i^*, a_{-i}^*) \geq u_i(a_i, a_{-i}^*) \quad \forall a_i \in A_i,\ \forall i \in N.$

Equivalently, each player's action is a best response to the others'. Nash equilibrium is a fixed point of the best-response correspondence: $a^* \in BR(a^*)$ where $BR_i(a_{-i}) = \arg\max_{a_i} u_i(a_i, a_{-i})$ .

Existence (Nash 1950). Every finite normal form game has at least one Nash equilibrium (possibly in mixed strategies). The proof uses Kakutani's fixed-point theorem: the best-response correspondence is upper hemicontinuous with convex values on the compact convex set of mixed strategy profiles, so it has a fixed point.

The fixed-point argument is the only proof that works here because Nash equilibrium is defined by a circular condition: each player's strategy is a best response to the others', which themselves depend on the first player's strategy. There is no constructive proof because finding Nash equilibria is PPAD-complete — existence is guaranteed, but no algorithm can find one efficiently in general. The theorem tells us a solution exists without telling us how to find it.

Mixed Strategies and Expected Utility

A mixed strategy for player $i$ is a probability distribution $\sigma_i \in \Delta(A_i)$ , the simplex over $A_i$ . The support of $\sigma_i$ is the set of actions assigned positive probability. Under mixed strategies, players maximize expected utility:

$u_i(\sigma) = \sum_{a \in A} \left(\prod_{j=1}^n \sigma_j(a_j)\right) u_i(a).$

Indifference condition. In any mixed Nash equilibrium, a player must be indifferent among all actions in the support of their mixed strategy. This is the key computational handle: if player $i$ mixes over $\{a_i, a_i'\}$ , then $u_i(a_i, \sigma_{-i}) = u_i(a_i', \sigma_{-i})$ , which pins down the opponent's mixing probabilities.

Correlated Equilibrium

A correlated equilibrium generalizes Nash by allowing a mediator to send private, correlated recommendations to players. Formally, it is a joint distribution $\mu \in \Delta(A)$ such that for every player $i$ and every deviation $a_i' \neq a_i$ :

$\sum_{a_{-i}} \mu(a_i, a_{-i})\, u_i(a_i, a_{-i}) \geq \sum_{a_{-i}} \mu(a_i, a_{-i})\, u_i(a_i', a_{-i}).$

The set of correlated equilibria is a convex polytope (defined by linear constraints), making it easier to compute than Nash equilibria. Every Nash equilibrium induces a correlated equilibrium (take $\mu = \prod_i \sigma_i^*$ ), but the converse is false. Correlated equilibria can achieve higher social welfare than any Nash equilibrium.

Worked Example

Prisoner's Dilemma

Two suspects independently choose to Cooperate (C) or Defect (D). Payoffs (row, column):

	C	D
C	(3, 3)	(0, 5)
D	(5, 0)	(1, 1)

Dominant strategy. For the row player: if the column player plays C, D gives 5 > 3; if column plays D, D gives 1 > 0. Defect strictly dominates Cooperate — for both players by symmetry.

Nash equilibrium. IESDS eliminates C for both players. The unique Nash equilibrium is (D, D) with payoffs (1, 1).

Social optimum. The socially efficient outcome is (C, C) with total payoff 6 vs. 2 at Nash. The price of anarchy — ratio of social optimum to Nash payoff — is 6/2 = 3. Individual rationality destroys collective welfare.

Battle of the Sexes

Two players want to coordinate but have different preferences. Payoffs:

	Opera (O)	Football (F)
O	(2, 1)	(0, 0)
F	(0, 0)	(1, 2)

Pure Nash equilibria. (O, O): neither player wants to deviate (row loses 2, column loses 1). (F, F): symmetric. Both are Nash.

Mixed Nash. Let row play O with probability $p$ , column play O with probability $q$ .

Row's indifference: $u_{\text{row}}(O) = u_{\text{row}}(F)$

$2q + 0(1-q) = 0 \cdot q + 1(1-q) \implies 2q = 1 - q \implies q = \tfrac{1}{3}.$

Column's indifference: $u_{\text{col}}(O) = u_{\text{col}}(F)$

$1 \cdot p + 0(1-p) = 0 \cdot p + 2(1-p) \implies p = 2(1-p) \implies p = \tfrac{2}{3}.$

The mixed Nash is $(\sigma_1, \sigma_2) = (\tfrac{2}{3}\text{ O}, \tfrac{1}{3}\text{ O})$ with expected payoffs $(2/3, 2/3)$ — worse for both players than either pure Nash. This is typical: mixing is not coordination.

Connections

Where Your Intuition Breaks

The mixed Nash equilibrium in Battle of the Sexes — $(\frac{2}{3}\text{ Opera}, \frac{1}{3}\text{ Opera})$ — gives both players expected payoff $2/3$ , which is worse than either pure Nash ( $2$ or $1$ ). The counterintuitive lesson: mixing is not a coordination device. A player mixes not to confuse the opponent, but because the opponent's mixing has made the player genuinely indifferent — and the opponent's mixing probabilities are set by the first player's payoffs, not by any desire to coordinate. Mixed equilibria exist to make the other player indifferent, which means the mixing player's own payoff is determined entirely by the opponent's strategy. The mixer gets no benefit from mixing itself.

💡Intuition

Nash equilibrium is a fixed point of best-response correspondences. Nash (1950) proved existence for all finite games by applying Kakutani's theorem — the same fixed-point argument that proves Brouwer's theorem in analysis. The result is striking: no matter how complex the game, strategic behavior always has at least one self-consistent resting point.

Property	Nash Equilibrium	Correlated Equilibrium	Social Optimum
Existence	Always (mixed)	Always (linear program)	Always
Uniqueness	Not guaranteed	Not guaranteed	Not guaranteed
Computability	PPAD-complete	Polynomial (LP)	NP-hard (welfare max)
Social welfare	Can be poor	Better than Nash possible	Maximal

⚠️Warning

Finding a Nash equilibrium is PPAD-complete in general (Daskalakis, Goldberg, Papadimitriou, 2006) — there is no known polynomial-time algorithm, and the problem is believed to be computationally intractable even though a solution is guaranteed to exist. This asymmetry (existence without efficient computation) makes Nash equilibrium a curious object: useful as a solution concept but hard to operationalize in large games.

💡Intuition

The price of anarchy (Koutsoupias & Papadimitriou, 1999) measures how much social welfare is lost when rational players choose Nash equilibria rather than the social optimum. In network routing games the price of anarchy is at most 4/3 (Roughgarden & Tardos, 2002) — a tight bound that has influenced the design of real routing protocols. When the price of anarchy is low, market competition is nearly efficient; when it is high, regulation or mechanism design is needed.

Enjoying these notes?

Get new lessons delivered to your inbox. No spam.

Operations Research

Bridge: Dynamic Programming, Bellman Equations & Neural Architecture Search

Extensive Form Games, Backward Induction & Subgame Perfection