Neural-Path/Notes
40 min

Normal Form Games, Nash Equilibria & Mixed Strategies

Strategic interaction arises whenever one agent's payoff depends on the actions of others — and that describes nearly every interesting decision in economics, computer science, and AI. Normal form games give this situation a precise mathematical structure, and Nash equilibrium tells us what rational play looks like when all players reason simultaneously about each other. Understanding these foundations unlocks a surprisingly large territory: from auction design to adversarial training to multi-agent reinforcement learning.

Concepts

Classic 2×2 games. Green border = Nash equilibrium. Gold border = social optimum. Payoffs: (row, column).

Column player
CooperateDefect
Row playerCooperate3, 30, 5
Defect5, 01, 1
Nash ★
Unique Nash: (D, D) with payoffs (1,1). Social optimum: (C, C) with payoffs (3,3). Price of anarchy = 3.

Two firms setting prices, two countries choosing tariffs, a player and an opponent in a card game — in each case, what you should do depends on what the other party does, and what they do depends on what you do. Normal form games give this circular dependency a precise mathematical structure. The payoff matrix encodes all possible outcomes; Nash equilibrium identifies the strategy profiles where no player, acting alone, can do better by switching.

Normal Form: Players, Actions, Utilities

A normal form game (also called a strategic form game) is a triple (N,A,u)(N, A, u) where:

  • N={1,,n}N = \{1, \ldots, n\} is the finite set of players.
  • A=A1×A2××AnA = A_1 \times A_2 \times \cdots \times A_n is the joint action space; each AiA_i is player ii's finite set of pure actions.
  • u=(u1,,un)u = (u_1, \ldots, u_n) where ui:ARu_i : A \to \mathbb{R} is player ii's utility function.

A strategy profile a=(a1,,an)Aa = (a_1, \ldots, a_n) \in A specifies one action per player. Write aia_{-i} for the profile of all players except ii.

Dominant Strategies and IESDS

Action aia_i strictly dominates aia_i' for player ii if

ui(ai,ai)>ui(ai,ai)aiAi.u_i(a_i, a_{-i}) > u_i(a_i', a_{-i}) \quad \forall a_{-i} \in A_{-i}.

Rational players never play strictly dominated actions. Iterated Elimination of Strictly Dominated Strategies (IESDS) applies this repeatedly: after each round of elimination, newly dominated actions become eliminable. The surviving set is order-independent (Bernheim 1984). Weak dominance (\geq rather than >>) is trickier — IESDS with weak dominance is order-dependent.

Nash Equilibrium

A strategy profile aAa^* \in A is a Nash equilibrium if no player can profitably deviate unilaterally:

ui(ai,ai)ui(ai,ai)aiAi, iN.u_i(a_i^*, a_{-i}^*) \geq u_i(a_i, a_{-i}^*) \quad \forall a_i \in A_i,\ \forall i \in N.

Equivalently, each player's action is a best response to the others'. Nash equilibrium is a fixed point of the best-response correspondence: aBR(a)a^* \in BR(a^*) where BRi(ai)=argmaxaiui(ai,ai)BR_i(a_{-i}) = \arg\max_{a_i} u_i(a_i, a_{-i}).

Existence (Nash 1950). Every finite normal form game has at least one Nash equilibrium (possibly in mixed strategies). The proof uses Kakutani's fixed-point theorem: the best-response correspondence is upper hemicontinuous with convex values on the compact convex set of mixed strategy profiles, so it has a fixed point.

The fixed-point argument is the only proof that works here because Nash equilibrium is defined by a circular condition: each player's strategy is a best response to the others', which themselves depend on the first player's strategy. There is no constructive proof because finding Nash equilibria is PPAD-complete — existence is guaranteed, but no algorithm can find one efficiently in general. The theorem tells us a solution exists without telling us how to find it.

Mixed Strategies and Expected Utility

A mixed strategy for player ii is a probability distribution σiΔ(Ai)\sigma_i \in \Delta(A_i), the simplex over AiA_i. The support of σi\sigma_i is the set of actions assigned positive probability. Under mixed strategies, players maximize expected utility:

ui(σ)=aA(j=1nσj(aj))ui(a).u_i(\sigma) = \sum_{a \in A} \left(\prod_{j=1}^n \sigma_j(a_j)\right) u_i(a).

Indifference condition. In any mixed Nash equilibrium, a player must be indifferent among all actions in the support of their mixed strategy. This is the key computational handle: if player ii mixes over {ai,ai}\{a_i, a_i'\}, then ui(ai,σi)=ui(ai,σi)u_i(a_i, \sigma_{-i}) = u_i(a_i', \sigma_{-i}), which pins down the opponent's mixing probabilities.

Correlated Equilibrium

A correlated equilibrium generalizes Nash by allowing a mediator to send private, correlated recommendations to players. Formally, it is a joint distribution μΔ(A)\mu \in \Delta(A) such that for every player ii and every deviation aiaia_i' \neq a_i:

aiμ(ai,ai)ui(ai,ai)aiμ(ai,ai)ui(ai,ai).\sum_{a_{-i}} \mu(a_i, a_{-i})\, u_i(a_i, a_{-i}) \geq \sum_{a_{-i}} \mu(a_i, a_{-i})\, u_i(a_i', a_{-i}).

The set of correlated equilibria is a convex polytope (defined by linear constraints), making it easier to compute than Nash equilibria. Every Nash equilibrium induces a correlated equilibrium (take μ=iσi\mu = \prod_i \sigma_i^*), but the converse is false. Correlated equilibria can achieve higher social welfare than any Nash equilibrium.

Worked Example

Prisoner's Dilemma

Two suspects independently choose to Cooperate (C) or Defect (D). Payoffs (row, column):

CD
C(3, 3)(0, 5)
D(5, 0)(1, 1)

Dominant strategy. For the row player: if the column player plays C, D gives 5 > 3; if column plays D, D gives 1 > 0. Defect strictly dominates Cooperate — for both players by symmetry.

Nash equilibrium. IESDS eliminates C for both players. The unique Nash equilibrium is (D, D) with payoffs (1, 1).

Social optimum. The socially efficient outcome is (C, C) with total payoff 6 vs. 2 at Nash. The price of anarchy — ratio of social optimum to Nash payoff — is 6/2 = 3. Individual rationality destroys collective welfare.

Battle of the Sexes

Two players want to coordinate but have different preferences. Payoffs:

Opera (O)Football (F)
O(2, 1)(0, 0)
F(0, 0)(1, 2)

Pure Nash equilibria. (O, O): neither player wants to deviate (row loses 2, column loses 1). (F, F): symmetric. Both are Nash.

Mixed Nash. Let row play O with probability pp, column play O with probability qq.

Row's indifference: urow(O)=urow(F)u_{\text{row}}(O) = u_{\text{row}}(F)

2q+0(1q)=0q+1(1q)    2q=1q    q=13.2q + 0(1-q) = 0 \cdot q + 1(1-q) \implies 2q = 1 - q \implies q = \tfrac{1}{3}.

Column's indifference: ucol(O)=ucol(F)u_{\text{col}}(O) = u_{\text{col}}(F)

1p+0(1p)=0p+2(1p)    p=2(1p)    p=23.1 \cdot p + 0(1-p) = 0 \cdot p + 2(1-p) \implies p = 2(1-p) \implies p = \tfrac{2}{3}.

The mixed Nash is (σ1,σ2)=(23 O,13 O)(\sigma_1, \sigma_2) = (\tfrac{2}{3}\text{ O}, \tfrac{1}{3}\text{ O}) with expected payoffs (2/3,2/3)(2/3, 2/3) — worse for both players than either pure Nash. This is typical: mixing is not coordination.

Connections

Where Your Intuition Breaks

The mixed Nash equilibrium in Battle of the Sexes — (23 Opera,13 Opera)(\frac{2}{3}\text{ Opera}, \frac{1}{3}\text{ Opera}) — gives both players expected payoff 2/32/3, which is worse than either pure Nash (22 or 11). The counterintuitive lesson: mixing is not a coordination device. A player mixes not to confuse the opponent, but because the opponent's mixing has made the player genuinely indifferent — and the opponent's mixing probabilities are set by the first player's payoffs, not by any desire to coordinate. Mixed equilibria exist to make the other player indifferent, which means the mixing player's own payoff is determined entirely by the opponent's strategy. The mixer gets no benefit from mixing itself.

💡Intuition

Nash equilibrium is a fixed point of best-response correspondences. Nash (1950) proved existence for all finite games by applying Kakutani's theorem — the same fixed-point argument that proves Brouwer's theorem in analysis. The result is striking: no matter how complex the game, strategic behavior always has at least one self-consistent resting point.

PropertyNash EquilibriumCorrelated EquilibriumSocial Optimum
ExistenceAlways (mixed)Always (linear program)Always
UniquenessNot guaranteedNot guaranteedNot guaranteed
ComputabilityPPAD-completePolynomial (LP)NP-hard (welfare max)
Social welfareCan be poorBetter than Nash possibleMaximal
⚠️Warning

Finding a Nash equilibrium is PPAD-complete in general (Daskalakis, Goldberg, Papadimitriou, 2006) — there is no known polynomial-time algorithm, and the problem is believed to be computationally intractable even though a solution is guaranteed to exist. This asymmetry (existence without efficient computation) makes Nash equilibrium a curious object: useful as a solution concept but hard to operationalize in large games.

💡Intuition

The price of anarchy (Koutsoupias & Papadimitriou, 1999) measures how much social welfare is lost when rational players choose Nash equilibria rather than the social optimum. In network routing games the price of anarchy is at most 4/3 (Roughgarden & Tardos, 2002) — a tight bound that has influenced the design of real routing protocols. When the price of anarchy is low, market competition is nearly efficient; when it is high, regulation or mechanism design is needed.

Enjoying these notes?

Get new lessons delivered to your inbox. No spam.