Normal Form Games, Nash Equilibria & Mixed Strategies
Strategic interaction arises whenever one agent's payoff depends on the actions of others — and that describes nearly every interesting decision in economics, computer science, and AI. Normal form games give this situation a precise mathematical structure, and Nash equilibrium tells us what rational play looks like when all players reason simultaneously about each other. Understanding these foundations unlocks a surprisingly large territory: from auction design to adversarial training to multi-agent reinforcement learning.
Concepts
Classic 2×2 games. Green border = Nash equilibrium. Gold border = social optimum. Payoffs: (row, column).
| Column player | |||
| Cooperate | Defect | ||
| Row player | Cooperate | 3, 3 | 0, 5 |
| Defect | 5, 0 | 1, 1 Nash ★ |
Two firms setting prices, two countries choosing tariffs, a player and an opponent in a card game — in each case, what you should do depends on what the other party does, and what they do depends on what you do. Normal form games give this circular dependency a precise mathematical structure. The payoff matrix encodes all possible outcomes; Nash equilibrium identifies the strategy profiles where no player, acting alone, can do better by switching.
Normal Form: Players, Actions, Utilities
A normal form game (also called a strategic form game) is a triple where:
- is the finite set of players.
- is the joint action space; each is player 's finite set of pure actions.
- where is player 's utility function.
A strategy profile specifies one action per player. Write for the profile of all players except .
Dominant Strategies and IESDS
Action strictly dominates for player if
Rational players never play strictly dominated actions. Iterated Elimination of Strictly Dominated Strategies (IESDS) applies this repeatedly: after each round of elimination, newly dominated actions become eliminable. The surviving set is order-independent (Bernheim 1984). Weak dominance ( rather than ) is trickier — IESDS with weak dominance is order-dependent.
Nash Equilibrium
A strategy profile is a Nash equilibrium if no player can profitably deviate unilaterally:
Equivalently, each player's action is a best response to the others'. Nash equilibrium is a fixed point of the best-response correspondence: where .
Existence (Nash 1950). Every finite normal form game has at least one Nash equilibrium (possibly in mixed strategies). The proof uses Kakutani's fixed-point theorem: the best-response correspondence is upper hemicontinuous with convex values on the compact convex set of mixed strategy profiles, so it has a fixed point.
The fixed-point argument is the only proof that works here because Nash equilibrium is defined by a circular condition: each player's strategy is a best response to the others', which themselves depend on the first player's strategy. There is no constructive proof because finding Nash equilibria is PPAD-complete — existence is guaranteed, but no algorithm can find one efficiently in general. The theorem tells us a solution exists without telling us how to find it.
Mixed Strategies and Expected Utility
A mixed strategy for player is a probability distribution , the simplex over . The support of is the set of actions assigned positive probability. Under mixed strategies, players maximize expected utility:
Indifference condition. In any mixed Nash equilibrium, a player must be indifferent among all actions in the support of their mixed strategy. This is the key computational handle: if player mixes over , then , which pins down the opponent's mixing probabilities.
Correlated Equilibrium
A correlated equilibrium generalizes Nash by allowing a mediator to send private, correlated recommendations to players. Formally, it is a joint distribution such that for every player and every deviation :
The set of correlated equilibria is a convex polytope (defined by linear constraints), making it easier to compute than Nash equilibria. Every Nash equilibrium induces a correlated equilibrium (take ), but the converse is false. Correlated equilibria can achieve higher social welfare than any Nash equilibrium.
Worked Example
Prisoner's Dilemma
Two suspects independently choose to Cooperate (C) or Defect (D). Payoffs (row, column):
| C | D | |
|---|---|---|
| C | (3, 3) | (0, 5) |
| D | (5, 0) | (1, 1) |
Dominant strategy. For the row player: if the column player plays C, D gives 5 > 3; if column plays D, D gives 1 > 0. Defect strictly dominates Cooperate — for both players by symmetry.
Nash equilibrium. IESDS eliminates C for both players. The unique Nash equilibrium is (D, D) with payoffs (1, 1).
Social optimum. The socially efficient outcome is (C, C) with total payoff 6 vs. 2 at Nash. The price of anarchy — ratio of social optimum to Nash payoff — is 6/2 = 3. Individual rationality destroys collective welfare.
Battle of the Sexes
Two players want to coordinate but have different preferences. Payoffs:
| Opera (O) | Football (F) | |
|---|---|---|
| O | (2, 1) | (0, 0) |
| F | (0, 0) | (1, 2) |
Pure Nash equilibria. (O, O): neither player wants to deviate (row loses 2, column loses 1). (F, F): symmetric. Both are Nash.
Mixed Nash. Let row play O with probability , column play O with probability .
Row's indifference:
Column's indifference:
The mixed Nash is with expected payoffs — worse for both players than either pure Nash. This is typical: mixing is not coordination.
Connections
Where Your Intuition Breaks
The mixed Nash equilibrium in Battle of the Sexes — — gives both players expected payoff , which is worse than either pure Nash ( or ). The counterintuitive lesson: mixing is not a coordination device. A player mixes not to confuse the opponent, but because the opponent's mixing has made the player genuinely indifferent — and the opponent's mixing probabilities are set by the first player's payoffs, not by any desire to coordinate. Mixed equilibria exist to make the other player indifferent, which means the mixing player's own payoff is determined entirely by the opponent's strategy. The mixer gets no benefit from mixing itself.
Nash equilibrium is a fixed point of best-response correspondences. Nash (1950) proved existence for all finite games by applying Kakutani's theorem — the same fixed-point argument that proves Brouwer's theorem in analysis. The result is striking: no matter how complex the game, strategic behavior always has at least one self-consistent resting point.
| Property | Nash Equilibrium | Correlated Equilibrium | Social Optimum |
|---|---|---|---|
| Existence | Always (mixed) | Always (linear program) | Always |
| Uniqueness | Not guaranteed | Not guaranteed | Not guaranteed |
| Computability | PPAD-complete | Polynomial (LP) | NP-hard (welfare max) |
| Social welfare | Can be poor | Better than Nash possible | Maximal |
Finding a Nash equilibrium is PPAD-complete in general (Daskalakis, Goldberg, Papadimitriou, 2006) — there is no known polynomial-time algorithm, and the problem is believed to be computationally intractable even though a solution is guaranteed to exist. This asymmetry (existence without efficient computation) makes Nash equilibrium a curious object: useful as a solution concept but hard to operationalize in large games.
The price of anarchy (Koutsoupias & Papadimitriou, 1999) measures how much social welfare is lost when rational players choose Nash equilibria rather than the social optimum. In network routing games the price of anarchy is at most 4/3 (Roughgarden & Tardos, 2002) — a tight bound that has influenced the design of real routing protocols. When the price of anarchy is low, market competition is nearly efficient; when it is high, regulation or mechanism design is needed.
Enjoying these notes?
Get new lessons delivered to your inbox. No spam.