The Spectral Theorem & Symmetric Matrices
The Spectral Theorem is one of the deepest and most useful results in linear algebra: every real symmetric matrix admits an orthonormal eigenbasis. This transforms symmetric matrices from abstract linear maps into concrete geometric objects — ellipsoids, quadratic forms, and projection operators — with a complete and stable numerical theory. PCA, kernel methods, Gaussian processes, spectral graph theory, and the analysis of optimization landscapes all rest on this foundation.
Concepts
Quadratic Form — Level Set xᵀAx = 1
A = [2, 0.00; 0.00, 2]
λ₁ = 2.000, 1/√λ₁ = 0.707
λ₂ = 2.000, 1/√λ₂ = 0.707
tr(A) = 4.000
det(A) = 4.000
κ(A) = 1.00
Diagonal, equal eigenvalues — level set is a circle. Identity up to scale.
Dashed axes are the eigenvectors of A, with length 1/√λ — the semi-axes of the ellipse. Large eigenvalue = short axis (tight constraint). Small eigenvalue = long axis (loose constraint).
The Spectral Theorem says A = QΛQᵀ — the ellipse axes always align with eigenvectors, regardless of off-diagonal entries.
Every symmetric matrix is, at its core, a collection of stretches along orthogonal axes. The Spectral Theorem makes this geometric picture precise: for any real symmetric matrix, there always exists a rotation that transforms it into a pure diagonal scaling — an ellipse in 2D, an ellipsoid in dimensions. This is the foundational result behind PCA, kernel methods, and all second-order analysis of loss surfaces.
Statement of the Spectral Theorem
Theorem (Real Spectral Theorem). Let be symmetric (). Then:
- All eigenvalues of are real.
- There exists an orthogonal matrix () and a real diagonal matrix such that
The orthogonality of is not optional — it is the consequence of symmetry. For a non-symmetric matrix, eigenvectors need not be orthogonal and the eigenbasis can be numerically ill-conditioned. Symmetry forces the eigenvectors into perpendicular directions, which is the only basis that is numerically stable and geometrically interpretable.
- The columns of form an orthonormal eigenbasis of , with .
Proof sketch (induction on ). The key lemma is that every eigenspace of is -invariant and its orthogonal complement is also -invariant (for symmetric ). One proceeds as follows:
- Base case : trivial.
- Inductive step: By the Fundamental Theorem of Algebra, has at least one complex eigenvalue. By reality of eigenvalues (proved in Lesson 3), it has a real eigenvalue with real unit eigenvector .
- Let (the -dimensional orthogonal complement). For any : . So maps into — is -invariant.
- The restriction is symmetric. By induction, it has an orthonormal eigenbasis for .
- Together, is an orthonormal eigenbasis for .
Outer-Product (Spectral) Decomposition
Since and is orthogonal:
Each term is a rank-1 orthogonal projection onto the -th eigendirection. The entire matrix is a weighted sum of projectors, with eigenvalues as weights.
This decomposition is practically useful for:
- Low-rank approximation: truncate to top terms: . By the Eckart-Young theorem, is the best rank- approximation to in both spectral and Frobenius norms.
- Matrix functions: for any function (e.g., gives the matrix exponential, gives the matrix square root).
- Pseudoinverse: (invert only nonzero eigenvalues).
Quadratic Forms
A quadratic form associated with symmetric is the scalar function
Using the spectral decomposition and substituting (a rotation):
In the eigenbasis, the quadratic form is purely diagonal — no cross terms. This is the algebraic statement of the Principal Axis Theorem: there always exists a rotation of coordinates that eliminates cross terms in a quadratic form.
Level sets for and are ellipsoids. In the eigenbasis, the ellipsoid has semi-axes of length along the -th eigenvector direction. Large eigenvalue = short axis (tight constraint). Small eigenvalue = long axis (loose constraint).
Classification of quadratic forms:
| Condition | Name | Level sets |
|---|---|---|
| All | Positive definite | Ellipsoids (bounded, closed) |
| All | Positive semidefinite | Degenerate ellipsoids (unbounded in null directions) |
| All | Negative semidefinite | — |
| All | Negative definite | — |
| Mixed signs | Indefinite | Hyperboloids (unbounded) |
Schur Decomposition
The Spectral Theorem requires to be symmetric. For general square matrices, the best one can do is:
Theorem (Schur Decomposition). For any , there exists a unitary () such that
where is upper triangular with the eigenvalues of on the diagonal.
If and has only real eigenvalues, then can be chosen real orthogonal (real Schur form). The Schur decomposition is the foundation for the QR algorithm for eigenvalue computation.
Normal Matrices
A matrix (real or complex) is normal if . Normal matrices are precisely those that are unitarily diagonalizable:
Examples of normal matrices:
| Type | Condition | Eigenvalues |
|---|---|---|
| Symmetric (real) | Real | |
| Hermitian (complex) | Real | |
| Skew-symmetric | Purely imaginary | |
| Orthogonal | $ | |
| Unitary | On unit circle in |
Non-normal matrices (e.g., upper triangular with distinct diagonal) may not have orthogonal eigenvectors — their eigenbasis can be poorly conditioned, leading to transient growth even when .
The Frobenius Norm and Eigenvalues
For symmetric with eigenvalues :
More generally, for any positive integer . The trace of matrix powers thus provides a sequence of invariants (Newton's power sums) that determine the eigenvalues via the characteristic polynomial.
Simultaneous Diagonalizability
Theorem. Two symmetric matrices are simultaneously orthogonally diagonalizable (i.e., orthogonal such that both and are diagonal) if and only if .
This is important in statistics: the sample covariance matrix and a population covariance matrix are simultaneously diagonalizable if and only if they commute — a condition that arises in hypothesis testing (the Fisher discriminant, Hotelling's ).
Worked Example
Example 1: Spectral Decomposition and Low-Rank Approximation
Let . From Lesson 3, , , , .
Outer-product form:
Verify: ✓
Rank-1 approximation (keep only the top term): . This captures of the spectral energy (and of the Frobenius norm squared , the rank-1 term contributes ).
Matrix square root: .
Example 2: Diagonalizing a Quadratic Form
Consider . Write in matrix form:
Characteristic polynomial: .
Eigenvalues: , . Semi-axes of the level set : and .
Change of coordinates: in the eigenbasis :
No cross term. The level set is an ellipse with semi-axes (along ) and (along ).
Example 3: Kernel Matrix Decomposition
A symmetric positive semidefinite matrix with (a kernel matrix / Gram matrix) has spectral decomposition , .
The Nyström approximation uses the top eigenpairs to approximate the full kernel:
reducing storage from to and matrix-vector products from to .
Mercer's Theorem guarantees that any continuous, symmetric, positive semi-definite kernel on a compact space admits a spectral expansion — the infinite-dimensional analogue of the finite spectral theorem.
Connections
Where Your Intuition Breaks
Any matrix can be diagonalized by an orthogonal transformation. Only symmetric (and more generally, normal) matrices have orthogonal eigenbases. For a general non-symmetric matrix, the Jordan normal form is the best available decomposition — and Jordan blocks produce polynomial growth in matrix powers even when the spectral radius is below 1. A recurrent weight matrix with is stable in theory, but if it has Jordan blocks, the matrix powers can grow as before eventually decaying — this is precisely the transient growth that makes training certain RNN architectures numerically treacherous. The Spectral Theorem's guarantee of orthogonal eigenvectors is not universal; it is a special property of symmetric matrices.
Why Symmetry Is Not Just a Convenience
Symmetric matrices appear naturally wherever an inner product appears:
- Covariance matrices — by construction symmetric PSD
- Gram matrices — by commutativity of inner product
- Graph Laplacians where is a symmetric adjacency — symmetric PSD
- Hessians — symmetric by Clairaut's theorem (for smooth )
- Fisher information — symmetric PSD
In each case, the Spectral Theorem guarantees real eigenvalues, orthogonal eigenvectors, and a stable eigendecomposition — properties that do not hold for general matrices.
Quadratic forms are the geometry of second-order information. The Hessian at a critical point is symmetric. Its quadratic form measures the curvature of in direction . Eigenvectors of are the directions of pure curvature; eigenvalues measure the curvature magnitude. This is why the eigenspectrum of the Hessian completely controls gradient descent convergence — and why ill-conditioned Hessians (large ) make optimization slow.
Outer-product decomposition = interpretable features. In PCA, . Each term is a rank-1 "feature" — the -th principal component captures fraction of total variance. Plotting the (e.g., eigenfaces in face recognition, eigenmodes in physics) reveals the dominant patterns in data. The outer-product form makes this decomposition transparent.
Numerical computation: use eigh, not eig. For symmetric/Hermitian matrices, use numpy.linalg.eigh (or scipy.linalg.eigh) rather than the general eig. The eigh routine exploits symmetry (roughly halving the work), guarantees real output, returns eigenvalues sorted, and uses more numerically stable algorithms (divide-and-conquer or MRRR). Using eig on a symmetric matrix wastes computation and may return slightly complex eigenvalues due to floating-point artifacts.
Stability and Perturbation: Weyl's Theorem
Theorem (Weyl). Let be symmetric with eigenvalues and for and respectively. Then:
Eigenvalues of symmetric matrices are Lipschitz stable under perturbation — a perturbation in the matrix causes at most a change in each eigenvalue. This is in stark contrast to non-symmetric matrices, where eigenvalues can be extremely sensitive to perturbations.
This stability is why numerical eigenvalue algorithms for symmetric matrices (like the symmetric QR algorithm) achieve backward stability — the computed eigenvalues are the exact eigenvalues of a nearby matrix.
Comparing Spectral Decomposition to Other Decompositions
| Decomposition | Form | Requires | Gives |
|---|---|---|---|
| Spectral ( symmetric) | Orthogonal eigenbasis, real | ||
| Eigendecomposition (general) | diagonalizable | Eigenbasis (may not be orthogonal) | |
| Schur | Any | Orthonormal basis, triangular | |
| SVD | Any (even non-square) | Two orthonormal bases, non-neg | |
| QR | Any | Orthogonal , upper triangular |
For symmetric PSD matrices, the spectral decomposition and the SVD coincide: singular values equal eigenvalues, left and right singular vectors are both the eigenvectors.
Enjoying these notes?
Get new lessons delivered to your inbox. No spam.