Graph Fundamentals: Connectivity, Trees, Planarity & Colorings

A graph is a set of vertices and edges — connectivity, trees, planarity, and colorability are the foundational properties that determine what computations are feasible on graph-structured data. Understanding these properties rigorously is the prerequisite for spectral methods, random graphs, and graph neural networks.

Concepts

A graph $G = (V, E)$ with $|V| = 6$ vertices and $|E| = 7$ edges. Colors show connected components. Toggle views.

Hover a vertex to see its degree. 1 connected component.

A graph is the mathematical model for any system of objects and relationships — social networks, road maps, molecular bonds, dependency graphs in software. The same three questions recur in all of them: which objects are reachable from a given starting point, how can the graph be partitioned into coherent subsets, and what is the minimum-cost structure that preserves connectivity?

Graphs and Their Representations

A graph $G = (V, E)$ consists of a vertex set $V$ with $|V| = n$ and an edge set $E \subseteq V \times V$ with $|E| = m$ . In an undirected graph, edges are unordered pairs $\{u, v\}$ ; in a directed graph (digraph), edges are ordered pairs $(u, v)$ .

The degree of vertex $v$ is $\deg(v) = |\{u : \{u,v\} \in E\}|$ . The handshaking lemma states:

$\sum_{v \in V} \deg(v) = 2m$

since each edge contributes to two degrees. The degree sequence lists degrees in non-increasing order. For directed graphs, in-degree $\deg^-(v)$ and out-degree $\deg^+(v)$ are tracked separately.

The adjacency matrix $A \in \{0,1\}^{n \times n}$ has $A_{ij} = 1$ if $\{i,j\} \in E$ . For undirected graphs $A$ is symmetric. The degree matrix $D$ is diagonal with $D_{ii} = \deg(i)$ . Alternatively, the adjacency list representation stores $N(v) = \{u : \{u,v\} \in E\}$ for each vertex, using $O(n + m)$ space versus $O(n^2)$ for the matrix.

The adjacency list is more space-efficient, but the matrix representation is chosen because it exposes algebraic structure: $A^k_{ij}$ counts walks of length $k$ from $i$ to $j$ , the eigenvalues of $A$ reveal expansion and clustering, and matrix-vector products implement one step of message passing over the entire graph simultaneously. This is why the Laplacian $L = D - A$ — not an edge list — sits at the core of spectral clustering and graph neural networks.

Paths, Connectivity, and Components

A path from $u$ to $v$ is a sequence of distinct vertices $u = v_0, v_1, \ldots, v_k = v$ with $\{v_{i-1}, v_i\} \in E$ . A cycle is a closed path ( $v_0 = v_k$ , $k \geq 3$ ). A graph is connected if a path exists between every pair of vertices; otherwise the vertex set partitions into connected components.

A bridge is an edge whose removal disconnects the graph. An articulation point is a vertex whose removal disconnects the graph. Both can be found in $O(n + m)$ time via a single DFS using Tarjan's low-link values: define $\text{low}[v] = \min(\text{disc}[v], \min_u \text{low}[u])$ over DFS tree children and back-edge neighbors.

For directed graphs, strongly connected components (SCCs) are maximal subsets where every vertex reaches every other. Tarjan's SCC algorithm finds all SCCs in $O(n + m)$ using DFS with a stack and low-link tracking. The SCCs form a DAG (the condensation) whose topological sort gives a processing order for dynamic programming on graphs.

Trees and Spanning Trees

A tree is a connected acyclic graph. Any two of the following three conditions imply the third: (1) $G$ is connected, (2) $G$ is acyclic, (3) $|E| = |V| - 1$ . Trees have exactly $n - 1$ edges and a unique path between any two vertices.

A spanning tree of $G$ is a subgraph that is a tree containing all vertices. Spanning trees can be found by BFS or DFS in $O(n + m)$ ; minimum spanning trees use Kruskal's or Prim's algorithms in $O(m \log n)$ .

Cayley's formula counts labeled trees: the number of distinct labeled trees on $n$ vertices is $n^{n-2}$ . This follows from the Prüfer sequence bijection — every labeled tree corresponds uniquely to a sequence in $\{1, \ldots, n\}^{n-2}$ , and there are $n^{n-2}$ such sequences.

Graph Coloring

A proper $k$ -coloring assigns one of $k$ colors to each vertex so that no adjacent vertices share a color. The chromatic number $\chi(G)$ is the minimum $k$ for which a proper coloring exists.

Greedy coloring processes vertices in some order and assigns the smallest available color. This gives $\chi(G) \leq \Delta(G) + 1$ where $\Delta(G)$ is the maximum degree. Brooks' theorem strengthens this: for any connected graph that is neither a complete graph $K_n$ nor an odd cycle, $\chi(G) \leq \Delta(G)$ .

The four-color theorem (Appel and Haken, 1976) states every planar graph satisfies $\chi(G) \leq 4$ . Determining $\chi(G)$ is NP-complete for $k \geq 3$ ; even approximating it within a factor of $n^{1-\varepsilon}$ is NP-hard.

Matchings and Hall's Theorem

A matching is a set of edges with no shared vertices. A perfect matching covers every vertex. In bipartite graphs $G = (X \cup Y, E)$ , Hall's theorem characterizes when a perfect matching from $X$ into $Y$ exists: for every subset $S \subseteq X$ , the neighborhood $N(S) \subseteq Y$ satisfies $|N(S)| \geq |S|$ .

König's theorem states that in bipartite graphs, the size of the maximum matching equals the size of the minimum vertex cover (a set of vertices covering all edges). This is a rare case where a combinatorial min-max identity holds and can be computed in polynomial time.

Maximum matching is found via augmenting paths: a path from an unmatched vertex to another unmatched vertex alternating between unmatched and matched edges. Augmenting along such a path increases the matching size by 1. The Hopcroft-Karp algorithm finds a maximum bipartite matching in $O(m\sqrt{n})$ by finding all shortest augmenting paths simultaneously.

Worked Example

Connected components and spanning tree via BFS. Take the graph $V = \{1,2,3,4,5,6\}$ with edges $\{1,2\}, \{2,3\}, \{3,1\}, \{4,5\}$ , leaving vertex 6 isolated. BFS from vertex 1 visits $\{1,2,3\}$ using the queue: start with 1, enqueue neighbors 2 and 3, visit them — the BFS spanning tree has edges $\{1,2\}, \{1,3\}$ . Restarting BFS from vertex 4 discovers component $\{4,5\}$ with spanning edge $\{4,5\}$ . Vertex 6 forms a singleton component. Total: 3 connected components, 2 spanning tree edges (matching $|E_{\text{tree}}| = n_{\text{component}} - 1$ within each component).

Verifying Hall's condition and finding a maximum matching. Consider the bipartite graph with $X = \{a, b, c\}$ , $Y = \{1, 2, 3\}$ , and edges $a$ -1, $a$ -2, $b$ -2, $b$ -3, $c$ -3. Check Hall's condition: $N(\{a\}) = \{1,2\}$ , $|N| = 2 \geq 1$ ; $N(\{b,c\}) = \{2,3\}$ , $|N| = 2 \geq 2$ ; $N(\{a,b,c\}) = \{1,2,3\}$ , $|N| = 3 \geq 3$ . Hall's condition holds, so a perfect matching exists. Start with matching $M = \{a\text{-}1\}$ . Vertex $b$ is unmatched; augmenting path $b$ -2: extend to $M = \{a\text{-}1, b\text{-}2\}$ . Vertex $c$ is unmatched; try $c$ -3: augmenting path $c$ -3, giving $M = \{a\text{-}1, b\text{-}2, c\text{-}3\}$ — a perfect matching.

Chromatic number of the Petersen graph. The Petersen graph has 10 vertices, 15 edges, and is 3-regular ( $\Delta = 3$ ). Brooks' theorem gives $\chi \leq 3$ . Greedy coloring: begin with an outer 5-cycle $1$ -2-3-4-5-1. Color 1 with color A, 2 with B, 3 with A, 4 with B, 5 with C (forced since 5 is adjacent to both 1=A and 4=B). The inner pentagram vertices connect across the cycle; working through the adjacencies shows all three colors are needed and no 2-coloring is possible (the graph is not bipartite — it contains odd cycles). Thus $\chi(\text{Petersen}) = 3$ , matching Brooks' bound exactly.

Connections

Where Your Intuition Breaks

The four-color theorem guarantees planarity implies 4-colorability, and König's theorem guarantees bipartite matching is polynomial — but these two cases might give the wrong impression that existence results in combinatorics come with efficient algorithms. For general graphs, computing the chromatic number requires exponential time; 3-colorability of planar graphs is NP-complete. The tractability gap runs through graph theory: Hall's condition characterizes which bipartite graphs have perfect matchings in polynomial time, but determining whether a general graph has a Hamiltonian cycle — a superficially similar connectivity question — is NP-complete. The structure of the problem (bipartite vs. general, matching vs. Hamiltonicity) determines tractability in ways that are not predictable from the problem's surface appearance.

💡Intuition

The spectrum of the adjacency matrix $A$ encodes global graph properties. The largest eigenvalue satisfies $\lambda_1 \leq \Delta(G)$ , and for regular graphs $\lambda_1 = \Delta$ exactly. More importantly, the second smallest eigenvalue of the Laplacian $L = D - A$ satisfies $\lambda_2 > 0$ if and only if $G$ is connected — this algebraic connectivity (Fiedler value) measures how robustly connected the graph is. Spectral graph theory, developed in the next lesson, makes this the foundation for clustering and network analysis.

💡Intuition

Graph coloring is the prototypical NP-complete problem. Deciding whether $\chi(G) \leq 3$ (3-colorability) is NP-complete, making it one of Karp's original 21 NP-complete problems. This hardness propagates across combinatorics and theoretical computer science: register allocation in compilers, frequency assignment in wireless networks, and exam scheduling all reduce to graph coloring. Understanding that the problem is easy for special graph classes (bipartite graphs are 2-colorable, chordal graphs are greedy-colorable in polynomial time) while hard in general is a recurring theme in combinatorial optimization.

⚠️Warning

There is a fundamental gap between graph-theoretic guarantees and computational tractability. The four-color theorem guarantees every planar graph is 4-colorable, yet no efficient algorithm computes the chromatic number of a general graph — the problem remains NP-complete even for planar graphs (3-colorability of planar graphs is NP-complete). Similarly, Hall's theorem guarantees a perfect matching exists in bipartite graphs satisfying the Hall condition, but computing a maximum matching in general (non-bipartite) graphs requires Edmonds' blossom algorithm, which handles odd cycles through a significantly more complex augmentation strategy. The gap between existence proofs and efficient algorithms is a central concern throughout combinatorics.

Enjoying these notes?

Get new lessons delivered to your inbox. No spam.

Functional Analysis

Bridge: Kernel Methods, Attention as Operators & Neural Operators (FNO)

Spectral Graph Theory: Laplacians, Eigenvalues & the Cheeger Inequality