Neural-Path/Notes
40 min

Graph Fundamentals: Connectivity, Trees, Planarity & Colorings

A graph is a set of vertices and edges — connectivity, trees, planarity, and colorability are the foundational properties that determine what computations are feasible on graph-structured data. Understanding these properties rigorously is the prerequisite for spectral methods, random graphs, and graph neural networks.

Concepts

A graph $G = (V, E)$ with $|V| = 6$ vertices and $|E| = 7$ edges. Colors show connected components. Toggle views.

123456Handshaking: Σdeg = 2|E| = 14. Components: 1.
Hover a vertex to see its degree. 1 connected component.

A graph is the mathematical model for any system of objects and relationships — social networks, road maps, molecular bonds, dependency graphs in software. The same three questions recur in all of them: which objects are reachable from a given starting point, how can the graph be partitioned into coherent subsets, and what is the minimum-cost structure that preserves connectivity?

Graphs and Their Representations

A graph G=(V,E)G = (V, E) consists of a vertex set VV with V=n|V| = n and an edge set EV×VE \subseteq V \times V with E=m|E| = m. In an undirected graph, edges are unordered pairs {u,v}\{u, v\}; in a directed graph (digraph), edges are ordered pairs (u,v)(u, v).

The degree of vertex vv is deg(v)={u:{u,v}E}\deg(v) = |\{u : \{u,v\} \in E\}|. The handshaking lemma states:

vVdeg(v)=2m\sum_{v \in V} \deg(v) = 2m

since each edge contributes to two degrees. The degree sequence lists degrees in non-increasing order. For directed graphs, in-degree deg(v)\deg^-(v) and out-degree deg+(v)\deg^+(v) are tracked separately.

The adjacency matrix A{0,1}n×nA \in \{0,1\}^{n \times n} has Aij=1A_{ij} = 1 if {i,j}E\{i,j\} \in E. For undirected graphs AA is symmetric. The degree matrix DD is diagonal with Dii=deg(i)D_{ii} = \deg(i). Alternatively, the adjacency list representation stores N(v)={u:{u,v}E}N(v) = \{u : \{u,v\} \in E\} for each vertex, using O(n+m)O(n + m) space versus O(n2)O(n^2) for the matrix.

The adjacency list is more space-efficient, but the matrix representation is chosen because it exposes algebraic structure: AijkA^k_{ij} counts walks of length kk from ii to jj, the eigenvalues of AA reveal expansion and clustering, and matrix-vector products implement one step of message passing over the entire graph simultaneously. This is why the Laplacian L=DAL = D - A — not an edge list — sits at the core of spectral clustering and graph neural networks.

Paths, Connectivity, and Components

A path from uu to vv is a sequence of distinct vertices u=v0,v1,,vk=vu = v_0, v_1, \ldots, v_k = v with {vi1,vi}E\{v_{i-1}, v_i\} \in E. A cycle is a closed path (v0=vkv_0 = v_k, k3k \geq 3). A graph is connected if a path exists between every pair of vertices; otherwise the vertex set partitions into connected components.

A bridge is an edge whose removal disconnects the graph. An articulation point is a vertex whose removal disconnects the graph. Both can be found in O(n+m)O(n + m) time via a single DFS using Tarjan's low-link values: define low[v]=min(disc[v],minulow[u])\text{low}[v] = \min(\text{disc}[v], \min_u \text{low}[u]) over DFS tree children and back-edge neighbors.

For directed graphs, strongly connected components (SCCs) are maximal subsets where every vertex reaches every other. Tarjan's SCC algorithm finds all SCCs in O(n+m)O(n + m) using DFS with a stack and low-link tracking. The SCCs form a DAG (the condensation) whose topological sort gives a processing order for dynamic programming on graphs.

Trees and Spanning Trees

A tree is a connected acyclic graph. Any two of the following three conditions imply the third: (1) GG is connected, (2) GG is acyclic, (3) E=V1|E| = |V| - 1. Trees have exactly n1n - 1 edges and a unique path between any two vertices.

A spanning tree of GG is a subgraph that is a tree containing all vertices. Spanning trees can be found by BFS or DFS in O(n+m)O(n + m); minimum spanning trees use Kruskal's or Prim's algorithms in O(mlogn)O(m \log n).

Cayley's formula counts labeled trees: the number of distinct labeled trees on nn vertices is nn2n^{n-2}. This follows from the Prüfer sequence bijection — every labeled tree corresponds uniquely to a sequence in {1,,n}n2\{1, \ldots, n\}^{n-2}, and there are nn2n^{n-2} such sequences.

Graph Coloring

A proper kk-coloring assigns one of kk colors to each vertex so that no adjacent vertices share a color. The chromatic number χ(G)\chi(G) is the minimum kk for which a proper coloring exists.

Greedy coloring processes vertices in some order and assigns the smallest available color. This gives χ(G)Δ(G)+1\chi(G) \leq \Delta(G) + 1 where Δ(G)\Delta(G) is the maximum degree. Brooks' theorem strengthens this: for any connected graph that is neither a complete graph KnK_n nor an odd cycle, χ(G)Δ(G)\chi(G) \leq \Delta(G).

The four-color theorem (Appel and Haken, 1976) states every planar graph satisfies χ(G)4\chi(G) \leq 4. Determining χ(G)\chi(G) is NP-complete for k3k \geq 3; even approximating it within a factor of n1εn^{1-\varepsilon} is NP-hard.

Matchings and Hall's Theorem

A matching is a set of edges with no shared vertices. A perfect matching covers every vertex. In bipartite graphs G=(XY,E)G = (X \cup Y, E), Hall's theorem characterizes when a perfect matching from XX into YY exists: for every subset SXS \subseteq X, the neighborhood N(S)YN(S) \subseteq Y satisfies N(S)S|N(S)| \geq |S|.

König's theorem states that in bipartite graphs, the size of the maximum matching equals the size of the minimum vertex cover (a set of vertices covering all edges). This is a rare case where a combinatorial min-max identity holds and can be computed in polynomial time.

Maximum matching is found via augmenting paths: a path from an unmatched vertex to another unmatched vertex alternating between unmatched and matched edges. Augmenting along such a path increases the matching size by 1. The Hopcroft-Karp algorithm finds a maximum bipartite matching in O(mn)O(m\sqrt{n}) by finding all shortest augmenting paths simultaneously.

Worked Example

Connected components and spanning tree via BFS. Take the graph V={1,2,3,4,5,6}V = \{1,2,3,4,5,6\} with edges {1,2},{2,3},{3,1},{4,5}\{1,2\}, \{2,3\}, \{3,1\}, \{4,5\}, leaving vertex 6 isolated. BFS from vertex 1 visits {1,2,3}\{1,2,3\} using the queue: start with 1, enqueue neighbors 2 and 3, visit them — the BFS spanning tree has edges {1,2},{1,3}\{1,2\}, \{1,3\}. Restarting BFS from vertex 4 discovers component {4,5}\{4,5\} with spanning edge {4,5}\{4,5\}. Vertex 6 forms a singleton component. Total: 3 connected components, 2 spanning tree edges (matching Etree=ncomponent1|E_{\text{tree}}| = n_{\text{component}} - 1 within each component).

Verifying Hall's condition and finding a maximum matching. Consider the bipartite graph with X={a,b,c}X = \{a, b, c\}, Y={1,2,3}Y = \{1, 2, 3\}, and edges aa-1, aa-2, bb-2, bb-3, cc-3. Check Hall's condition: N({a})={1,2}N(\{a\}) = \{1,2\}, N=21|N| = 2 \geq 1; N({b,c})={2,3}N(\{b,c\}) = \{2,3\}, N=22|N| = 2 \geq 2; N({a,b,c})={1,2,3}N(\{a,b,c\}) = \{1,2,3\}, N=33|N| = 3 \geq 3. Hall's condition holds, so a perfect matching exists. Start with matching M={a-1}M = \{a\text{-}1\}. Vertex bb is unmatched; augmenting path bb-2: extend to M={a-1,b-2}M = \{a\text{-}1, b\text{-}2\}. Vertex cc is unmatched; try cc-3: augmenting path cc-3, giving M={a-1,b-2,c-3}M = \{a\text{-}1, b\text{-}2, c\text{-}3\} — a perfect matching.

Chromatic number of the Petersen graph. The Petersen graph has 10 vertices, 15 edges, and is 3-regular (Δ=3\Delta = 3). Brooks' theorem gives χ3\chi \leq 3. Greedy coloring: begin with an outer 5-cycle 11-2-3-4-5-1. Color 1 with color A, 2 with B, 3 with A, 4 with B, 5 with C (forced since 5 is adjacent to both 1=A and 4=B). The inner pentagram vertices connect across the cycle; working through the adjacencies shows all three colors are needed and no 2-coloring is possible (the graph is not bipartite — it contains odd cycles). Thus χ(Petersen)=3\chi(\text{Petersen}) = 3, matching Brooks' bound exactly.

Connections

Where Your Intuition Breaks

The four-color theorem guarantees planarity implies 4-colorability, and König's theorem guarantees bipartite matching is polynomial — but these two cases might give the wrong impression that existence results in combinatorics come with efficient algorithms. For general graphs, computing the chromatic number requires exponential time; 3-colorability of planar graphs is NP-complete. The tractability gap runs through graph theory: Hall's condition characterizes which bipartite graphs have perfect matchings in polynomial time, but determining whether a general graph has a Hamiltonian cycle — a superficially similar connectivity question — is NP-complete. The structure of the problem (bipartite vs. general, matching vs. Hamiltonicity) determines tractability in ways that are not predictable from the problem's surface appearance.

💡Intuition

The spectrum of the adjacency matrix AA encodes global graph properties. The largest eigenvalue satisfies λ1Δ(G)\lambda_1 \leq \Delta(G), and for regular graphs λ1=Δ\lambda_1 = \Delta exactly. More importantly, the second smallest eigenvalue of the Laplacian L=DAL = D - A satisfies λ2>0\lambda_2 > 0 if and only if GG is connected — this algebraic connectivity (Fiedler value) measures how robustly connected the graph is. Spectral graph theory, developed in the next lesson, makes this the foundation for clustering and network analysis.

💡Intuition

Graph coloring is the prototypical NP-complete problem. Deciding whether χ(G)3\chi(G) \leq 3 (3-colorability) is NP-complete, making it one of Karp's original 21 NP-complete problems. This hardness propagates across combinatorics and theoretical computer science: register allocation in compilers, frequency assignment in wireless networks, and exam scheduling all reduce to graph coloring. Understanding that the problem is easy for special graph classes (bipartite graphs are 2-colorable, chordal graphs are greedy-colorable in polynomial time) while hard in general is a recurring theme in combinatorial optimization.

⚠️Warning

There is a fundamental gap between graph-theoretic guarantees and computational tractability. The four-color theorem guarantees every planar graph is 4-colorable, yet no efficient algorithm computes the chromatic number of a general graph — the problem remains NP-complete even for planar graphs (3-colorability of planar graphs is NP-complete). Similarly, Hall's theorem guarantees a perfect matching exists in bipartite graphs satisfying the Hall condition, but computing a maximum matching in general (non-bipartite) graphs requires Edmonds' blossom algorithm, which handles odd cycles through a significantly more complex augmentation strategy. The gap between existence proofs and efficient algorithms is a central concern throughout combinatorics.

Enjoying these notes?

Get new lessons delivered to your inbox. No spam.