Loading content...
Imagine you're an engineer tasked with connecting a set of cities with fiber-optic cables. Each potential cable route has a cost based on distance and terrain. Your goal: connect all cities at the minimum total cost, with no redundant connections.
This is the Minimum Spanning Tree (MST) problem—one of the most elegant and practically important problems in graph theory. The solution forms a tree (no cycles) that spans all vertices (everyone is connected) with minimum total edge weight.
Two classic algorithms solve this problem:
Prim's Algorithm: Grows the MST from a single vertex, always adding the cheapest edge that extends the tree.
Kruskal's Algorithm: Considers edges globally in weight order, adding each edge that doesn't create a cycle.
Both are greedy, both are optimal, and both reveal deep insights about graph structure.
By the end of this page, you will understand what a spanning tree is and why minimum weight matters, the cut property that guarantees greedy MST algorithms work, how Prim's and Kruskal's algorithms exploit this property, implementation details including the Union-Find data structure, and when to prefer one algorithm over the other.
Before diving into algorithms, let's establish precise definitions:
Spanning Tree:
A spanning tree of a connected, undirected graph G = (V, E) is a subgraph T = (V, E') such that:
A spanning tree with V vertices has exactly V - 1 edges—the minimum needed to connect V vertices without cycles.
Minimum Spanning Tree (MST):
A minimum spanning tree is a spanning tree where the sum of edge weights is minimized. For a weighted graph, there may be multiple spanning trees, but at least one will have minimum total weight (the MST may not be unique if multiple trees have the same minimum weight).
Don't confuse MST with shortest paths! An MST minimizes total network cost, not the distance between any specific pair of vertices. The shortest path between two vertices in an MST may be longer than the shortest path in the original graph.
Both Prim's and Kruskal's algorithms are greedy—they make locally optimal choices. Why do these local choices lead to a globally optimal MST? The answer lies in the cut property.
Definition: Cut
A cut of a graph is a partition of vertices into two non-empty sets S and V-S. The cut edges are edges with one endpoint in S and one in V-S.
The Cut Property (MST Optimality Guarantee):
For any cut of a graph, the minimum-weight edge crossing the cut is part of some MST.
This is profound: no matter how you partition the vertices, the lightest edge connecting the two parts is "safe" to include in the MST.
Proof Intuition:
Suppose the minimum cut edge e = (u, v) is NOT in some MST T. Then T must have some other path from u to v (since T spans all vertices). This path crosses the cut at some other edge e'. Since w(e) < w(e'), replacing e' with e gives a spanning tree with smaller total weight. Contradiction—T wasn't minimum after all.
1234567891011121314151617181920212223
Graph: A ----(5)---- B |\ /| (1) \(3) (4)/ (2) | \ / | C ----(6)---- D Consider cut: {A, C} | {B, D} Cut edges and weights:- A-B: 5- A-D: 3 ← MINIMUM cut edge- C-D: 6 The Cut Property guarantees that edge A-D (weight 3)is in some MST of this graph. Indeed, one MST is: A-C (1), A-D (3), B-D (2) = total 6Another valid MST: A-C (1), A-D (3), A-B (5) = total 9? No, not minimum. Actually: A-C (1), B-D (2), A-D (3) = 6 ✓ Both Prim and Kruskal will include the minimum cut edge.The cut property is why greedy MST algorithms work. Each algorithm defines cuts implicitly, selects the minimum edge crossing each cut, and the cut property guarantees these choices lead to an MST. Understanding this principle helps you see why the algorithms are correct—not just that they work.
Prim's algorithm grows the MST from a single starting vertex, at each step adding the minimum-weight edge that connects a tree vertex to a non-tree vertex.
Algorithm Outline:
Connection to Dijkstra:
Prim's algorithm is structurally similar to Dijkstra:
The cut at each step is: {vertices in MST} | {vertices not in MST}. The minimum edge crossing this cut is added, satisfying the cut property.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
interface WeightedEdge { to: number; weight: number;} interface WeightedGraph { neighbors(vertex: number): WeightedEdge[]; numVertices: number;} interface MSTResult { edges: { from: number; to: number; weight: number }[]; totalWeight: number;} function primMST(graph: WeightedGraph): MSTResult { const n = graph.numVertices; const inMST = new Array(n).fill(false); const minEdgeTo = new Array(n).fill(Infinity); // Minimum edge weight to reach vertex const parent = new Array(n).fill(-1); // Parent in MST (for edge reconstruction) // Priority queue: [weight, vertex] const pq = new MinHeap<number>(); // Start from vertex 0 minEdgeTo[0] = 0; pq.push(0, 0); const mstEdges: { from: number; to: number; weight: number }[] = []; let totalWeight = 0; while (!pq.isEmpty()) { const current = pq.pop()!; if (inMST[current]) continue; // Already processed inMST[current] = true; // Add edge to MST (except for starting vertex) if (parent[current] !== -1) { mstEdges.push({ from: parent[current], to: current, weight: minEdgeTo[current] }); totalWeight += minEdgeTo[current]; } // Update minimum edge weights to neighbors for (const { to, weight } of graph.neighbors(current)) { if (!inMST[to] && weight < minEdgeTo[to]) { minEdgeTo[to] = weight; parent[to] = current; pq.push(weight, to); } } } return { edges: mstEdges, totalWeight };} // Example usageconst graph: WeightedGraph = { numVertices: 4, neighbors: (v: number) => { const adj: WeightedEdge[][] = [ [{ to: 1, weight: 10 }, { to: 2, weight: 6 }, { to: 3, weight: 5 }], // 0 [{ to: 0, weight: 10 }, { to: 3, weight: 15 }], // 1 [{ to: 0, weight: 6 }, { to: 3, weight: 4 }], // 2 [{ to: 0, weight: 5 }, { to: 1, weight: 15 }, { to: 2, weight: 4 }], // 3 ]; return adj[v]; }}; const mst = primMST(graph);console.log(`MST total weight: ${mst.totalWeight}`);console.log(`MST edges: ${JSON.stringify(mst.edges)}`);Step-by-Step Execution:
Graph:
0 ---(10)--- 1
|\ |
(6) \(5) (15)
| \ |
2 ---(4)--- 3
| Step | In MST | PQ Top | Action | MST Edges |
|---|---|---|---|---|
| Init | {} | (0, v0) | Start with v0 | |
| 1 | {0} | (5, v3) | Add 0→3 | (0,3,5) |
| 2 | {0,3} | (4, v2) | Add 3→2 | (0,3,5), (3,2,4) |
| 3 | {0,2,3} | (10, v1) | Add 0→1 | (0,3,5), (3,2,4), (0,1,10) |
| Done | {0,1,2,3} | empty | Total: 19 |
MST: edges (0,3), (3,2), (0,1) with total weight 5 + 4 + 10 = 19.
Kruskal's algorithm takes a different approach: instead of growing from a single vertex, it considers all edges globally in order of weight, adding each edge that doesn't create a cycle.
Algorithm Outline:
The Union-Find Data Structure:
Kruskal's efficiency depends on quickly answering: "Are u and v in the same component?" and "Merge u's and v's components." The Union-Find (Disjoint Set Union) data structure handles both in nearly O(1) amortized time.
Connection to Cut Property:
When we consider edge (u, v), if u and v are in different components, that edge crosses the cut {u's component} | {rest}. Being the next smallest edge we're considering, it's the minimum crossing that cut. The cut property guarantees it belongs in some MST.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
interface Edge { from: number; to: number; weight: number;} class UnionFind { private parent: number[]; private rank: number[]; constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); this.rank = new Array(n).fill(0); } find(x: number): number { // Path compression: point directly to root if (this.parent[x] !== x) { this.parent[x] = this.find(this.parent[x]); } return this.parent[x]; } union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); if (rootX === rootY) return false; // Already connected // Union by rank: attach smaller tree under larger if (this.rank[rootX] < this.rank[rootY]) { this.parent[rootX] = rootY; } else if (this.rank[rootX] > this.rank[rootY]) { this.parent[rootY] = rootX; } else { this.parent[rootY] = rootX; this.rank[rootX]++; } return true; // Successfully merged } connected(x: number, y: number): boolean { return this.find(x) === this.find(y); }} function kruskalMST(numVertices: number, edges: Edge[]): MSTResult { // Sort edges by weight const sortedEdges = [...edges].sort((a, b) => a.weight - b.weight); const uf = new UnionFind(numVertices); const mstEdges: Edge[] = []; let totalWeight = 0; for (const edge of sortedEdges) { // If adding this edge doesn't create a cycle if (uf.union(edge.from, edge.to)) { mstEdges.push(edge); totalWeight += edge.weight; // MST complete when we have V-1 edges if (mstEdges.length === numVertices - 1) break; } } return { edges: mstEdges, totalWeight };} // Example usageconst edges: Edge[] = [ { from: 0, to: 1, weight: 10 }, { from: 0, to: 2, weight: 6 }, { from: 0, to: 3, weight: 5 }, { from: 1, to: 3, weight: 15 }, { from: 2, to: 3, weight: 4 },]; const mst = kruskalMST(4, edges);console.log(`MST total weight: ${mst.totalWeight}`); // 19Step-by-Step Execution:
Sorted edges: (2,3,4), (0,3,5), (0,2,6), (0,1,10), (1,3,15)
| Step | Edge | Weight | Action | Components | MST Edges Count |
|---|---|---|---|---|---|
| 1 | (2,3) | 4 | Add (different components) | {0},{1},{2,3} | 1 |
| 2 | (0,3) | 5 | Add (0 not with 2,3) | {0,2,3},{1} | 2 |
| 3 | (0,2) | 6 | Skip (0,2 same component) | {0,2,3},{1} | 2 |
| 4 | (0,1) | 10 | Add (1 not with others) | {0,1,2,3} | 3 |
| Done | — | — | V-1 = 3 edges | All connected | 3 |
MST: edges (2,3), (0,3), (0,1) with total weight 4 + 5 + 10 = 19.
(Same result as Prim's)
The Union-Find data structure is crucial for Kruskal's efficiency. Let's understand it deeply.
Core Operations:
Naive Implementation:
Each element points to its parent. Find follows parent pointers to the root. Union makes one root point to another. This can degrade to O(n) per operation for pathological cases (long chains).
Optimizations:
Path Compression (in find): Make every node on the path point directly to the root. Future finds are faster.
Union by Rank (in union): Always attach the shorter tree under the taller tree. Keeps trees balanced.
With both optimizations, the amortized time per operation is O(α(n)), where α is the inverse Ackermann function—essentially O(1) for all practical purposes.
12345678910111213141516171819202122232425262728293031323334353637383940
Initial state (5 elements, each its own set): 0 1 2 3 4 ↓ ↓ ↓ ↓ ↓ [0] [1] [2] [3] [4] (parent of i is i) After union(0, 1): 0 2 3 4 / ↓ ↓ ↓ 1 [2] [3] [4] Parent: [0, 0, 2, 3, 4] After union(2, 3): 0 2 4 / / ↓ 1 3 [4] Parent: [0, 0, 2, 2, 4] After union(0, 2): (Union by rank: 0's tree has rank 1, 2's tree has rank 1) 0 4 /| ↓ 1 2 [4] | 3 Parent: [0, 0, 0, 2, 4] Now find(3): - 3's parent is 2, 2's parent is 0, 0 is root - With path compression: set parent[3] = 0 After path compression on find(3): 0 4 /|\ 1 2 3 Parent: [0, 0, 0, 0, 4] Future find(3) is O(1)!The inverse Ackermann function α(n) grows incredibly slowly—α(n) ≤ 4 for any practical value of n (even n = 10^80, more than atoms in the universe). For all intents and purposes, Union-Find operations are O(1) amortized. This makes Kruskal extremely efficient.
Both algorithms find an MST, but they have different performance characteristics:
| Aspect | Prim | Kruskal |
|---|---|---|
| Approach | Grow tree from one vertex | Process edges globally |
| Time Complexity | O((V + E) log V) with binary heap | O(E log E) = O(E log V) |
| Space Complexity | O(V) | O(V) for Union-Find |
| Best for | Dense graphs | Sparse graphs |
| Graph Representation | Adjacency list preferred | Edge list preferred |
| Early Termination | Can stop when target reached | Naturally stops at V-1 edges |
| Parallelization | Harder (sequential growth) | Easier (edge processing) |
Deep Dive: Complexity Analysis
Prim's Algorithm:
Kruskal's Algorithm:
Practical Guidance:
For most practical applications, either algorithm works well. Kruskal is often preferred for its simplicity and natural fit with edge-list representations. Prim is preferred when you're working with adjacency lists or need to start from a specific vertex.
MST algorithms have applications far beyond their original networking context:
Application 1: Clustering
MST-based clustering: Build the MST, then remove the k-1 heaviest edges to get k clusters. The intuition: heavy edges connect "distant" groups, so removing them separates natural clusters.
Application 2: Approximation Algorithms
The Traveling Salesman Problem (TSP) is NP-hard, but an MST provides a 2-approximation:
Application 3: Network Reliability
Finding the "most reliable" spanning tree (maximize minimum edge weight, or minimize maximum edge weight) uses MST variants.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
function mstClustering( numVertices: number, edges: Edge[], numClusters: number): number[][] { // Build MST const mst = kruskalMST(numVertices, edges); if (numClusters <= 1 || numClusters > numVertices) { throw new Error("Invalid number of clusters"); } // Sort MST edges by weight descending const sortedMSTEdges = [...mst.edges].sort((a, b) => b.weight - a.weight); // Remove k-1 heaviest edges to create k clusters const edgesToKeep = sortedMSTEdges.slice(numClusters - 1); // Build clusters using Union-Find on remaining edges const uf = new UnionFind(numVertices); for (const edge of edgesToKeep) { uf.union(edge.from, edge.to); } // Group vertices by their cluster root const clusters = new Map<number, number[]>(); for (let v = 0; v < numVertices; v++) { const root = uf.find(v); if (!clusters.has(root)) { clusters.set(root, []); } clusters.get(root)!.push(v); } return Array.from(clusters.values());} // Example: Cluster 6 points into 2 clustersconst points = [ // Cluster 1: close together { id: 0, x: 0, y: 0 }, { id: 1, x: 1, y: 0 }, { id: 2, x: 0, y: 1 }, // Cluster 2: far from cluster 1 { id: 3, x: 10, y: 10 }, { id: 4, x: 11, y: 10 }, { id: 5, x: 10, y: 11 },]; // Build complete graph with Euclidean distancesconst clusterEdges: Edge[] = [];for (let i = 0; i < points.length; i++) { for (let j = i + 1; j < points.length; j++) { const dist = Math.sqrt( Math.pow(points[i].x - points[j].x, 2) + Math.pow(points[i].y - points[j].y, 2) ); clusterEdges.push({ from: i, to: j, weight: dist }); }} const clusters = mstClustering(6, clusterEdges, 2);console.log(clusters); // [[0,1,2], [3,4,5]] or similarMany problems can be solved using MST modifications: Maximum Spanning Tree (negate weights, run MST), Minimum Bottleneck Spanning Tree (minimize maximum edge), Steiner Tree (MST connecting a subset of vertices), and Second-Best MST (useful for sensitivity analysis).
Minimum spanning trees represent one of the most elegant intersections of theory and practice in algorithm design. The cut property provides the theoretical foundation; Prim's and Kruskal's algorithms provide efficient implementations.
What's Next:
With traversal, shortest paths, and minimum spanning trees covered, the final page previews what's ahead in Chapters 23-24—where you'll master these algorithms through implementation, advanced variations, and challenging applications.
You now understand the two classic MST algorithms. Both Prim and Kruskal exploit the cut property to make locally optimal choices that yield globally optimal spanning trees. The Union-Find data structure is a powerful tool you'll encounter repeatedly in graph algorithms.