Intro To Graph Algorithms - Learning Module

Loading content...

0/279

Minimum Spanning Trees — Prim and Kruskal

Connecting Networks at Minimum Cost

Imagine you're an engineer tasked with connecting a set of cities with fiber-optic cables. Each potential cable route has a cost based on distance and terrain. Your goal: connect all cities at the minimum total cost, with no redundant connections.

This is the Minimum Spanning Tree (MST) problem—one of the most elegant and practically important problems in graph theory. The solution forms a tree (no cycles) that spans all vertices (everyone is connected) with minimum total edge weight.

Two classic algorithms solve this problem:

Prim's Algorithm: Grows the MST from a single vertex, always adding the cheapest edge that extends the tree.
Kruskal's Algorithm: Considers edges globally in weight order, adding each edge that doesn't create a cycle.

Both are greedy, both are optimal, and both reveal deep insights about graph structure.

What You Will Learn

By the end of this page, you will understand what a spanning tree is and why minimum weight matters, the cut property that guarantees greedy MST algorithms work, how Prim's and Kruskal's algorithms exploit this property, implementation details including the Union-Find data structure, and when to prefer one algorithm over the other.

Spanning Trees and the MST Problem

Before diving into algorithms, let's establish precise definitions:

Spanning Tree:

A spanning tree of a connected, undirected graph G = (V, E) is a subgraph T = (V, E') such that:

T includes all vertices of G
T is connected
T has no cycles (it's a tree)

A spanning tree with V vertices has exactly V - 1 edges—the minimum needed to connect V vertices without cycles.

Minimum Spanning Tree (MST):

A minimum spanning tree is a spanning tree where the sum of edge weights is minimized. For a weighted graph, there may be multiple spanning trees, but at least one will have minimum total weight (the MST may not be unique if multiple trees have the same minimum weight).

MST Properties

•Contains all V vertices
•Contains exactly V-1 edges
•Is connected (path between any two vertices)
•Has no cycles (tree structure)
•Minimizes total edge weight

Real-World Applications

•Network design: Minimum-cost cable/pipe layout
•Clustering: MST-based clustering algorithms
•Approximation: Basis for TSP approximations
•Image segmentation: Graph-cut based methods
•Taxonomy: Phylogenetic tree construction

MST vs Shortest Path

Don't confuse MST with shortest paths! An MST minimizes total network cost, not the distance between any specific pair of vertices. The shortest path between two vertices in an MST may be longer than the shortest path in the original graph.

The Cut Property: Why Greedy Works

Both Prim's and Kruskal's algorithms are greedy—they make locally optimal choices. Why do these local choices lead to a globally optimal MST? The answer lies in the cut property.

Definition: Cut

A cut of a graph is a partition of vertices into two non-empty sets S and V-S. The cut edges are edges with one endpoint in S and one in V-S.

The Cut Property (MST Optimality Guarantee):

For any cut of a graph, the minimum-weight edge crossing the cut is part of some MST.

This is profound: no matter how you partition the vertices, the lightest edge connecting the two parts is "safe" to include in the MST.

Proof Intuition:

Suppose the minimum cut edge e = (u, v) is NOT in some MST T. Then T must have some other path from u to v (since T spans all vertices). This path crosses the cut at some other edge e'. Since w(e) < w(e'), replacing e' with e gives a spanning tree with smaller total weight. Contradiction—T wasn't minimum after all.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Graph:
        A ----(5)---- B
        |\          /|
       (1) \(3)  (4)/ (2)
        |    \  /   |
        C ----(6)---- D
 
Consider cut: {A, C} | {B, D}
 
Cut edges and weights:
- A-B: 5
- A-D: 3  ← MINIMUM cut edge
- C-D: 6
 
The Cut Property guarantees that edge A-D (weight 3)
is in some MST of this graph.
 
Indeed, one MST is: A-C (1), A-D (3), B-D (2) = total 6
Another valid MST: A-C (1), A-D (3), A-B (5) = total 9? No, not minimum.
 
Actually: A-C (1), B-D (2), A-D (3) = 6 ✓
 
Both Prim and Kruskal will include the minimum cut edge.

Why This Matters

The cut property is why greedy MST algorithms work. Each algorithm defines cuts implicitly, selects the minimum edge crossing each cut, and the cut property guarantees these choices lead to an MST. Understanding this principle helps you see why the algorithms are correct—not just that they work.

Prim's Algorithm: Growing the Tree Vertex by Vertex

Prim's algorithm grows the MST from a single starting vertex, at each step adding the minimum-weight edge that connects a tree vertex to a non-tree vertex.

Algorithm Outline:

Start with an arbitrary vertex in the MST
Repeat until all vertices are in the MST:
- Find the minimum-weight edge (u, v) where u is in the MST and v is not
- Add v to the MST and include edge (u, v)
The selected edges form the MST

Connection to Dijkstra:

Prim's algorithm is structurally similar to Dijkstra:

Both use a priority queue
Both process vertices in order of some metric
Dijkstra: minimum distance from source
Prim: minimum edge weight to the existing tree

The cut at each step is: {vertices in MST} | {vertices not in MST}. The minimum edge crossing this cut is added, satisfying the cut property.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
interface WeightedEdge {
    to: number;
    weight: number;
}
 
interface WeightedGraph {
    neighbors(vertex: number): WeightedEdge[];
    numVertices: number;
}
 
interface MSTResult {
    edges: { from: number; to: number; weight: number }[];
    totalWeight: number;
}
 
function primMST(graph: WeightedGraph): MSTResult {
    const n = graph.numVertices;
    const inMST = new Array(n).fill(false);
    const minEdgeTo = new Array(n).fill(Infinity);  // Minimum edge weight to reach vertex
    const parent = new Array(n).fill(-1);           // Parent in MST (for edge reconstruction)
    
    // Priority queue: [weight, vertex]
    const pq = new MinHeap<number>();
    
    // Start from vertex 0
    minEdgeTo[0] = 0;
    pq.push(0, 0);
    
    const mstEdges: { from: number; to: number; weight: number }[] = [];
    let totalWeight = 0;
    
    while (!pq.isEmpty()) {
        const current = pq.pop()!;
        
        if (inMST[current]) continue;  // Already processed
        
        inMST[current] = true;
        
        // Add edge to MST (except for starting vertex)
        if (parent[current] !== -1) {
            mstEdges.push({
                from: parent[current],
                to: current,
                weight: minEdgeTo[current]
            });
            totalWeight += minEdgeTo[current];
        }
        
        // Update minimum edge weights to neighbors
        for (const { to, weight } of graph.neighbors(current)) {
            if (!inMST[to] && weight < minEdgeTo[to]) {
                minEdgeTo[to] = weight;
                parent[to] = current;
                pq.push(weight, to);
            }
        }
    }
    
    return { edges: mstEdges, totalWeight };
}
 
// Example usage
const graph: WeightedGraph = {
    numVertices: 4,
    neighbors: (v: number) => {
        const adj: WeightedEdge[][] = [
            [{ to: 1, weight: 10 }, { to: 2, weight: 6 }, { to: 3, weight: 5 }],  // 0
            [{ to: 0, weight: 10 }, { to: 3, weight: 15 }],                        // 1
            [{ to: 0, weight: 6 }, { to: 3, weight: 4 }],                          // 2
            [{ to: 0, weight: 5 }, { to: 1, weight: 15 }, { to: 2, weight: 4 }],   // 3
        ];
        return adj[v];
    }
};
 
const mst = primMST(graph);
console.log(`MST total weight: ${mst.totalWeight}`);
console.log(`MST edges: ${JSON.stringify(mst.edges)}`);

Step-by-Step Execution:

Graph:

    0 ---(10)--- 1
    |\          |
   (6) \(5)   (15)
    |    \      |
    2 ---(4)--- 3

Step	In MST	PQ Top	Action	MST Edges
Init	{}	(0, v0)	Start with v0
1	{0}	(5, v3)	Add 0→3	(0,3,5)
2	{0,3}	(4, v2)	Add 3→2	(0,3,5), (3,2,4)
3	{0,2,3}	(10, v1)	Add 0→1	(0,3,5), (3,2,4), (0,1,10)
Done	{0,1,2,3}	empty		Total: 19

MST: edges (0,3), (3,2), (0,1) with total weight 5 + 4 + 10 = 19.

Kruskal's Algorithm: Sorting Edges Globally

Kruskal's algorithm takes a different approach: instead of growing from a single vertex, it considers all edges globally in order of weight, adding each edge that doesn't create a cycle.

Algorithm Outline:

Sort all edges by weight (ascending)
Initialize each vertex as its own component
For each edge (u, v) in sorted order:
- If u and v are in different components (adding edge won't create cycle):
  - Add the edge to the MST
  - Merge the components containing u and v
Stop when MST has V-1 edges

The Union-Find Data Structure:

Kruskal's efficiency depends on quickly answering: "Are u and v in the same component?" and "Merge u's and v's components." The Union-Find (Disjoint Set Union) data structure handles both in nearly O(1) amortized time.

Connection to Cut Property:

When we consider edge (u, v), if u and v are in different components, that edge crosses the cut {u's component} | {rest}. Being the next smallest edge we're considering, it's the minimum crossing that cut. The cut property guarantees it belongs in some MST.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
interface Edge {
    from: number;
    to: number;
    weight: number;
}
 
class UnionFind {
    private parent: number[];
    private rank: number[];
    
    constructor(n: number) {
        this.parent = Array.from({ length: n }, (_, i) => i);
        this.rank = new Array(n).fill(0);
    }
    
    find(x: number): number {
        // Path compression: point directly to root
        if (this.parent[x] !== x) {
            this.parent[x] = this.find(this.parent[x]);
        }
        return this.parent[x];
    }
    
    union(x: number, y: number): boolean {
        const rootX = this.find(x);
        const rootY = this.find(y);
        
        if (rootX === rootY) return false;  // Already connected
        
        // Union by rank: attach smaller tree under larger
        if (this.rank[rootX] < this.rank[rootY]) {
            this.parent[rootX] = rootY;
        } else if (this.rank[rootX] > this.rank[rootY]) {
            this.parent[rootY] = rootX;
        } else {
            this.parent[rootY] = rootX;
            this.rank[rootX]++;
        }
        
        return true;  // Successfully merged
    }
    
    connected(x: number, y: number): boolean {
        return this.find(x) === this.find(y);
    }
}
 
function kruskalMST(numVertices: number, edges: Edge[]): MSTResult {
    // Sort edges by weight
    const sortedEdges = [...edges].sort((a, b) => a.weight - b.weight);
    
    const uf = new UnionFind(numVertices);
    const mstEdges: Edge[] = [];
    let totalWeight = 0;
    
    for (const edge of sortedEdges) {
        // If adding this edge doesn't create a cycle
        if (uf.union(edge.from, edge.to)) {
            mstEdges.push(edge);
            totalWeight += edge.weight;
            
            // MST complete when we have V-1 edges
            if (mstEdges.length === numVertices - 1) break;
        }
    }
    
    return { edges: mstEdges, totalWeight };
}
 
// Example usage
const edges: Edge[] = [
    { from: 0, to: 1, weight: 10 },
    { from: 0, to: 2, weight: 6 },
    { from: 0, to: 3, weight: 5 },
    { from: 1, to: 3, weight: 15 },
    { from: 2, to: 3, weight: 4 },
];
 
const mst = kruskalMST(4, edges);
console.log(`MST total weight: ${mst.totalWeight}`);  // 19

Step-by-Step Execution:

Sorted edges: (2,3,4), (0,3,5), (0,2,6), (0,1,10), (1,3,15)

Step	Edge	Weight	Action	Components	MST Edges Count
1	(2,3)	4	Add (different components)	{0},{1},{2,3}	1
2	(0,3)	5	Add (0 not with 2,3)	{0,2,3},{1}	2
3	(0,2)	6	Skip (0,2 same component)	{0,2,3},{1}	2
4	(0,1)	10	Add (1 not with others)	{0,1,2,3}	3
Done	—	—	V-1 = 3 edges	All connected	3

MST: edges (2,3), (0,3), (0,1) with total weight 4 + 5 + 10 = 19.

(Same result as Prim's)

Union-Find: The Power Behind Kruskal

The Union-Find data structure is crucial for Kruskal's efficiency. Let's understand it deeply.

Core Operations:

find(x): Return the "representative" (root) of x's set
union(x, y): Merge the sets containing x and y
connected(x, y): Check if x and y are in the same set

Naive Implementation:

Each element points to its parent. Find follows parent pointers to the root. Union makes one root point to another. This can degrade to O(n) per operation for pathological cases (long chains).

Optimizations:

Path Compression (in find): Make every node on the path point directly to the root. Future finds are faster.
Union by Rank (in union): Always attach the shorter tree under the taller tree. Keeps trees balanced.

With both optimizations, the amortized time per operation is O(α(n)), where α is the inverse Ackermann function—essentially O(1) for all practical purposes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Initial state (5 elements, each its own set):
    0   1   2   3   4
    ↓   ↓   ↓   ↓   ↓
   [0] [1] [2] [3] [4]  (parent of i is i)
 
After union(0, 1):
    0       2   3   4
   /        ↓   ↓   ↓
  1        [2] [3] [4]
  
  Parent: [0, 0, 2, 3, 4]
 
After union(2, 3):
    0       2       4
   /       /        ↓
  1       3        [4]
  
  Parent: [0, 0, 2, 2, 4]
 
After union(0, 2):  (Union by rank: 0's tree has rank 1, 2's tree has rank 1)
        0           4
       /|           ↓
      1 2          [4]
        |
        3
  
  Parent: [0, 0, 0, 2, 4]
  
Now find(3):
  - 3's parent is 2, 2's parent is 0, 0 is root
  - With path compression: set parent[3] = 0
 
After path compression on find(3):
        0           4
       /|\
      1 2 3
 
  Parent: [0, 0, 0, 0, 4]
  
Future find(3) is O(1)!

Inverse Ackermann Function

The inverse Ackermann function α(n) grows incredibly slowly—α(n) ≤ 4 for any practical value of n (even n = 10^80, more than atoms in the universe). For all intents and purposes, Union-Find operations are O(1) amortized. This makes Kruskal extremely efficient.

Prim vs Kruskal: When to Use Which

Both algorithms find an MST, but they have different performance characteristics:

Prim vs Kruskal Comparison
Aspect	Prim	Kruskal
Approach	Grow tree from one vertex	Process edges globally
Time Complexity	O((V + E) log V) with binary heap	O(E log E) = O(E log V)
Space Complexity	O(V)	O(V) for Union-Find
Best for	Dense graphs	Sparse graphs
Graph Representation	Adjacency list preferred	Edge list preferred
Early Termination	Can stop when target reached	Naturally stops at V-1 edges
Parallelization	Harder (sequential growth)	Easier (edge processing)

Deep Dive: Complexity Analysis

Prim's Algorithm:

With binary heap: O((V + E) log V)
- Each vertex extracted once: O(V log V)
- Each edge potentially updates the heap: O(E log V)
With Fibonacci heap: O(E + V log V)
For dense graphs (E ≈ V²): O(V² log V) with binary heap

Kruskal's Algorithm:

Sorting edges: O(E log E) = O(E log V²) = O(2E log V) = O(E log V)
Union-Find operations: O(E × α(V)) ≈ O(E)
Total: O(E log E)
For sparse graphs (E ≈ V): O(V log V)
For dense graphs (E ≈ V²): O(V² log V)

Practical Guidance:

Sparse graphs: Kruskal is often faster (sorting fewer edges)
Dense graphs: Prim with a good heap can be faster
Edge list input: Kruskal is more natural
Adjacency list input: Prim is more natural

In Practice

For most practical applications, either algorithm works well. Kruskal is often preferred for its simplicity and natural fit with edge-list representations. Prim is preferred when you're working with adjacency lists or need to start from a specific vertex.

Applications and Extensions

MST algorithms have applications far beyond their original networking context:

Application 1: Clustering

MST-based clustering: Build the MST, then remove the k-1 heaviest edges to get k clusters. The intuition: heavy edges connect "distant" groups, so removing them separates natural clusters.

Application 2: Approximation Algorithms

The Traveling Salesman Problem (TSP) is NP-hard, but an MST provides a 2-approximation:

Build the MST
DFS the MST to create a tour visiting all vertices
This tour is at most 2× the optimal TSP tour

Application 3: Network Reliability

Finding the "most reliable" spanning tree (maximize minimum edge weight, or minimize maximum edge weight) uses MST variants.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
function mstClustering(
    numVertices: number, 
    edges: Edge[], 
    numClusters: number
): number[][] {
    // Build MST
    const mst = kruskalMST(numVertices, edges);
    
    if (numClusters <= 1 || numClusters > numVertices) {
        throw new Error("Invalid number of clusters");
    }
    
    // Sort MST edges by weight descending
    const sortedMSTEdges = [...mst.edges].sort((a, b) => b.weight - a.weight);
    
    // Remove k-1 heaviest edges to create k clusters
    const edgesToKeep = sortedMSTEdges.slice(numClusters - 1);
    
    // Build clusters using Union-Find on remaining edges
    const uf = new UnionFind(numVertices);
    for (const edge of edgesToKeep) {
        uf.union(edge.from, edge.to);
    }
    
    // Group vertices by their cluster root
    const clusters = new Map<number, number[]>();
    for (let v = 0; v < numVertices; v++) {
        const root = uf.find(v);
        if (!clusters.has(root)) {
            clusters.set(root, []);
        }
        clusters.get(root)!.push(v);
    }
    
    return Array.from(clusters.values());
}
 
// Example: Cluster 6 points into 2 clusters
const points = [
    // Cluster 1: close together
    { id: 0, x: 0, y: 0 },
    { id: 1, x: 1, y: 0 },
    { id: 2, x: 0, y: 1 },
    // Cluster 2: far from cluster 1
    { id: 3, x: 10, y: 10 },
    { id: 4, x: 11, y: 10 },
    { id: 5, x: 10, y: 11 },
];
 
// Build complete graph with Euclidean distances
const clusterEdges: Edge[] = [];
for (let i = 0; i < points.length; i++) {
    for (let j = i + 1; j < points.length; j++) {
        const dist = Math.sqrt(
            Math.pow(points[i].x - points[j].x, 2) +
            Math.pow(points[i].y - points[j].y, 2)
        );
        clusterEdges.push({ from: i, to: j, weight: dist });
    }
}
 
const clusters = mstClustering(6, clusterEdges, 2);
console.log(clusters);  // [[0,1,2], [3,4,5]] or similar

MST Extensions

Many problems can be solved using MST modifications: Maximum Spanning Tree (negate weights, run MST), Minimum Bottleneck Spanning Tree (minimize maximum edge), Steiner Tree (MST connecting a subset of vertices), and Second-Best MST (useful for sensitivity analysis).

Summary: Minimum Spanning Trees

Minimum spanning trees represent one of the most elegant intersections of theory and practice in algorithm design. The cut property provides the theoretical foundation; Prim's and Kruskal's algorithms provide efficient implementations.

Key Takeaways

•An MST connects all vertices with minimum total edge weight and no cycles (V-1 edges).
•The cut property guarantees greedy MST algorithms work: the minimum edge crossing any cut is in some MST.
•Prim's algorithm grows the tree from one vertex—O((V+E) log V) with a binary heap.
•Kruskal's algorithm processes edges globally—O(E log E) dominated by sorting.
•Union-Find enables Kruskal's efficiency with near-O(1) cycle detection.
•Choose based on graph density: Kruskal for sparse, Prim for dense (roughly).
•MST has diverse applications: clustering, approximation algorithms, network design.

What's Next:

With traversal, shortest paths, and minimum spanning trees covered, the final page previews what's ahead in Chapters 23-24—where you'll master these algorithms through implementation, advanced variations, and challenging applications.

Page Complete

You now understand the two classic MST algorithms. Both Prim and Kruskal exploit the cut property to make locally optimal choices that yield globally optimal spanning trees. The Union-Find data structure is a powerful tool you'll encounter repeatedly in graph algorithms.

Minimum Spanning Trees — Prim and Kruskal

Connecting Networks at Minimum Cost

Two classic algorithms solve this problem:

Prim's Algorithm: Grows the MST from a single vertex, always adding the cheapest edge that extends the tree.
Kruskal's Algorithm: Considers edges globally in weight order, adding each edge that doesn't create a cycle.

Both are greedy, both are optimal, and both reveal deep insights about graph structure.

What You Will Learn

Spanning Trees and the MST Problem

Before diving into algorithms, let's establish precise definitions:

Spanning Tree:

A spanning tree of a connected, undirected graph G = (V, E) is a subgraph T = (V, E') such that:

T includes all vertices of G
T is connected
T has no cycles (it's a tree)

A spanning tree with V vertices has exactly V - 1 edges—the minimum needed to connect V vertices without cycles.

Minimum Spanning Tree (MST):

MST Properties

•Contains all V vertices
•Contains exactly V-1 edges
•Is connected (path between any two vertices)
•Has no cycles (tree structure)
•Minimizes total edge weight

Real-World Applications

•Network design: Minimum-cost cable/pipe layout
•Clustering: MST-based clustering algorithms
•Approximation: Basis for TSP approximations
•Image segmentation: Graph-cut based methods
•Taxonomy: Phylogenetic tree construction

MST vs Shortest Path

The Cut Property: Why Greedy Works

Both Prim's and Kruskal's algorithms are greedy—they make locally optimal choices. Why do these local choices lead to a globally optimal MST? The answer lies in the cut property.

Definition: Cut

A cut of a graph is a partition of vertices into two non-empty sets S and V-S. The cut edges are edges with one endpoint in S and one in V-S.

The Cut Property (MST Optimality Guarantee):

For any cut of a graph, the minimum-weight edge crossing the cut is part of some MST.

This is profound: no matter how you partition the vertices, the lightest edge connecting the two parts is "safe" to include in the MST.

Proof Intuition:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Graph:
        A ----(5)---- B
        |\          /|
       (1) \(3)  (4)/ (2)
        |    \  /   |
        C ----(6)---- D
 
Consider cut: {A, C} | {B, D}
 
Cut edges and weights:
- A-B: 5
- A-D: 3  ← MINIMUM cut edge
- C-D: 6
 
The Cut Property guarantees that edge A-D (weight 3)
is in some MST of this graph.
 
Indeed, one MST is: A-C (1), A-D (3), B-D (2) = total 6
Another valid MST: A-C (1), A-D (3), A-B (5) = total 9? No, not minimum.
 
Actually: A-C (1), B-D (2), A-D (3) = 6 ✓
 
Both Prim and Kruskal will include the minimum cut edge.

Why This Matters

Prim's Algorithm: Growing the Tree Vertex by Vertex

Prim's algorithm grows the MST from a single starting vertex, at each step adding the minimum-weight edge that connects a tree vertex to a non-tree vertex.

Algorithm Outline:

Start with an arbitrary vertex in the MST
Repeat until all vertices are in the MST:
- Find the minimum-weight edge (u, v) where u is in the MST and v is not
- Add v to the MST and include edge (u, v)
The selected edges form the MST

Connection to Dijkstra:

Prim's algorithm is structurally similar to Dijkstra:

Both use a priority queue
Both process vertices in order of some metric
Dijkstra: minimum distance from source
Prim: minimum edge weight to the existing tree

The cut at each step is: {vertices in MST} | {vertices not in MST}. The minimum edge crossing this cut is added, satisfying the cut property.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
interface WeightedEdge {
    to: number;
    weight: number;
}
 
interface WeightedGraph {
    neighbors(vertex: number): WeightedEdge[];
    numVertices: number;
}
 
interface MSTResult {
    edges: { from: number; to: number; weight: number }[];
    totalWeight: number;
}
 
function primMST(graph: WeightedGraph): MSTResult {
    const n = graph.numVertices;
    const inMST = new Array(n).fill(false);
    const minEdgeTo = new Array(n).fill(Infinity);  // Minimum edge weight to reach vertex
    const parent = new Array(n).fill(-1);           // Parent in MST (for edge reconstruction)
    
    // Priority queue: [weight, vertex]
    const pq = new MinHeap<number>();
    
    // Start from vertex 0
    minEdgeTo[0] = 0;
    pq.push(0, 0);
    
    const mstEdges: { from: number; to: number; weight: number }[] = [];
    let totalWeight = 0;
    
    while (!pq.isEmpty()) {
        const current = pq.pop()!;
        
        if (inMST[current]) continue;  // Already processed
        
        inMST[current] = true;
        
        // Add edge to MST (except for starting vertex)
        if (parent[current] !== -1) {
            mstEdges.push({
                from: parent[current],
                to: current,
                weight: minEdgeTo[current]
            });
            totalWeight += minEdgeTo[current];
        }
        
        // Update minimum edge weights to neighbors
        for (const { to, weight } of graph.neighbors(current)) {
            if (!inMST[to] && weight < minEdgeTo[to]) {
                minEdgeTo[to] = weight;
                parent[to] = current;
                pq.push(weight, to);
            }
        }
    }
    
    return { edges: mstEdges, totalWeight };
}
 
// Example usage
const graph: WeightedGraph = {
    numVertices: 4,
    neighbors: (v: number) => {
        const adj: WeightedEdge[][] = [
            [{ to: 1, weight: 10 }, { to: 2, weight: 6 }, { to: 3, weight: 5 }],  // 0
            [{ to: 0, weight: 10 }, { to: 3, weight: 15 }],                        // 1
            [{ to: 0, weight: 6 }, { to: 3, weight: 4 }],                          // 2
            [{ to: 0, weight: 5 }, { to: 1, weight: 15 }, { to: 2, weight: 4 }],   // 3
        ];
        return adj[v];
    }
};
 
const mst = primMST(graph);
console.log(`MST total weight: ${mst.totalWeight}`);
console.log(`MST edges: ${JSON.stringify(mst.edges)}`);

Step-by-Step Execution:

Graph:

    0 ---(10)--- 1
    |\          |
   (6) \(5)   (15)
    |    \      |
    2 ---(4)--- 3

Step	In MST	PQ Top	Action	MST Edges
Init	{}	(0, v0)	Start with v0
1	{0}	(5, v3)	Add 0→3	(0,3,5)
2	{0,3}	(4, v2)	Add 3→2	(0,3,5), (3,2,4)
3	{0,2,3}	(10, v1)	Add 0→1	(0,3,5), (3,2,4), (0,1,10)
Done	{0,1,2,3}	empty		Total: 19

MST: edges (0,3), (3,2), (0,1) with total weight 5 + 4 + 10 = 19.

Kruskal's Algorithm: Sorting Edges Globally

Kruskal's algorithm takes a different approach: instead of growing from a single vertex, it considers all edges globally in order of weight, adding each edge that doesn't create a cycle.

Algorithm Outline:

Sort all edges by weight (ascending)
Initialize each vertex as its own component
For each edge (u, v) in sorted order:
- If u and v are in different components (adding edge won't create cycle):
  - Add the edge to the MST
  - Merge the components containing u and v
Stop when MST has V-1 edges

The Union-Find Data Structure:

Connection to Cut Property:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
interface Edge {
    from: number;
    to: number;
    weight: number;
}
 
class UnionFind {
    private parent: number[];
    private rank: number[];
    
    constructor(n: number) {
        this.parent = Array.from({ length: n }, (_, i) => i);
        this.rank = new Array(n).fill(0);
    }
    
    find(x: number): number {
        // Path compression: point directly to root
        if (this.parent[x] !== x) {
            this.parent[x] = this.find(this.parent[x]);
        }
        return this.parent[x];
    }
    
    union(x: number, y: number): boolean {
        const rootX = this.find(x);
        const rootY = this.find(y);
        
        if (rootX === rootY) return false;  // Already connected
        
        // Union by rank: attach smaller tree under larger
        if (this.rank[rootX] < this.rank[rootY]) {
            this.parent[rootX] = rootY;
        } else if (this.rank[rootX] > this.rank[rootY]) {
            this.parent[rootY] = rootX;
        } else {
            this.parent[rootY] = rootX;
            this.rank[rootX]++;
        }
        
        return true;  // Successfully merged
    }
    
    connected(x: number, y: number): boolean {
        return this.find(x) === this.find(y);
    }
}
 
function kruskalMST(numVertices: number, edges: Edge[]): MSTResult {
    // Sort edges by weight
    const sortedEdges = [...edges].sort((a, b) => a.weight - b.weight);
    
    const uf = new UnionFind(numVertices);
    const mstEdges: Edge[] = [];
    let totalWeight = 0;
    
    for (const edge of sortedEdges) {
        // If adding this edge doesn't create a cycle
        if (uf.union(edge.from, edge.to)) {
            mstEdges.push(edge);
            totalWeight += edge.weight;
            
            // MST complete when we have V-1 edges
            if (mstEdges.length === numVertices - 1) break;
        }
    }
    
    return { edges: mstEdges, totalWeight };
}
 
// Example usage
const edges: Edge[] = [
    { from: 0, to: 1, weight: 10 },
    { from: 0, to: 2, weight: 6 },
    { from: 0, to: 3, weight: 5 },
    { from: 1, to: 3, weight: 15 },
    { from: 2, to: 3, weight: 4 },
];
 
const mst = kruskalMST(4, edges);
console.log(`MST total weight: ${mst.totalWeight}`);  // 19

Step-by-Step Execution:

Sorted edges: (2,3,4), (0,3,5), (0,2,6), (0,1,10), (1,3,15)

Step	Edge	Weight	Action	Components	MST Edges Count
1	(2,3)	4	Add (different components)	{0},{1},{2,3}	1
2	(0,3)	5	Add (0 not with 2,3)	{0,2,3},{1}	2
3	(0,2)	6	Skip (0,2 same component)	{0,2,3},{1}	2
4	(0,1)	10	Add (1 not with others)	{0,1,2,3}	3
Done	—	—	V-1 = 3 edges	All connected	3

MST: edges (2,3), (0,3), (0,1) with total weight 4 + 5 + 10 = 19.

(Same result as Prim's)

Union-Find: The Power Behind Kruskal

The Union-Find data structure is crucial for Kruskal's efficiency. Let's understand it deeply.

Core Operations:

find(x): Return the "representative" (root) of x's set
union(x, y): Merge the sets containing x and y
connected(x, y): Check if x and y are in the same set

Naive Implementation:

Each element points to its parent. Find follows parent pointers to the root. Union makes one root point to another. This can degrade to O(n) per operation for pathological cases (long chains).

Optimizations:

Path Compression (in find): Make every node on the path point directly to the root. Future finds are faster.
Union by Rank (in union): Always attach the shorter tree under the taller tree. Keeps trees balanced.

With both optimizations, the amortized time per operation is O(α(n)), where α is the inverse Ackermann function—essentially O(1) for all practical purposes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Initial state (5 elements, each its own set):
    0   1   2   3   4
    ↓   ↓   ↓   ↓   ↓
   [0] [1] [2] [3] [4]  (parent of i is i)
 
After union(0, 1):
    0       2   3   4
   /        ↓   ↓   ↓
  1        [2] [3] [4]
  
  Parent: [0, 0, 2, 3, 4]
 
After union(2, 3):
    0       2       4
   /       /        ↓
  1       3        [4]
  
  Parent: [0, 0, 2, 2, 4]
 
After union(0, 2):  (Union by rank: 0's tree has rank 1, 2's tree has rank 1)
        0           4
       /|           ↓
      1 2          [4]
        |
        3
  
  Parent: [0, 0, 0, 2, 4]
  
Now find(3):
  - 3's parent is 2, 2's parent is 0, 0 is root
  - With path compression: set parent[3] = 0
 
After path compression on find(3):
        0           4
       /|\
      1 2 3
 
  Parent: [0, 0, 0, 0, 4]
  
Future find(3) is O(1)!

Inverse Ackermann Function

Prim vs Kruskal: When to Use Which

Both algorithms find an MST, but they have different performance characteristics:

Prim vs Kruskal Comparison
Aspect	Prim	Kruskal
Approach	Grow tree from one vertex	Process edges globally
Time Complexity	O((V + E) log V) with binary heap	O(E log E) = O(E log V)
Space Complexity	O(V)	O(V) for Union-Find
Best for	Dense graphs	Sparse graphs
Graph Representation	Adjacency list preferred	Edge list preferred
Early Termination	Can stop when target reached	Naturally stops at V-1 edges
Parallelization	Harder (sequential growth)	Easier (edge processing)

Deep Dive: Complexity Analysis

Prim's Algorithm:

With binary heap: O((V + E) log V)
- Each vertex extracted once: O(V log V)
- Each edge potentially updates the heap: O(E log V)
With Fibonacci heap: O(E + V log V)
For dense graphs (E ≈ V²): O(V² log V) with binary heap

Kruskal's Algorithm:

Sorting edges: O(E log E) = O(E log V²) = O(2E log V) = O(E log V)
Union-Find operations: O(E × α(V)) ≈ O(E)
Total: O(E log E)
For sparse graphs (E ≈ V): O(V log V)
For dense graphs (E ≈ V²): O(V² log V)

Practical Guidance:

Sparse graphs: Kruskal is often faster (sorting fewer edges)
Dense graphs: Prim with a good heap can be faster
Edge list input: Kruskal is more natural
Adjacency list input: Prim is more natural

In Practice

Applications and Extensions

MST algorithms have applications far beyond their original networking context:

Application 1: Clustering

MST-based clustering: Build the MST, then remove the k-1 heaviest edges to get k clusters. The intuition: heavy edges connect "distant" groups, so removing them separates natural clusters.

Application 2: Approximation Algorithms

The Traveling Salesman Problem (TSP) is NP-hard, but an MST provides a 2-approximation:

Build the MST
DFS the MST to create a tour visiting all vertices
This tour is at most 2× the optimal TSP tour

Application 3: Network Reliability

Finding the "most reliable" spanning tree (maximize minimum edge weight, or minimize maximum edge weight) uses MST variants.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
function mstClustering(
    numVertices: number, 
    edges: Edge[], 
    numClusters: number
): number[][] {
    // Build MST
    const mst = kruskalMST(numVertices, edges);
    
    if (numClusters <= 1 || numClusters > numVertices) {
        throw new Error("Invalid number of clusters");
    }
    
    // Sort MST edges by weight descending
    const sortedMSTEdges = [...mst.edges].sort((a, b) => b.weight - a.weight);
    
    // Remove k-1 heaviest edges to create k clusters
    const edgesToKeep = sortedMSTEdges.slice(numClusters - 1);
    
    // Build clusters using Union-Find on remaining edges
    const uf = new UnionFind(numVertices);
    for (const edge of edgesToKeep) {
        uf.union(edge.from, edge.to);
    }
    
    // Group vertices by their cluster root
    const clusters = new Map<number, number[]>();
    for (let v = 0; v < numVertices; v++) {
        const root = uf.find(v);
        if (!clusters.has(root)) {
            clusters.set(root, []);
        }
        clusters.get(root)!.push(v);
    }
    
    return Array.from(clusters.values());
}
 
// Example: Cluster 6 points into 2 clusters
const points = [
    // Cluster 1: close together
    { id: 0, x: 0, y: 0 },
    { id: 1, x: 1, y: 0 },
    { id: 2, x: 0, y: 1 },
    // Cluster 2: far from cluster 1
    { id: 3, x: 10, y: 10 },
    { id: 4, x: 11, y: 10 },
    { id: 5, x: 10, y: 11 },
];
 
// Build complete graph with Euclidean distances
const clusterEdges: Edge[] = [];
for (let i = 0; i < points.length; i++) {
    for (let j = i + 1; j < points.length; j++) {
        const dist = Math.sqrt(
            Math.pow(points[i].x - points[j].x, 2) +
            Math.pow(points[i].y - points[j].y, 2)
        );
        clusterEdges.push({ from: i, to: j, weight: dist });
    }
}
 
const clusters = mstClustering(6, clusterEdges, 2);
console.log(clusters);  // [[0,1,2], [3,4,5]] or similar

MST Extensions

Summary: Minimum Spanning Trees

Key Takeaways

•An MST connects all vertices with minimum total edge weight and no cycles (V-1 edges).
•The cut property guarantees greedy MST algorithms work: the minimum edge crossing any cut is in some MST.
•Prim's algorithm grows the tree from one vertex—O((V+E) log V) with a binary heap.
•Kruskal's algorithm processes edges globally—O(E log E) dominated by sorting.
•Union-Find enables Kruskal's efficiency with near-O(1) cycle detection.
•Choose based on graph density: Kruskal for sparse, Prim for dense (roughly).
•MST has diverse applications: clustering, approximation algorithms, network design.

What's Next:

Page Complete