Loading learning content...
In the previous page, we established that Kruskal's algorithm processes edges in order of increasing weight, adding each edge if it doesn't create a cycle. But this greedy approach hinges on a critical preprocessing step: sorting all edges by weight.
This sorting step isn't merely a detail—it's the dominant factor in Kruskal's algorithm's time complexity. Understanding why we sort, how we sort, and what alternatives exist is essential for mastering this algorithm and applying it effectively in practice.
Consider this: if your graph has 1 million edges, sorting them requires approximately 20 million comparison operations (E log E). The subsequent processing—checking for cycles and merging components—takes only about 2-3 million operations with proper Union-Find. The sorting step does 80-90% of the work.
By the end of this page, you will understand why edge sorting is essential for Kruskal's greedy approach, analyze how different sorting algorithms affect overall performance, explore practical considerations for different graph representations, and discover alternative approaches for special cases.
The correctness of Kruskal's algorithm fundamentally depends on processing edges from lightest to heaviest. Let's understand why this ordering is non-negotiable.
The Greedy Choice Property
Kruskal's algorithm is a greedy algorithm—it makes locally optimal choices (selecting the cheapest available safe edge) that lead to a globally optimal solution (the minimum spanning tree). For this greedy approach to work correctly, we need a specific processing order.
Theorem: In Kruskal's algorithm, processing edges in non-decreasing weight order ensures that:
Proof Sketch:
When we consider edge e = (u, v), one of two things is true:
Case 1: u and v are in different components.
Case 2: u and v are in the same component.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
Consider this graph: A───────B │\ /│ 2 │ \ 1 / │ 3 │ \ / │ │ X │ │ / \ │ 1 │ / 3 \ │ 2 │/ \│ C───────D 1 Edges: AC: weight 2 AD: weight 1 AB: weight 1 BC: weight 1 CD: weight 1 BD: weight 3 True MST: AD(1) + AB(1) + BC(1) or AD(1) + AB(1) + CD(1)Minimum weight: 3 ═══════════════════════════════════════════════════════════════════CORRECT: Processing in sorted order═══════════════════════════════════════════════════════════════════Sorted: AD(1), AB(1), BC(1), CD(1), AC(2), BD(3) 1. AD(1): A≠D, add. Components: {A,D}, {B}, {C}2. AB(1): A≠B, add. Components: {A,B,D}, {C}3. BC(1): B≠C, add. Components: {A,B,C,D}4. CD(1): C=D (same component), REJECT...MST weight: 1+1+1 = 3 ✓ (optimal!) ═══════════════════════════════════════════════════════════════════INCORRECT: Processing in arbitrary order (unsorted)═══════════════════════════════════════════════════════════════════Arbitrary order: AC(2), BD(3), AD(1), AB(1), BC(1), CD(1) 1. AC(2): A≠C, add. Components: {A,C}, {B}, {D}2. BD(3): B≠D, add. Components: {A,C}, {B,D}3. AD(1): A≠D, add. Components: {A,C,B,D}4. AB(1): A=B (same component), REJECT...MST weight: 2+3+1 = 6 ✗ (NOT optimal!) The greedy approach only works with sorted edges!The greedy approach of 'take the cheapest safe edge' only produces optimal results when we globally order edges by weight first. Processing edges in any other order—such as the order they're discovered during traversal or the order they appear in an input file—will likely produce suboptimal spanning trees.
The choice of sorting algorithm directly impacts Kruskal's overall runtime. Let's analyze the options systematically.
Comparison-Based Sorting: The Standard Approach
For arbitrary edge weights (floating-point, very large integers, etc.), comparison-based sorting is the default. The lower bound for comparison-based sorting is Ω(E log E), and we have several algorithms achieving this:
Merge Sort:
Quick Sort:
Heap Sort:
Introsort (Hybrid):
| Algorithm | Time Complexity | Space | Stable | Best For |
|---|---|---|---|---|
| Merge Sort | O(E log E) | O(E) | Yes | When stability matters, guaranteed performance |
| Quick Sort | O(E log E) avg | O(log E) | No | Average-case performance-critical applications |
| Heap Sort | O(E log E) | O(1) | No | Memory-constrained environments |
| Introsort | O(E log E) | O(log E) | No | Library implementations, general use |
| Tim Sort | O(E log E) | O(E) | Yes | Nearly-sorted data, Python/Java default |
| Radix Sort | O(E × k) | O(E) | Yes | Integer weights with bounded range |
| Counting Sort | O(E + W) | O(W) | Yes | Small weight range, dense distribution |
Non-Comparison Sorting: When Weights Have Special Structure
If edge weights have special properties, we can beat O(E log E):
Counting Sort (when weight range is small):
If all weights are integers in range [0, W], counting sort achieves O(E + W) time. For practical cases where W ≈ E or W is polynomial in E, this is O(E), making the overall Kruskal's algorithm dominated by the Union-Find operations rather than sorting!
Radix Sort (when weights are integers with bounded digits):
For k-digit integers (or integers up to E^k for some constant k), radix sort achieves O(E × k) = O(E) time. Again, this shifts the complexity bottleneck.
Bucket Sort (when weights are uniformly distributed):
If weights are uniformly distributed in a known range, bucket sort can achieve O(E) expected time.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122
import heapqfrom typing import List, Tuple Edge = Tuple[int, int, float] # (u, v, weight) # ═══════════════════════════════════════════════════════════════════# APPROACH 1: Standard Library Sort (Recommended for most cases)# ═══════════════════════════════════════════════════════════════════def sort_edges_standard(edges: List[Edge]) -> List[Edge]: """ Uses Python's Timsort via sorted(). Time: O(E log E) Space: O(E) for the new list """ return sorted(edges, key=lambda e: e[2]) # ═══════════════════════════════════════════════════════════════════# APPROACH 2: In-Place Sort (When memory is constrained)# ═══════════════════════════════════════════════════════════════════def sort_edges_inplace(edges: List[Edge]) -> None: """ Sorts edges in-place using Python's sort(). Time: O(E log E) Space: O(1) additional (modifies original list) """ edges.sort(key=lambda e: e[2]) # ═══════════════════════════════════════════════════════════════════# APPROACH 3: Counting Sort (When weights are small integers)# ═══════════════════════════════════════════════════════════════════def sort_edges_counting(edges: List[Edge], max_weight: int) -> List[Edge]: """ Counting sort for integer weights in [0, max_weight]. Time: O(E + max_weight) Space: O(E + max_weight) Use when max_weight is O(E) or smaller for O(E) sorting! """ # Count occurrences of each weight counts = [0] * (max_weight + 1) for _, _, w in edges: counts[int(w)] += 1 # Convert to cumulative counts (positions) for i in range(1, len(counts)): counts[i] += counts[i - 1] # Build sorted output output = [None] * len(edges) for edge in reversed(edges): # Reverse for stability w = int(edge[2]) counts[w] -= 1 output[counts[w]] = edge return output # ═══════════════════════════════════════════════════════════════════# APPROACH 4: Partial Sort via Heap (When you might terminate early)# ═══════════════════════════════════════════════════════════════════def edges_by_weight_lazy(edges: List[Edge]): """ Yields edges in sorted order using a min-heap. Only builds the heap O(E), then pops are O(log E) each. If you don't need all edges (graph is sparse or dense), this can be faster than full sorting! Building heap: O(E) Each pop: O(log E) Getting k edges: O(E + k log E) Beats O(E log E) when k << E. """ # Build min-heap: O(E) heap = [(w, u, v) for u, v, w in edges] heapq.heapify(heap) # Yield in sorted order while heap: w, u, v = heapq.heappop(heap) yield (u, v, w) # ═══════════════════════════════════════════════════════════════════# DEMO: Comparing approaches# ═══════════════════════════════════════════════════════════════════import timeimport random def benchmark_sorting(): """Compare sorting approaches on random edges.""" # Generate random edges E = 100000 edges = [(random.randint(0, 1000), random.randint(0, 1000), random.randint(1, 1000)) for _ in range(E)] # Approach 1: Standard sort start = time.time() sorted1 = sort_edges_standard(edges.copy()) time1 = time.time() - start print(f"Standard sort: {time1:.4f}s") # Approach 2: Counting sort (weights are 1-1000) start = time.time() sorted2 = sort_edges_counting(edges.copy(), 1000) time2 = time.time() - start print(f"Counting sort: {time2:.4f}s") # Approach 3: Lazy heap (get first V-1 = ~200 edges) start = time.time() lazy_gen = edges_by_weight_lazy(edges.copy()) sorted3 = [next(lazy_gen) for _ in range(200)] time3 = time.time() - start print(f"Lazy heap (200 edges): {time3:.4f}s") # Verify correctness assert sorted1[0][2] == sorted2[0][2] == sorted3[0][2] print("All approaches produce correct minimum!")Kruskal's algorithm has time complexity O(E log E), which is often stated as "dominated by sorting." Let's rigorously analyze this claim and understand when it's accurate.
Component-by-Component Analysis:
Step 1: Build Edge List
Step 2: Sort Edges
Step 3: Initialize Union-Find
Step 4: Process Sorted Edges
Total: O(E log E + E × α(V)) = O(E log E)
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
═══════════════════════════════════════════════════════════════════KRUSKAL'S ALGORITHM COMPLEXITY BREAKDOWN═══════════════════════════════════════════════════════════════════ For a graph with V vertices and E edges: ┌─────────────────────────────────────────────────────────────────┐│ STEP │ TIME COMPLEXITY │├───────────────────────────┼─────────────────────────────────────┤│ Extract edge list │ O(E) or O(V²) from adj. matrix ││ Sort edges │ O(E log E) ← DOMINATES ││ Initialize Union-Find │ O(V) ││ Process E edges │ O(E × α(V)) ≈ O(E) ││ - E × Find operations │ O(E × α(V)) ││ - ≤ V-1 Union operations │ O(V × α(V)) │└───────────────────────────┴─────────────────────────────────────┘ Total: O(E log E) ═══════════════════════════════════════════════════════════════════WHY SORTING DOMINATES═══════════════════════════════════════════════════════════════════ For a graph with V = 10,000 vertices and E = 100,000 edges: Sorting: E log E = 100,000 × 17 = 1,700,000 comparisonsUnion-Find: E × α(V) ≈ 100,000 × 4 = 400,000 operations Sorting does ~80% of the work! For a dense graph with V = 1,000 and E = V² = 1,000,000: Sorting: E log E = 1,000,000 × 20 = 20,000,000 comparisonsUnion-Find: E × α(V) ≈ 1,000,000 × 3 = 3,000,000 operations Sorting does ~87% of the work! ═══════════════════════════════════════════════════════════════════WHEN SORTING DOESN'T DOMINATE═══════════════════════════════════════════════════════════════════ 1. Pre-sorted edges: Sorting is O(E) with good algorithms (Timsort)2. Integer weights: Radix/counting sort gives O(E)3. Lazy evaluation: Heap gives O(E + k log E) for k edges needed In these cases, Union-Find becomes the bottleneck: O(E × α(V))Why O(E log E) = O(E log V):
You'll sometimes see Kruskal's complexity stated as O(E log V) instead of O(E log E). These are equivalent because:
For a simple graph: E ≤ V(V-1)/2 = O(V²)
Therefore: log E ≤ log V² = 2 log V = O(log V)
And trivially: log E ≥ log V (since a connected graph needs at least V-1 edges)
So log E = Θ(log V), making O(E log E) = O(E log V).
However, O(E log E) is more precise when the graph is very sparse (E ≈ V), as in a tree or chain.
When optimizing Kruskal's algorithm for specific use cases, focus on the sorting step first. Using a faster sorting algorithm, exploiting structure in edge weights, or avoiding full sorting (via lazy extraction) can provide significant speedups.
The way your graph is represented affects how efficiently you can extract edges for Kruskal's algorithm. Let's examine the common representations.
Edge List Representation:
This is the ideal representation for Kruskal's algorithm. Edges are already collected—just sort and process.
edges = [(0, 1, 4), (0, 2, 3), (1, 2, 1), ...]
Adjacency List Representation:
Common for general graph algorithms. We need to extract edges into a sortable list.
adj = {
0: [(1, 4), (2, 3)],
1: [(0, 4), (2, 1)],
...
}
Adjacency Matrix Representation:
Used for dense graphs. Least efficient for Kruskal's extraction.
matrix[i][j] = weight of edge (i, j) or INF if no edge
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136
from typing import List, Dict, Tuple, Setimport math Edge = Tuple[int, int, float] # ═══════════════════════════════════════════════════════════════════# FROM EDGE LIST# ═══════════════════════════════════════════════════════════════════def extract_from_edge_list(edge_list: List[Edge]) -> List[Edge]: """ Trivial case: edges already in the format we need. Time: O(1) to reference, O(E) to copy """ return edge_list.copy() # Or just use directly if modification is safe # ═══════════════════════════════════════════════════════════════════# FROM ADJACENCY LIST (Undirected Graph)# ═══════════════════════════════════════════════════════════════════def extract_from_adjacency_list_undirected( adj: Dict[int, List[Tuple[int, float]]]) -> List[Edge]: """ Extract edges from adjacency list representation. For undirected graphs, each edge (u,v) appears as: - v in adj[u] with weight w - u in adj[v] with weight w We extract once by enforcing u < v. Time: O(E) """ edges = [] for u in adj: for v, weight in adj[u]: if u < v: # Only take each edge once edges.append((u, v, weight)) return edges def extract_from_adjacency_list_directed( adj: Dict[int, List[Tuple[int, float]]]) -> List[Edge]: """ For directed graphs, every adjacency entry is a unique edge. Time: O(E) """ edges = [] for u in adj: for v, weight in adj[u]: edges.append((u, v, weight)) return edges # ═══════════════════════════════════════════════════════════════════# FROM ADJACENCY MATRIX# ═══════════════════════════════════════════════════════════════════def extract_from_adjacency_matrix( matrix: List[List[float]], V: int, undirected: bool = True) -> List[Edge]: """ Extract edges from V×V adjacency matrix. matrix[i][j] = weight of edge i→j, or math.inf / None if no edge. Time: O(V²) — must examine entire matrix For sparse graphs, this is inefficient! For dense graphs where E = Θ(V²), this is acceptable. """ edges = [] if undirected: # Only examine upper triangle to avoid duplicates for i in range(V): for j in range(i + 1, V): weight = matrix[i][j] if weight is not None and weight != math.inf: edges.append((i, j, weight)) else: # Examine all entries for directed graph for i in range(V): for j in range(V): if i != j: # Exclude self-loops weight = matrix[i][j] if weight is not None and weight != math.inf: edges.append((i, j, weight)) return edges # ═══════════════════════════════════════════════════════════════════# DEMONSTRATION# ═══════════════════════════════════════════════════════════════════def demo(): # Example graph: # 0 --4-- 1 # |\ /| # 3 2 1 5 # | / \| # 2 --6-- 3 # As edge list edge_list = [ (0, 1, 4), (0, 2, 3), (0, 3, 2), (1, 2, 1), (1, 3, 5), (2, 3, 6) ] # As adjacency list adj_list = { 0: [(1, 4), (2, 3), (3, 2)], 1: [(0, 4), (2, 1), (3, 5)], 2: [(0, 3), (1, 1), (3, 6)], 3: [(0, 2), (1, 5), (2, 6)] } # As adjacency matrix INF = math.inf adj_matrix = [ [0, 4, 3, 2 ], [4, 0, 1, 5 ], [3, 1, 0, 6 ], [2, 5, 6, 0 ] ] # Extract and verify from_list = extract_from_edge_list(edge_list) from_adj = extract_from_adjacency_list_undirected(adj_list) from_matrix = extract_from_adjacency_matrix(adj_matrix, 4) print(f"From edge list: {len(from_list)} edges") print(f"From adj list: {len(from_adj)} edges") print(f"From adj matrix: {len(from_matrix)} edges") # All should have 6 edges assert len(from_list) == len(from_adj) == len(from_matrix) == 6If you know you'll be running Kruskal's algorithm, consider maintaining an edge list alongside your primary graph representation. The O(E) space overhead is usually worthwhile for O(1) extraction versus O(V²) matrix scanning.
An important optimization for Kruskal's algorithm is lazy sorting—not sorting the entire edge list upfront, but instead extracting edges in sorted order as needed using a min-heap.
The Key Insight:
Kruskal's algorithm typically adds V-1 edges to the MST. In a sparse graph where E ≈ V, we process almost all edges anyway. But in a denser graph where E >> V, we might only need the smallest 10-20% of edges before completing the MST. Full sorting does unnecessary work!
Heap-Based Approach:
When Lazy Sorting Wins:
But practically, the break-even point depends on constants. Lazy sorting has overhead from heap operations, while full sorting benefits from cache-efficient implementations.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137
import heapqfrom typing import List, Tuple, Optional Edge = Tuple[int, int, float] class UnionFind: """Standard Union-Find with path compression and union by rank.""" def __init__(self, n: int): self.parent = list(range(n)) self.rank = [0] * n def find(self, x: int) -> int: if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) # Path compression return self.parent[x] def union(self, x: int, y: int) -> bool: """Returns True if x and y were in different components.""" px, py = self.find(x), self.find(y) if px == py: return False # Union by rank if self.rank[px] < self.rank[py]: px, py = py, px self.parent[py] = px if self.rank[px] == self.rank[py]: self.rank[px] += 1 return True def kruskal_lazy(vertices: int, edges: List[Edge]) -> Tuple[List[Edge], float]: """ Kruskal's algorithm with lazy edge extraction via min-heap. Time Complexity: - Heapify: O(E) - At most E heap pops: O(E log E) worst case - But typically only O(V) edges needed: O(E + V log E) For dense graphs (E = Θ(V²)), this can be faster than full sorting! """ # Build min-heap: (weight, u, v) # We rearrange to put weight first for heap ordering heap = [(w, u, v) for u, v, w in edges] heapq.heapify(heap) # O(E) uf = UnionFind(vertices) mst_edges = [] mst_weight = 0.0 edges_needed = vertices - 1 edges_examined = 0 while heap and len(mst_edges) < edges_needed: w, u, v = heapq.heappop(heap) # O(log E) edges_examined += 1 if uf.union(u, v): mst_edges.append((u, v, w)) mst_weight += w # Statistics if len(mst_edges) == edges_needed: print(f"MST found! Examined {edges_examined}/{len(edges)} edges " f"({100*edges_examined/len(edges):.1f}%)") else: print("Graph is disconnected - no spanning tree exists") return mst_edges, mst_weight def kruskal_standard(vertices: int, edges: List[Edge]) -> Tuple[List[Edge], float]: """ Standard Kruskal's with full sorting for comparison. """ sorted_edges = sorted(edges, key=lambda e: e[2]) # O(E log E) uf = UnionFind(vertices) mst_edges = [] mst_weight = 0.0 for u, v, w in sorted_edges: if uf.union(u, v): mst_edges.append((u, v, w)) mst_weight += w if len(mst_edges) == vertices - 1: break return mst_edges, mst_weight # ═══════════════════════════════════════════════════════════════════# BENCHMARK: When does lazy sorting help?# ═══════════════════════════════════════════════════════════════════import timeimport random def benchmark_lazy_vs_standard(): """Compare lazy heap approach vs full sorting.""" # Dense graph: V vertices, many edges V = 1000 E_dense = V * (V - 1) // 4 # ~25% of complete graph = 249,750 edges edges = [] for i in range(V): for j in range(i + 1, V): if random.random() < 0.25: # 25% edge density edges.append((i, j, random.random() * 100)) print(f"Graph: {V} vertices, {len(edges)} edges") print(f"Expected edges in MST: {V - 1}") print() # Standard approach (full sort) edges_copy = edges.copy() start = time.time() mst1, weight1 = kruskal_standard(V, edges_copy) time_standard = time.time() - start print(f"Standard (full sort): {time_standard:.4f}s") # Lazy approach (heap) edges_copy = edges.copy() start = time.time() mst2, weight2 = kruskal_lazy(V, edges_copy) time_lazy = time.time() - start print(f"Lazy (heap): {time_lazy:.4f}s") print() print(f"Speedup from lazy: {time_standard/time_lazy:.2f}x") print(f"MST weights match: {abs(weight1 - weight2) < 0.001}") if __name__ == "__main__": benchmark_lazy_vs_standard()Use heap-based lazy extraction when: (1) the graph is dense (E >> V), so only a fraction of edges will be examined; (2) memory is tight and you can't afford to store a sorted copy; (3) edges arrive in a stream and you need to process them incrementally. Use full sorting when: the graph is sparse, sorting is highly optimized in your environment, or you need all edges in sorted order for other purposes.
In many practical applications, edges aren't arbitrary—they have structure that can be exploited to speed up or skip the sorting step entirely.
Scenario 1: Edges Already Sorted
If edges come from a source that naturally orders them by weight (e.g., a sorted database table, output from another algorithm), we can skip sorting entirely!
Adaptive sorting algorithms like Timsort detect nearly-sorted input and run in O(E) time instead of O(E log E). Python's sorted() and .sort() use Timsort.
Scenario 2: Integer Weights with Small Range
If edge weights are integers in range [0, W], use counting sort for O(E + W) time.
Example applications:
Scenario 3: Weights from a Known Distribution
If weights are uniformly distributed in [0, 1], bucket sort can achieve O(E) expected time.
Scenario 4: Geometric Graphs
In Euclidean MST problems (MST of points in the plane), spatial data structures can help generate edges in approximately sorted order, or prune the edge set significantly before sorting.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173
from typing import List, Tuplefrom collections import defaultdict Edge = Tuple[int, int, float] # ═══════════════════════════════════════════════════════════════════# CASE 1: Integer weights with small range - use Counting Sort# ═══════════════════════════════════════════════════════════════════def kruskal_integer_weights( vertices: int, edges: List[Edge], max_weight: int) -> Tuple[List[Edge], int]: """ Optimized Kruskal's for integer weights in [0, max_weight]. Instead of O(E log E) comparison sort, uses O(E + W) counting sort. When W = O(E), this makes sorting O(E), and overall algorithm is dominated by Union-Find: O(E × α(V)) ≈ O(E). Args: vertices: Number of vertices edges: List of (u, v, weight) with integer weights max_weight: Maximum possible weight """ # Counting sort: bucket edges by weight buckets = defaultdict(list) for u, v, w in edges: buckets[int(w)].append((u, v, w)) # Process edges in sorted order (iterate through buckets) uf = UnionFind(vertices) mst_edges = [] mst_weight = 0 for weight in range(max_weight + 1): if weight in buckets: for u, v, w in buckets[weight]: if uf.union(u, v): mst_edges.append((u, v, w)) mst_weight += w if len(mst_edges) == vertices - 1: return mst_edges, mst_weight return mst_edges, mst_weight # ═══════════════════════════════════════════════════════════════════# CASE 2: Pre-sorted edges - skip sorting entirely# ═══════════════════════════════════════════════════════════════════def kruskal_presorted( vertices: int, sorted_edges: List[Edge]) -> Tuple[List[Edge], float]: """ Kruskal's when edges are already sorted by weight. Time: O(E × α(V)) ≈ O(E) The sorting bottleneck is eliminated! """ uf = UnionFind(vertices) mst_edges = [] mst_weight = 0.0 for u, v, w in sorted_edges: # Already in sorted order if uf.union(u, v): mst_edges.append((u, v, w)) mst_weight += w if len(mst_edges) == vertices - 1: break return mst_edges, mst_weight # ═══════════════════════════════════════════════════════════════════# CASE 3: Grid graph - unit weights# ═══════════════════════════════════════════════════════════════════def kruskal_grid_unit_weights(rows: int, cols: int) -> int: """ MST of a grid graph where all edges have weight 1. Since all weights are equal, no sorting needed! Just find any spanning tree. For an r×c grid: - Vertices: r × c - MST needs: r × c - 1 edges - All edges have weight 1 - MST weight: r × c - 1 Time: O(r × c) """ vertices = rows * cols # Any spanning tree works! Just traverse all cells. # Each cell except the first adds exactly 1 edge. return vertices - 1 # MST weight def kruskal_grid_random_weights( rows: int, cols: int, horizontal_weights: List[List[float]], vertical_weights: List[List[float]]) -> Tuple[List[Edge], float]: """ MST of a grid graph with arbitrary edge weights. Grid structure: - Horizontal edges: (i,j) -- (i,j+1) for j in 0..cols-2 - Vertical edges: (i,j) -- (i+1,j) for i in 0..rows-2 horizontal_weights[i][j] = weight of edge (i,j)--(i,j+1) vertical_weights[i][j] = weight of edge (i,j)--(i+1,j) """ def vertex_id(r: int, c: int) -> int: return r * cols + c edges = [] # Collect horizontal edges for i in range(rows): for j in range(cols - 1): u = vertex_id(i, j) v = vertex_id(i, j + 1) w = horizontal_weights[i][j] edges.append((u, v, w)) # Collect vertical edges for i in range(rows - 1): for j in range(cols): u = vertex_id(i, j) v = vertex_id(i + 1, j) w = vertical_weights[i][j] edges.append((u, v, w)) # Standard Kruskal's sorted_edges = sorted(edges, key=lambda e: e[2]) uf = UnionFind(rows * cols) mst_edges = [] mst_weight = 0.0 for u, v, w in sorted_edges: if uf.union(u, v): mst_edges.append((u, v, w)) mst_weight += w if len(mst_edges) == rows * cols - 1: break return mst_edges, mst_weight # Union-Find implementationclass UnionFind: def __init__(self, n: int): self.parent = list(range(n)) self.rank = [0] * n def find(self, x: int) -> int: if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) return self.parent[x] def union(self, x: int, y: int) -> bool: px, py = self.find(x), self.find(y) if px == py: return False if self.rank[px] < self.rank[py]: px, py = py, px self.parent[py] = px if self.rank[px] == self.rank[py]: self.rank[px] += 1 return TrueBefore running Kruskal's algorithm, always ask: Do I know anything about the edge weights? Are they pre-sorted? Are they integers in a small range? Are they all equal? Do they come from a geometric source? Exploiting structure can reduce complexity from O(E log E) to O(E) or better.
Based on our analysis, here are practical guidelines for implementing the sorting phase effectively.
Guideline 1: Use Standard Library Sorting
Unless you have a specific reason to optimize, use your language's built-in sort. It's heavily optimized, handles edge cases, and is well-tested.
# Python
edges.sort(key=lambda e: e[2]) # In-place
sorted_edges = sorted(edges, key=lambda e: e[2]) # New list
// JavaScript
edges.sort((a, b) => a[2] - b[2]);
// C++
std::sort(edges.begin(), edges.end(),
[](const Edge& a, const Edge& b) { return a.weight < b.weight; });
Guideline 2: Early Termination
Always check if you've found V-1 edges and terminate early. In dense graphs, this saves significant work.
| Graph Type | Edge Weights | Recommended Strategy |
|---|---|---|
| Sparse (E ≈ V) | Any | Standard library sort |
| Medium (E ≈ V log V) | Any | Standard library sort |
| Dense (E ≈ V²) | Float/arbitrary | Lazy heap extraction |
| Dense (E ≈ V²) | Small integers | Counting sort + iteration |
| Any | Pre-sorted | No sorting needed |
| Any | All equal | No sorting needed (any spanning tree) |
| Geometric (Euclidean) | Distances | Spatial data structures + filtering |
We've thoroughly analyzed the sorting phase of Kruskal's algorithm—the step that typically determines overall performance. Let's consolidate the key insights:
What's Next:
With sorted edges in hand, we need a way to determine whether adding an edge would create a cycle. The next page explores the Union-Find data structure—the elegant solution that makes Kruskal's cycle detection nearly constant time per operation.
You now understand the critical role of edge sorting in Kruskal's algorithm. From standard comparison-based sorts to exploiting special structure in edge weights, you have the knowledge to implement and optimize this phase effectively.