Loading content...
In the previous page, we identified a critical weakness in naive Union-Find: tree depth can grow linearly with the number of elements, making Find operations painfully slow. A chain of n elements requires n-1 pointer traversals.
But here's a profound insight: every time we traverse a path to find a root, we're gathering valuable information that we can exploit to speed up future queries.
When we find that elements 5, 4, 3, 2, 1, and 0 all belong to the same set with root 0, why should subsequent queries on element 5 traverse the entire path again? We already know the answer—element 5's root is 0.
Path compression captures this insight. During Find, we restructure the tree to make future Finds faster. It's a form of lazy optimization: we do a little extra work now to save a lot of work later.
This page covers: (1) The intuition behind path compression, (2) Full path compression implementation, (3) Path splitting and path halving variants, (4) Correctness proof—why compression doesn't break Union-Find, (5) Complexity improvements, and (6) Practical trade-offs between variants.
Consider a chain structure where finding element 0 requires traversing through elements 1, 2, 3, 4, and 5 to reach the root:
[5] ← root
|
[4]
|
[3]
|
[2]
|
[1]
|
[0]
When we call Find(0), we traverse: 0 → 1 → 2 → 3 → 4 → 5. At the end, we know that the root is 5. All elements along this path belong to the same set.
The key insight: Since all these elements share the same root, why not update their parent pointers to point directly to the root?
After path compression during Find(0):
[5] ← root
┌──┬──┼──┬──┬──┐
[0][1][2][3][4]
Now every element points directly to the root! The next Find on any of these elements will take O(1) time.
Path compression makes Union-Find a self-adjusting data structure. Each query improves the structure for future queries. Trees naturally flatten over time, making subsequent operations faster. This is reminiscent of splay trees and other self-adjusting structures.
Why does this work?
Path compression exploits a fundamental property of Union-Find: the tree structure is not semantically meaningful. All that matters is which elements share the same root. The specific parent-child relationships don't affect the logical partition—they only affect query performance.
Therefore, we're free to restructure trees however we want, as long as:
Path compression satisfies all three requirements. It only flattens paths within a single tree—it never crosses set boundaries or changes which elements are connected.
The most aggressive form of path compression makes every node on the find path point directly to the root. This can be implemented either recursively or iteratively.
Recursive Implementation:
The recursive version is elegant and naturally embodies path compression through the return value:
12345678910111213141516171819202122232425262728293031323334353637383940414243
class UnionFindPathCompression { private parent: number[]; constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); } /** * Find with full path compression (recursive). * * Every node on the path from x to root gets its parent * updated to point directly to the root. * * Time Complexity: O(depth) for first call * Space Complexity: O(depth) due to recursion stack */ find(x: number): number { // Base case: x is the root if (this.parent[x] === x) { return x; } // Recursive case: // 1. Find the root of the tree // 2. Set x's parent directly to the root // 3. Return the root const root = this.find(this.parent[x]); this.parent[x] = root; // Path compression! return root; } // Union remains unchanged union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); if (rootX === rootY) return false; this.parent[rootY] = rootX; return true; }}Iterative Implementation (Two-Pass):
For environments where deep recursion is problematic, we can implement path compression iteratively with two passes:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
class UnionFindIterative { private parent: number[]; constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); } /** * Find with full path compression (iterative two-pass). * * Pass 1: Follow path to find the root * Pass 2: Update all nodes on the path to point to root * * Time Complexity: O(depth) * Space Complexity: O(depth) for storing the path */ find(x: number): number { // Pass 1: Collect all nodes on the path to root const path: number[] = []; let current = x; while (this.parent[current] !== current) { path.push(current); current = this.parent[current]; } const root = current; // Pass 2: Update all collected nodes to point to root for (const node of path) { this.parent[node] = root; } return root; } /** * Alternative: Single-pass with deferred updates. * Uses O(1) extra space by traversing twice without storing. */ findSpaceOptimized(x: number): number { // First pass: find root let root = x; while (this.parent[root] !== root) { root = this.parent[root]; } // Second pass: update parents along the path while (this.parent[x] !== root) { const next = this.parent[x]; this.parent[x] = root; x = next; } return root; }}12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
Initial tree (a chain): [5] ← root | [4] | [3] | [2] | [1] | [0] parent: [1, 2, 3, 4, 5, 5] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Call: find(0) with path compression Step 1 (Recursion): find(0) parent[0] = 1 ≠ 0, so recurse on find(1) Step 2 (Recursion): find(1) parent[1] = 2 ≠ 1, so recurse on find(2) Step 3 (Recursion): find(2) parent[2] = 3 ≠ 2, so recurse on find(3) Step 4 (Recursion): find(3) parent[3] = 4 ≠ 3, so recurse on find(4) Step 5 (Recursion): find(4) parent[4] = 5 ≠ 4, so recurse on find(5) Step 6 (Base case): find(5) parent[5] = 5 → Root found! Return 5 Unwinding and compressing: Step 5 returns: root = 5, set parent[4] = 5Step 4 returns: root = 5, set parent[3] = 5Step 3 returns: root = 5, set parent[2] = 5Step 2 returns: root = 5, set parent[1] = 5Step 1 returns: root = 5, set parent[0] = 5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ After path compression: [5] ← root ┌──┬──┼──┬──┐ [0][1][2][3][4] parent: [5, 5, 5, 5, 5, 5] Future find(0), find(1), find(2), find(3), find(4):All take O(1)! Just one parent lookup to reach root.One Find operation compressed a chain of depth 5 into a star of depth 1. The first Find took 5 steps, but now ALL subsequent Finds take just 1 step. This is the magic of path compression—it amortizes the cost of traversal over future operations.
Full path compression is maximally aggressive—every node ends up pointing to the root. But there are lighter-weight alternatives that achieve nearly the same asymptotic performance with simpler implementations and no recursion.
Path Splitting:
Make every node on the path point to its grandparent (except the root and its child):
12345678910111213141516171819202122232425262728293031323334353637383940
/** * Path Splitting: During find, make every node point to its grandparent. * * This is a one-pass algorithm that roughly halves the path length * each time find is called. * * Advantages: * - No recursion (bounded stack) * - Simple single-pass loop * - Same asymptotic performance as full compression */class UnionFindPathSplitting { private parent: number[]; constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); } find(x: number): number { while (this.parent[x] !== x) { // Save current parent before modifying const next = this.parent[x]; // Point current node to its grandparent this.parent[x] = this.parent[this.parent[x]]; // Move to the original parent (now sibling) x = next; } return x; } union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); if (rootX === rootY) return false; this.parent[rootY] = rootX; return true; }}Path Halving:
Make every other node on the path point to its grandparent:
12345678910111213141516171819202122232425262728293031323334
/** * Path Halving: During find, make every OTHER node point to its grandparent. * * Even simpler than path splitting - only half as many updates. * Same asymptotic complexity as full compression. * * This is often the preferred implementation due to its simplicity. */class UnionFindPathHalving { private parent: number[]; constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); } find(x: number): number { while (this.parent[x] !== x) { // Point to grandparent this.parent[x] = this.parent[this.parent[x]]; // Skip to grandparent (not just parent) x = this.parent[x]; } return x; } union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); if (rootX === rootY) return false; this.parent[rootY] = rootX; return true; }}12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
Initial chain: [5] → [4] → [3] → [2] → [1] → [0] (all arrows point toward root) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ After find(0) with FULL PATH COMPRESSION: [5] ← root ┌──┬──┼──┬──┐ [0][1][2][3][4] All nodes point directly to root.Maximum flattening in one operation. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ After find(0) with PATH SPLITTING: [5] / \ [4] [3] \ [2] \ [1] \ [0] Wait, that's not right. Let me trace this carefully: Starting: parent = [1, 2, 3, 4, 5, 5]x = 0: parent[0] = parent[parent[0]] = parent[1] = 2 x = 1 (original parent[0])x = 1: parent[1] = parent[parent[1]] = parent[2] = 3 x = 2 (original parent[1])x = 2: parent[2] = parent[parent[2]] = parent[3] = 4 x = 3x = 3: parent[3] = parent[parent[3]] = parent[4] = 5 x = 4x = 4: parent[4] = parent[parent[4]] = parent[5] = 5 x = 5x = 5: parent[5] = 5 → done! Resulting: parent = [2, 3, 4, 5, 5, 5] [5] / | [4][3] | [2] | [1] | [0] Path roughly halved at each step. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ After find(0) with PATH HALVING: Similar effect but with half as many updates.Every other node updated.| Variant | Updates per Find | Recursion Needed? | Passes | Complexity |
|---|---|---|---|---|
| Full Compression (Recursive) | All nodes on path | Yes | 1 (recursive) | O(α(n)) amortized |
| Full Compression (Iterative) | All nodes on path | No | 2 | O(α(n)) amortized |
| Path Splitting | All nodes on path | No | 1 | O(α(n)) amortized |
| Path Halving | Half the nodes | No | 1 | O(α(n)) amortized |
In practice, path halving is often preferred: it's the simplest to implement (just 3 lines in the loop!), uses no extra space, needs no recursion, and achieves the same asymptotic complexity. Unless you have specific requirements, path halving is an excellent default choice.
Path compression modifies the tree structure during Find operations. This might seem dangerous—are we breaking the Union-Find invariants? Let's prove that compression is safe.
Claim: Path compression maintains all Union-Find correctness properties:
Proof:
Formal Argument:
Let T be the tree containing element x with root r. During Find(x) with path compression:
Before: For any node v on path, Find(v) follows chain v → ... → r = r After: For any node v on path, Find(v) = parent[v] = r
The result is the same! The answer to "what set is v in?" hasn't changed—only the path to compute it has shortened.
Invariants after compression:
The tree structure is just an implementation detail—what matters semantically is which elements share a root. Path compression changes the tree but preserves the root relationships. It's a structure-preserving transformation that optimizes performance without affecting correctness.
Path compression alone already provides significant improvements over the naive implementation. Let's analyze its complexity.
Single Operation Analysis:
Amortized Analysis (Informal):
With path compression alone (no union by rank), we can show that a sequence of m operations on n elements takes O(m log n) time. Here's the intuition:
1234567891011121314151617181920212223242526
Consider what happens as we perform operations: 1. Every Find operation shortens paths for future operations2. The "work" done in traversing a path is "paid for" by compression3. A node that gets compressed to depth 1 never needs to be traversed again (for that portion of the path) Accounting argument:━━━━━━━━━━━━━━━━━━━ We can charge each pointer traversal to either:- The Find operation itself (constant charge)- The node being moved closer to root (compression credit) Each node can only be "compressed" O(log n) times before it reaches the root, because:- Compression at least halves the distance to root- Starting distance is at most n- log(n) halvings reach distance 1 Therefore:- n nodes × O(log n) compressions each = O(n log n) total compression work- m Find operations × O(1) base cost = O(m)- Total: O(n log n + m) = O(m log n) for m ≥ n This is already much better than O(mn) for naive implementation!With Union by Rank:
When we combine path compression with union by rank (covered in the next page), something magical happens. The amortized complexity drops to O(α(n)) per operation, where α is the inverse Ackermann function.
The inverse Ackermann function grows extraordinarily slowly:
| n | α(n) |
|---|---|
| 1 | 0 |
| 2 | 1 |
| 4 | 2 |
| 16 | 3 |
| 65,536 | 4 |
| 2^65536 | 5 |
For any practical value of n (up to the number of atoms in the universe), α(n) ≤ 4. This means Union-Find with both optimizations has essentially O(1) amortized time per operation.
Path compression alone: O(log n) amortized per operation. Path compression + union by rank: O(α(n)) ≈ O(1) amortized per operation. Either way, we've transformed a potentially O(n) operation into something essentially constant. This is one of the most remarkable complexity results in data structure theory.
Let's consolidate what we've learned into production-quality implementations with best practices.
Complete Implementation with Path Compression:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293
/** * Union-Find with Path Compression * * This implementation uses path halving for simplicity and efficiency. * Path halving achieves the same asymptotic complexity as full compression * with simpler code and no recursion. * * Time Complexity: O(log n) amortized per operation (with compression only) * Space Complexity: O(n) */class UnionFind { private parent: number[]; private count: number; private size: number[]; constructor(n: number) { if (n <= 0) { throw new Error("Size must be positive"); } this.parent = Array.from({ length: n }, (_, i) => i); this.size = new Array(n).fill(1); this.count = n; } /** * Find root of x with path halving compression. * Each node on the path points to its grandparent after this call. */ find(x: number): number { this.validateIndex(x); while (this.parent[x] !== x) { // Path halving: point to grandparent this.parent[x] = this.parent[this.parent[x]]; x = this.parent[x]; } return x; } /** * Unite the sets containing x and y. * @returns true if union occurred, false if already connected */ union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); if (rootX === rootY) { return false; } // Attach smaller tree under larger (simple heuristic) // Full union by rank will be covered next page if (this.size[rootX] < this.size[rootY]) { this.parent[rootX] = rootY; this.size[rootY] += this.size[rootX]; } else { this.parent[rootY] = rootX; this.size[rootX] += this.size[rootY]; } this.count--; return true; } /** * Check if x and y are in the same set. */ connected(x: number, y: number): boolean { return this.find(x) === this.find(y); } /** * Get the number of disjoint sets. */ getCount(): number { return this.count; } /** * Get the size of the set containing x. */ getSize(x: number): number { return this.size[this.find(x)]; } private validateIndex(x: number): void { if (x < 0 || x >= this.parent.length) { throw new RangeError(`Index ${x} out of bounds [0, ${this.parent.length})`); } }}In path halving, the order of operations matters. Always read parent[parent[x]] BEFORE updating parent[x], or you'll get incorrect results. The update parent[x] = parent[parent[x]] followed by x = parent[x] is correct because we jump to the (new) parent after the update.
Path compression is a beautiful optimization that transforms Union-Find from a potentially slow O(n) per operation to a much faster O(log n) amortized per operation. Let's consolidate the key insights:
What's next:
Path compression optimizes the Find operation. But we still have an issue: the naive Union always attaches one root to another arbitrarily. This can create unbalanced trees where one large tree dominates.
The next page introduces Union by Rank — an optimization for the Union operation that ensures trees stay balanced. When combined with path compression, this yields the remarkable O(α(n)) amortized complexity.
You now deeply understand path compression—the optimization that makes Find fast. You know three variants, can prove correctness, and understand the complexity improvements. Next, we'll learn Union by Rank: the optimization that makes Union smart about which tree goes under which.