Loading learning content...
Path compression optimizes the Find operation by flattening trees during traversal. But there's another source of inefficiency in naive Union-Find: the Union operation itself makes arbitrary choices about which tree to attach under which.
Consider uniting a tree with 1,000 elements under a tree with just 1 element. All 1,000 elements now have their path to root increased by 1. That's 1,000 elements affected by a suboptimal choice. What if we had done it the other way—attached the tiny tree under the large one? Only 1 element would have its path extended.
Union by Rank captures this insight: always attach the smaller or shallower tree under the larger or deeper one. This simple heuristic keeps trees balanced and ensures that tree heights grow logarithmically, not linearly.
This page covers: (1) Why union ordering matters, (2) Union by rank vs union by size, (3) Complete implementation with rank tracking, (4) Proof that rank-based union limits tree height to O(log n), (5) How rank and path compression work together, and (6) When to use which variant.
In naive Union-Find, when we merge two trees, we arbitrarily choose which root becomes the child. Let's see why this matters:
Scenario 1: Bad ordering
Suppose we have:
If we execute parent[A] = B (tree A goes under B):
Scenario 2: Good ordering
If we execute parent[B] = A (tree B goes under A):
The difference is 1000x!
1234567891011121314151617181920212223242526272829303132333435363738394041
Initial state: Tree A (1000 elements): Tree B (1 element): [A] [B] / | \ ... ... ... (1000 elements) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Option 1: Attach A under B (BAD) [B] | [A] / | \ ... ... ... Find for any element in original tree A:Before: depth dAfter: depth d + 1 1000 elements × 1 extra hop = 1000 extra work ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Option 2: Attach B under A (GOOD) [A] / | \ \ ... ... ... [B] Find for element B:Before: depth 0 (B was root)After: depth 1 Only 1 element × 1 extra hop = 1 extra work ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Conclusion: Always attach smaller tree under larger tree!When merging two structures, attach the smaller one to the larger one. This principle appears throughout computer science: weighted quick-union, merge operations in many data structures, and even in distributed systems for cluster merging.
There are two common approaches to balancing unions:
Union by Size (Weighted Union):
Union by Rank:
Both approaches achieve O(log n) tree height, but rank is more commonly used because it works better with path compression.
| Aspect | Union by Size | Union by Rank |
|---|---|---|
| What we track | Number of nodes in tree | Upper bound on tree height |
| Space overhead | O(n) for size array | O(n) for rank array |
| Height guarantee | O(log n) | O(log n) |
| With path compression | Works, but sizes may become inaccurate | Works perfectly—rank is an upper bound |
| Additional features | Can query set sizes | Simpler reasoning about height |
| Tie-breaking | Arbitrary when equal size | Increment winner's rank when equal |
Why Rank Works Better with Path Compression:
Path compression changes tree structure but doesn't change set sizes. So after compression, the "size" stored at a root is still accurate.
However, the meaning differs between size and rank:
After path compression, actual tree heights decrease, but we don't update ranks. This is actually fine! Rank remains a valid upper bound, and the mathematical analysis still holds. If we tried to maintain exact heights after compression, we'd need expensive tree traversals.
The key insight: Rank is a proxy for "how deep this tree could be," not "how deep it actually is." This makes it robust under path compression.
Use union by size when you need to query set sizes frequently—it gives you O(1) size queries at any time. Use union by rank when you only care about connectivity and want the simplest implementation that works optimally with path compression. In practice, both work well.
Let's implement union by rank with clear documentation:
Key Rules:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
/** * Union-Find with Union by Rank * * Rank is an upper bound on tree height. Trees are kept balanced * by always attaching smaller-rank trees under larger-rank trees. * * Without path compression: O(log n) per operation (guaranteed) * With path compression: O(α(n)) amortized per operation */class UnionFindByRank { private parent: number[]; private rank: number[]; // rank[i] is meaningful only when i is a root private count: number; constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); this.rank = new Array(n).fill(0); // All start at rank 0 this.count = n; } /** * Find with path compression (using path halving). */ find(x: number): number { while (this.parent[x] !== x) { this.parent[x] = this.parent[this.parent[x]]; // Path halving x = this.parent[x]; } return x; } /** * Union by rank. * * Rules: * - If rank[rootX] < rank[rootY]: attach rootX under rootY * - If rank[rootX] > rank[rootY]: attach rootY under rootX * - If ranks are equal: attach rootY under rootX, increment rank[rootX] */ union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); // Already in the same set if (rootX === rootY) { return false; } // Union by rank if (this.rank[rootX] < this.rank[rootY]) { // rootX has smaller rank: attach under rootY this.parent[rootX] = rootY; } else if (this.rank[rootX] > this.rank[rootY]) { // rootY has smaller rank: attach under rootX this.parent[rootY] = rootX; } else { // Equal ranks: arbitrarily choose rootX as new root // Increment its rank since the tree is now taller this.parent[rootY] = rootX; this.rank[rootX]++; } this.count--; return true; } connected(x: number, y: number): boolean { return this.find(x) === this.find(y); } getCount(): number { return this.count; }}Union by Size Implementation:
For completeness, here's the equivalent implementation using size instead of rank:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
/** * Union-Find with Union by Size (Weighted Union) * * Size tracks the number of elements in each tree. * Smaller trees are attached under larger trees. */class UnionFindBySize { private parent: number[]; private size: number[]; // size[i] is meaningful only when i is a root private count: number; constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); this.size = new Array(n).fill(1); // All start with size 1 this.count = n; } find(x: number): number { while (this.parent[x] !== x) { this.parent[x] = this.parent[this.parent[x]]; x = this.parent[x]; } return x; } /** * Union by size. * * Always attach the tree with fewer elements under * the tree with more elements. */ union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); if (rootX === rootY) { return false; } // Union by size: smaller tree under larger tree if (this.size[rootX] < this.size[rootY]) { this.parent[rootX] = rootY; this.size[rootY] += this.size[rootX]; } else { // When equal, choose rootX (arbitrary but consistent) this.parent[rootY] = rootX; this.size[rootX] += this.size[rootY]; } this.count--; return true; } connected(x: number, y: number): boolean { return this.find(x) === this.find(y); } getCount(): number { return this.count; } /** * Get the size of the set containing x. * This is O(α(n)) due to the find operation. */ getSetSize(x: number): number { return this.size[this.find(x)]; }}Let's trace through a series of unions to see how rank-based balancing keeps trees shallow:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125
Initial state: 8 elements (0-7) parent: [0, 1, 2, 3, 4, 5, 6, 7]rank: [0, 0, 0, 0, 0, 0, 0, 0] [0] [1] [2] [3] [4] [5] [6] [7] R R R R R R R R (R = rank 0, all are roots) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 1: Union(0, 1) find(0) = 0, find(1) = 1 rank[0] = 0, rank[1] = 0 → Equal ranks! Action: parent[1] = 0, rank[0]++ parent: [0, 0, 2, 3, 4, 5, 6, 7]rank: [1, 0, 0, 0, 0, 0, 0, 0] [0] [2] [3] [4] [5] [6] [7] | [1] 0 is rank 1, others are rank 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 2: Union(2, 3) rank[2] = 0, rank[3] = 0 → Equal ranks! Action: parent[3] = 2, rank[2]++ parent: [0, 0, 2, 2, 4, 5, 6, 7]rank: [1, 0, 1, 0, 0, 0, 0, 0] [0] [2] [4] [5] [6] [7] | | [1] [3] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 3: Union(4, 5) Equal ranks → parent[5] = 4, rank[4]++ parent: [0, 0, 2, 2, 4, 4, 6, 7]rank: [1, 0, 1, 0, 1, 0, 0, 0] [0] [2] [4] [6] [7] | | | [1] [3] [5] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 4: Union(6, 7) Equal ranks → parent[7] = 6, rank[6]++ parent: [0, 0, 2, 2, 4, 4, 6, 6]rank: [1, 0, 1, 0, 1, 0, 1, 0] [0] [2] [4] [6] | | | | [1] [3] [5] [7] Four pairs, all rank 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 5: Union(0, 2) find(0) = 0, find(2) = 2 rank[0] = 1, rank[2] = 1 → Equal ranks! Action: parent[2] = 0, rank[0]++ parent: [0, 0, 0, 2, 4, 4, 6, 6]rank: [2, 0, 1, 0, 1, 0, 1, 0] [0] [4] [6] / \ | | [1] [2] [5] [7] | [3] 0 is now rank 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 6: Union(4, 6) rank[4] = 1, rank[6] = 1 → Equal ranks! Action: parent[6] = 4, rank[4]++ parent: [0, 0, 0, 2, 4, 4, 4, 6]rank: [2, 0, 1, 0, 2, 0, 1, 0] [0] [4] / \ / \ [1] [2] [5] [6] | | [3] [7] Both 0 and 4 are rank 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 7: Union(0, 4) rank[0] = 2, rank[4] = 2 → Equal ranks! Action: parent[4] = 0, rank[0]++ parent: [0, 0, 0, 2, 0, 4, 4, 6]rank: [3, 0, 1, 0, 2, 0, 1, 0] [0] ← rank 3 / | \ \ [1] [2] [4] | / \ [3][5][6] | [7] Maximum depth: 3 (path 7 → 6 → 4 → 0)With 8 elements: log₂(8) = 3 ✓ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Key observation:With naive union, we could have depth up to 7.With union by rank, maximum depth is 3 = log₂(8). This is the power of balanced unions!Notice how rank only increases when merging trees of equal rank. This is because attaching a smaller-rank tree under a larger-rank tree doesn't increase the maximum depth. Only when two trees of equal depth merge does the combined tree become one level deeper.
Let's prove that union by rank guarantees tree height (and hence Find complexity) is O(log n). This is a fundamental result in Union-Find analysis.
Theorem: In Union-Find with union by rank, the height of any tree with n nodes is at most log₂(n).
Proof:
We'll prove a stronger statement: A tree with rank r has at least 2^r nodes.
Lemma: If a tree has rank r, it contains at least 2^r nodes.
Proof of Lemma (by induction on r):
Deriving the height bound:
From the lemma:
Since rank is an upper bound on tree height, tree height ≤ log₂(n).
Implications:
1234567891011121314151617181920212223242526272829303132
Consider merging trees with ranks r₁ and r₂ where r₁ < r₂: [root_2] rank r₂ [root_2] rank r₂ (unchanged!) ... ... + → [root_1] rank r₁ [root_1] rank r₁ ... ... (attached under root_2) Height of merged tree = max(r₂, r₁ + 1) Since r₁ < r₂: r₁ + 1 ≤ r₂ So height = max(r₂, r₁ + 1) = r₂ The winning tree's rank (r₂) is already sufficient!No need to increment. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Now consider merging trees with equal ranks r₁ = r₂ = r: [root_1] rank r [root_1] rank r+1 (incremented!) ... ... + → [root_2] rank r [root_2] rank r ... ... (attached under root_1) Height of merged tree = max(r, r + 1) = r + 1 We must increment the winner's rank to maintain the invariant!The logarithmic bound comes from the exponential growth property: a tree can only reach rank r if it resulted from merging two rank-(r-1) trees, each of which needed 2^(r-1) nodes. This forces rank (and height) to grow at most logarithmically.
The real power of Union-Find emerges when we combine both optimizations. Let's examine how they interact.
The Synergy:
Do ranks become inaccurate after path compression?
Yes, but it doesn't matter! Here's why:
12345678910111213141516171819202122232425262728293031
Before path compression: [0] rank 3 / \ [1] [2] / \ [3] [4] | [5] Actual heights: 0→0, 1→1, 2→1, 3→2, 4→2, 5→3Root's rank: 3 (accurate upper bound) After find(5) with full path compression: [0] rank 3 (unchanged!) / | \ \ \ [1][2][4][5] | [3] Actual heights: 0→0, 1→1, 2→1, 3→2, 4→1, 5→1Root's rank: 3 (still valid as upper bound, though conservative) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Key insight: Rank is an UPPER BOUND, not exact height.After compression, rank may overestimate, but that's OK:- It doesn't affect correctness- Union decisions remain valid (we still attach smaller under larger)- The amortized analysis accounts for thisComplete Optimized Implementation:
Here's the production-ready implementation with both optimizations:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
/** * Union-Find with Path Compression + Union by Rank * * This is the optimal Union-Find implementation achieving * O(α(n)) amortized time per operation, where α is the * inverse Ackermann function (effectively constant). * * Time Complexity: O(α(n)) ≈ O(1) amortized per operation * Space Complexity: O(n) */class UnionFind { private parent: number[]; private rank: number[]; private count: number; constructor(n: number) { if (n <= 0) throw new Error("Size must be positive"); this.parent = Array.from({ length: n }, (_, i) => i); this.rank = new Array(n).fill(0); this.count = n; } /** * Find with path compression (path halving variant). * O(α(n)) amortized. */ find(x: number): number { while (this.parent[x] !== x) { // Path halving: point to grandparent this.parent[x] = this.parent[this.parent[x]]; x = this.parent[x]; } return x; } /** * Union with union by rank. * O(α(n)) amortized. * * @returns true if union occurred, false if already connected */ union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); if (rootX === rootY) return false; // Union by rank: attach lower rank under higher rank if (this.rank[rootX] < this.rank[rootY]) { this.parent[rootX] = rootY; } else if (this.rank[rootX] > this.rank[rootY]) { this.parent[rootY] = rootX; } else { // Equal ranks: choose rootX, increment its rank this.parent[rootY] = rootX; this.rank[rootX]++; } this.count--; return true; } /** * Check if x and y are connected. * O(α(n)) amortized. */ connected(x: number, y: number): boolean { return this.find(x) === this.find(y); } /** * Get number of disjoint sets. * O(1). */ getCount(): number { return this.count; }}This implementation—path compression with union by rank—is what you should use in production. It's simple (under 50 lines), efficient (essentially O(1) per operation), and handles all edge cases correctly. Memorize this pattern!
Different problems require different variations of Union-Find. Let's explore common adaptations:
| Variation | When to Use | Trade-off |
|---|---|---|
| Basic (rank + compression) | Most problems | Best general-purpose choice |
| With size tracking | Need set sizes | Extra O(n) space, O(1) size queries |
| With element tracking | Need to list set members | O(n) space per set, expensive queries |
| Rollback/persistent | Need to undo unions | No path compression, O(log n) per op |
| Map-based | Sparse/dynamic elements | Hash overhead, but flexible keys |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576
/** * Union-Find tracking both rank (for balanced unions) and * size (for size queries). */class UnionFindWithSize { private parent: number[]; private rank: number[]; private size: number[]; private count: number; constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); this.rank = new Array(n).fill(0); this.size = new Array(n).fill(1); this.count = n; } find(x: number): number { while (this.parent[x] !== x) { this.parent[x] = this.parent[this.parent[x]]; x = this.parent[x]; } return x; } union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); if (rootX === rootY) return false; // Still use rank for balance decisions if (this.rank[rootX] < this.rank[rootY]) { this.parent[rootX] = rootY; this.size[rootY] += this.size[rootX]; } else if (this.rank[rootX] > this.rank[rootY]) { this.parent[rootY] = rootX; this.size[rootX] += this.size[rootY]; } else { this.parent[rootY] = rootX; this.size[rootX] += this.size[rootY]; this.rank[rootX]++; } this.count--; return true; } /** * Get the size of the set containing x. */ getSize(x: number): number { return this.size[this.find(x)]; } /** * Get the largest set size. */ getLargestSetSize(): number { let maxSize = 0; for (let i = 0; i < this.parent.length; i++) { if (this.parent[i] === i) { // It's a root maxSize = Math.max(maxSize, this.size[i]); } } return maxSize; } connected(x: number, y: number): boolean { return this.find(x) === this.find(y); } getCount(): number { return this.count; }}If you need to undo unions (rollback), you cannot use path compression—it makes changes that can't be easily reversed. In that case, use union by rank only, accepting O(log n) per operation. This is common in offline algorithms that process queries and then need to "go back in time."
Union by rank completes our optimization toolkit for Union-Find. Let's consolidate what we've learned:
What's next:
We've covered both major optimizations: path compression and union by rank. The final page brings everything together with a deep analysis of the O(α(n)) amortized complexity—explaining why the inverse Ackermann function appears and what it means for practical performance.
You now understand union by rank—the optimization that keeps Union-Find trees balanced from the start. Combined with path compression, this achieves essentially O(1) per operation. Next, we'll explore the remarkable amortized analysis that proves this result and understand why Union-Find is considered one of the most elegant data structures ever designed.