Loading content...
Union-Find derives its name from its two fundamental operations: Union and Find. Every capability of this data structure—every connectivity query, every set merge, every component count—builds upon these two primitives.
In this page, we'll dissect these operations with surgical precision. We'll trace through their execution step by step, understand their implementation, analyze their complexity, and identify the weaknesses that motivate the optimizations covered in subsequent pages.
By the end, you'll not only understand how these operations work but why they work—and why the naive implementations aren't good enough for serious applications.
This page covers: (1) The Find operation—how we trace element ancestry to identify set membership, (2) The Union operation—how we merge two trees into one, (3) Step-by-step execution traces, (4) Worst-case analysis revealing why naive implementations can be slow, and (5) The foundation for understanding optimizations.
The Find operation answers a simple question: "Which set does element x belong to?"
To answer this, we return the root (representative) of the tree containing x. Since every element in the same set shares the same root, we can determine set membership by comparing roots.
How Find works:
The root is the canonical identifier for the set. Two elements are in the same set if and only if Find returns the same root for both.
12345678910111213141516171819202122232425262728293031323334
class UnionFind { private parent: number[]; constructor(n: number) { // Each element starts as its own root this.parent = Array.from({ length: n }, (_, i) => i); } /** * Find the root (representative) of element x's set. * * Time Complexity: O(d) where d is the depth of x in its tree * In the worst case (unbalanced tree), this is O(n) * * @param x - The element to find the root of * @returns The root (representative) of x's set */ find(x: number): number { // Traverse up the tree until we reach the root // Root is identified by being its own parent: parent[r] = r while (this.parent[x] !== x) { x = this.parent[x]; // Move to parent } return x; // x is now the root } /** * Check if two elements are in the same set. * Simply compares their roots. */ connected(x: number, y: number): boolean { return this.find(x) === this.find(y); }}Execution Trace Example:
Let's trace Find on a concrete tree structure to see exactly what happens.
123456789101112131415161718192021222324252627282930313233343536373839404142
Tree Structure: [0] ← root / | \ [1][2][3] | [4] | [5] parent array: [0, 0, 0, 0, 2, 4] ↑ ↑ ↑ ↑ ↑ ↑ idx: 0 1 2 3 4 5 parent[0] = 0 (root: points to itself)parent[1] = 0 (1's parent is 0)parent[2] = 0 (2's parent is 0)parent[3] = 0 (3's parent is 0)parent[4] = 2 (4's parent is 2)parent[5] = 4 (5's parent is 4) Find(5) execution trace:─────────────────────────Step 1: x = 5 Is parent[5] === 5? → parent[5] = 4 → NO Move to parent: x = parent[5] = 4 Step 2: x = 4 Is parent[4] === 4? → parent[4] = 2 → NO Move to parent: x = parent[4] = 2 Step 3: x = 2 Is parent[2] === 2? → parent[2] = 0 → NO Move to parent: x = parent[2] = 0 Step 4: x = 0 Is parent[0] === 0? → parent[0] = 0 → YES! Return 0 Result: Find(5) = 0 The path traversed: 5 → 4 → 2 → 0Path length: 3 hopsThe time complexity of Find is O(depth of x), where depth is the number of edges from x to its root. In a balanced tree, this might be O(log n). But in an unbalanced tree (a chain), depth can be n-1, making Find O(n). This worst case motivates the path compression optimization we'll study later.
The Union operation merges two sets into one. Given elements x and y, we want to combine their sets so that all elements from both sets are now in a single connected set.
How Union works:
The key insight: we don't need to update every element's parent. We only update one parent pointer—the root of one tree now points to the root of the other. This single pointer change merges entire sets!
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
class UnionFind { private parent: number[]; private setCount: number; // Track number of disjoint sets constructor(n: number) { this.parent = Array.from({ length: n }, (_, i) => i); this.setCount = n; // Initially n separate sets } find(x: number): number { while (this.parent[x] !== x) { x = this.parent[x]; } return x; } /** * Merge the sets containing x and y. * * Time Complexity: Dominated by two Find calls * - Best case: O(1) if roots are direct * - Worst case: O(n) in unbalanced trees * * @param x - First element * @param y - Second element * @returns true if a merge occurred, false if already connected */ union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); // Already in the same set - nothing to do if (rootX === rootY) { return false; // No merge occurred } // Merge: make rootY point to rootX // (This is an arbitrary choice - we'll optimize this later) this.parent[rootY] = rootX; // One less disjoint set now this.setCount--; return true; // Merge occurred } /** * Get the current number of disjoint sets. */ getSetCount(): number { return this.setCount; }}Execution Trace Example:
Let's trace Union(3, 7) through two separate trees to see exactly how they merge.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
Before Union(3, 7): Tree A: Tree B: [0] [5] / | \ / \[1][2][3] [6] [7] | [4] parent: [0, 0, 0, 0, 2, 5, 5, 5] ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑idx: 0 1 2 3 4 5 6 7 Set A = {0, 1, 2, 3, 4} with root 0Set B = {5, 6, 7} with root 5 Union(3, 7) execution trace:────────────────────────────── Step 1: Find root of 3 Find(3): 3 → parent[3] = 0 → parent[0] = 0 ✓ rootX = 0 Step 2: Find root of 7 Find(7): 7 → parent[7] = 5 → parent[5] = 5 ✓ rootY = 5 Step 3: Compare roots rootX (0) ≠ rootY (5) Different sets - proceed with merge Step 4: Merge parent[rootY] = rootX parent[5] = 0 After Union(3, 7): [0] ← root of combined set / | \ \ [1][2][3][5] | / \ [4] [6] [7] parent: [0, 0, 0, 0, 2, 0, 5, 5] ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑idx: 0 1 2 3 4 5 6 7 ↑ Changed from 5 to 0! Combined set = {0, 1, 2, 3, 4, 5, 6, 7} Verification:- Find(3) = 0 ✓- Find(7) → 5 → 0 ✓- connected(3, 7) = (0 === 0) = true ✓Notice how we merged 8 elements by changing just ONE parent pointer. This is the magic of Union-Find. We don't need to update every element—just link the roots. The tree structure automatically propagates the connection to all descendants.
Robust implementations handle edge cases gracefully. Let's examine the important edge cases and how Union-Find handles them.
Case 1: Union of an element with itself — Union(x, x)
What happens if we call Union(x, x)?
12345678910111213141516171819
// Union(x, x) - element with itselffunction demonstrateSelfUnion() { const uf = new UnionFind(5); console.log("Before Union(2, 2):"); console.log("setCount:", uf.getSetCount()); // 5 const merged = uf.union(2, 2); console.log("After Union(2, 2):"); console.log("merged:", merged); // false console.log("setCount:", uf.getSetCount()); // 5 (unchanged)} // Explanation:// Find(2) = 2 (it's its own root initially)// Find(2) = 2 (same thing)// rootX === rootY, so no merge occurs// This is correct! Element is already "connected" to itselfCase 2: Union of elements already in the same set
1234567891011121314151617181920212223
// Union(x, y) when x and y are already connectedfunction demonstrateRedundantUnion() { const uf = new UnionFind(5); uf.union(0, 1); // {0, 1} uf.union(1, 2); // {0, 1, 2} console.log("Before redundant Union(0, 2):"); console.log("connected(0, 2):", uf.connected(0, 2)); // true console.log("setCount:", uf.getSetCount()); // 3 const merged = uf.union(0, 2); // Already connected! console.log("After redundant Union(0, 2):"); console.log("merged:", merged); // false console.log("setCount:", uf.getSetCount()); // 3 (unchanged)} // Explanation:// Find(0) = root of {0, 1, 2}// Find(2) = same root// rootX === rootY, so no merge occurs// This idempotency is essential for correctnessCase 3: Multiple sequential unions creating a chain
This case reveals a weakness in the naive implementation:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
Sequence: Union(0,1), Union(1,2), Union(2,3), Union(3,4) After Union(0,1): parent[1] = 0 [0] | [1] After Union(1,2): Find(1)=0, Find(2)=2, parent[2] = 0 [0] / \ [1][2] After Union(2,3): Find(2)=0, Find(3)=3, parent[3] = 0 [0] / | \ [1][2][3] After Union(3,4): Find(3)=0, Find(4)=4, parent[4] = 0 [0] / | | \ [1][2][3][4] This creates a shallow tree! Find is O(1) for all elements. ───────────────────────────────────────────────────────── BUT consider this sequence instead:Union(1,0), Union(2,1), Union(3,2), Union(4,3)(Note: we're passing arguments in reverse order) After Union(1,0): Find(1)=1, Find(0)=0, parent[0] = 1 [1] | [0] After Union(2,1): Find(2)=2, Find(1)=1, parent[1] = 2 [2] | [1] | [0] After Union(3,2): Find(3)=3, Find(2)=2, parent[2] = 3 [3] | [2] | [1] | [0] After Union(4,3): Find(4)=4, Find(3)=3, parent[3] = 4 [4] | [3] | [2] | [1] | [0] This creates a CHAIN! Find(0) = O(n) = 4 hops!The second sequence creates a linked-list structure where Find(0) requires traversing through 4 elements. With n elements in a chain, Find becomes O(n). This worst case motivates the Union by Rank optimization, which ensures we always attach smaller trees under larger ones.
Let's put together a complete, well-documented naive implementation. This version lacks optimizations but demonstrates all the core concepts clearly.
Why study the naive version?
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
/** * Naive Union-Find (Disjoint Set Union) Implementation * * This implementation is functionally correct but lacks the * optimizations (path compression, union by rank) needed for * optimal performance at scale. * * Time Complexity: * - Constructor: O(n) * - find: O(n) worst case (chain), O(1) best case (root is direct parent) * - union: O(n) worst case (dominated by two find calls) * - connected: O(n) worst case (dominated by two find calls) * * Space Complexity: O(n) */class NaiveUnionFind { private parent: number[]; private count: number; // Number of disjoint sets private size: number[]; // Size of each tree (rooted at that index) /** * Initialize Union-Find with n elements (0 to n-1). * Each element starts in its own singleton set. */ constructor(n: number) { // Every element is its own parent initially (each is a root) this.parent = Array.from({ length: n }, (_, i) => i); // n elements means n disjoint sets initially this.count = n; // Each tree starts with size 1 this.size = new Array(n).fill(1); } /** * Find the root (representative) of the set containing x. * * Elements in the same set share the same root. * Root is identified by parent[x] === x. */ find(x: number): number { // Validation if (x < 0 || x >= this.parent.length) { throw new Error(`Element ${x} out of bounds`); } // Traverse parent chain until we reach the root while (this.parent[x] !== x) { x = this.parent[x]; } return x; } /** * Merge the sets containing x and y. * * @returns true if a merge occurred, false if already connected */ union(x: number, y: number): boolean { const rootX = this.find(x); const rootY = this.find(y); // Already in the same set if (rootX === rootY) { return false; } // Merge: attach rootY's tree under rootX // (Arbitrary choice - we'll optimize this in union by rank) this.parent[rootY] = rootX; // Update size of the merged tree this.size[rootX] += this.size[rootY]; // One fewer disjoint set this.count--; return true; } /** * Check if x and y are in the same set. */ connected(x: number, y: number): boolean { return this.find(x) === this.find(y); } /** * Get the number of disjoint sets. * Decreases by 1 with each successful union. */ getCount(): number { return this.count; } /** * Get the size of the set containing x. * Returns the total number of elements in x's set. */ getSize(x: number): number { const root = this.find(x); return this.size[root]; } /** * Debug: Print the internal state. */ debug(): void { console.log("parent:", this.parent); console.log("size: ", this.size); console.log("count: ", this.count); }}Usage Examples:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
function demonstrateNaiveUnionFind() { const uf = new NaiveUnionFind(10); // Elements 0-9 console.log("Initial state:"); console.log("Count:", uf.getCount()); // 10 (each element is its own set) // Build some connections uf.union(0, 1); uf.union(2, 3); uf.union(4, 5); uf.union(6, 7); uf.union(8, 9); console.log("\nAfter 5 unions:"); console.log("Count:", uf.getCount()); // 5 (five pairs) // Query connectivity console.log("connected(0, 1):", uf.connected(0, 1)); // true console.log("connected(0, 2):", uf.connected(0, 2)); // false // Merge two pairs uf.union(1, 3); // Merges {0,1} with {2,3} console.log("\nAfter merging pairs:"); console.log("connected(0, 2):", uf.connected(0, 2)); // true console.log("getSize(0):", uf.getSize(0)); // 4 console.log("Count:", uf.getCount()); // 4 // Create one giant component uf.union(0, 4); uf.union(4, 6); uf.union(6, 8); console.log("\nAfter connecting all:"); console.log("Count:", uf.getCount()); // 1 (all connected) console.log("getSize(5):", uf.getSize(5)); // 10 (all elements)} // Output:// Initial state:// Count: 10//// After 5 unions:// Count: 5// connected(0, 1): true// connected(0, 2): false//// After merging pairs:// connected(0, 2): true// getSize(0): 4// Count: 4//// After connecting all:// Count: 1// getSize(5): 10Let's rigorously analyze the time complexity of our naive implementation. Understanding these bounds is essential for recognizing when optimizations matter.
Find Operation:
The Find operation traverses from an element to its root. The number of steps equals the depth of the element in its tree.
| Case | Tree Structure | Depth | Time |
|---|---|---|---|
| Best | Element is root | 0 | O(1) |
| Average (balanced) | Balanced tree | O(log n) | O(log n) |
| Worst | Chain/linked list | n-1 | O(n) |
Union Operation:
Union performs two Find operations plus one constant-time pointer update. The complexity is dominated by the Finds:
Sequence Analysis:
For a sequence of m operations on n elements:
1234567891011121314151617181920212223242526272829
Consider n elements and this sequence of operations: 1. Union(1, 0) → Creates chain: 0 → 12. Union(2, 1) → Find(2)=O(1), Find(1)=O(1) → 0 → 1 → 23. Union(3, 2) → Find(3)=O(1), Find(2)=O(2) → 0 → 1 → 2 → 34. Union(4, 3) → Find(4)=O(1), Find(3)=O(3) → 0 → 1 → 2 → 3 → 4...n. Union(n-1, n-2) → Find(n-1)=O(1), Find(n-2)=O(n-2) Total time for n-1 unions:= O(1 + 1) + O(1 + 2) + O(1 + 3) + ... + O(1 + (n-2))= O(n) + O(1 + 2 + 3 + ... + (n-2))= O(n) + O(n²/2)= O(n²) Then, m Find operations on element 0:Each Find(0) traverses the entire chain: O(n) eachm × O(n) = O(mn) Total for n unions + m finds: O(n² + mn) = O(n² + mn) Compare to optimal Union-Find (with optimizations):Same sequence: O(n + m·α(n)) ≈ O(n + m) where α(n) ≤ 4 For n = 1,000,000 and m = 1,000,000:Naive: O(10¹² + 10¹²) = O(2 × 10¹²) operationsOptimal: O(10⁶ + 10⁶) ≈ O(2 × 10⁶) operations That's about ONE MILLION times slower!The O(n) per operation in the worst case makes naive Union-Find impractical for large datasets. With 1 million elements and 1 million operations, you could be waiting days instead of seconds. This isn't theoretical—it happens in practice when union order creates chains.
Let's trace a comprehensive example showing how trees evolve through a sequence of operations. This visual understanding is crucial for grasping both the structure and the problems with naive Union-Find.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128
Initial State: 8 elements, each in its own set━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ [0] [1] [2] [3] [4] [5] [6] [7] parent: [0, 1, 2, 3, 4, 5, 6, 7]Sets: {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}Count: 8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 1: Union(0, 1)Find(0) = 0, Find(1) = 1Different roots → parent[1] = 0 [0] | [1] [0] [2] [3] [4] [5] [6] [7] |[1] parent: [0, 0, 2, 3, 4, 5, 6, 7]Sets: {{0,1}, {2}, {3}, {4}, {5}, {6}, {7}}Count: 7 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 2: Union(2, 3)Find(2) = 2, Find(3) = 3Different roots → parent[3] = 2 [0] [2] [4] [5] [6] [7] | |[1] [3] parent: [0, 0, 2, 2, 4, 5, 6, 7]Sets: {{0,1}, {2,3}, {4}, {5}, {6}, {7}}Count: 6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 3: Union(4, 5) [0] [2] [4] [6] [7] | | |[1] [3] [5] parent: [0, 0, 2, 2, 4, 4, 6, 7]Count: 5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 4: Union(6, 7) [0] [2] [4] [6] | | | |[1] [3] [5] [7] parent: [0, 0, 2, 2, 4, 4, 6, 6]Count: 4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 5: Union(1, 3)Find(1) = 1 → 0Find(3) = 3 → 2Different roots → parent[2] = 0 [0] / | [1][2] | [3] [0] [4] [6] / | | | [1][2] [5] [7] | [3] parent: [0, 0, 0, 2, 4, 4, 6, 6]Sets: {{0,1,2,3}, {4,5}, {6,7}}Count: 3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 6: Union(5, 7)Find(5) = 5 → 4Find(7) = 7 → 6Different roots → parent[6] = 4 [0] [4] / | / | [1][2] [5][6] | | [3] [7] parent: [0, 0, 0, 2, 4, 4, 4, 6]Sets: {{0,1,2,3}, {4,5,6,7}}Count: 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Operation 7: Union(0, 4)Find(0) = 0Find(4) = 4Different roots → parent[4] = 0 [0] / | \ \ [1] [2] [4] | / \ [3][5][6] | [7] parent: [0, 0, 0, 2, 0, 4, 4, 6]Sets: {{0,1,2,3,4,5,6,7}}Count: 1 All elements now connected! Query: connected(3, 7)?Find(3): 3 → 2 → 0 (2 hops)Find(7): 7 → 6 → 4 → 0 (3 hops)Both return 0 → YES, connected!Notice how the trees developed. Some branches are deeper than others (path from 7 to root is 3 hops). In this example, the trees stayed reasonably balanced by luck. But without explicit balancing, chains easily form. The optimizations we'll study ensure trees stay shallow regardless of union order.
We've thoroughly explored the core operations of Union-Find. Let's consolidate what we've learned:
The Need for Optimization:
We've identified two key problems with naive Union-Find:
The next two pages address these issues with two powerful optimizations:
Together, these optimizations achieve nearly constant time per operation—one of the most remarkable results in data structure design.
You now thoroughly understand the Union and Find operations—how they work, their execution traces, edge cases, and complexity analysis. You've seen why the naive implementation struggles with degenerate cases. Next, we'll learn Path Compression: a simple yet powerful optimization that dramatically reduces tree depth.