Loading learning content...
You've inherited a BST that's become terribly unbalanced—perhaps from a sequence of unfortunate insertions, or from deletions that left the tree lopsided. Operations that should be O(log n) are now O(n). You need to restore balance.
But you don't want to rebuild the tree from scratch if you can avoid it. Can we rebalance an existing BST efficiently?
The answer reveals a beautiful connection between the patterns we've learned: inorder traversal and sorted-array-to-BST construction combine to solve this problem elegantly.
By the end of this page, you will understand when and why BSTs become unbalanced, master the flatten-and-rebuild approach to rebalancing, analyze the trade-offs between periodic rebalancing and self-balancing trees, and recognize how this pattern integrates with practical tree maintenance strategies.
A BST becomes unbalanced when the heights of left and right subtrees diverge significantly. This happens due to several common patterns:
Measuring Imbalance:
We can quantify imbalance using the balance factor at each node:
balance_factor(node) = height(node.left) - height(node.right)
The worst case is a completely skewed tree where one subtree is empty at every level.
Same 5 elements, vastly different performance:
For 1 million elements: O(1,000,000) vs O(20). That's a 50,000x difference!
The most straightforward approach to rebalancing combines two operations we already know:
This is essentially "serialize then deserialize with optimal structure."
Why This Works:
The Algorithm:
1. Perform inorder traversal to collect values → sorted array
2. Apply sorted_array_to_bst on the array → balanced BST
3. Return the new balanced root
This is a beautiful example of pattern composition: the inorder traversal pattern (Module 8) combined with the sorted-array-to-BST pattern (Page 4 of this module) solves the rebalancing problem. Master individual patterns, then combine them for complex problems.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145
class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def balance_bst(root): """ Convert any BST to a height-balanced BST. Strategy: Flatten via inorder, rebuild via divide-and-conquer. Time Complexity: O(n) - O(n) for inorder + O(n) for rebuild Space Complexity: O(n) - for storing the sorted array Args: root: Root of the potentially unbalanced BST Returns: Root of a height-balanced BST with the same elements """ # Step 1: Flatten to sorted array via inorder traversal def inorder(node, result): if node is None: return inorder(node.left, result) result.append(node.val) inorder(node.right, result) sorted_values = [] inorder(root, sorted_values) # Step 2: Rebuild as balanced BST def build(left, right): if left > right: return None mid = (left + right) // 2 node = TreeNode(sorted_values[mid]) node.left = build(left, mid - 1) node.right = build(mid + 1, right) return node if not sorted_values: return None return build(0, len(sorted_values) - 1) def balance_bst_reuse_nodes(root): """ Balance BST while reusing existing node objects. More memory-efficient: doesn't create new nodes, just rearranges pointers. Useful when nodes have additional data or when memory is constrained. Time: O(n) Space: O(n) for node array (but no new node allocations) """ # Step 1: Collect all nodes via inorder traversal def inorder(node, nodes): if node is None: return inorder(node.left, nodes) nodes.append(node) inorder(node.right, nodes) nodes = [] inorder(root, nodes) # Step 2: Rebuild by rearranging pointers def build(left, right): if left > right: return None mid = (left + right) // 2 node = nodes[mid] # Clear and reset pointers node.left = build(left, mid - 1) node.right = build(mid + 1, right) return node if not nodes: return None return build(0, len(nodes) - 1) def balance_bst_iterative(root): """ Iterative version using explicit stacks. Avoids recursion depth issues for very deep trees. """ # Iterative inorder traversal sorted_values = [] stack = [] current = root while stack or current: while current: stack.append(current) current = current.left current = stack.pop() sorted_values.append(current.val) current = current.right if not sorted_values: return None # Iterative tree building using queue of ranges from collections import deque n = len(sorted_values) mid = (0 + n - 1) // 2 new_root = TreeNode(sorted_values[mid]) # Queue entries: (parent, is_left_child, left_idx, right_idx) queue = deque() queue.append((new_root, True, 0, mid - 1)) # Left child range queue.append((new_root, False, mid + 1, n - 1)) # Right child range while queue: parent, is_left, left, right = queue.popleft() if left > right: continue mid = (left + right) // 2 child = TreeNode(sorted_values[mid]) if is_left: parent.left = child else: parent.right = child queue.append((child, True, left, mid - 1)) queue.append((child, False, mid + 1, right)) return new_root| Phase | Time | Space | Notes |
|---|---|---|---|
| Inorder Traversal | O(n) | O(h) recursion + O(n) result | Visit all n nodes once |
| Array Storage | O(n) | Store all n values | |
| Divide-and-Conquer Build | O(n) | O(log n) recursion | Each element placed once |
| Total | O(n) | O(n) | Dominated by array storage |
Time Complexity: O(n)
Space Complexity: O(n)
The node-reuse variant saves memory by not allocating new nodes, but still requires O(n) for the node array. It's useful when nodes contain large payloads beyond just the key.
Can we rebalance in O(n) time with O(1) extra space? Not with this approach. Some advanced techniques (like Day-Stout-Warren algorithm) can rebalance in-place, but they're more complex. For most purposes, O(n) space is acceptable for a periodic rebalancing operation.
For scenarios where O(n) extra space is prohibitive, the Day-Stout-Warren (DSW) algorithm achieves O(n) time with O(1) extra space.
High-Level Strategy:
This operates entirely through pointer manipulation—no arrays needed.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113
def balance_bst_dsw(root): """ Day-Stout-Warren algorithm for in-place BST balancing. Two phases: 1. Tree → Vine (right-skewed backbone) 2. Vine → Balanced Tree (via rotations) Time Complexity: O(n) Space Complexity: O(1) extra (only pointer manipulations) """ # Create a dummy root to simplify edge cases dummy = TreeNode(0) dummy.right = root # Phase 1: Convert tree to vine (right-skewed tree) def tree_to_vine(root): """ Convert tree to vine using right rotations. Keep rotating left children to the right until no left children remain. """ count = 0 current = root.right while current: if current.left: # Right rotation old_current = current current = current.left old_current.left = current.right current.right = old_current root.right = current else: # Move down to next node count += 1 root = current current = current.right return count # Phase 2: Convert vine to balanced tree def vine_to_tree(root, n): """ Convert vine to balanced tree using left rotations. Perform compression passes to fold the vine. """ # Calculate number of leaves in bottom level # For a balanced tree with n nodes, the number of # "extra" nodes beyond a perfect tree is: # n - (2^floor(log2(n+1)) - 1) import math m = 2 ** int(math.log2(n + 1)) - 1 # First compression: handle the extra nodes compress(root, n - m) # Subsequent compressions: halve the vine length each time while m > 1: m //= 2 compress(root, m) def compress(root, count): """ Perform 'count' left rotations starting from root. """ scanner = root for _ in range(count): child = scanner.right scanner.right = child.right scanner = scanner.right child.right = scanner.left scanner.left = child # Execute the algorithm n = tree_to_vine(dummy) vine_to_tree(dummy, n) return dummy.right def right_rotate(node): """ Perform right rotation around node. node left_child / \ → / \ left_child R LL node / \ / \ LL LR LR R """ left_child = node.left node.left = left_child.right left_child.right = node return left_child def left_rotate(node): """ Perform left rotation around node. node right_child / \ → / \ L right_child node RR / \ / \ RL RR L RL """ right_child = node.right node.right = right_child.left right_child.left = node return right_childWhile DSW achieves O(1) space, it's significantly more complex to implement and understand. For most practical scenarios, the flatten-and-rebuild approach with O(n) space is preferred for clarity and maintainability. DSW is a specialized technique for memory-constrained environments.
Rebalancing has a cost—O(n) time makes it unsuitable after every operation. When should you trigger a rebalance?
| Strategy | Best For | Drawback |
|---|---|---|
| Post-bulk rebalance | Data loading, migrations | Costly if bulk ops are rare |
| Periodic | Predictable workloads | May rebalance unnecessarily |
| Threshold-based | Variable workloads | Requires monitoring logic |
| Self-balancing trees | Real-time balance requirements | More complex data structure |
In production, use self-balancing trees (like std::set in C++, TreeSet in Java) for data structures that need consistent O(log n) performance. Manual rebalancing is useful for simpler BST implementations or one-time restructuring operations.
Why not just use self-balancing trees that maintain balance automatically?
Self-balancing BSTs (AVL, Red-Black) perform rotations after every insert/delete to maintain O(log n) height. Let's compare:
| Aspect | Plain BST + Manual Rebalance | Self-Balancing (AVL/RB) |
|---|---|---|
| Insert complexity | O(h), potentially O(n) | O(log n) guaranteed |
| Search complexity | O(h), degrades over time | O(log n) always |
| Rebalance cost | O(n) periodic | O(1) to O(log n) per operation |
| Implementation complexity | Simple BST + one rebalance function | Complex rotation logic |
| Worst-case between rebalances | O(n) | O(log n) |
| Memory overhead | 3 pointers per node | 3-4 pointers + balance info |
When Plain BST + Rebalancing Makes Sense:
When Self-Balancing Trees Are Better:
Sometimes you don't need to rebalance the entire tree—just a subtree that's become problematic. Partial rebalancing applies the flatten-and-rebuild strategy to a specific subtree.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697
def rebalance_subtree(root, target_value): """ Rebalance only the subtree rooted at the node containing target_value. Useful when: - Only a specific region of the tree is unbalanced - Full rebalance is too expensive - You know which subtree is problematic Returns the new root (may change if target was the root). """ # Find the target node and its parent parent = None current = root is_left_child = False while current and current.val != target_value: parent = current if target_value < current.val: current = current.left is_left_child = True else: current = current.right is_left_child = False if current is None: # Target not found return root # Rebalance the subtree rooted at current balanced_subtree = balance_bst(current) # Attach the rebalanced subtree back if parent is None: # Target was the root return balanced_subtree elif is_left_child: parent.left = balanced_subtree else: parent.right = balanced_subtree return root def find_unbalanced_subtree(root): """ Find the first (shallowest) unbalanced subtree root. Returns (node, parent, is_left_child) for the unbalanced subtree. """ def height(node): if node is None: return 0 return 1 + max(height(node.left), height(node.right)) def find(node, parent, is_left): if node is None: return None left_height = height(node.left) right_height = height(node.right) if abs(left_height - right_height) > 1: return (node, parent, is_left) # Check left subtree result = find(node.left, node, True) if result: return result # Check right subtree return find(node.right, node, False) return find(root, None, False) def rebalance_worst_subtree(root): """ Find and rebalance the most unbalanced subtree. """ result = find_unbalanced_subtree(root) if result is None: # Tree is already balanced return root unbalanced_node, parent, is_left = result balanced_subtree = balance_bst(unbalanced_node) if parent is None: return balanced_subtree elif is_left: parent.left = balanced_subtree else: parent.right = balanced_subtree return rootAdvantages of Partial Rebalancing:
Caveats:
BST rebalancing appears in several practical scenarios:
Example: MongoDB Index Rebuilding
MongoDB uses B-trees for indexing. The reIndex() command:
This is exactly the flatten-and-rebuild pattern applied at scale.
Rebalancing existing BSTs restores optimal O(log n) height from degraded structures. Let's consolidate the key insights:
Module Summary:
This module covered the essential BST patterns that form a problem-solving toolkit:
These patterns appear repeatedly in interviews and production systems. Mastering them gives you the vocabulary and tools to tackle tree problems with confidence.
Congratulations! You've completed the Common BST Patterns module. You now have a comprehensive toolkit for BST problem-solving: validation, bounded searches, range operations, optimal construction, and rebalancing. These patterns form the foundation for advanced tree topics like self-balancing trees and specialized tree structures.