Loading content...
Imagine you have a BST containing 100 million user records ordered by timestamp. You need to process them in chronological order, but you can't load all 100 million records into memory simultaneously.
The naive approach—collect entire inorder traversal into a list, then iterate—would require O(n) extra space. For massive datasets, this is simply not feasible.
This page introduces BST iterators: mechanisms that allow you to traverse a BST in sorted order while maintaining only O(h) space, processing one element at a time. This is the power of lazy evaluation applied to tree traversal.
By the end of this page, you will understand how to implement BST iterators for memory-efficient sorted traversal. You'll master both forward and reverse iterators, learn the connection to controlled/resumable recursion, and see real-world applications in databases, stream processing, and merge algorithms.
Our earlier inorder traversal implementations had a hidden cost: they materialize the entire sorted sequence before processing it.
def get_sorted_elements(root):
result = [] # ← This grows to O(n) size
def inorder(node):
if node is None:
return
inorder(node.left)
result.append(node.val) # ← Stores every element
inorder(node.right)
inorder(root)
return result # ← Returns entire list
| Approach | Time | Space | Memory for 1M nodes |
|---|---|---|---|
| Materialize to list | O(n) | O(n) | ~400 MB (assuming 400 bytes/entry) |
| Process during recursion | O(n) | O(h) | ~1.6 KB (balanced, h ≈ 20) |
| Iterator-based | O(n) total | O(h) | ~1.6 KB (balanced, h ≈ 20) |
When does this matter?
It's easy to write code that collects results into a list 'for convenience.' In interviews, this is often acceptable. In production systems processing large datasets, it can mean the difference between a working system and one that crashes with OutOfMemoryError.
A BST Iterator is an object that:
next() call over the entire traversalThis is sometimes called a controlled recursion pattern. Instead of letting the call stack drive traversal (which happens all at once), we manually control when to advance.
The interface:
BSTIterator(root) → Construct iterator, initialize to first element
next() → Return next smallest element
hasNext() → Return true if more elements remain
With this interface, you can iterate in a standard loop:
iterator = BSTIterator(root)
while iterator.hasNext():
val = iterator.next()
process(val) # Process one element at a time
BST Iterator is a classic interview problem (LeetCode 173). Understanding it deeply prepares you not just for that specific problem, but for a whole class of iterator and lazy evaluation patterns.
The key insight is that inorder traversal can be "paused" and "resumed" if we maintain the right state. In recursive traversal, the call stack implicitly tracks which nodes we've seen and which we haven't. For an iterator, we make this explicit using a stack.
The strategy:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
class BSTIterator: """ Iterator for BST that returns elements in ascending order. This implements 'controlled recursion' - we manually manage what the call stack would normally track in recursive inorder traversal. Space Complexity: O(h) - stack never exceeds tree height Time Complexity: - __init__: O(h) to push leftmost path - next(): Amortized O(1) - each node pushed/popped exactly once - hasNext(): O(1) """ def __init__(self, root): """Initialize iterator, find first element.""" self.stack = [] # Push all left children - simulate leftward descent self._push_left_chain(root) def _push_left_chain(self, node): """ Push a node and all its left descendants onto the stack. This is the 'go left as far as possible' step of inorder traversal. """ while node: self.stack.append(node) node = node.left def next(self) -> int: """ Return the next smallest element. The top of stack is always the next element to return. After returning it, we need to process its right subtree. """ # The next smallest is at the top of the stack node = self.stack.pop() result = node.val # If this node has a right child, we need to process its # right subtree next - push the left chain of the right child if node.right: self._push_left_chain(node.right) return result def hasNext(self) -> bool: """Return true if there are more elements.""" return len(self.stack) > 0 # Example usage:# BST: 7# / \# 3 15# / \# 9 20## iterator = BSTIterator(root)# iterator.next() # returns 3# iterator.next() # returns 7# iterator.hasNext() # returns True# iterator.next() # returns 9# iterator.hasNext() # returns True# iterator.next() # returns 15# iterator.hasNext() # returns True# iterator.next() # returns 20# iterator.hasNext() # returns FalseThe next() operation seems like it might be O(h) in the worst case—we might push up to h nodes. So why do we say it's amortized O(1)?
The amortized analysis:
Look at the total work over the entire iteration:
next())For n nodes: total pushes = n, total pops = n
Over n calls to next(): total work = 2n operations
Amortized cost per next() = 2n / n = 2 = O(1)
A single call to next() might do O(h) work (pushing a long left chain). But because each node is processed exactly once total, averaging over all calls gives O(1) per call. This is similar to how ArrayList's add() is amortized O(1) despite occasional O(n) resizing.
| next() Call | Node Returned | Nodes Pushed | Stack Size After |
|---|---|---|---|
| 1 | 3 | 0 (no right child) | 1 (just root 7) |
| 2 | 7 | 2 (15, 9) | 2 |
| 3 | 9 | 0 (no right child) | 1 |
| 4 | 15 | 1 (20) | 1 |
| 5 | 20 | 0 (no right child) | 0 |
For the tree:
7
/ \
3 15
/ \
9 20
Notice that some calls push multiple nodes while others push none. But every node is pushed exactly once total. This is the essence of amortized analysis.
Sometimes we need to iterate in descending order (largest to smallest). This is the mirror image of our forward iterator.
Reverse inorder traversal: Right → Node → Left
This visits nodes from largest to smallest, which is the reverse of sorted order.
The changes:
next(), process left subtree12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394
class ReverseBSTIterator: """ Iterator for BST that returns elements in DESCENDING order. Uses reverse inorder traversal: Right → Node → Left This gives us largest to smallest. Space: O(h), Time: amortized O(1) per next() """ def __init__(self, root): self.stack = [] # Push all RIGHT children - simulate rightward descent self._push_right_chain(root) def _push_right_chain(self, node): """Push a node and all its right descendants.""" while node: self.stack.append(node) node = node.right def next(self) -> int: """Return the next LARGEST element.""" node = self.stack.pop() result = node.val # Process LEFT subtree next (contains smaller values) if node.left: self._push_right_chain(node.left) return result def hasNext(self) -> bool: return len(self.stack) > 0 # Example: For BST with values [3, 7, 9, 15, 20]# Reverse iterator returns: 20, 15, 9, 7, 3 class BidirectionalBSTIterator: """ Iterator that can go both forward (next) and backward (prev). This is more complex - we need to track both directions. Useful for implementing cursors in database-like systems. """ def __init__(self, root): self.root = root self.current = None self._find_minimum() def _find_minimum(self): """Position at the minimum element.""" node = self.root if node: while node.left: node = node.left self.current = node def _find_successor(self, node): """Find inorder successor of given node.""" if node.right: # Successor is leftmost in right subtree successor = node.right while successor.left: successor = successor.left return successor # Need to go up - find ancestor where we came from left # This requires parent pointers or re-traversal # Simplified version assumes we can re-search return self._find_next_larger(node.val) def _find_next_larger(self, val): """Find smallest node with value > val.""" result = None node = self.root while node: if node.val > val: result = node node = node.left else: node = node.right return result def next(self): """Move to and return the next larger element.""" if self.current is None: return None result = self.current.val self.current = self._find_successor(self.current) return resultHaving both forward and reverse iterators enables the two-pointer technique on BSTs. For example, finding a pair with a given sum can use a forward iterator starting at minimum and a reverse iterator starting at maximum, moving them toward each other.
Many modern languages support generators—functions that can pause execution and yield values one at a time. Generators provide a more elegant way to implement BST iteration without manually managing a stack.
How generators work:
yield to produce valuesnext() resumes execution until the next yield12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
def inorder_generator(node): """ Generator function for inorder traversal. This looks almost identical to recursive inorder, but uses 'yield' instead of appending to a list. The Python runtime handles pausing/resuming automatically. Space: O(h) - generator maintains its own call stack Time: O(n) total, amortized O(1) per yield """ if node is None: return # Recursively yield from left subtree yield from inorder_generator(node.left) # Yield current node's value yield node.val # Recursively yield from right subtree yield from inorder_generator(node.right) # Usage - looks like iterating over a collection!def process_bst_sorted(root): for value in inorder_generator(root): print(f"Processing: {value}") # Process one element at a time # Memory usage stays O(h) regardless of tree size # Can also convert to list if neededsorted_list = list(inorder_generator(root)) # Generator with state for more complex iterationdef ranged_inorder_generator(node, low, high): """ Generator that yields only values in [low, high]. Combines iteration with range pruning. """ if node is None: return # Only yield from left if there might be values in range if node.val > low: yield from ranged_inorder_generator(node.left, low, high) # Yield current if in range if low <= node.val <= high: yield node.val # Only yield from right if there might be values in range if node.val < high: yield from ranged_inorder_generator(node.right, low, high) # Bidirectional generator patterndef make_bst_iterator(root, reverse=False): """ Factory function for forward or reverse iteration. """ def forward(node): if node is None: return yield from forward(node.left) yield node.val yield from forward(node.right) def backward(node): if node is None: return yield from backward(node.right) yield node.val yield from backward(node.left) return backward(root) if reverse else forward(root)BST iterators aren't just academic exercises—they power real systems:
SELECT * FROM users ORDER BY created_at, the database uses a B-tree iterator to stream results. It doesn't load all matching rows into memory—it iterates through the index.1234567891011121314151617181920212223242526272829303132333435363738394041424344
import heapq def merge_bst_sorted(*roots): """ Merge multiple BSTs into a single sorted sequence. Uses iterators from each BST + a min-heap to produce elements in global sorted order. This is memory-efficient - we only load h elements per tree into memory, where h is the height. Time: O(N log k) where N is total elements, k is number of trees Space: O(k * h) for iterators + O(k) for heap """ # Create iterator for each tree iterators = [inorder_generator(root) for root in roots if root] # Initialize heap with first element from each iterator # Heap entries: (value, iterator_index, iterator) heap = [] for i, it in enumerate(iterators): try: val = next(it) heapq.heappush(heap, (val, i, it)) except StopIteration: pass # Empty tree # Merge by repeatedly extracting minimum while heap: val, idx, it = heapq.heappop(heap) yield val # Advance this iterator and re-insert if not exhausted try: next_val = next(it) heapq.heappush(heap, (next_val, idx, it)) except StopIteration: pass # This tree exhausted # Usage:# for val in merge_bst_sorted(tree1, tree2, tree3):# print(val) # Prints all values from all trees in sorted orderWhen working with large ordered datasets, always ask: 'Can I process this as a stream?' If yes, use iterators. Only materialize to a list when you genuinely need random access to all elements.
Let's apply our iterator knowledge to an elegant solution for finding two elements in a BST that sum to a target.
The insight:
With a sorted array, two-sum uses two pointers: one at start (smallest), one at end (largest). We move them toward each other based on whether the sum is too small or too large.
With a BST and two iterators (forward and reverse), we can do the same thing without materializing the sorted sequence!
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
class BSTIterator: """Forward iterator (ascending order).""" def __init__(self, root): self.stack = [] self._push_left(root) def _push_left(self, node): while node: self.stack.append(node) node = node.left def peek(self): return self.stack[-1].val if self.stack else None def next(self): node = self.stack.pop() self._push_left(node.right) return node.val def hasNext(self): return len(self.stack) > 0 class ReverseBSTIterator: """Reverse iterator (descending order).""" def __init__(self, root): self.stack = [] self._push_right(root) def _push_right(self, node): while node: self.stack.append(node) node = node.right def peek(self): return self.stack[-1].val if self.stack else None def next(self): node = self.stack.pop() self._push_right(node.left) return node.val def hasNext(self): return len(self.stack) > 0 def find_two_sum_bst(root, target): """ Find two elements in BST that sum to target. Uses two iterators: forward (from min) and reverse (from max). This is the BST analog of two-pointer on sorted array. Time: O(n) - each element visited at most once per iterator Space: O(h) - two iterators, each using O(h) space """ if root is None: return None left = BSTIterator(root) # Ascending iterator right = ReverseBSTIterator(root) # Descending iterator # Get initial values (min and max) left_val = left.next() right_val = right.next() while left_val < right_val: # Still have elements between them current_sum = left_val + right_val if current_sum == target: return (left_val, right_val) elif current_sum < target: # Need a larger sum, advance left (get larger value) left_val = left.next() else: # current_sum > target # Need a smaller sum, advance right (get smaller value) right_val = right.next() return None # No pair found # Example:# BST with values [2, 3, 4, 6, 7, 9]# find_two_sum_bst(root, 10) → (3, 7) or (4, 6)# find_two_sum_bst(root, 15) → (6, 9)# find_two_sum_bst(root, 20) → None (no such pair)Why this is elegant:
We've completed our exploration of how BSTs provide sorted iteration without explicit sorting. Let's consolidate the key insights:
Module Complete:
You've now mastered the profound connection between BST structure and sorted sequences:
This knowledge transforms BSTs from simple search structures into powerful ordered data stores—dynamic sorted collections that support efficient insertion, deletion, searching, ranking, and iteration.
You've achieved deep understanding of BST sorted output properties and applications. You can now implement memory-efficient BST iteration, apply two-pointer techniques to tree structures, and recognize when these patterns apply to real-world problems. This knowledge is foundational for working with ordered data at scale.