Loading learning content...
How do database systems execute complex queries involving multiple tables, filters, joins, and aggregations in a clean, modular, and efficient manner? The answer lies in one of the most elegant and enduring abstractions in database architecture: the Iterator Model, also known as the Volcano Model after the influential Volcano query processing system developed by Goetz Graefe in the early 1990s.
The iterator model defines a simple yet powerful interface that every physical operator implements. This uniformity enables operators to be composed arbitrarily—stacked, nested, and connected—to execute any query expressible in relational algebra. The model has proven so successful that it remains the foundation of query execution in virtually every major relational database system today.
By the end of this page, you will master the iterator model's interface, understand how tuples flow through operator trees via pull-based execution, trace the lifecycle of a query from initialization through completion, and appreciate both the elegance and the performance trade-offs of this foundational abstraction.
The iterator model is built on a deceptively simple idea: every operator in the query plan behaves like an iterator over tuples. Just as an iterator in programming provides a sequence of elements one at a time, each query operator provides a sequence of tuples one at a time.
This abstraction creates a universal "language" that all operators speak. A table scan iterator produces tuples from disk. A filter iterator consumes tuples and produces only those satisfying a predicate. A join iterator consumes tuples from two child iterators and produces combined tuples. The uniformity means any operator can be connected to any other, enabling infinite composability.
Core Principles:
The model is named after the Volcano query processing system, pioneered by Goetz Graefe at the University of Colorado in 1994. Graefe's work demonstrated that a clean iterator interface could unify diverse operators (scans, joins, aggregations, sorts) into a single, composable framework. This work became the template for nearly all subsequent database systems.
The iterator model's power derives from its simplicity. Every operator implements exactly three methods. No more, no less. This minimal interface is sufficient to express all relational operations.
The Three Methods:
| Method | Purpose | When Called | Return Value |
|---|---|---|---|
| Open() | Initialize operator state | Once, before any Next() calls | None (prepares for iteration) |
| Next() | Return the next tuple | Repeatedly until exhausted | Next tuple, or NULL/EOF indicator |
| Close() | Clean up and release resources | Once, after all Next() calls | None (frees resources) |
123456789101112131415161718192021
// Abstract iterator interface that all operators implementinterface Iterator { // Initialize the operator for execution // - Allocate necessary memory and resources // - Recursively call Open() on child operators // - Prepare internal state for tuple production Open() → void // Return the next tuple in the result stream // - May recursively call Next() on children // - Performs operator-specific processing // - Returns NULL (or special EOF marker) when exhausted Next() → Tuple | NULL // Clean up after execution is complete // - Release allocated resources // - Recursively call Close() on child operators // - Ensure no memory leaks or resource orphans Close() → void}Lifecycle Semantics:
The three methods define a strict lifecycle that every operator follows:
This lifecycle ensures proper resource management. Resources allocated in Open() have a guaranteed cleanup point in Close(). Intermediate state maintained during Next() calls is scoped precisely to the query's execution lifetime.
Different database systems use slightly different terminology—PostgreSQL calls these Init/Exec/End, some systems call them Reset/Fetch/Cleanup—but the three-phase structure remains universal. Some systems add convenience methods (like Restart for rescan), but these are extensions of the core model.
The iterator model implements pull-based (or demand-driven) execution. The consumer—ultimately the client application—pulls tuples from the query result. Each pull propagates down through the operator tree, triggering production of exactly the tuples needed.
How Pull-Based Execution Works:
Imagine a query with a Filter operator on top of a Table Scan. The execution flow works as follows:
The key insight is that control flows from parent to child, while data flows from child to parent.
1234567891011121314151617181920212223242526272829303132
Query: SELECT * FROM employees WHERE salary > 50000 Operator Tree: [Filter: salary > 50000] │ [Table Scan: employees] Execution Trace (each row is a method call): Step Caller Method Called Result━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━1. Executor Filter.Open() (initializes) Filter TableScan.Open() (opens table) 2. Executor Filter.Next() (wants tuple) Filter TableScan.Next() → {id:1, name:"Alice", salary:45000} Filter evaluates: 45000 > 50000 → FALSE Filter TableScan.Next() → {id:2, name:"Bob", salary:65000} Filter evaluates: 65000 > 50000 → TRUE Filter returns → {id:2, name:"Bob", salary:65000} 3. Executor Filter.Next() (wants another) Filter TableScan.Next() → {id:3, name:"Carol", salary:55000} Filter evaluates: 55000 > 50000 → TRUE Filter returns → {id:3, name:"Carol", salary:55000} 4. Executor Filter.Next() (wants another) Filter TableScan.Next() → NULL (exhausted) Filter returns → NULL 5. Executor Filter.Close() (cleanup) Filter TableScan.Close() (closes table)Benefits of Pull-Based Execution:
The alternative to pull-based is push-based execution, where producers push tuples to consumers. Push-based models offer performance advantages for analytical workloads (better cache utilization) but sacrifice some elegance and can be harder to implement. We'll contrast these approaches in the Materialization page.
To truly understand the iterator model, let's examine how common database operators implement the Open/Next/Close interface. Each operator has distinct behavior, but all conform to the same interface contract.
Sequential Scan (TableScan):
123456789101112131415161718192021222324252627282930
class SequentialScan implements Iterator { tableId: TableId currentPage: Page currentSlot: int Open(): // Open the table for scanning currentPage = bufferManager.getFirstPage(tableId) currentSlot = 0 Next() → Tuple | NULL: while true: if currentPage == NULL: return NULL // Exhausted all pages if currentSlot < currentPage.tupleCount: tuple = currentPage.getTuple(currentSlot) currentSlot++ return tuple else: // Move to next page bufferManager.unpin(currentPage) currentPage = bufferManager.getNextPage(tableId, currentPage) currentSlot = 0 Close(): if currentPage != NULL: bufferManager.unpin(currentPage) currentPage = NULL}Filter (Selection):
12345678910111213141516171819
class Filter implements Iterator { child: Iterator predicate: Expression Open(): child.Open() // Recursively open child Next() → Tuple | NULL: while true: tuple = child.Next() if tuple == NULL: return NULL // Child exhausted if predicate.evaluate(tuple) == TRUE: return tuple // Predicate satisfied // Predicate failed; continue to next tuple Close(): child.Close() // Recursively close child}Nested Loop Join:
1234567891011121314151617181920212223242526272829303132
class NestedLoopJoin implements Iterator { outerChild: Iterator innerChild: Iterator // Must support Rescan() joinPredicate: Expression currentOuterTuple: Tuple Open(): outerChild.Open() innerChild.Open() currentOuterTuple = outerChild.Next() // Get first outer Next() → Tuple | NULL: while currentOuterTuple != NULL: while true: innerTuple = innerChild.Next() if innerTuple == NULL: // Inner exhausted; advance outer, reset inner currentOuterTuple = outerChild.Next() innerChild.Rescan() // Reset inner to beginning break // Continue outer loop // Check join condition if joinPredicate.evaluate(currentOuterTuple, innerTuple): return combine(currentOuterTuple, innerTuple) return NULL // Both exhausted Close(): outerChild.Close() innerChild.Close()}Some operators (particularly join inners) need to be scanned multiple times. The Rescan() method (or Restart()) resets the iterator to its beginning without full Close/Open overhead. Not all operators support efficient rescan—sorts and hash builds may need to preserve their results for rescanning.
Not all operators can produce output tuples immediately upon receiving input. Some operators—called blocking operators or pipeline breakers—must consume their entire input before producing any output. The iterator model accommodates these operators, though they require special consideration.
Common Blocking Operators:
| Operator | Blocking? | Reason |
|---|---|---|
| Sequential Scan | No | Produces tuples as it reads |
| Filter (Selection) | No | Evaluates and passes through immediately |
| Projection | No | Transforms and passes immediately |
| Limit | No | Counts and passes or stops |
| Sort | Yes | Must see all input to determine order |
| Hash Aggregate | Yes | Must see all input to compute groups |
| Hash Join (Build) | Yes | Must build complete hash table first |
| Distinct (Hash) | Yes | Must build complete set of seen values |
How Blocking Operators Work in the Iterator Model:
For blocking operators, the Open() phase does the heavy lifting. The operator consumes its entire input during Open() and buffers the results. Subsequent Next() calls simply iterate over the buffered results.
123456789101112131415161718192021222324252627282930313233
class Sort implements Iterator { child: Iterator sortKey: Column[] sortedTuples: List<Tuple> currentIndex: int Open(): child.Open() // BLOCKING: Consume entire input sortedTuples = [] while true: tuple = child.Next() if tuple == NULL: break sortedTuples.add(tuple) // Sort the collected tuples sortedTuples.sortBy(sortKey) currentIndex = 0 Next() → Tuple | NULL: // Non-blocking: iterate over pre-sorted list if currentIndex < sortedTuples.length: tuple = sortedTuples[currentIndex] currentIndex++ return tuple return NULL Close(): sortedTuples = NULL // Free memory child.Close()}Pipeline Breaking Implications:
Blocking operators create pipeline breaks—points where tuples cannot flow continuously through the tree. Understanding pipeline breaks is crucial for:
Blocking operators can consume enormous memory when processing large inputs. Production systems implement spilling strategies where buffered data overflows to temporary disk storage when memory limits are exceeded. This maintains correctness at the cost of increased I/O.
Let's trace through a complete query execution to see how all the pieces fit together. Consider a query that joins two tables, filters the result, sorts it, and returns the top 10 rows.
Example Query:
123456
SELECT e.name, d.dept_name, e.salaryFROM employees eJOIN departments d ON e.dept_id = d.idWHERE e.salary > 50000ORDER BY e.salary DESCLIMIT 10;123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
Physical Plan Structure: [Limit: 10] │ [Sort: salary DESC] ← BLOCKING │ [Filter: salary > 50000] │ [Hash Join: e.dept_id = d.id] ← BUILD PHASE IS BLOCKING ├── [Scan: employees] (probe side) └── [Scan: departments] (build side) ═══════════════════════════════════════════════════════════════PHASE 1: OPEN (Initialization)═══════════════════════════════════════════════════════════════ Limit.Open(): └── Sort.Open(): // Sort is BLOCKING └── Filter.Open(): └── HashJoin.Open(): ├── EmpScan.Open() // Opens employees table └── DeptScan.Open() // Opens departments table // BUILD PHASE: Read ALL departments // Build hash table: {id → dept_tuple} // Example: {1 → ("Engineering"), 2 → ("Sales")} ═══════════════════════════════════════════════════════════════PHASE 2: NEXT (Tuple Production) - Inside Sort.Open()!═══════════════════════════════════════════════════════════════ // Sort consumes ALL its input during Open(): Sort pulls from Filter: Filter pulls from HashJoin: HashJoin pulls from EmpScan: → Returns: {id:1, name:"Alice", dept_id:2, salary:45000} HashJoin probes hash table (dept_id=2): → Match found, returns: {name:"Alice", dept_name:"Sales", salary:45000} Filter evaluates: 45000 > 50000 → FALSE (discarded) // Loop continues... HashJoin pulls from EmpScan: → Returns: {id:2, name:"Bob", dept_id:1, salary:65000} HashJoin probes hash table (dept_id=1): → Returns: {name:"Bob", dept_name:"Engineering", salary:65000} Filter evaluates: 65000 > 50000 → TRUE Sort buffers: {name:"Bob", dept_name:"Engineering", salary:65000} // ... continues until EmpScan exhausted // Sort now has all matching tuples, sorts by salary DESC ═══════════════════════════════════════════════════════════════PHASE 3: NEXT (Result Delivery)═══════════════════════════════════════════════════════════════ Client calls Limit.Next(): Limit calls Sort.Next(): → Returns: {name:"Carol", ..., salary:95000} // Highest salary Limit returns it (count=1) Client calls Limit.Next() again: Limit calls Sort.Next(): → Returns: {name:"Dave", ..., salary:85000} // Second highest Limit returns it (count=2) ... repeats 8 more times until count=10 ... Client calls Limit.Next() (11th time): Limit: count >= limit → returns NULL immediately // Sort may have more tuples, but Limit stops pulling! ═══════════════════════════════════════════════════════════════PHASE 4: CLOSE (Cleanup)═══════════════════════════════════════════════════════════════ Limit.Close(): └── Sort.Close(): Free sorted tuple buffer └── Filter.Close(): └── HashJoin.Close(): Free hash table ├── EmpScan.Close() └── DeptScan.Close()Notice how Limit can stop early without Sort ever knowing. This lazy evaluation means if a query has LIMIT 1 on a sorted result, we'd ideally only need the minimum/maximum value. However, since Sort is blocking, it must still consume all input. Smarter optimizers can sometimes transform such queries to avoid full sorts.
The iterator model has reigned supreme for over 30 years. Its longevity stems from deep architectural elegance that offers numerous practical advantages.
Why the Iterator Model Works So Well:
PostgreSQL, MySQL, Oracle, SQL Server, SQLite, and countless other databases all use variations of the iterator model. When an abstraction works this well across such diverse systems for decades, it represents a genuine discovery about the nature of query processing—not just a convenient implementation choice.
Despite its elegance, the iterator model has inherent performance costs that become significant at scale—particularly in analytical workloads processing millions of tuples.
Performance Limitations:
| Source of Overhead | Typical Cost | Impact at 1M Tuples |
|---|---|---|
| Function call (Next()) | ~10-30 cycles | 10-30 million cycles |
| Virtual dispatch | ~5-15 cycles | 5-15 million cycles |
| Branch misprediction (per operator) | ~10-20 cycles | 10-20 million cycles |
| Data cache misses | ~100-300 cycles | Highly variable |
| Total per operator | ~125-365 cycles | Compounds across tree depth |
Modern Mitigations:
Database researchers and engineers have developed several strategies to address iterator model overhead while preserving its architectural benefits:
These techniques—especially vectorization—are increasingly common in modern analytical databases while preserving the iterator model's logical structure.
For OLTP workloads (small, simple queries), iterator model overhead is negligible—the query only processes dozens or hundreds of tuples. For OLAP workloads (scanning millions of rows), overhead becomes critical. Modern databases often use different execution strategies for different workloads.
The iterator model is the conceptual heart of query execution in relational databases. Let's consolidate our understanding:
What's Next:
Now that we understand the pull-based iterator model, we'll explore Pipeline Execution—how non-blocking operators enable efficient tuple flow without intermediate buffering, and how to identify and optimize pipeline segments in complex query plans.
You've now mastered the iterator model—the elegant abstraction that enables query execution in virtually every relational database. You understand the three-method interface, pull-based control flow, the handling of blocking operators, and the trade-offs involved. Next, we'll dive deeper into pipelining and how databases maximize efficiency within the iterator framework.