Loading learning content...
Imagine a factory assembly line where each worker performs their operation as items pass by, never stopping, never buffering excess inventory. Each worker receives an item, transforms it, and immediately passes it to the next station. This continuous flow maximizes throughput while minimizing work-in-progress inventory.
Database query execution follows a remarkably similar principle called pipeline execution. In pipelining, tuples flow continuously through a chain of operators without being fully buffered at each stage. As soon as an operator produces a tuple, it can immediately be consumed by the next operator. This approach dramatically reduces memory requirements and enables the first results to appear before all input has been processed.
By the end of this page, you will understand how pipelining works in query execution, identify pipeline segments and pipeline breakers in execution plans, analyze the memory and latency benefits of pipelining, and recognize optimization opportunities for maximizing pipeline efficiency.
Pipeline execution is a mode of query processing where tuples pass directly from one operator to the next without intermediate storage. The term "pipelining" comes from the analogy to hardware processor pipelines, where multiple instructions are processed simultaneously at different stages.
The Essence of Pipelining:
In a pipelined execution:
This contrasts sharply with batch execution where entire intermediate results would be written to disk before the next operator processes them.
Pipelining is sometimes called 'on-the-fly' processing because tuples are processed as they flow by, not stored for later batch processing. This is fundamental to why databases can return the first rows of a query result almost immediately, even when the query ultimately returns millions of rows.
Pipelining in the iterator model emerges naturally from the pull-based execution strategy. When a parent operator calls Next() on its child, and that child immediately returns a tuple (without buffering), pipelining occurs. Let's trace through how this works in practice.
Pipelining Sequence:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
Query: SELECT name, salary * 1.1 AS bonus FROM employees WHERE salary > 50000 Operator Tree (all pipelinable): [Project: name, salary * 1.1] │ [Filter: salary > 50000] │ [TableScan: employees] ═══════════════════════════════════════════════════════════════PIPELINE EXECUTION: Tuple #1 flows through entire tree═══════════════════════════════════════════════════════════════ Time Operation───── ────────────────────────────────────────────────────────T1 Client calls Project.Next()T2 Project calls Filter.Next()T3 Filter calls TableScan.Next()T4 TableScan reads: {id:1, name:"Alice", salary:45000}T5 TableScan returns tuple → FilterT6 Filter evaluates: 45000 > 50000? FALSET7 Filter calls TableScan.Next() againT8 TableScan reads: {id:2, name:"Bob", salary:65000}T9 TableScan returns tuple → FilterT10 Filter evaluates: 65000 > 50000? TRUET11 Filter returns tuple → ProjectT12 Project computes: {"Bob", 65000 * 1.1 = 71500}T13 Project returns tuple → Client TOTAL: T1 to T13 for first tuple (NO BUFFERING occurred) ═══════════════════════════════════════════════════════════════PIPELINE EXECUTION: Tuple #2 flows through entire tree═══════════════════════════════════════════════════════════════ Time Operation───── ────────────────────────────────────────────────────────T14 Client calls Project.Next() againT15 Project calls Filter.Next()T16 Filter calls TableScan.Next()T17 TableScan reads: {id:3, name:"Carol", salary:55000}T18 TableScan returns tuple → FilterT19 Filter evaluates: 55000 > 50000? TRUET20 Filter returns tuple → ProjectT21 Project computes: {"Carol", 55000 * 1.1 = 60500}T22 Project returns tuple → Client MEMORY USAGE: Same tuple slot reused - no growth!Key Observations:
Most databases use fixed 'tuple slots' for inter-operator communication. Each operator has an output slot where it places its current tuple. The parent reads from this slot. When the next tuple is produced, it overwrites the slot. This eliminates allocation/deallocation overhead and ensures predictable memory usage.
A pipeline segment (or simply "pipeline") is a maximal sequence of operators that can execute in a pipelined fashion. Within a segment, tuples flow continuously without buffering. A query execution plan typically consists of one or more pipeline segments connected by pipeline breakers.
Identifying Pipeline Segments:
Pipeline segments begin at data sources (table scans, index scans) or at the output of blocking operators, and extend as far as possible until reaching another blocking operator or the query result.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
Query: SELECT dept_name, AVG(salary) as avg_sal FROM employees e JOIN departments d ON e.dept_id = d.id WHERE salary > 30000 GROUP BY dept_name ORDER BY avg_sal DESC Physical Plan with Pipeline Segments: ┌─────────────────────────────────────────────────────────────────┐│ PIPELINE SEGMENT 3: Final output pipeline ││ ││ [Result] ││ │ ││ [Sort: avg_sal DESC] ←── BLOCKING (consumes all) │└─────────────────────────────────────────────────────────────────┘ │ │ (materialized intermediate result) │┌─────────────────────────────────────────────────────────────────┐│ PIPELINE SEGMENT 2: Aggregation pipeline ││ ││ [Hash Aggregate: GROUP BY dept_name, AVG(salary)] ││ │ ↑ ││ │ │ BLOCKING (consumes all) ││ [Filter: salary > 30000] ││ │ ││ [Hash Join: e.dept_id = d.id] ││ │ ↑ (build side is blocking) ││ │ │ ││ ┌────┴────┐ │ ││ │ │ │ │└────│─────────│────────────│─────────────────────────────────────┘ │ │ │ │ │ │ (hash table build - blocking) │ │ │┌────│─────────│────────────│─────────────────────────────────────┐│ │ │ PIPELINE SEGMENT 1b: Build side ││ │ │ ││ │ [TableScan: departments] ││ │ │└────│─────────────────────────────────────────────────────────────┘ │┌────│────────────────────────────────────────────────────────────┐│ PIPELINE SEGMENT 1a: Probe side (can pipeline with join) ││ ││ [TableScan: employees] ││ │└─────────────────────────────────────────────────────────────────┘ Summary:• 4 distinct pipeline segments• 3 pipeline breakers (Hash build, Hash Aggregate, Sort)• Employees scan pipelines through join into aggregation• Departments scan blocks to build hash tablePipeline Segment Properties:
| Property | Description | Implication |
|---|---|---|
| Continuous Flow | Tuples pass through without buffering | Memory usage is O(1) per tuple within segment |
| Interleaved Execution | Operators in segment run concurrently at tuple level | CPU cache stays warm with active data |
| Bounded by Breakers | Segments start/end at blocking operators | Can't extend pipelining past Sort, Hash Aggregate, etc. |
| Parallel Within Segment | All segment operators work on same tuple | Natural unit for parallel execution |
When reading execution plans, look for Sort, Hash Join (build side), Hash Aggregate, and Distinct operators—these are pipeline breakers. All operators between breakers form a single pipeline segment. Understanding segments helps predict memory usage and query latency.
Pipeline breakers are operators that must consume all (or a significant portion) of their input before producing any output. They prevent pipelining because downstream operators cannot receive tuples until the blocking operation completes.
Why Some Operators Must Block:
Certain operations are fundamentally incompatible with tuple-at-a-time processing because their output depends on the entire input:
| Operator | Why It Blocks | Memory Impact |
|---|---|---|
| Sort | Must see all tuples to determine global order | O(n) - stores all input tuples |
| Hash Aggregate | Must see all group members for final aggregates | O(g) - stores per-group state, g = groups |
| Hash Join (Build) | Must complete hash table before probing | O(b) - stores build relation, b = build size |
| Distinct (Hash) | Must track all seen values | O(d) - stores distinct values, d = distinct count |
| Window Functions | May need entire partition for computation | O(p) - stores partition, p = partition size |
| Materialize | Explicitly stores input for reuse | O(n) - stores all tuples |
12345678910111213141516171819202122232425262728293031323334
Sort Operator Execution: Open() Phase - BLOCKING:═══════════════════════════════════════════════════════════════ Pull ALL tuples from child: child.Next() → {salary: 65000} → append to buffer child.Next() → {salary: 45000} → append to buffer child.Next() → {salary: 55000} → append to buffer child.Next() → {salary: 75000} → append to buffer child.Next() → {salary: 35000} → append to buffer child.Next() → NULL (exhausted) Sort the buffer (e.g., by salary ascending): buffer = [{35000}, {45000}, {55000}, {65000}, {75000}] Set currentIndex = 0 Next() Phase - NON-BLOCKING (iterating sorted buffer):═══════════════════════════════════════════════════════════════ Next() call 1: return buffer[0] = {salary: 35000}, index++ Next() call 2: return buffer[1] = {salary: 45000}, index++ Next() call 3: return buffer[2] = {salary: 55000}, index++ Next() call 4: return buffer[3] = {salary: 65000}, index++ Next() call 5: return buffer[4] = {salary: 75000}, index++ Next() call 6: index >= length → return NULL Timeline Perspective:═══════════════════════════════════════════════════════════════ Time ─────────────────────────────────────────────────────► [ Open() consumes all input ][ Next() delivers results ] [ BLOCKING ][ NON-BLOCKING ] Parent sees no output during Open() - appears to "stall"When a pipeline breaker's input exceeds available memory, the operator must 'spill' data to disk. This involves writing intermediate results to temporary files and reading them back—a significant performance penalty. Work_mem (PostgreSQL) or similar settings control when spilling occurs.
Pipelining provides substantial benefits across multiple dimensions of query execution. Understanding these benefits explains why databases strive to maximize pipelining in their execution plans.
Memory Efficiency:
The most immediate benefit is dramatically reduced memory consumption. Consider a query processing 10 million rows through 5 operators:
| Approach | Memory at Each Stage | Total Memory |
|---|---|---|
| Full Materialization | 10M rows × 5 stages | ~50M rows worth of memory |
| Pipelined Execution | 1 tuple × 5 slots | ~5 tuple slots (constant) |
| Savings Factor | — | ~10,000,000x less memory |
Latency Impact:
Pipelining fundamentally changes query latency characteristics. Instead of having to wait for complete processing, clients receive results incrementally:
12345678910111213141516
Time ─────────────────────────────────────────────────────────────► MATERIALIZED (Batch) Execution:─────────────────────────────────────────────────────────────────────[ Process Stage 1 (all rows) ][ Stage 2 ][Stage 3][Results] ↑ First result appears HERE (after ALL processing) PIPELINED Execution: ─────────────────────────────────────────────────────────────────────[S1+S2+S3]→R1 [S1+S2+S3]→R2 [S1+S2+S3]→R3 ... continues ... ↑First result appears HERE (almost immediately) Legend: S1/S2/S3 = Stages, R1/R2/R3 = Results deliveredPipelining is why database clients can start showing query results immediately, even for huge result sets. Without pipelining, users would see a blank screen until the entire query finished. This responsiveness dramatically improves the user experience for interactive SQL tools.
Query optimizers employ numerous strategies to maximize pipelining and minimize the impact of pipeline breakers. These optimizations can dramatically improve execution efficiency.
Strategy 1: Operator Ordering
Place pipeline breakers as late as possible in the execution plan, allowing maximum filtering before blocking occurs:
123456789101112131415161718192021
Query: SELECT * FROM large_table WHERE col > 100 ORDER BY col SUBOPTIMAL: Sort early, then filter═══════════════════════════════════════════════════════════════ [Filter: col > 100] ← 1K rows pass │ [Sort: col] ← BLOCKS ON 10M ROWS ❌ │ [Scan: large_table] ← 10M rows Memory: Sorts ALL 10M rows, filters after OPTIMAL: Filter early, then sort═══════════════════════════════════════════════════════════════ [Sort: col] ← BLOCKS ON 1K ROWS ✓ │ [Filter: col > 100] ← 1K rows pass │ [Scan: large_table] ← 10M rows Memory: Only sorts the 1K filtered rows (10,000x smaller!)Strategy 2: Avoiding Unnecessary Blocking
Some operations can be implemented with blocking or non-blocking algorithms. Optimizers should prefer non-blocking implementations when possible:
| Operation | Blocking Approach | Pipelinable Alternative | When to Use Alternative |
|---|---|---|---|
| LIMIT n | Sort then take top n | Use Top-N Heap (only buffer n tuples) | Always for reasonable n |
| DISTINCT | Hash all values, then output | Sort-based streaming distinct | When input is already sorted |
| GROUP BY | Hash aggregation | Sorted aggregation (streaming) | When input is already sorted by group key |
| Merge Join | Sort both inputs first | Use merge join directly | When inputs are already sorted |
Strategy 3: Exploiting Existing Order
If data is already sorted (from an index scan or previous sort), subsequent operations can leverage this to avoid blocking:
123456789101112131415161718192021222324
Query: SELECT dept_id, SUM(salary) FROM employees GROUP BY dept_id ORDER BY dept_id Without order exploitation:═══════════════════════════════════════════════════════════════ [Sort: dept_id] ← BLOCKING │ [Hash Aggregate] ← BLOCKING │ [Scan: employees] TWO blocking operators, neither pipelines With order exploitation (using index):═══════════════════════════════════════════════════════════════ [Stream Aggregate] ← PIPELINED! (groups arrive in order) │ [Index Scan: idx_emp_dept_id] ← Produces sorted output ONE pipelined operator! - Groups arrive in order, aggregate can stream- Already sorted for ORDER BY, no extra sort neededQuery optimizers track 'interesting orderings'—sort orders that would benefit downstream operators. An index scan producing sorted output might be chosen over a table scan specifically because it enables pipelining for later GROUP BY or ORDER BY operations, even if the scan itself is slightly more expensive.
Join operations present interesting pipelining considerations because they involve two inputs. Different join algorithms have different pipelining characteristics.
Nested Loop Join:
The most pipelining-friendly join. Both inputs can be pipelined (though the inner input may be rescanned):
123456789101112131415161718192021
Nested Loop Join: employees ⋈ departments [Nested Loop Join] ├── [Scan: employees] (outer - pipelined) └── [Index Scan: depts] (inner - rescanned per outer tuple) Execution Flow:─────────────────────────────────────────────────────────────────Outer.Next() → emp1 Inner seeks to matching dept → dept_A Emit (emp1, dept_A) → PIPELINED to parent immediately Outer.Next() → emp2 Inner seeks to matching dept → dept_B Emit (emp2, dept_B) → PIPELINED to parent immediately ... continues ... ✓ Outer input: fully pipelined✓ Join results: fully pipelined△ Inner input: rescanned, but with index seeks = efficientHash Join:
Hash joins have asymmetric pipelining—the probe side pipelines, but the build side blocks:
12345678910111213141516171819202122232425262728
Hash Join: large_table ⋈ small_table [Hash Join] ├── [Scan: large_table] (probe side - PIPELINED) └── [Scan: small_table] (build side - BLOCKING) ═══════════════════════════════════════════════════════════════PHASE 1: Build (BLOCKING)═══════════════════════════════════════════════════════════════ Build side fully consumed: small_table.Next() → hash table insert small_table.Next() → hash table insert ... until exhausted ... Hash table complete ═══════════════════════════════════════════════════════════════PHASE 2: Probe (PIPELINED)═══════════════════════════════════════════════════════════════ large_table.Next() → probe hash table Match found → Emit join result (PIPELINED!) large_table.Next() → probe hash table Match found → Emit join result (PIPELINED!) ... continues ... Key insight: Put SMALLER table on BUILD side- Minimizes blocking memory- Larger table streams through (pipelined)Merge Join:
Merge joins can be fully pipelined IF inputs are already sorted. Otherwise, the sort phases block:
| Join Type | Left Input | Right Input | Output |
|---|---|---|---|
| Nested Loop | Pipelined | Rescanned (may block if materialize needed) | Pipelined |
| Hash Join | Pipelined (probe) | Blocking (build) | Pipelined |
| Merge Join (sorted inputs) | Pipelined | Pipelined | Pipelined |
| Merge Join (unsorted) | Blocking (sort) | Blocking (sort) | Pipelined |
For hash joins, always put the smaller relation on the build side. This minimizes blocking memory and maximizes pipelining of the larger relation. Good optimizers estimate cardinalities and automatically choose the optimal build side.
Understanding how to identify and analyze pipeline behavior in execution plans is crucial for query performance tuning. Different database systems provide various tools for this analysis.
PostgreSQL Example:
1234567
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT e.name, d.dept_nameFROM employees e JOIN departments d ON e.dept_id = d.idWHERE e.salary > 50000ORDER BY e.salary DESCLIMIT 10;Key Metrics to Watch:
When you see 'external merge sort', 'Batches > 1', or 'Disk' in execution plans, the operator exceeded memory limits and spilled to temporary files. This is orders of magnitude slower than in-memory processing. Consider increasing work_mem or restructuring the query to reduce blocking operator input size.
Pipeline execution is fundamental to efficient query processing. Let's consolidate the key concepts:
What's Next:
We've seen that some operators must block and buffer their input. The next page explores Materialization—when and why databases explicitly store intermediate results, the trade-offs involved, and strategies for minimizing materialization overhead.
You now understand pipeline execution—the art of continuous tuple flow through operator chains. You can identify pipeline segments and breakers in execution plans, appreciate the memory and latency benefits of pipelining, and recognize optimization strategies for maximizing pipeline efficiency. Next, we'll explore materialization and when buffering intermediate results is necessary or beneficial.