Database Management SystemsQuery Execution Engine

Query Execution Engine

LevelIntermediate

Duration90 mins

TopicQuery Execution Engine

3 / 5

Pipeline Execution

The Art of Continuous Tuple Flow

Imagine a factory assembly line where each worker performs their operation as items pass by, never stopping, never buffering excess inventory. Each worker receives an item, transforms it, and immediately passes it to the next station. This continuous flow maximizes throughput while minimizing work-in-progress inventory.

Database query execution follows a remarkably similar principle called pipeline execution. In pipelining, tuples flow continuously through a chain of operators without being fully buffered at each stage. As soon as an operator produces a tuple, it can immediately be consumed by the next operator. This approach dramatically reduces memory requirements and enables the first results to appear before all input has been processed.

What You Will Learn

By the end of this page, you will understand how pipelining works in query execution, identify pipeline segments and pipeline breakers in execution plans, analyze the memory and latency benefits of pipelining, and recognize optimization opportunities for maximizing pipeline efficiency.

Pipeline Execution Fundamentals

Pipeline execution is a mode of query processing where tuples pass directly from one operator to the next without intermediate storage. The term "pipelining" comes from the analogy to hardware processor pipelines, where multiple instructions are processed simultaneously at different stages.

The Essence of Pipelining:

In a pipelined execution:

An operator produces a tuple
The tuple is immediately passed to the consuming (parent) operator
No copy is made; the same memory location is referenced
No disk writes occur for intermediate results
Memory usage remains constant regardless of input size (for pure pipeline segments)

This contrasts sharply with batch execution where entire intermediate results would be written to disk before the next operator processes them.

With Pipelining

•Tuple flows through operators instantly
•Memory usage: O(1) per tuple
•First result appears quickly
•No intermediate I/O
•Cache-friendly access pattern

Without Pipelining (Batch)

•Entire result buffered at each stage
•Memory usage: O(n) per operator
•First result after full processing
•Heavy intermediate I/O
•Poor cache utilization

On-The-Fly Processing

Pipelining is sometimes called 'on-the-fly' processing because tuples are processed as they flow by, not stored for later batch processing. This is fundamental to why databases can return the first rows of a query result almost immediately, even when the query ultimately returns millions of rows.

How Pipelining Works

Pipelining in the iterator model emerges naturally from the pull-based execution strategy. When a parent operator calls Next() on its child, and that child immediately returns a tuple (without buffering), pipelining occurs. Let's trace through how this works in practice.

Pipelining Sequence:

Pipelining in Action

Execution Trace

Query: SELECT name, salary * 1.1 AS bonus 
       FROM employees 
       WHERE salary > 50000
 
Operator Tree (all pipelinable):
    [Project: name, salary * 1.1]
              │
    [Filter: salary > 50000]
              │
    [TableScan: employees]
 
═══════════════════════════════════════════════════════════════
PIPELINE EXECUTION: Tuple #1 flows through entire tree
═══════════════════════════════════════════════════════════════
 
Time    Operation
─────   ────────────────────────────────────────────────────────
T1      Client calls Project.Next()
T2          Project calls Filter.Next()
T3              Filter calls TableScan.Next()
T4              TableScan reads: {id:1, name:"Alice", salary:45000}
T5              TableScan returns tuple → Filter
T6              Filter evaluates: 45000 > 50000? FALSE
T7              Filter calls TableScan.Next() again
T8              TableScan reads: {id:2, name:"Bob", salary:65000}
T9              TableScan returns tuple → Filter
T10             Filter evaluates: 65000 > 50000? TRUE
T11             Filter returns tuple → Project
T12         Project computes: {"Bob", 65000 * 1.1 = 71500}
T13         Project returns tuple → Client
 
TOTAL: T1 to T13 for first tuple (NO BUFFERING occurred)
 
═══════════════════════════════════════════════════════════════
PIPELINE EXECUTION: Tuple #2 flows through entire tree
═══════════════════════════════════════════════════════════════
 
Time    Operation
─────   ────────────────────────────────────────────────────────
T14     Client calls Project.Next() again
T15         Project calls Filter.Next()
T16             Filter calls TableScan.Next()
T17             TableScan reads: {id:3, name:"Carol", salary:55000}
T18             TableScan returns tuple → Filter
T19             Filter evaluates: 55000 > 50000? TRUE
T20             Filter returns tuple → Project
T21         Project computes: {"Carol", 55000 * 1.1 = 60500}
T22         Project returns tuple → Client
 
MEMORY USAGE: Same tuple slot reused - no growth!

Key Observations:

Immediate Pass-Through — Each operator processes exactly one tuple and immediately passes it upward; no buffering
Constant Memory — The same tuple memory slot is reused for each tuple; memory doesn't grow with input size
Interleavbed Execution — Operators don't run sequentially; they interleave at tuple granularity
Quick First Result — The client receives the first result without waiting for all rows to be processed

Tuple Slots

Most databases use fixed 'tuple slots' for inter-operator communication. Each operator has an output slot where it places its current tuple. The parent reads from this slot. When the next tuple is produced, it overwrites the slot. This eliminates allocation/deallocation overhead and ensures predictable memory usage.

Pipeline Segments

A pipeline segment (or simply "pipeline") is a maximal sequence of operators that can execute in a pipelined fashion. Within a segment, tuples flow continuously without buffering. A query execution plan typically consists of one or more pipeline segments connected by pipeline breakers.

Identifying Pipeline Segments:

Pipeline segments begin at data sources (table scans, index scans) or at the output of blocking operators, and extend as far as possible until reaching another blocking operator or the query result.

Pipeline Segment Analysis

Plan Analysis

Query: SELECT dept_name, AVG(salary) as avg_sal
       FROM employees e
       JOIN departments d ON e.dept_id = d.id
       WHERE salary > 30000
       GROUP BY dept_name
       ORDER BY avg_sal DESC
 
Physical Plan with Pipeline Segments:
 
┌─────────────────────────────────────────────────────────────────┐
│ PIPELINE SEGMENT 3: Final output pipeline                       │
│                                                                  │
│     [Result]                                                     │
│         │                                                        │
│     [Sort: avg_sal DESC]  ←── BLOCKING (consumes all)           │
└─────────────────────────────────────────────────────────────────┘
                    │
                    │ (materialized intermediate result)
                    │
┌─────────────────────────────────────────────────────────────────┐
│ PIPELINE SEGMENT 2: Aggregation pipeline                        │
│                                                                  │
│     [Hash Aggregate: GROUP BY dept_name, AVG(salary)]           │
│         │                 ↑                                      │
│         │                 │ BLOCKING (consumes all)             │
│     [Filter: salary > 30000]                                     │
│         │                                                        │
│     [Hash Join: e.dept_id = d.id]                               │
│         │                 ↑ (build side is blocking)             │
│         │                 │                                      │
│    ┌────┴────┐            │                                      │
│    │         │            │                                      │
└────│─────────│────────────│─────────────────────────────────────┘
     │         │            │
     │         │            │ (hash table build - blocking)
     │         │            │
┌────│─────────│────────────│─────────────────────────────────────┐
│    │         │ PIPELINE SEGMENT 1b: Build side                  │
│    │         │                                                   │
│    │    [TableScan: departments]                                │
│    │                                                             │
└────│─────────────────────────────────────────────────────────────┘
     │
┌────│────────────────────────────────────────────────────────────┐
│ PIPELINE SEGMENT 1a: Probe side (can pipeline with join)        │
│                                                                  │
│    [TableScan: employees]                                        │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
 
Summary:
• 4 distinct pipeline segments
• 3 pipeline breakers (Hash build, Hash Aggregate, Sort)
• Employees scan pipelines through join into aggregation
• Departments scan blocks to build hash table

Pipeline Segment Properties:

Properties of Pipeline Segments
Property	Description	Implication
Continuous Flow	Tuples pass through without buffering	Memory usage is O(1) per tuple within segment
Interleaved Execution	Operators in segment run concurrently at tuple level	CPU cache stays warm with active data
Bounded by Breakers	Segments start/end at blocking operators	Can't extend pipelining past Sort, Hash Aggregate, etc.
Parallel Within Segment	All segment operators work on same tuple	Natural unit for parallel execution

Recognizing Segments in EXPLAIN

When reading execution plans, look for Sort, Hash Join (build side), Hash Aggregate, and Distinct operators—these are pipeline breakers. All operators between breakers form a single pipeline segment. Understanding segments helps predict memory usage and query latency.

Pipeline Breakers (Blocking Operators)

Pipeline breakers are operators that must consume all (or a significant portion) of their input before producing any output. They prevent pipelining because downstream operators cannot receive tuples until the blocking operation completes.

Why Some Operators Must Block:

Certain operations are fundamentally incompatible with tuple-at-a-time processing because their output depends on the entire input:

Sorting — Cannot output the smallest element without seeing all elements
Hash Aggregation — Cannot output a group's aggregate without seeing all group members
Hash Join Build — Cannot probe the hash table until it's fully constructed
Distinct (Hash) — Cannot know if a tuple is duplicate without seeing all tuples

Common Pipeline Breakers
Operator	Why It Blocks	Memory Impact
Sort	Must see all tuples to determine global order	O(n) - stores all input tuples
Hash Aggregate	Must see all group members for final aggregates	O(g) - stores per-group state, g = groups
Hash Join (Build)	Must complete hash table before probing	O(b) - stores build relation, b = build size
Distinct (Hash)	Must track all seen values	O(d) - stores distinct values, d = distinct count
Window Functions	May need entire partition for computation	O(p) - stores partition, p = partition size
Materialize	Explicitly stores input for reuse	O(n) - stores all tuples

Pipeline Breaker Behavior

Sort Operator Trace

Sort Operator Execution:
 
Open() Phase - BLOCKING:
═══════════════════════════════════════════════════════════════
    Pull ALL tuples from child:
        child.Next() → {salary: 65000} → append to buffer
        child.Next() → {salary: 45000} → append to buffer
        child.Next() → {salary: 55000} → append to buffer
        child.Next() → {salary: 75000} → append to buffer
        child.Next() → {salary: 35000} → append to buffer
        child.Next() → NULL (exhausted)
    
    Sort the buffer (e.g., by salary ascending):
        buffer = [{35000}, {45000}, {55000}, {65000}, {75000}]
    
    Set currentIndex = 0
 
Next() Phase - NON-BLOCKING (iterating sorted buffer):
═══════════════════════════════════════════════════════════════
    Next() call 1: return buffer[0] = {salary: 35000}, index++
    Next() call 2: return buffer[1] = {salary: 45000}, index++
    Next() call 3: return buffer[2] = {salary: 55000}, index++
    Next() call 4: return buffer[3] = {salary: 65000}, index++
    Next() call 5: return buffer[4] = {salary: 75000}, index++
    Next() call 6: index >= length → return NULL
 
Timeline Perspective:
═══════════════════════════════════════════════════════════════
    Time ─────────────────────────────────────────────────────►
    
    [    Open() consumes all input    ][  Next() delivers results  ]
    [         BLOCKING                ][    NON-BLOCKING           ]
    
    Parent sees no output during Open() - appears to "stall"

Memory and Spilling

When a pipeline breaker's input exceeds available memory, the operator must 'spill' data to disk. This involves writing intermediate results to temporary files and reading them back—a significant performance penalty. Work_mem (PostgreSQL) or similar settings control when spilling occurs.

Benefits of Pipelining

Pipelining provides substantial benefits across multiple dimensions of query execution. Understanding these benefits explains why databases strive to maximize pipelining in their execution plans.

Memory Efficiency:

The most immediate benefit is dramatically reduced memory consumption. Consider a query processing 10 million rows through 5 operators:

Memory Usage: Pipelined vs. Materialized
Approach	Memory at Each Stage	Total Memory
Full Materialization	10M rows × 5 stages	~50M rows worth of memory
Pipelined Execution	1 tuple × 5 slots	~5 tuple slots (constant)
Savings Factor	—	~10,000,000x less memory

Key Benefits of Pipelining

•Constant Memory Usage — Memory requirements don't grow with input size for pure pipelines; process terabytes with kilobytes of operator state
•Reduced I/O — No writing intermediate results to disk; avoids temporary file creation and associated I/O overhead
•Lower Latency to First Result — Results appear as soon as the first tuple passes through; don't wait for full input processing
•Better Cache Utilization — Active tuple stays cache-hot as it flows through operators; temporal locality is excellent
•Natural Parallelism — Pipeline segments are natural units for intra-query parallelism; easy to parallelize
•Early Termination Efficiency — LIMIT and EXISTS can stop execution immediately; no wasted post-processing

Latency Impact:

Pipelining fundamentally changes query latency characteristics. Instead of having to wait for complete processing, clients receive results incrementally:

Latency Comparison

Timeline

Time ─────────────────────────────────────────────────────────────►
 
MATERIALIZED (Batch) Execution:
─────────────────────────────────────────────────────────────────────
[   Process Stage 1 (all rows)   ][   Stage 2   ][Stage 3][Results]
                                                           ↑
                                             First result appears HERE
                                             (after ALL processing)
 
PIPELINED Execution:  
─────────────────────────────────────────────────────────────────────
[S1+S2+S3]→R1  [S1+S2+S3]→R2  [S1+S2+S3]→R3  ... continues ...
     ↑
First result appears HERE (almost immediately)
 
Legend: S1/S2/S3 = Stages, R1/R2/R3 = Results delivered

Interactive Query Experience

Pipelining is why database clients can start showing query results immediately, even for huge result sets. Without pipelining, users would see a blank screen until the entire query finished. This responsiveness dramatically improves the user experience for interactive SQL tools.

Pipeline Optimization Strategies

Query optimizers employ numerous strategies to maximize pipelining and minimize the impact of pipeline breakers. These optimizations can dramatically improve execution efficiency.

Strategy 1: Operator Ordering

Place pipeline breakers as late as possible in the execution plan, allowing maximum filtering before blocking occurs:

Operator Ordering Example

Plan Comparison

Query: SELECT * FROM large_table WHERE col > 100 ORDER BY col
 
SUBOPTIMAL: Sort early, then filter
═══════════════════════════════════════════════════════════════
    [Filter: col > 100]              ← 1K rows pass
            │
    [Sort: col]  ← BLOCKS ON 10M ROWS ❌ 
            │
    [Scan: large_table]              ← 10M rows
 
Memory: Sorts ALL 10M rows, filters after
 
OPTIMAL: Filter early, then sort
═══════════════════════════════════════════════════════════════
    [Sort: col]  ← BLOCKS ON 1K ROWS ✓
            │
    [Filter: col > 100]              ← 1K rows pass
            │
    [Scan: large_table]              ← 10M rows
 
Memory: Only sorts the 1K filtered rows (10,000x smaller!)

Strategy 2: Avoiding Unnecessary Blocking

Some operations can be implemented with blocking or non-blocking algorithms. Optimizers should prefer non-blocking implementations when possible:

Blocking vs. Non-Blocking Algorithm Choices
Operation	Blocking Approach	Pipelinable Alternative	When to Use Alternative
LIMIT n	Sort then take top n	Use Top-N Heap (only buffer n tuples)	Always for reasonable n
DISTINCT	Hash all values, then output	Sort-based streaming distinct	When input is already sorted
GROUP BY	Hash aggregation	Sorted aggregation (streaming)	When input is already sorted by group key
Merge Join	Sort both inputs first	Use merge join directly	When inputs are already sorted

Strategy 3: Exploiting Existing Order

If data is already sorted (from an index scan or previous sort), subsequent operations can leverage this to avoid blocking:

Order Exploitation

Example

Query: SELECT dept_id, SUM(salary) 
       FROM employees 
       GROUP BY dept_id 
       ORDER BY dept_id
 
Without order exploitation:
═══════════════════════════════════════════════════════════════
    [Sort: dept_id]           ← BLOCKING
            │
    [Hash Aggregate]          ← BLOCKING
            │
    [Scan: employees]
    
TWO blocking operators, neither pipelines
 
With order exploitation (using index):
═══════════════════════════════════════════════════════════════
    [Stream Aggregate]        ← PIPELINED! (groups arrive in order)
            │
    [Index Scan: idx_emp_dept_id]  ← Produces sorted output
 
ONE pipelined operator! 
- Groups arrive in order, aggregate can stream
- Already sorted for ORDER BY, no extra sort needed

Interesting Orderings

Query optimizers track 'interesting orderings'—sort orders that would benefit downstream operators. An index scan producing sorted output might be chosen over a table scan specifically because it enables pipelining for later GROUP BY or ORDER BY operations, even if the scan itself is slightly more expensive.

Pipelining in Join Operations

Join operations present interesting pipelining considerations because they involve two inputs. Different join algorithms have different pipelining characteristics.

Nested Loop Join:

The most pipelining-friendly join. Both inputs can be pipelined (though the inner input may be rescanned):

Nested Loop Join Pipelining

Execution Flow

Nested Loop Join: employees ⋈ departments
 
    [Nested Loop Join]
        ├── [Scan: employees]    (outer - pipelined)
        └── [Index Scan: depts]  (inner - rescanned per outer tuple)
 
Execution Flow:
─────────────────────────────────────────────────────────────────
Outer.Next() → emp1
    Inner seeks to matching dept → dept_A
    Emit (emp1, dept_A) → PIPELINED to parent immediately
    
Outer.Next() → emp2
    Inner seeks to matching dept → dept_B
    Emit (emp2, dept_B) → PIPELINED to parent immediately
    
... continues ...
 
✓ Outer input: fully pipelined
✓ Join results: fully pipelined
△ Inner input: rescanned, but with index seeks = efficient

Hash Join:

Hash joins have asymmetric pipelining—the probe side pipelines, but the build side blocks:

Hash Join Pipelining

Execution Flow

Hash Join: large_table ⋈ small_table
 
    [Hash Join]
        ├── [Scan: large_table]   (probe side - PIPELINED)
        └── [Scan: small_table]   (build side - BLOCKING)
 
═══════════════════════════════════════════════════════════════
PHASE 1: Build (BLOCKING)
═══════════════════════════════════════════════════════════════
    Build side fully consumed:
        small_table.Next() → hash table insert
        small_table.Next() → hash table insert
        ... until exhausted ...
    
    Hash table complete
 
═══════════════════════════════════════════════════════════════
PHASE 2: Probe (PIPELINED)
═══════════════════════════════════════════════════════════════
    large_table.Next() → probe hash table
        Match found → Emit join result (PIPELINED!)
    large_table.Next() → probe hash table
        Match found → Emit join result (PIPELINED!)
    ... continues ...
 
Key insight: Put SMALLER table on BUILD side
- Minimizes blocking memory
- Larger table streams through (pipelined)

Merge Join:

Merge joins can be fully pipelined IF inputs are already sorted. Otherwise, the sort phases block:

Join Algorithm Pipelining Characteristics
Join Type	Left Input	Right Input	Output
Nested Loop	Pipelined	Rescanned (may block if materialize needed)	Pipelined
Hash Join	Pipelined (probe)	Blocking (build)	Pipelined
Merge Join (sorted inputs)	Pipelined	Pipelined	Pipelined
Merge Join (unsorted)	Blocking (sort)	Blocking (sort)	Pipelined

Choose Build Side Wisely

For hash joins, always put the smaller relation on the build side. This minimizes blocking memory and maximizes pipelining of the larger relation. Good optimizers estimate cardinalities and automatically choose the optimal build side.

Monitoring Pipeline Behavior

Understanding how to identify and analyze pipeline behavior in execution plans is crucial for query performance tuning. Different database systems provide various tools for this analysis.

PostgreSQL Example:

PostgreSQL EXPLAIN ANALYZE
1
2
3
4
5
6
7
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) 
SELECT e.name, d.dept_name
FROM employees e 
JOIN departments d ON e.dept_id = d.id
WHERE e.salary > 50000
ORDER BY e.salary DESC
LIMIT 10;

Key Metrics to Watch:

Pipeline Health Indicators

•Sort Method — 'quicksort' vs 'external merge sort' indicates if spilling occurred; 'top-N heapsort' means efficient partial sort
•Memory Usage — Compare to work_mem; values near work_mem suggest spilling risk
•Batches — For hash operations, Batches > 1 means spilling to disk occurred
•Rows Removed by Filter — High values early in pipeline mean effective predicate pushdown
•Actual Time — Time progressing steadily through pipelined operators; sudden jumps indicate blocking

Spilling to Disk

When you see 'external merge sort', 'Batches > 1', or 'Disk' in execution plans, the operator exceeded memory limits and spilled to temporary files. This is orders of magnitude slower than in-memory processing. Consider increasing work_mem or restructuring the query to reduce blocking operator input size.

Summary: Mastering Pipeline Execution

Pipeline execution is fundamental to efficient query processing. Let's consolidate the key concepts:

Key Takeaways

•Continuous Tuple Flow — Pipelining passes tuples directly between operators without buffering; dramatically reduces memory
•Pipeline Segments — Maximal sequences of pipelinable operators bounded by pipeline breakers
•Pipeline Breakers — Sort, Hash Aggregate, and Hash Build must consume all input before producing output
•Memory Efficiency — Pipelined execution uses O(1) memory per tuple regardless of input size
•Low Latency — First results appear quickly; don't wait for complete processing
•Join Asymmetry — Hash joins pipeline the probe side; put larger relations there
•Order Exploitation — Leveraging sorted input enables streaming where blocking would otherwise be required

What's Next:

We've seen that some operators must block and buffer their input. The next page explores Materialization—when and why databases explicitly store intermediate results, the trade-offs involved, and strategies for minimizing materialization overhead.

Page Complete

You now understand pipeline execution—the art of continuous tuple flow through operator chains. You can identify pipeline segments and breakers in execution plans, appreciate the memory and latency benefits of pipelining, and recognize optimization strategies for maximizing pipeline efficiency. Next, we'll explore materialization and when buffering intermediate results is necessary or beneficial.

3 / 5

Loading learning content...

Database Management SystemsQuery Execution Engine

Query Execution Engine

LevelIntermediate

Duration90 mins

TopicQuery Execution Engine

3 / 5

Pipeline Execution

The Art of Continuous Tuple Flow

What You Will Learn

Pipeline Execution Fundamentals

The Essence of Pipelining:

In a pipelined execution:

An operator produces a tuple
The tuple is immediately passed to the consuming (parent) operator
No copy is made; the same memory location is referenced
No disk writes occur for intermediate results
Memory usage remains constant regardless of input size (for pure pipeline segments)

This contrasts sharply with batch execution where entire intermediate results would be written to disk before the next operator processes them.

With Pipelining

•Tuple flows through operators instantly
•Memory usage: O(1) per tuple
•First result appears quickly
•No intermediate I/O
•Cache-friendly access pattern

Without Pipelining (Batch)

•Entire result buffered at each stage
•Memory usage: O(n) per operator
•First result after full processing
•Heavy intermediate I/O
•Poor cache utilization

On-The-Fly Processing

How Pipelining Works

Pipelining Sequence:

Pipelining in Action

Execution Trace

Query: SELECT name, salary * 1.1 AS bonus 
       FROM employees 
       WHERE salary > 50000
 
Operator Tree (all pipelinable):
    [Project: name, salary * 1.1]
              │
    [Filter: salary > 50000]
              │
    [TableScan: employees]
 
═══════════════════════════════════════════════════════════════
PIPELINE EXECUTION: Tuple #1 flows through entire tree
═══════════════════════════════════════════════════════════════
 
Time    Operation
─────   ────────────────────────────────────────────────────────
T1      Client calls Project.Next()
T2          Project calls Filter.Next()
T3              Filter calls TableScan.Next()
T4              TableScan reads: {id:1, name:"Alice", salary:45000}
T5              TableScan returns tuple → Filter
T6              Filter evaluates: 45000 > 50000? FALSE
T7              Filter calls TableScan.Next() again
T8              TableScan reads: {id:2, name:"Bob", salary:65000}
T9              TableScan returns tuple → Filter
T10             Filter evaluates: 65000 > 50000? TRUE
T11             Filter returns tuple → Project
T12         Project computes: {"Bob", 65000 * 1.1 = 71500}
T13         Project returns tuple → Client
 
TOTAL: T1 to T13 for first tuple (NO BUFFERING occurred)
 
═══════════════════════════════════════════════════════════════
PIPELINE EXECUTION: Tuple #2 flows through entire tree
═══════════════════════════════════════════════════════════════
 
Time    Operation
─────   ────────────────────────────────────────────────────────
T14     Client calls Project.Next() again
T15         Project calls Filter.Next()
T16             Filter calls TableScan.Next()
T17             TableScan reads: {id:3, name:"Carol", salary:55000}
T18             TableScan returns tuple → Filter
T19             Filter evaluates: 55000 > 50000? TRUE
T20             Filter returns tuple → Project
T21         Project computes: {"Carol", 55000 * 1.1 = 60500}
T22         Project returns tuple → Client
 
MEMORY USAGE: Same tuple slot reused - no growth!

Key Observations:

Immediate Pass-Through — Each operator processes exactly one tuple and immediately passes it upward; no buffering
Constant Memory — The same tuple memory slot is reused for each tuple; memory doesn't grow with input size
Interleavbed Execution — Operators don't run sequentially; they interleave at tuple granularity
Quick First Result — The client receives the first result without waiting for all rows to be processed

Tuple Slots

Pipeline Segments

Identifying Pipeline Segments:

Pipeline segments begin at data sources (table scans, index scans) or at the output of blocking operators, and extend as far as possible until reaching another blocking operator or the query result.

Pipeline Segment Analysis

Plan Analysis

Query: SELECT dept_name, AVG(salary) as avg_sal
       FROM employees e
       JOIN departments d ON e.dept_id = d.id
       WHERE salary > 30000
       GROUP BY dept_name
       ORDER BY avg_sal DESC
 
Physical Plan with Pipeline Segments:
 
┌─────────────────────────────────────────────────────────────────┐
│ PIPELINE SEGMENT 3: Final output pipeline                       │
│                                                                  │
│     [Result]                                                     │
│         │                                                        │
│     [Sort: avg_sal DESC]  ←── BLOCKING (consumes all)           │
└─────────────────────────────────────────────────────────────────┘
                    │
                    │ (materialized intermediate result)
                    │
┌─────────────────────────────────────────────────────────────────┐
│ PIPELINE SEGMENT 2: Aggregation pipeline                        │
│                                                                  │
│     [Hash Aggregate: GROUP BY dept_name, AVG(salary)]           │
│         │                 ↑                                      │
│         │                 │ BLOCKING (consumes all)             │
│     [Filter: salary > 30000]                                     │
│         │                                                        │
│     [Hash Join: e.dept_id = d.id]                               │
│         │                 ↑ (build side is blocking)             │
│         │                 │                                      │
│    ┌────┴────┐            │                                      │
│    │         │            │                                      │
└────│─────────│────────────│─────────────────────────────────────┘
     │         │            │
     │         │            │ (hash table build - blocking)
     │         │            │
┌────│─────────│────────────│─────────────────────────────────────┐
│    │         │ PIPELINE SEGMENT 1b: Build side                  │
│    │         │                                                   │
│    │    [TableScan: departments]                                │
│    │                                                             │
└────│─────────────────────────────────────────────────────────────┘
     │
┌────│────────────────────────────────────────────────────────────┐
│ PIPELINE SEGMENT 1a: Probe side (can pipeline with join)        │
│                                                                  │
│    [TableScan: employees]                                        │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
 
Summary:
• 4 distinct pipeline segments
• 3 pipeline breakers (Hash build, Hash Aggregate, Sort)
• Employees scan pipelines through join into aggregation
• Departments scan blocks to build hash table

Pipeline Segment Properties:

Properties of Pipeline Segments
Property	Description	Implication
Continuous Flow	Tuples pass through without buffering	Memory usage is O(1) per tuple within segment
Interleaved Execution	Operators in segment run concurrently at tuple level	CPU cache stays warm with active data
Bounded by Breakers	Segments start/end at blocking operators	Can't extend pipelining past Sort, Hash Aggregate, etc.
Parallel Within Segment	All segment operators work on same tuple	Natural unit for parallel execution

Recognizing Segments in EXPLAIN

Pipeline Breakers (Blocking Operators)

Why Some Operators Must Block:

Certain operations are fundamentally incompatible with tuple-at-a-time processing because their output depends on the entire input:

Sorting — Cannot output the smallest element without seeing all elements
Hash Aggregation — Cannot output a group's aggregate without seeing all group members
Hash Join Build — Cannot probe the hash table until it's fully constructed
Distinct (Hash) — Cannot know if a tuple is duplicate without seeing all tuples

Common Pipeline Breakers
Operator	Why It Blocks	Memory Impact
Sort	Must see all tuples to determine global order	O(n) - stores all input tuples
Hash Aggregate	Must see all group members for final aggregates	O(g) - stores per-group state, g = groups
Hash Join (Build)	Must complete hash table before probing	O(b) - stores build relation, b = build size
Distinct (Hash)	Must track all seen values	O(d) - stores distinct values, d = distinct count
Window Functions	May need entire partition for computation	O(p) - stores partition, p = partition size
Materialize	Explicitly stores input for reuse	O(n) - stores all tuples

Pipeline Breaker Behavior

Sort Operator Trace

Sort Operator Execution:
 
Open() Phase - BLOCKING:
═══════════════════════════════════════════════════════════════
    Pull ALL tuples from child:
        child.Next() → {salary: 65000} → append to buffer
        child.Next() → {salary: 45000} → append to buffer
        child.Next() → {salary: 55000} → append to buffer
        child.Next() → {salary: 75000} → append to buffer
        child.Next() → {salary: 35000} → append to buffer
        child.Next() → NULL (exhausted)
    
    Sort the buffer (e.g., by salary ascending):
        buffer = [{35000}, {45000}, {55000}, {65000}, {75000}]
    
    Set currentIndex = 0
 
Next() Phase - NON-BLOCKING (iterating sorted buffer):
═══════════════════════════════════════════════════════════════
    Next() call 1: return buffer[0] = {salary: 35000}, index++
    Next() call 2: return buffer[1] = {salary: 45000}, index++
    Next() call 3: return buffer[2] = {salary: 55000}, index++
    Next() call 4: return buffer[3] = {salary: 65000}, index++
    Next() call 5: return buffer[4] = {salary: 75000}, index++
    Next() call 6: index >= length → return NULL
 
Timeline Perspective:
═══════════════════════════════════════════════════════════════
    Time ─────────────────────────────────────────────────────►
    
    [    Open() consumes all input    ][  Next() delivers results  ]
    [         BLOCKING                ][    NON-BLOCKING           ]
    
    Parent sees no output during Open() - appears to "stall"

Memory and Spilling

Benefits of Pipelining

Pipelining provides substantial benefits across multiple dimensions of query execution. Understanding these benefits explains why databases strive to maximize pipelining in their execution plans.

Memory Efficiency:

The most immediate benefit is dramatically reduced memory consumption. Consider a query processing 10 million rows through 5 operators:

Memory Usage: Pipelined vs. Materialized
Approach	Memory at Each Stage	Total Memory
Full Materialization	10M rows × 5 stages	~50M rows worth of memory
Pipelined Execution	1 tuple × 5 slots	~5 tuple slots (constant)
Savings Factor	—	~10,000,000x less memory

Key Benefits of Pipelining

•Constant Memory Usage — Memory requirements don't grow with input size for pure pipelines; process terabytes with kilobytes of operator state
•Reduced I/O — No writing intermediate results to disk; avoids temporary file creation and associated I/O overhead
•Lower Latency to First Result — Results appear as soon as the first tuple passes through; don't wait for full input processing
•Better Cache Utilization — Active tuple stays cache-hot as it flows through operators; temporal locality is excellent
•Natural Parallelism — Pipeline segments are natural units for intra-query parallelism; easy to parallelize
•Early Termination Efficiency — LIMIT and EXISTS can stop execution immediately; no wasted post-processing

Latency Impact:

Pipelining fundamentally changes query latency characteristics. Instead of having to wait for complete processing, clients receive results incrementally:

Latency Comparison

Timeline

Time ─────────────────────────────────────────────────────────────►
 
MATERIALIZED (Batch) Execution:
─────────────────────────────────────────────────────────────────────
[   Process Stage 1 (all rows)   ][   Stage 2   ][Stage 3][Results]
                                                           ↑
                                             First result appears HERE
                                             (after ALL processing)
 
PIPELINED Execution:  
─────────────────────────────────────────────────────────────────────
[S1+S2+S3]→R1  [S1+S2+S3]→R2  [S1+S2+S3]→R3  ... continues ...
     ↑
First result appears HERE (almost immediately)
 
Legend: S1/S2/S3 = Stages, R1/R2/R3 = Results delivered

Interactive Query Experience

Pipeline Optimization Strategies

Query optimizers employ numerous strategies to maximize pipelining and minimize the impact of pipeline breakers. These optimizations can dramatically improve execution efficiency.

Strategy 1: Operator Ordering

Place pipeline breakers as late as possible in the execution plan, allowing maximum filtering before blocking occurs:

Operator Ordering Example

Plan Comparison

Query: SELECT * FROM large_table WHERE col > 100 ORDER BY col
 
SUBOPTIMAL: Sort early, then filter
═══════════════════════════════════════════════════════════════
    [Filter: col > 100]              ← 1K rows pass
            │
    [Sort: col]  ← BLOCKS ON 10M ROWS ❌ 
            │
    [Scan: large_table]              ← 10M rows
 
Memory: Sorts ALL 10M rows, filters after
 
OPTIMAL: Filter early, then sort
═══════════════════════════════════════════════════════════════
    [Sort: col]  ← BLOCKS ON 1K ROWS ✓
            │
    [Filter: col > 100]              ← 1K rows pass
            │
    [Scan: large_table]              ← 10M rows
 
Memory: Only sorts the 1K filtered rows (10,000x smaller!)

Strategy 2: Avoiding Unnecessary Blocking

Some operations can be implemented with blocking or non-blocking algorithms. Optimizers should prefer non-blocking implementations when possible:

Blocking vs. Non-Blocking Algorithm Choices
Operation	Blocking Approach	Pipelinable Alternative	When to Use Alternative
LIMIT n	Sort then take top n	Use Top-N Heap (only buffer n tuples)	Always for reasonable n
DISTINCT	Hash all values, then output	Sort-based streaming distinct	When input is already sorted
GROUP BY	Hash aggregation	Sorted aggregation (streaming)	When input is already sorted by group key
Merge Join	Sort both inputs first	Use merge join directly	When inputs are already sorted

Strategy 3: Exploiting Existing Order

If data is already sorted (from an index scan or previous sort), subsequent operations can leverage this to avoid blocking:

Order Exploitation

Example

Query: SELECT dept_id, SUM(salary) 
       FROM employees 
       GROUP BY dept_id 
       ORDER BY dept_id
 
Without order exploitation:
═══════════════════════════════════════════════════════════════
    [Sort: dept_id]           ← BLOCKING
            │
    [Hash Aggregate]          ← BLOCKING
            │
    [Scan: employees]
    
TWO blocking operators, neither pipelines
 
With order exploitation (using index):
═══════════════════════════════════════════════════════════════
    [Stream Aggregate]        ← PIPELINED! (groups arrive in order)
            │
    [Index Scan: idx_emp_dept_id]  ← Produces sorted output
 
ONE pipelined operator! 
- Groups arrive in order, aggregate can stream
- Already sorted for ORDER BY, no extra sort needed

Interesting Orderings

Pipelining in Join Operations

Join operations present interesting pipelining considerations because they involve two inputs. Different join algorithms have different pipelining characteristics.

Nested Loop Join:

The most pipelining-friendly join. Both inputs can be pipelined (though the inner input may be rescanned):

Nested Loop Join Pipelining

Execution Flow

Nested Loop Join: employees ⋈ departments
 
    [Nested Loop Join]
        ├── [Scan: employees]    (outer - pipelined)
        └── [Index Scan: depts]  (inner - rescanned per outer tuple)
 
Execution Flow:
─────────────────────────────────────────────────────────────────
Outer.Next() → emp1
    Inner seeks to matching dept → dept_A
    Emit (emp1, dept_A) → PIPELINED to parent immediately
    
Outer.Next() → emp2
    Inner seeks to matching dept → dept_B
    Emit (emp2, dept_B) → PIPELINED to parent immediately
    
... continues ...
 
✓ Outer input: fully pipelined
✓ Join results: fully pipelined
△ Inner input: rescanned, but with index seeks = efficient

Hash Join:

Hash joins have asymmetric pipelining—the probe side pipelines, but the build side blocks:

Hash Join Pipelining

Execution Flow

Hash Join: large_table ⋈ small_table
 
    [Hash Join]
        ├── [Scan: large_table]   (probe side - PIPELINED)
        └── [Scan: small_table]   (build side - BLOCKING)
 
═══════════════════════════════════════════════════════════════
PHASE 1: Build (BLOCKING)
═══════════════════════════════════════════════════════════════
    Build side fully consumed:
        small_table.Next() → hash table insert
        small_table.Next() → hash table insert
        ... until exhausted ...
    
    Hash table complete
 
═══════════════════════════════════════════════════════════════
PHASE 2: Probe (PIPELINED)
═══════════════════════════════════════════════════════════════
    large_table.Next() → probe hash table
        Match found → Emit join result (PIPELINED!)
    large_table.Next() → probe hash table
        Match found → Emit join result (PIPELINED!)
    ... continues ...
 
Key insight: Put SMALLER table on BUILD side
- Minimizes blocking memory
- Larger table streams through (pipelined)

Merge Join:

Merge joins can be fully pipelined IF inputs are already sorted. Otherwise, the sort phases block:

Join Algorithm Pipelining Characteristics
Join Type	Left Input	Right Input	Output
Nested Loop	Pipelined	Rescanned (may block if materialize needed)	Pipelined
Hash Join	Pipelined (probe)	Blocking (build)	Pipelined
Merge Join (sorted inputs)	Pipelined	Pipelined	Pipelined
Merge Join (unsorted)	Blocking (sort)	Blocking (sort)	Pipelined

Choose Build Side Wisely

Monitoring Pipeline Behavior

Understanding how to identify and analyze pipeline behavior in execution plans is crucial for query performance tuning. Different database systems provide various tools for this analysis.

PostgreSQL Example:

PostgreSQL EXPLAIN ANALYZE
1
2
3
4
5
6
7
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) 
SELECT e.name, d.dept_name
FROM employees e 
JOIN departments d ON e.dept_id = d.id
WHERE e.salary > 50000
ORDER BY e.salary DESC
LIMIT 10;

Key Metrics to Watch:

Pipeline Health Indicators

•Sort Method — 'quicksort' vs 'external merge sort' indicates if spilling occurred; 'top-N heapsort' means efficient partial sort
•Memory Usage — Compare to work_mem; values near work_mem suggest spilling risk
•Batches — For hash operations, Batches > 1 means spilling to disk occurred
•Rows Removed by Filter — High values early in pipeline mean effective predicate pushdown
•Actual Time — Time progressing steadily through pipelined operators; sudden jumps indicate blocking

Spilling to Disk

Summary: Mastering Pipeline Execution

Pipeline execution is fundamental to efficient query processing. Let's consolidate the key concepts:

Key Takeaways

•Continuous Tuple Flow — Pipelining passes tuples directly between operators without buffering; dramatically reduces memory
•Pipeline Segments — Maximal sequences of pipelinable operators bounded by pipeline breakers
•Pipeline Breakers — Sort, Hash Aggregate, and Hash Build must consume all input before producing output
•Memory Efficiency — Pipelined execution uses O(1) memory per tuple regardless of input size
•Low Latency — First results appear quickly; don't wait for complete processing
•Join Asymmetry — Hash joins pipeline the probe side; put larger relations there
•Order Exploitation — Leveraging sorted input enables streaming where blocking would otherwise be required

What's Next:

Page Complete

3 / 5