Database Management SystemsPhysical Operators

Nested Loop Join Algorithms

LevelIntermediate

Duration60 mins

TopicPhysical Operators

1 / 5

Basic Nested Loop Join

The Foundation of Physical Join Operators

In the world of relational databases, the join operation stands as perhaps the most computationally significant operator in query processing. While the declarative SQL statement SELECT * FROM R JOIN S ON R.a = S.b appears deceptively simple, its physical implementation determines whether a query completes in milliseconds or grinds for hours.

At the heart of every database system's query execution engine lies a family of physical join operators—algorithms that transform the abstract concept of 'combining related tuples' into concrete machine instructions. Among these operators, the nested loop join represents both the simplest conceptually and the most versatile in practice.

This page begins our deep exploration of nested loop join algorithms, starting with the most fundamental variant: the simple nested loop join (also called the naive nested loop join or tuple-at-a-time nested loop). Understanding this algorithm thoroughly is essential—not because you'll always use it directly, but because it establishes the conceptual baseline against which all other join algorithms are measured.

What You Will Learn

By the end of this page, you will understand the mechanics of the simple nested loop join algorithm, its memory requirements, how it handles different join predicates, its I/O behavior, and why—despite its simplicity—it remains relevant in modern database systems under specific circumstances.

Understanding the Join Problem

Before diving into the algorithm, let's crystallize what we're solving. A join combines tuples from two relations based on a join predicate—a condition that determines which pairs of tuples belong together in the result.

Formal Definition:

Given relations R and S, and a join predicate θ, the join R ⋈θ S produces all pairs (r, s) where r ∈ R, s ∈ S, and θ(r, s) evaluates to true.

The challenge is computational: if R has |R| tuples and S has |S| tuples, there are |R| × |S| possible pairs to consider. For large relations, this cross product explodes combinatorially. A database with 10 million customers and 100 million orders presents 10^15 potential pairs—far too many to evaluate naively.

Scale of the Join Problem
Relation R	Relation S	Pair Combinations	At 1µs per pair
1,000 rows	1,000 rows	1 million	1 second
10,000 rows	10,000 rows	100 million	~2 minutes
100,000 rows	100,000 rows	10 billion	~3 hours
1 million rows	1 million rows	1 trillion	~12 days
10 million rows	100 million rows	10^15	~32,000 years

This table illustrates why join algorithms matter so profoundly. The simple nested loop join approaches this problem in the most direct way possible—by examining every pair. Its value lies not in avoiding this cost, but in doing so with minimal overhead and maximum flexibility.

The Curse of Cartesian Products

The nested loop join's worst-case behavior is the full Cartesian product. Without selective join predicates or early termination conditions, performance degrades quadratically. This is why understanding when to use (and avoid) this algorithm is as important as understanding how it works.

The Simple Nested Loop Algorithm

The simple nested loop join is exactly what its name suggests: two nested loops, one iterating over each tuple in the outer relation and one iterating over each tuple in the inner relation. For each pair, the join predicate is evaluated, and matching pairs are added to the result.

The Algorithm Structure:

simple_nested_loop_join.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Simple Nested Loop Join Algorithm
// R: Outer relation (driving table)
// S: Inner relation (probed table)
// θ: Join predicate (e.g., R.a = S.b)
 
function NestedLoopJoin(R, S, θ):
    result ← ∅                          // Initialize empty result set
    
    for each tuple r in R:              // OUTER LOOP
        for each tuple s in S:          // INNER LOOP
            if θ(r, s):                 // Evaluate join predicate
                result ← result ∪ {r ⊕ s}  // Concatenate and emit
    
    return result

Detailed Execution Flow:

Initialization: Allocate minimal memory—just enough for one tuple from R and one tuple from S, plus any temporary storage for predicate evaluation.
Outer Loop Start: Fetch the first tuple from relation R into memory. This tuple becomes the current outer tuple.
Inner Loop Iteration: For this outer tuple, scan the entire relation S from beginning to end. Each tuple fetched from S is the current inner tuple.
Predicate Evaluation: For each (outer, inner) pair, evaluate θ(r, s). This may involve comparing attribute values, applying functions, or checking complex boolean expressions.
Result Emission: If the predicate evaluates to true, concatenate the outer and inner tuples and add to the result (or stream to the next operator in a pipelined execution).
Inner Loop Completion: After examining all tuples in S, the inner loop terminates.
Outer Loop Continuation: Advance to the next tuple in R and repeat the inner loop.
Termination: When all tuples in R have been processed, the join is complete.

The Tuple-at-a-Time Model

This algorithm follows the classic 'tuple-at-a-time' processing model. Each iteration deals with exactly one tuple from each relation, making the algorithm simple to implement, debug, and reason about. Modern systems often batch this for efficiency, but the conceptual model remains the same.

Memory Requirements and Buffer Usage

One of the most attractive properties of the simple nested loop join is its minimal memory footprint. Unlike hash joins or sort-merge joins that may require memory proportional to relation sizes, the simple nested loop needs only constant space.

Minimum Memory Configuration:

Memory Components

•Outer Tuple Buffer — Space for exactly one tuple from relation R. Size depends on R's schema (sum of attribute sizes plus tuple header overhead).
•Inner Tuple Buffer — Space for exactly one tuple from relation S. Similarly schema-dependent.
•Predicate Evaluation Space — Temporary storage for intermediate values during predicate computation. Typically small and fixed.
•Output Buffer — Space to construct result tuples before emission. May be omitted if results stream directly to the next operator.
•I/O Buffer — At minimum, one page/block from each relation if reading from disk. More buffers improve performance but aren't strictly required.

Total Memory Bound:

M_min = sizeof(tuple_R) + sizeof(tuple_S) + sizeof(predicate_temp) + sizeof(output_buffer)

For typical OLTP tuples (hundreds of bytes each), this might be just 1-2 kilobytes—negligible compared to hash join's potential gigabyte requirements.

Practical Buffer Allocation:

While the algorithm can run with minimal memory, practical implementations allocate at least one disk page (typically 4KB-16KB) for each relation. This amortizes disk seek overhead and aligns with the storage engine's I/O unit:

buffer_allocation.pseudo
1
2
3
4
5
6
7
8
9
10
11
// Practical buffer allocation for nested loop join
// P: Page size (e.g., 8KB)
// M: Available memory
 
outer_buffer ← allocate(P)      // One page for outer relation
inner_buffer ← allocate(P)      // One page for inner relation
output_buffer ← allocate(P)     // One page for output assembly
 
// Remaining memory can cache inner relation pages
cache_pages ← (M - 3P) / P
inner_cache ← allocate(cache_pages × P)  // Optional: cache inner pages

Memory-Constrained Environments

The simple nested loop join's minimal memory requirement makes it invaluable in embedded databases, memory-constrained systems, and situations where concurrent queries compete for buffer pool space. When joining very small relations, the overhead of building hash tables may exceed the simple nested loop's straightforward approach.

I/O Cost Analysis

The I/O cost of the simple nested loop join is where its simplicity becomes a liability for large relations. Understanding this cost precisely helps us recognize when the algorithm is appropriate and when alternatives are necessary.

Defining the Variables:

b_R: Number of disk blocks (pages) in relation R
b_S: Number of disk blocks in relation S
n_R: Number of tuples in relation R
n_S: Number of tuples in relation S
f_R: Blocking factor of R (tuples per block)
f_S: Blocking factor of S (tuples per block)

Tuple-at-a-Time I/O Cost:

In the worst case (no caching, each tuple access requires disk I/O):

Cost_simple = b_R + (n_R × b_S)

This formula reveals the brutal truth: for each tuple in R (not block—tuple!), we scan the entire relation S. If R has 1 million tuples and S occupies 1,000 blocks, we perform 1 billion block reads plus the initial R scan.

I/O Cost Examples (Simple Nested Loop)
R Size	S Size	R Tuples	Block I/Os	At 10ms/block
10 blocks	10 blocks	1,000	10,010	~100 seconds
100 blocks	100 blocks	10,000	1,000,100	~2.8 hours
1,000 blocks	1,000 blocks	100,000	100,001,000	~12 days
10,000 blocks	1,000 blocks	1 million	1,000,010,000	~4 months

Page-Level Optimization:

Practical implementations improve on the tuple-at-a-time model by processing at the page level:

Cost_page = b_R + (b_R × b_S)

Here, for each block of R (not tuple), we scan S once. This is dramatically better—the 1 million tuple case with 10,000 R blocks and 1,000 S blocks becomes 10,000 + (10,000 × 1,000) = 10,010,000 I/Os—still large but 100x better than tuple-at-a-time.

The Outer Relation Choice Matters

With cost = b_R + (b_R × b_S), choosing the smaller relation as outer dramatically reduces I/O. If R has 100 blocks and S has 10,000: placing R as outer costs 100 + (100 × 10,000) = 1,000,100 I/Os. Swapping gives 10,000 + (10,000 × 100) = 1,010,000 I/Os—essentially the same. But for block nested loop variants, the difference becomes crucial.

Understanding the I/O Pattern:

The simple nested loop creates a distinctive I/O pattern:

Sequential Read of Outer: R is read once, sequentially, from beginning to end.
Repeated Sequential Reads of Inner: S is read completely once for every block (or tuple) in R.
No Random I/O (for base relations): Both relations are scanned sequentially, which is optimal for spinning disks.
Poor Locality for Inner: Unless S fits in cache, each scan evicts the previous scan's cached pages.

This pattern explains why the simple nested loop performs terribly for large S regardless of available memory—S is re-read from disk for each outer block.

Join Predicate Handling

A strength of the nested loop join is its flexibility with join predicates. Unlike hash joins (which require equality predicates on hashable attributes) or sort-merge joins (which require sortable equality predicates), the nested loop can handle any predicate expressible as a boolean function.

Supported Predicate Types:

Universal Predicate Support

•Equality Joins — R.a = S.b (standard equi-join)
•Inequality Joins — R.a < S.b, R.a >= S.b, R.a <> S.b
•Range Joins — R.a BETWEEN S.low AND S.high
•Pattern Matches — R.name LIKE S.pattern
•Function-Based — SUBSTRING(R.a, 1, 3) = S.prefix
•Compound Predicates — R.a = S.b AND R.c > S.d OR R.e LIKE '%test%'
•Cross Joins — No predicate (Cartesian product) or 1 = 1
•Non-Equi Theta Joins — Any arbitrary condition θ(r, s) → boolean

Predicate Evaluation Cost:

The predicate is evaluated once per tuple pair. For complex predicates, this cost can dominate:

Total_CPU_cost = n_R × n_S × cost_per_predicate_eval

For simple equality comparisons, cost_per_predicate_eval might be a few CPU cycles. For predicates involving string operations, function calls, or subqueries, costs can vary by orders of magnitude.

predicate_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Example predicates handled by nested loop join
 
-- Simple equality (most common)
SELECT * FROM Orders o 
JOIN Customers c ON o.customer_id = c.id;
 
-- Inequality join (range comparison)
SELECT * FROM Products p 
JOIN PriceHistory h ON p.id = h.product_id 
  AND p.current_price > h.historical_price;
 
-- Spatial/geometric predicates
SELECT * FROM Buildings b 
JOIN Zones z ON ST_Contains(z.boundary, b.location);
 
-- Complex compound predicate
SELECT * FROM Employees e 
JOIN Departments d 
  ON (e.dept_id = d.id OR e.secondary_dept = d.id)
  AND e.hire_date BETWEEN d.active_from AND d.active_until
  AND d.name LIKE e.preferred_dept_pattern;

The Theta Join Workhorse

For non-equi joins and complex predicates, the nested loop is often the only viable physical operator. Hash and sort-merge optimizations depend on specific predicate structures. The nested loop's willingness to evaluate any boolean function makes it the universal fallback.

Algorithm Variations and Semantics

The simple nested loop naturally supports all SQL join types through minor modifications to the control flow and output logic.

Inner Join (Default):

Emit only when predicate matches. Non-matching tuples from either relation are excluded.

inner_join.pseudo
1
2
3
4
5
// Inner Join: emit matched pairs only
for each r in R:
    for each s in S:
        if θ(r, s):
            emit(r ⊕ s)

Left Outer Join:

For each outer tuple, if no inner tuple matches, emit the outer tuple padded with NULLs.

left_outer_join.pseudo
1
2
3
4
5
6
7
8
9
// Left Outer Join: preserve unmatched outer tuples
for each r in R:
    matched ← false
    for each s in S:
        if θ(r, s):
            emit(r ⊕ s)
            matched ← true
    if not matched:
        emit(r ⊕ NULL_S)  // Pad with NULL values for S attributes

Right Outer Join:

Similar to left outer, but preserving unmatched inner tuples. Typically implemented by swapping R and S, then using left outer logic.

right_outer_join.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Right Outer Join: preserve unmatched inner tuples
// Track which S tuples have been matched
matched_S ← new BitSet(|S|)
 
for each r in R:
    for i, s in enumerate(S):
        if θ(r, s):
            emit(r ⊕ s)
            matched_S.set(i)
 
// Emit unmatched S tuples with NULL padding
for i, s in enumerate(S):
    if not matched_S.get(i):
        emit(NULL_R ⊕ s)

Full Outer Join:

Combines left and right outer semantics—preserve unmatched from both sides.

full_outer_join.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Full Outer Join: preserve unmatched from both relations
matched_S ← new BitSet(|S|)
 
for each r in R:
    r_matched ← false
    for i, s in enumerate(S):
        if θ(r, s):
            emit(r ⊕ s)
            r_matched ← true
            matched_S.set(i)
    if not r_matched:
        emit(r ⊕ NULL_S)
 
// Emit unmatched S tuples
for i, s in enumerate(S):
    if not matched_S.get(i):
        emit(NULL_R ⊕ s)

Semi-Join and Anti-Join Optimizations

For semi-joins (EXISTS) and anti-joins (NOT EXISTS), the inner loop can terminate early upon first match (or non-match). This optimization can dramatically reduce execution time when the first matching tuple appears early in S: if θ(r, s): emit(r); break

Pipelining and Blocking Behavior

Understanding how the nested loop join interacts with query pipelines is crucial for query optimization and performance prediction.

Pipelining Characteristics:

The nested loop join is a pipelining operator on the outer relation and a blocking operator on the inner relation. This distinction has profound implications:

Pipelining (Outer Relation)

•Outer tuples can stream from upstream operators
•Each outer tuple is processed immediately upon arrival
•Results emit incrementally as matches are found
•First result can appear before R is fully scanned
•Memory contains only current outer tuple
•Excellent for 'top-K' queries with LIMIT

Blocking (Inner Relation)

•Inner relation must be fully accessible
•S is re-scanned for each outer tuple/block
•Cannot pipeline from inner's upstream operator
•If S comes from a subquery, it must materialize
•Inner relation often cached if memory permits
•Best when inner relation is base table or materialized

Implications for Query Execution:

Left-deep query plans: Nested loops naturally fit left-deep trees where results from one join feed into the next join's outer relation, maintaining pipelining.
Materialization points: If the inner relation is a complex subquery or another join, the executor may need to materialize it to enable repeated scanning.
Memory pressure: While the outer can pipeline, caching inner pages improves performance dramatically. The query optimizer balances this against other operators competing for buffer pool space.
Early termination: LIMIT clauses can terminate the outer loop early, saving significant work. SELECT * FROM R JOIN S ... LIMIT 10 may finish after processing just a few outer tuples.

Correlated Subqueries

Correlated subqueries in the WHERE clause are often executed using nested loop semantics—the outer query acts as R, and the subquery is re-evaluated (like scanning S) for each outer row. Understanding nested loops helps you recognize and optimize these patterns.

Implementation Considerations

Building a production-quality nested loop join requires attention to details beyond the basic algorithm. Here are the key engineering considerations:

Iterator Interface:

Modern database engines implement operators using the iterator model (also called the Volcano model). Each operator exposes three methods:

iterator_interface.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class NestedLoopJoin implements Iterator<Tuple>:
    outer: Iterator<Tuple>
    inner: Iterator<Tuple>
    inner_source: Relation        // For rewinding
    predicate: (Tuple, Tuple) → bool
    current_outer: Tuple | null
    
    function open():
        outer.open()
        inner.open()
        current_outer ← outer.next()
 
    function next() → Tuple | EOF:
        while current_outer ≠ null:
            inner_tuple ← inner.next()
            
            if inner_tuple = EOF:
                // Inner exhausted, advance outer
                current_outer ← outer.next()
                inner.rewind()           // Critical: restart inner scan
                continue
            
            if predicate(current_outer, inner_tuple):
                return concatenate(current_outer, inner_tuple)
        
        return EOF
 
    function close():
        outer.close()
        inner.close()

Critical Implementation Details:

•Rewinding the Inner: The inner iterator must support rewind() or equivalent. For table scans, this means seeking back to the first page. For complex operators, this may require materialization.
•NULL Handling: Join predicates involving NULL require careful three-valued logic. NULL = NULL is unknown, not true, affecting outer join correctness.
•Tuple Copying: When emitting results, decide whether to copy tuple data or reference original storage. Copying is safer but costs memory and CPU.
•Projection Push-down: If the join's output doesn't need all attributes, projecting early reduces tuple sizes and memory bandwidth.
•Predicate Decomposition: Complex predicates may be split—equality portions might enable index use for inner relation, with remaining predicates as residual filters.

Modern Vectorized Execution

Contemporary high-performance databases (like DuckDB, ClickHouse) use vectorized execution, processing batches of tuples rather than one at a time. The nested loop concept adapts—outer blocks of tuples are joined against inner blocks, enabling SIMD predicate evaluation and better cache utilization.

Summary: The Simple Nested Loop Join

We've explored the simple nested loop join—the most fundamental physical join operator. Let's consolidate the essential points:

Key Takeaways

•Algorithm Simplicity — Two nested loops examining every tuple pair. Straightforward to implement, reason about, and debug.
•Minimal Memory — Requires only constant space (buffers for one outer and one inner tuple). Invaluable in memory-constrained scenarios.
•Quadratic I/O Cost — Cost of b_R + (n_R × b_S) or b_R + (b_R × b_S) with page-level processing. Prohibitive for large relations.
•Universal Predicate Support — Handles any join predicate: equality, inequality, range, pattern, compound, or arbitrary boolean functions.
•Outer Pipelining — Results stream incrementally as outer tuples are processed. Excellent for top-K queries.
•Join Type Flexibility — Inner, left outer, right outer, full outer, semi-join, and anti-join achieved with minimal modifications.
•Baseline for Comparison — Understanding simple nested loop helps appreciate optimizations in block nested loop, index nested loop, hash join, and sort-merge join.

What's Next:

The simple nested loop's Achilles heel is its repeated scanning of the inner relation. The next page introduces the block nested loop join, which dramatically reduces I/O by processing multiple outer tuples together, scanning the inner relation once per block of outer tuples rather than once per tuple.

Page Complete

You now understand the simple nested loop join algorithm—its mechanics, memory requirements, I/O costs, predicate handling, and execution semantics. This foundational knowledge prepares you for the optimized variants we'll explore next: block nested loop and index nested loop joins.

1 / 5

Loading learning content...

Database Management SystemsPhysical Operators

Nested Loop Join Algorithms

LevelIntermediate

Duration60 mins

TopicPhysical Operators

1 / 5

Basic Nested Loop Join

The Foundation of Physical Join Operators

What You Will Learn

Understanding the Join Problem

Formal Definition:

Given relations R and S, and a join predicate θ, the join R ⋈θ S produces all pairs (r, s) where r ∈ R, s ∈ S, and θ(r, s) evaluates to true.

Scale of the Join Problem
Relation R	Relation S	Pair Combinations	At 1µs per pair
1,000 rows	1,000 rows	1 million	1 second
10,000 rows	10,000 rows	100 million	~2 minutes
100,000 rows	100,000 rows	10 billion	~3 hours
1 million rows	1 million rows	1 trillion	~12 days
10 million rows	100 million rows	10^15	~32,000 years

The Curse of Cartesian Products

The Simple Nested Loop Algorithm

The Algorithm Structure:

simple_nested_loop_join.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Simple Nested Loop Join Algorithm
// R: Outer relation (driving table)
// S: Inner relation (probed table)
// θ: Join predicate (e.g., R.a = S.b)
 
function NestedLoopJoin(R, S, θ):
    result ← ∅                          // Initialize empty result set
    
    for each tuple r in R:              // OUTER LOOP
        for each tuple s in S:          // INNER LOOP
            if θ(r, s):                 // Evaluate join predicate
                result ← result ∪ {r ⊕ s}  // Concatenate and emit
    
    return result

Detailed Execution Flow:

Initialization: Allocate minimal memory—just enough for one tuple from R and one tuple from S, plus any temporary storage for predicate evaluation.
Outer Loop Start: Fetch the first tuple from relation R into memory. This tuple becomes the current outer tuple.
Inner Loop Iteration: For this outer tuple, scan the entire relation S from beginning to end. Each tuple fetched from S is the current inner tuple.
Predicate Evaluation: For each (outer, inner) pair, evaluate θ(r, s). This may involve comparing attribute values, applying functions, or checking complex boolean expressions.
Result Emission: If the predicate evaluates to true, concatenate the outer and inner tuples and add to the result (or stream to the next operator in a pipelined execution).
Inner Loop Completion: After examining all tuples in S, the inner loop terminates.
Outer Loop Continuation: Advance to the next tuple in R and repeat the inner loop.
Termination: When all tuples in R have been processed, the join is complete.

The Tuple-at-a-Time Model

Memory Requirements and Buffer Usage

Minimum Memory Configuration:

Memory Components

•Outer Tuple Buffer — Space for exactly one tuple from relation R. Size depends on R's schema (sum of attribute sizes plus tuple header overhead).
•Inner Tuple Buffer — Space for exactly one tuple from relation S. Similarly schema-dependent.
•Predicate Evaluation Space — Temporary storage for intermediate values during predicate computation. Typically small and fixed.
•Output Buffer — Space to construct result tuples before emission. May be omitted if results stream directly to the next operator.
•I/O Buffer — At minimum, one page/block from each relation if reading from disk. More buffers improve performance but aren't strictly required.

Total Memory Bound:

M_min = sizeof(tuple_R) + sizeof(tuple_S) + sizeof(predicate_temp) + sizeof(output_buffer)

For typical OLTP tuples (hundreds of bytes each), this might be just 1-2 kilobytes—negligible compared to hash join's potential gigabyte requirements.

Practical Buffer Allocation:

buffer_allocation.pseudo
1
2
3
4
5
6
7
8
9
10
11
// Practical buffer allocation for nested loop join
// P: Page size (e.g., 8KB)
// M: Available memory
 
outer_buffer ← allocate(P)      // One page for outer relation
inner_buffer ← allocate(P)      // One page for inner relation
output_buffer ← allocate(P)     // One page for output assembly
 
// Remaining memory can cache inner relation pages
cache_pages ← (M - 3P) / P
inner_cache ← allocate(cache_pages × P)  // Optional: cache inner pages

Memory-Constrained Environments

I/O Cost Analysis

Defining the Variables:

b_R: Number of disk blocks (pages) in relation R
b_S: Number of disk blocks in relation S
n_R: Number of tuples in relation R
n_S: Number of tuples in relation S
f_R: Blocking factor of R (tuples per block)
f_S: Blocking factor of S (tuples per block)

Tuple-at-a-Time I/O Cost:

In the worst case (no caching, each tuple access requires disk I/O):

Cost_simple = b_R + (n_R × b_S)

I/O Cost Examples (Simple Nested Loop)
R Size	S Size	R Tuples	Block I/Os	At 10ms/block
10 blocks	10 blocks	1,000	10,010	~100 seconds
100 blocks	100 blocks	10,000	1,000,100	~2.8 hours
1,000 blocks	1,000 blocks	100,000	100,001,000	~12 days
10,000 blocks	1,000 blocks	1 million	1,000,010,000	~4 months

Page-Level Optimization:

Practical implementations improve on the tuple-at-a-time model by processing at the page level:

Cost_page = b_R + (b_R × b_S)

The Outer Relation Choice Matters

Understanding the I/O Pattern:

The simple nested loop creates a distinctive I/O pattern:

Sequential Read of Outer: R is read once, sequentially, from beginning to end.
Repeated Sequential Reads of Inner: S is read completely once for every block (or tuple) in R.
No Random I/O (for base relations): Both relations are scanned sequentially, which is optimal for spinning disks.
Poor Locality for Inner: Unless S fits in cache, each scan evicts the previous scan's cached pages.

This pattern explains why the simple nested loop performs terribly for large S regardless of available memory—S is re-read from disk for each outer block.

Join Predicate Handling

Supported Predicate Types:

Universal Predicate Support

•Equality Joins — R.a = S.b (standard equi-join)
•Inequality Joins — R.a < S.b, R.a >= S.b, R.a <> S.b
•Range Joins — R.a BETWEEN S.low AND S.high
•Pattern Matches — R.name LIKE S.pattern
•Function-Based — SUBSTRING(R.a, 1, 3) = S.prefix
•Compound Predicates — R.a = S.b AND R.c > S.d OR R.e LIKE '%test%'
•Cross Joins — No predicate (Cartesian product) or 1 = 1
•Non-Equi Theta Joins — Any arbitrary condition θ(r, s) → boolean

Predicate Evaluation Cost:

The predicate is evaluated once per tuple pair. For complex predicates, this cost can dominate:

Total_CPU_cost = n_R × n_S × cost_per_predicate_eval

For simple equality comparisons, cost_per_predicate_eval might be a few CPU cycles. For predicates involving string operations, function calls, or subqueries, costs can vary by orders of magnitude.

predicate_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Example predicates handled by nested loop join
 
-- Simple equality (most common)
SELECT * FROM Orders o 
JOIN Customers c ON o.customer_id = c.id;
 
-- Inequality join (range comparison)
SELECT * FROM Products p 
JOIN PriceHistory h ON p.id = h.product_id 
  AND p.current_price > h.historical_price;
 
-- Spatial/geometric predicates
SELECT * FROM Buildings b 
JOIN Zones z ON ST_Contains(z.boundary, b.location);
 
-- Complex compound predicate
SELECT * FROM Employees e 
JOIN Departments d 
  ON (e.dept_id = d.id OR e.secondary_dept = d.id)
  AND e.hire_date BETWEEN d.active_from AND d.active_until
  AND d.name LIKE e.preferred_dept_pattern;

The Theta Join Workhorse

Algorithm Variations and Semantics

The simple nested loop naturally supports all SQL join types through minor modifications to the control flow and output logic.

Inner Join (Default):

Emit only when predicate matches. Non-matching tuples from either relation are excluded.

inner_join.pseudo
1
2
3
4
5
// Inner Join: emit matched pairs only
for each r in R:
    for each s in S:
        if θ(r, s):
            emit(r ⊕ s)

Left Outer Join:

For each outer tuple, if no inner tuple matches, emit the outer tuple padded with NULLs.

left_outer_join.pseudo
1
2
3
4
5
6
7
8
9
// Left Outer Join: preserve unmatched outer tuples
for each r in R:
    matched ← false
    for each s in S:
        if θ(r, s):
            emit(r ⊕ s)
            matched ← true
    if not matched:
        emit(r ⊕ NULL_S)  // Pad with NULL values for S attributes

Right Outer Join:

Similar to left outer, but preserving unmatched inner tuples. Typically implemented by swapping R and S, then using left outer logic.

right_outer_join.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Right Outer Join: preserve unmatched inner tuples
// Track which S tuples have been matched
matched_S ← new BitSet(|S|)
 
for each r in R:
    for i, s in enumerate(S):
        if θ(r, s):
            emit(r ⊕ s)
            matched_S.set(i)
 
// Emit unmatched S tuples with NULL padding
for i, s in enumerate(S):
    if not matched_S.get(i):
        emit(NULL_R ⊕ s)

Full Outer Join:

Combines left and right outer semantics—preserve unmatched from both sides.

full_outer_join.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Full Outer Join: preserve unmatched from both relations
matched_S ← new BitSet(|S|)
 
for each r in R:
    r_matched ← false
    for i, s in enumerate(S):
        if θ(r, s):
            emit(r ⊕ s)
            r_matched ← true
            matched_S.set(i)
    if not r_matched:
        emit(r ⊕ NULL_S)
 
// Emit unmatched S tuples
for i, s in enumerate(S):
    if not matched_S.get(i):
        emit(NULL_R ⊕ s)

Semi-Join and Anti-Join Optimizations

Pipelining and Blocking Behavior

Understanding how the nested loop join interacts with query pipelines is crucial for query optimization and performance prediction.

Pipelining Characteristics:

The nested loop join is a pipelining operator on the outer relation and a blocking operator on the inner relation. This distinction has profound implications:

Pipelining (Outer Relation)

•Outer tuples can stream from upstream operators
•Each outer tuple is processed immediately upon arrival
•Results emit incrementally as matches are found
•First result can appear before R is fully scanned
•Memory contains only current outer tuple
•Excellent for 'top-K' queries with LIMIT

Blocking (Inner Relation)

•Inner relation must be fully accessible
•S is re-scanned for each outer tuple/block
•Cannot pipeline from inner's upstream operator
•If S comes from a subquery, it must materialize
•Inner relation often cached if memory permits
•Best when inner relation is base table or materialized

Implications for Query Execution:

Left-deep query plans: Nested loops naturally fit left-deep trees where results from one join feed into the next join's outer relation, maintaining pipelining.
Materialization points: If the inner relation is a complex subquery or another join, the executor may need to materialize it to enable repeated scanning.
Memory pressure: While the outer can pipeline, caching inner pages improves performance dramatically. The query optimizer balances this against other operators competing for buffer pool space.
Early termination: LIMIT clauses can terminate the outer loop early, saving significant work. SELECT * FROM R JOIN S ... LIMIT 10 may finish after processing just a few outer tuples.

Correlated Subqueries

Implementation Considerations

Building a production-quality nested loop join requires attention to details beyond the basic algorithm. Here are the key engineering considerations:

Iterator Interface:

Modern database engines implement operators using the iterator model (also called the Volcano model). Each operator exposes three methods:

iterator_interface.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class NestedLoopJoin implements Iterator<Tuple>:
    outer: Iterator<Tuple>
    inner: Iterator<Tuple>
    inner_source: Relation        // For rewinding
    predicate: (Tuple, Tuple) → bool
    current_outer: Tuple | null
    
    function open():
        outer.open()
        inner.open()
        current_outer ← outer.next()
 
    function next() → Tuple | EOF:
        while current_outer ≠ null:
            inner_tuple ← inner.next()
            
            if inner_tuple = EOF:
                // Inner exhausted, advance outer
                current_outer ← outer.next()
                inner.rewind()           // Critical: restart inner scan
                continue
            
            if predicate(current_outer, inner_tuple):
                return concatenate(current_outer, inner_tuple)
        
        return EOF
 
    function close():
        outer.close()
        inner.close()

Critical Implementation Details:

•Rewinding the Inner: The inner iterator must support rewind() or equivalent. For table scans, this means seeking back to the first page. For complex operators, this may require materialization.
•NULL Handling: Join predicates involving NULL require careful three-valued logic. NULL = NULL is unknown, not true, affecting outer join correctness.
•Tuple Copying: When emitting results, decide whether to copy tuple data or reference original storage. Copying is safer but costs memory and CPU.
•Projection Push-down: If the join's output doesn't need all attributes, projecting early reduces tuple sizes and memory bandwidth.
•Predicate Decomposition: Complex predicates may be split—equality portions might enable index use for inner relation, with remaining predicates as residual filters.

Modern Vectorized Execution

Summary: The Simple Nested Loop Join

We've explored the simple nested loop join—the most fundamental physical join operator. Let's consolidate the essential points:

Key Takeaways

•Algorithm Simplicity — Two nested loops examining every tuple pair. Straightforward to implement, reason about, and debug.
•Minimal Memory — Requires only constant space (buffers for one outer and one inner tuple). Invaluable in memory-constrained scenarios.
•Quadratic I/O Cost — Cost of b_R + (n_R × b_S) or b_R + (b_R × b_S) with page-level processing. Prohibitive for large relations.
•Universal Predicate Support — Handles any join predicate: equality, inequality, range, pattern, compound, or arbitrary boolean functions.
•Outer Pipelining — Results stream incrementally as outer tuples are processed. Excellent for top-K queries.
•Join Type Flexibility — Inner, left outer, right outer, full outer, semi-join, and anti-join achieved with minimal modifications.
•Baseline for Comparison — Understanding simple nested loop helps appreciate optimizations in block nested loop, index nested loop, hash join, and sort-merge join.

What's Next:

Page Complete

1 / 5