Database Management SystemsHash Join

Hash Join Algorithms

LevelAdvanced

Duration75 mins

TopicHash Join

2 / 5

Probe Phase

Where Hash Join Delivers Its Promise

The probe phase is where hash join's investment pays off. Having constructed a hash table in the build phase, we now scan the probe relation and perform constant-time lookups for each tuple. This is the moment when hash join demonstrates its power: what would be O(n × m) comparisons in a naive nested loop becomes O(n + m) operations.

The probe phase determines join output quality and performance. Efficient probe operations can process millions of tuples per second, while subtle inefficiencies—cache misses, poor memory access patterns, excessive hash collisions—can degrade performance by orders of magnitude. Understanding probe phase mechanics is essential for diagnosing join performance issues and appreciating optimizer decisions.

What You Will Learn

By the end of this page, you will understand how probe phase lookups work, why probe efficiency depends on build phase quality, how bloom filters accelerate negative lookups, the mechanics of output generation, and the special considerations for outer joins. This knowledge enables you to trace join execution paths and identify performance bottlenecks.

Probe Phase Fundamentals

The probe phase follows a straightforward pattern: for each tuple in the probe relation, look up matching tuples in the hash table and output combined results. The elegance lies in the O(1) lookup cost per probe tuple.

Core probe algorithm:

Read a tuple from the probe relation
Extract the join key value(s)
Compute the hash of the join key
Look up the corresponding hash bucket
Compare with all tuples in the bucket (handle collisions)
For each match, output the joined result tuple
Repeat until probe relation is exhausted

probe_phase_basic.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// Basic probe phase implementation
class HashJoinProbeOperator {
    HashTable* hash_table;        // Built in previous phase
    Operator* probe_child;        // Source of probe tuples
    vector<int> join_key_columns; // Which columns form the join key
    
    // Current state for iterator model
    Tuple current_probe_tuple;
    vector<Tuple*>::iterator match_iterator;
    vector<Tuple*> current_matches;
    bool probe_exhausted;
    
    void initialize(HashTable* ht, Operator* probe_source) {
        hash_table = ht;
        probe_child = probe_source;
        probe_exhausted = false;
        // Load first probe tuple
        advance_probe();
    }
    
    bool next(Tuple* output) {
        while (!probe_exhausted) {
            // Check if we have more matches for current probe tuple
            if (match_iterator != current_matches.end()) {
                // Combine probe tuple with matching build tuple
                combine_tuples(current_probe_tuple, **match_iterator, output);
                ++match_iterator;
                return true;
            }
            
            // No more matches; advance to next probe tuple
            if (!advance_probe()) {
                probe_exhausted = true;
                return false;
            }
        }
        return false;
    }
    
private:
    bool advance_probe() {
        if (!probe_child->next(&current_probe_tuple)) {
            return false;  // Probe relation exhausted
        }
        
        // Look up matches in hash table
        JoinKey key = extract_key(current_probe_tuple, join_key_columns);
        
        if (key.has_null()) {
            // NULL keys never match; skip to next probe tuple
            current_matches.clear();
        } else {
            uint64_t hash_value = compute_hash(key);
            current_matches = hash_table->find_all(hash_value, key);
        }
        
        match_iterator = current_matches.begin();
        return true;
    }
};

The Iterator Model

The code above demonstrates the Volcano-style iterator model where each operator's next() call produces one output tuple. This enables pipelining—results flow to parent operators as they're produced. However, a single probe tuple may produce multiple output tuples (when multiple build tuples share the same key), requiring stateful iteration within the probe operator.

Hash Table Lookup Mechanics

The probe lookup operation is deceptively simple in concept but nuanced in implementation. Each lookup involves hashing, bucket location, and key comparison—each step with performance implications.

Step 1: Hash Computation

The probe tuple's join key must be hashed using the identical hash function as the build phase. Using a different function—or even a different seed—would cause lookups to check wrong buckets, producing incorrect (missing) results.

Step 2: Bucket Location

The hash value maps to a bucket: bucket_index = hash_value mod num_buckets. This is a single arithmetic operation, but accessing the bucket itself may cause a cache miss if the bucket array is large and access patterns are random.

Step 3: Key Comparison

Within the bucket, we must compare the probe key against each stored key. This is necessary because:

Multiple keys can hash to the same bucket (hash collisions)
We need exact matches, not just hash-value matches

Optimization: Store hash values alongside tuples

Comparing full keys (especially strings or composite keys) is expensive. A common optimization: store the full hash value with each tuple in the hash table. During probe, first compare hash values (cheap integer comparison), and only compare full keys if hash values match. This filters most non-matches cheaply.

probe_lookup_optimized.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// Optimized probe lookup with stored hash values
vector<Tuple*> HashTable::find_all(uint64_t probe_hash, const JoinKey& probe_key) {
    vector<Tuple*> matches;
    
    // Locate bucket
    size_t bucket_idx = probe_hash % num_buckets;
    HashEntry* entry = buckets[bucket_idx];
    
    // Traverse chain
    while (entry != nullptr) {
        // First: cheap hash comparison
        if (entry->cached_hash == probe_hash) {
            // Hash matches - might be real match, might be hash collision
            // Second: full key comparison (more expensive)
            if (keys_equal(entry->key, probe_key)) {
                matches.push_back(&entry->tuple);
            }
        }
        entry = entry->next;
    }
    
    return matches;
}
 
// Comparison function for different key types
bool keys_equal(const JoinKey& a, const JoinKey& b) {
    if (a.type != b.type) return false;
    
    switch (a.type) {
        case KeyType::INTEGER:
            return a.int_value == b.int_value;
            
        case KeyType::STRING:
            // String comparison is expensive - 
            // check length first as quick filter
            if (a.str_value.length() != b.str_value.length())
                return false;
            return a.str_value == b.str_value;
            
        case KeyType::COMPOSITE:
            for (size_t i = 0; i < a.num_columns; i++) {
                if (!columns_equal(a.columns[i], b.columns[i]))
                    return false;
            }
            return true;
            
        default:
            return false;
    }
}

Lookup Cost Factors
Factor	Impact	Mitigation
Hash computation	~10-50 CPU cycles	Hardware-accelerated functions
Bucket access cache miss	~100-300 cycles (L3/RAM)	Prefetching, smaller bucket array
Chain traversal	~10-50 cycles per entry + cache miss per pointer dereference	Keep chains short, use contiguous storage
Key comparison	~5-500 cycles depending on key type	Store hash values, compare hash first
Match output construction	~50-200 cycles per output	Avoid deep copies where possible

Bloom Filter Optimization

One of the most powerful probe phase optimizations is the Bloom filter—a probabilistic data structure that can quickly answer "Is this key definitely NOT in the hash table?" with zero false negatives.

The problem it solves:

In many joins, a significant fraction of probe tuples have no matching build tuples. For these non-matching probes, we still pay the full cost of hashing, bucket access, and (empty) chain traversal. If we could identify non-matches earlier, we could skip the expensive hash table lookup entirely.

How Bloom filters work:

A Bloom filter is a bit array of m bits, initially all zeros. During build phase, for each inserted key:

Compute k different hash values from the key
Set the bit at each of those k positions to 1

During probe phase, for each probe key:

Compute the same k hash values
Check if ALL k bits are set to 1
If any bit is 0: key is DEFINITELY not in the table (skip lookup)
If all bits are 1: key MIGHT be in the table (proceed with lookup)

The key insight: false positives are possible, but false negatives are not. A Bloom filter never claims a key is absent when it's present.

bloom_filter_hash_join.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
// Bloom filter integration with hash join
class BloomFilter {
    vector<bool> bits;
    size_t num_bits;
    int num_hash_functions;
    
public:
    BloomFilter(size_t expected_elements, double fp_rate = 0.01) {
        // Optimal sizing formulas
        // m = -n * ln(p) / (ln(2)^2)
        num_bits = static_cast<size_t>(
            -expected_elements * log(fp_rate) / (log(2) * log(2))
        );
        // k = (m/n) * ln(2)
        num_hash_functions = static_cast<int>(
            (num_bits / expected_elements) * log(2)
        );
        
        bits.resize(num_bits, false);
    }
    
    void insert(uint64_t primary_hash) {
        // Use double hashing to generate k hash values
        uint64_t h1 = primary_hash;
        uint64_t h2 = hash_secondary(primary_hash);
        
        for (int i = 0; i < num_hash_functions; i++) {
            size_t bit_pos = (h1 + i * h2) % num_bits;
            bits[bit_pos] = true;
        }
    }
    
    bool might_contain(uint64_t primary_hash) {
        uint64_t h1 = primary_hash;
        uint64_t h2 = hash_secondary(primary_hash);
        
        for (int i = 0; i < num_hash_functions; i++) {
            size_t bit_pos = (h1 + i * h2) % num_bits;
            if (!bits[bit_pos]) {
                return false;  // Definitely not in set
            }
        }
        return true;  // Might be in set
    }
};
 
// Enhanced probe phase with Bloom filter
bool advance_probe_with_bloom() {
    while (probe_child->next(&current_probe_tuple)) {
        JoinKey key = extract_key(current_probe_tuple, join_key_columns);
        
        if (key.has_null()) continue;  // Skip NULLs
        
        uint64_t hash_value = compute_hash(key);
        
        // Quick Bloom filter check before expensive hash table lookup
        if (!bloom_filter->might_contain(hash_value)) {
            // Definitely no match - skip hash table lookup entirely
            probe_tuples_bloom_filtered++;
            continue;
        }
        
        // Might have match - proceed with real lookup
        current_matches = hash_table->find_all(hash_value, key);
        if (!current_matches.empty()) {
            match_iterator = current_matches.begin();
            return true;
        }
        // False positive case: Bloom said maybe, but no actual match
    }
    return false;
}

Bloom Filter Trade-offs

Bloom filters trade memory for lookup speed reduction. A 1% false positive rate requires ~10 bits per element. For 1 million build tuples, that's ~1.25 MB—negligible compared to the main hash table but providing significant speedup when many probe tuples don't match. Analytical databases with highly selective joins often see 80-90% of probe tuples eliminated by Bloom filters.

Bloom Filter Sizing Guidelines
False Positive Rate	Bits per Element	Hash Functions	Use Case
10%	4.8	3	Quick filter, memory-constrained
1%	9.6	7	Standard applications
0.1%	14.4	10	High-selectivity joins
0.01%	19.2	13	Very large ratio of non-matches

Output Tuple Generation

When a probe tuple matches one or more build tuples, the join produces output tuples. This output generation step is often overlooked but can significantly impact performance, especially for wide tables or joins with high match rates.

Output tuple composition:

The output schema typically includes:

All (or selected) columns from the probe tuple
All (or selected) columns from the matching build tuple(s)
Potentially computed expressions

The physical representation depends on the execution model:

Materialized output: Copy all column values into a new tuple buffer. Safe but expensive for wide tuples.

Reference-based output: Store pointers to source tuples plus the matched row. Cheaper but requires source tuples to remain valid.

Column-position output: In columnar execution, output column pointers plus selection vectors. Most efficient for analytical workloads.

output_generation.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// Different output generation strategies
 
// Strategy 1: Materialized (copy all data)
void combine_tuples_materialized(
    const Tuple& probe,
    const Tuple& build,
    Tuple* output,
    const OutputSchema& schema
) {
    output->allocate(schema.total_bytes());
    
    size_t offset = 0;
    // Copy selected probe columns
    for (int col : schema.probe_columns) {
        size_t col_size = probe.column_size(col);
        memcpy(output->data() + offset, probe.column_data(col), col_size);
        offset += col_size;
    }
    // Copy selected build columns
    for (int col : schema.build_columns) {
        size_t col_size = build.column_size(col);
        memcpy(output->data() + offset, build.column_data(col), col_size);
        offset += col_size;
    }
}
 
// Strategy 2: Reference-based (just store pointers)
struct JoinOutputRef {
    const Tuple* probe_tuple;
    const Tuple* build_tuple;
    // Columns accessed on-demand through pointers
};
 
// Strategy 3: Vectorized columnar output
struct VectorizedJoinOutput {
    // For each output column, points to source column + selection vector
    struct ColumnBinding {
        Column* source_column;
        SelectionVector* selection;  // Which rows matched
        bool from_probe;             // vs from build
        int source_column_index;
    };
    
    vector<ColumnBinding> output_columns;
    size_t num_output_rows;
};

Late Materialization

Modern analytical databases practice 'late materialization'—deferring the assembly of complete output tuples until absolutely necessary. If a downstream operator only needs certain columns, or if additional filters might eliminate tuples, avoiding early materialization saves substantial memory bandwidth. This is especially powerful when join output feeds directly into aggregation that only needs a subset of columns.

Handling multiple matches:

When a probe tuple matches multiple build tuples, the join produces multiple outputs. This is common in many-to-many relationships and requires careful handling:

Probe tuple P1 matches build tuples B1, B2, B3
→ Outputs: (P1, B1), (P1, B2), (P1, B3)

The iterator model handles this by maintaining match state across next() calls. The current probe tuple is retained until all matches are emitted, then we advance to the next probe tuple.

For batch/vectorized execution, the output may be larger than the input batch, requiring output buffer management and potentially splitting output across multiple downstream calls.

Outer Join Probe Phase Modifications

Outer joins require additional bookkeeping during the probe phase to ensure non-matching tuples are properly included in the output. The specific requirements depend on which outer join variant is being executed.

LEFT OUTER JOIN

•Probe relation is preserved
•Non-matching probes output with NULL build columns
•Simple to implement: just check match count
•No build-side tracking needed
•Build relation is the inner side

RIGHT OUTER JOIN

•Build relation is preserved
•Must track which build tuples matched
•After probe completes: scan for unmatched
•Requires matched flag per build tuple
•More memory and post-processing overhead

outer_join_probe.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
// LEFT OUTER JOIN probe phase
bool probe_left_outer_next(Tuple* output) {
    while (!probe_exhausted) {
        if (match_iterator != current_matches.end()) {
            // Normal match - output joined tuple
            combine_tuples(current_probe_tuple, **match_iterator, output);
            ++match_iterator;
            return true;
        }
        
        // Check if we have any matches for current probe tuple
        if (current_matches.empty() && !current_probe_processed) {
            // No matches - output probe with NULLs for build columns
            combine_tuple_with_nulls(current_probe_tuple, output);
            current_probe_processed = true;
            return true;
        }
        
        // Advance to next probe tuple
        if (!advance_probe()) {
            probe_exhausted = true;
            return false;
        }
        current_probe_processed = false;
    }
    return false;
}
 
// RIGHT OUTER JOIN - requires tracking matched build tuples
struct HashEntryWithMatch {
    HashEntry base;
    bool matched;  // Set to true when this tuple matches a probe tuple
};
 
bool probe_right_outer_next(Tuple* output) {
    // First phase: normal probing with match tracking
    while (!probe_exhausted) {
        if (match_iterator != current_matches.end()) {
            // Mark build tuple as matched
            (*match_iterator)->matched = true;
            combine_tuples(current_probe_tuple, **match_iterator, output);
            ++match_iterator;
            return true;
        }
        
        if (!advance_probe()) {
            probe_exhausted = true;
            // Start second phase: emit unmatched build tuples
            unmatched_iterator = hash_table->begin();
        }
    }
    
    // Second phase: emit unmatched build tuples with NULL probe columns
    while (unmatched_iterator != hash_table->end()) {
        if (!unmatched_iterator->matched) {
            combine_nulls_with_tuple(unmatched_iterator->tuple, output);
            ++unmatched_iterator;
            return true;
        }
        ++unmatched_iterator;
    }
    
    return false;
}
 
// FULL OUTER JOIN combines both approaches
// - Track matched build tuples (like RIGHT OUTER)
// - Output non-matching probe tuples (like LEFT OUTER)
// - After probe: emit unmatched build tuples

Memory Overhead of Match Tracking

RIGHT OUTER and FULL OUTER joins require storing a matched flag for each build tuple. For large build relations, this adds significant memory overhead. Some systems use bitmap representations (1 bit per tuple) to minimize this cost. Others avoid hash join entirely for outer joins when the preserved side is too large.

Semi-Join and Anti-Join Optimizations

Semi-joins and anti-joins are specialized join variants that can be executed more efficiently than full joins because they have simpler output requirements.

Semi-join (EXISTS semantics): Return probe tuples that have at least one match. Anti-join (NOT EXISTS semantics): Return probe tuples that have no matches.

Both variants output only probe-side columns—no joining of build columns is needed. This enables significant optimizations:

Semi-Join/Anti-Join Optimizations

•Early termination: For semi-join, stop searching on first match. Don't need to find ALL matches, just one.
•No build tuple storage: We only need to know if keys exist, not the full tuples. Store only keys in hash table.
•Bloom filter only: For extreme cases, a Bloom filter alone can answer semi-join queries (with small false positive rate).
•No output materialization: Output is just the probe tuple itself, no column combination needed.
•Duplicate elimination on build: If build has duplicates, we only need one representative per key.

semi_anti_join.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
// Optimized semi-join probe
bool probe_semi_join_next(Tuple* output) {
    while (probe_child->next(&current_probe_tuple)) {
        JoinKey key = extract_key(current_probe_tuple, join_key_columns);
        
        if (key.has_null()) continue;
        
        uint64_t hash_value = compute_hash(key);
        
        // Just check existence - don't need actual tuples
        if (hash_table->contains(hash_value, key)) {
            // Output probe tuple directly (no joining)
            *output = current_probe_tuple;
            return true;
        }
    }
    return false;
}
 
// Optimized anti-join probe
bool probe_anti_join_next(Tuple* output) {
    while (probe_child->next(&current_probe_tuple)) {
        JoinKey key = extract_key(current_probe_tuple, join_key_columns);
        
        // NULL-key tuples: never match, so always included in anti-join output
        if (key.has_null()) {
            *output = current_probe_tuple;
            return true;
        }
        
        uint64_t hash_value = compute_hash(key);
        
        // Use Bloom filter for quick negative check
        if (!bloom_filter->might_contain(hash_value)) {
            // Definitely no match - include in anti-join output
            *output = current_probe_tuple;
            return true;
        }
        
        // Bloom says maybe - check hash table
        if (!hash_table->contains(hash_value, key)) {
            *output = current_probe_tuple;
            return true;
        }
        // Match found - exclude from anti-join output
    }
    return false;
}
 
// For semi-join, we can use a key-only hash table
class KeyOnlyHashTable {
    // Stores only keys, not full tuples
    // ~50-90% memory reduction for wide tables
    vector<KeySet> buckets;
    
    void insert(uint64_t hash, const JoinKey& key) {
        buckets[hash % num_buckets].insert(key);
    }
    
    bool contains(uint64_t hash, const JoinKey& key) {
        return buckets[hash % num_buckets].contains(key);
    }
};

Query Optimizer Recognition

Query optimizers automatically convert appropriate patterns to semi-join or anti-join. An EXISTS subquery becomes a semi-join; NOT EXISTS becomes an anti-join. Understanding these conversions helps you write queries that the optimizer can execute efficiently.

Probe Phase Performance Analysis

Probe phase performance depends on several factors, each contributing to the overall throughput of tuples processed per second.

Probe Performance Factors
Factor	Good Performance	Poor Performance	Diagnostic
Selectivity	Many non-matches (Bloom helps)	Most tuples match	Compare probe count vs output count
Match cardinality	1:1 or low (1:few)	1:many (fanout)	Check output rows / probe rows ratio
Hash table cache fit	Fits in L3 cache	Exceeds cache significantly	Monitor cache miss rates
Chain length	Average 1-2 entries	Long chains (>10)	Bucket size distribution
Key comparison cost	Integers, short strings	Wide composite, long strings	Profile key comparison time

Probe phase cost model:

Probe_cost = |S| × (hash_cost + lookup_cost + match_cost × avg_matches)
           + |S| × tuple_read_cost
           + output_rows × output_cost

Where:
- |S| = probe relation cardinality
- lookup_cost = bucket_access + avg_chain_length × comparison_cost  
- avg_matches = |output| / |S| (expected matches per probe)

Example calculation:

Join producing 5M output rows from 10M probe tuples (50% match rate, 1:1 on average):

Hash cost: 10M × 20 ns = 200 ms
Lookup cost (assume hash table in L3): 10M × 150 ns = 1.5 seconds
Match output: 5M × 100 ns = 500 ms
Total estimated: ~2.2 seconds

If hash table exceeds cache, lookup cost could 3-5× higher due to cache misses.

Prefetching Optimization

Advanced implementations use software prefetching to hide memory latency. By computing hashes for upcoming tuples and issuing prefetch instructions for their buckets, the CPU can overlap memory access with processing. This technique can improve probe throughput by 2-3× for hash tables larger than cache.

prefetching_probe.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Prefetching-optimized probe phase (simplified)
void probe_with_prefetch(Tuple* probe_batch, size_t batch_size) {
    const int PREFETCH_DISTANCE = 4;  // Look ahead 4 tuples
    
    // Compute hashes and prefetch for upcoming tuples
    uint64_t hashes[batch_size];
    for (size_t i = 0; i < batch_size; i++) {
        hashes[i] = compute_hash(extract_key(probe_batch[i]));
    }
    
    // Issue prefetches for first PREFETCH_DISTANCE buckets
    for (size_t i = 0; i < min(PREFETCH_DISTANCE, batch_size); i++) {
        _mm_prefetch(hash_table->bucket_ptr(hashes[i]), _MM_HINT_T0);
    }
    
    // Process tuples with rolling prefetch window
    for (size_t i = 0; i < batch_size; i++) {
        // Prefetch for tuple i + PREFETCH_DISTANCE
        if (i + PREFETCH_DISTANCE < batch_size) {
            _mm_prefetch(
                hash_table->bucket_ptr(hashes[i + PREFETCH_DISTANCE]),
                _MM_HINT_T0
            );
        }
        
        // Process tuple i (data should be in cache now)
        process_probe_tuple(probe_batch[i], hashes[i]);
    }
}

Monitoring Probe Phase Execution

Understanding probe phase behavior is essential for diagnosing join performance issues. Production database systems expose metrics that reveal execution characteristics:

Key Probe Phase Metrics

•Probe rows processed: Total tuples from probe relation. Compare to estimate for accuracy assessment.
•Rows output: Join output cardinality. High ratio to probe rows indicates 1:many join (fanout).
•Bloom filter rejection rate: Percentage of probes filtered before hash table lookup. High rates (>50%) indicate good Bloom filter effectiveness.
•Hash table hit rate: Probes that found at least one match. Low rates suggest filtering opportunity before join.
•Average chain length traversed: Indicates hash table quality. Long traversals suggest poor hash function or severe data skew.
•Probe phase time: Wall-clock time for complete probe phase. Compare to build phase for balance assessment.

probe_metrics_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- PostgreSQL EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) output for hash join
Hash Join  (cost=3152.00..14829.00 rows=50000 width=200) 
           (actual time=45.2..512.3 rows=47832 loops=1)
   Hash Cond: (orders.customer_id = customers.customer_id)
   -- Probe phase metrics visible in parent Hash Join node:
   -- actual rows=47832 indicates output cardinality
   -- actual time includes probe phase
   
   ->  Seq Scan on orders  (cost=0.00..8334.00 rows=500000 width=150) 
                           (actual time=0.02..95.1 rows=500000 loops=1)
       -- This is the probe relation: 500K rows
       -- 500K probes produced 47.8K matches (~9.5% match rate)
       
   ->  Hash  (cost=1769.00..1769.00 rows=100000 width=50) 
             (actual time=44.9..44.9 rows=100000 loops=1)
       Buckets: 131072  Batches: 1  Memory Usage: 6231kB
       -- Build phase: 100K rows, single batch (fit in memory)
       -- 131072 buckets for 100K rows = 1.3 load factor
       
       ->  Seq Scan on customers  (cost=0.00..1769.00 rows=100000 width=50) 
                                  (actual time=0.01..15.2 rows=100000 loops=1)
 
-- Key insights from these metrics:
-- - Build relation (customers): 100K rows, fit in 6MB hash table
-- - Probe relation (orders): 500K rows
-- - Output: 47.8K rows (many orders have no matching customer, or filtered earlier)
-- - Build time ~45ms, total time ~512ms, so probe took ~467ms
-- - Probe throughput: ~1.07 million rows/second

Cardinality Estimation Errors

When estimated rows differs significantly from actual rows, it indicates cardinality estimation errors. For hash joins, this can mean: (1) wrong relation chosen as build side, (2) hash table sized poorly, (3) downstream operators receive unexpected volumes. Monitoring actual vs. estimated throughout the plan helps identify the source of estimation errors.

Summary: Probe Phase Mastery

The probe phase is where hash join's efficiency materializes—where the upfront investment in building a hash table translates to fast, constant-time lookups. Let's consolidate the essential concepts:

Key Takeaways

•O(1) lookup per probe tuple — Each probe involves hashing, bucket location, and chain traversal. With good hash functions and low load factors, this is effectively constant time.
•Bloom filters accelerate negative lookups — Pre-filter probes that definitely won't match, avoiding expensive hash table access for non-matching tuples.
•Output generation has cost — Joining data from probe and build tuples involves memory copies. Late materialization and reference-based output reduce this overhead.
•Outer joins require match tracking — LEFT OUTER is simple (non-matching probes). RIGHT/FULL OUTER need to track which build tuples matched for post-probe emission.
•Semi/anti-join optimizations — These variants need only existence checking, enabling early termination and smaller hash tables.
•Cache behavior dominates performance — Hash tables exceeding cache see dramatic slowdowns. Prefetching and careful memory layout help mitigate.

What's Next:

So far we've assumed the build relation fits comfortably in memory. Reality is often less accommodating—very large relations may exceed available memory by orders of magnitude. The next page explores partition handling: techniques for handling hash joins when data exceeds memory, including the crucial concepts of partitioning and multi-pass processing.

Page Complete

You now understand the probe phase of hash join algorithms: lookup mechanics, Bloom filter optimization, output generation, outer join handling, semi-join/anti-join optimizations, and performance analysis. This knowledge is essential for understanding query execution plans and diagnosing join performance issues.

2 / 5

Loading learning content...

Database Management SystemsHash Join

Hash Join Algorithms

LevelAdvanced

Duration75 mins

TopicHash Join

2 / 5

Probe Phase

Where Hash Join Delivers Its Promise

What You Will Learn

Probe Phase Fundamentals

Core probe algorithm:

Read a tuple from the probe relation
Extract the join key value(s)
Compute the hash of the join key
Look up the corresponding hash bucket
Compare with all tuples in the bucket (handle collisions)
For each match, output the joined result tuple
Repeat until probe relation is exhausted

probe_phase_basic.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// Basic probe phase implementation
class HashJoinProbeOperator {
    HashTable* hash_table;        // Built in previous phase
    Operator* probe_child;        // Source of probe tuples
    vector<int> join_key_columns; // Which columns form the join key
    
    // Current state for iterator model
    Tuple current_probe_tuple;
    vector<Tuple*>::iterator match_iterator;
    vector<Tuple*> current_matches;
    bool probe_exhausted;
    
    void initialize(HashTable* ht, Operator* probe_source) {
        hash_table = ht;
        probe_child = probe_source;
        probe_exhausted = false;
        // Load first probe tuple
        advance_probe();
    }
    
    bool next(Tuple* output) {
        while (!probe_exhausted) {
            // Check if we have more matches for current probe tuple
            if (match_iterator != current_matches.end()) {
                // Combine probe tuple with matching build tuple
                combine_tuples(current_probe_tuple, **match_iterator, output);
                ++match_iterator;
                return true;
            }
            
            // No more matches; advance to next probe tuple
            if (!advance_probe()) {
                probe_exhausted = true;
                return false;
            }
        }
        return false;
    }
    
private:
    bool advance_probe() {
        if (!probe_child->next(&current_probe_tuple)) {
            return false;  // Probe relation exhausted
        }
        
        // Look up matches in hash table
        JoinKey key = extract_key(current_probe_tuple, join_key_columns);
        
        if (key.has_null()) {
            // NULL keys never match; skip to next probe tuple
            current_matches.clear();
        } else {
            uint64_t hash_value = compute_hash(key);
            current_matches = hash_table->find_all(hash_value, key);
        }
        
        match_iterator = current_matches.begin();
        return true;
    }
};

The Iterator Model

Hash Table Lookup Mechanics

The probe lookup operation is deceptively simple in concept but nuanced in implementation. Each lookup involves hashing, bucket location, and key comparison—each step with performance implications.

Step 1: Hash Computation

Step 2: Bucket Location

Step 3: Key Comparison

Within the bucket, we must compare the probe key against each stored key. This is necessary because:

Multiple keys can hash to the same bucket (hash collisions)
We need exact matches, not just hash-value matches

Optimization: Store hash values alongside tuples

probe_lookup_optimized.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// Optimized probe lookup with stored hash values
vector<Tuple*> HashTable::find_all(uint64_t probe_hash, const JoinKey& probe_key) {
    vector<Tuple*> matches;
    
    // Locate bucket
    size_t bucket_idx = probe_hash % num_buckets;
    HashEntry* entry = buckets[bucket_idx];
    
    // Traverse chain
    while (entry != nullptr) {
        // First: cheap hash comparison
        if (entry->cached_hash == probe_hash) {
            // Hash matches - might be real match, might be hash collision
            // Second: full key comparison (more expensive)
            if (keys_equal(entry->key, probe_key)) {
                matches.push_back(&entry->tuple);
            }
        }
        entry = entry->next;
    }
    
    return matches;
}
 
// Comparison function for different key types
bool keys_equal(const JoinKey& a, const JoinKey& b) {
    if (a.type != b.type) return false;
    
    switch (a.type) {
        case KeyType::INTEGER:
            return a.int_value == b.int_value;
            
        case KeyType::STRING:
            // String comparison is expensive - 
            // check length first as quick filter
            if (a.str_value.length() != b.str_value.length())
                return false;
            return a.str_value == b.str_value;
            
        case KeyType::COMPOSITE:
            for (size_t i = 0; i < a.num_columns; i++) {
                if (!columns_equal(a.columns[i], b.columns[i]))
                    return false;
            }
            return true;
            
        default:
            return false;
    }
}

Lookup Cost Factors
Factor	Impact	Mitigation
Hash computation	~10-50 CPU cycles	Hardware-accelerated functions
Bucket access cache miss	~100-300 cycles (L3/RAM)	Prefetching, smaller bucket array
Chain traversal	~10-50 cycles per entry + cache miss per pointer dereference	Keep chains short, use contiguous storage
Key comparison	~5-500 cycles depending on key type	Store hash values, compare hash first
Match output construction	~50-200 cycles per output	Avoid deep copies where possible

Bloom Filter Optimization

The problem it solves:

How Bloom filters work:

A Bloom filter is a bit array of m bits, initially all zeros. During build phase, for each inserted key:

Compute k different hash values from the key
Set the bit at each of those k positions to 1

During probe phase, for each probe key:

Compute the same k hash values
Check if ALL k bits are set to 1
If any bit is 0: key is DEFINITELY not in the table (skip lookup)
If all bits are 1: key MIGHT be in the table (proceed with lookup)

The key insight: false positives are possible, but false negatives are not. A Bloom filter never claims a key is absent when it's present.

bloom_filter_hash_join.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
// Bloom filter integration with hash join
class BloomFilter {
    vector<bool> bits;
    size_t num_bits;
    int num_hash_functions;
    
public:
    BloomFilter(size_t expected_elements, double fp_rate = 0.01) {
        // Optimal sizing formulas
        // m = -n * ln(p) / (ln(2)^2)
        num_bits = static_cast<size_t>(
            -expected_elements * log(fp_rate) / (log(2) * log(2))
        );
        // k = (m/n) * ln(2)
        num_hash_functions = static_cast<int>(
            (num_bits / expected_elements) * log(2)
        );
        
        bits.resize(num_bits, false);
    }
    
    void insert(uint64_t primary_hash) {
        // Use double hashing to generate k hash values
        uint64_t h1 = primary_hash;
        uint64_t h2 = hash_secondary(primary_hash);
        
        for (int i = 0; i < num_hash_functions; i++) {
            size_t bit_pos = (h1 + i * h2) % num_bits;
            bits[bit_pos] = true;
        }
    }
    
    bool might_contain(uint64_t primary_hash) {
        uint64_t h1 = primary_hash;
        uint64_t h2 = hash_secondary(primary_hash);
        
        for (int i = 0; i < num_hash_functions; i++) {
            size_t bit_pos = (h1 + i * h2) % num_bits;
            if (!bits[bit_pos]) {
                return false;  // Definitely not in set
            }
        }
        return true;  // Might be in set
    }
};
 
// Enhanced probe phase with Bloom filter
bool advance_probe_with_bloom() {
    while (probe_child->next(&current_probe_tuple)) {
        JoinKey key = extract_key(current_probe_tuple, join_key_columns);
        
        if (key.has_null()) continue;  // Skip NULLs
        
        uint64_t hash_value = compute_hash(key);
        
        // Quick Bloom filter check before expensive hash table lookup
        if (!bloom_filter->might_contain(hash_value)) {
            // Definitely no match - skip hash table lookup entirely
            probe_tuples_bloom_filtered++;
            continue;
        }
        
        // Might have match - proceed with real lookup
        current_matches = hash_table->find_all(hash_value, key);
        if (!current_matches.empty()) {
            match_iterator = current_matches.begin();
            return true;
        }
        // False positive case: Bloom said maybe, but no actual match
    }
    return false;
}

Bloom Filter Trade-offs

Bloom Filter Sizing Guidelines
False Positive Rate	Bits per Element	Hash Functions	Use Case
10%	4.8	3	Quick filter, memory-constrained
1%	9.6	7	Standard applications
0.1%	14.4	10	High-selectivity joins
0.01%	19.2	13	Very large ratio of non-matches

Output Tuple Generation

Output tuple composition:

The output schema typically includes:

All (or selected) columns from the probe tuple
All (or selected) columns from the matching build tuple(s)
Potentially computed expressions

The physical representation depends on the execution model:

Materialized output: Copy all column values into a new tuple buffer. Safe but expensive for wide tuples.

Reference-based output: Store pointers to source tuples plus the matched row. Cheaper but requires source tuples to remain valid.

Column-position output: In columnar execution, output column pointers plus selection vectors. Most efficient for analytical workloads.

output_generation.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// Different output generation strategies
 
// Strategy 1: Materialized (copy all data)
void combine_tuples_materialized(
    const Tuple& probe,
    const Tuple& build,
    Tuple* output,
    const OutputSchema& schema
) {
    output->allocate(schema.total_bytes());
    
    size_t offset = 0;
    // Copy selected probe columns
    for (int col : schema.probe_columns) {
        size_t col_size = probe.column_size(col);
        memcpy(output->data() + offset, probe.column_data(col), col_size);
        offset += col_size;
    }
    // Copy selected build columns
    for (int col : schema.build_columns) {
        size_t col_size = build.column_size(col);
        memcpy(output->data() + offset, build.column_data(col), col_size);
        offset += col_size;
    }
}
 
// Strategy 2: Reference-based (just store pointers)
struct JoinOutputRef {
    const Tuple* probe_tuple;
    const Tuple* build_tuple;
    // Columns accessed on-demand through pointers
};
 
// Strategy 3: Vectorized columnar output
struct VectorizedJoinOutput {
    // For each output column, points to source column + selection vector
    struct ColumnBinding {
        Column* source_column;
        SelectionVector* selection;  // Which rows matched
        bool from_probe;             // vs from build
        int source_column_index;
    };
    
    vector<ColumnBinding> output_columns;
    size_t num_output_rows;
};

Late Materialization

Handling multiple matches:

When a probe tuple matches multiple build tuples, the join produces multiple outputs. This is common in many-to-many relationships and requires careful handling:

Probe tuple P1 matches build tuples B1, B2, B3
→ Outputs: (P1, B1), (P1, B2), (P1, B3)

The iterator model handles this by maintaining match state across next() calls. The current probe tuple is retained until all matches are emitted, then we advance to the next probe tuple.

For batch/vectorized execution, the output may be larger than the input batch, requiring output buffer management and potentially splitting output across multiple downstream calls.

Outer Join Probe Phase Modifications

LEFT OUTER JOIN

•Probe relation is preserved
•Non-matching probes output with NULL build columns
•Simple to implement: just check match count
•No build-side tracking needed
•Build relation is the inner side

RIGHT OUTER JOIN

•Build relation is preserved
•Must track which build tuples matched
•After probe completes: scan for unmatched
•Requires matched flag per build tuple
•More memory and post-processing overhead

outer_join_probe.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
// LEFT OUTER JOIN probe phase
bool probe_left_outer_next(Tuple* output) {
    while (!probe_exhausted) {
        if (match_iterator != current_matches.end()) {
            // Normal match - output joined tuple
            combine_tuples(current_probe_tuple, **match_iterator, output);
            ++match_iterator;
            return true;
        }
        
        // Check if we have any matches for current probe tuple
        if (current_matches.empty() && !current_probe_processed) {
            // No matches - output probe with NULLs for build columns
            combine_tuple_with_nulls(current_probe_tuple, output);
            current_probe_processed = true;
            return true;
        }
        
        // Advance to next probe tuple
        if (!advance_probe()) {
            probe_exhausted = true;
            return false;
        }
        current_probe_processed = false;
    }
    return false;
}
 
// RIGHT OUTER JOIN - requires tracking matched build tuples
struct HashEntryWithMatch {
    HashEntry base;
    bool matched;  // Set to true when this tuple matches a probe tuple
};
 
bool probe_right_outer_next(Tuple* output) {
    // First phase: normal probing with match tracking
    while (!probe_exhausted) {
        if (match_iterator != current_matches.end()) {
            // Mark build tuple as matched
            (*match_iterator)->matched = true;
            combine_tuples(current_probe_tuple, **match_iterator, output);
            ++match_iterator;
            return true;
        }
        
        if (!advance_probe()) {
            probe_exhausted = true;
            // Start second phase: emit unmatched build tuples
            unmatched_iterator = hash_table->begin();
        }
    }
    
    // Second phase: emit unmatched build tuples with NULL probe columns
    while (unmatched_iterator != hash_table->end()) {
        if (!unmatched_iterator->matched) {
            combine_nulls_with_tuple(unmatched_iterator->tuple, output);
            ++unmatched_iterator;
            return true;
        }
        ++unmatched_iterator;
    }
    
    return false;
}
 
// FULL OUTER JOIN combines both approaches
// - Track matched build tuples (like RIGHT OUTER)
// - Output non-matching probe tuples (like LEFT OUTER)
// - After probe: emit unmatched build tuples

Memory Overhead of Match Tracking

Semi-Join and Anti-Join Optimizations

Semi-joins and anti-joins are specialized join variants that can be executed more efficiently than full joins because they have simpler output requirements.

Semi-join (EXISTS semantics): Return probe tuples that have at least one match. Anti-join (NOT EXISTS semantics): Return probe tuples that have no matches.

Both variants output only probe-side columns—no joining of build columns is needed. This enables significant optimizations:

Semi-Join/Anti-Join Optimizations

•Early termination: For semi-join, stop searching on first match. Don't need to find ALL matches, just one.
•No build tuple storage: We only need to know if keys exist, not the full tuples. Store only keys in hash table.
•Bloom filter only: For extreme cases, a Bloom filter alone can answer semi-join queries (with small false positive rate).
•No output materialization: Output is just the probe tuple itself, no column combination needed.
•Duplicate elimination on build: If build has duplicates, we only need one representative per key.

semi_anti_join.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
// Optimized semi-join probe
bool probe_semi_join_next(Tuple* output) {
    while (probe_child->next(&current_probe_tuple)) {
        JoinKey key = extract_key(current_probe_tuple, join_key_columns);
        
        if (key.has_null()) continue;
        
        uint64_t hash_value = compute_hash(key);
        
        // Just check existence - don't need actual tuples
        if (hash_table->contains(hash_value, key)) {
            // Output probe tuple directly (no joining)
            *output = current_probe_tuple;
            return true;
        }
    }
    return false;
}
 
// Optimized anti-join probe
bool probe_anti_join_next(Tuple* output) {
    while (probe_child->next(&current_probe_tuple)) {
        JoinKey key = extract_key(current_probe_tuple, join_key_columns);
        
        // NULL-key tuples: never match, so always included in anti-join output
        if (key.has_null()) {
            *output = current_probe_tuple;
            return true;
        }
        
        uint64_t hash_value = compute_hash(key);
        
        // Use Bloom filter for quick negative check
        if (!bloom_filter->might_contain(hash_value)) {
            // Definitely no match - include in anti-join output
            *output = current_probe_tuple;
            return true;
        }
        
        // Bloom says maybe - check hash table
        if (!hash_table->contains(hash_value, key)) {
            *output = current_probe_tuple;
            return true;
        }
        // Match found - exclude from anti-join output
    }
    return false;
}
 
// For semi-join, we can use a key-only hash table
class KeyOnlyHashTable {
    // Stores only keys, not full tuples
    // ~50-90% memory reduction for wide tables
    vector<KeySet> buckets;
    
    void insert(uint64_t hash, const JoinKey& key) {
        buckets[hash % num_buckets].insert(key);
    }
    
    bool contains(uint64_t hash, const JoinKey& key) {
        return buckets[hash % num_buckets].contains(key);
    }
};

Query Optimizer Recognition

Probe Phase Performance Analysis

Probe phase performance depends on several factors, each contributing to the overall throughput of tuples processed per second.

Probe Performance Factors
Factor	Good Performance	Poor Performance	Diagnostic
Selectivity	Many non-matches (Bloom helps)	Most tuples match	Compare probe count vs output count
Match cardinality	1:1 or low (1:few)	1:many (fanout)	Check output rows / probe rows ratio
Hash table cache fit	Fits in L3 cache	Exceeds cache significantly	Monitor cache miss rates
Chain length	Average 1-2 entries	Long chains (>10)	Bucket size distribution
Key comparison cost	Integers, short strings	Wide composite, long strings	Profile key comparison time

Probe phase cost model:

Probe_cost = |S| × (hash_cost + lookup_cost + match_cost × avg_matches)
           + |S| × tuple_read_cost
           + output_rows × output_cost

Where:
- |S| = probe relation cardinality
- lookup_cost = bucket_access + avg_chain_length × comparison_cost  
- avg_matches = |output| / |S| (expected matches per probe)

Example calculation:

Join producing 5M output rows from 10M probe tuples (50% match rate, 1:1 on average):

Hash cost: 10M × 20 ns = 200 ms
Lookup cost (assume hash table in L3): 10M × 150 ns = 1.5 seconds
Match output: 5M × 100 ns = 500 ms
Total estimated: ~2.2 seconds

If hash table exceeds cache, lookup cost could 3-5× higher due to cache misses.

Prefetching Optimization

prefetching_probe.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Prefetching-optimized probe phase (simplified)
void probe_with_prefetch(Tuple* probe_batch, size_t batch_size) {
    const int PREFETCH_DISTANCE = 4;  // Look ahead 4 tuples
    
    // Compute hashes and prefetch for upcoming tuples
    uint64_t hashes[batch_size];
    for (size_t i = 0; i < batch_size; i++) {
        hashes[i] = compute_hash(extract_key(probe_batch[i]));
    }
    
    // Issue prefetches for first PREFETCH_DISTANCE buckets
    for (size_t i = 0; i < min(PREFETCH_DISTANCE, batch_size); i++) {
        _mm_prefetch(hash_table->bucket_ptr(hashes[i]), _MM_HINT_T0);
    }
    
    // Process tuples with rolling prefetch window
    for (size_t i = 0; i < batch_size; i++) {
        // Prefetch for tuple i + PREFETCH_DISTANCE
        if (i + PREFETCH_DISTANCE < batch_size) {
            _mm_prefetch(
                hash_table->bucket_ptr(hashes[i + PREFETCH_DISTANCE]),
                _MM_HINT_T0
            );
        }
        
        // Process tuple i (data should be in cache now)
        process_probe_tuple(probe_batch[i], hashes[i]);
    }
}

Monitoring Probe Phase Execution

Understanding probe phase behavior is essential for diagnosing join performance issues. Production database systems expose metrics that reveal execution characteristics:

Key Probe Phase Metrics

•Probe rows processed: Total tuples from probe relation. Compare to estimate for accuracy assessment.
•Rows output: Join output cardinality. High ratio to probe rows indicates 1:many join (fanout).
•Bloom filter rejection rate: Percentage of probes filtered before hash table lookup. High rates (>50%) indicate good Bloom filter effectiveness.
•Hash table hit rate: Probes that found at least one match. Low rates suggest filtering opportunity before join.
•Average chain length traversed: Indicates hash table quality. Long traversals suggest poor hash function or severe data skew.
•Probe phase time: Wall-clock time for complete probe phase. Compare to build phase for balance assessment.

probe_metrics_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- PostgreSQL EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) output for hash join
Hash Join  (cost=3152.00..14829.00 rows=50000 width=200) 
           (actual time=45.2..512.3 rows=47832 loops=1)
   Hash Cond: (orders.customer_id = customers.customer_id)
   -- Probe phase metrics visible in parent Hash Join node:
   -- actual rows=47832 indicates output cardinality
   -- actual time includes probe phase
   
   ->  Seq Scan on orders  (cost=0.00..8334.00 rows=500000 width=150) 
                           (actual time=0.02..95.1 rows=500000 loops=1)
       -- This is the probe relation: 500K rows
       -- 500K probes produced 47.8K matches (~9.5% match rate)
       
   ->  Hash  (cost=1769.00..1769.00 rows=100000 width=50) 
             (actual time=44.9..44.9 rows=100000 loops=1)
       Buckets: 131072  Batches: 1  Memory Usage: 6231kB
       -- Build phase: 100K rows, single batch (fit in memory)
       -- 131072 buckets for 100K rows = 1.3 load factor
       
       ->  Seq Scan on customers  (cost=0.00..1769.00 rows=100000 width=50) 
                                  (actual time=0.01..15.2 rows=100000 loops=1)
 
-- Key insights from these metrics:
-- - Build relation (customers): 100K rows, fit in 6MB hash table
-- - Probe relation (orders): 500K rows
-- - Output: 47.8K rows (many orders have no matching customer, or filtered earlier)
-- - Build time ~45ms, total time ~512ms, so probe took ~467ms
-- - Probe throughput: ~1.07 million rows/second

Cardinality Estimation Errors

Summary: Probe Phase Mastery

Key Takeaways

•O(1) lookup per probe tuple — Each probe involves hashing, bucket location, and chain traversal. With good hash functions and low load factors, this is effectively constant time.
•Bloom filters accelerate negative lookups — Pre-filter probes that definitely won't match, avoiding expensive hash table access for non-matching tuples.
•Output generation has cost — Joining data from probe and build tuples involves memory copies. Late materialization and reference-based output reduce this overhead.
•Outer joins require match tracking — LEFT OUTER is simple (non-matching probes). RIGHT/FULL OUTER need to track which build tuples matched for post-probe emission.
•Semi/anti-join optimizations — These variants need only existence checking, enabling early termination and smaller hash tables.
•Cache behavior dominates performance — Hash tables exceeding cache see dramatic slowdowns. Prefetching and careful memory layout help mitigate.

What's Next:

Page Complete

2 / 5