Join Ordering - Learning Module

Loading content...

0/241

Heuristic Approaches

When Exhaustive Search Isn't Enough

Dynamic programming is elegant and optimal within its search space—but it breaks down for very large queries. A 20-table join has over 10^17 possible orderings; even the efficient DP approach evaluates 3^20 ≈ 3.5 billion subset pairs. At microseconds per evaluation, that's hours of optimization time for a single query.

Real-world analytical queries often join 15, 20, or even 50+ tables. Data warehouse star schemas commonly involve dozens of dimension tables. Federated queries may pull from many sources. For these queries, we need algorithms that find good-enough plans quickly, even at the cost of theoretical optimality.

This is where heuristic approaches shine. They trade guaranteed optimality for predictable, bounded optimization time—finding plans that are typically within a small factor of optimal, in time measured in milliseconds rather than hours.

What You Will Learn

By the end of this page, you will understand greedy algorithms for join ordering, randomized approaches like genetic algorithms and simulated annealing, and rule-based heuristics. You'll learn when each approach is appropriate and how production systems combine them for robust optimization.

Greedy Join Ordering

The simplest heuristic approach is the greedy algorithm: at each step, choose the locally optimal action without considering future consequences. For join ordering, this means repeatedly selecting the cheapest next join until all tables are connected.

Greedy Join Ordering Algorithm
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
function GreedyJoinOrdering(tables: Set<Table>, joinGraph: Graph):
    // Start with either the smallest table or a strategic seed
    currentPlan = selectInitialTable(tables)
    remaining = tables - {currentPlan.table}
    
    while remaining is not empty:
        bestNextJoin = null
        bestCost = infinity
        
        // Find the cheapest table to join next
        for each table T in remaining:
            if canJoin(currentPlan, T, joinGraph):
                cost = estimateJoinCost(currentPlan, T)
                if cost < bestCost:
                    bestCost = cost
                    bestNextJoin = T
        
        // If no direct join possible, must use cross product
        if bestNextJoin is null:
            bestNextJoin = selectCheapestCrossProduct(currentPlan, remaining)
        
        // Extend the plan
        currentPlan = join(currentPlan, bestNextJoin)
        remaining = remaining - {bestNextJoin}
    
    return currentPlan

Greedy algorithm characteristics:

Time complexity: O(n²) comparisons for n tables—dramatically faster than DP
Space complexity: O(n)—no memoization table needed
Optimality: No guarantee; can be arbitrarily far from optimal
Predictability: Always completes in bounded time regardless of query size

Greedy Strengths

•Extremely fast—scales to hundreds of tables
•Simple to implement and debug
•Often finds good plans for well-structured queries
•Natural fit for left-deep plan generation
•No memory overhead for optimization state

Greedy Weaknesses

•Can get stuck in local optima
•Poor on queries requiring foresight
•Sensitive to initial table selection
•No backtracking when early choices prove bad
•May miss significantly better orderings

The Greedy Trap

Greedy algorithms can fail dramatically when the best overall plan requires accepting a more expensive early join to enable much cheaper later joins. Without lookahead, greedy optimization is 'short-sighted.'

Greedy Algorithm Variants

Several refinements to basic greedy ordering address its limitations while maintaining efficiency:

Greedy Variants

•Minimum Selectivity First — Instead of minimum cost, choose the join with the smallest output cardinality (highest selectivity). This often better minimizes downstream processing.
•K-Lookahead Greedy — Before selecting the next join, simulate the next k choices. This adds foresight at O(n^k) cost—expensive but more informed.
•Multi-Start Greedy — Run greedy starting from different initial tables; keep the best result. Mitigates sensitivity to starting point.
•Greedy with Local Search — After greedy completes, apply local improvements (swap adjacent joins, reorder subsequences). Hybrid approach combining global and local optimization.
•Minimum Intermediate Result — At each step, choose the join minimizing maximum intermediate result size so far. This bounds memory usage more effectively.

Greedy Variant Comparison
Variant	Time Complexity	Plan Quality	Best Use Case
Basic Greedy	O(n²)	Moderate	Very large queries, time-critical
Selectivity-First	O(n²)	Good	Queries with selective predicates
K-Lookahead (k=2)	O(n⁴)	Better	Medium queries, moderate time budget
Multi-Start (m starts)	O(m × n²)	Better	When starting table unclear
Greedy + Local Search	O(n² + iterations)	Best of greedy	When greedy alone produces poor plans

The IK-KBZ Algorithm:

A particularly effective greedy variant is the IK-KBZ algorithm (Ibaraki-Kameda / Krishnamurthy-Boral-Zaniolo), which works specifically for acyclic join graphs (common in practice).

The key insight: for acyclic join graphs, there exists a polynomial-time algorithm to find optimal left-deep join orderings. IK-KBZ computes a 'rank' for each table based on its selectivity and position in the join graph, then orders tables by rank.

rank(T) = (cardinality(T after local predicates)) / 
          (selectivity of join with parent)

Tables with lower rank should be joined earlier—they reduce data volume most effectively. This simple ranking often achieves optimal or near-optimal orderings for star and snowflake schemas.

Acyclic Join Graphs

Most real-world schemas produce acyclic join graphs—star schemas, normalized OLTP schemas, and time-series data all naturally have tree-structured relationships. The IK-KBZ algorithm's polynomial-time optimality for these cases makes it extremely valuable in practice.

Genetic Algorithms (GEQO)

Genetic algorithms apply evolutionary principles to optimization: maintain a population of candidate solutions, breed new candidates by combining successful ones, introduce random mutations, and select the fittest across generations. PostgreSQL's GEQO (Genetic Query Optimizer) is the most prominent example in database systems.

Genetic Algorithm for Join Ordering
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
function GeneticJoinOptimization(tables: Set<Table>, config: GAConfig):
    // Initialize population with random orderings
    population = []
    for i = 1 to config.populationSize:
        chromosome = randomPermutation(tables)  // Encoding: table order
        population.add(chromosome)
    
    // Evaluate initial fitness (lower cost = higher fitness)
    for each chromosome in population:
        chromosome.fitness = 1 / evaluatePlanCost(chromosome)
    
    // Evolve over generations
    for generation = 1 to config.maxGenerations:
        newPopulation = []
        
        // Elitism: keep best individuals
        elite = selectTopK(population, config.eliteCount)
        newPopulation.addAll(elite)
        
        // Breed new individuals
        while |newPopulation| < config.populationSize:
            parent1 = tournamentSelect(population)
            parent2 = tournamentSelect(population)
            
            // Crossover: combine parents
            child = orderedCrossover(parent1, parent2)
            
            // Mutation: random swaps with low probability
            if random() < config.mutationRate:
                child = mutate(child)
            
            child.fitness = 1 / evaluatePlanCost(child)
            newPopulation.add(child)
        
        population = newPopulation
        
        // Early termination if converged
        if isConverged(population):
            break
    
    return getBest(population)

Key genetic algorithm components:

Encoding — Join orderings are represented as permutations (chromosomes). Each position indicates join sequence.
Fitness function — Inverse of estimated query cost. Lower-cost plans are 'fitter' and more likely to reproduce.
Selection — Tournament selection picks random pairs and keeps the fitter. This balances exploration (giving weak candidates a chance) with exploitation (favoring strong candidates).
Crossover — Ordered crossover (OX) combines segments from each parent while preserving relative ordering—essential for permutation representations where simple point crossover creates invalid orderings.
Mutation — Random swaps or reversals introduce variation, preventing premature convergence to local optima.

PostgreSQL's GEQO

PostgreSQL activates GEQO for queries joining more tables than geqo_threshold (default 12). GEQO uses edge recombination crossover—a sophisticated operator that preserves adjacency information, crucial for join ordering since adjacent tables often share join predicates. Typical parameters: population 300-500, generations 50-100.

Genetic Algorithm Trade-offs
Aspect	Characteristic	Implication
Time predictability	Fixed generations × population	Bounded optimization time
Solution quality	Stochastic; varies across runs	May need multiple runs or larger populations
Configuration sensitivity	Performance varies with parameters	Tuning required per workload
Parallelization	Population evaluation is parallelizable	Good fit for multi-core systems
No optimality guarantee	Best-found, not proven optimal	Acceptable for large queries

Simulated Annealing

Simulated annealing borrows from metallurgical annealing, where slow cooling allows atoms to find low-energy configurations. For optimization, this translates to occasionally accepting worse solutions to escape local optima, with the probability of acceptance decreasing over time (the 'temperature' lowers).

Simulated Annealing for Join Ordering
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
function SimulatedAnnealingJoin(tables: Set<Table>, config: SAConfig):
    // Start with a random or greedy initial solution
    current = greedyJoinOrder(tables)
    currentCost = evaluatePlanCost(current)
    
    bestSolution = current
    bestCost = currentCost
    
    temperature = config.initialTemperature
    
    for iteration = 1 to config.maxIterations:
        // Generate neighbor by random modification
        neighbor = generateNeighbor(current)
        neighborCost = evaluatePlanCost(neighbor)
        
        // Calculate acceptance probability
        delta = neighborCost - currentCost
        
        if delta < 0:
            // Better solution: always accept
            current = neighbor
            currentCost = neighborCost
        else:
            // Worse solution: accept with probability e^(-delta/T)
            acceptanceProbability = exp(-delta / temperature)
            if random() < acceptanceProbability:
                current = neighbor
                currentCost = neighborCost
        
        // Update best found
        if currentCost < bestCost:
            bestSolution = current
            bestCost = currentCost
        
        // Cool down
        temperature = temperature * config.coolingRate
        
        // Optional: restart if stuck
        if temperature < config.minTemperature:
            temperature = config.initialTemperature * 0.5
    
    return bestSolution

Neighbor generation strategies:

The 'neighbor' of a join ordering is a slightly modified version. Common transformations include:

Swap — Exchange positions of two tables in the ordering
Reverse — Reverse a random subsequence
Insert — Remove a table and reinsert at a different position
3-opt — Reconnect three segments in a different order

The neighborhood structure significantly affects search effectiveness. Larger neighborhoods (like 3-opt) explore more broadly but cost more to evaluate.

Temperature Tuning

Setting the initial temperature requires balancing exploration vs exploitation. A good heuristic: set initial temperature so that ~80% of random neighbors are accepted initially, then cool to where <1% are accepted at termination. The cooling rate (typically 0.95-0.99) controls this transition.

SA Advantages

•Escapes local optima through probabilistic acceptance
•Simple implementation with few parameters
•Can start from any initial solution (greedy, random)
•Anytime algorithm: can return best-so-far if interrupted
•Provable convergence under certain cooling schedules

SA Disadvantages

•Single-point search: only one solution evolves
•Parameter tuning non-trivial
•May require many iterations for good convergence
•No memory of previous good solutions
•Stochastic: results vary across runs

Rule-Based Heuristics

Before cost-based optimization became universal, and even today as a complement to it, rule-based heuristics guide join ordering. These rules encode expert knowledge about what generally produces efficient plans.

Classic Join Ordering Heuristics

•Small tables first — Start with tables having few rows; they produce small intermediate results
•Selective predicates first — Apply filters before joins to reduce data volume
•Highly selective joins first — Joins with low output cardinality should come early
•Avoid Cartesian products — Never produce cross products until absolutely necessary (all other tables joined)
•Existence subqueries early — EXISTS/IN subqueries that filter significantly should be applied early
•Primary key joins last — When joining to parent tables (PK lookup), save these for later since each FK matches exactly one PK
•Star schema: fact table in middle — For star joins, access dimension tables first (for filtering), then join to fact table

Rules Can Mislead

Every rule has exceptions. 'Small tables first' fails if small tables expand via M:N joins. 'Selective predicates first' fails if those predicates prevent index usage. Role-based optimization alone is rarely sufficient for complex queries—but rules provide excellent starting points and sanity checks for cost-based optimization.

Oracle's rule-based optimizer (RBO):

Before Oracle 7, Oracle used a purely rule-based optimizer that ordered operations according to a fixed hierarchy:

Single row by ROWID
Single row by cluster join
Single row by hash cluster key
Single row by unique index
... (many levels)
Full table scan

This strict ordering was predictable but inflexible—it couldn't adapt to data statistics. Oracle deprecated RBO in favor of cost-based optimization (CBO), but traces remain as 'hints' that let users guide optimizer behavior.

Modern role of heuristics:

Today, heuristics typically serve as:

Initialization — Generate starting points for cost-based or randomized search
Pruning — Quickly reject obviously bad orderings without costing them
Fallback — Handle misconfigured or missing statistics
Validation — Sanity-check that cost-based choices don't violate known-good patterns

Hybrid Optimization Strategies

Production query optimizers rarely use a single algorithm. Instead, they combine approaches based on query characteristics, available time, and historical performance:

Hybrid Strategy Patterns
Strategy	Description	Example System
DP → Heuristic	Exhaustive DP for first k tables, heuristic for rest	Most commercial DBs
Heuristic → Local Search	Greedy initialization, then iterative improvement	Research prototypes
Multi-Algorithm Selection	Switch algorithms based on query size/complexity	PostgreSQL (DP → GEQO)
Parallel Portfolio	Run multiple algorithms concurrently, take first completion	Distributed systems
Adaptive Optimization	Start with heuristic, refine during execution if time allows	Oracle adaptive

Staged optimization:

A common pattern is staged optimization with increasingly sophisticated algorithms:

function StagedOptimization(query):
    n = countTables(query)
    
    if n <= 4:
        return exhaustiveEnumeration(query)  // Try everything
    else if n <= 10:
        return dynamicProgramming(query)     // DP with pruning
    else if n <= 20:
        return greedyWithLocalSearch(query)  // Fast + refinement
    else:
        return geneticAlgorithm(query)       // Exploratory search

The thresholds (4, 10, 20) are tuned based on:

Available optimization time budget
Hardware speed for cost estimation
Workload characteristics (how often large queries appear)
Quality requirements (OLTP vs analytical)

The Pareto Frontier

There's typically a Pareto frontier between optimization time and plan quality. DP takes longest but finds optimal plans (within its space). Greedy takes shortest but may find poor plans. The art of hybrid optimization is choosing the right algorithm for each query's position on this frontier.

Practical Hybrid Recommendations

•Use DP for OLTP — Short queries (≤5-7 tables) deserve exhaustive optimization; execution time dwarfs optimization time
•Use heuristics for large analytics — Data warehouse queries with 15+ tables should use genetic or greedy approaches
•Configure thresholds — Most databases expose parameters controlling algorithm switching; tune for your workload
•Cache optimized plans — Avoid re-optimizing identical queries; amortize optimization cost across executions
•Monitor plan quality — Track actual vs estimated costs; when they diverge badly, heuristics may have failed

Emerging Approaches: Learned Optimization

A new frontier in join ordering uses machine learning to guide optimization. Rather than hand-crafted heuristics or exhaustive enumeration, learned optimizers train models on historical query performance to predict good orderings.

Learned Optimization Approaches

•Cardinality estimation learning — Train neural networks to predict join result sizes more accurately than traditional histogram-based estimates. Better estimates → better ordering decisions.
•Cost model learning — Replace hand-tuned cost formulas with learned models that capture actual system behavior (caching, parallelism, etc.).
•End-to-end learning — Train reinforcement learning agents that directly select join orderings based on query features, bypassing traditional cost models entirely.
•Transfer learning — Pre-train on benchmark workloads, then fine-tune on production queries. Accelerates adoption for new databases.
•Hybrid learned-classical — Use ML to guide which regions of the search space to explore, while still using classical algorithms for local search.

Research vs Production

Learned query optimization is an active research area, with promising results from systems like Neo, Bao, and learned cardinality estimators. However, production adoption is limited by: uncertainty about model reliability, need for training data, and difficulty debugging ML-based decisions. Expect gradual integration rather than wholesale replacement of traditional optimization.

Reinforcement learning for join ordering:

The most ambitious learned optimization approaches frame join ordering as a sequential decision problem:

State: Current partial join plan + remaining tables
Action: Select next table to join
Reward: Negative of estimated (or actual) query execution time
Policy: Neural network mapping states to actions

Training uses query execution feedback to improve the policy. The trained model can then order joins in milliseconds—far faster than traditional search—while leveraging patterns learned from thousands of historical queries.

Challenges:

Distribution shift — Queries in production may differ from training queries
Schema changes — Model must be retrained when tables/columns change
Explainability — Hard to understand why the model chose a particular ordering
Cold start — New databases have no historical data for training

Practical Guidance for Heuristic Selection

Given the range of heuristic approaches, how should practitioners choose? Here's guidance based on common scenarios:

Heuristic Selection Guide
Scenario	Recommended Approach	Rationale
OLTP queries (≤7 tables)	Exhaustive DP	Optimization overhead negligible; optimal plans matter
Medium analytics (8-15 tables)	DP with aggressive pruning	Still tractable with cost bounds
Large analytics (15-25 tables)	GEQO or multi-start greedy	DP too slow; randomized exploration needed
Very large queries (25+ tables)	IK-KBZ or greedy	Even GEQO may be slow; fast heuristics essential
Star schema joins	Dimension-first heuristic	Schema structure provides strong guidance
Unknown/varied workload	Hybrid with adaptive thresholds	Flexibility to handle different query patterns

Best Practices for Heuristic Optimization

•Know your query size distribution — Tune algorithm thresholds based on actual query patterns in your workload
•Monitor optimization time separately — Track time spent in optimization vs execution; rebalance if optimization dominates
•Use plan caching — For repeated queries, cache optimized plans to avoid redundant optimization
•Test with realistic data volumes — Heuristics that work for 1000 rows may fail at 1 billion; test at scale
•Be skeptical of estimates — When heuristic plans perform badly, suspect cardinality estimates before blaming the algorithm
•Leverage database-specific features — Each system has unique parameters; learn PostgreSQL's GEQO settings, Oracle's optimizer parameters, etc.

Heuristics Are Not Magic

Heuristics fail gracefully in terms of optimization time but can fail catastrophically in plan quality. Always monitor actual query performance and be prepared to refine approaches. A 'good' heuristic on average may produce terrible plans for specific important queries.

Summary: Heuristic Approaches

We've explored the spectrum of heuristic approaches that make large-query optimization tractable. Let's consolidate the key insights:

Key Takeaways

•Greedy algorithms are fast but myopic — O(n²) complexity scales to hundreds of tables, but local optima traps limit quality.
•Genetic algorithms balance exploration and exploitation — Population-based evolution finds diverse good solutions; PostgreSQL's GEQO is a prime example.
•Simulated annealing escapes local optima — Temperature-controlled acceptance of worse solutions enables global search.
•Rule-based heuristics encode expert knowledge — Classic rules like 'small tables first' provide useful guidance but have exceptions.
•Hybrid approaches dominate production — Real systems combine DP for small subproblems with heuristics for large queries.
•Learned optimization is emerging — Machine learning promises better cardinality estimates and join ordering, though production adoption is nascent.
•Algorithm selection depends on query characteristics — Table count, schema structure, and time budget determine the best approach.

What's next:

We've examined both exhaustive (DP) and heuristic approaches to join ordering. The final page of this module focuses on left-deep plans—the most common search space restriction in practice. We'll explore why left-deep plans are favored, how they enable pipelining, and their implications for query execution.

Page Complete

You now understand the landscape of heuristic approaches for join ordering. From greedy algorithms to genetic evolution to learned optimization, each approach trades optimality for tractability in different ways. The key is selecting the right approach—or combination—for your specific workload and query characteristics.