Cost Based Optimization - Learning Module

Loading content...

0/241

Search Strategies

The Art of Finding Good Plans Fast

At this point, you understand cost estimation, plan comparison, dynamic programming, and tree shapes. But there's a meta-problem looming: how does the optimizer search through all these possibilities efficiently?

An optimizer faces a classic exploration-exploitation trade-off:

Exploration: Search more broadly to find better plans
Exploitation: Use what you've learned; stop searching and execute

Spend too long optimizing and you waste time that could be spent executing. Stop too soon and you might pick a terrible plan. The search strategy determines how the optimizer balances these concerns.

This page surveys the spectrum of search strategies—from exhaustive enumeration that guarantees optimality to randomized approaches that scale to massive queries—and the practical techniques real systems use to find excellent plans within their time budgets.

What You Will Learn

By the end of this page, you will understand: (1) Exhaustive search and its limitations, (2) Heuristic and greedy strategies, (3) Randomized algorithms (GEQO, simulated annealing), (4) Branch-and-bound optimization, (5) Adaptive and iterative improvement approaches, (6) Time budgeting and optimization levels, and (7) How production systems blend these strategies.

Exhaustive Search

The simplest conceptual strategy is exhaustive search: enumerate every possible plan, compute the cost of each, and pick the cheapest.

Algorithm:

best_plan = null
best_cost = infinity

for each possible plan P in search_space(query):
  cost = estimate_cost(P)
  if cost < best_cost:
    best_plan = P
    best_cost = cost

return best_plan

Guarantees:

✅ Optimal: Will find the best plan in the search space
✅ Complete: Explores every alternative
❌ Slow: Time grows exponentially with query complexity

When exhaustive works:

For small queries (≤5-6 tables), exhaustive search is practical:

5 tables, left-deep: 120 permutations × join algorithms ≈ few thousand plans
At microseconds per plan, still completes in milliseconds

When it breaks down:

Exhaustive Search Time (Approximate)
Tables	Left-Deep Plans	Time @ 1μs/plan	Status
4	~100	0.1ms	✅ Fast
6	~5,000	5ms	✅ Acceptable
8	~200,000	200ms	⚠️ Noticeable
10	~18 million	18s	❌ Too slow
12	~2 billion	33 minutes	❌ Impractical
15	~1.3 trillion	15 days	❌ Absurd

Dynamic programming improves this:

As we learned, DP exploits optimal substructure to avoid recomputation:

Instead of O(n!) for each complete plan...
We solve O(2ⁿ) subproblems, each in O(n) time...
Yielding O(3ⁿ × n) total, dramatically better

But even O(3ⁿ) is exponential. For 20+ tables, even DP becomes impractical. We need smarter strategies.

Optimality Within a Class

Exhaustive search with DP guarantees optimality only within the explored plan space. If restricted to left-deep trees, the 'optimal' plan may still be suboptimal compared to an unexplored bushy alternative. Optimality is relative to search space definition.

Greedy and Heuristic Strategies

Greedy algorithms make locally optimal choices at each step, hoping to achieve a globally good result. They're fast but offer no optimality guarantee.

Greedy Join Ordering:

remaining = {all tables}
result_so_far = null

while |remaining| > 1:
  # Find the pair with lowest join cost
  best_join = null
  best_cost = infinity
  
  for each pair (R, S) in remaining:
    if can_join(R, S):  # Have join predicate
      cost = estimate_join_cost(R, S)
      if cost < best_cost:
        best_join = (R, S)
        best_cost = cost
  
  # Perform the greedy choice
  result_so_far = join(best_join.R, best_join.S)
  remove R, S from remaining
  add result_so_far to remaining

return result_so_far

Time complexity: O(n²) for n tables—polynomial, not exponential!

Greedy pitfalls:

Greedy can make catastrophically bad choices:

Example where greedy fails:

Tables: A(1M rows), B(1K rows), C(500K rows), D(100 rows)
Join graph: A—B—C—D (chain)

Selective joins:
  B⋈D is very selective (result: 10 rows)
  A⋈B is moderately selective (result: 1K rows)
  C⋈D is selective (result: 100 rows)

Greedy choice at step 1:
  Considers only adjacent pairs: A⋈B, B⋈C, C⋈D
  Picks cheapest: A⋈B (cost 1000)
  
This commits us to processing A early, when:
  Optimal would be: (B⋈D)⋈(C) first, tiny intermediate!
  Then join with A at the end.

Greedy can't see that delaying A produces smaller intermediates.

Common heuristics:

Heuristic Rules for Join Ordering

•Smallest first: Join smallest relations first to keep intermediates small
•Most selective first: Perform most selective joins early to reduce cardinality
•Avoid Cartesian products: Never join tables without predicates if alternatives exist
•Use indexes: Prefer joins that can leverage existing indexes
•Follow foreign keys: Join along referential constraints (often selective)
•Star schema rule: Join all dimension tables to fact table (dimension-first)

Heuristics Are Approximations

Heuristics work well on average but can fail badly for unusual queries. 'Smallest first' assumes small tables have selective joins—not always true. 'Most selective first' depends on cardinality estimates which may be wrong. Use heuristics as guidance, not guarantees.

Randomized Search Strategies

When the search space is too large for exhaustive exploration and greedy is too myopic, randomized algorithms offer a middle ground. They explore broadly but not exhaustively, trading optimality guarantees for scalability.

Random Sampling:

best_plan = null
best_cost = infinity

for i = 1 to BUDGET:
  plan = generate_random_valid_plan(query)
  cost = estimate_cost(plan)
  if cost < best_cost:
    best_plan = plan
    best_cost = cost

return best_plan

With enough samples, you're likely to find a good plan. But pure random sampling is wasteful—most random plans are terrible.

Iterative Improvement:

current = generate_initial_plan()  # Maybe greedy start
best = current

for i = 1 to BUDGET:
  neighbor = apply_random_transformation(current)
  if cost(neighbor) < cost(current):
    current = neighbor
    if cost(current) < cost(best):
      best = current
  else:
    # Stay with current (reject worse neighbor)

return best

This hill-climbs toward local optima. Better than pure random, but gets stuck in local minima.

Simulated Annealing:

To escape local minima, occasionally accept worse moves:

simulated_annealing.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
function SimulatedAnnealing(query):
    current = initial_plan(query)
    best = current
    temperature = INITIAL_TEMP  # e.g., 1000
    
    while temperature > MIN_TEMP:
        for i = 1 to ITERATIONS_PER_TEMP:
            neighbor = random_neighbor(current)
            delta = cost(neighbor) - cost(current)
            
            if delta < 0:
                # Neighbor is better: always accept
                current = neighbor
            else:
                # Neighbor is worse: accept with probability e^(-delta/temp)
                probability = exp(-delta / temperature)
                if random() < probability:
                    current = neighbor
            
            if cost(current) < cost(best):
                best = current
        
        temperature = temperature * COOLING_RATE  # e.g., 0.95
    
    return best

How annealing works:

High temperature: Accept many uphill moves → broad exploration
Low temperature: Rarely accept uphill moves → local refinement
Cooling schedule: Gradually reduce temperature over time

With enough time, simulated annealing can find excellent solutions even in massive search spaces.

Parameter Tuning

Annealing performance depends heavily on parameters: initial temperature, cooling rate, iterations per temperature level. These are typically tuned empirically for the database workload. Too fast cooling → stuck in local minima. Too slow → wastes optimization time.

Genetic Query Optimization (GEQO)

Genetic algorithms model optimization as evolution: maintain a population of candidate plans, select the fittest, breed new plans through crossover and mutation, and iterate.

PostgreSQL uses GEQO (Genetic Query Optimizer) for queries exceeding its DP threshold (default: 12 tables).

The genetic algorithm structure:

genetic_optimizer.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
function GeneticQueryOptimizer(query, tables):
    # Represent plans as permutations (join order)
    # Each "chromosome" is a permutation of table indices
    
    # Initialize population with random permutations
    population = []
    for i = 1 to POPULATION_SIZE:  # e.g., 50-500
        chromosome = random_permutation(tables)
        fitness = evaluate_fitness(chromosome)  # 1/cost
        population.append((chromosome, fitness))
    
    for generation = 1 to MAX_GENERATIONS:
        # Selection: pick parents (tournament or roulette wheel)
        parents = select_parents(population)
        
        # Crossover: combine parent chromosomes
        offspring = []
        for (parent1, parent2) in pairs(parents):
            child1, child2 = crossover(parent1, parent2)
            offspring.extend([child1, child2])
        
        # Mutation: randomly alter some offspring
        for child in offspring:
            if random() < MUTATION_RATE:  # e.g., 0.05
                mutate(child)  # Swap two positions
        
        # Evaluate fitness of offspring
        for child in offspring:
            child.fitness = evaluate_fitness(child)
        
        # Survivor selection: keep best of population + offspring
        population = select_survivors(population + offspring)
    
    # Return best plan found
    return decode_chromosome(best_in(population))

Key genetic operations:

Crossover (recombination):

Combine join orders from two parents. For permutations, use methods like:

Edge recombination: Preserve adjacency relationships
Order crossover (OX): Copy segment from one parent, fill rest from other
Partially mapped crossover (PMX): Map positions between parents

Mutation:

Randomly alter a chromosome:

Swap: Exchange two table positions
Inversion: Reverse a segment
Scramble: Randomly reorder a segment

Selection pressure:

Balance between exploiting good solutions and maintaining diversity:

Tournament selection: Compare k random individuals, pick best
Roulette wheel: Probability proportional to fitness
Elitism: Always keep the best N individuals

PostgreSQL GEQO Configuration
Parameter	Default	Description
`geqo`	on	Enable/disable GEQO
`geqo_threshold`	12	Minimum tables to trigger GEQO
`geqo_effort`	5	Trade-off: time vs. plan quality (1-10)
`geqo_pool_size`	0	Population size (0 = auto-scale with tables)
`geqo_generations`	0	Number of generations (0 = auto)
`geqo_selection_bias`	2.0	Selection pressure (1.5-2.0 typical)

GEQO Trade-offs

GEQO finds good (not guaranteed optimal) plans in reasonable time for large queries. Plan quality is probabilistic—running GEQO twice on the same query may give different results. For critical queries, consider increasing geqo_effort or using plan forcing after finding a good plan.

Branch and Bound Optimization

Branch and bound is a systematic way to explore the search space while pruning provably suboptimal branches. It's exhaustive in principle but often skips most of the space in practice.

The key insight:

If we have a current best plan with cost B, and we're exploring a partial plan P with cost already ≥ B, we can prune all plans extending P—they can only be more expensive.

Algorithm structure:

branch_and_bound.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
function BranchAndBound(query):
    # Initialize with a greedy solution as upper bound
    best_plan = greedy_plan(query)
    best_cost = cost(best_plan)
    
    # Priority queue ordered by lower_bound (best-first search)
    frontier = PriorityQueue()
    initial = PartialPlan(tables=query.tables, joined=empty)
    frontier.push(initial, priority=lower_bound(initial))
    
    while not frontier.empty():
        partial = frontier.pop()
        
        # Pruning check
        if lower_bound(partial) >= best_cost:
            continue  # Can't beat current best
        
        if partial.is_complete():
            if cost(partial) < best_cost:
                best_plan = partial
                best_cost = cost(partial)
        else:
            # Branch: generate children (ways to extend)
            for next_table in partial.remaining_tables():
                child = extend(partial, next_table)
                child_lb = lower_bound(child)
                
                if child_lb < best_cost:
                    frontier.push(child, priority=child_lb)
                # Else: prune this branch
    
    return best_plan

Lower bound function:

The effectiveness of branch and bound depends on tight lower bounds. A lower bound for a partial plan must be:

Admissible: Never overestimate the true optimal completion cost
Tight: Close to actual cost to enable aggressive pruning

Common lower bounds for query optimization:

Lower_Bound(partial_plan) = 
    Cost(partial_plan)               # Cost so far
  + ∑ MinAccessCost(remaining_table)  # Cheapest way to access each
  + ∑ MinJoinCost(required_joins)     # Cheapest join for each remaining

Search order matters:

Exploring promising branches first leads to finding good complete plans early, which in turn enables more pruning:

Best-first search: Always expand the partial plan with lowest lower bound
Depth-first with bound: Complete plans quickly to establish bounds
Hybrid: Use heuristics to guess promising directions

Commercial Implementations

SQL Server and Oracle use sophisticated branch-and-bound with multiple levels of optimization. Quick searches find initial bounds; deeper searches refine. The system dynamically adjusts search depth based on query complexity and available time budget.

Tiered and Adaptive Optimization

Real-world optimizers don't use a single strategy—they adapt their approach based on query complexity and time constraints.

Tiered optimization (optimization levels):

Many systems define optimization levels:

Level 0 (Trivial): 
  - Single-table queries, no joins
  - Just pick access method
  - Microseconds

Level 1 (Simple):
  - Few tables, simple predicates
  - Greedy or exhaustive for small space
  - Milliseconds

Level 2 (Moderate):
  - Multiple tables, complex predicates
  - Full DP for join ordering
  - Tens of milliseconds

Level 3 (Complex):
  - Many tables, subqueries, CTEs
  - Extensive search with pruning
  - Hundreds of milliseconds

Level 4 (Full):
  - Exhaustive exploration of all alternatives
  - May include bushy trees, parallelism
  - Seconds (with timeout)

SQL Server Optimization Levels
Phase	Description	Stop If...
Phase 0 (Transaction Processing)	Quick, simple plans for OLTP	Plan cost < threshold
Phase 1 (Quick Plan)	Heuristics + limited exploration	Good plan found quickly
Phase 2 (Full Optimization)	Complete DP enumeration	Time budget exhausted or optimal found
Phase 3 (Parallel)	Consider parallel alternatives	Always if Phase 2 runs

Adaptive search:

Within each level, the optimizer adapts:

Start with quick heuristics to establish a baseline
Estimate search space size based on query structure
Allocate time budget proportional to expected query runtime
Search with incremental commitment: stop when confident

Time budget formula (example):

Optimization_Budget = min(
  MAX_OPTIMIZATION_TIME,
  OPTIMIZATION_FRACTION × Estimated_Query_Time
)

where OPTIMIZATION_FRACTION might be 0.05 (5%)

For a query estimated to take 60 seconds, allow up to 3 seconds of optimization. For a 10ms query, cap optimization at 0.5ms.

Any-time algorithms:

Some optimizers use any-time algorithms that:

Quickly produce a valid plan
Continuously improve it as time allows
Can be interrupted and return best-so-far

This guarantees a result within any time budget while improving quality with more time.

Plan Stability Concerns

Aggressive timeout-based optimization can cause plan instability—similar queries getting different plans based on when optimization was interrupted. Production systems often include stability mechanisms: plan baselines, consistent search order, and minimum exploration thresholds.

Transformation-Based Search (Cascades)

The Cascades/Volcano optimization framework uses a fundamentally different search paradigm: transformation rules applied within a memoization structure.

Core concepts:

Memo (MEMO): A data structure grouping logically equivalent expressions
Groups: Equivalence classes of expressions producing the same result
Rules: Transformations that generate new equivalent expressions or implementations
Top-down search: Start from the goal (full query) and work backward

Search via rule application:

cascades_search.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
function CascadesOptimize(goal, required_properties):
    # Look up or create group for this expression
    group = memo.find_or_create(goal)
    
    # Check if already optimized for these properties
    if group.has_optimal_plan(required_properties):
        return group.get_optimal_plan(required_properties)
    
    # Apply transformation rules to generate alternatives
    for rule in applicable_rules(goal):
        new_expressions = rule.apply(goal)
        for expr in new_expressions:
            memo.add_to_group(group, expr)
    
    # Apply implementation rules (logical → physical)
    for impl_rule in implementation_rules(goal):
        physical_ops = impl_rule.apply(goal)
        for op in physical_ops:
            # Recursively optimize inputs with required properties
            input_plans = []
            for input, input_props in op.inputs_with_requirements():
                input_plan = CascadesOptimize(input, input_props)
                input_plans.append(input_plan)
            
            plan = create_plan(op, input_plans)
            cost = compute_cost(plan)
            group.update_optimal(required_properties, plan, cost)
    
    return group.get_optimal_plan(required_properties)

Example transformation rules:

Logical Transformation Rules:
  JoinCommutativity:  A ⋈ B  →  B ⋈ A
  JoinAssociativity:  (A ⋈ B) ⋈ C  →  A ⋈ (B ⋈ C)
  SelectionPushdown:  σ(R ⋈ S)  →  σ(R) ⋈ S (if σ applies to R only)
  ProjectionPushdown: π(R ⋈ S)  →  π(π'(R) ⋈ π''(S))

Implementation Rules:
  JoinToHashJoin:     R ⋈ S  →  HashJoin(R, S)
  JoinToMergeJoin:    R ⋈ S  →  MergeJoin(R, S) [requires sorted inputs]
  JoinToNLJ:          R ⋈ S  →  NestedLoopJoin(R, S)
  TableToScan:        R      →  TableScan(R)
  TableToIdxScan:     R      →  IndexScan(R, idx) [if idx exists]

Advantages of Cascades:

Cascades Framework Benefits

•Modular rules: Add new transformations without restructuring the optimizer
•Demand-driven: Only explores alternatives relevant to the current goal
•Memoization: Avoids redundant work on shared subexpressions
•Property-aware: Naturally handles interesting orders and physical properties
•Extensible: Easy to add new operators, join types, or access methods

Systems Using Cascades

SQL Server, Greenplum, Apache Calcite, CockroachDB, and many modern databases use Cascades-style optimizers. The framework's modularity makes it popular for extensible systems, though it requires more complex implementation than traditional System R-style DP.

Practical Search Configuration

When tuning or troubleshooting query optimization, understanding configuration options helps:

PostgreSQL optimization controls:

PostgreSQL Optimizer Configuration
Setting	Purpose	Impact on Search
`join_collapse_limit`	Max tables for exhaustive reordering	Higher = more thorough, slower
`from_collapse_limit`	Limits FROM clause flattening	Affects what optimizer 'sees'
`geqo_threshold`	When to switch to genetic	Lower = genetic kicks in earlier
`enable_hashjoin`, etc.	Enable/disable join types	Constrains search space
`random_page_cost`	Cost model parameter	Influences plan preference

Debugging search behavior:

When a query has a poor plan, investigate:

EXPLAIN ANALYZE: See actual vs. estimated rows
Statistics: Are estimates reasonable? (ANALYZE tables)
Plan forcing: Test alternative plans with hints/settings
Optimizer timeout: Did optimization complete or timeout?
Search space: Could the optimal plan have been explored?

Example debugging session:

-- Check current plan
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT ...

-- Force a different join order (PostgreSQL)
SET join_collapse_limit = 1;  -- Disable reordering
SELECT ... FROM A, B, C ...   -- Uses FROM clause order

-- Force specific join type
SET enable_hashjoin = off;
EXPLAIN SELECT ...

-- Increase optimization effort
SET from_collapse_limit = 20;
SET join_collapse_limit = 20;
EXPLAIN SELECT ...

Start Simple

Before adjusting search parameters, verify statistics are current (ANALYZE), cost model parameters match your hardware, and the query itself is correct. Many 'optimization' problems are actually stale statistics or misconfigured cost weights.

Summary: Search Strategies in Query Optimization

We've surveyed the spectrum of search strategies that power cost-based optimizers. Let's consolidate the key insights:

Key Takeaways

•Exhaustive search guarantees optimality — But only practical for small queries (≤6-8 tables) due to exponential complexity.
•Greedy heuristics are fast but myopic — O(n²) complexity enables large queries but may miss good plans that require short-term sacrifice.
•Randomized search scales well — Simulated annealing and iterative improvement explore large spaces without guarantees.
•Genetic algorithms (GEQO) handle massive queries — Population-based evolution finds good plans for 12+ table queries.
•Branch and bound prunes systematically — Lower bounds enable skipping provably suboptimal branches while maintaining completeness.
•Tiered optimization adapts effort to complexity — Quick checks for simple queries; deep analysis for complex ones.
•Cascades framework uses rule-based transformation — Modular, extensible, demand-driven search used by modern systems.
•Time budgets balance optimization with execution — Spending 1 second to save 10 seconds is worthwhile; 1 second to save 10ms isn't.

Module complete:

With this page, we've completed our exploration of Cost-Based Query Optimization. You now understand:

How costs are estimated (Page 1)
How plans are compared (Page 2)
How dynamic programming finds optimal join orders (Page 3)
How tree shape affects execution (Page 4)
How search strategies navigate the space (Page 5)

This knowledge empowers you to understand optimizer behavior, diagnose poor plans, and tune systems for optimal performance. Query optimization remains an active research area, with machine learning, adaptive execution, and learned cost models continually advancing the field.

Module Complete

Congratulations! You've mastered the core concepts of cost-based query optimization. From cost estimation through dynamic programming to search strategies, you now understand how database systems transform your SQL into efficient execution plans. This knowledge is fundamental for performance tuning, system design, and deep database work.