Cost Based Optimization - Learning Module

Loading content...

0/241

Plan Comparison

Navigating the Vast Space of Plans

Imagine you're planning a road trip visiting 10 cities. How many different orderings could you visit them in? The answer is 10! = 3,628,800 different routes. Now imagine you could also choose between 3 different roads between each pair of cities, and at each city you could stay at 4 different hotels. The combinations explode astronomically.

This is exactly the challenge facing a query optimizer. For a query joining 10 tables, with multiple access methods per table, multiple join algorithms, and multiple join orderings, the space of possible execution plans can exceed trillions. Evaluating every single plan is impossible—yet the optimizer must find a good plan in milliseconds.

Plan comparison is the systematic process of enumerating candidate plans, computing their costs, and selecting the optimal (or near-optimal) one. This page explores how optimizers navigate this exponential search space efficiently.

What You Will Learn

By the end of this page, you will understand: (1) The structure of the plan search space, (2) How optimizers enumerate candidate plans, (3) Techniques for pruning unproductive search branches, (4) The concept of equivalence classes and physical properties, and (5) Trade-offs between optimization time and plan quality.

The Plan Search Space

Before comparing plans, we must understand what we're comparing. The plan search space consists of all valid execution plans for a given query. This space has multiple dimensions:

Dimension 1: Join Ordering

For n tables, there are (2n-2)!/((n-1)!) different join trees. This grows super-exponentially:

Tables	Join Trees
3	12
5	1,680
7	665,280
10	17,643,225,600

Even just considering join order, the space becomes astronomical for modest query sizes.

Dimension 2: Join Algorithms

For each join in the plan, the optimizer chooses among physical operators:

Nested Loop Join (with index or without)
Block Nested Loop Join
Sort-Merge Join
Hash Join (basic, hybrid, Grace)
Index Nested Loop Join

With 5 join types and 9 joins (10-table query), that's 5⁹ ≈ 2 million combinations per join order.

Dimension 3: Access Methods

For each table, the optimizer chooses how to access it:

Sequential scan
Index scan (potentially multiple indexes)
Index-only scan
Bitmap scan
Partitioned access

With 4 choices per table and 10 tables, that's 4¹⁰ ≈ 1 million combinations per join/algorithm choice.

The Combinatorial Explosion

Multiplying these dimensions: For a 10-table query, the total plan space can exceed 10^20 (100 quintillion) plans. If each cost estimate took 1 microsecond, exhaustive enumeration would take 3 billion years. Optimizers MUST use smarter strategies.

The optimization problem:

Given:

A query Q with logical operators (joins, selections, projections, etc.)
Statistics about tables, columns, and indexes
A cost model C(plan) → ℝ

Find:

The plan P* such that C(P*) = min{C(P) : P is valid for Q}

Or at least find a plan within some factor of optimal.

Plan Enumeration Strategies

Optimizers use several strategies to enumerate candidate plans efficiently:

Strategy 1: Bottom-Up Enumeration

Start with single-table access plans, then progressively build larger plans by combining smaller ones:

Level 1: All single-table plans
  Plan({A}), Plan({B}), Plan({C}), ...

Level 2: All two-table plans from Level 1
  Plan({A,B}) = Plan({A}) ⋈ Plan({B})
  Plan({A,C}) = Plan({A}) ⋈ Plan({C})
  Plan({B,C}) = Plan({B}) ⋈ Plan({C})
  ...

Level 3: All three-table plans
  Plan({A,B,C}) = Plan({A,B}) ⋈ Plan({C})
                 = Plan({A,C}) ⋈ Plan({B})
                 = Plan({A}) ⋈ Plan({B,C})
  ...

This is the foundation of dynamic programming optimization, which we'll explore in depth later.

Strategy 2: Top-Down Enumeration

Start with the full query and recursively decompose:

Optimize(A ⋈ B ⋈ C):
  Try: Optimize(A ⋈ B) ⋈ Optimize(C)
  Try: Optimize(A ⋈ C) ⋈ Optimize(B)
  Try: Optimize(A) ⋈ Optimize(B ⋈ C)
  ...
  Return best option

This enables memoization—caching results of subproblems to avoid recomputation.

Strategy 3: Transformation-Based Enumeration

Start with an initial plan and apply transformations to generate alternatives:

Initial: (A ⋈ B) ⋈ C with nested loop joins

Transformation 1: Apply join commutativity
  → (B ⋈ A) ⋈ C

Transformation 2: Apply join associativity
  → A ⋈ (B ⋈ C)

Transformation 3: Change join algorithm
  → (A ⊳⊲ B) ⋈ C  [hash join for A-B]

Systems like Volcano/Cascades use transformation-based enumeration with rule-driven generation.

Comparison of Enumeration Strategies
Strategy	Advantages	Disadvantages	Used By
Bottom-Up (System R style)	Complete exploration, optimal for class of plans	May waste time on poor subplans	PostgreSQL, MySQL
Top-Down (Volcano/Cascades)	Natural memoization, flexible rules	Higher overhead, complex implementation	SQL Server, Greenplum
Transformation-Based	Modular, easy to extend	May miss plans, ordering-dependent	Apache Calcite, Spark SQL
Randomized/Genetic	Good for very large queries	No optimality guarantee	PostgreSQL (GEQO)

Pruning the Search Space

Since exhaustive search is infeasible, optimizers must prune the search space—eliminating branches without fully exploring them. Effective pruning is the key to fast optimization.

Pruning Technique 1: Cost-Based Pruning (Branch and Bound)

If we've found a complete plan with cost 1000, we can prune any partial plan whose cost already exceeds 1000:

Best_known_cost = 1000

Partial plan (A ⋈ B) has cost 800
→ Continue exploring; might find plan with total cost < 1000

Partial plan (A ⋈ C) has cost 1100
→ PRUNE! No extension can have cost < 1000

This is highly effective when we find good plans early. Order of exploration matters—try promising branches first.

Pruning Technique 2: Interesting Orders

Interesting orders are sort orderings that benefit downstream operations. A plan producing sorted output might be more expensive initially but saves sorting later:

SELECT * FROM Orders o JOIN Customers c ON o.customer_id = c.id
ORDER BY c.name;

If we sort Customers by name early:

Sort-merge join naturally preserves order
Final ORDER BY is free!

Without tracking this, we'd prune the "expensive" sorted plan, missing the overall optimal.

Pruning Technique 3: Physical Property Propagation

Beyond sorting, plans have physical properties:

Ordering: Columns output is sorted by
Partitioning: How data is distributed (in parallel systems)
Compression: Whether data remains compressed
Locality: Data placement constraints

Plans with compatible properties may enable more efficient parent operators. Optimizers track property requirements and prune plans that can't satisfy them.

Additional Pruning Strategies

•Heuristic filters: Eliminate obviously bad plans (Cartesian products, missing index when available)
•Plan dominance: If plan A is better than plan B in all properties, prune B
•Symmetry breaking: A ⋈ B and B ⋈ A are equivalent for commutative joins; explore only one
•Histogram bounds: Prune plans accessing partitions that are provably empty
•Timeout limits: Stop optimization after time budget, use best plan found so far

The Art of Pruning

Aggressive pruning risks missing the optimal plan. Conservative pruning wastes optimization time. Good optimizers balance aggressiveness with safety, using heuristics tuned for typical workloads while ensuring correctness.

Equivalence Classes

A powerful optimization concept is grouping logically equivalent expressions into equivalence classes. All members of a class produce the same result but may have different physical implementations and costs.

Example equivalence class:

For the expression (A ⋈ B) ⋈ C, the equivalence class contains:

Logically equivalent forms:
  (A ⋈ B) ⋈ C
  (B ⋈ A) ⋈ C
  A ⋈ (B ⋈ C)
  A ⋈ (C ⋈ B)
  (A ⋈ C) ⋈ B
  (C ⋈ A) ⋈ B
  ...

Each form can use different physical operators, creating many physical plans per logical form.

The memoization table:

Transformation-based optimizers (Volcano/Cascades) maintain a data structure called the memo that groups expressions by equivalence:

Memo:
  Group 1: {A}           → Plans: SeqScan(A), IdxScan(A, idx1)
  Group 2: {B}           → Plans: SeqScan(B)
  Group 3: {C}           → Plans: SeqScan(C), IdxScan(C, idx2)
  Group 4: {A ⋈ B}       → Plans: Group1 NLJ Group2, Group1 HJ Group2, ...
  Group 5: {B ⋈ C}       → Plans: Group2 NLJ Group3, Group2 SMJ Group3, ...
  Group 6: {A ⋈ B ⋈ C}   → Plans: Group4 ⋈ Group3, Group1 ⋈ Group5, ...

Benefits of equivalence classes:

Equivalence Class Benefits

•Avoid recomputation: Computing the best plan for {A ⋈ B} once serves all uses
•Natural sharing: Multiple query blocks reusing subexpressions leverage the same group
•Incremental optimization: Add new rules without replanning existing groups
•Physical property tracking: Each group tracks which properties its plans can provide

The Cascades/Columbia Framework

The Cascades optimizer framework (used by SQL Server, Greenplum, and others) formalizes memo-based optimization. Columbia extended it with further optimizations. These systems treat optimization as graph search over equivalence classes with memoization.

Multi-Objective Comparison

In practice, "best plan" isn't always one-dimensional. Optimizers must sometimes balance multiple objectives:

Objective 1: Total Work (Total Cost)

Minimize total resource consumption—the traditional optimization goal.

Objective 2: Time to First Row (Startup Cost)

For interactive queries or cursors, users want the first result quickly, even if the total query takes longer.

Plan A: Startup=1000, Total=5000 (sort-then-stream)
Plan B: Startup=10, Total=8000   (stream immediately)

For LIMIT 10, Plan B might be better despite higher total cost.

Objective 3: Memory Usage

In memory-constrained environments, a plan using less memory might be preferred even if slower:

Plan A: Cost=1000, Memory=500MB (fits in buffer pool)
Plan B: Cost=800, Memory=2GB   (forces spilling to disk)

The actual execution cost of Plan B might exceed Plan A due to spill overhead.

Objective 4: Resource Contention

In multi-tenant systems, some plans are "friendlier" to other queries:

Plan A: Uses all parallel workers, finishes in 10s
Plan B: Uses 2 workers, finishes in 40s

Plan B might be preferred if other critical queries need resources.

Plan Properties Beyond Cost
Property	When It Matters	Trade-off Example
Startup cost	Interactive queries, LIMIT clauses	Sort vs. nested loop for first rows
Peak memory	Concurrent queries, small buffer pools	Hash join vs. sort-merge join
Parallelism	Shared systems, cloud resources	Parallel scan vs. sequential scan
Output ordering	ORDER BY, merge joins downstream	Sorted access vs. unsorted + sort
Blocking behavior	Streaming pipelines, timeouts	Hash aggregation vs. sorted aggregation

Pareto Optimality

With multiple objectives, there may be no single "best" plan. A set of Pareto-optimal plans exists where no plan beats another in ALL objectives. Advanced optimizers maintain this frontier and make context-dependent choices.

Plan Cost Comparison Internals

When the optimizer compares two plans, it follows a systematic process:

Step 1: Compute individual operator costs

Plan: HashJoin(SeqScan(A), IdxScan(B))

SeqScan(A):
  IO_cost = pages(A) = 1000
  CPU_cost = rows(A) × tuple_cost = 100,000 × 0.01 = 1000
  Total = 1000 + 1000 = 2000

IdxScan(B, predicate):
  IO_cost = index_height + selectivity × pages(B)
          = 3 + 0.01 × 5000 = 53
  CPU_cost = estimated_rows × (tuple_cost + predicate_cost)
          = 500 × (0.01 + 0.005) = 7.5
  Total = 53 + 7.5 = 60.5

HashJoin:
  Build_cost = rows(smaller_input) × hash_cost = 500 × 0.02 = 10
  Probe_cost = rows(larger_input) × probe_cost = 100,000 × 0.01 = 1000
  Output_cost = matched_rows × output_cost = 500 × 0.005 = 2.5
  Total = 10 + 1000 + 2.5 = 1012.5

Step 2: Sum the plan tree

Total_Plan_Cost = 2000 + 60.5 + 1012.5 = 3073

Step 3: Compare with alternatives

Alternative: NestedLoop(IdxScan(A), IdxScan(B))

IdxScan(A, predicate):
  Total = 73 (similar to B)

IdxScan(B) × rows from A:
  Per lookup = 4 pages (index + data)
  Total lookups = 7,300 (rows from A)
  Total = 7,300 × 4 = 29,200

Total_Plan_Cost = 73 + 29,200 + output_cost ≈ 29,300

Winner: HashJoin plan (cost 3,073 vs 29,300)

Cost Granularity Trade-offs

More detailed cost models (accounting for cache effects, NUMA, CPU pipeline stalls) produce more accurate comparisons but slower optimization. Most production optimizers use models that balance accuracy with speed, with perhaps 5-10 cost components per operator.

When Plans Are Close

What happens when two plans have similar costs? This situation is more common—and more problematic—than it might seem.

The instability problem:

If Plan A costs 1000 and Plan B costs 1010, the optimizer chooses A. But:

A minor statistics update might flip the estimates
The 1% difference is within estimation error
Actual performance might be reversed

This leads to plan instability—the same query getting different plans across minor changes, causing unpredictable performance.

Strategies for handling close plans:

Close Plan Handling Strategies

•Cost delta thresholds: Only switch plans if the improvement exceeds 5-10%
•Plan baselines: Lock plans that have proven good in production (Oracle SPM)
•Consistent tie-breaking: Deterministic rules when costs are equal (e.g., prefer hash join over sort-merge)
•Sensitivity analysis: Compute cost across range of possible cardinalities; choose robust plan
•Plan forcing: Allow DBAs to force specific plans for critical queries

The "good enough" philosophy:

Because cost estimates are approximate, optimizers increasingly adopt satisficing rather than optimizing:

Find a plan that meets a quality threshold
If cost < threshold, stop optimizing and execute
Spend optimization time only on clearly suboptimal plans

This reduces optimization time while producing plans that are rarely significantly worse than optimal.

Plan Regression

Plan regression—when a plan change makes performance worse—is a major operational concern. It often happens after statistics updates, database upgrades, or configuration changes. Production systems need monitoring to detect regressions and mechanisms to revert to known-good plans.

Time Limits and Optimization Budget

Optimization itself takes time. For OLTP queries that should execute in milliseconds, spending seconds on optimization is absurd. For complex analytics running for hours, thorough optimization is worthwhile.

The optimization cost balance:

Total_Time = Optimization_Time + Execution_Time

We want to minimize Total_Time, not just Execution_Time.

Example trade-off:

Scenario 1 (OLTP):
  Quick optimize (1ms) → Plan cost 100 → Total: 101ms
  Full optimize (50ms) → Plan cost 95 → Total: 145ms
  → Quick optimization wins!

Scenario 2 (Analytics):
  Quick optimize (1ms) → Plan cost 600,000 → Total: 600,001ms
  Full optimize (500ms) → Plan cost 60,000 → Total: 60,500ms
  → Full optimization wins by 10×!

Optimization Budget Strategies
Strategy	How It Works	Use Case
Fixed timeout	Stop optimization after N milliseconds	Simple, predictable overhead
Proportional budget	Spend up to X% of expected query time optimizing	Adaptive to query complexity
Plan count limit	Evaluate at most N plans, then pick best	Controls worst-case optimization time
Tiered optimization	Quick checks first; deep analysis only if needed	Best of both worlds
Estimated complexity	Scale budget with query size (tables, predicates)	Natural adaptation to query size

PostgreSQL's GEQO example:

For queries joining many tables, PostgreSQL switches from exhaustive search to Genetic Query Optimization (GEQO)—a randomized algorithm:

Default threshold: 12 tables
Uses genetic algorithms to explore plan space stochastically
No optimality guarantee, but finishes in reasonable time
Produces good (not necessarily best) plans

This pragmatic approach recognizes that perfect is the enemy of good—and fast.

Prepared Statements

For repeated queries with different parameters, prepared statements amortize optimization cost across many executions. Optimize once thoroughly, cache the plan, reuse indefinitely (with caveats around parameter sensitivity).

Summary: Plan Comparison Techniques

We've explored how optimizers navigate the vast space of execution plans. Let's consolidate the key insights:

Key Takeaways

•The search space is enormous — Join orderings, algorithms, and access methods combine to create trillions of possible plans for moderate queries.
•Enumeration strategies vary — Bottom-up (System R), top-down (Cascades), and transformation-based approaches each have trade-offs in completeness and efficiency.
•Pruning is essential — Cost-based pruning, interesting orders, and heuristic filters eliminate vast portions of the search space.
•Equivalence classes enable sharing — Memoization across logically equivalent expressions avoids redundant computation.
•Multiple objectives matter — Beyond total cost, startup time, memory usage, and resource contention influence plan choice.
•Close plans cause instability — Thresholds, baselines, and robust selection help manage near-equivalent alternatives.
•Optimization has a budget — Time spent optimizing must be justified by improvement; timeouts and tiered approaches manage this trade-off.

What's next:

With plan comparison in hand, we're ready to examine the crown jewel of cost-based optimization: Dynamic Programming. The next page explores how this algorithmic technique enables optimal join ordering in polynomial time—turning an apparently intractable problem into a tractable one.

Page Complete

You now understand how optimizers systematically explore and compare execution plans. The combination of smart enumeration, aggressive pruning, and equivalence-based sharing makes optimization feasible despite the astronomical search space.