Loading content...
Imagine you're planning a road trip visiting 10 cities. How many different orderings could you visit them in? The answer is 10! = 3,628,800 different routes. Now imagine you could also choose between 3 different roads between each pair of cities, and at each city you could stay at 4 different hotels. The combinations explode astronomically.
This is exactly the challenge facing a query optimizer. For a query joining 10 tables, with multiple access methods per table, multiple join algorithms, and multiple join orderings, the space of possible execution plans can exceed trillions. Evaluating every single plan is impossible—yet the optimizer must find a good plan in milliseconds.
Plan comparison is the systematic process of enumerating candidate plans, computing their costs, and selecting the optimal (or near-optimal) one. This page explores how optimizers navigate this exponential search space efficiently.
By the end of this page, you will understand: (1) The structure of the plan search space, (2) How optimizers enumerate candidate plans, (3) Techniques for pruning unproductive search branches, (4) The concept of equivalence classes and physical properties, and (5) Trade-offs between optimization time and plan quality.
Before comparing plans, we must understand what we're comparing. The plan search space consists of all valid execution plans for a given query. This space has multiple dimensions:
Dimension 1: Join Ordering
For n tables, there are (2n-2)!/((n-1)!) different join trees. This grows super-exponentially:
| Tables | Join Trees |
|---|---|
| 3 | 12 |
| 5 | 1,680 |
| 7 | 665,280 |
| 10 | 17,643,225,600 |
Even just considering join order, the space becomes astronomical for modest query sizes.
Dimension 2: Join Algorithms
For each join in the plan, the optimizer chooses among physical operators:
With 5 join types and 9 joins (10-table query), that's 5⁹ ≈ 2 million combinations per join order.
Dimension 3: Access Methods
For each table, the optimizer chooses how to access it:
With 4 choices per table and 10 tables, that's 4¹⁰ ≈ 1 million combinations per join/algorithm choice.
Multiplying these dimensions: For a 10-table query, the total plan space can exceed 10^20 (100 quintillion) plans. If each cost estimate took 1 microsecond, exhaustive enumeration would take 3 billion years. Optimizers MUST use smarter strategies.
The optimization problem:
Given:
Find:
Or at least find a plan within some factor of optimal.
Optimizers use several strategies to enumerate candidate plans efficiently:
Strategy 1: Bottom-Up Enumeration
Start with single-table access plans, then progressively build larger plans by combining smaller ones:
Level 1: All single-table plans
Plan({A}), Plan({B}), Plan({C}), ...
Level 2: All two-table plans from Level 1
Plan({A,B}) = Plan({A}) ⋈ Plan({B})
Plan({A,C}) = Plan({A}) ⋈ Plan({C})
Plan({B,C}) = Plan({B}) ⋈ Plan({C})
...
Level 3: All three-table plans
Plan({A,B,C}) = Plan({A,B}) ⋈ Plan({C})
= Plan({A,C}) ⋈ Plan({B})
= Plan({A}) ⋈ Plan({B,C})
...
This is the foundation of dynamic programming optimization, which we'll explore in depth later.
Strategy 2: Top-Down Enumeration
Start with the full query and recursively decompose:
Optimize(A ⋈ B ⋈ C):
Try: Optimize(A ⋈ B) ⋈ Optimize(C)
Try: Optimize(A ⋈ C) ⋈ Optimize(B)
Try: Optimize(A) ⋈ Optimize(B ⋈ C)
...
Return best option
This enables memoization—caching results of subproblems to avoid recomputation.
Strategy 3: Transformation-Based Enumeration
Start with an initial plan and apply transformations to generate alternatives:
Initial: (A ⋈ B) ⋈ C with nested loop joins
Transformation 1: Apply join commutativity
→ (B ⋈ A) ⋈ C
Transformation 2: Apply join associativity
→ A ⋈ (B ⋈ C)
Transformation 3: Change join algorithm
→ (A ⊳⊲ B) ⋈ C [hash join for A-B]
Systems like Volcano/Cascades use transformation-based enumeration with rule-driven generation.
| Strategy | Advantages | Disadvantages | Used By |
|---|---|---|---|
| Bottom-Up (System R style) | Complete exploration, optimal for class of plans | May waste time on poor subplans | PostgreSQL, MySQL |
| Top-Down (Volcano/Cascades) | Natural memoization, flexible rules | Higher overhead, complex implementation | SQL Server, Greenplum |
| Transformation-Based | Modular, easy to extend | May miss plans, ordering-dependent | Apache Calcite, Spark SQL |
| Randomized/Genetic | Good for very large queries | No optimality guarantee | PostgreSQL (GEQO) |
Since exhaustive search is infeasible, optimizers must prune the search space—eliminating branches without fully exploring them. Effective pruning is the key to fast optimization.
Pruning Technique 1: Cost-Based Pruning (Branch and Bound)
If we've found a complete plan with cost 1000, we can prune any partial plan whose cost already exceeds 1000:
Best_known_cost = 1000
Partial plan (A ⋈ B) has cost 800
→ Continue exploring; might find plan with total cost < 1000
Partial plan (A ⋈ C) has cost 1100
→ PRUNE! No extension can have cost < 1000
This is highly effective when we find good plans early. Order of exploration matters—try promising branches first.
Pruning Technique 2: Interesting Orders
Interesting orders are sort orderings that benefit downstream operations. A plan producing sorted output might be more expensive initially but saves sorting later:
SELECT * FROM Orders o JOIN Customers c ON o.customer_id = c.id
ORDER BY c.name;
If we sort Customers by name early:
Without tracking this, we'd prune the "expensive" sorted plan, missing the overall optimal.
Pruning Technique 3: Physical Property Propagation
Beyond sorting, plans have physical properties:
Plans with compatible properties may enable more efficient parent operators. Optimizers track property requirements and prune plans that can't satisfy them.
Aggressive pruning risks missing the optimal plan. Conservative pruning wastes optimization time. Good optimizers balance aggressiveness with safety, using heuristics tuned for typical workloads while ensuring correctness.
A powerful optimization concept is grouping logically equivalent expressions into equivalence classes. All members of a class produce the same result but may have different physical implementations and costs.
Example equivalence class:
For the expression (A ⋈ B) ⋈ C, the equivalence class contains:
Logically equivalent forms:
(A ⋈ B) ⋈ C
(B ⋈ A) ⋈ C
A ⋈ (B ⋈ C)
A ⋈ (C ⋈ B)
(A ⋈ C) ⋈ B
(C ⋈ A) ⋈ B
...
Each form can use different physical operators, creating many physical plans per logical form.
The memoization table:
Transformation-based optimizers (Volcano/Cascades) maintain a data structure called the memo that groups expressions by equivalence:
Memo:
Group 1: {A} → Plans: SeqScan(A), IdxScan(A, idx1)
Group 2: {B} → Plans: SeqScan(B)
Group 3: {C} → Plans: SeqScan(C), IdxScan(C, idx2)
Group 4: {A ⋈ B} → Plans: Group1 NLJ Group2, Group1 HJ Group2, ...
Group 5: {B ⋈ C} → Plans: Group2 NLJ Group3, Group2 SMJ Group3, ...
Group 6: {A ⋈ B ⋈ C} → Plans: Group4 ⋈ Group3, Group1 ⋈ Group5, ...
Benefits of equivalence classes:
The Cascades optimizer framework (used by SQL Server, Greenplum, and others) formalizes memo-based optimization. Columbia extended it with further optimizations. These systems treat optimization as graph search over equivalence classes with memoization.
In practice, "best plan" isn't always one-dimensional. Optimizers must sometimes balance multiple objectives:
Objective 1: Total Work (Total Cost)
Minimize total resource consumption—the traditional optimization goal.
Objective 2: Time to First Row (Startup Cost)
For interactive queries or cursors, users want the first result quickly, even if the total query takes longer.
Plan A: Startup=1000, Total=5000 (sort-then-stream)
Plan B: Startup=10, Total=8000 (stream immediately)
For LIMIT 10, Plan B might be better despite higher total cost.
Objective 3: Memory Usage
In memory-constrained environments, a plan using less memory might be preferred even if slower:
Plan A: Cost=1000, Memory=500MB (fits in buffer pool)
Plan B: Cost=800, Memory=2GB (forces spilling to disk)
The actual execution cost of Plan B might exceed Plan A due to spill overhead.
Objective 4: Resource Contention
In multi-tenant systems, some plans are "friendlier" to other queries:
Plan A: Uses all parallel workers, finishes in 10s
Plan B: Uses 2 workers, finishes in 40s
Plan B might be preferred if other critical queries need resources.
| Property | When It Matters | Trade-off Example |
|---|---|---|
| Startup cost | Interactive queries, LIMIT clauses | Sort vs. nested loop for first rows |
| Peak memory | Concurrent queries, small buffer pools | Hash join vs. sort-merge join |
| Parallelism | Shared systems, cloud resources | Parallel scan vs. sequential scan |
| Output ordering | ORDER BY, merge joins downstream | Sorted access vs. unsorted + sort |
| Blocking behavior | Streaming pipelines, timeouts | Hash aggregation vs. sorted aggregation |
With multiple objectives, there may be no single "best" plan. A set of Pareto-optimal plans exists where no plan beats another in ALL objectives. Advanced optimizers maintain this frontier and make context-dependent choices.
When the optimizer compares two plans, it follows a systematic process:
Step 1: Compute individual operator costs
Plan: HashJoin(SeqScan(A), IdxScan(B))
SeqScan(A):
IO_cost = pages(A) = 1000
CPU_cost = rows(A) × tuple_cost = 100,000 × 0.01 = 1000
Total = 1000 + 1000 = 2000
IdxScan(B, predicate):
IO_cost = index_height + selectivity × pages(B)
= 3 + 0.01 × 5000 = 53
CPU_cost = estimated_rows × (tuple_cost + predicate_cost)
= 500 × (0.01 + 0.005) = 7.5
Total = 53 + 7.5 = 60.5
HashJoin:
Build_cost = rows(smaller_input) × hash_cost = 500 × 0.02 = 10
Probe_cost = rows(larger_input) × probe_cost = 100,000 × 0.01 = 1000
Output_cost = matched_rows × output_cost = 500 × 0.005 = 2.5
Total = 10 + 1000 + 2.5 = 1012.5
Step 2: Sum the plan tree
Total_Plan_Cost = 2000 + 60.5 + 1012.5 = 3073
Step 3: Compare with alternatives
Alternative: NestedLoop(IdxScan(A), IdxScan(B))
IdxScan(A, predicate):
Total = 73 (similar to B)
IdxScan(B) × rows from A:
Per lookup = 4 pages (index + data)
Total lookups = 7,300 (rows from A)
Total = 7,300 × 4 = 29,200
Total_Plan_Cost = 73 + 29,200 + output_cost ≈ 29,300
Winner: HashJoin plan (cost 3,073 vs 29,300)
More detailed cost models (accounting for cache effects, NUMA, CPU pipeline stalls) produce more accurate comparisons but slower optimization. Most production optimizers use models that balance accuracy with speed, with perhaps 5-10 cost components per operator.
What happens when two plans have similar costs? This situation is more common—and more problematic—than it might seem.
The instability problem:
If Plan A costs 1000 and Plan B costs 1010, the optimizer chooses A. But:
This leads to plan instability—the same query getting different plans across minor changes, causing unpredictable performance.
Strategies for handling close plans:
The "good enough" philosophy:
Because cost estimates are approximate, optimizers increasingly adopt satisficing rather than optimizing:
This reduces optimization time while producing plans that are rarely significantly worse than optimal.
Plan regression—when a plan change makes performance worse—is a major operational concern. It often happens after statistics updates, database upgrades, or configuration changes. Production systems need monitoring to detect regressions and mechanisms to revert to known-good plans.
Optimization itself takes time. For OLTP queries that should execute in milliseconds, spending seconds on optimization is absurd. For complex analytics running for hours, thorough optimization is worthwhile.
The optimization cost balance:
Total_Time = Optimization_Time + Execution_Time
We want to minimize Total_Time, not just Execution_Time.
Example trade-off:
Scenario 1 (OLTP):
Quick optimize (1ms) → Plan cost 100 → Total: 101ms
Full optimize (50ms) → Plan cost 95 → Total: 145ms
→ Quick optimization wins!
Scenario 2 (Analytics):
Quick optimize (1ms) → Plan cost 600,000 → Total: 600,001ms
Full optimize (500ms) → Plan cost 60,000 → Total: 60,500ms
→ Full optimization wins by 10×!
| Strategy | How It Works | Use Case |
|---|---|---|
| Fixed timeout | Stop optimization after N milliseconds | Simple, predictable overhead |
| Proportional budget | Spend up to X% of expected query time optimizing | Adaptive to query complexity |
| Plan count limit | Evaluate at most N plans, then pick best | Controls worst-case optimization time |
| Tiered optimization | Quick checks first; deep analysis only if needed | Best of both worlds |
| Estimated complexity | Scale budget with query size (tables, predicates) | Natural adaptation to query size |
PostgreSQL's GEQO example:
For queries joining many tables, PostgreSQL switches from exhaustive search to Genetic Query Optimization (GEQO)—a randomized algorithm:
This pragmatic approach recognizes that perfect is the enemy of good—and fast.
For repeated queries with different parameters, prepared statements amortize optimization cost across many executions. Optimize once thoroughly, cache the plan, reuse indefinitely (with caveats around parameter sensitivity).
We've explored how optimizers navigate the vast space of execution plans. Let's consolidate the key insights:
What's next:
With plan comparison in hand, we're ready to examine the crown jewel of cost-based optimization: Dynamic Programming. The next page explores how this algorithmic technique enables optimal join ordering in polynomial time—turning an apparently intractable problem into a tractable one.
You now understand how optimizers systematically explore and compare execution plans. The combination of smart enumeration, aggressive pruning, and equivalence-based sharing makes optimization feasible despite the astronomical search space.