Database Management SystemsQuery Optimization

Optimization Overview

LevelAdvanced

Duration60 mins

TopicQuery Optimization

5 / 5

Time Constraints

The Optimization Paradox

Here's a paradox at the heart of query optimization: the more time you spend optimizing, the less time you have for executing—but the better your plan, the faster execution will be.

For a query that will run for hours, spending 30 seconds finding an optimal plan is an excellent investment. But for a query that should complete in 10 milliseconds, spending 30 seconds optimizing is absurd—you've delayed the result by 3000×.

The optimizer must navigate this trade-off dynamically, adapting its effort to the expected query complexity. Too little optimization → poor plan quality. Too much optimization → wasted time. Getting this balance right is a crucial but often overlooked aspect of query processing.

What You Will Learn

By the end of this page, you will understand the optimization time vs. plan quality trade-off, strategies for bounding optimization time (budgets, timeouts, thresholds), adaptive optimization approaches that scale effort with complexity, multi-phase optimization that progressively refines plans, and practical implications for query design and system configuration.

The Time-Quality Trade-off

Optimization time and plan quality are fundamentally linked. More time allows more thorough search, better cost estimation, and consideration of more alternatives. But optimization has diminishing returns:

1.1 The Law of Diminishing Returns

First 10ms of optimization: Prune obviously bad plans (Cartesian products, missing predicates)
Next 100ms: Find the optimal join order via DP for moderate queries
Next 1s: Consider additional access paths, parallel alternatives
Beyond 1s: Marginal improvements, exploring rare edge cases

For most practical queries, 95% of the benefit comes from the first 100ms of optimization. Spending more time produces incrementally smaller improvements.

1.2 Characterizing Total Query Cost

Total response time = Optimization time + Execution time

$$T_{total} = T_{opt} + T_{exec}(plan)$$

Where T_exec depends on the plan chosen. The optimizer seeks to minimize T_total, not just T_exec.

Optimal Optimization Time: In principle, we want to optimize until: $$\frac{dT_{exec}}{dT_{opt}} = -1$$

That is, until each additional optimization second saves less than one execution second. In practice, this is impossible to measure in advance.

Optimization Time vs. Query Duration
Query Type	Expected Execution	Ideal Opt Time	Strategy
Point lookup	1-5 ms	<1 ms	Cached plan, no re-optimize
Simple OLTP	10-100 ms	1-5 ms	Quick heuristics, limited DP
Moderate analytic	1-10 sec	10-100 ms	Full DP, parallel options
Complex report	1-30 min	1-5 sec	Exhaustive search, all options
Batch/ETL	1+ hour	1-30 sec	Maximum optimization, hints review

The 1% Rule

A practical guideline: optimization time should be <1% of expected execution time. For a 10-second query, spend <100ms optimizing. For a 1-hour query, up to 36 seconds is acceptable. This keeps optimization overhead reasonable while allowing thorough search for complex queries.

Strategies for Bounding Optimization Time

Optimizers use various mechanisms to prevent optimization from taking too long:

2.1 Static Thresholds

The simplest approach: switch strategies based on query characteristics.

if (num_tables <= 5):
    use full_dp_optimization()
elif (num_tables <= 12):
    use left_deep_dp_optimization()
else:
    use genetic_optimization()  # or heuristic

PostgreSQL Example:

geqo_threshold = 12: Switch to genetic optimizer above 12 tables
from_collapse_limit = 8: Stop collapsing subqueries above 8 tables

2.2 Time Budgets

Allocate a maximum optimization time:

budget = estimate_execution_time(query) * 0.01  # 1% of expected
start_time = now()

while (has_more_plans() and elapsed() < budget):
    evaluate_next_plan()

return best_plan_so_far()

Challenge: Estimating expected execution time before optimization completes—a chicken-and-egg problem. Systems often use query complexity heuristics (number of tables, presence of aggregations) instead.

2.3 Plan Count Limits

Limit the number of plans evaluated:

max_plans = 10000
plans_evaluated = 0

while (has_more_plans() and plans_evaluated < max_plans):
    evaluate_next_plan()
    plans_evaluated++

Advantage: Predictable bound on optimization time. Disadvantage: Doesn't adapt to plan evaluation speed or query importance.

2.4 Pruning Escalation

Start with light pruning, escalate if optimization is taking too long:

pruning_level = 1  # 1=light, 2=medium, 3=aggressive

while (has_more_plans()):
    if (elapsed() > phase_budget[pruning_level]):
        pruning_level++
        apply_stricter_pruning(pruning_level)
        if (pruning_level > 3):
            break
    evaluate_next_plan()

This adaptively increases aggressiveness as time pressure mounts.

2.5 Timeout with Best-Effort

Hard timeout after maximum allowed time:

timeout = 30 seconds  # Configurable

try:
    with_timeout(timeout):
        plan = full_optimization(query)
except TimeoutError:
    plan = best_plan_so_far or heuristic_plan(query)

return plan

This guarantees bounded optimization time but may return suboptimal plans.

Configuration Matters

These thresholds are typically configurable. Incorrect settings cause problems: too aggressive → poor plans for complex queries; too permissive → slow optimization for simple queries. Default settings are tuned for typical workloads but may need adjustment for specific applications.

Multi-Phase Optimization

Rather than a single optimization pass, many systems use multi-phase approaches that progressively improve plan quality:

3.1 Quick Phase → Thorough Phase

Phase 1: Quick Optimization (Low Cost)

Apply heuristic transformations (push selections, eliminate redundant operations)
Use simplified cost model
Consider only common access paths
Find a "good enough" plan quickly

Phase 2: Thorough Optimization (If Time Permits)

Full DP enumeration
Detailed cost model with accurate statistics
Consider all access paths and algorithms
Refine or replace Phase 1 plan

Decision Point: Proceed to Phase 2 only if:

Phase 1 plan has high estimated cost (optimization is worth it)
Time budget allows
Query is not trivial (e.g., single table, simple filter)

3.2 The SQL Server Approach

SQL Server uses three optimization phases:

Phase	Description	When Triggered
0	Simple plans	Single table queries, trivial cases
1	Transaction	Simple joins, OLTP patterns
2	Full optimization	Complex queries, many tables

Each phase has increasing sophistication and cost. The optimizer escalates only when the previous phase's best plan exceeds a cost threshold.

multi_phase_optimizer.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
def multi_phase_optimize(query):
    """
    Multi-phase optimization with escalation based on plan quality.
    """
    # Phase 0: Trivial plan check
    if is_trivial_query(query):
        return trivial_plan(query)
    
    # Phase 1: Quick heuristic optimization
    phase1_plan = heuristic_optimize(query)
    phase1_cost = estimate_cost(phase1_plan)
    
    # Escalation thresholds
    PHASE2_THRESHOLD = 10000  # Arbitrary cost units
    TIME_BUDGET_PHASE2 = 100  # milliseconds
    
    if phase1_cost < PHASE2_THRESHOLD:
        # Good enough, don't waste more time
        log(f"Phase 1 sufficient: cost={phase1_cost}")
        return phase1_plan
    
    # Phase 2: Full optimization
    log(f"Escalating to Phase 2: Phase 1 cost={phase1_cost}")
    start_time = time.now()
    
    try:
        phase2_plan = dp_optimize_with_timeout(
            query, 
            timeout=TIME_BUDGET_PHASE2
        )
        phase2_cost = estimate_cost(phase2_plan)
        
        if phase2_cost < phase1_cost:
            log(f"Phase 2 improved: {phase1_cost} -> {phase2_cost}")
            return phase2_plan
        else:
            log(f"Phase 2 didn't improve, using Phase 1")
            return phase1_plan
            
    except TimeoutError:
        log(f"Phase 2 timeout, using Phase 1")
        return phase1_plan
 
def heuristic_optimize(query):
    """Quick heuristic-based plan generation."""
    plan = parse_to_logical_plan(query)
    
    # Apply standard heuristics
    plan = push_selections_down(plan)
    plan = push_projections_down(plan)
    plan = order_joins_by_selectivity(plan)
    plan = assign_default_algorithms(plan)
    
    return plan
 
def dp_optimize_with_timeout(query, timeout):
    """Full DP optimization with timeout."""
    deadline = time.now() + timeout
    best_plan = None
    
    for subset in enumerate_connected_subsets(query.tables):
        if time.now() > deadline:
            raise TimeoutError("DP optimization timeout")
        
        best_plan = dp_optimize_subset(subset, best_plan)
    
    return best_plan

Phase Escalation Metadata

Query execution tools often show which optimization phase was used. In SQL Server, SET STATISTICS PROFILE ON reveals the 'Statement Optimization Level.' Seeing 'FULL' for a simple query suggests optimization overhead; seeing 'TRIVIAL' for a complex query suggests potentially suboptimal plans.

Adaptive Optimization Strategies

The most sophisticated systems adapt their optimization approach based on query characteristics and system state:

4.1 Query Complexity Metrics

Systems compute complexity scores to guide optimization effort:

Join count: More joins = more complex
Predicate complexity: OR conditions, functions, subqueries
Aggregation presence: GROUP BY, DISTINCT, window functions
Data volume: Estimated rows touched
Index availability: More indexes = more options = more search

$$complexity = \alpha \cdot joins + \beta \cdot predicates + \gamma \cdot aggregates + ...$$

Optimization time budget scales with complexity score.

4.2 System Load Adaptation

When the system is under heavy load, optimize less aggressively:

load_factor = current_cpu_usage / max_cpu_capacity

if load_factor > 0.8:
    optimization_budget *= 0.5  # Cut optimization time in half
    use_cached_plans_more_aggressively()

Rationale: During high load, faster optimization (even with worse plans) reduces queueing delays and helps overall throughput.

4.3 Query Priority / Importance

Allocate more optimization resources to important queries:

Priority Indicators:

User-specified hints (HIGH_PRIORITY, OPTIMIZE_FOR_AD_HOC)
Query frequency (optimize more for frequently repeated queries)
Resource governor classifications
Service tier (premium users get more optimization)

4.4 Feedback-Driven Adaptation

Learn from past queries to adjust optimization:

Successful Patterns:

Queries that completed quickly after optimization
Plans that matched cardinality estimates

Failed Patterns:

Queries where actual time >> estimated time
Plans with large cardinality estimation errors

Adaptive systems might:

Spend more time on query patterns that previously had poor plans
Reuse plans that worked well on similar queries
Flag queries that frequently have estimation problems

4.5 Incremental Improvement

For repeated queries or prepared statements:

if (same_query_executed_before):
    previous_plan = get_cached_plan(query)
    previous_actual_cost = get_actual_cost(previous_plan)
    
    if (previous_actual_cost was_acceptable):
        return previous_plan  # Don't re-optimize
    else:
        re_optimize with (higher_budget, updated_statistics)

This amortizes optimization cost over multiple executions while adapting to change.

Adaptive Query Processing

Modern systems go beyond adaptive optimization to adaptive execution. Oracle's Adaptive Query Processing and SQL Server's Adaptive Joins adjust plans during execution based on actual cardinalities. This compensates for estimation errors without paying re-optimization cost.

Plan Caching and Reuse

The best way to reduce optimization time is to avoid optimization altogether by reusing previously computed plans:

5.1 Plan Cache Architecture

Query String → Hash → Cache Lookup
                        ↓
              Hit: Return cached plan
                        ↓ (Miss)
              Optimize → Store in cache → Return plan

Cache Key Components:

Query text (normalized: whitespace, case)
Parameter values (for some systems) or parameter types
Session settings (ANSI nulls, date format, etc.)
Schema version (invalidate on DDL changes)

5.2 Prepared Statements

Prepared statements explicitly separate parsing/optimization from execution:

-- Prepare once (optimization happens here)
PREPARE order_lookup AS 
    SELECT * FROM orders WHERE customer_id = $1;

-- Execute many times (no re-optimization)
EXECUTE order_lookup(12345);
EXECUTE order_lookup(67890);
EXECUTE order_lookup(11111);

Benefits:

Optimization cost amortized over many executions
Reduced SQL injection risk
Reduced parsing overhead
Stable plans across executions

5.3 Cache Invalidation

Cached plans must be invalidated when they become stale:

Invalidation Triggers:

Schema changes (DDL on involved tables)
Statistics updates (ANALYZE)
Index creation/deletion
Configuration changes affecting optimization
Explicit cache flush commands

Lazy Recompilation: Some systems recompile lazily:

Mark plan as "needs recompile"
Recompile on next execution (not immediately)
Avoids thundering herd of simultaneous recompilations

5.4 Cache Management

Memory Pressure:

Plans consume memory from a shared cache
Eviction policies (LRU, cost-weighted) manage cache size
Rarely-used plans evicted first

Plan Cache Bloat:

Auto-parameterization helps: WHERE id = 5 → WHERE id = @p1
Non-parameterized literals create unique cache entries
Ad-hoc workloads can flood the cache with single-use plans

-- SQL Server: Check plan cache efficiency
SELECT 
    objtype,
    COUNT(*) as plan_count,
    SUM(size_in_bytes) / 1024 / 1024 as size_mb
FROM sys.dm_exec_cached_plans
GROUP BY objtype;

The Parameter Sniffing Problem Revisited

Plan caching creates the parameter sniffing problem: the first parameter value determines the plan for all subsequent values. Solutions include: RECOMPILE hint (re-optimize every time), OPTIMIZE FOR hint (optimize for specific or 'average' value), or plan guides that force specific plans.

Timeout and Fallback Handling

When optimization exceeds time limits, the system must have fallback strategies:

6.1 Graceful Degradation

Attempt full optimization
    ↓ (timeout)
Use best plan found so far
    ↓ (no complete plan)
Fall back to heuristic plan
    ↓ (heuristics fail)
Fall back to left-to-right join order
    ↓ (catastrophic failure)
Report error and abort

Each fallback level produces a valid (if potentially suboptimal) plan.

6.2 Partial Results Usage

Modern optimizers track partial results during enumeration:

Best complete plan so far: Even if search is incomplete
Best partial plans per subset: Can be assembled into complete plan
Cost bounds discovered: Enable early termination with confidence

6.3 Parallel Optimization

Some systems parallelize optimization itself:

Multiple threads explore different regions of search space
First thread to find a "good enough" plan terminates others
Especially useful for very large queries

6.4 User-Facing Implications

Timeout During Optimization:

Query may still execute (with suboptimal plan)
User sees slower-than-expected performance, not error
May return results but log a warning

Complete Optimization Failure:

Return error to user
Suggest query simplification
Recommend explicit hints or query decomposition

Optimization Timeout Behaviors by System
System	Default Behavior	Configuration
PostgreSQL	No hard timeout, switch to GEQO at threshold	geqo_threshold, join_collapse_limit
MySQL	Use heuristic after optimizer threshold	optimizer_search_depth, optimizer_prune_level
SQL Server	Timeout with best-so-far plan	Cost threshold for parallelism, resource governor
Oracle	Adaptive optimization phases	optimizer_mode, optimizer hints

Invisible Degradation

Optimization timeout is often silent—the query executes without errors but slower than it could be. Monitoring optimization time and plan quality is essential to detect when fallbacks are being triggered too frequently, indicating workload complexity exceeding system capabilities.

Practical Implications for Developers

Understanding time constraints has practical implications for application development:

7.1 Query Design for Optimal Optimization

Keep Joins Manageable:

Try to keep queries under 10-12 table joins
Break complex queries into CTEs or subqueries (may parallelize optimization)
Use views or materialized views for common join patterns

Simplify Predicates:

Avoid overly complex WHERE clauses
Minimize OR conditions that multiply search space
Use helper columns instead of expression-heavy predicates

Use Explicit Hints When Needed:

-- When you know optimizer time is wasted
SELECT /*+ NO_REORDER */ * FROM ...

-- When you know the optimal join
SELECT /*+ LEADING(a b) */ * FROM a JOIN b ON ...

7.2 Prepared Statements Strategy

Use Prepared Statements For:

Frequently executed queries (OLTP patterns)
Queries with varying parameters but stable structure
Security-sensitive queries (SQL injection prevention)

Consider Ad-hoc For:

One-time analytical queries
Queries with extreme parameter variability requiring different plans
Exploratory queries during development

7.3 Monitoring Optimization Performance

Key Metrics to Track:

Compilation/optimization time per query
Cache hit ratio for plan cache
Number of plans in cache
Frequency of optimization timeout/fallback

-- PostgreSQL: Check planning time
EXPLAIN (ANALYZE, COSTS) SELECT ...;
-- Look for "Planning Time: X ms"

-- SQL Server: Check compilation time
SET STATISTICS TIME ON;
-- Look for "SQL Server parse and compile time"

7.4 Configuration Tuning

For Complex OLAP Workloads:

Increase optimization time budgets
Raise table join thresholds
Consider query hints for repeatably complex queries

For High-Volume OLTP:

Aggressive plan caching
Lower optimization thresholds
Focus on prepared statement reuse

The Root Cause Isn't Always Optimization

When queries are slow, developers often blame optimization time. In reality, optimization rarely exceeds a few hundred milliseconds even for complex queries. Long optimization times are usually symptoms of missing statistics, overly complex queries, or configuration problems—not fundamental optimizer limitations.

The Future of Time-Constrained Optimization

Research and industry are pushing toward better solutions for the optimization time problem:

8.1 Machine Learning for Quick Planning

ML models can shortcut traditional optimization:

Plan Prediction: Train models to predict good plans directly from query features
Cost Prediction: Replace expensive cost computation with learned models
Steering: Use ML to guide search toward promising regions

Example (Neo, Bao systems):

Neural network trained on historical query/plan pairs
Given new query, predicts plan in milliseconds
Falls back to traditional optimizer for novel patterns

8.2 Query Compilation

Instead of interpreting plans, compile them to native code:

Preparation takes longer (compilation overhead)
Execution is much faster (native code vs. interpretation)
Reuse amortizes cost (compile once, run many times)

Systems like HyPer, LegoBase, and Peloton demonstrate order-of-magnitude speedups for complex queries.

8.3 Continuous / Background Optimization

Decouple optimization from query submission:

Background re-optimization: Continuously improve cached plans
Workload-aware optimization: Optimize based on predicted future queries
What-if analysis: Pre-compute plans for likely parameter values

8.4 Distributed and Cloud Considerations

Cloud databases face unique time constraint challenges:

Serverless: Cold start optimization overhead matters more
Multi-tenancy: Fair optimization resource allocation
Elastic scaling: Optimization must consider variable resources
Separation of compute/storage: Different cost model trade-offs

8.5 The Enduring Trade-off

Despite advances, the fundamental trade-off remains:

More optimization time → Better plans → Faster execution
Less optimization time → Faster start → Possibly slower execution

No approach completely eliminates this trade-off; each technique shifts the curve rather than eliminating it. Understanding this trade-off helps database professionals make informed decisions about query design, system configuration, and performance expectations.

Module Complete

With this page, we've completed the Optimization Overview module. You now understand the fundamental goal of query optimization, the search space of possible plans, enumeration and selection strategies, and the time constraints that bound the optimizer's work. This foundation prepares you for the detailed coverage of specific optimization techniques in the following modules.

Summary: Module 1 - Optimization Overview

This module has provided a comprehensive overview of query optimization—the sophisticated process that transforms declarative SQL into efficient execution plans. Let's consolidate the key insights from all five pages:

Module Key Takeaways

•The Optimization Goal: Find the minimum-cost execution plan while guaranteeing semantic correctness—a deceptively simple statement hiding immense complexity.
•The Search Space: Exponentially vast (n! to 4^n for n tables), requiring systematic exploration strategies rather than exhaustive search.
•Plan Enumeration: Dynamic programming and heuristics navigate the search space, with interesting orders and pruning making enumeration tractable.
•Cost-Based Selection: Cardinality estimation and cost models enable plan comparison, though estimation errors remain a persistent challenge.
•Time Constraints: Optimizers balance thoroughness against time pressure, using multi-phase strategies, caching, and adaptive techniques.

What's Next in Chapter 35:

The remaining modules dive deeper into specific optimization techniques:

Module 2: Equivalence Rules — The mathematical transformations that generate equivalent plans
Module 3: Heuristic Optimization — Rule-based optimization techniques
Module 4: Cost-Based Optimization — Deep dive into cost models and selection
Module 5: Join Ordering — The most critical optimization challenge in detail
Module 6: Statistics and Histograms — The data that powers estimation

With the foundation established in this module, you're prepared to understand the detailed techniques that make modern query optimization so effective.

Page Complete

Congratulations on completing Module 1: Optimization Overview! You now have a solid foundation in query optimization concepts. This understanding will serve you well both in optimizing real-world queries and in appreciating the sophisticated engineering behind every database query execution.

5 / 5

Loading learning content...

Database Management SystemsQuery Optimization

Optimization Overview

LevelAdvanced

Duration60 mins

TopicQuery Optimization

5 / 5

Time Constraints

The Optimization Paradox

Here's a paradox at the heart of query optimization: the more time you spend optimizing, the less time you have for executing—but the better your plan, the faster execution will be.

What You Will Learn

The Time-Quality Trade-off

1.1 The Law of Diminishing Returns

First 10ms of optimization: Prune obviously bad plans (Cartesian products, missing predicates)
Next 100ms: Find the optimal join order via DP for moderate queries
Next 1s: Consider additional access paths, parallel alternatives
Beyond 1s: Marginal improvements, exploring rare edge cases

For most practical queries, 95% of the benefit comes from the first 100ms of optimization. Spending more time produces incrementally smaller improvements.

1.2 Characterizing Total Query Cost

Total response time = Optimization time + Execution time

$$T_{total} = T_{opt} + T_{exec}(plan)$$

Where T_exec depends on the plan chosen. The optimizer seeks to minimize T_total, not just T_exec.

Optimal Optimization Time: In principle, we want to optimize until: $$\frac{dT_{exec}}{dT_{opt}} = -1$$

That is, until each additional optimization second saves less than one execution second. In practice, this is impossible to measure in advance.

Optimization Time vs. Query Duration
Query Type	Expected Execution	Ideal Opt Time	Strategy
Point lookup	1-5 ms	<1 ms	Cached plan, no re-optimize
Simple OLTP	10-100 ms	1-5 ms	Quick heuristics, limited DP
Moderate analytic	1-10 sec	10-100 ms	Full DP, parallel options
Complex report	1-30 min	1-5 sec	Exhaustive search, all options
Batch/ETL	1+ hour	1-30 sec	Maximum optimization, hints review

The 1% Rule

Strategies for Bounding Optimization Time

Optimizers use various mechanisms to prevent optimization from taking too long:

2.1 Static Thresholds

The simplest approach: switch strategies based on query characteristics.

if (num_tables <= 5):
    use full_dp_optimization()
elif (num_tables <= 12):
    use left_deep_dp_optimization()
else:
    use genetic_optimization()  # or heuristic

PostgreSQL Example:

geqo_threshold = 12: Switch to genetic optimizer above 12 tables
from_collapse_limit = 8: Stop collapsing subqueries above 8 tables

2.2 Time Budgets

Allocate a maximum optimization time:

budget = estimate_execution_time(query) * 0.01  # 1% of expected
start_time = now()

while (has_more_plans() and elapsed() < budget):
    evaluate_next_plan()

return best_plan_so_far()

2.3 Plan Count Limits

Limit the number of plans evaluated:

max_plans = 10000
plans_evaluated = 0

while (has_more_plans() and plans_evaluated < max_plans):
    evaluate_next_plan()
    plans_evaluated++

Advantage: Predictable bound on optimization time. Disadvantage: Doesn't adapt to plan evaluation speed or query importance.

2.4 Pruning Escalation

Start with light pruning, escalate if optimization is taking too long:

pruning_level = 1  # 1=light, 2=medium, 3=aggressive

while (has_more_plans()):
    if (elapsed() > phase_budget[pruning_level]):
        pruning_level++
        apply_stricter_pruning(pruning_level)
        if (pruning_level > 3):
            break
    evaluate_next_plan()

This adaptively increases aggressiveness as time pressure mounts.

2.5 Timeout with Best-Effort

Hard timeout after maximum allowed time:

timeout = 30 seconds  # Configurable

try:
    with_timeout(timeout):
        plan = full_optimization(query)
except TimeoutError:
    plan = best_plan_so_far or heuristic_plan(query)

return plan

This guarantees bounded optimization time but may return suboptimal plans.

Configuration Matters

Multi-Phase Optimization

Rather than a single optimization pass, many systems use multi-phase approaches that progressively improve plan quality:

3.1 Quick Phase → Thorough Phase

Phase 1: Quick Optimization (Low Cost)

Apply heuristic transformations (push selections, eliminate redundant operations)
Use simplified cost model
Consider only common access paths
Find a "good enough" plan quickly

Phase 2: Thorough Optimization (If Time Permits)

Full DP enumeration
Detailed cost model with accurate statistics
Consider all access paths and algorithms
Refine or replace Phase 1 plan

Decision Point: Proceed to Phase 2 only if:

Phase 1 plan has high estimated cost (optimization is worth it)
Time budget allows
Query is not trivial (e.g., single table, simple filter)

3.2 The SQL Server Approach

SQL Server uses three optimization phases:

Phase	Description	When Triggered
0	Simple plans	Single table queries, trivial cases
1	Transaction	Simple joins, OLTP patterns
2	Full optimization	Complex queries, many tables

Each phase has increasing sophistication and cost. The optimizer escalates only when the previous phase's best plan exceeds a cost threshold.

multi_phase_optimizer.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
def multi_phase_optimize(query):
    """
    Multi-phase optimization with escalation based on plan quality.
    """
    # Phase 0: Trivial plan check
    if is_trivial_query(query):
        return trivial_plan(query)
    
    # Phase 1: Quick heuristic optimization
    phase1_plan = heuristic_optimize(query)
    phase1_cost = estimate_cost(phase1_plan)
    
    # Escalation thresholds
    PHASE2_THRESHOLD = 10000  # Arbitrary cost units
    TIME_BUDGET_PHASE2 = 100  # milliseconds
    
    if phase1_cost < PHASE2_THRESHOLD:
        # Good enough, don't waste more time
        log(f"Phase 1 sufficient: cost={phase1_cost}")
        return phase1_plan
    
    # Phase 2: Full optimization
    log(f"Escalating to Phase 2: Phase 1 cost={phase1_cost}")
    start_time = time.now()
    
    try:
        phase2_plan = dp_optimize_with_timeout(
            query, 
            timeout=TIME_BUDGET_PHASE2
        )
        phase2_cost = estimate_cost(phase2_plan)
        
        if phase2_cost < phase1_cost:
            log(f"Phase 2 improved: {phase1_cost} -> {phase2_cost}")
            return phase2_plan
        else:
            log(f"Phase 2 didn't improve, using Phase 1")
            return phase1_plan
            
    except TimeoutError:
        log(f"Phase 2 timeout, using Phase 1")
        return phase1_plan
 
def heuristic_optimize(query):
    """Quick heuristic-based plan generation."""
    plan = parse_to_logical_plan(query)
    
    # Apply standard heuristics
    plan = push_selections_down(plan)
    plan = push_projections_down(plan)
    plan = order_joins_by_selectivity(plan)
    plan = assign_default_algorithms(plan)
    
    return plan
 
def dp_optimize_with_timeout(query, timeout):
    """Full DP optimization with timeout."""
    deadline = time.now() + timeout
    best_plan = None
    
    for subset in enumerate_connected_subsets(query.tables):
        if time.now() > deadline:
            raise TimeoutError("DP optimization timeout")
        
        best_plan = dp_optimize_subset(subset, best_plan)
    
    return best_plan

Phase Escalation Metadata

Adaptive Optimization Strategies

The most sophisticated systems adapt their optimization approach based on query characteristics and system state:

4.1 Query Complexity Metrics

Systems compute complexity scores to guide optimization effort:

Join count: More joins = more complex
Predicate complexity: OR conditions, functions, subqueries
Aggregation presence: GROUP BY, DISTINCT, window functions
Data volume: Estimated rows touched
Index availability: More indexes = more options = more search

$$complexity = \alpha \cdot joins + \beta \cdot predicates + \gamma \cdot aggregates + ...$$

Optimization time budget scales with complexity score.

4.2 System Load Adaptation

When the system is under heavy load, optimize less aggressively:

load_factor = current_cpu_usage / max_cpu_capacity

if load_factor > 0.8:
    optimization_budget *= 0.5  # Cut optimization time in half
    use_cached_plans_more_aggressively()

Rationale: During high load, faster optimization (even with worse plans) reduces queueing delays and helps overall throughput.

4.3 Query Priority / Importance

Allocate more optimization resources to important queries:

Priority Indicators:

User-specified hints (HIGH_PRIORITY, OPTIMIZE_FOR_AD_HOC)
Query frequency (optimize more for frequently repeated queries)
Resource governor classifications
Service tier (premium users get more optimization)

4.4 Feedback-Driven Adaptation

Learn from past queries to adjust optimization:

Successful Patterns:

Queries that completed quickly after optimization
Plans that matched cardinality estimates

Failed Patterns:

Queries where actual time >> estimated time
Plans with large cardinality estimation errors

Adaptive systems might:

Spend more time on query patterns that previously had poor plans
Reuse plans that worked well on similar queries
Flag queries that frequently have estimation problems

4.5 Incremental Improvement

For repeated queries or prepared statements:

if (same_query_executed_before):
    previous_plan = get_cached_plan(query)
    previous_actual_cost = get_actual_cost(previous_plan)
    
    if (previous_actual_cost was_acceptable):
        return previous_plan  # Don't re-optimize
    else:
        re_optimize with (higher_budget, updated_statistics)

This amortizes optimization cost over multiple executions while adapting to change.

Adaptive Query Processing

Plan Caching and Reuse

The best way to reduce optimization time is to avoid optimization altogether by reusing previously computed plans:

5.1 Plan Cache Architecture

Query String → Hash → Cache Lookup
                        ↓
              Hit: Return cached plan
                        ↓ (Miss)
              Optimize → Store in cache → Return plan

Cache Key Components:

Query text (normalized: whitespace, case)
Parameter values (for some systems) or parameter types
Session settings (ANSI nulls, date format, etc.)
Schema version (invalidate on DDL changes)

5.2 Prepared Statements

Prepared statements explicitly separate parsing/optimization from execution:

-- Prepare once (optimization happens here)
PREPARE order_lookup AS 
    SELECT * FROM orders WHERE customer_id = $1;

-- Execute many times (no re-optimization)
EXECUTE order_lookup(12345);
EXECUTE order_lookup(67890);
EXECUTE order_lookup(11111);

Benefits:

Optimization cost amortized over many executions
Reduced SQL injection risk
Reduced parsing overhead
Stable plans across executions

5.3 Cache Invalidation

Cached plans must be invalidated when they become stale:

Invalidation Triggers:

Schema changes (DDL on involved tables)
Statistics updates (ANALYZE)
Index creation/deletion
Configuration changes affecting optimization
Explicit cache flush commands

Lazy Recompilation: Some systems recompile lazily:

Mark plan as "needs recompile"
Recompile on next execution (not immediately)
Avoids thundering herd of simultaneous recompilations

5.4 Cache Management

Memory Pressure:

Plans consume memory from a shared cache
Eviction policies (LRU, cost-weighted) manage cache size
Rarely-used plans evicted first

Plan Cache Bloat:

Auto-parameterization helps: WHERE id = 5 → WHERE id = @p1
Non-parameterized literals create unique cache entries
Ad-hoc workloads can flood the cache with single-use plans

-- SQL Server: Check plan cache efficiency
SELECT 
    objtype,
    COUNT(*) as plan_count,
    SUM(size_in_bytes) / 1024 / 1024 as size_mb
FROM sys.dm_exec_cached_plans
GROUP BY objtype;

The Parameter Sniffing Problem Revisited

Timeout and Fallback Handling

When optimization exceeds time limits, the system must have fallback strategies:

6.1 Graceful Degradation

Attempt full optimization
    ↓ (timeout)
Use best plan found so far
    ↓ (no complete plan)
Fall back to heuristic plan
    ↓ (heuristics fail)
Fall back to left-to-right join order
    ↓ (catastrophic failure)
Report error and abort

Each fallback level produces a valid (if potentially suboptimal) plan.

6.2 Partial Results Usage

Modern optimizers track partial results during enumeration:

Best complete plan so far: Even if search is incomplete
Best partial plans per subset: Can be assembled into complete plan
Cost bounds discovered: Enable early termination with confidence

6.3 Parallel Optimization

Some systems parallelize optimization itself:

Multiple threads explore different regions of search space
First thread to find a "good enough" plan terminates others
Especially useful for very large queries

6.4 User-Facing Implications

Timeout During Optimization:

Query may still execute (with suboptimal plan)
User sees slower-than-expected performance, not error
May return results but log a warning

Complete Optimization Failure:

Return error to user
Suggest query simplification
Recommend explicit hints or query decomposition

Optimization Timeout Behaviors by System
System	Default Behavior	Configuration
PostgreSQL	No hard timeout, switch to GEQO at threshold	geqo_threshold, join_collapse_limit
MySQL	Use heuristic after optimizer threshold	optimizer_search_depth, optimizer_prune_level
SQL Server	Timeout with best-so-far plan	Cost threshold for parallelism, resource governor
Oracle	Adaptive optimization phases	optimizer_mode, optimizer hints

Invisible Degradation

Practical Implications for Developers

Understanding time constraints has practical implications for application development:

7.1 Query Design for Optimal Optimization

Keep Joins Manageable:

Try to keep queries under 10-12 table joins
Break complex queries into CTEs or subqueries (may parallelize optimization)
Use views or materialized views for common join patterns

Simplify Predicates:

Avoid overly complex WHERE clauses
Minimize OR conditions that multiply search space
Use helper columns instead of expression-heavy predicates

Use Explicit Hints When Needed:

-- When you know optimizer time is wasted
SELECT /*+ NO_REORDER */ * FROM ...

-- When you know the optimal join
SELECT /*+ LEADING(a b) */ * FROM a JOIN b ON ...

7.2 Prepared Statements Strategy

Use Prepared Statements For:

Frequently executed queries (OLTP patterns)
Queries with varying parameters but stable structure
Security-sensitive queries (SQL injection prevention)

Consider Ad-hoc For:

One-time analytical queries
Queries with extreme parameter variability requiring different plans
Exploratory queries during development

7.3 Monitoring Optimization Performance

Key Metrics to Track:

Compilation/optimization time per query
Cache hit ratio for plan cache
Number of plans in cache
Frequency of optimization timeout/fallback

-- PostgreSQL: Check planning time
EXPLAIN (ANALYZE, COSTS) SELECT ...;
-- Look for "Planning Time: X ms"

-- SQL Server: Check compilation time
SET STATISTICS TIME ON;
-- Look for "SQL Server parse and compile time"

7.4 Configuration Tuning

For Complex OLAP Workloads:

Increase optimization time budgets
Raise table join thresholds
Consider query hints for repeatably complex queries

For High-Volume OLTP:

Aggressive plan caching
Lower optimization thresholds
Focus on prepared statement reuse

The Root Cause Isn't Always Optimization

The Future of Time-Constrained Optimization

Research and industry are pushing toward better solutions for the optimization time problem:

8.1 Machine Learning for Quick Planning

ML models can shortcut traditional optimization:

Plan Prediction: Train models to predict good plans directly from query features
Cost Prediction: Replace expensive cost computation with learned models
Steering: Use ML to guide search toward promising regions

Example (Neo, Bao systems):

Neural network trained on historical query/plan pairs
Given new query, predicts plan in milliseconds
Falls back to traditional optimizer for novel patterns

8.2 Query Compilation

Instead of interpreting plans, compile them to native code:

Preparation takes longer (compilation overhead)
Execution is much faster (native code vs. interpretation)
Reuse amortizes cost (compile once, run many times)

Systems like HyPer, LegoBase, and Peloton demonstrate order-of-magnitude speedups for complex queries.

8.3 Continuous / Background Optimization

Decouple optimization from query submission:

Background re-optimization: Continuously improve cached plans
Workload-aware optimization: Optimize based on predicted future queries
What-if analysis: Pre-compute plans for likely parameter values

8.4 Distributed and Cloud Considerations

Cloud databases face unique time constraint challenges:

Serverless: Cold start optimization overhead matters more
Multi-tenancy: Fair optimization resource allocation
Elastic scaling: Optimization must consider variable resources
Separation of compute/storage: Different cost model trade-offs

8.5 The Enduring Trade-off

Despite advances, the fundamental trade-off remains:

More optimization time → Better plans → Faster execution
Less optimization time → Faster start → Possibly slower execution

Module Complete

Summary: Module 1 - Optimization Overview

Module Key Takeaways

•The Optimization Goal: Find the minimum-cost execution plan while guaranteeing semantic correctness—a deceptively simple statement hiding immense complexity.
•The Search Space: Exponentially vast (n! to 4^n for n tables), requiring systematic exploration strategies rather than exhaustive search.
•Plan Enumeration: Dynamic programming and heuristics navigate the search space, with interesting orders and pruning making enumeration tractable.
•Cost-Based Selection: Cardinality estimation and cost models enable plan comparison, though estimation errors remain a persistent challenge.
•Time Constraints: Optimizers balance thoroughness against time pressure, using multi-phase strategies, caching, and adaptive techniques.

What's Next in Chapter 35:

The remaining modules dive deeper into specific optimization techniques:

Module 2: Equivalence Rules — The mathematical transformations that generate equivalent plans
Module 3: Heuristic Optimization — Rule-based optimization techniques
Module 4: Cost-Based Optimization — Deep dive into cost models and selection
Module 5: Join Ordering — The most critical optimization challenge in detail
Module 6: Statistics and Histograms — The data that powers estimation

With the foundation established in this module, you're prepared to understand the detailed techniques that make modern query optimization so effective.

Page Complete

5 / 5