Loading learning content...
Here's a paradox at the heart of query optimization: the more time you spend optimizing, the less time you have for executing—but the better your plan, the faster execution will be.
For a query that will run for hours, spending 30 seconds finding an optimal plan is an excellent investment. But for a query that should complete in 10 milliseconds, spending 30 seconds optimizing is absurd—you've delayed the result by 3000×.
The optimizer must navigate this trade-off dynamically, adapting its effort to the expected query complexity. Too little optimization → poor plan quality. Too much optimization → wasted time. Getting this balance right is a crucial but often overlooked aspect of query processing.
By the end of this page, you will understand the optimization time vs. plan quality trade-off, strategies for bounding optimization time (budgets, timeouts, thresholds), adaptive optimization approaches that scale effort with complexity, multi-phase optimization that progressively refines plans, and practical implications for query design and system configuration.
Optimization time and plan quality are fundamentally linked. More time allows more thorough search, better cost estimation, and consideration of more alternatives. But optimization has diminishing returns:
For most practical queries, 95% of the benefit comes from the first 100ms of optimization. Spending more time produces incrementally smaller improvements.
Total response time = Optimization time + Execution time
$$T_{total} = T_{opt} + T_{exec}(plan)$$
Where T_exec depends on the plan chosen. The optimizer seeks to minimize T_total, not just T_exec.
Optimal Optimization Time: In principle, we want to optimize until: $$\frac{dT_{exec}}{dT_{opt}} = -1$$
That is, until each additional optimization second saves less than one execution second. In practice, this is impossible to measure in advance.
| Query Type | Expected Execution | Ideal Opt Time | Strategy |
|---|---|---|---|
| Point lookup | 1-5 ms | <1 ms | Cached plan, no re-optimize |
| Simple OLTP | 10-100 ms | 1-5 ms | Quick heuristics, limited DP |
| Moderate analytic | 1-10 sec | 10-100 ms | Full DP, parallel options |
| Complex report | 1-30 min | 1-5 sec | Exhaustive search, all options |
| Batch/ETL | 1+ hour | 1-30 sec | Maximum optimization, hints review |
A practical guideline: optimization time should be <1% of expected execution time. For a 10-second query, spend <100ms optimizing. For a 1-hour query, up to 36 seconds is acceptable. This keeps optimization overhead reasonable while allowing thorough search for complex queries.
Optimizers use various mechanisms to prevent optimization from taking too long:
The simplest approach: switch strategies based on query characteristics.
if (num_tables <= 5):
use full_dp_optimization()
elif (num_tables <= 12):
use left_deep_dp_optimization()
else:
use genetic_optimization() # or heuristic
PostgreSQL Example:
geqo_threshold = 12: Switch to genetic optimizer above 12 tablesfrom_collapse_limit = 8: Stop collapsing subqueries above 8 tablesAllocate a maximum optimization time:
budget = estimate_execution_time(query) * 0.01 # 1% of expected
start_time = now()
while (has_more_plans() and elapsed() < budget):
evaluate_next_plan()
return best_plan_so_far()
Challenge: Estimating expected execution time before optimization completes—a chicken-and-egg problem. Systems often use query complexity heuristics (number of tables, presence of aggregations) instead.
Limit the number of plans evaluated:
max_plans = 10000
plans_evaluated = 0
while (has_more_plans() and plans_evaluated < max_plans):
evaluate_next_plan()
plans_evaluated++
Advantage: Predictable bound on optimization time. Disadvantage: Doesn't adapt to plan evaluation speed or query importance.
Start with light pruning, escalate if optimization is taking too long:
pruning_level = 1 # 1=light, 2=medium, 3=aggressive
while (has_more_plans()):
if (elapsed() > phase_budget[pruning_level]):
pruning_level++
apply_stricter_pruning(pruning_level)
if (pruning_level > 3):
break
evaluate_next_plan()
This adaptively increases aggressiveness as time pressure mounts.
Hard timeout after maximum allowed time:
timeout = 30 seconds # Configurable
try:
with_timeout(timeout):
plan = full_optimization(query)
except TimeoutError:
plan = best_plan_so_far or heuristic_plan(query)
return plan
This guarantees bounded optimization time but may return suboptimal plans.
These thresholds are typically configurable. Incorrect settings cause problems: too aggressive → poor plans for complex queries; too permissive → slow optimization for simple queries. Default settings are tuned for typical workloads but may need adjustment for specific applications.
Rather than a single optimization pass, many systems use multi-phase approaches that progressively improve plan quality:
Phase 1: Quick Optimization (Low Cost)
Phase 2: Thorough Optimization (If Time Permits)
Decision Point: Proceed to Phase 2 only if:
SQL Server uses three optimization phases:
| Phase | Description | When Triggered |
|---|---|---|
| 0 | Simple plans | Single table queries, trivial cases |
| 1 | Transaction | Simple joins, OLTP patterns |
| 2 | Full optimization | Complex queries, many tables |
Each phase has increasing sophistication and cost. The optimizer escalates only when the previous phase's best plan exceeds a cost threshold.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
def multi_phase_optimize(query): """ Multi-phase optimization with escalation based on plan quality. """ # Phase 0: Trivial plan check if is_trivial_query(query): return trivial_plan(query) # Phase 1: Quick heuristic optimization phase1_plan = heuristic_optimize(query) phase1_cost = estimate_cost(phase1_plan) # Escalation thresholds PHASE2_THRESHOLD = 10000 # Arbitrary cost units TIME_BUDGET_PHASE2 = 100 # milliseconds if phase1_cost < PHASE2_THRESHOLD: # Good enough, don't waste more time log(f"Phase 1 sufficient: cost={phase1_cost}") return phase1_plan # Phase 2: Full optimization log(f"Escalating to Phase 2: Phase 1 cost={phase1_cost}") start_time = time.now() try: phase2_plan = dp_optimize_with_timeout( query, timeout=TIME_BUDGET_PHASE2 ) phase2_cost = estimate_cost(phase2_plan) if phase2_cost < phase1_cost: log(f"Phase 2 improved: {phase1_cost} -> {phase2_cost}") return phase2_plan else: log(f"Phase 2 didn't improve, using Phase 1") return phase1_plan except TimeoutError: log(f"Phase 2 timeout, using Phase 1") return phase1_plan def heuristic_optimize(query): """Quick heuristic-based plan generation.""" plan = parse_to_logical_plan(query) # Apply standard heuristics plan = push_selections_down(plan) plan = push_projections_down(plan) plan = order_joins_by_selectivity(plan) plan = assign_default_algorithms(plan) return plan def dp_optimize_with_timeout(query, timeout): """Full DP optimization with timeout.""" deadline = time.now() + timeout best_plan = None for subset in enumerate_connected_subsets(query.tables): if time.now() > deadline: raise TimeoutError("DP optimization timeout") best_plan = dp_optimize_subset(subset, best_plan) return best_planQuery execution tools often show which optimization phase was used. In SQL Server, SET STATISTICS PROFILE ON reveals the 'Statement Optimization Level.' Seeing 'FULL' for a simple query suggests optimization overhead; seeing 'TRIVIAL' for a complex query suggests potentially suboptimal plans.
The most sophisticated systems adapt their optimization approach based on query characteristics and system state:
Systems compute complexity scores to guide optimization effort:
$$complexity = \alpha \cdot joins + \beta \cdot predicates + \gamma \cdot aggregates + ...$$
Optimization time budget scales with complexity score.
When the system is under heavy load, optimize less aggressively:
load_factor = current_cpu_usage / max_cpu_capacity
if load_factor > 0.8:
optimization_budget *= 0.5 # Cut optimization time in half
use_cached_plans_more_aggressively()
Rationale: During high load, faster optimization (even with worse plans) reduces queueing delays and helps overall throughput.
Allocate more optimization resources to important queries:
Priority Indicators:
Learn from past queries to adjust optimization:
Successful Patterns:
Failed Patterns:
Adaptive systems might:
For repeated queries or prepared statements:
if (same_query_executed_before):
previous_plan = get_cached_plan(query)
previous_actual_cost = get_actual_cost(previous_plan)
if (previous_actual_cost was_acceptable):
return previous_plan # Don't re-optimize
else:
re_optimize with (higher_budget, updated_statistics)
This amortizes optimization cost over multiple executions while adapting to change.
Modern systems go beyond adaptive optimization to adaptive execution. Oracle's Adaptive Query Processing and SQL Server's Adaptive Joins adjust plans during execution based on actual cardinalities. This compensates for estimation errors without paying re-optimization cost.
The best way to reduce optimization time is to avoid optimization altogether by reusing previously computed plans:
Query String → Hash → Cache Lookup
↓
Hit: Return cached plan
↓ (Miss)
Optimize → Store in cache → Return plan
Cache Key Components:
Prepared statements explicitly separate parsing/optimization from execution:
-- Prepare once (optimization happens here)
PREPARE order_lookup AS
SELECT * FROM orders WHERE customer_id = $1;
-- Execute many times (no re-optimization)
EXECUTE order_lookup(12345);
EXECUTE order_lookup(67890);
EXECUTE order_lookup(11111);
Benefits:
Cached plans must be invalidated when they become stale:
Invalidation Triggers:
Lazy Recompilation: Some systems recompile lazily:
Memory Pressure:
Plan Cache Bloat:
WHERE id = 5 → WHERE id = @p1-- SQL Server: Check plan cache efficiency
SELECT
objtype,
COUNT(*) as plan_count,
SUM(size_in_bytes) / 1024 / 1024 as size_mb
FROM sys.dm_exec_cached_plans
GROUP BY objtype;
Plan caching creates the parameter sniffing problem: the first parameter value determines the plan for all subsequent values. Solutions include: RECOMPILE hint (re-optimize every time), OPTIMIZE FOR hint (optimize for specific or 'average' value), or plan guides that force specific plans.
When optimization exceeds time limits, the system must have fallback strategies:
Attempt full optimization
↓ (timeout)
Use best plan found so far
↓ (no complete plan)
Fall back to heuristic plan
↓ (heuristics fail)
Fall back to left-to-right join order
↓ (catastrophic failure)
Report error and abort
Each fallback level produces a valid (if potentially suboptimal) plan.
Modern optimizers track partial results during enumeration:
Some systems parallelize optimization itself:
Timeout During Optimization:
Complete Optimization Failure:
| System | Default Behavior | Configuration |
|---|---|---|
| PostgreSQL | No hard timeout, switch to GEQO at threshold | geqo_threshold, join_collapse_limit |
| MySQL | Use heuristic after optimizer threshold | optimizer_search_depth, optimizer_prune_level |
| SQL Server | Timeout with best-so-far plan | Cost threshold for parallelism, resource governor |
| Oracle | Adaptive optimization phases | optimizer_mode, optimizer hints |
Optimization timeout is often silent—the query executes without errors but slower than it could be. Monitoring optimization time and plan quality is essential to detect when fallbacks are being triggered too frequently, indicating workload complexity exceeding system capabilities.
Understanding time constraints has practical implications for application development:
Keep Joins Manageable:
Simplify Predicates:
Use Explicit Hints When Needed:
-- When you know optimizer time is wasted
SELECT /*+ NO_REORDER */ * FROM ...
-- When you know the optimal join
SELECT /*+ LEADING(a b) */ * FROM a JOIN b ON ...
Use Prepared Statements For:
Consider Ad-hoc For:
Key Metrics to Track:
-- PostgreSQL: Check planning time
EXPLAIN (ANALYZE, COSTS) SELECT ...;
-- Look for "Planning Time: X ms"
-- SQL Server: Check compilation time
SET STATISTICS TIME ON;
-- Look for "SQL Server parse and compile time"
For Complex OLAP Workloads:
For High-Volume OLTP:
When queries are slow, developers often blame optimization time. In reality, optimization rarely exceeds a few hundred milliseconds even for complex queries. Long optimization times are usually symptoms of missing statistics, overly complex queries, or configuration problems—not fundamental optimizer limitations.
Research and industry are pushing toward better solutions for the optimization time problem:
ML models can shortcut traditional optimization:
Example (Neo, Bao systems):
Instead of interpreting plans, compile them to native code:
Systems like HyPer, LegoBase, and Peloton demonstrate order-of-magnitude speedups for complex queries.
Decouple optimization from query submission:
Cloud databases face unique time constraint challenges:
Despite advances, the fundamental trade-off remains:
More optimization time → Better plans → Faster execution
Less optimization time → Faster start → Possibly slower execution
No approach completely eliminates this trade-off; each technique shifts the curve rather than eliminating it. Understanding this trade-off helps database professionals make informed decisions about query design, system configuration, and performance expectations.
With this page, we've completed the Optimization Overview module. You now understand the fundamental goal of query optimization, the search space of possible plans, enumeration and selection strategies, and the time constraints that bound the optimizer's work. This foundation prepares you for the detailed coverage of specific optimization techniques in the following modules.
This module has provided a comprehensive overview of query optimization—the sophisticated process that transforms declarative SQL into efficient execution plans. Let's consolidate the key insights from all five pages:
What's Next in Chapter 35:
The remaining modules dive deeper into specific optimization techniques:
With the foundation established in this module, you're prepared to understand the detailed techniques that make modern query optimization so effective.
Congratulations on completing Module 1: Optimization Overview! You now have a solid foundation in query optimization concepts. This understanding will serve you well both in optimizing real-world queries and in appreciating the sophisticated engineering behind every database query execution.