Loading learning content...
The logical query plan tells us what to compute. But there are countless ways to compute it. A simple three-table join can be executed in dozens of different ways—different join orders, different join algorithms, different access methods for each table. The difference between the best and worst plan can be orders of magnitude: seconds versus hours.
Query optimization is the process of finding an efficient execution strategy from the space of equivalent alternatives. It's one of the most intellectually challenging and practically important components of database systems—the reason modern databases can efficiently process queries against terabyte-scale datasets.
This is where database systems truly shine. A declarative query becomes an optimized execution plan, transforming user intent into machine efficiency.
By the end of this page, you will understand the query optimizer's goals and constraints, heuristic and rule-based optimization techniques, cost-based optimization principles, how statistics and cardinality estimation drive plan selection, and the exponential search space that makes optimization challenging.
To appreciate query optimization, consider how dramatically execution strategies can differ in performance.
Example: Simple Join Performance Variance
Consider joining two tables:
customers — 100,000 rowsorders — 10,000,000 rows (100 orders per customer)SELECT c.name, COUNT(*)
FROM customers c JOIN orders o ON c.id = o.customer_id
GROUP BY c.name;
| Strategy | Approach | Operations | Estimated Time |
|---|---|---|---|
| Nested Loop (bad order) | For each order, scan customers | 10M × 100K = 1 trillion comparisons | Hours |
| Nested Loop (good order) | For each customer, index lookup orders | 100K × log(10M) = 2.4M operations | Seconds |
| Hash Join | Build hash on customers, probe with orders | 100K + 10M = 10.1M operations | Seconds |
| Merge Join (presorted) | Merge sorted streams | 100K + 10M = 10.1M operations | Seconds |
The performance gap can be 10,000× or more. A few seconds versus many hours. The optimizer's job is finding those efficient strategies automatically.
The Optimization Challenge:
Optimizers don't find the theoretically optimal plan—that's often computationally infeasible. They find a 'good enough' plan quickly. A plan that's 10% slower but found in 1ms is better than the optimal plan found after 10 seconds of optimization.
Query optimization typically proceeds in stages, from fast heuristic transformations to more expensive cost-based searches.
Multi-Stage Optimization:
Two Schools of Optimization:
| Approach | Description | Examples |
|---|---|---|
| Rule-Based (RBO) | Apply fixed rules based on query structure. No statistics needed. Simpler but less adaptive. | Oracle (legacy), older systems |
| Cost-Based (CBO) | Estimate costs using statistics, choose minimum-cost plan. More accurate but complex. | PostgreSQL, modern MySQL, SQL Server |
Modern systems are predominantly cost-based, but still apply heuristic rules for transformations that are universally beneficial.
Before cost-based optimization, the optimizer applies heuristic transformations—changes that are almost always beneficial regardless of data distribution.
Key Heuristic Rules:
x > 5 AND x > 3 to x > 5, or 1 = 1 to TRUE.5 + 3 becomes 8.123456789101112131415161718192021222324252627282930
-- Original querySELECT o.order_id, c.nameFROM orders o, customers cWHERE o.customer_id = c.id AND c.region = 'WEST' AND o.year = 2024 AND o.total > 1000; -- After selection pushdown (conceptual):-- Predicates pushed to their respective tables -- Scan customers with: region = 'WEST'-- Scan orders with: year = 2024 AND total > 1000-- Then join on: customer_id = id -- This reduces:-- - Rows read from customers (only WEST)-- - Rows read from orders (only 2024, >$1000)-- - Join input sizes (fewer rows to join) -- Before pushdown: join ALL rows, then filter-- After pushdown: filter FIRST, then join (much smaller) -- Example of constant folding and simplificationSELECT *FROM productsWHERE price * 1.1 > 100 -- Rewritten to: price > 90.909... AND category = 'A' AND (active = true OR 1 = 0) -- Simplified to: active = true AND stock + 0 > 5; -- Simplified to: stock > 5Selection pushdown is the single most impactful heuristic. Filtering early can reduce data volumes by orders of magnitude. A predicate that eliminates 99% of rows should be applied as early as possible, not after expensive joins.
Cost-based optimization (CBO) is the heart of modern query optimization. The optimizer generates candidate plans, estimates each plan's cost, and selects the lowest-cost alternative.
The Cost Model:
Plan cost is typically measured in abstract units combining:
| Cost Component | What It Measures | Typical Weight |
|---|---|---|
| I/O Cost | Disk page reads/writes | Heavy (disk is slow) |
| CPU Cost | Tuple processing, comparisons | Moderate |
| Memory Cost | Buffer usage, sorting, hashing | Considered via algorithms |
| Network Cost | Data transfer (distributed DBs) | Heavy for distributed |
Most optimizers focus primarily on I/O cost, as disk access dominates runtime for data-intensive queries.
Cost Estimation Formulas (Simplified):
Sequential Scan Cost:
cost = (pages_in_table × seq_page_cost) + (rows × cpu_tuple_cost)
Index Scan Cost:
cost = (index_pages × random_page_cost) +
(matching_rows × index_tuple_cost) +
(heap_fetches × random_page_cost)
Nested Loop Join Cost:
cost = outer_cost + (outer_rows × inner_cost)
Hash Join Cost:
cost = outer_cost + inner_cost + (hash_build_cost) + (probe_cost)
Merge Join Cost:
cost = outer_cost + inner_cost + sort_cost_if_needed + merge_cost
These formulas require knowing row counts, page counts, and various tuning parameters.
123456789101112131415161718192021222324252627282930313233343536
-- PostgreSQL cost estimates in EXPLAIN outputEXPLAIN SELECT * FROM orders WHERE customer_id = 12345; /* QUERY PLAN----------------------------------------------------------------------------- Index Scan using orders_customer_id_idx on orders (cost=0.43..23.45 rows=12 width=84) ↑ ↑ ↑ ↑ | | | └─ Estimated row width (bytes) | | └─ Estimated rows returned | └─ Total cost (to get all rows) └─ Startup cost (before first row) Index Cond: (customer_id = 12345)*/ -- More complex query with cost breakdownEXPLAIN (COSTS ON, FORMAT TEXT) SELECT c.name, SUM(o.total) FROM customers c JOIN orders o ON c.id = o.customer_idWHERE c.region = 'WEST'GROUP BY c.name; /* QUERY PLAN------------------------------------------------------------------------------------------------- HashAggregate (cost=4523.45..4528.45 rows=500 width=40) Group Key: c.name -> Hash Join (cost=125.00..4498.45 rows=5000 width=40) Hash Cond: (o.customer_id = c.id) -> Seq Scan on orders o (cost=0.00..3500.00 rows=100000 width=12) -> Hash (cost=100.00..100.00 rows=2000 width=36) -> Seq Scan on customers c (cost=0.00..100.00 rows=2000 width=36) Filter: (region = 'WEST'::text)*/Cost-based optimization requires statistics about the data. Without knowing how many rows match a predicate, cost estimates are guesswork.
Key Statistics Maintained:
| Statistic | Description | Used For |
|---|---|---|
| Row Count (n) | Total rows in table | Base cardinality |
| Distinct Values (NDV) | Unique values per column | Selectivity estimation |
| NULL Fraction | Percentage of NULL values | IS NULL predicates |
| Min/Max Values | Column value range | Range predicates |
| Histograms | Value distribution buckets | Non-uniform distributions |
| Correlation | Physical order vs. logical order | Index scan efficiency |
| Average Width | Average bytes per column | Memory estimation |
| Page Count | Disk pages used | I/O cost estimation |
Cardinality Estimation:
The optimizer must estimate how many rows each operation produces. This is cardinality estimation and it's notoriously difficult.
Selectivity Formulas:
Equality predicate (column = value):
selectivity = 1 / distinct_values (if uniform distribution)
Range predicate (column > value):
selectivity = (max - value) / (max - min)
LIKE prefix (column LIKE 'abc%'):
selectivity = 1 / (distinct_values^0.3) (heuristic)
AND combination:
selectivity = sel1 × sel2 (assumes independence)
OR combination:
selectivity = sel1 + sel2 - (sel1 × sel2)
Join result:
cardinality = (rows1 × rows2) / max(distinct1, distinct2)
The independence assumption (AND) is often wrong—real data has correlations. This is a major source of estimation errors.
12345678910111213141516171819202122232425
-- View table statisticsSELECT relname, reltuples, relpagesFROM pg_classWHERE relname = 'orders'; -- View column statistics SELECT attname, n_distinct, null_frac, avg_width, correlation, most_common_vals, most_common_freqs, histogram_boundsFROM pg_statsWHERE tablename = 'orders' AND attname = 'status'; -- Update statistics (analyze table)ANALYZE orders; -- Analyze specific columns with extended statisticsANALYZE orders (customer_id, order_date, status); -- Create extended statistics for correlated columnsCREATE STATISTICS orders_customer_date (dependencies)ON customer_id, order_date FROM orders; -- Force sample size for accuracyALTER TABLE orders SET (autovacuum_analyze_scale_factor = 0.01);When statistics become stale (after many inserts/updates/deletes), the optimizer makes poor decisions based on outdated information. Regular ANALYZE operations are essential for production systems. Many databases auto-analyze, but may not do so frequently enough for rapidly changing tables.
The optimizer faces a combinatorial explosion: the number of possible execution plans grows exponentially with query complexity.
Join Order Explosion:
For a query joining n tables, the number of possible join orders is:
| Tables (n) | Join Orders | Approximate Count |
|---|---|---|
| 3 | 3! = 6 | 6 |
| 5 | 5! = 120 | 120 |
| 10 | 10! = 3.6M | 3,628,800 |
| 15 | 15! | 1.3 trillion |
| 20 | 20! | 2.4 × 10^18 |
For each join order, we must also choose join algorithms, access methods, and parallelism levels. The complete search space is astronomical.
Search Strategies:
Optimizers use various techniques to explore this space efficiently:
| Strategy | Description | Pros/Cons |
|---|---|---|
| Exhaustive Search | Examine all plans | Optimal but infeasible for large queries |
| Dynamic Programming | Build optimal subplans, combine them | Good for left-deep trees, manageable space |
| Greedy Heuristics | Make locally optimal choices | Fast but may miss global optimum |
| Randomized Search | Random exploration with local optimization | Good for very large queries |
| Genetic Algorithms | Evolutionary approach to plan search | Used in PostgreSQL for many-table queries |
12345678910111213141516171819202122232425262728
// Selinger-style dynamic programming optimizer (System R)// Finds optimal left-deep join trees function find_best_plan(tables): // Phase 1: Find best single-table access for each table for each table T in tables: for each access method A (seq scan, index scan, etc.): cost = estimate_cost(A, T) if cost < best_cost[{T}]: best_plan[{T}] = create_plan(A, T) best_cost[{T}] = cost // Phase 2: Build up join plans incrementally for size = 2 to |tables|: for each subset S of tables with |S| = size: for each way to partition S into (R, {T}) where T is single table: if join_predicate_exists(R, T): for each join method J (nested loop, hash, merge): left_plan = best_plan[R] cost = best_cost[R] + join_cost(J, left_plan, T) if cost < best_cost[S]: best_plan[S] = create_join_plan(J, left_plan, T) best_cost[S] = cost return best_plan[tables] // Complexity: O(3^n) for n tables - still exponential but manageable// For n=10: ~59,000 subsets to consider (much less than 10! = 3.6M)PostgreSQL uses dynamic programming for queries with up to 12 tables (configurable via geqo_threshold). Beyond that, it switches to GEQO (Genetic Query Optimizer), which uses genetic algorithms to search the plan space. This trades optimality for speed on very complex queries.
Logical operators (join, select, project) must be mapped to physical operators—specific algorithms that implement the logical operation.
Access Method Selection:
| Logical Operation | Physical Options | Selection Criteria |
|---|---|---|
| Table Scan | Sequential Scan, Index Scan, Index-Only Scan, Bitmap Scan | Selectivity, index availability, clustering |
| Join | Nested Loop, Hash Join, Merge Join | Table sizes, join type, available memory, indexes |
| Aggregation | Plain Agg, Sorted Agg, Hash Agg | Distinct values, available memory, sort order |
| Sort | Quicksort, External Merge Sort, Index Scan | Data size, memory, existing sort order |
| Distinct | Sort + Unique, Hash Unique | Distinct count, memory, input ordering |
Join Algorithm Selection Heuristics:
| Join Type | Best When... |
|---|---|
| Nested Loop | Inner side is small or indexed; outer is highly selective |
| Hash Join | No useful indexes; equi-join; one side fits in memory |
| Merge Join | Inputs already sorted; equi-join; sort can be reused |
Index Selection:
When multiple indexes exist, the optimizer evaluates which (if any) to use:
-- Table 'orders' has indexes on:
-- - order_id (primary key)
-- - customer_id
-- - order_date
-- - (customer_id, order_date) composite
SELECT * FROM orders WHERE customer_id = 100;
-- Uses: customer_id index (direct match)
SELECT * FROM orders WHERE order_date = '2024-01-15';
-- Uses: order_date index (direct match)
SELECT * FROM orders WHERE customer_id = 100 AND order_date = '2024-01-15';
-- Uses: composite index (matches both columns)
SELECT * FROM orders WHERE order_date > '2024-01-01';
-- May use: sequential scan (if range is large)
-- Or: order_date index (if range is small)
12345678910111213141516171819202122232425
-- Compare physical operator choices with forced methods -- Default (optimizer chooses)EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id; -- Force nested loopSET enable_hashjoin = off;SET enable_mergejoin = off;EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id; -- Force hash joinSET enable_hashjoin = on;SET enable_mergejoin = off;SET enable_nestloop = off;EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id; -- Reset to defaultRESET enable_hashjoin;RESET enable_mergejoin;RESET enable_nestloop; -- Force specific scan methodSET enable_seqscan = off; -- Force index usage if possibleEXPLAIN SELECT * FROM orders WHERE customer_id = 12345;RESET enable_seqscan;Optimization is expensive. Databases cache execution plans to avoid re-optimizing identical or similar queries.
Plan Cache Strategies:
The Generic Plan Problem:
Prepared statements face a dilemma:
-- Prepared statement
PREPARE find_orders(text) AS SELECT * FROM orders WHERE status = $1;
-- With status = 'pending' (rare, 0.1% of rows): Index scan is best
-- With status = 'completed' (common, 80% of rows): Seq scan is best
-- A generic plan must choose ONE approach.
-- PostgreSQL uses custom plans for first 5 executions, then evaluates
-- whether a generic plan is acceptable.
12345678910111213141516171819202122
-- View prepared statement plan (after several executions)PREPARE get_orders(int) AS SELECT * FROM orders WHERE customer_id = $1; -- Execute multiple times to build statisticsEXECUTE get_orders(100);EXECUTE get_orders(200);EXECUTE get_orders(300); -- View cached plan infoSELECT name, statement, generic_plans, custom_plans, total_exec_timeFROM pg_prepared_statements; -- Force generic plan behaviorSET plan_cache_mode = 'force_generic_plan'; -- Force custom plan behavior SET plan_cache_mode = 'force_custom_plan'; -- Reset to autoSET plan_cache_mode = 'auto';Parameter sniffing occurs when the first execution's parameter values produce a plan that's suboptimal for subsequent values. If the first call uses status='pending' (rare), the cached plan uses index scan—but then status='completed' (common) also gets index scan, causing thousands of random I/Os instead of a fast sequential scan.
We've explored query optimization comprehensively. Let's consolidate the key concepts:
What's Next:
With an optimized execution plan in hand, we're ready for the final phase: execution. The next page examines how the query executor actually processes data—the iterator model, pipelining, materialization, and the machinery that transforms plans into results.
You now understand how query optimizers transform logical plans into efficient physical plans through heuristic and cost-based techniques. This foundation prepares you for understanding query execution—where plans become results.