Loading learning content...
A skilled chess player doesn't just see the current board position—they see multiple possible futures. Before moving a piece, they mentally simulate several sequences of moves, evaluating outcomes, comparing strategies. The move they choose comes from this comparative analysis.
Query optimization works similarly. For any non-trivial query, the database optimizer generates multiple candidate execution plans. It evaluates their estimated costs and selects what it believes is optimal. But as we've learned, estimates can be wrong. Sometimes the optimizer's choice isn't actually the best.
This is where plan comparison becomes essential. By understanding how to compare plans—both alternatives the optimizer considered and alternatives you create through query restructuring—you gain the ability to:
By the end of this page, you will master techniques for generating alternative plans, comparing them systematically, understanding optimizer decision factors, using hints and configuration to influence choices, and validating improvements through rigorous measurement.
To compare plans, you first need multiple plans to compare. There are several strategies for generating alternatives:
Strategy 1: Disable Specific Operators
Most databases have configuration parameters to disable specific join algorithms or access methods. This forces the optimizer to choose alternatives:
PostgreSQL Example:
SET enable_seqscan = off; -- Force use of indexes
SET enable_hashjoin = off; -- Prevent hash joins
SET enable_nestloop = off; -- Prevent nested loops
SET enable_mergejoin = off; -- Prevent merge joins
SET enable_indexscan = off; -- Prevent index scans
SET enable_bitmapscan = off; -- Prevent bitmap scans
By disabling the chosen operator and re-running EXPLAIN, you see what the optimizer would do if that option weren't available.
1234567891011121314151617181920212223242526272829303132333435363738
-- Original planEXPLAIN ANALYZESELECT c.name, COUNT(o.id)FROM customers cJOIN orders o ON c.id = o.customer_idGROUP BY c.name; -- Result: Hash Join selected (cost=2845.00)-- Execution Time: 125.432 ms -- Alternative 1: Force Merge JoinSET enable_hashjoin = off;EXPLAIN ANALYZESELECT c.name, COUNT(o.id)FROM customers cJOIN orders o ON c.id = o.customer_idGROUP BY c.name; -- Result: Merge Join with Sort (cost=3567.00)-- Execution Time: 178.234 ms-- Conclusion: Hash Join was better -- Alternative 2: Force Nested LoopSET enable_hashjoin = off;SET enable_mergejoin = off;EXPLAIN ANALYZESELECT c.name, COUNT(o.id)FROM customers cJOIN orders o ON c.id = o.customer_idGROUP BY c.name; -- Result: Nested Loop (cost=45678.00)-- Execution Time: 12456.789 ms (10x slower!)-- Conclusion: Nested Loop catastrophic for this query -- Reset to defaultsRESET enable_hashjoin;RESET enable_mergejoin;Strategy 2: Query Reformulation
Same logical result, different SQL structure:
| Original | Alternative 1 | Alternative 2 |
|---|---|---|
IN (subquery) | EXISTS (subquery) | JOIN |
LEFT JOIN + IS NULL | NOT EXISTS | EXCEPT |
| Correlated subquery | Derived table | CTE |
| DISTINCT | GROUP BY | Window + filter |
Strategy 3: Different Join Orders
For multi-table joins, the order matters. Some databases accept explicit join order hints:
-- PostgreSQL: join_collapse_limit = 1 preserves written order
SET join_collapse_limit = 1;
SELECT * FROM a JOIN b ON... JOIN c ON... -- Joins in this order
-- MySQL: STRAIGHT_JOIN forces left-to-right order
SELECT * FROM a STRAIGHT_JOIN b ON... STRAIGHT_JOIN c ON...
-- Oracle: LEADING hint
SELECT /*+ LEADING(a b c) */ * FROM a, b, c WHERE...
Disabling optimizers should only be done for comparison testing. Don't disable optimizers in production. Session-level settings affect only your connection, but be sure to RESET after testing. For production influence, use optimizer hints in the query itself.
When comparing plans, use a systematic framework to ensure meaningful comparison:
The 5-Dimension Comparison Model:
| Dimension | What to Compare | How to Measure |
|---|---|---|
| Estimated Cost | Optimizer's prediction | EXPLAIN cost values |
| Actual Time | Real execution duration | EXPLAIN ANALYZE timing |
| Row Estimates vs. Actual | Cardinality accuracy | EXPLAIN ANALYZE rows |
| Resource Usage | I/O, memory consumption | EXPLAIN (BUFFERS), OS monitoring |
| Scalability | How performance changes with data size | Test with varying data volumes |
The Comparison Worksheet:
For each query variant, record:
| Plan Version | Est. Cost | Actual Time | Rows Est | Rows Actual | Buffers |
|--------------------|-----------|-------------|----------|-------------|---------|
| Original (Hash) | 2845.00 | 125.4 ms | 50000 | 51234 | hit=456 |
| Alt 1 (Merge) | 3567.00 | 178.2 ms | 50000 | 51234 | hit=612 |
| Alt 2 (NL) | 45678.00 | 12456.8 ms | 50000 | 51234 | hit=1M |
| Rewritten (EXISTS) | 2341.00 | 98.7 ms | 50000 | 51234 | hit=234 |
Interpreting the Worksheet:
Cost Correlation: Does lower cost correlate with faster actual time?
Cardinality Accuracy: Are row estimates close to actual?
Resource Trade-offs: What's the buffer/memory difference?
1234567891011121314151617181920212223242526272829303132333435363738394041
-- Create comparison baselineEXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)SELECT p.name, COUNT(s.id) as sales_countFROM products pJOIN sales s ON p.id = s.product_idWHERE s.sale_date >= '2024-01-01'GROUP BY p.nameORDER BY sales_count DESCLIMIT 10; -- Store results: -- {"Plan": {"Total Cost": 4521.34, "Actual Total Time": 234.567, ...}} -- Test Alternative: CTE instead of direct joinEXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)WITH recent_sales AS ( SELECT product_id, COUNT(*) as cnt FROM sales WHERE sale_date >= '2024-01-01' GROUP BY product_id)SELECT p.name, rs.cntFROM products pJOIN recent_sales rs ON p.id = rs.product_idORDER BY rs.cnt DESCLIMIT 10; -- Store and compare JSON results programmatically -- Test Alternative: Force Index-based approachSET enable_seqscan = off;EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)SELECT p.name, COUNT(s.id) as sales_countFROM products pJOIN sales s ON p.id = s.product_id WHERE s.sale_date >= '2024-01-01'GROUP BY p.nameORDER BY sales_count DESCLIMIT 10; RESET enable_seqscan;Query timing varies due to caching, concurrent load, and system activity. Run each comparison at least 3-5 times and use median timing. For critical decisions, test under varying cache conditions (cold vs. warm) to understand realistic behavior.
When the optimizer makes a choice you don't expect, understanding why is essential before trying to override it.
Common Reasons for Unexpected Choices:
| Observation | Likely Reason | Investigation |
|---|---|---|
| Seq Scan chosen despite index | Low selectivity (filter returns >10-20% rows) | Check rows estimate vs table size |
| Nested Loop with large tables | Wrong row estimate on outer side | Compare estimated vs actual rows |
| Index not used for ORDER BY | Low correlation or many columns needed | Check pg_stats.correlation for column |
| Hash Join for small tables | Small = faster hash build than sort for merge | Cost threshold calculation favored hash |
| Subquery not flattened | Aggregate, LIMIT, or side effects prevent | Check if subquery has blocking features |
| Different plan each time | Changing statistics or prepared statement re-planning | Check if data distribution changed recently |
12345678910111213141516171819202122232425262728293031
-- Why was Seq Scan chosen instead of Index Scan? EXPLAIN ANALYZESELECT * FROM orders WHERE status = 'completed';-- Seq Scan on orders (cost=0.00..15234.00 rows=85000 width=120)-- Filter: (status = 'completed')-- Rows Removed by Filter: 15000 -- 85000/100000 = 85% of rows match. Index would be slower!-- Random I/O for 85K lookups > Sequential read of whole table -- Check statistics driving this decision:SELECT tablename, attname, n_distinct, most_common_vals, most_common_freqsFROM pg_stats WHERE tablename = 'orders' AND attname = 'status'; -- Output:-- most_common_vals: {completed,pending,cancelled}-- most_common_freqs: {0.85, 0.10, 0.05}-- Optimizer correctly estimated 85% selectivity -- For the rare value, index IS used:EXPLAIN ANALYZESELECT * FROM orders WHERE status = 'cancelled';-- Index Scan using idx_status on orders (cost=0.29..1523.45 rows=5000)-- Only 5% of rows, worth using indexThe Optimizer's Perspective:
The optimizer is trying to minimize estimated cost. When its choice seems wrong:
Using Debug Output:
Some databases provide detailed optimizer reasoning:
-- PostgreSQL: No direct debug, but geqo_threshold, explain (verbose)
EXPLAIN (VERBOSE, COSTS) SELECT ...;
-- MySQL: Extended EXPLAIN + WARNINGS
EXPLAIN SELECT ...;
SHOW WARNINGS; -- Shows optimizer's reasoning
-- Oracle: 10053 trace event for detailed cost analysis
ALTER SESSION SET EVENTS '10053 trace name context forever';
Modern query optimizers are sophisticated. They consider factors you might miss: correlation, cache behavior, I/O patterns. Override them only when you have clear evidence (EXPLAIN ANALYZE) that they're wrong, and understand why.
When you've determined the optimizer is making a suboptimal choice, you have several options to influence plan selection:
Option 1: Update Statistics
The most common fix. If cardinality estimates are wrong, update statistics:
-- PostgreSQL
ANALYZE table_name;
ANALYZE table_name(column_name); -- Specific column
-- Increase statistics granularity for complex distributions
ALTER TABLE orders ALTER COLUMN region SET STATISTICS 1000;
ANALYZE orders;
-- MySQL
ANALYZE TABLE table_name;
-- Oracle
EXEC DBMS_STATS.GATHER_TABLE_STATS('schema', 'table_name');
Option 2: Add or Modify Indexes
Give the optimizer better access paths:
-- Create missing index
CREATE INDEX idx_composite ON orders(customer_id, status);
-- Create covering index to enable Index Only Scan
CREATE INDEX idx_covering ON orders(customer_id, status) INCLUDE (order_date, total);
12345678910111213141516
-- PostgreSQL doesn't have traditional hints-- but pg_hint_plan extension provides them -- Install extension (if available)CREATE EXTENSION pg_hint_plan; -- Use hints in querySELECT /*+ SeqScan(orders) */ * FROM orders WHERE id = 1;SELECT /*+ IndexScan(orders orders_pkey) */ * FROM orders WHERE id = 1;SELECT /*+ HashJoin(c o) */ * FROM customers c JOIN orders o ON c.id = o.customer_id;SELECT /*+ NestLoop(c o) */ * FROM customers c JOIN orders o ON c.id = o.customer_id;SELECT /*+ Leading((c o)) */ * FROM customers c JOIN orders o ON c.id = o.customer_id; -- Without extension: session-level workaroundsSET enable_seqscan = off; -- Discourage seq scansSET random_page_cost = 1.1; -- Make index scans more attractive (SSD adjustment)Hints override optimizer intelligence. As data grows or distribution changes, a once-correct hint may become harmful. Prefer addressing root causes (statistics, indexes, query restructuring) over hints. If you must use hints, document why and review periodically.
One of the most insidious performance problems is plan regression—when a query that previously ran fast suddenly becomes slow due to a change in execution plan.
Common Causes of Plan Regression:
| Trigger | Why Plan Changes |
|---|---|
| Statistics update | New statistics change cost estimates |
| Data growth | Selectivity thresholds crossed |
| Database upgrade | New optimizer features or cost models |
| Index creation/deletion | New options available or removed |
| Parameter change | Configuration affects cost calculation |
| Histogram bucket boundary | Values now fall in different buckets |
1234567891011121314151617181920212223242526272829303132333435363738
-- Capture plan baseline (store plan hash or full plan)CREATE TABLE plan_baselines ( query_id VARCHAR(64), captured_at TIMESTAMP DEFAULT now(), plan_hash VARCHAR(64), plan_json JSONB, execution_time_ms NUMERIC); -- Capture current plan for important queryINSERT INTO plan_baselines (query_id, plan_hash, plan_json, execution_time_ms)SELECT 'critical_report_query', md5(plan::text), plan, (plan->>'Execution Time')::numericFROM ( SELECT to_json(pg_catalog.pg_stat_statements.query_plan) as plan FROM ... -- Capture mechanism varies) t; -- Periodic comparison to detect drift-- If plan_hash changes, investigate -- pg_stat_statements for historical query performanceCREATE EXTENSION IF NOT EXISTS pg_stat_statements; SELECT query, calls, mean_time, stddev_time, rowsFROM pg_stat_statementsWHERE query LIKE '%orders%customer%'ORDER BY mean_time DESC; -- Watch for mean_time spikes on known queriesTools like SQL Server Query Store, Oracle SQL Plan Baselines, and third-party APM solutions can automatically detect and sometimes revert plan regressions. Invest in automated monitoring for production-critical workloads.
Let's walk through a complete plan comparison scenario to see these techniques in action.
Scenario: A weekly sales report query has become slow after a data migration. We need to diagnose and fix it.
Step 1: Capture Current Plan
12345678910111213141516171819202122232425262728293031323334353637
-- STEP 1: Current problematic planEXPLAIN (ANALYZE, BUFFERS)SELECT p.category, DATE_TRUNC('week', s.sale_date) as week, SUM(s.amount) as total_salesFROM products pJOIN sales s ON p.id = s.product_idWHERE s.sale_date >= '2024-01-01'GROUP BY p.category, DATE_TRUNC('week', s.sale_date)ORDER BY week, total_sales DESC; -- OUTPUT:-- Sort (cost=58234.56..58245.67 rows=4444 width=64)-- (actual time=15234.567..15234.890 rows=520 loops=1)-- Sort Key: (date_trunc('week', s.sale_date)), (sum(s.amount)) DESC-- Sort Method: quicksort Memory: 58kB-- -> HashAggregate (cost=57890.12..57934.56 rows=4444 width=64)-- (actual time=15232.123..15233.456 rows=520 loops=1)-- Group Key: p.category, (date_trunc('week', s.sale_date))-- -> Hash Join (cost=325.00..54567.89 rows=444444 width=20)-- (actual time=2.345..14890.234 rows=500000 loops=1)-- Hash Cond: (s.product_id = p.id)-- -> Seq Scan on sales s (cost=0.00..43210.00 rows=1000000 width=16)-- (actual time=0.015..8234.567 rows=1000000 loops=1)-- Filter: (sale_date >= '2024-01-01')-- Rows Removed by Filter: 0-- -> Hash (cost=225.00..225.00 rows=8000 width=12)-- (actual time=2.234..2.234 rows=8000 loops=1)-- -> Seq Scan on products p (rows=8000 loops=1)-- Buffers: shared hit=15234 read=28456 <- Lots of disk reads!-- Execution Time: 15238.456 ms <- 15 SECONDS! -- ANALYSIS:-- 1. Seq Scan on sales reads entire table (no date index being used)-- 2. Large buffer read count indicates cold cache / missing index-- 3. No index on sale_date or composite index available12345678910111213141516171819202122232425262728293031323334353637383940414243
-- STEP 2: Check what indexes existSELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'sales'; -- Result: No index on sale_date! -- STEP 3: Generate alternative with indexCREATE INDEX idx_sales_date ON sales(sale_date);ANALYZE sales; -- STEP 4: Compare new planEXPLAIN (ANALYZE, BUFFERS)SELECT p.category, DATE_TRUNC('week', s.sale_date) as week, SUM(s.amount) as total_salesFROM products pJOIN sales s ON p.id = s.product_idWHERE s.sale_date >= '2024-01-01'GROUP BY p.category, DATE_TRUNC('week', s.sale_date)ORDER BY week, total_sales DESC; -- NEW OUTPUT:-- Sort (cost=8234.56..8245.67 rows=4444 width=64)-- (actual time=234.567..234.890 rows=520 loops=1) <- FROM 15s to 235ms!-- -> HashAggregate (cost=7890.12..7934.56 rows=4444 width=64)-- -> Hash Join (cost=325.00..5567.89 rows=444444 width=20)-- -> Index Scan using idx_sales_date on sales s <- NOW USING INDEX-- (actual time=0.056..156.234 rows=500000)-- Index Cond: (sale_date >= '2024-01-01')-- -> Hash (cost=225.00..225.00 rows=8000 width=12)-- -> Seq Scan on products p-- Buffers: shared hit=12456 read=234 <- Dramatially fewer disk reads-- Execution Time: 238.456 ms <- 64x FASTER! -- COMPARISON SUMMARY:-- | Metric | Before | After | Improvement |-- |-----------------|-----------|-----------|-------------|-- | Execution Time | 15238 ms | 238 ms | 64x |-- | Buffer Reads | 28456 | 234 | 122x |-- | Access Method | Seq Scan | Idx Scan | Selective |-- | Est. Cost | 58234 | 8234 | 7x |By systematically comparing plans, we identified that missing index on sale_date forced a full table scan. Adding the index and comparing plans confirmed the improvement. The fix reduced execution time from 15 seconds to under 250 milliseconds—a 64x improvement.
Manual plan comparison is powerful for deep analysis, but automation helps scale this to many queries and continuous monitoring.
Visualization and Analysis Tools:
| Database | Built-in Tools | Third-Party Options |
|---|---|---|
| PostgreSQL | pgAdmin, auto_explain extension | pgMustard, explain.depesz.com, pganalyze |
| MySQL | MySQL Workbench Visual Explain | Percona Toolkit, VividCortex |
| SQL Server | SSMS graphical plans, Query Store | SentryOne, Redgate SQL Monitor |
| Oracle | SQL Developer, AWR/ASH reports | Oracle EM, Toad, SQL Optimizer |
Automated Plan Collection:
-- PostgreSQL: auto_explain logs slow query plans
ALTER SYSTEM SET auto_explain.log_min_duration = '1000'; -- 1 second
ALTER SYSTEM SET auto_explain.log_analyze = on;
ALTER SYSTEM SET auto_explain.log_buffers = on;
SELECT pg_reload_conf();
-- Now any query taking >1s gets its plan logged automatically
CI/CD Integration:
| Use Case | Implementation |
|---|---|
| Pre-merge checks | EXPLAIN critical queries in PR, compare to baseline |
| Post-deploy verification | Run standard queries, alert if plan changes |
| Performance budgets | Fail build if estimated cost exceeds threshold |
| Regression testing | Track actual execution times across releases |
123456789101112131415161718192021222324252627282930313233343536373839404142
# Example: Automated plan comparison in CI/CDimport psycopg2import jsonimport hashlib def capture_plan(conn, query): """Capture execution plan as JSON""" with conn.cursor() as cur: cur.execute(f"EXPLAIN (FORMAT JSON) {query}") return cur.fetchone()[0] def plan_hash(plan): """Generate hash of plan structure for comparison""" # Normalize plan (remove volatile fields like costs) normalized = normalize_plan(plan) return hashlib.md5(json.dumps(normalized).encode()).hexdigest() def compare_plans(baseline_hash, current_hash): """Compare plan hashes, alert if different""" if baseline_hash != current_hash: print("⚠️ PLAN CHANGE DETECTED!") print(f"Baseline: {baseline_hash}") print(f"Current: {current_hash}") return False return True # Usage in CI/CD pipelinecritical_queries = [ "SELECT * FROM orders WHERE customer_id = $1", "SELECT COUNT(*) FROM sales WHERE sale_date >= $1", # ... more critical queries] for query in critical_queries: current_plan = capture_plan(conn, query) current_hash = plan_hash(current_plan) baseline_hash = load_baseline(query) # From version control if not compare_plans(baseline_hash, current_hash): # Fail pipeline or alert, depending on policy raise Exception(f"Plan regression for: {query[:50]}...")Don't try to automate everything immediately. Start with manual comparison for your 5-10 most critical queries. Once you understand the patterns, build automation. The goal is preventing regressions, not achieving perfection.
We've completed our deep dive into EXPLAIN plans. This final page covered the essential skill of comparing execution plans—the capstone of effective query optimization.
Module Conclusion: Mastering EXPLAIN Plans
Over these five pages, you've developed comprehensive expertise in execution plan analysis:
| Page | Core Skill Developed |
|---|---|
| EXPLAIN Statement | Invoking EXPLAIN across databases, understanding estimated vs. actual |
| Execution Plan Reading | Tracing data flow, understanding tree structure, systematic analysis |
| Plan Operators | Vocabulary of access methods, join algorithms, processing operators |
| Cost Estimates | What costs mean, how they're calculated, when to trust them |
| Plan Comparison | Generating alternatives, comparing rigorously, influencing choices |
With these skills, you can diagnose any slow query, understand optimizer behavior, and make informed optimization decisions. Execution plans are no longer mysterious output—they're a diagnostic language you speak fluently.
Continuing Your Journey:
The next modules in this chapter explore other aspects of SQL performance:
Congratulations! You now have a comprehensive understanding of EXPLAIN plans—from basic invocation to advanced plan comparison. This knowledge is foundational for serious SQL performance engineering. Practice regularly, and execution plan analysis will become an intuitive skill that accelerates all your database work.