Query Optimization - Learning Module

Loading content...

0/273

EXPLAIN Analysis

Reading the Database's Mind

The EXPLAIN command is your window into the optimizer's decision-making process. It reveals exactly what the database intends to do (or did, with execution statistics) when executing your query. This is the single most important diagnostic tool for query performance.

Yet many developers treat EXPLAIN output as inscrutable noise—an intimidating wall of text to be skimmed and ignored. This is a critical mistake. Every performance investigation should start with EXPLAIN. Every slow query has a story to tell, written in its execution plan.

This page will teach you to read that story fluently, extract actionable insights, and identify the specific bottlenecks causing performance problems.

What You Will Learn

By the end of this page, you will: master EXPLAIN syntax across PostgreSQL, MySQL, and SQL Server; understand how to interpret cost, row, and timing information; identify red flags that indicate optimization opportunities; and develop a systematic methodology for analyzing any execution plan.

EXPLAIN Fundamentals

Every major SQL database provides an EXPLAIN command, but the syntax and output format vary significantly. Understanding the commonalities and differences is essential for working across database platforms.

EXPLAIN Command Syntax Comparison
Database	Basic Syntax	With Execution Stats	Important Options
PostgreSQL	EXPLAIN query	EXPLAIN ANALYZE query	BUFFERS, FORMAT (TEXT/JSON/YAML), VERBOSE, SETTINGS
MySQL	EXPLAIN query	EXPLAIN ANALYZE query (8.0.18+)	FORMAT (TRADITIONAL/JSON/TREE), FOR CONNECTION
SQL Server	SET SHOWPLAN_TEXT ON / SET SHOWPLAN_ALL ON	SET STATISTICS PROFILE ON	Include Actual Execution Plan, Live Query Statistics
Oracle	EXPLAIN PLAN FOR query; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY)	Run query + DBMS_XPLAN.DISPLAY_CURSOR	ALL, ADVANCED, +PEEKED_BINDS

The Critical Distinction: Estimated vs. Actual

There are two fundamentally different types of EXPLAIN output:

Estimated (EXPLAIN without ANALYZE):

Shows what the optimizer plans to do
Displays estimated row counts and costs
Does NOT execute the query—safe to run on production
Useful for understanding the optimizer's reasoning
May differ significantly from actual execution

Actual (EXPLAIN ANALYZE or equivalent):

Shows what actually happened during execution
Displays actual row counts, actual timing, and resource usage
EXECUTES THE QUERY—including side effects (INSERT, UPDATE, DELETE)
Essential for identifying where estimates went wrong
The gold standard for performance diagnosis

EXPLAIN ANALYZE Executes Your Query!

EXPLAIN ANALYZE runs the full query, including data modifications. Never run EXPLAIN ANALYZE on INSERT, UPDATE, or DELETE in production without wrapping in a transaction you'll roll back: BEGIN; EXPLAIN ANALYZE DELETE FROM users; ROLLBACK;

When to Use Each Mode:

Situation	Use
Quick check of index usage	EXPLAIN (estimated)
Understanding why optimizer chose a plan	EXPLAIN (estimated)
Diagnosing actual performance problems	EXPLAIN ANALYZE (actual)
Comparing estimated vs actual row counts	EXPLAIN ANALYZE (actual)
Production investigation of SELECT queries	EXPLAIN ANALYZE (actual, carefully)
Production investigation of write queries	EXPLAIN (estimated only), test on replica with ANALYZE

The analysis workflow:

Start with EXPLAIN (estimated) to see the plan structure
If estimates look reasonable, run EXPLAIN ANALYZE on a non-production copy
Compare estimated vs. actual row counts at each step
Large discrepancies point to statistics problems or query issues
Identify the most expensive operators by cumulative time

PostgreSQL EXPLAIN In Depth

PostgreSQL provides the most detailed and flexible EXPLAIN output of any open-source database. Mastering PostgreSQL's EXPLAIN is a transferable skill—the concepts apply to all databases even when syntax differs.

PostgreSQL EXPLAIN Options
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Basic estimated plan
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
 
-- With execution statistics (RUNS THE QUERY)
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending';
 
-- With buffer usage statistics (I/O details)
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE status = 'pending';
 
-- JSON format for programmatic parsing
EXPLAIN (FORMAT JSON, ANALYZE, BUFFERS) SELECT * FROM orders WHERE status = 'pending';
 
-- Show planning time and configuration
EXPLAIN (ANALYZE, BUFFERS, SETTINGS, TIMING) SELECT * FROM orders WHERE status = 'pending';
 
-- Verbose output with schema-qualified names
EXPLAIN (VERBOSE) SELECT * FROM orders WHERE status = 'pending';

Component-by-Component Breakdown:

Let's dissect a real EXPLAIN ANALYZE output:

Sample PostgreSQL EXPLAIN ANALYZE Output
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
EXPLAIN (ANALYZE, BUFFERS) 
SELECT c.name, o.total 
FROM customers c 
JOIN orders o ON c.id = o.customer_id 
WHERE c.country = 'USA' AND o.total > 1000;
 
                                                QUERY PLAN
------------------------------------------------------------------------------------------
 Hash Join  (cost=85.50..294.35 rows=127 width=48) 
            (actual time=0.892..2.341 rows=89 loops=1)
   Hash Cond: (o.customer_id = c.id)
   Buffers: shared hit=47
   ->  Seq Scan on orders o  (cost=0.00..189.00 rows=1523 width=16) 
                              (actual time=0.012..0.987 rows=1456 loops=1)
         Filter: (total > 1000)
         Rows Removed by Filter: 8544
         Buffers: shared hit=30
   ->  Hash  (cost=73.75..73.75 rows=940 width=40) 
             (actual time=0.834..0.835 rows=892 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 65kB
         Buffers: shared hit=17
         ->  Seq Scan on customers c  (cost=0.00..73.75 rows=940 width=40) 
                                       (actual time=0.007..0.512 rows=892 loops=1)
               Filter: (country = 'USA'::text)
               Rows Removed by Filter: 9108
               Buffers: shared hit=17
 Planning Time: 0.215 ms
 Execution Time: 2.432 ms
(16 rows)

Key Elements to Analyze

•cost=85.50..294.35 — Startup cost (before first row) and total cost in arbitrary units. Use for comparing alternative plans, not absolute prediction.
•rows=127 — Optimizer's estimate of rows this operator will produce. Compare with actual for cardinality accuracy.
•actual time=0.892..2.341 — Actual time in milliseconds. First number is time to first row; second is time to last row.
•rows=89 — Actual row count produced. Estimate was 127, actual was 89 — reasonably close (within 1.4x).
•loops=1 — How many times this operator executed. In nested loops, inner sides may have loops >> 1.
•Buffers: shared hit=47 — Pages read from shared buffer cache. No disk reads here — all cached.
•Rows Removed by Filter — How many rows were examined but didn't pass the filter. High values indicate filtering opportunities.

The Loops Multiplier

When loops > 1, multiply the actual time and rows by loops to get totals. A node showing 'actual time=0.01..0.02 rows=1 loops=10000' actually takes 200ms total (0.02 × 10000) and produces 10,000 rows. Missing this is a common analysis mistake.

The BUFFERS Option—Essential for I/O Analysis:

The BUFFERS output reveals I/O patterns that timing alone doesn't capture:

Metric	Meaning
shared hit	Pages read from buffer cache (fast)
shared read	Pages read from disk into cache (slow)
shared dirtied	Pages modified in cache
shared written	Pages written to disk (during checkpoint)
temp read/written	Temporary file I/O (sorts/hashes spilled to disk)

A query showing high temp read/written is running out of memory and spilling to disk—a major red flag.

A query with all shared hit (no shared read) is fully cached—great for performance but may mask I/O problems that appear under cache pressure.

MySQL EXPLAIN Specifics

MySQL's EXPLAIN output differs structurally from PostgreSQL's, using a table-based format that shows one row per table in the query. Understanding MySQL's unique terminology is essential for MySQL optimization.

MySQL EXPLAIN Formats
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Traditional tabular format
EXPLAIN SELECT c.name, o.total 
FROM customers c 
JOIN orders o ON c.id = o.customer_id 
WHERE c.country = 'USA';
 
-- JSON format (more detail)
EXPLAIN FORMAT=JSON SELECT c.name, o.total 
FROM customers c 
JOIN orders o ON c.id = o.customer_id 
WHERE c.country = 'USA';
 
-- Tree format (8.0.16+) - shows actual plan tree
EXPLAIN FORMAT=TREE SELECT c.name, o.total 
FROM customers c 
JOIN orders o ON c.id = o.customer_id 
WHERE c.country = 'USA';
 
-- With execution statistics (8.0.18+) - RUNS THE QUERY
EXPLAIN ANALYZE SELECT c.name, o.total 
FROM customers c 
JOIN orders o ON c.id = o.customer_id 
WHERE c.country = 'USA';

MySQL EXPLAIN Columns Explained
Column	Description	What to Look For
id	Query block identifier; same id = same SELECT	Subqueries and UNIONs have different ids
select_type	Type of SELECT clause	SIMPLE is good; DEPENDENT SUBQUERY is often slow
table	Table being accessed	Order indicates join sequence
type	Access type (crucial!)	ALL (full scan) is worst; eq_ref, const are best
possible_keys	Indexes that could be used	Empty means no usable indexes
key	Index actually chosen	NULL means no index used
key_len	Bytes of index used	Shows how much of composite index is used
ref	Columns/constants compared to index	Reveals join conditions
rows	Estimated rows to examine	Multiply across joins for total work
filtered	Percentage passing table condition	Low values indicate heavy filtering
Extra	Additional optimizer information	Many important flags here (see below)

The 'type' Column—Access Type Hierarchy:

MySQL's type column is the most important indicator of scan efficiency. From best to worst:

system — Table has exactly one row (constant table)
const — At most one matching row, read at query start (PRIMARY KEY or UNIQUE lookup)
eq_ref — One row read per combination from previous tables (unique index join)
ref — All matching rows read for each combination (non-unique index)
range — Only rows in a range retrieved using an index (BETWEEN, <, >)
index — Full index scan (reads all index entries, but index-only)
ALL — Full table scan (reads every row in the table)

General rule: If you see type: ALL on a large table, you have a problem. Either add an index or accept that the query will be slow.

Critical 'Extra' Column Flags

•Using filesort — MySQL must perform an extra sorting pass. Consider adding an index for ORDER BY columns.
•Using temporary — MySQL creates a temporary table for the query. Common with GROUP BY, DISTINCT, or ORDER BY on different columns than result set.
•Using where — The storage engine returns rows that are then filtered by the MySQL server. Not inherently bad, but indicates post-filter overhead.
•Using index — The query can be satisfied entirely from the index (covering index). This is GOOD.
•Using index condition — Index condition pushdown is applied; filter evaluated at storage engine level. Generally good.
•Using join buffer — No index for the join; MySQL allocates a join buffer. Improve with indexes.

filesort + temporary = Red Flag

Seeing both 'Using filesort' and 'Using temporary' typically indicates a query that will perform poorly at scale. The database is creating a temporary table AND sorting it—two expensive operations that bypass indexes entirely.

Identifying Red Flags in Plans

After reviewing hundreds of slow query investigations, certain patterns emerge repeatedly. Learning to spot these red flags accelerates diagnosis from hours to minutes.

Critical Red Flags

•Massive estimate-vs-actual discrepancy — If estimated rows is 100 but actual is 100,000, the optimizer planned for the wrong data volume. Statistics are stale or missing.
•Sequential scan on large table with filter — Seq Scan on a multi-million row table with 'Filter: (column = value)' and 'Rows Removed by Filter: 9,999,000' means you're reading 10M rows to find 1,000. Index needed.
•Nested loop join with high 'loops' count — A node with 'loops=50000' means that operation ran 50,000 times. Multiply actual time by loops for true impact.
•Hash join building on the larger table — Hash joins should build on the smaller input. If the build side is larger, the optimizer misjudged sizes.
•Sort operator on huge row count — 'Sort (actual rows=10,000,000)' is extremely expensive and may spill to disk. Add index on ORDER BY columns if possible.
•Temp file usage (temp read/written in BUFFERS) — Any temp file I/O means work_mem was insufficient. You're hitting disk for intermediate results.
•'Rows Removed by Filter' >> 'actual rows' — Reading many rows to return few indicates missing or unused index.

Anti-Patterns (Problems)

•Seq Scan on large tables with low-selectivity filters
•Nested Loop with no index on inner side
•Hash Join with build side larger than probe
•Sort on millions of rows
•Planning time >> execution time (for simple queries)

Healthy Patterns (Goals)

•Index Only Scan (reading from index alone)
•Index Scan with tight selectivity
•Hash Join with small build side
•Merge Join on pre-sorted inputs
•Bitmap Heap Scan for moderate selectivity

The Cardinality Error Cascade:

The most insidious performance problems stem from cascading cardinality errors. Consider:

Hash Join (estimated 100 rows, actual 500,000 rows)
  ├── Index Scan on a (estimated 50 rows, actual 50 rows) ← accurate
  └── Index Scan on b (estimated 2 rows, actual 10,000 rows) ← 5000x error!

The optimizer estimated the join would produce 100 rows (50 × 2). It actually produced 500,000 (50 × 10,000). Because it expected 100 rows, it may have chosen an algorithm (nested loop) that's catastrophic for 500,000 rows.

The fix: Update statistics on table b. If correlations are involved, consider extended/multi-column statistics.

The 10x Rule of Thumb

When actual rows exceeds estimated rows by more than 10x, treat it as a critical problem. The optimizer's entire strategy is based on wrong assumptions. Run ANALYZE to update statistics and re-explain.

Analyzing Join Order Issues

For multi-table queries, join order is often the difference between milliseconds and minutes. The optimizer tries to find the optimal order, but it can fail—especially with complex queries or stale statistics.

How to Identify Join Order Problems:

Look at intermediate result sizes: After each join, check actual row counts. If they explode unexpectedly, earlier tables may be in wrong order.
Check which table drives the join: The first table (outermost in the plan) should be the one that filters down to the fewest rows earliest.
Look for 'loops' multipliers: High loop counts on an operator mean it's being called repeatedly—it's on the inner side of a nested loop driven by a large outer side.

Example of Poor Join Order:

Poor vs Optimal Join Order
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Query: Find orders for VIP customers in last 30 days
-- Tables: customers (10M rows), orders (100M rows), order_items (500M rows)
-- VIP customers: 1,000. Orders in last 30 days: 10M.
 
-- POOR PLAN (driving from orders):
Nested Loop
  ├── Seq Scan on orders (filter: date > now()-30)
  │     actual rows: 10,000,000  ← driving 10M iterations!
  └── Index Scan on customers (filter: vip = true)
        actual rows: 1, loops: 10,000,000  ← runs 10M times
 
-- OPTIMAL PLAN (driving from customers):  
Nested Loop
  ├── Index Scan on customers (filter: vip = true)
  │     actual rows: 1,000  ← driving only 1K iterations
  └── Index Scan on orders (filter: customer_id = c.id AND date > now()-30)
        actual rows: 50, loops: 1,000  ← runs 1K times, returns ~50K total

Forcing Better Join Orders:

When the optimizer chooses wrong, you have options:

Approach	Description	When to Use
Update statistics	ANALYZE/VACUUM ANALYZE	First thing to try
Add appropriate indexes	Create indexes the optimizer can use	When missing index causes seq scan
Create multi-column statistics	For correlated columns	When correlations fool the optimizer
Use query hints	STRAIGHT_JOIN (MySQL), join hints (SQL Server)	Last resort, high maintenance
Rewrite query	Subqueries, CTEs, temp tables	When hints aren't available

A word on hints: Hints lock you into a specific plan that may become suboptimal as data changes. Use sparingly and document why.

PostgreSQL Join Order Configuration

PostgreSQL's join_collapse_limit (default 8) controls how many tables the optimizer fully evaluates for join order. For queries joining more tables, it uses heuristics. You can raise this for complex queries, but optimization time increases factorially. The geqo_threshold controls when the genetic query optimizer takes over for very complex joins.

Systematic Plan Analysis Methodology

Analyzing execution plans is a skill that improves with practice. Having a systematic methodology ensures you don't miss critical insights and makes your analysis reproducible.

The 7-Step Plan Analysis Methodology

•Get baseline metrics: Run EXPLAIN ANALYZE (safely) and record total execution time, planning time, and row counts.
•Start at the bottom (leaves): Examine table access methods. Are indexes being used? Are they the right indexes? Check 'rows removed by filter'.
•Check cardinality accuracy: At each node, compare estimated rows vs. actual rows. Flag discrepancies > 10x.
•Follow the data upward: Trace row counts through joins and filters. Where does data volume increase or decrease? Are intermediate results as expected?
•Identify the expensive operators: Look for operators with highest actual time (remember to multiply by loops). These are your optimization targets.
•Check for I/O and memory issues: Examine BUFFERS output. Are you hitting disk? Spilling to temp files?
•Consider the overall plan shape: Is the join order sensible? Are appropriate join algorithms used? Could parallelism help?

Documentation Template for Plan Analysis:

When documenting your analysis (which you should do for significant queries), include:

Query: [SQL text]
Context: [Why this query matters, when it runs]
Problem: [e.g., "Times out during peak hours"]

Execution Summary:
- Total time: X ms
- Planning time: Y ms
- Peak memory: Z MB

Key Findings:
1. [e.g., "Seq scan on 'orders' table reading 10M rows"]
2. [e.g., "Cardinality estimate at join node off by 100x"]

Root Cause:
[e.g., "Missing index on orders.customer_id combined with stale statistics"]

Recommendation:
[e.g., "CREATE INDEX idx_orders_customer_id ON orders(customer_id); ANALYZE orders;"]

This systematic documentation helps with knowledge sharing and prevents recurrence.

Use Visual Tools

For complex plans, visual tools dramatically improve understanding. PostgreSQL's EXPLAIN (FORMAT JSON) output can be pasted into tools like pev2.toast.dev, explain.depesz.com, or explain.dalibo.com for interactive visualization. MySQL Workbench and SQL Server Management Studio have built-in graphical plan viewers.

Advanced EXPLAIN Techniques

Beyond basic plan reading, several advanced techniques help diagnose subtle performance issues.

Comparing Plans for the Same Query:

When a query performs differently at different times, capture both plans and diff them:

Get plan when query is fast
Get plan when query is slow
Compare: Different access methods? Different join algorithms? Different cardinality estimates?

This technique reveals plan instability, which often stems from:

Parameter sensitivity (different values → different plans)
Statistics changes (after ANALYZE)
Plan cache issues
Memory pressure affecting algorithm choice

PostgreSQL Plan Comparison Setup
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Capture plans with auto_explain for production diagnosis
-- In postgresql.conf or per-session:
LOAD 'auto_explain';
SET auto_explain.log_min_duration = '1000';  -- Log plans taking >1s
SET auto_explain.log_analyze = true;
SET auto_explain.log_buffers = true;
SET auto_explain.log_format = 'json';
 
-- Or manually capture plans with timestamps
CREATE TABLE plan_snapshots (
    captured_at TIMESTAMPTZ DEFAULT now(),
    query_text TEXT,
    plan_json JSONB,
    execution_time_ms NUMERIC
);
 
-- Insert plan snapshots for comparison
INSERT INTO plan_snapshots (query_text, plan_json, execution_time_ms)
SELECT 
    'SELECT ...', 
    (SELECT plan FROM EXPLAIN (FORMAT JSON, ANALYZE) SELECT ...)::jsonb,
    (SELECT execution_time FROM ...)

Testing Alternative Plans:

Sometimes you want to test what would happen with a different plan. You can influence plan choice through:

PostgreSQL settings (session-level):

SET enable_seqscan = off;       -- Discourage seq scans
SET enable_hashjoin = off;       -- Discourage hash joins
SET enable_nestloop = off;       -- Discourage nested loops
SET work_mem = '256MB';          -- Increase sort/hash memory
SET random_page_cost = 1.1;      -- Tune for SSD

MySQL hints:

SELECT /*+ NO_INDEX(orders) */ * FROM orders WHERE status = 'pending';
SELECT /*+ BNL(customers, orders) */ * FROM customers JOIN orders ...;

SQL Server hints:

SELECT * FROM orders WITH (INDEX(idx_status)) WHERE status = 'pending';
OPTION (HASH JOIN, FORCE ORDER)

Use these for testing, not as permanent solutions. If the optimizer needs hints to find a good plan, the underlying issue (statistics, indexes, query structure) should be fixed.

Session Settings Affect All Queries

Settings like 'enable_seqscan = off' affect ALL queries in the session, not just your test query. Reset them immediately after testing: RESET enable_seqscan; Or use SET LOCAL within a transaction for automatic reset.

Summary: EXPLAIN Analysis Mastery

EXPLAIN analysis is the foundational skill for database performance work. Let's consolidate the essential knowledge from this page:

Key Takeaways

•Estimated vs. Actual is fundamental — EXPLAIN shows the plan; EXPLAIN ANALYZE shows what actually happened. Use both, understanding that ANALYZE executes the query.
•Cardinality errors are the root cause — When estimates differ from actuals by >10x, the optimizer is planning for wrong assumptions. Fix statistics first.
•Access methods reveal indexing opportunities — Sequential scans on large tables with filters indicate missing indexes. Index Only Scans are the goal.
•Join order dominates multi-table performance — The driving table should filter down to the smallest set first. Large intermediate results are expensive.
•The 'Extra' column (MySQL) and operator details (PostgreSQL) contain critical information — Using filesort, temp files, and rows removed by filter all indicate optimization opportunities.
•Use a systematic methodology — Bottom-up analysis, cardinality checking, and I/O examination should be followed consistently.
•Visual tools accelerate understanding — For complex plans, paste JSON output into visualization tools rather than squinting at text.
•Compare plans over time — Plan instability causes intermittent performance problems. Capture and diff plans to understand changes.

What's Next:

With the ability to read and interpret execution plans, you're ready to focus on specific optimization techniques. The next page covers Index Optimization—how to design, analyze, and maintain indexes for maximum query performance.

Page Complete

You now have the practical skills to read and interpret EXPLAIN output from any major database. This diagnostic ability is the foundation for all query optimization work. Practice by explaining your own queries—even fast ones—to build intuition for what good plans look like.