Loading content...
The EXPLAIN command is your window into the optimizer's decision-making process. It reveals exactly what the database intends to do (or did, with execution statistics) when executing your query. This is the single most important diagnostic tool for query performance.
Yet many developers treat EXPLAIN output as inscrutable noise—an intimidating wall of text to be skimmed and ignored. This is a critical mistake. Every performance investigation should start with EXPLAIN. Every slow query has a story to tell, written in its execution plan.
This page will teach you to read that story fluently, extract actionable insights, and identify the specific bottlenecks causing performance problems.
By the end of this page, you will: master EXPLAIN syntax across PostgreSQL, MySQL, and SQL Server; understand how to interpret cost, row, and timing information; identify red flags that indicate optimization opportunities; and develop a systematic methodology for analyzing any execution plan.
Every major SQL database provides an EXPLAIN command, but the syntax and output format vary significantly. Understanding the commonalities and differences is essential for working across database platforms.
| Database | Basic Syntax | With Execution Stats | Important Options |
|---|---|---|---|
| PostgreSQL | EXPLAIN query | EXPLAIN ANALYZE query | BUFFERS, FORMAT (TEXT/JSON/YAML), VERBOSE, SETTINGS |
| MySQL | EXPLAIN query | EXPLAIN ANALYZE query (8.0.18+) | FORMAT (TRADITIONAL/JSON/TREE), FOR CONNECTION |
| SQL Server | SET SHOWPLAN_TEXT ON / SET SHOWPLAN_ALL ON | SET STATISTICS PROFILE ON | Include Actual Execution Plan, Live Query Statistics |
| Oracle | EXPLAIN PLAN FOR query; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY) | Run query + DBMS_XPLAN.DISPLAY_CURSOR | ALL, ADVANCED, +PEEKED_BINDS |
The Critical Distinction: Estimated vs. Actual
There are two fundamentally different types of EXPLAIN output:
Estimated (EXPLAIN without ANALYZE):
Actual (EXPLAIN ANALYZE or equivalent):
EXPLAIN ANALYZE runs the full query, including data modifications. Never run EXPLAIN ANALYZE on INSERT, UPDATE, or DELETE in production without wrapping in a transaction you'll roll back: BEGIN; EXPLAIN ANALYZE DELETE FROM users; ROLLBACK;
When to Use Each Mode:
| Situation | Use |
|---|---|
| Quick check of index usage | EXPLAIN (estimated) |
| Understanding why optimizer chose a plan | EXPLAIN (estimated) |
| Diagnosing actual performance problems | EXPLAIN ANALYZE (actual) |
| Comparing estimated vs actual row counts | EXPLAIN ANALYZE (actual) |
| Production investigation of SELECT queries | EXPLAIN ANALYZE (actual, carefully) |
| Production investigation of write queries | EXPLAIN (estimated only), test on replica with ANALYZE |
The analysis workflow:
PostgreSQL provides the most detailed and flexible EXPLAIN output of any open-source database. Mastering PostgreSQL's EXPLAIN is a transferable skill—the concepts apply to all databases even when syntax differs.
1234567891011121314151617
-- Basic estimated planEXPLAIN SELECT * FROM orders WHERE status = 'pending'; -- With execution statistics (RUNS THE QUERY)EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending'; -- With buffer usage statistics (I/O details)EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE status = 'pending'; -- JSON format for programmatic parsingEXPLAIN (FORMAT JSON, ANALYZE, BUFFERS) SELECT * FROM orders WHERE status = 'pending'; -- Show planning time and configurationEXPLAIN (ANALYZE, BUFFERS, SETTINGS, TIMING) SELECT * FROM orders WHERE status = 'pending'; -- Verbose output with schema-qualified namesEXPLAIN (VERBOSE) SELECT * FROM orders WHERE status = 'pending';Component-by-Component Breakdown:
Let's dissect a real EXPLAIN ANALYZE output:
1234567891011121314151617181920212223242526272829
EXPLAIN (ANALYZE, BUFFERS) SELECT c.name, o.total FROM customers c JOIN orders o ON c.id = o.customer_id WHERE c.country = 'USA' AND o.total > 1000; QUERY PLAN------------------------------------------------------------------------------------------ Hash Join (cost=85.50..294.35 rows=127 width=48) (actual time=0.892..2.341 rows=89 loops=1) Hash Cond: (o.customer_id = c.id) Buffers: shared hit=47 -> Seq Scan on orders o (cost=0.00..189.00 rows=1523 width=16) (actual time=0.012..0.987 rows=1456 loops=1) Filter: (total > 1000) Rows Removed by Filter: 8544 Buffers: shared hit=30 -> Hash (cost=73.75..73.75 rows=940 width=40) (actual time=0.834..0.835 rows=892 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 65kB Buffers: shared hit=17 -> Seq Scan on customers c (cost=0.00..73.75 rows=940 width=40) (actual time=0.007..0.512 rows=892 loops=1) Filter: (country = 'USA'::text) Rows Removed by Filter: 9108 Buffers: shared hit=17 Planning Time: 0.215 ms Execution Time: 2.432 ms(16 rows)When loops > 1, multiply the actual time and rows by loops to get totals. A node showing 'actual time=0.01..0.02 rows=1 loops=10000' actually takes 200ms total (0.02 × 10000) and produces 10,000 rows. Missing this is a common analysis mistake.
The BUFFERS Option—Essential for I/O Analysis:
The BUFFERS output reveals I/O patterns that timing alone doesn't capture:
| Metric | Meaning |
|---|---|
| shared hit | Pages read from buffer cache (fast) |
| shared read | Pages read from disk into cache (slow) |
| shared dirtied | Pages modified in cache |
| shared written | Pages written to disk (during checkpoint) |
| temp read/written | Temporary file I/O (sorts/hashes spilled to disk) |
A query showing high temp read/written is running out of memory and spilling to disk—a major red flag.
A query with all shared hit (no shared read) is fully cached—great for performance but may mask I/O problems that appear under cache pressure.
MySQL's EXPLAIN output differs structurally from PostgreSQL's, using a table-based format that shows one row per table in the query. Understanding MySQL's unique terminology is essential for MySQL optimization.
1234567891011121314151617181920212223
-- Traditional tabular formatEXPLAIN SELECT c.name, o.total FROM customers c JOIN orders o ON c.id = o.customer_id WHERE c.country = 'USA'; -- JSON format (more detail)EXPLAIN FORMAT=JSON SELECT c.name, o.total FROM customers c JOIN orders o ON c.id = o.customer_id WHERE c.country = 'USA'; -- Tree format (8.0.16+) - shows actual plan treeEXPLAIN FORMAT=TREE SELECT c.name, o.total FROM customers c JOIN orders o ON c.id = o.customer_id WHERE c.country = 'USA'; -- With execution statistics (8.0.18+) - RUNS THE QUERYEXPLAIN ANALYZE SELECT c.name, o.total FROM customers c JOIN orders o ON c.id = o.customer_id WHERE c.country = 'USA';| Column | Description | What to Look For |
|---|---|---|
| id | Query block identifier; same id = same SELECT | Subqueries and UNIONs have different ids |
| select_type | Type of SELECT clause | SIMPLE is good; DEPENDENT SUBQUERY is often slow |
| table | Table being accessed | Order indicates join sequence |
| type | Access type (crucial!) | ALL (full scan) is worst; eq_ref, const are best |
| possible_keys | Indexes that could be used | Empty means no usable indexes |
| key | Index actually chosen | NULL means no index used |
| key_len | Bytes of index used | Shows how much of composite index is used |
| ref | Columns/constants compared to index | Reveals join conditions |
| rows | Estimated rows to examine | Multiply across joins for total work |
| filtered | Percentage passing table condition | Low values indicate heavy filtering |
| Extra | Additional optimizer information | Many important flags here (see below) |
The 'type' Column—Access Type Hierarchy:
MySQL's type column is the most important indicator of scan efficiency. From best to worst:
General rule: If you see type: ALL on a large table, you have a problem. Either add an index or accept that the query will be slow.
Seeing both 'Using filesort' and 'Using temporary' typically indicates a query that will perform poorly at scale. The database is creating a temporary table AND sorting it—two expensive operations that bypass indexes entirely.
After reviewing hundreds of slow query investigations, certain patterns emerge repeatedly. Learning to spot these red flags accelerates diagnosis from hours to minutes.
The Cardinality Error Cascade:
The most insidious performance problems stem from cascading cardinality errors. Consider:
Hash Join (estimated 100 rows, actual 500,000 rows)
├── Index Scan on a (estimated 50 rows, actual 50 rows) ← accurate
└── Index Scan on b (estimated 2 rows, actual 10,000 rows) ← 5000x error!
The optimizer estimated the join would produce 100 rows (50 × 2). It actually produced 500,000 (50 × 10,000). Because it expected 100 rows, it may have chosen an algorithm (nested loop) that's catastrophic for 500,000 rows.
The fix: Update statistics on table b. If correlations are involved, consider extended/multi-column statistics.
When actual rows exceeds estimated rows by more than 10x, treat it as a critical problem. The optimizer's entire strategy is based on wrong assumptions. Run ANALYZE to update statistics and re-explain.
For multi-table queries, join order is often the difference between milliseconds and minutes. The optimizer tries to find the optimal order, but it can fail—especially with complex queries or stale statistics.
How to Identify Join Order Problems:
Look at intermediate result sizes: After each join, check actual row counts. If they explode unexpectedly, earlier tables may be in wrong order.
Check which table drives the join: The first table (outermost in the plan) should be the one that filters down to the fewest rows earliest.
Look for 'loops' multipliers: High loop counts on an operator mean it's being called repeatedly—it's on the inner side of a nested loop driven by a large outer side.
Example of Poor Join Order:
1234567891011121314151617
-- Query: Find orders for VIP customers in last 30 days-- Tables: customers (10M rows), orders (100M rows), order_items (500M rows)-- VIP customers: 1,000. Orders in last 30 days: 10M. -- POOR PLAN (driving from orders):Nested Loop ├── Seq Scan on orders (filter: date > now()-30) │ actual rows: 10,000,000 ← driving 10M iterations! └── Index Scan on customers (filter: vip = true) actual rows: 1, loops: 10,000,000 ← runs 10M times -- OPTIMAL PLAN (driving from customers): Nested Loop ├── Index Scan on customers (filter: vip = true) │ actual rows: 1,000 ← driving only 1K iterations └── Index Scan on orders (filter: customer_id = c.id AND date > now()-30) actual rows: 50, loops: 1,000 ← runs 1K times, returns ~50K totalForcing Better Join Orders:
When the optimizer chooses wrong, you have options:
| Approach | Description | When to Use |
|---|---|---|
| Update statistics | ANALYZE/VACUUM ANALYZE | First thing to try |
| Add appropriate indexes | Create indexes the optimizer can use | When missing index causes seq scan |
| Create multi-column statistics | For correlated columns | When correlations fool the optimizer |
| Use query hints | STRAIGHT_JOIN (MySQL), join hints (SQL Server) | Last resort, high maintenance |
| Rewrite query | Subqueries, CTEs, temp tables | When hints aren't available |
A word on hints: Hints lock you into a specific plan that may become suboptimal as data changes. Use sparingly and document why.
PostgreSQL's join_collapse_limit (default 8) controls how many tables the optimizer fully evaluates for join order. For queries joining more tables, it uses heuristics. You can raise this for complex queries, but optimization time increases factorially. The geqo_threshold controls when the genetic query optimizer takes over for very complex joins.
Analyzing execution plans is a skill that improves with practice. Having a systematic methodology ensures you don't miss critical insights and makes your analysis reproducible.
Documentation Template for Plan Analysis:
When documenting your analysis (which you should do for significant queries), include:
Query: [SQL text]
Context: [Why this query matters, when it runs]
Problem: [e.g., "Times out during peak hours"]
Execution Summary:
- Total time: X ms
- Planning time: Y ms
- Peak memory: Z MB
Key Findings:
1. [e.g., "Seq scan on 'orders' table reading 10M rows"]
2. [e.g., "Cardinality estimate at join node off by 100x"]
Root Cause:
[e.g., "Missing index on orders.customer_id combined with stale statistics"]
Recommendation:
[e.g., "CREATE INDEX idx_orders_customer_id ON orders(customer_id); ANALYZE orders;"]
This systematic documentation helps with knowledge sharing and prevents recurrence.
For complex plans, visual tools dramatically improve understanding. PostgreSQL's EXPLAIN (FORMAT JSON) output can be pasted into tools like pev2.toast.dev, explain.depesz.com, or explain.dalibo.com for interactive visualization. MySQL Workbench and SQL Server Management Studio have built-in graphical plan viewers.
Beyond basic plan reading, several advanced techniques help diagnose subtle performance issues.
Comparing Plans for the Same Query:
When a query performs differently at different times, capture both plans and diff them:
This technique reveals plan instability, which often stems from:
12345678910111213141516171819202122
-- Capture plans with auto_explain for production diagnosis-- In postgresql.conf or per-session:LOAD 'auto_explain';SET auto_explain.log_min_duration = '1000'; -- Log plans taking >1sSET auto_explain.log_analyze = true;SET auto_explain.log_buffers = true;SET auto_explain.log_format = 'json'; -- Or manually capture plans with timestampsCREATE TABLE plan_snapshots ( captured_at TIMESTAMPTZ DEFAULT now(), query_text TEXT, plan_json JSONB, execution_time_ms NUMERIC); -- Insert plan snapshots for comparisonINSERT INTO plan_snapshots (query_text, plan_json, execution_time_ms)SELECT 'SELECT ...', (SELECT plan FROM EXPLAIN (FORMAT JSON, ANALYZE) SELECT ...)::jsonb, (SELECT execution_time FROM ...)Testing Alternative Plans:
Sometimes you want to test what would happen with a different plan. You can influence plan choice through:
PostgreSQL settings (session-level):
SET enable_seqscan = off; -- Discourage seq scans
SET enable_hashjoin = off; -- Discourage hash joins
SET enable_nestloop = off; -- Discourage nested loops
SET work_mem = '256MB'; -- Increase sort/hash memory
SET random_page_cost = 1.1; -- Tune for SSD
MySQL hints:
SELECT /*+ NO_INDEX(orders) */ * FROM orders WHERE status = 'pending';
SELECT /*+ BNL(customers, orders) */ * FROM customers JOIN orders ...;
SQL Server hints:
SELECT * FROM orders WITH (INDEX(idx_status)) WHERE status = 'pending';
OPTION (HASH JOIN, FORCE ORDER)
Use these for testing, not as permanent solutions. If the optimizer needs hints to find a good plan, the underlying issue (statistics, indexes, query structure) should be fixed.
Settings like 'enable_seqscan = off' affect ALL queries in the session, not just your test query. Reset them immediately after testing: RESET enable_seqscan; Or use SET LOCAL within a transaction for automatic reset.
EXPLAIN analysis is the foundational skill for database performance work. Let's consolidate the essential knowledge from this page:
What's Next:
With the ability to read and interpret execution plans, you're ready to focus on specific optimization techniques. The next page covers Index Optimization—how to design, analyze, and maintain indexes for maximum query performance.
You now have the practical skills to read and interpret EXPLAIN output from any major database. This diagnostic ability is the foundation for all query optimization work. Practice by explaining your own queries—even fast ones—to build intuition for what good plans look like.