Loading learning content...
An execution plan (also called a query plan or explain plan) is the detailed blueprint that tells the database engine exactly how to execute your query. It's the output of the optimization phase—a tree of operations that, when executed from leaves to root, produces the query results.
Execution plans are indispensable for performance tuning. They reveal:
Learning to read execution plans transforms you from someone who hopes queries are fast into someone who knows why they are—or isn't—and can fix them. This page teaches you to decode execution plans across major databases and use them for systematic performance analysis.
By the end of this page, you'll understand execution plan structure, recognize common operators, know how to generate plans in major databases, and confidently interpret plan output to diagnose performance bottlenecks. You'll develop the skill to look at a plan and immediately spot potential problems.
An execution plan is a tree structure where:
Execution flows from leaves to root. Each node receives data from its children, performs its operation, and passes results to its parent.
Key Plan Properties:
Every plan node typically includes:
Reading the Plan:
In the diagram above:
Plan Execution Models:
Databases execute plans using different models:
Volcano/Iterator Model: Each operator is an iterator that produces one row at a time when next() is called. Simple and flexible but function call overhead per row.
Push-Based Model: Data is pushed from producers to consumers. Better for pipelining and parallel execution.
Vectorized/Batch Model: Operators process batches of rows (vectors) instead of single rows. Reduces interpreter overhead; enables SIMD optimizations. Used by modern columnar engines.
While all databases produce execution plans, the format and terminology differ significantly. PostgreSQL shows nested indentation; Oracle uses a table format; SQL Server shows graphical plans in SSMS. The concepts are universal, but you'll need to learn the specific vocabulary of your database.
Every major database provides commands to view execution plans. There are two types of plan output:
1. Estimated Plan: Shows what the optimizer expects to happen based on statistics. Generated without executing the query.
2. Actual Plan: Shows what actually happened during execution, including real row counts and timing. Requires running the query.
| Database | Estimated Plan | Actual Plan | Notes |
|---|---|---|---|
| PostgreSQL | EXPLAIN query; | EXPLAIN ANALYZE query; | Add BUFFERS, FORMAT JSON for more detail |
| MySQL | EXPLAIN query; | EXPLAIN ANALYZE query; (8.0+) | Use EXPLAIN FORMAT=TREE for hierarchical view |
| Oracle | EXPLAIN PLAN FOR query; | Use DBMS_XPLAN with GATHER_PLAN_STATISTICS | Plan stored in PLAN_TABLE |
| SQL Server | SET SHOWPLAN_TEXT ON; | SET STATISTICS PROFILE ON; | SSMS provides graphical plans |
| SQLite | EXPLAIN QUERY PLAN query; | No actual plan support | Simplified output |
PostgreSQL EXPLAIN Examples:
123456789101112131415161718192021222324252627
-- Basic estimated planEXPLAINSELECT e.name, d.department_nameFROM employees eJOIN departments d ON e.department_id = d.idWHERE e.salary > 50000; -- Actual plan with execution statisticsEXPLAIN ANALYZESELECT e.name, d.department_nameFROM employees eJOIN departments d ON e.department_id = d.idWHERE e.salary > 50000; -- Comprehensive output with buffers and timingEXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)SELECT e.name, d.department_nameFROM employees eJOIN departments d ON e.department_id = d.idWHERE e.salary > 50000; -- JSON format for programmatic parsingEXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)SELECT e.name, d.department_nameFROM employees eJOIN departments d ON e.department_id = d.idWHERE e.salary > 50000;Sample EXPLAIN Output (PostgreSQL):
12345678910111213141516
QUERY PLAN--------------------------------------------------------------------------- Hash Join (cost=14.50..58.75 rows=125 width=64) (actual time=0.524..1.235 rows=142 loops=1) Hash Cond: (e.department_id = d.id) -> Seq Scan on employees e (cost=0.00..42.50 rows=125 width=48) (actual time=0.015..0.687 rows=142 loops=1) Filter: (salary > 50000) Rows Removed by Filter: 858 -> Hash (cost=11.00..11.00 rows=200 width=20) (actual time=0.285..0.286 rows=200 loops=1) Buckets: 256 Batches: 1 Memory Usage: 15kB -> Seq Scan on departments d (cost=0.00..11.00 rows=200 width=20) (actual time=0.008..0.145 rows=200 loops=1) Planning Time: 0.285 ms Execution Time: 1.352 msUnlike plain EXPLAIN, EXPLAIN ANALYZE executes the query to gather real statistics. For INSERT, UPDATE, or DELETE statements, wrap in a transaction and ROLLBACK if you don't want side effects: BEGIN; EXPLAIN ANALYZE DELETE FROM ...; ROLLBACK;
Execution plans consist of operators that perform specific operations. Understanding these operators is essential for reading plans effectively.
| Operator | Description | When It Appears |
|---|---|---|
| Sort | Sort rows by specified columns | ORDER BY, or to prepare for merge join/group |
| Aggregate | Compute aggregates (SUM, COUNT, AVG) | Aggregate functions in SELECT |
| Group By / HashAggregate | Group rows and compute per-group aggregates | GROUP BY clauses |
| Limit | Return only first N rows | LIMIT/TOP clauses |
| Unique | Remove duplicate rows | DISTINCT or UNION (not UNION ALL) |
| Filter | Apply predicate to filter rows | WHERE/HAVING conditions |
| Materialize | Store intermediate results in memory/disk | When results needed multiple times |
| Append | Concatenate results from multiple sources | UNION ALL, partitioned tables |
| Subquery Scan | Execute subquery and return results | Subqueries, CTEs |
| Window Aggregate | Compute window functions | OVER() clauses in SELECT |
Operator Characteristics:
Operators have different characteristics that affect performance:
Blocking vs. Non-Blocking:
Memory Usage:
Parallelism:
Blocking operators like Sort and HashAggregate must process all input before producing output. If you see these operators on very large intermediate results, they may be performance bottlenecks consuming excessive memory or spilling to disk.
Execution plan costs are estimates in arbitrary units that represent expected resource consumption. Understanding cost numbers helps you compare plans and identify expensive operations.
PostgreSQL Cost Format:
PostgreSQL shows costs as (cost=startup..total rows=N width=W)
1234567891011121314
Hash Join (cost=14.50..58.75 rows=125 width=64) ^^^^^^^^ ^^^^^ ^^^ ^^ | | | | | | | Average row size: 64 bytes | | Estimated 125 rows returned | Total cost: 58.75 units Startup cost: 14.50 units (build hash table first) -> Seq Scan on employees e (cost=0.00..42.50 rows=125 width=48) ^^^^ ^^^^^ ^^^ | | | | | 125 rows expected | Total cost: 42.50 Startup cost: 0 (pipelined operation)Cost Calculation Factors:
PostgreSQL's planner uses these base costs (configurable):
| Parameter | Default | Meaning |
|---|---|---|
seq_page_cost | 1.0 | Cost to read one page sequentially |
random_page_cost | 4.0 | Cost for random I/O (seek) |
cpu_tuple_cost | 0.01 | Cost to process one tuple |
cpu_index_tuple_cost | 0.005 | Cost to process one index entry |
cpu_operator_cost | 0.0025 | Cost to apply an operator |
parallel_tuple_cost | 0.1 | Cost to transfer tuple to parallel worker |
Example Cost Breakdown:
For a sequential scan of a 1000-page table with 100,000 rows, applying a filter:
I/O Cost: 1000 pages × 1.0 = 1000
Tuple Cost: 100,000 rows × 0.01 = 1000
Filter Cost: 100,000 rows × 0.0025 = 250
Total Estimated Cost: 2250
Cost numbers are not milliseconds or any real time unit—they're arbitrary units for comparing plans. A cost of 1000 doesn't mean 1000ms. Use costs to compare alternative plans for the same query, not to predict actual execution time.
Estimated vs. Actual: The Critical Comparison
When using EXPLAIN ANALYZE, compare estimated and actual rows:
12345678910111213141516
-- GOOD: Estimates match realityIndex Scan on orders (cost=0.42..8.44 rows=1 width=45) (actual time=0.028..0.029 rows=1 loops=1) ^^^^^^ Close to estimated 1! -- BAD: Major underestimate (optimizer chose wrong plan)Seq Scan on log_entries (cost=0.00..15.00 rows=10 width=64) (actual time=45.000..2500.000 rows=500000 loops=1) ^^^^^^ 50,000x more than estimated! -- This means:-- 1. Statistics are stale (run ANALYZE)-- 2. Correlated columns not captured in stats-- 3. Complex expression produced unexpected selectivityIf actual rows differ from estimated by more than 10x, the optimizer likely chose a suboptimal plan. Investigate missing indexes, stale statistics, or query reformulation. Even if the query is 'fast enough,' large mismatches suggest future problems at scale.
Execution plans reveal many common performance problems. Here's what to look for:
Case Study: Finding the Problem
12345678910111213141516171819202122232425262728293031323334
-- Slow query: Why is this taking 30 seconds? EXPLAIN ANALYZESELECT c.name, SUM(o.total)FROM customers cJOIN orders o ON o.customer_id = c.idWHERE o.order_date > '2024-01-01'GROUP BY c.name; -- Plan output:GroupAggregate (cost=85421.50..86421.50 rows=10000 width=40) (actual time=28542.123..29876.456 rows=8542 loops=1) Group Key: c.name -> Sort (cost=85421.50..85671.50 rows=100000 width=40) (actual time=28501.234..28765.789 rows=95234 loops=1) ⚠️ Sort Method: external merge Disk: 12345kB ⚠️ Sort Key: c.name -> Hash Join (cost=1025.00..75421.50 rows=100000 width=40) (actual time=15.234..27543.567 rows=95234 loops=1) Hash Cond: (o.customer_id = c.id) -> Seq Scan on orders o (cost=0.00..65000.00 rows=100000 width=20) (actual time=0.050..26543.234 rows=95234 loops=1) ⚠️ Filter: (order_date > '2024-01-01') ⚠️ ⚠️ Rows Removed by Filter: 4500000 ⚠️ -> Hash (cost=650.00..650.00 rows=30000 width=24) (actual time=12.345..12.346 rows=30000 loops=1) -> Seq Scan on customers c .. -- Analysis:-- 1. Full scan of orders table (4.5M rows!) to find 95K matching-- PROBLEM: Missing index on order_date-- 2. Sort spilling to disk (12MB)-- PROBLEM: work_mem too low, or try index for sorted output-- 3. Join looks OK (hash join is appropriate here)The 'Rows Removed by Filter' statistic is gold for optimization. If millions of rows are removed, the filter is being applied too late. Create an index to push the filter earlier, or check why the optimizer isn't using an existing index.
While execution plan concepts are universal, each database has unique plan features and terminology. Here's a comparison of major systems:
PostgreSQL Plan Features:
1234567
-- Comprehensive PostgreSQL planEXPLAIN (ANALYZE, BUFFERS, TIMING, VERBOSE)SELECT * FROM orders WHERE total > 1000; -- Check parallel planEXPLAIN SELECT COUNT(*) FROM large_table;-- Look for: Gather, Parallel Seq Scan, Workers Planned/LaunchedHere's a systematic approach to analyzing execution plans for performance tuning:
When optimizing, make one change (add an index, rewrite a predicate, update statistics) and re-examine the plan. Multiple simultaneous changes make it impossible to understand what helped and what didn't.
Execution plans are your window into the optimizer's decisions and the actual work performed by your queries. Let's consolidate the key concepts:
What's Next:
With the execution plan generated, the database engine proceeds to result retrieval—actually executing the plan and delivering data to the client. The next page explores how data flows through the execution engine, how buffering and network transfer work, and how to optimize result delivery for different use cases.
You now understand how to read and interpret SQL execution plans. This skill is essential for performance troubleshooting—you can see exactly what the database is doing and identify optimization opportunities. Next, we'll explore how the execution engine processes the plan and delivers results.