Explain Plans - Learning Module

Loading content...

0/252

Cost Estimates

The Economics of Query Execution

Imagine you're planning a cross-country trip. You could drive, fly, take a train, or some combination. Each option has different costs: money, time, convenience, environmental impact. To make a rational choice, you need to estimate these costs before committing to a route.

Database query optimizers face a similar challenge. For a single query, there might be dozens or hundreds of possible execution strategies. Different join orders, different access methods, different algorithms. The optimizer must choose one strategy before executing the query—it can't try them all and pick the fastest.

To make this choice, optimizers rely on cost estimates—predictions of how expensive each strategy will be. These estimates appear in EXPLAIN output as abstract numbers that confuse many developers. But understanding cost estimation transforms you from someone who reads execution plans to someone who reasons about why plans are chosen and how to influence optimizer decisions.

What You Will Learn

By the end of this page, you will understand what cost numbers represent, how databases calculate them, the inputs that influence cost (statistics, configuration parameters), why costs can be wrong, and how to use cost information effectively for query optimization.

What Cost Really Means

Let's start by demystifying what cost numbers in execution plans actually represent.

Cost Is NOT Time

The most important thing to understand: cost values are not milliseconds, seconds, or any real time unit. They are abstract, relative measures calibrated to make comparisons possible.

The Fundamental Unit: Sequential Page Reads

Most database cost models are calibrated relative to a single sequential page read from disk. This is assigned a cost of 1.0. All other operations are measured relative to this baseline:

Operation	Typical Cost Multiplier
Sequential page read	1.0 (baseline)
Random page read	4.0 (random I/O is slower)
CPU processing per row	0.01
CPU processing per operator	0.0025
Parallel worker communication	variable

Why Abstract Units?

Real I/O times vary enormously: HDDs vs. SSDs, cached vs. cold data, sequential vs. random access. Rather than predict actual time (which is impossible), cost models predict relative resource consumption. A plan with cost 1000 should be roughly 10x more expensive than a plan with cost 100.

Understanding Cost Numbers

PostgreSQL

-- EXPLAIN output with cost breakdown
EXPLAIN (COSTS ON) SELECT * FROM employees WHERE salary > 50000;
 
--                          QUERY PLAN                          
-- ---------------------------------------------------------------
--  Seq Scan on employees  (cost=0.00..289.00 rows=3330 width=64)
--    Filter: (salary > 50000)
 
-- Cost breakdown: cost=startup..total
-- startup=0.00    : Cost before first row is returned
-- total=289.00    : Cost to return all rows
 
-- How 289.00 is calculated (approximately):
-- - Table has 145 pages (blocks)
-- - seq_page_cost = 1.0 (per page read)
-- - cpu_tuple_cost = 0.01 (per row processed)  
-- - cpu_operator_cost = 0.0025 (per row for filter)
-- 
-- Total ≈ (145 pages × 1.0) + (10000 rows × 0.01) + (10000 rows × 0.0025)
-- Total ≈ 145 + 100 + 25 = 270 (approximately)
-- (Exact formula varies by version and statistics)

Cost For Comparison Only

Never say 'this query costs 289 milliseconds.' Instead, say 'this query has an estimated cost of 289 units' or compare 'Plan A costs 3x more than Plan B.' Cost is meaningful only for relative comparison, not absolute performance prediction.

Startup Cost vs. Total Cost

Every operation in an execution plan has two cost components shown as (cost=startup..total):

Startup Cost: Work that must be completed before the first row can be returned. This represents blocking operations or initialization overhead.

Total Cost: Work to complete the entire operation and return all rows.

The distinction matters enormously for understanding query behavior:

Startup Cost Implications by Operator
Operator	Startup Cost	Why It Matters
Seq Scan	Near zero	Can return first row immediately as it reads
Index Scan	Low	Small overhead to navigate index before first row
Sort	High (≈ total)	Must read ALL input before outputting ANY row
Hash (for join)	High	Must build entire hash table before probing
Hash Join	Moderate	Build phase completes before matches flow
Merge Join	Depends	Low if sorted input, high if Sort added
Aggregate	Near total	Must process all input to compute aggregates

Startup vs Total Cost Examples

PostgreSQL

-- Low startup cost: Streaming operation
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
--  Seq Scan on orders  (cost=0.00..1250.00 rows=500 width=80)
--                            ^^^^
--  Startup=0: First row available immediately as scan begins
 
-- High startup cost: Sorting blocks output
EXPLAIN SELECT * FROM orders ORDER BY created_at DESC;
--  Sort  (cost=9876.89..10126.89 rows=100000 width=80)
--              ^^^^^^^
--  Startup=9876.89, Total=10126.89
--  Must accumulate ~98% of work before ANY output
 
-- High startup cost: Hash build phase
EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id;
--  Hash Join  (cost=458.00..2589.00 rows=100000 width=160)
--    Hash Cond: (o.customer_id = c.id)
--    ->  Seq Scan on orders o  (cost=0.00..1250.00 ...)
--    ->  Hash  (cost=333.00..333.00 rows=10000 width=80)
--                    ^^^^^^  ^^^^^^
--              Startup = Total for Hash (must build completely)
--          ->  Seq Scan on customers c  (cost=0.00..283.00 ...)

Why Startup Cost Matters:

Time to First Row: For interactive queries where users see results progressively, high startup cost means they wait with blank screens.
LIMIT Optimization: If only a few rows are needed, low startup + streaming can be faster than 'cheaper' plans requiring full processing:

SELECT * FROM huge_table ORDER BY indexed_column LIMIT 10
- Index scan: startup=0, reads only 10 rows
- Seq scan + sort: reads entire table before returning anything
Pipeline Breaking: High startup operators 'break' the data pipeline—nothing flows until they complete. Multiple such operators in sequence multiply delays.
Cursor/Stream Behavior: Applications using cursors to fetch rows incrementally are penalized by high startup costs but benefit from low-startup streaming operations.

LIMIT Loves Low Startup

When using LIMIT, prefer execution plans with low startup costs even if total cost is higher. A plan that can produce the first N rows without processing all data wins. The optimizer usually handles this, but understanding why helps when it doesn't.

Cost Model Inputs: Statistics and Parameters

The cost model requires several inputs to calculate estimates. Understanding these inputs explains why costs are sometimes wildly wrong.

Table and Index Statistics:

The optimizer maintains statistics about data distribution:

Statistic	Description	Impact on Cost
Row count (reltuples)	Estimated number of rows in table	Fundamental input to all estimates
Page count (relpages)	Number of disk pages table occupies	Drives I/O cost calculation
Distinct values (n_distinct)	Unique values per column	Affects GROUP BY and join estimates
Most common values (MCV)	Frequent values and their frequencies	Improves selective filter estimates
Histogram	Distribution of values in buckets	Range query selectivity
Correlation	Physical ordering vs. logical ordering	Index scan random vs. sequential I/O

Viewing Statistics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- View table-level statistics
SELECT 
    relname AS table_name,
    reltuples AS estimated_rows,
    relpages AS pages_on_disk,
    pg_size_pretty(relpages * 8192) AS estimated_size
FROM pg_class 
WHERE relname = 'orders';
 
-- View column-level statistics
SELECT 
    attname AS column_name,
    n_distinct,
    most_common_vals,
    most_common_freqs,
    histogram_bounds,
    correlation
FROM pg_stats 
WHERE tablename = 'orders' AND attname = 'status';
 
-- Sample output:
-- column_name | status
-- n_distinct  | 5                    (5 unique values)
-- most_common_vals | {pending,completed,shipped,cancelled,refunded}
-- most_common_freqs | {0.35,0.30,0.20,0.10,0.05}
-- correlation | 0.15                 (weak physical-logical ordering)
 
-- Force statistics update
ANALYZE orders;
 
-- Increase statistics granularity for a column
ALTER TABLE orders ALTER COLUMN customer_id SET STATISTICS 1000;
ANALYZE orders;

Configuration Parameters Affecting Cost:

Databases have tunable parameters that adjust cost calculations:

PostgreSQL Cost Parameters:

Parameter	Default	Effect
`seq_page_cost`	1.0	Cost per sequential page read
`random_page_cost`	4.0	Cost per random page read (index access)
`cpu_tuple_cost`	0.01	Cost per row processed
`cpu_index_tuple_cost`	0.005	Cost per index entry processed
`cpu_operator_cost`	0.0025	Cost per operator evaluation
`effective_cache_size`	4GB typical	Assumed available cache (affects index preference)
`parallel_tuple_cost`	0.1	Cost of passing tuples to parallel workers

Tuning Strategy:

These parameters should reflect your hardware reality:

SSDs have lower random vs. sequential difference (lower random_page_cost, maybe 1.1-2.0)
Fast CPUs might warrant higher I/O costs relative to CPU
Large memory systems benefit from higher effective_cache_size

Don't Tune Randomly

Adjusting cost parameters without understanding the effects can make optimization worse across your workload. Tune only after profiling and understanding your specific hardware characteristics. Changes affect ALL query planning.

How Costs Are Calculated

Understanding the mechanics of cost calculation helps explain optimizer behavior and diagnose when estimates go wrong.

Sequential Scan Cost:

Cost = (disk pages × seq_page_cost) + (rows × cpu_tuple_cost)
                                    + (rows × filter evaluation cost)

Example: Table with 1000 pages, 100,000 rows:

Disk: 1000 × 1.0 = 1000
CPU: 100,000 × 0.01 = 1000
Total ≈ 2000

Index Scan Cost:

Startup = index height traversal
Per-row = (index_tuple_cost + random_page_cost * correlation_factor)
                            + cpu_tuple_cost for heap fetch

The correlation factor is crucial: highly correlated data (physically ordered like the index) means sequential I/O; low correlation means random I/O.

Cost Calculation Examples

PostgreSQL

-- Comparing Seq Scan vs Index Scan costs
-- Table: 100,000 rows, 1000 pages
 
-- Scenario: WHERE id = 12345 (returns 1 row)
-- Index Scan is chosen:
--  Index Scan using pk_table on mytable  (cost=0.29..8.31 rows=1 width=80)
--    Index Cond: (id = 12345)
-- Cost breakdown:
--   startup: ~0.29 (index tree traversal)
--   per row: ~8.02 (one random page + CPU)
--   Very cheap because selectivity is high
 
-- Seq Scan would be:
--  (cost=0.00..1500.00 rows=1 width=80)
-- Complete table scan to find 1 row = wasteful
 
-- Scenario: WHERE status = 'active' (returns 60,000 rows)
-- Seq Scan is chosen:
--  Seq Scan on mytable  (cost=0.00..1750.00 rows=60000 width=80)
--    Filter: (status = 'active')
-- Why not index? 60% of rows means:
--   - Index scan: 60,000 random page fetches
--   - Seq scan: 1000 sequential page reads
--   Sequential reading 1000 pages beats 60,000 random fetches
 
-- Viewing optimizer's decision factors
SET enable_seqscan = off;  -- Force index consideration
EXPLAIN SELECT * FROM mytable WHERE status = 'active';
--  Bitmap Heap Scan  (cost=685.00..2250.00 rows=60000 width=80)
-- Higher cost! Optimizer was right to prefer Seq Scan.

Join Cost Formulas (Simplified)
Join Type	Cost Formula (Simplified)	When Optimal
Nested Loop	`outer_rows × (inner_startup + inner_per_row)`	Small outer, indexed inner
Hash Join	`build_cost + probe_cost`	Medium tables, equi-join
Merge Join	`sort_cost (if needed) + merge_cost`	Pre-sorted inputs

The Selectivity Factor:

Selectivity is the estimated fraction of rows passing a filter. It's crucial for cost calculation:

WHERE id = 12345 on unique column → selectivity = 1/total_rows
WHERE status = 'active' → selectivity from MCV or histogram
WHERE flag = true (boolean) → often estimated as 50% if no statistics

When Selectivity Estimation Fails:

Situation	Problem	Symptom
Skewed data	Histogram doesn't capture unusual distribution	Wrong row estimates
Correlated columns	`(a = 1 AND b = 2)` estimated as independent	Underestimate
Functions on columns	`WHERE UPPER(name) = 'SMITH'`	Can't use column stats
Parameterized queries	Value unknown at plan time	Generic estimate used
Cross-database comparisons	Foreign tables lack local statistics	Wild guessing

Selectivity Errors Cascade

A 10x error in selectivity creates a 10x error in row estimates, which cascades through joins (10x × 10x = 100x error) and affects all subsequent cost calculations. This is why cardinality errors are the primary cause of bad plans.

Why Cost Estimates Go Wrong

Cost estimates are predictions, and predictions can be wrong. Understanding failure modes helps you diagnose and fix optimizer mistakes.

Category 1: Stale Statistics

The most common problem. Statistics describe data at the time they were gathered. If data changes significantly afterward:

Row counts are wrong
Value distributions no longer match histograms
Cardinality estimates cascade errors through the plan

Common Causes of Bad Estimates

•Stale statistics after bulk loads — Inserted 1M rows but didn't ANALYZE. Optimizer thinks table is small.
•Deleted data with bloated estimates — Deleted 90% of rows but statistics show original count. Over-estimates everywhere.
•Skewed data not captured — 99% of orders are 'completed' but histogram doesn't reflect this. Filters on 'pending' grossly overestimate.
•Correlated column assumptions — (region = 'US' AND language = 'English') estimated as independent (0.3 × 0.4 = 0.12) when reality is 0.28.
•Functions hide column statistics — WHERE YEAR(created_at) = 2024 can't use created_at histogram. Optimizer guesses.
•Parameterized query generic plans — Prepared statements may use generic plan ignoring specific parameter distributions.
•Cross-join selectivity — Complex multi-table joins compound estimation errors multiplicatively.

Diagnosing Estimate Errors

PostgreSQL

-- Using EXPLAIN ANALYZE to find estimate errors
EXPLAIN ANALYZE
SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.country = 'Luxembourg';
 
--  Hash Join  (cost=125.00..2589.00 rows=15000 width=160)
--             (actual time=0.5..45.2 rows=23 loops=1)
--                               ^^^^^^^       ^^^^
--                           ESTIMATED      ACTUAL
--  
--  15000 vs 23 = 652x overestimate!!
--  
--  Why? Optimizer estimated Luxembourg has many customers
--  but actually it's a tiny country with few customers.
 
-- Check the statistics
SELECT most_common_vals, most_common_freqs 
FROM pg_stats 
WHERE tablename = 'customers' AND attname = 'country';
 
-- Might show: Luxembourg not in most_common_vals
-- Optimizer used default selectivity = 1/n_distinct
-- If n_distinct=200 countries, estimated 0.5% → wrong
 
-- Solution: Update statistics with more detail
ALTER TABLE customers ALTER COLUMN country SET STATISTICS 1000;
ANALYZE customers;
 
-- Or create extended statistics for correlated columns
CREATE STATISTICS stats_country_lang (dependencies)
ON country, language FROM customers;
ANALYZE customers;

The 10x Rule of Thumb

When EXPLAIN ANALYZE shows actual rows differing from estimated by more than 10x, the optimizer is working with significantly wrong information. This is your cue to investigate statistics freshness, missing histograms, or correlated column assumptions.

Using Cost for Plan Comparison

While cost isn't a time measurement, it's extremely valuable for comparing alternative approaches to the same query.

Effective Comparison Techniques:

1. Compare Alternative Query Formulations:

Write the same logic multiple ways and compare costs:

Comparing Query Alternatives
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Approach 1: Subquery
EXPLAIN SELECT * FROM orders 
WHERE customer_id IN (SELECT id FROM customers WHERE region = 'EU');
-- Cost: 2458.00
 
-- Approach 2: JOIN
EXPLAIN SELECT o.* FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.region = 'EU';
-- Cost: 1872.00  ← Lower cost, likely faster
 
-- Approach 3: EXISTS
EXPLAIN SELECT * FROM orders o
WHERE EXISTS (
    SELECT 1 FROM customers c 
    WHERE c.id = o.customer_id AND c.region = 'EU'
);
-- Cost: 1756.00  ← Even lower!
 
-- Verify with ANALYZE
EXPLAIN ANALYZE SELECT * FROM orders o ... (each version)
-- Actual execution confirms EXISTS is fastest for this data

2. Compare With and Without Indexes:

-- Before index
EXPLAIN SELECT * FROM orders WHERE customer_id = 123;
-- Seq Scan (cost=0.00..15000.00)

CREATE INDEX idx_customer ON orders(customer_id);

-- After index
EXPLAIN SELECT * FROM orders WHERE customer_id = 123;
-- Index Scan (cost=0.29..8.31)
-- Cost improvement: ~1800x

3. Test Configuration Changes:

-- Current: prefer hash joins
EXPLAIN SELECT ... -- Hash Join cost=2500

SET enable_hashjoin = off;
EXPLAIN SELECT ... -- Merge Join cost=3200
-- Confirms hash join is better for this query
RESET enable_hashjoin;

4. Evaluate Materialized View Benefits:

-- Complex aggregation query
EXPLAIN SELECT region, SUM(amount) FROM orders 
         GROUP BY region;
-- Cost: 45000 (full table scan + aggregation)

-- With materialized view
EXPLAIN SELECT * FROM mv_regional_sales;
-- Cost: 5.00 (just read pre-computed results)

Always Validate with ANALYZE

Cost comparison identifies likely winners, but EXPLAIN ANALYZE confirms actual performance. Sometimes lower-cost plans are slower due to caching effects, parameter variations, or cardinality errors. Use cost for screening, actual timing for final decisions.

Advanced Cost Considerations

For expert-level optimization, consider these advanced aspects of cost estimation:

Cost and Parallelism:

Parallel query execution changes cost calculations:

Work is divided among workers, reducing wall-clock time
But coordination overhead adds cost
Plans show parallel_tuple_cost and parallel_setup_cost

EXPLAIN SELECT COUNT(*) FROM huge_table;
-- Without parallel: cost=500000
-- With parallel (4 workers): cost=125000 + coordination
-- Wall time: ~4x faster despite similar 'logical' cost

Cost and Caching:

The cost model assumes cold cache (data read from disk). Reality often differs:

Repeated access to same tables benefits from buffer cache
Warm caches make I/O-heavy plans faster than predicted
Cold starts (after restart, different query patterns) match predictions better

Cost Doesn't Include:

Excluded Factor	Impact
Lock wait time	Query may wait for locks (not predictable)
Network transfer	Returning large results to client
Concurrent workload	Resource contention with other queries
Connection overhead	Connecting, authentication
Trigger execution	Cascading effects of DML

When to Trust (and Distrust) Cost Estimates
Situation	Trust Level	Reason
Comparing similar plans	High	Errors affect both similarly
Fresh statistics	High	Inputs are accurate
Simple queries	High	Few places for errors to cascade
Complex multi-join queries	Medium	Errors compound through joins
After bulk data changes	Low	Statistics likely stale
Parameterized queries	Low	Generic plans may not fit specific parameters
Histograms missing	Low	Filter selectivity is guesswork
Cross-database joins	Very Low	Foreign statistics unavailable

Detecting Cost Model Limitations
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Check if autovacuum is updating statistics
SELECT 
    relname,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze,
    n_live_tup,
    n_dead_tup
FROM pg_stat_user_tables
WHERE relname = 'orders';
 
-- If last_analyze is old + high n_dead_tup → stale stats
 
-- Check histogram coverage
SELECT 
    attname,
    null_frac,
    n_distinct,
    most_common_vals IS NOT NULL AS has_mcv,
    histogram_bounds IS NOT NULL AS has_histogram
FROM pg_stats
WHERE tablename = 'orders';
 
-- Columns with has_histogram = false have poor selectivity estimates
 
-- Monitor prepared statement behavior
SELECT * FROM pg_prepared_statements;
-- Check if generic_plans are being used (suboptimal for skewed params)

Cost Is a Heuristic, Not Truth

Even with perfect statistics, cost is a model—a simplification of reality. Hardware, caching, concurrency, and data distribution all create real-world variations. Use cost as a strong guide, but validate important decisions with actual measurements.

Summary: Understanding Cost Estimates

We've developed a comprehensive understanding of cost estimates in execution plans. Here are the essential insights:

Key Takeaways

•Cost is relative, not absolute — Units are calibrated to disk page reads, not wall-clock time. Use cost for comparison only.
•Startup vs. total cost matters — Blocking operations (Sort, Hash) have high startup, affecting time-to-first-row and LIMIT behavior.
•Statistics drive accuracy — Row counts, distinct values, histograms, and correlation determine estimate quality. Keep them fresh.
•Configuration affects cost model — Parameters like random_page_cost should reflect your hardware reality (SSD vs. HDD).
•Selectivity errors cascade — A 10x error in filter selectivity creates exponentially larger errors through joins.
•Cost comparison is powerful — Compare query alternatives, index impacts, and configuration changes using EXPLAIN cost.
•Validate with ANALYZE — Cost predicts; EXPLAIN ANALYZE measures. Always confirm important decisions with actual execution.

What's Next:

With a deep understanding of cost estimates, the final page of this module covers plan comparison—systematic techniques for comparing execution plans, understanding why the optimizer makes specific choices, and influencing plan selection when needed.

Page Complete

You now understand the mechanics of cost estimation—what costs mean, how they're calculated, why they go wrong, and how to use them effectively. This knowledge transforms EXPLAIN output from mysterious numbers into actionable performance insights.

Cost Estimates

The Economics of Query Execution

What You Will Learn

What Cost Really Means

Let's start by demystifying what cost numbers in execution plans actually represent.

Cost Is NOT Time

The most important thing to understand: cost values are not milliseconds, seconds, or any real time unit. They are abstract, relative measures calibrated to make comparisons possible.

The Fundamental Unit: Sequential Page Reads

Most database cost models are calibrated relative to a single sequential page read from disk. This is assigned a cost of 1.0. All other operations are measured relative to this baseline:

Operation	Typical Cost Multiplier
Sequential page read	1.0 (baseline)
Random page read	4.0 (random I/O is slower)
CPU processing per row	0.01
CPU processing per operator	0.0025
Parallel worker communication	variable

Why Abstract Units?

Understanding Cost Numbers

PostgreSQL

-- EXPLAIN output with cost breakdown
EXPLAIN (COSTS ON) SELECT * FROM employees WHERE salary > 50000;
 
--                          QUERY PLAN                          
-- ---------------------------------------------------------------
--  Seq Scan on employees  (cost=0.00..289.00 rows=3330 width=64)
--    Filter: (salary > 50000)
 
-- Cost breakdown: cost=startup..total
-- startup=0.00    : Cost before first row is returned
-- total=289.00    : Cost to return all rows
 
-- How 289.00 is calculated (approximately):
-- - Table has 145 pages (blocks)
-- - seq_page_cost = 1.0 (per page read)
-- - cpu_tuple_cost = 0.01 (per row processed)  
-- - cpu_operator_cost = 0.0025 (per row for filter)
-- 
-- Total ≈ (145 pages × 1.0) + (10000 rows × 0.01) + (10000 rows × 0.0025)
-- Total ≈ 145 + 100 + 25 = 270 (approximately)
-- (Exact formula varies by version and statistics)

Cost For Comparison Only

Startup Cost vs. Total Cost

Every operation in an execution plan has two cost components shown as (cost=startup..total):

Startup Cost: Work that must be completed before the first row can be returned. This represents blocking operations or initialization overhead.

Total Cost: Work to complete the entire operation and return all rows.

The distinction matters enormously for understanding query behavior:

Startup Cost Implications by Operator
Operator	Startup Cost	Why It Matters
Seq Scan	Near zero	Can return first row immediately as it reads
Index Scan	Low	Small overhead to navigate index before first row
Sort	High (≈ total)	Must read ALL input before outputting ANY row
Hash (for join)	High	Must build entire hash table before probing
Hash Join	Moderate	Build phase completes before matches flow
Merge Join	Depends	Low if sorted input, high if Sort added
Aggregate	Near total	Must process all input to compute aggregates

Startup vs Total Cost Examples

PostgreSQL

-- Low startup cost: Streaming operation
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
--  Seq Scan on orders  (cost=0.00..1250.00 rows=500 width=80)
--                            ^^^^
--  Startup=0: First row available immediately as scan begins
 
-- High startup cost: Sorting blocks output
EXPLAIN SELECT * FROM orders ORDER BY created_at DESC;
--  Sort  (cost=9876.89..10126.89 rows=100000 width=80)
--              ^^^^^^^
--  Startup=9876.89, Total=10126.89
--  Must accumulate ~98% of work before ANY output
 
-- High startup cost: Hash build phase
EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id;
--  Hash Join  (cost=458.00..2589.00 rows=100000 width=160)
--    Hash Cond: (o.customer_id = c.id)
--    ->  Seq Scan on orders o  (cost=0.00..1250.00 ...)
--    ->  Hash  (cost=333.00..333.00 rows=10000 width=80)
--                    ^^^^^^  ^^^^^^
--              Startup = Total for Hash (must build completely)
--          ->  Seq Scan on customers c  (cost=0.00..283.00 ...)

Why Startup Cost Matters:

Time to First Row: For interactive queries where users see results progressively, high startup cost means they wait with blank screens.
LIMIT Optimization: If only a few rows are needed, low startup + streaming can be faster than 'cheaper' plans requiring full processing:

SELECT * FROM huge_table ORDER BY indexed_column LIMIT 10
- Index scan: startup=0, reads only 10 rows
- Seq scan + sort: reads entire table before returning anything
Pipeline Breaking: High startup operators 'break' the data pipeline—nothing flows until they complete. Multiple such operators in sequence multiply delays.
Cursor/Stream Behavior: Applications using cursors to fetch rows incrementally are penalized by high startup costs but benefit from low-startup streaming operations.

LIMIT Loves Low Startup

Cost Model Inputs: Statistics and Parameters

The cost model requires several inputs to calculate estimates. Understanding these inputs explains why costs are sometimes wildly wrong.

Table and Index Statistics:

The optimizer maintains statistics about data distribution:

Statistic	Description	Impact on Cost
Row count (reltuples)	Estimated number of rows in table	Fundamental input to all estimates
Page count (relpages)	Number of disk pages table occupies	Drives I/O cost calculation
Distinct values (n_distinct)	Unique values per column	Affects GROUP BY and join estimates
Most common values (MCV)	Frequent values and their frequencies	Improves selective filter estimates
Histogram	Distribution of values in buckets	Range query selectivity
Correlation	Physical ordering vs. logical ordering	Index scan random vs. sequential I/O

Viewing Statistics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- View table-level statistics
SELECT 
    relname AS table_name,
    reltuples AS estimated_rows,
    relpages AS pages_on_disk,
    pg_size_pretty(relpages * 8192) AS estimated_size
FROM pg_class 
WHERE relname = 'orders';
 
-- View column-level statistics
SELECT 
    attname AS column_name,
    n_distinct,
    most_common_vals,
    most_common_freqs,
    histogram_bounds,
    correlation
FROM pg_stats 
WHERE tablename = 'orders' AND attname = 'status';
 
-- Sample output:
-- column_name | status
-- n_distinct  | 5                    (5 unique values)
-- most_common_vals | {pending,completed,shipped,cancelled,refunded}
-- most_common_freqs | {0.35,0.30,0.20,0.10,0.05}
-- correlation | 0.15                 (weak physical-logical ordering)
 
-- Force statistics update
ANALYZE orders;
 
-- Increase statistics granularity for a column
ALTER TABLE orders ALTER COLUMN customer_id SET STATISTICS 1000;
ANALYZE orders;

Configuration Parameters Affecting Cost:

Databases have tunable parameters that adjust cost calculations:

PostgreSQL Cost Parameters:

Parameter	Default	Effect
`seq_page_cost`	1.0	Cost per sequential page read
`random_page_cost`	4.0	Cost per random page read (index access)
`cpu_tuple_cost`	0.01	Cost per row processed
`cpu_index_tuple_cost`	0.005	Cost per index entry processed
`cpu_operator_cost`	0.0025	Cost per operator evaluation
`effective_cache_size`	4GB typical	Assumed available cache (affects index preference)
`parallel_tuple_cost`	0.1	Cost of passing tuples to parallel workers

Tuning Strategy:

These parameters should reflect your hardware reality:

SSDs have lower random vs. sequential difference (lower random_page_cost, maybe 1.1-2.0)
Fast CPUs might warrant higher I/O costs relative to CPU
Large memory systems benefit from higher effective_cache_size

Don't Tune Randomly

How Costs Are Calculated

Understanding the mechanics of cost calculation helps explain optimizer behavior and diagnose when estimates go wrong.

Sequential Scan Cost:

Cost = (disk pages × seq_page_cost) + (rows × cpu_tuple_cost)
                                    + (rows × filter evaluation cost)

Example: Table with 1000 pages, 100,000 rows:

Disk: 1000 × 1.0 = 1000
CPU: 100,000 × 0.01 = 1000
Total ≈ 2000

Index Scan Cost:

Startup = index height traversal
Per-row = (index_tuple_cost + random_page_cost * correlation_factor)
                            + cpu_tuple_cost for heap fetch

The correlation factor is crucial: highly correlated data (physically ordered like the index) means sequential I/O; low correlation means random I/O.

Cost Calculation Examples

PostgreSQL

-- Comparing Seq Scan vs Index Scan costs
-- Table: 100,000 rows, 1000 pages
 
-- Scenario: WHERE id = 12345 (returns 1 row)
-- Index Scan is chosen:
--  Index Scan using pk_table on mytable  (cost=0.29..8.31 rows=1 width=80)
--    Index Cond: (id = 12345)
-- Cost breakdown:
--   startup: ~0.29 (index tree traversal)
--   per row: ~8.02 (one random page + CPU)
--   Very cheap because selectivity is high
 
-- Seq Scan would be:
--  (cost=0.00..1500.00 rows=1 width=80)
-- Complete table scan to find 1 row = wasteful
 
-- Scenario: WHERE status = 'active' (returns 60,000 rows)
-- Seq Scan is chosen:
--  Seq Scan on mytable  (cost=0.00..1750.00 rows=60000 width=80)
--    Filter: (status = 'active')
-- Why not index? 60% of rows means:
--   - Index scan: 60,000 random page fetches
--   - Seq scan: 1000 sequential page reads
--   Sequential reading 1000 pages beats 60,000 random fetches
 
-- Viewing optimizer's decision factors
SET enable_seqscan = off;  -- Force index consideration
EXPLAIN SELECT * FROM mytable WHERE status = 'active';
--  Bitmap Heap Scan  (cost=685.00..2250.00 rows=60000 width=80)
-- Higher cost! Optimizer was right to prefer Seq Scan.

Join Cost Formulas (Simplified)
Join Type	Cost Formula (Simplified)	When Optimal
Nested Loop	`outer_rows × (inner_startup + inner_per_row)`	Small outer, indexed inner
Hash Join	`build_cost + probe_cost`	Medium tables, equi-join
Merge Join	`sort_cost (if needed) + merge_cost`	Pre-sorted inputs

The Selectivity Factor:

Selectivity is the estimated fraction of rows passing a filter. It's crucial for cost calculation:

WHERE id = 12345 on unique column → selectivity = 1/total_rows
WHERE status = 'active' → selectivity from MCV or histogram
WHERE flag = true (boolean) → often estimated as 50% if no statistics

When Selectivity Estimation Fails:

Situation	Problem	Symptom
Skewed data	Histogram doesn't capture unusual distribution	Wrong row estimates
Correlated columns	`(a = 1 AND b = 2)` estimated as independent	Underestimate
Functions on columns	`WHERE UPPER(name) = 'SMITH'`	Can't use column stats
Parameterized queries	Value unknown at plan time	Generic estimate used
Cross-database comparisons	Foreign tables lack local statistics	Wild guessing

Selectivity Errors Cascade

Why Cost Estimates Go Wrong

Cost estimates are predictions, and predictions can be wrong. Understanding failure modes helps you diagnose and fix optimizer mistakes.

Category 1: Stale Statistics

The most common problem. Statistics describe data at the time they were gathered. If data changes significantly afterward:

Row counts are wrong
Value distributions no longer match histograms
Cardinality estimates cascade errors through the plan

Common Causes of Bad Estimates

•Stale statistics after bulk loads — Inserted 1M rows but didn't ANALYZE. Optimizer thinks table is small.
•Deleted data with bloated estimates — Deleted 90% of rows but statistics show original count. Over-estimates everywhere.
•Skewed data not captured — 99% of orders are 'completed' but histogram doesn't reflect this. Filters on 'pending' grossly overestimate.
•Correlated column assumptions — (region = 'US' AND language = 'English') estimated as independent (0.3 × 0.4 = 0.12) when reality is 0.28.
•Functions hide column statistics — WHERE YEAR(created_at) = 2024 can't use created_at histogram. Optimizer guesses.
•Parameterized query generic plans — Prepared statements may use generic plan ignoring specific parameter distributions.
•Cross-join selectivity — Complex multi-table joins compound estimation errors multiplicatively.

Diagnosing Estimate Errors

PostgreSQL

-- Using EXPLAIN ANALYZE to find estimate errors
EXPLAIN ANALYZE
SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.country = 'Luxembourg';
 
--  Hash Join  (cost=125.00..2589.00 rows=15000 width=160)
--             (actual time=0.5..45.2 rows=23 loops=1)
--                               ^^^^^^^       ^^^^
--                           ESTIMATED      ACTUAL
--  
--  15000 vs 23 = 652x overestimate!!
--  
--  Why? Optimizer estimated Luxembourg has many customers
--  but actually it's a tiny country with few customers.
 
-- Check the statistics
SELECT most_common_vals, most_common_freqs 
FROM pg_stats 
WHERE tablename = 'customers' AND attname = 'country';
 
-- Might show: Luxembourg not in most_common_vals
-- Optimizer used default selectivity = 1/n_distinct
-- If n_distinct=200 countries, estimated 0.5% → wrong
 
-- Solution: Update statistics with more detail
ALTER TABLE customers ALTER COLUMN country SET STATISTICS 1000;
ANALYZE customers;
 
-- Or create extended statistics for correlated columns
CREATE STATISTICS stats_country_lang (dependencies)
ON country, language FROM customers;
ANALYZE customers;

The 10x Rule of Thumb

Using Cost for Plan Comparison

While cost isn't a time measurement, it's extremely valuable for comparing alternative approaches to the same query.

Effective Comparison Techniques:

1. Compare Alternative Query Formulations:

Write the same logic multiple ways and compare costs:

Comparing Query Alternatives
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Approach 1: Subquery
EXPLAIN SELECT * FROM orders 
WHERE customer_id IN (SELECT id FROM customers WHERE region = 'EU');
-- Cost: 2458.00
 
-- Approach 2: JOIN
EXPLAIN SELECT o.* FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.region = 'EU';
-- Cost: 1872.00  ← Lower cost, likely faster
 
-- Approach 3: EXISTS
EXPLAIN SELECT * FROM orders o
WHERE EXISTS (
    SELECT 1 FROM customers c 
    WHERE c.id = o.customer_id AND c.region = 'EU'
);
-- Cost: 1756.00  ← Even lower!
 
-- Verify with ANALYZE
EXPLAIN ANALYZE SELECT * FROM orders o ... (each version)
-- Actual execution confirms EXISTS is fastest for this data

2. Compare With and Without Indexes:

-- Before index
EXPLAIN SELECT * FROM orders WHERE customer_id = 123;
-- Seq Scan (cost=0.00..15000.00)

CREATE INDEX idx_customer ON orders(customer_id);

-- After index
EXPLAIN SELECT * FROM orders WHERE customer_id = 123;
-- Index Scan (cost=0.29..8.31)
-- Cost improvement: ~1800x

3. Test Configuration Changes:

-- Current: prefer hash joins
EXPLAIN SELECT ... -- Hash Join cost=2500

SET enable_hashjoin = off;
EXPLAIN SELECT ... -- Merge Join cost=3200
-- Confirms hash join is better for this query
RESET enable_hashjoin;

4. Evaluate Materialized View Benefits:

-- Complex aggregation query
EXPLAIN SELECT region, SUM(amount) FROM orders 
         GROUP BY region;
-- Cost: 45000 (full table scan + aggregation)

-- With materialized view
EXPLAIN SELECT * FROM mv_regional_sales;
-- Cost: 5.00 (just read pre-computed results)

Always Validate with ANALYZE

Advanced Cost Considerations

For expert-level optimization, consider these advanced aspects of cost estimation:

Cost and Parallelism:

Parallel query execution changes cost calculations:

Work is divided among workers, reducing wall-clock time
But coordination overhead adds cost
Plans show parallel_tuple_cost and parallel_setup_cost

EXPLAIN SELECT COUNT(*) FROM huge_table;
-- Without parallel: cost=500000
-- With parallel (4 workers): cost=125000 + coordination
-- Wall time: ~4x faster despite similar 'logical' cost

Cost and Caching:

The cost model assumes cold cache (data read from disk). Reality often differs:

Repeated access to same tables benefits from buffer cache
Warm caches make I/O-heavy plans faster than predicted
Cold starts (after restart, different query patterns) match predictions better

Cost Doesn't Include:

Excluded Factor	Impact
Lock wait time	Query may wait for locks (not predictable)
Network transfer	Returning large results to client
Concurrent workload	Resource contention with other queries
Connection overhead	Connecting, authentication
Trigger execution	Cascading effects of DML

When to Trust (and Distrust) Cost Estimates
Situation	Trust Level	Reason
Comparing similar plans	High	Errors affect both similarly
Fresh statistics	High	Inputs are accurate
Simple queries	High	Few places for errors to cascade
Complex multi-join queries	Medium	Errors compound through joins
After bulk data changes	Low	Statistics likely stale
Parameterized queries	Low	Generic plans may not fit specific parameters
Histograms missing	Low	Filter selectivity is guesswork
Cross-database joins	Very Low	Foreign statistics unavailable

Detecting Cost Model Limitations
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Check if autovacuum is updating statistics
SELECT 
    relname,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze,
    n_live_tup,
    n_dead_tup
FROM pg_stat_user_tables
WHERE relname = 'orders';
 
-- If last_analyze is old + high n_dead_tup → stale stats
 
-- Check histogram coverage
SELECT 
    attname,
    null_frac,
    n_distinct,
    most_common_vals IS NOT NULL AS has_mcv,
    histogram_bounds IS NOT NULL AS has_histogram
FROM pg_stats
WHERE tablename = 'orders';
 
-- Columns with has_histogram = false have poor selectivity estimates
 
-- Monitor prepared statement behavior
SELECT * FROM pg_prepared_statements;
-- Check if generic_plans are being used (suboptimal for skewed params)

Cost Is a Heuristic, Not Truth

Summary: Understanding Cost Estimates

We've developed a comprehensive understanding of cost estimates in execution plans. Here are the essential insights:

Key Takeaways

•Cost is relative, not absolute — Units are calibrated to disk page reads, not wall-clock time. Use cost for comparison only.
•Startup vs. total cost matters — Blocking operations (Sort, Hash) have high startup, affecting time-to-first-row and LIMIT behavior.
•Statistics drive accuracy — Row counts, distinct values, histograms, and correlation determine estimate quality. Keep them fresh.
•Configuration affects cost model — Parameters like random_page_cost should reflect your hardware reality (SSD vs. HDD).
•Selectivity errors cascade — A 10x error in filter selectivity creates exponentially larger errors through joins.
•Cost comparison is powerful — Compare query alternatives, index impacts, and configuration changes using EXPLAIN cost.
•Validate with ANALYZE — Cost predicts; EXPLAIN ANALYZE measures. Always confirm important decisions with actual execution.

What's Next:

Page Complete