Database Management SystemsCovering Indexes

Covering Indexes: Optimizing Query Performance Through Index Design

LevelAdvanced

Duration60 mins

TopicCovering Indexes

2 / 5

Index-Only Scans: The Execution Mechanism

From Concept to Execution

In the previous page, we established that covering indexes allow databases to answer queries without accessing the underlying table. But how exactly does the database execute this optimization? How does the query optimizer decide to use an index-only scan? And what hidden requirements must be satisfied for index-only scans to actually work?

This page takes you inside the database engine to understand index-only scans—the execution mechanism that transforms covering indexes from a theoretical concept into measurable performance gains. We'll examine the complete lifecycle: detection, planning, execution, and verification.

What You Will Learn

By the end of this page, you will understand how the query optimizer identifies index-only scan opportunities, the role of visibility maps in MVCC databases, platform-specific behaviors across PostgreSQL, SQL Server, MySQL, and Oracle, and how to verify and diagnose index-only scan execution using EXPLAIN plans.

Anatomy of an Index-Only Scan

An index-only scan is a query execution strategy where the database engine retrieves all required data directly from an index, completely bypassing the heap (table) pages. This is fundamentally different from a regular index scan, which uses the index to locate rows and then fetches those rows from the table.

The Execution Flow Comparison

Converting Mermaid diagram...

Key Characteristics of Index-Only Scans

Characteristic	Regular Index Scan	Index-Only Scan
Data Source	Index + Heap	Index only
I/O Pattern	Sequential (index) + Random (heap)	Sequential (index) only
Row Visibility Check	Via heap page	Via visibility map (MVCC) or lock
Memory Pressure	High (heap pages in buffer)	Low (only index pages)
Scale Behavior	Degrades with result size	Constant relative to result size
Prerequisite	Suitable index exists	Covering index + visibility requirements

The I/O Elimination

The critical insight: index-only scans don't just reduce I/O—they eliminate an entire category of I/O (random heap access). This transforms query performance from being dominated by the most expensive operation to using only the most efficient operation (sequential index access).

How the Optimizer Chooses Index-Only Scans

The query optimizer evaluates multiple execution plans and selects the plan with the lowest estimated cost. For index-only scans to be chosen, several conditions must be met.

The Optimizer's Decision Tree

Converting Mermaid diagram...

Detailed Decision Criteria

1. Column Coverage Check

The optimizer first identifies all columns referenced in the query:

SELECT list columns (output)
WHERE clause columns (filtering)
JOIN condition columns
ORDER BY columns (sorting)
GROUP BY columns (grouping)
HAVING clause columns

If any required column is missing from the index, index-only scan is immediately ruled out.

2. Index Selectivity Analysis

Even with a covering index, the optimizer compares costs:

Index-Only Scan Cost = (Index Pages to Read) × (Page Read Cost)
Regular Index Scan Cost = (Index Pages) × (Page Read Cost) + 
                          (Expected Rows) × (Random I/O Cost)
Table Scan Cost = (Table Pages) × (Sequential Read Cost)

The optimizer chooses the plan with the lowest total cost. For highly selective queries (few matching rows), even a non-covering index might be preferred if the table lookup cost is minimal.

optimizer_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Analyze how PostgreSQL evaluates index-only scan opportunity
-- Step 1: Create a covering index
CREATE INDEX idx_orders_covering ON orders 
(customer_id, order_date) 
INCLUDE (total_amount, status);
 
-- Step 2: Examine the execution plan with costs
EXPLAIN (ANALYZE, COSTS, BUFFERS) 
SELECT customer_id, order_date, total_amount, status
FROM orders
WHERE customer_id = 12345
AND order_date >= '2024-01-01';
 
-- Example output showing index-only scan:
-- Index Only Scan using idx_orders_covering on orders
--   Index Cond: ((customer_id = 12345) AND (order_date >= '2024-01-01'))
--   Heap Fetches: 0  <-- KEY INDICATOR: Zero heap access!
--   Buffers: shared hit=4
 
-- Heap Fetches > 0 indicates visibility map misses
-- requiring fallback to heap for row visibility verification

The Visibility Map Challenge (MVCC)

In databases using Multi-Version Concurrency Control (MVCC)—including PostgreSQL, MySQL InnoDB, and Oracle—there's a subtle but critical challenge for index-only scans: row visibility determination.

The MVCC Visibility Problem

MVCC databases may have multiple versions of the same row, each visible to different transactions. When reading data, the database must determine which version (if any) is visible to the current transaction.

Normally, visibility is determined by examining the row's heap tuple header, which contains:

Transaction ID that created this version (xmin)
Transaction ID that deleted/updated this version (xmax)
Hint bits indicating visibility status

But here's the problem: index entries don't contain visibility information. The index only stores the key values and a pointer (TID/RID) to the heap tuple.

The Paradox

If we must access the heap to check visibility, we lose the entire benefit of index-only scans! The database would need to read heap pages anyway, negating the I/O savings. This was a fundamental limitation that prevented true index-only scans in early MVCC implementations.

The Visibility Map Solution (PostgreSQL)

PostgreSQL solved this with the visibility map—a compact auxiliary structure that tracks which heap pages contain only tuples visible to all transactions.

Visibility Map Structure:

Converting Mermaid diagram...

How Index-Only Scan Uses the Visibility Map:

Index Scan Phase: Read index entries, extracting both data and TIDs (tuple IDs)
Visibility Map Check: For each TID, check if its page is marked "all-visible" in the visibility map
If All-Visible: Return the data directly from the index—no heap access needed
If Not All-Visible: Fall back to heap access for that specific tuple (counted as "Heap Fetches")

Critical Implication: Index-only scans work best on tables that have been vacuumed recently. VACUUM updates the visibility map, marking pages as "all-visible" after dead tuples are removed.

visibility_map_analysis.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Analyze visibility map status and its impact on index-only scans
 
-- Check visibility map coverage using pg_visibility extension
CREATE EXTENSION IF NOT EXISTS pg_visibility;
 
-- View visibility map statistics for a table
SELECT * FROM pg_visibility_map_summary('orders');
-- Returns: all_visible (pages marked visible), all_frozen (pages frozen)
 
-- Detailed per-page visibility
SELECT * FROM pg_visibility('orders') LIMIT 10;
-- Returns: blkno, all_visible, all_frozen
 
-- Check EXPLAIN output for heap fetches
EXPLAIN (ANALYZE, BUFFERS) 
SELECT customer_id, order_date, total_amount
FROM orders
WHERE customer_id BETWEEN 1 AND 1000;
 
-- "Heap Fetches: 1523" means 1523 tuples required heap access
-- due to visibility map indicating non-visible pages
 
-- Solution: VACUUM to update visibility map
VACUUM ANALYZE orders;
 
-- Re-run EXPLAIN - should see reduced or zero heap fetches
EXPLAIN (ANALYZE, BUFFERS) 
SELECT customer_id, order_date, total_amount
FROM orders
WHERE customer_id BETWEEN 1 AND 1000;
-- "Heap Fetches: 0" - true index-only scan achieved!

Monitoring Visibility Map Health

A high 'Heap Fetches' count in an index-only scan indicates poor visibility map coverage. This typically means the table has high write activity without sufficient vacuuming. For write-heavy tables, consider more aggressive autovacuum settings to maintain index-only scan efficiency.

Platform-Specific Index-Only Scan Behaviors

Each major database platform implements index-only scans differently, with unique terminology, capabilities, and limitations.

Index-Only Scan Implementation Comparison
Aspect	PostgreSQL	SQL Server	MySQL InnoDB	Oracle
Terminology	Index Only Scan	Index Seek/Scan (no Key Lookup)	Using index	INDEX FAST FULL SCAN / INDEX RANGE SCAN
INCLUDE Clause	Yes (v11+)	Yes	No (columns in key)	No (must use key columns)
Visibility Handling	Visibility Map	Lock-based (less overhead)	Implicit (clustered key)	Consistent read via undo
EXPLAIN Indicator	Index Only Scan, Heap Fetches	No Key Lookup in plan	Extra: Using index	Rowsource shows INDEX access
Prerequisites	VACUUM for visibility map	None special	All columns in index key	Index must have all columns

PostgreSQL Index-Only Scan Details

Introduced in PostgreSQL 9.2
INCLUDE clause added in PostgreSQL 11
Heavily dependent on visibility map maintenance
Supports partial index-only scans (some tuples from heap)

Best Practices:

Ensure autovacuum is properly tuned
Monitor pg_stat_user_indexes.idx_scan vs idx_tup_fetch
Check EXPLAIN for "Heap Fetches" counter
Consider VACUUM FREEZE for read-heavy, rarely-modified tables

Diagnosing Index-Only Scans in Execution Plans

Reading and interpreting execution plans is essential for verifying that your covering indexes are actually enabling index-only scans. Let's examine what to look for in each platform's EXPLAIN output.

explain_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- PostgreSQL: Confirming Index-Only Scan
 
-- Query with covering index
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT customer_id, email, signup_date
FROM customers
WHERE customer_id BETWEEN 1000 AND 2000;
 
-- POSITIVE OUTPUT (Index-Only Scan):
--  Index Only Scan using idx_customers_covering on customers
--    Index Cond: ((customer_id >= 1000) AND (customer_id <= 2000))
--    Heap Fetches: 0
--    Buffers: shared hit=12
--  → "Index Only Scan" = correct plan type
--  → "Heap Fetches: 0" = no table access needed
--  → Only index buffers accessed
 
-- NEGATIVE OUTPUT (Fallback to heap):
--  Index Only Scan using idx_customers_covering on customers
--    Index Cond: ((customer_id >= 1000) AND (customer_id <= 2000))
--    Heap Fetches: 487
--  → Heap Fetches > 0 means visibility map check failed for some pages
--  → Solution: VACUUM ANALYZE customers;
 
-- REALLY NEGATIVE OUTPUT (Not index-only at all):
--  Index Scan using idx_customers_partial on customers
--    Index Cond: ...
--  → "Index Scan" (not "Index Only Scan") = table access required

Execution Plan Verification Checklist

•PostgreSQL: Confirm "Index Only Scan" (not "Index Scan") and "Heap Fetches: 0"
•SQL Server: Verify no "Key Lookup" or "RID Lookup" operators connected to the index seek
•MySQL: Check for "Using index" in the Extra column of EXPLAIN output
•Oracle: Look for INDEX access methods without TABLE ACCESS FULL or TABLE ACCESS BY INDEX ROWID
•All Platforms: Compare logical reads before/after adding covering columns—should decrease dramatically

Index-Only Scan Limitations and Edge Cases

While index-only scans are powerful, they have limitations and edge cases that can prevent their use or reduce their effectiveness.

Common Index-Only Scan Blockers

•SELECT * Queries — The moment you use SELECT *, you request all table columns. Unless your index includes every column (rarely practical), index-only scan is impossible.
•Expression Evaluation — Functions or expressions on covered columns may force table access. SELECT UPPER(name) might work, but complex expressions might not.
•NULL Handling Quirks — Some indexes exclude NULL values. IS NULL conditions may not use the index, or the index may not cover rows with NULL values.
•TOAST Columns (PostgreSQL) — Large values (text, bytea) stored in TOAST tables require separate access even if the column is in the index.
•Visibility Map Staleness — High-write tables without adequate VACUUM will have poor visibility coverage, forcing heap access.
•Index Type Limitations — Some index types (GIN, GiST, BRIN) don't support index-only scans for all operators.
•Partial Indexes — Index-only scans require the partial index predicate to match the query's WHERE clause.

common_blockers.sql
PostgreSQL Examples
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Example 1: SELECT * breaks index-only scan
-- Even with a "covering" index, SELECT * fails
CREATE INDEX idx_users_covering ON users (user_id) 
INCLUDE (email, name, created_at);
 
-- This WILL use index-only scan:
EXPLAIN SELECT user_id, email, name, created_at FROM users WHERE user_id = 100;
-- "Index Only Scan"
 
-- This will NOT (assumes table has more columns):
EXPLAIN SELECT * FROM users WHERE user_id = 100;
-- "Index Scan" (not "Index Only Scan")
 
 
-- Example 2: TOAST column limitation
-- Large text columns stored in TOAST tables
CREATE TABLE documents (
    doc_id INT PRIMARY KEY,
    title VARCHAR(255),
    content TEXT  -- Likely TOAST-ed if large
);
 
CREATE INDEX idx_docs_covering ON documents (doc_id) INCLUDE (title, content);
 
-- Index-only scan will still need TOAST access for large content values
-- The index stores a pointer to TOAST, not the actual content
 
 
-- Example 3: Expression preventing index-only scan
CREATE INDEX idx_users_email ON users (email) INCLUDE (name);
 
-- Works (simple column reference):
EXPLAIN SELECT email, name FROM users WHERE email = 'test@example.com';
 
-- May not work (function on column):
EXPLAIN SELECT email, name, LENGTH(email) FROM users WHERE email LIKE 'test%';
-- LENGTH(email) might require heap access depending on PostgreSQL version

The SELECT * Anti-Pattern

Avoiding SELECT * is one of the most impactful query optimization habits. By explicitly listing only needed columns, you enable index coverage, reduce network transfer, and make your code more maintainable. Modern ORMs often generate SELECT *, making this a common performance pitfall.

Summary: Mastering Index-Only Scans

We've taken a deep dive into the execution mechanism behind covering indexes. Here are the key insights:

Key Takeaways

•Index-only scans are the goal — They represent the ideal execution path where data is retrieved entirely from the index with zero table access.
•The optimizer requires complete coverage — All columns in SELECT, WHERE, ORDER BY, GROUP BY, and JOIN must be present in the index.
•MVCC adds complexity — Databases like PostgreSQL require visibility maps to verify row visibility without heap access.
•VACUUM maintains index-only scan efficiency — Regular vacuuming updates visibility maps, reducing or eliminating 'Heap Fetches'.
•Platform differences matter — PostgreSQL uses INCLUDE, SQL Server has INCLUDE, MySQL requires all columns as index keys, Oracle uses composite indexes.
•EXPLAIN is essential — Always verify index-only scan usage through execution plan analysis; look for platform-specific indicators.
•Edge cases can block optimization — SELECT *, TOAST columns, expressions, and visibility issues can all prevent index-only scans.

What's Next

Now that you understand how index-only scans work, we'll explore the performance benefits in depth—quantifying the I/O savings, memory efficiency improvements, and real-world benchmarks that demonstrate the transformative impact of well-designed covering indexes.

Execution Mastery Achieved

You now understand the complete lifecycle of index-only scans: how optimizers detect coverage, how visibility is verified in MVCC databases, platform-specific behaviors, and how to diagnose index-only scan execution. This knowledge enables you to design, verify, and troubleshoot covering indexes in production systems.

2 / 5

Loading learning content...

Database Management SystemsCovering Indexes

Covering Indexes: Optimizing Query Performance Through Index Design

LevelAdvanced

Duration60 mins

TopicCovering Indexes

2 / 5

Index-Only Scans: The Execution Mechanism

From Concept to Execution

What You Will Learn

Anatomy of an Index-Only Scan

The Execution Flow Comparison

Converting Mermaid diagram...

Key Characteristics of Index-Only Scans

Characteristic	Regular Index Scan	Index-Only Scan
Data Source	Index + Heap	Index only
I/O Pattern	Sequential (index) + Random (heap)	Sequential (index) only
Row Visibility Check	Via heap page	Via visibility map (MVCC) or lock
Memory Pressure	High (heap pages in buffer)	Low (only index pages)
Scale Behavior	Degrades with result size	Constant relative to result size
Prerequisite	Suitable index exists	Covering index + visibility requirements

The I/O Elimination

How the Optimizer Chooses Index-Only Scans

The query optimizer evaluates multiple execution plans and selects the plan with the lowest estimated cost. For index-only scans to be chosen, several conditions must be met.

The Optimizer's Decision Tree

Converting Mermaid diagram...

Detailed Decision Criteria

1. Column Coverage Check

The optimizer first identifies all columns referenced in the query:

SELECT list columns (output)
WHERE clause columns (filtering)
JOIN condition columns
ORDER BY columns (sorting)
GROUP BY columns (grouping)
HAVING clause columns

If any required column is missing from the index, index-only scan is immediately ruled out.

2. Index Selectivity Analysis

Even with a covering index, the optimizer compares costs:

Index-Only Scan Cost = (Index Pages to Read) × (Page Read Cost)
Regular Index Scan Cost = (Index Pages) × (Page Read Cost) + 
                          (Expected Rows) × (Random I/O Cost)
Table Scan Cost = (Table Pages) × (Sequential Read Cost)

The optimizer chooses the plan with the lowest total cost. For highly selective queries (few matching rows), even a non-covering index might be preferred if the table lookup cost is minimal.

optimizer_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Analyze how PostgreSQL evaluates index-only scan opportunity
-- Step 1: Create a covering index
CREATE INDEX idx_orders_covering ON orders 
(customer_id, order_date) 
INCLUDE (total_amount, status);
 
-- Step 2: Examine the execution plan with costs
EXPLAIN (ANALYZE, COSTS, BUFFERS) 
SELECT customer_id, order_date, total_amount, status
FROM orders
WHERE customer_id = 12345
AND order_date >= '2024-01-01';
 
-- Example output showing index-only scan:
-- Index Only Scan using idx_orders_covering on orders
--   Index Cond: ((customer_id = 12345) AND (order_date >= '2024-01-01'))
--   Heap Fetches: 0  <-- KEY INDICATOR: Zero heap access!
--   Buffers: shared hit=4
 
-- Heap Fetches > 0 indicates visibility map misses
-- requiring fallback to heap for row visibility verification

The Visibility Map Challenge (MVCC)

The MVCC Visibility Problem

Normally, visibility is determined by examining the row's heap tuple header, which contains:

Transaction ID that created this version (xmin)
Transaction ID that deleted/updated this version (xmax)
Hint bits indicating visibility status

But here's the problem: index entries don't contain visibility information. The index only stores the key values and a pointer (TID/RID) to the heap tuple.

The Paradox

The Visibility Map Solution (PostgreSQL)

PostgreSQL solved this with the visibility map—a compact auxiliary structure that tracks which heap pages contain only tuples visible to all transactions.

Visibility Map Structure:

Converting Mermaid diagram...

How Index-Only Scan Uses the Visibility Map:

Index Scan Phase: Read index entries, extracting both data and TIDs (tuple IDs)
Visibility Map Check: For each TID, check if its page is marked "all-visible" in the visibility map
If All-Visible: Return the data directly from the index—no heap access needed
If Not All-Visible: Fall back to heap access for that specific tuple (counted as "Heap Fetches")

Critical Implication: Index-only scans work best on tables that have been vacuumed recently. VACUUM updates the visibility map, marking pages as "all-visible" after dead tuples are removed.

visibility_map_analysis.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Analyze visibility map status and its impact on index-only scans
 
-- Check visibility map coverage using pg_visibility extension
CREATE EXTENSION IF NOT EXISTS pg_visibility;
 
-- View visibility map statistics for a table
SELECT * FROM pg_visibility_map_summary('orders');
-- Returns: all_visible (pages marked visible), all_frozen (pages frozen)
 
-- Detailed per-page visibility
SELECT * FROM pg_visibility('orders') LIMIT 10;
-- Returns: blkno, all_visible, all_frozen
 
-- Check EXPLAIN output for heap fetches
EXPLAIN (ANALYZE, BUFFERS) 
SELECT customer_id, order_date, total_amount
FROM orders
WHERE customer_id BETWEEN 1 AND 1000;
 
-- "Heap Fetches: 1523" means 1523 tuples required heap access
-- due to visibility map indicating non-visible pages
 
-- Solution: VACUUM to update visibility map
VACUUM ANALYZE orders;
 
-- Re-run EXPLAIN - should see reduced or zero heap fetches
EXPLAIN (ANALYZE, BUFFERS) 
SELECT customer_id, order_date, total_amount
FROM orders
WHERE customer_id BETWEEN 1 AND 1000;
-- "Heap Fetches: 0" - true index-only scan achieved!

Monitoring Visibility Map Health

Platform-Specific Index-Only Scan Behaviors

Each major database platform implements index-only scans differently, with unique terminology, capabilities, and limitations.

Index-Only Scan Implementation Comparison
Aspect	PostgreSQL	SQL Server	MySQL InnoDB	Oracle
Terminology	Index Only Scan	Index Seek/Scan (no Key Lookup)	Using index	INDEX FAST FULL SCAN / INDEX RANGE SCAN
INCLUDE Clause	Yes (v11+)	Yes	No (columns in key)	No (must use key columns)
Visibility Handling	Visibility Map	Lock-based (less overhead)	Implicit (clustered key)	Consistent read via undo
EXPLAIN Indicator	Index Only Scan, Heap Fetches	No Key Lookup in plan	Extra: Using index	Rowsource shows INDEX access
Prerequisites	VACUUM for visibility map	None special	All columns in index key	Index must have all columns

PostgreSQL Index-Only Scan Details

Introduced in PostgreSQL 9.2
INCLUDE clause added in PostgreSQL 11
Heavily dependent on visibility map maintenance
Supports partial index-only scans (some tuples from heap)

Best Practices:

Ensure autovacuum is properly tuned
Monitor pg_stat_user_indexes.idx_scan vs idx_tup_fetch
Check EXPLAIN for "Heap Fetches" counter
Consider VACUUM FREEZE for read-heavy, rarely-modified tables

Diagnosing Index-Only Scans in Execution Plans

explain_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- PostgreSQL: Confirming Index-Only Scan
 
-- Query with covering index
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT customer_id, email, signup_date
FROM customers
WHERE customer_id BETWEEN 1000 AND 2000;
 
-- POSITIVE OUTPUT (Index-Only Scan):
--  Index Only Scan using idx_customers_covering on customers
--    Index Cond: ((customer_id >= 1000) AND (customer_id <= 2000))
--    Heap Fetches: 0
--    Buffers: shared hit=12
--  → "Index Only Scan" = correct plan type
--  → "Heap Fetches: 0" = no table access needed
--  → Only index buffers accessed
 
-- NEGATIVE OUTPUT (Fallback to heap):
--  Index Only Scan using idx_customers_covering on customers
--    Index Cond: ((customer_id >= 1000) AND (customer_id <= 2000))
--    Heap Fetches: 487
--  → Heap Fetches > 0 means visibility map check failed for some pages
--  → Solution: VACUUM ANALYZE customers;
 
-- REALLY NEGATIVE OUTPUT (Not index-only at all):
--  Index Scan using idx_customers_partial on customers
--    Index Cond: ...
--  → "Index Scan" (not "Index Only Scan") = table access required

Execution Plan Verification Checklist

•PostgreSQL: Confirm "Index Only Scan" (not "Index Scan") and "Heap Fetches: 0"
•SQL Server: Verify no "Key Lookup" or "RID Lookup" operators connected to the index seek
•MySQL: Check for "Using index" in the Extra column of EXPLAIN output
•Oracle: Look for INDEX access methods without TABLE ACCESS FULL or TABLE ACCESS BY INDEX ROWID
•All Platforms: Compare logical reads before/after adding covering columns—should decrease dramatically

Index-Only Scan Limitations and Edge Cases

While index-only scans are powerful, they have limitations and edge cases that can prevent their use or reduce their effectiveness.

Common Index-Only Scan Blockers

•SELECT * Queries — The moment you use SELECT *, you request all table columns. Unless your index includes every column (rarely practical), index-only scan is impossible.
•Expression Evaluation — Functions or expressions on covered columns may force table access. SELECT UPPER(name) might work, but complex expressions might not.
•NULL Handling Quirks — Some indexes exclude NULL values. IS NULL conditions may not use the index, or the index may not cover rows with NULL values.
•TOAST Columns (PostgreSQL) — Large values (text, bytea) stored in TOAST tables require separate access even if the column is in the index.
•Visibility Map Staleness — High-write tables without adequate VACUUM will have poor visibility coverage, forcing heap access.
•Index Type Limitations — Some index types (GIN, GiST, BRIN) don't support index-only scans for all operators.
•Partial Indexes — Index-only scans require the partial index predicate to match the query's WHERE clause.

common_blockers.sql
PostgreSQL Examples
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Example 1: SELECT * breaks index-only scan
-- Even with a "covering" index, SELECT * fails
CREATE INDEX idx_users_covering ON users (user_id) 
INCLUDE (email, name, created_at);
 
-- This WILL use index-only scan:
EXPLAIN SELECT user_id, email, name, created_at FROM users WHERE user_id = 100;
-- "Index Only Scan"
 
-- This will NOT (assumes table has more columns):
EXPLAIN SELECT * FROM users WHERE user_id = 100;
-- "Index Scan" (not "Index Only Scan")
 
 
-- Example 2: TOAST column limitation
-- Large text columns stored in TOAST tables
CREATE TABLE documents (
    doc_id INT PRIMARY KEY,
    title VARCHAR(255),
    content TEXT  -- Likely TOAST-ed if large
);
 
CREATE INDEX idx_docs_covering ON documents (doc_id) INCLUDE (title, content);
 
-- Index-only scan will still need TOAST access for large content values
-- The index stores a pointer to TOAST, not the actual content
 
 
-- Example 3: Expression preventing index-only scan
CREATE INDEX idx_users_email ON users (email) INCLUDE (name);
 
-- Works (simple column reference):
EXPLAIN SELECT email, name FROM users WHERE email = 'test@example.com';
 
-- May not work (function on column):
EXPLAIN SELECT email, name, LENGTH(email) FROM users WHERE email LIKE 'test%';
-- LENGTH(email) might require heap access depending on PostgreSQL version

The SELECT * Anti-Pattern

Summary: Mastering Index-Only Scans

We've taken a deep dive into the execution mechanism behind covering indexes. Here are the key insights:

Key Takeaways

•Index-only scans are the goal — They represent the ideal execution path where data is retrieved entirely from the index with zero table access.
•The optimizer requires complete coverage — All columns in SELECT, WHERE, ORDER BY, GROUP BY, and JOIN must be present in the index.
•MVCC adds complexity — Databases like PostgreSQL require visibility maps to verify row visibility without heap access.
•VACUUM maintains index-only scan efficiency — Regular vacuuming updates visibility maps, reducing or eliminating 'Heap Fetches'.
•Platform differences matter — PostgreSQL uses INCLUDE, SQL Server has INCLUDE, MySQL requires all columns as index keys, Oracle uses composite indexes.
•EXPLAIN is essential — Always verify index-only scan usage through execution plan analysis; look for platform-specific indicators.
•Edge cases can block optimization — SELECT *, TOAST columns, expressions, and visibility issues can all prevent index-only scans.

What's Next

Execution Mastery Achieved

2 / 5