Physical OperatorsSelection Implementation

Selection Implementation

LevelIntermediate

Duration90 mins

TopicSelection Implementation

2 / 5

Index Selection

Precision Access Through Indexes

While linear scans provide a reliable baseline for selection operations, they become prohibitively expensive for large tables when only a small fraction of rows match the predicate. Index-based selection transforms the access pattern from sequential-read-everything to surgical-retrieval-of-specific-rows.

An index is an auxiliary data structure that maps search key values to the locations of tuples containing those values. When a selection predicate involves an indexed column, the database can use the index to directly locate matching tuples without examining every row in the table. This transforms an O(N) operation into an O(log N) or even O(1) lookup.

Understanding index selection is crucial because it represents the primary mechanism by which databases achieve sub-millisecond response times on tables containing billions of rows. The decision of whether to use an index, which index to use, and how to use it forms the core of query optimization.

What You Will Learn

By the end of this page, you will understand the mechanics of index-based tuple retrieval, analyze the cost model for index scans across different index types, comprehend the critical distinction between index-only scans and index-heap scans, and master the decision framework for when index selection outperforms linear scanning.

Index Scan Fundamentals

An index scan uses an index structure to locate tuples matching a selection predicate. Unlike linear scans that process every page, index scans follow a targeted path through the index to retrieve only relevant data. The fundamental mechanism varies by index type, but B+tree indexes dominate in relational databases.

B+Tree Index Structure Recap:

Before diving into scan mechanics, let us recall the B+tree structure:

Root Node: Entry point for all searches, containing keys and pointers to child nodes
Internal Nodes: Guide the search by comparing search keys with node keys
Leaf Nodes: Contain index entries linking keys to tuple locations (TIDs/ROWIDs)
Leaf-Level Linkage: Leaf nodes are doubly-linked for efficient range scans

The B+tree maintains sorted order of keys, enabling both point lookups (equality predicates) and range scans (inequality predicates).

btree_index_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
-- B+Tree Index Scan Algorithm
 
PROCEDURE BTIndexScan(Index I, Predicate P):
    -- Extract search key from predicate
    search_key ← ExtractKeyFromPredicate(P)
    result ← EmptyResultSet()
    
    -- Phase 1: Root-to-Leaf Traversal (Point Lookup)
    current_node ← I.root
    WHILE NOT IsLeaf(current_node):
        -- Binary search within node to find child pointer
        child_pointer ← SearchNode(current_node, search_key)
        current_node ← FetchPage(child_pointer)
    
    -- Phase 2: Leaf Node Search
    -- For equality: Find matching entries
    -- For range: Find start position
    leaf ← current_node
    position ← BinarySearchLeaf(leaf, search_key)
    
    -- Phase 3: Collect Matching Entries
    WHILE CurrentKeyMatchesPredicate(leaf, position, P):
        tid ← GetTupleId(leaf, position)
        
        -- Phase 4: Heap Access (unless index-only scan)
        tuple ← FetchTupleFromHeap(tid)
        IF EvaluateFullPredicate(tuple, P):  -- Recheck if needed
            result.Add(tuple)
        
        -- Advance to next entry
        position ← position + 1
        IF position >= leaf.num_entries:
            leaf ← leaf.next_leaf  -- Follow sibling pointer
            position ← 0
            IF leaf = NULL: BREAK
    
    RETURN result
 
-- Example: SELECT * FROM employees WHERE emp_id = 12345
-- 1. Root-to-leaf traversal: ~3-4 page reads (log complexity)
-- 2. Leaf scan: 1 page read (single match)
-- 3. Heap access: 1 page read (fetch actual tuple)
-- Total: ~5 page reads vs. thousands for table scan

The Two-Phase Access Pattern:

Most index scans involve two distinct phases:

Index Traversal Phase: Navigate the index structure to locate entries matching the predicate. This phase reads only index pages, which are typically smaller and more cache-friendly than data pages.
Heap Access Phase: For each matching index entry, fetch the actual tuple from the heap (table) using the stored tuple identifier (TID). This phase incurs random I/O as matching tuples are typically scattered across data pages.

This two-phase pattern has profound cost implications—the heap access phase often dominates total cost because random I/O is expensive.

Index Entry Structure

Each leaf-level index entry contains: (1) the indexed column value(s), and (2) a tuple identifier (TID) pointing to the heap location. In PostgreSQL, a TID is (block_number, tuple_offset). In Oracle, a ROWID encodes similar information. The TID enables direct access to the tuple without additional searching.

Types of Index Scans

Database systems implement several variations of index scans, each optimized for specific access patterns. Understanding these variations is essential for query plan interpretation and optimization.

1. Index-Only Scan (Covering Index Scan):

When the index contains all columns needed by the query (both for filtering and result projection), the heap access phase can be eliminated entirely. This is the most efficient form of index access.

index_only_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Index-Only Scan Example
 
-- Create covering index including all needed columns
CREATE INDEX idx_emp_salary_dept ON employees(department_id, salary);
 
-- Query that can be satisfied entirely from the index
SELECT department_id, salary 
FROM employees 
WHERE department_id = 10;
 
-- PostgreSQL EXPLAIN output:
-- Index Only Scan using idx_emp_salary_dept on employees
--   Index Cond: (department_id = 10)
--   Heap Fetches: 0  -- No heap access needed!
 
-- Why it works:
-- - Predicate column (department_id) is in the index
-- - All SELECT columns (department_id, salary) are in the index
-- - No heap access required → dramatically faster
 
-- Visibility Note:
-- PostgreSQL must still check visibility map to ensure tuples
-- are visible to current transaction. If page isn't all-visible,
-- heap access is required even for index-only scans.

2. Index Scan (with Heap Access):

The standard index scan retrieves TIDs from the index, then fetches corresponding tuples from the heap. This is necessary when:

The query selects columns not included in the index
Visibility cannot be determined from the index alone
A recheck condition must be evaluated on the full tuple

standard_index_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Standard Index Scan with Heap Access
 
-- Index only contains employee_id
CREATE INDEX idx_emp_id ON employees(employee_id);
 
-- Query needs columns not in the index
SELECT employee_id, name, salary, hire_date
FROM employees 
WHERE employee_id = 12345;
 
-- PostgreSQL EXPLAIN output:
-- Index Scan using idx_emp_id on employees
--   Index Cond: (employee_id = 12345)
--   (cost=0.29..8.31 rows=1 width=48)
 
-- The executor:
-- 1. Traverses idx_emp_id to find entry for employee_id = 12345
-- 2. Retrieves the TID (page_number, tuple_offset)
-- 3. Fetches the heap page containing the actual tuple
-- 4. Returns the full row from the heap page

3. Bitmap Index Scan:

For predicates matching multiple tuples scattered across many pages, bitmap index scans provide an efficient alternative. Instead of fetching tuples one-by-one (causing random I/O), the bitmap scan:

Scans the index to collect all matching TIDs
Sorts TIDs by page number into a bitmap
Performs a single pass through heap pages in physical order

This converts random I/O into sequential I/O, dramatically improving performance for medium-selectivity queries.

bitmap_index_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Bitmap Index Scan
 
CREATE INDEX idx_salary ON employees(salary);
 
-- Query returning many scattered tuples
SELECT * FROM employees WHERE salary BETWEEN 50000 AND 60000;
 
-- PostgreSQL EXPLAIN output:
-- Bitmap Heap Scan on employees
--   Recheck Cond: (salary >= 50000 AND salary <= 60000)
--   ->  Bitmap Index Scan on idx_salary
--         Index Cond: (salary >= 50000 AND salary <= 60000)
 
-- Execution phases:
-- Phase 1: Bitmap Index Scan
--   - Traverse index for range [50000, 60000]
--   - Build bitmap: page 1 -> [tuple 3, tuple 7]
--                   page 4 -> [tuple 1, tuple 5, tuple 9]
--                   page 7 -> [tuple 2]
--                   ... (sorted by page number)
 
-- Phase 2: Bitmap Heap Scan
--   - Read heap pages in physical order (1, 4, 7, ...)
--   - Sequential I/O pattern even for scattered matches
--   - Recheck condition (may be lossy for large bitmaps)
 
-- Bitmap becomes "lossy" when too large:
-- - Stores only page numbers, not tuple offsets
-- - All tuples on matching pages must be rechecked

Comparison of Index Scan Types
Scan Type	Index Access	Heap Access	I/O Pattern	Best For
Index-Only Scan	Sequential (leaf→leaf)	None required	Sequential	Covering indexes, high selectivity
Index Scan	Sequential (leaf→leaf)	Random per-tuple	Random	High selectivity (few matches)
Bitmap Index Scan	Sequential (build bitmap)	Sequential (sorted pages)	Sequential	Medium selectivity, scattered data

Choosing the Right Scan Type

The query optimizer automatically chooses among these scan types based on estimated costs. Index-only scans are preferred when possible. For point queries (= operator), standard index scans suffice. For range queries returning many rows, bitmap scans convert random heap access into sequential access, often providing 10-100× speedup over standard index scans.

Index Scan Cost Model

Accurate cost estimation for index scans is critical for the optimizer to make correct access path decisions. The cost model must account for both index access costs and heap access costs.

Cost Components:

Let us define the parameters:

H = Height of the B+tree index (typically 3-4 for most tables)
L = Number of leaf pages to scan (depends on predicate selectivity)
M = Number of matching index entries
P_distinct = Number of distinct heap pages containing matching tuples
s = Selectivity (fraction of tuples matching predicate)
T = Total tuples in table
B = Total data pages in table

Index Access Cost:

Index Access Cost Formulas
Operation	Cost Formula	Explanation
Root-to-leaf traversal	H page reads	Fixed cost to reach first leaf node
Leaf scan (equality)	⌈M / entries_per_page⌉	Pages containing matching entries
Leaf scan (range)	L = s × Leaf_Pages	Proportion of leaf pages in range
Total index I/O	H + L	Traversal + leaf scanning

Heap Access Cost (Critical Factor):

The heap access cost depends on how matching tuples are distributed across data pages. This is where index scans can become expensive. Consider two extremes:

Best Case (Clustered Data): All matching tuples reside in adjacent pages. P_distinct ≈ M × tuple_size / page_size. Heap access is nearly sequential.
Worst Case (Scattered Data): Each matching tuple resides on a different page. P_distinct ≈ M. Every tuple fetch requires a separate random I/O.

The correlation between index order and physical tuple order determines where on this spectrum actual performance falls.

correlation_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Analyzing Index-Heap Correlation
 
-- PostgreSQL: View correlation statistics
SELECT 
    schemaname,
    tablename,
    attname AS column_name,
    correlation
FROM pg_stats
WHERE tablename = 'orders' AND attname = 'order_date';
 
-- Correlation values:
-- +1.0 : Perfect positive correlation (index order = heap order)
--        Excellent for range scans - sequential heap access
-- -1.0 : Perfect negative correlation (opposite orders)
--        Still good - predictable access pattern
--  0.0 : No correlation (random distribution)
--        Worst case - every tuple might be on different page
 
-- Example correlations and their implications:
 
-- order_date with correlation = 0.98
-- → Orders stored roughly chronologically in heap
-- → Range scan on date range will access contiguous pages
-- → Cost closer to sequential scan for heap access
 
-- customer_id with correlation = 0.12
-- → Customer IDs scattered randomly in heap
-- → Each customer lookup requires random page access
-- → High heap access cost for index scans
 
-- The optimizer uses correlation to estimate P_distinct:
-- High correlation: P_distinct ≈ M × tuple_size / page_size
-- Low correlation:  P_distinct ≈ min(M, B) where B = total pages

Complete Cost Formula:

For a standard index scan:

Total Cost = Index_IO + Heap_IO

Where:

Index_IO = H + L (index traversal and leaf scanning)
Heap_IO = P_distinct × (t_seek + t_transfer) for random access

For bitmap index scan:

Index_IO = H + L (same as standard)
Heap_IO = P_distinct × t_transfer (sequential, post-sorting)

For index-only scan:

Total Cost = H + L (no heap access)

Break-Even Point:

The optimizer compares index scan cost against sequential scan cost (B pages). The break-even selectivity is approximately:

s_break ≈ B / (H + L + P_distinct × random_io_factor)

When selectivity exceeds this threshold, sequential scans become cheaper despite reading the entire table.

The Random I/O Trap

Never underestimate random I/O cost. On spinning disks, random access is 100× slower than sequential. Even on NVMe SSDs, it's 3-5× slower. A query returning 10% of a table might seem ideal for an index, but if data is uncorrelated, the random heap access could be slower than scanning everything sequentially.

Index Selection Strategies

When multiple indexes exist on a table, the optimizer must choose which index (if any) to use for a given predicate. This section explores the strategies and factors influencing this decision.

Strategy 1: Predicate Matching

The most fundamental requirement is that an index can satisfy the predicate structure. This determines index applicability:

Predicate Types and Index Applicability
Predicate Type	B+Tree Support	Hash Index Support	Notes
Equality (col = value)	✓ Full support	✓ Full support	Both index types optimal
Range (col > value)	✓ Full support	✗ Not supported	B+tree only (sorted structure)
IN list (col IN (...))	✓ Multiple lookups	✓ Multiple lookups	Multiple index probes
LIKE 'prefix%'	✓ Range scan	✗ Not supported	Prefix translates to range
LIKE '%suffix'	✗ Not efficient	✗ Not supported	Pattern at start prevents index use
Function(col) = value	✗ Unless expression index	✗ Unless expression index	Index on raw column doesn't help
col IS NULL	✓ (most databases)	Varies	NULL handling is vendor-specific

Strategy 2: Cost-Based Selection

When multiple indexes can satisfy a predicate, the optimizer estimates cost for each option and selects the cheapest. Factors influencing this choice:

Cost Comparison Factors

•Selectivity Estimation — How many rows match the predicate? Lower selectivity (fewer matching rows) favors index use.
•Index Depth — Shallower indexes (fewer levels) have lower traversal cost.
•Clustering Factor — The correlation between index order and heap order. High clustering reduces heap access cost.
•Index Size — Smaller indexes fit more entries in each page, reducing I/O.
•Covering Potential — If an index covers all needed columns, it eliminates heap access entirely.
•Index Freshness — Recently accessed indexes may already be in buffer pool, reducing I/O cost.

index_selection_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Index Selection Example
 
-- Table: orders (10 million rows)
-- Indexes:
--   idx_customer (customer_id)        -- Low selectivity per customer
--   idx_status (status)               -- Very low selectivity (5 status values)
--   idx_date (order_date)             -- High temporal correlation
--   idx_cust_date (customer_id, order_date)  -- Composite
 
-- Query with multiple index options:
SELECT order_id, total, status
FROM orders
WHERE customer_id = 12345
  AND order_date > '2024-01-01';
 
-- Optimizer considerations:
 
-- Option 1: Use idx_customer
-- + Good selectivity on customer_id (~500 orders per customer)
-- - Must recheck order_date predicate on all 500 rows
-- - Each tuple fetch is random I/O
 
-- Option 2: Use idx_date
-- - Poor selectivity on date range (maybe 30% of table)
-- - Would scan millions of index entries
-- - Not a good choice
 
-- Option 3: Use idx_cust_date (composite)
-- + Excellent selectivity: customer_id AND date range
-- + Both predicates satisfied by index traversal
-- + Minimal heap access for matching rows
-- → This is likely the chosen index
 
-- View optimizer's choice:
EXPLAIN (ANALYZE) 
SELECT order_id, total, status
FROM orders
WHERE customer_id = 12345
  AND order_date > '2024-01-01';

Strategy 3: Composite Index Considerations

Composite (multi-column) indexes require special consideration. The index can only be used efficiently if the query includes predicates on a prefix of the index columns:

Index on (A, B, C) supports queries on A, (A, B), or (A, B, C)
Queries on B alone, C alone, or (B, C) cannot use this index efficiently

The optimizer must evaluate whether available indexes match the query's predicate structure.

Index Selection Hints

When the optimizer makes suboptimal index choices (often due to stale statistics), most databases provide hints or directives to force specific index use. PostgreSQL uses enable_seqscan/enable_indexscan settings, MySQL and Oracle use hint syntax like USE INDEX or INDEX hints. Use these sparingly and only after verifying the optimizer's statistics are accurate.

Index Scan Optimizations

Modern database systems implement sophisticated optimizations to improve index scan performance beyond the basic algorithm. These optimizations address the fundamental challenges of random I/O and CPU overhead.

1. Index Prefetching:

During range scans, the database can issue asynchronous read requests for upcoming leaf pages, overlapping I/O with processing:

index_prefetching.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Index Prefetching Strategy
 
-- PostgreSQL: effective_io_concurrency controls prefetch depth
SET effective_io_concurrency = 200;  -- Higher for SSDs
 
-- How prefetching works during index range scan:
-- 
-- Without prefetching:
-- [Read leaf 1] → [Process] → [Read leaf 2] → [Process] → ...
--    10ms            5ms          10ms           5ms
-- Total: 30ms for 2 pages
--
-- With prefetching:
-- [Read leaf 1]    [Read leaf 2]
--        ↓               ↓
--    [Process]      [Process]
-- Total: 15ms (I/O overlapped with processing)
 
-- Oracle: db_file_multiblock_read_count for index leaf prefetch
-- MySQL/InnoDB: innodb_read_ahead_threshold controls prefetch trigger

2. Heap Access Batching:

Rather than fetching heap pages one at a time, the optimizer can batch TIDs and sort them by page number before accessing the heap. This is the core idea behind bitmap scans:

Heap Access Batching Benefits

•Reduced seek overhead — Accessing pages in physical order minimizes disk head movement on spinning disks.
•Better cache utilization — Sequential page access enables effective OS and buffer pool caching.
•Fewer page pins — Reading multiple tuples from the same page in one access reduces buffer pool management overhead.
•I/O coalescing — Adjacent page requests can be merged into larger, more efficient reads.

3. Index Condition Pushdown (ICP):

In MySQL/InnoDB and similar systems, portions of the WHERE clause can be evaluated at the index level, avoiding heap access for non-matching rows:

index_condition_pushdown.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Index Condition Pushdown (MySQL Example)
 
-- Composite index: (customer_id, status, order_date)
CREATE INDEX idx_cust_status_date ON orders(customer_id, status, order_date);
 
-- Query with multiple conditions:
SELECT * FROM orders
WHERE customer_id = 100
  AND status = 'shipped'
  AND order_date > '2024-01-01';
 
-- Without ICP (traditional approach):
-- 1. Use index to find rows with customer_id = 100
-- 2. Fetch each row from heap
-- 3. Apply status and order_date filters on fetched rows
-- → Many unnecessary heap accesses
 
-- With ICP enabled (default in MySQL 5.6+):
-- 1. Use index to find rows with customer_id = 100
-- 2. For each index entry, check if status = 'shipped' 
--    AND order_date > '2024-01-01' (values are in index!)
-- 3. Only fetch heap rows that pass all index conditions
-- → Dramatic reduction in heap accesses
 
-- EXPLAIN shows "Using index condition":
EXPLAIN SELECT * FROM orders
WHERE customer_id = 100 AND status = 'shipped' AND order_date > '2024-01-01';
 
-- Extra: Using index condition

4. Skip Scan Optimization:

Some databases can use an index even when the leading column is not in the predicate, by "skipping" through distinct values of the leading column:

skip_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Skip Scan Optimization (Oracle Example)
 
-- Index on (gender, last_name) with only 2 distinct values for gender
CREATE INDEX idx_gender_name ON employees(gender, last_name);
 
-- Query NOT filtering on leading column:
SELECT * FROM employees WHERE last_name = 'Smith';
 
-- Traditional approach: Full table scan (index not usable)
 
-- Skip Scan approach (Oracle, MySQL 8.0.13+):
-- 1. For each distinct value of 'gender' (M, F):
--    a. Range scan index for (gender = 'M', last_name = 'Smith')
--    b. Range scan index for (gender = 'F', last_name = 'Smith')
-- 2. Combine results
 
-- Effective when:
-- - Leading column has few distinct values
-- - Remaining columns have good selectivity
-- - Alternative is full table scan
 
-- Skip scan is essentially performing:
-- WHERE (gender = 'M' AND last_name = 'Smith')
--    OR (gender = 'F' AND last_name = 'Smith')

Optimization Applicability

These optimizations have specific applicability conditions. Prefetching requires configurable I/O concurrency. ICP requires columns present in the index. Skip scan is only efficient for low-cardinality leading columns. The optimizer evaluates whether each optimization applies and estimates its benefit.

Index Scan vs. Sequential Scan: The Decision Framework

The fundamental question in selection implementation is: should we use an index or scan the entire table? The optimizer makes this decision by estimating and comparing costs. Understanding this framework enables diagnosing and correcting suboptimal query plans.

The Decision Equation:

Choose index scan when:

Cost(Index_Scan) < Cost(Sequential_Scan)

Expanding this:

H + L + (P_distinct × random_io_penalty) < B × sequential_io_cost

Where:

Left side: Index traversal + leaf scan + random heap access
Right side: Sequential scan of all data pages

Decision Factors Summary
Factor	Favors Index Scan	Favors Sequential Scan
Selectivity	Low (< 5-10% of rows)	High (> 20% of rows)
Table size	Large tables (millions of rows)	Small tables (< 1000 rows)
Correlation/Clustering	High correlation	Low correlation/scattered data
Query frequency	Frequent queries	Rare/one-time queries
Index availability	Matching index exists	No suitable index
I/O pattern	Random I/O acceptable	Sequential I/O preferred
Buffer pool state	Index/data cached	Cold cache

decision_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-- Analyzing the Index vs Scan Decision
 
-- PostgreSQL: Compare costs explicitly
-- Force sequential scan:
SET enable_indexscan = off;
SET enable_bitmapscan = off;
EXPLAIN SELECT * FROM orders WHERE customer_id = 12345;
-- Shows: Seq Scan cost = X
 
SET enable_indexscan = on;
SET enable_bitmapscan = on;
SET enable_seqscan = off;
EXPLAIN SELECT * FROM orders WHERE customer_id = 12345;
-- Shows: Index Scan cost = Y
 
-- Compare X vs Y - optimizer normally picks lower cost
 
-- Key statistics influencing decision:
SELECT 
    relname,
    relpages AS heap_pages,       -- B in our formula
    reltuples AS heap_rows        -- T
FROM pg_class WHERE relname = 'orders';
 
-- Index statistics:
SELECT 
    indexrelid::regclass AS index_name,
    relpages AS index_pages,
    reltuples AS index_entries
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE indrelid = 'orders'::regclass;
 
-- Column statistics affecting selectivity:
SELECT
    attname,
    n_distinct,   -- Cardinality estimation
    correlation   -- Physical ordering correlation
FROM pg_stats 
WHERE tablename = 'orders' AND attname = 'customer_id';

Common Decision Pitfalls:

Why Optimizers Make Wrong Decisions

•Stale statistics — Table contents changed significantly since last ANALYZE. Selectivity estimates are wrong.
•Correlation blindness — Optimizer assumes uniform distribution when data is clustered, overestimating index scan cost.
•Cardinality estimation errors — Complex predicates (AND/OR combinations, functions) are hard to estimate accurately.
•Parameter sniffing issues — Query plans cached for one parameter value perform poorly for others.
•Buffer pool assumptions — Optimizer may not account for data already in memory, favoring I/O-minimizing plans.

Debugging Decision Problems

When you observe a suboptimal access path, first ANALYZE/UPDATE STATS the table. Then examine pg_stats/histogram data for the predicate column. Compare estimated vs actual row counts in EXPLAIN ANALYZE. If estimates are accurate but plan is still wrong, consider adjusting cost parameters (random_page_cost, seq_page_cost) to match your actual hardware characteristics.

Practical Examples and Patterns

Let us examine several real-world scenarios that demonstrate index selection principles in action. These examples illustrate common patterns and anti-patterns.

Example 1: The Deceptive Low-Cardinality Column

low_cardinality_trap.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Low Cardinality Column: Status Field
 
-- Table: orders (100 million rows)
-- Column: status (5 distinct values: pending, processing, shipped, delivered, cancelled)
 
CREATE INDEX idx_status ON orders(status);
 
-- Query for shipped orders:
SELECT * FROM orders WHERE status = 'shipped';
 
-- Analysis:
-- status = 'shipped' matches ~20% of rows (20 million rows)
-- Index approach: 20 million random heap accesses
-- Sequential approach: Read all pages sequentially
 
-- Result: SEQUENTIAL SCAN is likely faster!
-- The index is nearly useless due to low selectivity
 
-- Better approach for status queries:
-- 1. Use partial index for rare statuses:
CREATE INDEX idx_pending ON orders(order_id) WHERE status = 'pending';
-- Only indexes the small subset of pending orders
 
-- 2. Consider partitioning by status:
CREATE TABLE orders (
    order_id BIGINT,
    status VARCHAR(20),
    ...
) PARTITION BY LIST (status);
 
-- 3. For status counting, use materialized views or triggers

Example 2: Range Query Performance Variation

range_query_patterns.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Range Query: Correlation Impact
 
-- Table: sensor_readings (1 billion rows)
-- Columns: sensor_id, timestamp, value
-- Index: idx_timestamp (timestamp)
 
-- Scenario A: Time-Series Data (High Correlation)
-- Data inserted chronologically, timestamp ≈ heap order
-- correlation ≈ 0.95
 
SELECT * FROM sensor_readings 
WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02';
 
-- Result: INDEX SCAN efficient!
-- - Index identifies matching timestamp range
-- - Heap access is nearly sequential (correlated)
-- - Reads ~100 contiguous heap pages
 
-- Scenario B: Reshuffled Data (Low Correlation)
-- After bulk load from multiple sources, timestamps scattered
-- correlation ≈ 0.05
 
SELECT * FROM sensor_readings 
WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02';
 
-- Result: SEQUENTIAL SCAN may be chosen!
-- - Same index identifies same matching range
-- - But heap access hits pages scattered across table
-- - 1 million random page reads vs reading everything sequentially
 
-- Solution for Scenario B:
-- CLUSTER sensor_readings USING idx_timestamp;
-- (Rewrites table in index order, restoring correlation)

Example 3: Index-Only Scan Optimization

covering_index_pattern.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Covering Index: Eliminating Heap Access
 
-- Table: products (10 million rows)
-- Frequent query pattern:
SELECT product_id, category, price 
FROM products 
WHERE category = 'Electronics' 
ORDER BY price DESC 
LIMIT 100;
 
-- Original index:
CREATE INDEX idx_category ON products(category);
 
-- Execution with original index:
-- 1. Index scan: Find entries with category='Electronics'
-- 2. Heap access: Fetch (product_id, price) for each
-- 3. Sort by price
-- 4. Return top 100
-- Cost: Millions of random heap reads + sort
 
-- Optimized covering index:
CREATE INDEX idx_cat_price_prod ON products(category, price DESC, product_id);
 
-- Execution with covering index:
-- 1. Index-only scan: Navigate to category='Electronics'
-- 2. Already sorted by price DESC in index
-- 3. Return first 100 entries directly from index
-- 4. NO HEAP ACCESS
-- Cost: ~10 index pages (dramatic improvement!)
 
-- PostgreSQL EXPLAIN showing improvement:
-- Before: Index Scan + Sort (cost=50000)
-- After:  Index Only Scan (cost=50) - 1000x faster!

Pattern Recognition

These examples demonstrate key patterns: (1) index column cardinality must support selectivity, (2) physical data organization determines actual I/O behavior, and (3) covering indexes can eliminate entire cost components. Recognizing these patterns enables proactive index design rather than reactive optimization.

Summary: Mastering Index Selection

We have explored index-based selection in comprehensive depth, from fundamental mechanics through optimization techniques to practical decision-making frameworks. Let us consolidate the essential knowledge:

Key Takeaways

•Index scan mechanics involve B+tree traversal to leaf nodes, followed by heap access using stored TIDs. The two-phase pattern creates distinct cost components.
•Index scan types (index-only, standard, bitmap) offer different tradeoffs. Index-only eliminates heap access; bitmap converts random to sequential I/O.
•Cost model components include index traversal (H + L pages) and heap access (dominated by random I/O for uncorrelated data).
•Correlation is critical — the physical relationship between index order and heap order dramatically affects actual performance.
•Selection strategies consider predicate matching, selectivity estimation, and composite index prefix rules.
•Optimizations like prefetching, batching, ICP, and skip scan can dramatically improve index scan efficiency.
•The decision framework compares index scan total cost against sequential scan cost, influenced by selectivity, table size, and data distribution.

What's Next:

Having mastered single-predicate selection strategies, we now advance to Conjunctive Selection—implementing selections with multiple AND-ed predicates. This introduces the challenge of combining multiple index accesses or choosing a single dominant index.

Page Complete

You now possess expert-level understanding of index-based selection—the surgical access method that enables sub-millisecond queries on billion-row tables. This knowledge is essential for query optimization, index design, and understanding database performance characteristics.

2 / 5

Loading learning content...

Physical OperatorsSelection Implementation

Selection Implementation

LevelIntermediate

Duration90 mins

TopicSelection Implementation

2 / 5

Index Selection

Precision Access Through Indexes

What You Will Learn

Index Scan Fundamentals

B+Tree Index Structure Recap:

Before diving into scan mechanics, let us recall the B+tree structure:

Root Node: Entry point for all searches, containing keys and pointers to child nodes
Internal Nodes: Guide the search by comparing search keys with node keys
Leaf Nodes: Contain index entries linking keys to tuple locations (TIDs/ROWIDs)
Leaf-Level Linkage: Leaf nodes are doubly-linked for efficient range scans

The B+tree maintains sorted order of keys, enabling both point lookups (equality predicates) and range scans (inequality predicates).

btree_index_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
-- B+Tree Index Scan Algorithm
 
PROCEDURE BTIndexScan(Index I, Predicate P):
    -- Extract search key from predicate
    search_key ← ExtractKeyFromPredicate(P)
    result ← EmptyResultSet()
    
    -- Phase 1: Root-to-Leaf Traversal (Point Lookup)
    current_node ← I.root
    WHILE NOT IsLeaf(current_node):
        -- Binary search within node to find child pointer
        child_pointer ← SearchNode(current_node, search_key)
        current_node ← FetchPage(child_pointer)
    
    -- Phase 2: Leaf Node Search
    -- For equality: Find matching entries
    -- For range: Find start position
    leaf ← current_node
    position ← BinarySearchLeaf(leaf, search_key)
    
    -- Phase 3: Collect Matching Entries
    WHILE CurrentKeyMatchesPredicate(leaf, position, P):
        tid ← GetTupleId(leaf, position)
        
        -- Phase 4: Heap Access (unless index-only scan)
        tuple ← FetchTupleFromHeap(tid)
        IF EvaluateFullPredicate(tuple, P):  -- Recheck if needed
            result.Add(tuple)
        
        -- Advance to next entry
        position ← position + 1
        IF position >= leaf.num_entries:
            leaf ← leaf.next_leaf  -- Follow sibling pointer
            position ← 0
            IF leaf = NULL: BREAK
    
    RETURN result
 
-- Example: SELECT * FROM employees WHERE emp_id = 12345
-- 1. Root-to-leaf traversal: ~3-4 page reads (log complexity)
-- 2. Leaf scan: 1 page read (single match)
-- 3. Heap access: 1 page read (fetch actual tuple)
-- Total: ~5 page reads vs. thousands for table scan

The Two-Phase Access Pattern:

Most index scans involve two distinct phases:

Index Traversal Phase: Navigate the index structure to locate entries matching the predicate. This phase reads only index pages, which are typically smaller and more cache-friendly than data pages.
Heap Access Phase: For each matching index entry, fetch the actual tuple from the heap (table) using the stored tuple identifier (TID). This phase incurs random I/O as matching tuples are typically scattered across data pages.

This two-phase pattern has profound cost implications—the heap access phase often dominates total cost because random I/O is expensive.

Index Entry Structure

Types of Index Scans

Database systems implement several variations of index scans, each optimized for specific access patterns. Understanding these variations is essential for query plan interpretation and optimization.

1. Index-Only Scan (Covering Index Scan):

When the index contains all columns needed by the query (both for filtering and result projection), the heap access phase can be eliminated entirely. This is the most efficient form of index access.

index_only_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Index-Only Scan Example
 
-- Create covering index including all needed columns
CREATE INDEX idx_emp_salary_dept ON employees(department_id, salary);
 
-- Query that can be satisfied entirely from the index
SELECT department_id, salary 
FROM employees 
WHERE department_id = 10;
 
-- PostgreSQL EXPLAIN output:
-- Index Only Scan using idx_emp_salary_dept on employees
--   Index Cond: (department_id = 10)
--   Heap Fetches: 0  -- No heap access needed!
 
-- Why it works:
-- - Predicate column (department_id) is in the index
-- - All SELECT columns (department_id, salary) are in the index
-- - No heap access required → dramatically faster
 
-- Visibility Note:
-- PostgreSQL must still check visibility map to ensure tuples
-- are visible to current transaction. If page isn't all-visible,
-- heap access is required even for index-only scans.

2. Index Scan (with Heap Access):

The standard index scan retrieves TIDs from the index, then fetches corresponding tuples from the heap. This is necessary when:

The query selects columns not included in the index
Visibility cannot be determined from the index alone
A recheck condition must be evaluated on the full tuple

standard_index_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Standard Index Scan with Heap Access
 
-- Index only contains employee_id
CREATE INDEX idx_emp_id ON employees(employee_id);
 
-- Query needs columns not in the index
SELECT employee_id, name, salary, hire_date
FROM employees 
WHERE employee_id = 12345;
 
-- PostgreSQL EXPLAIN output:
-- Index Scan using idx_emp_id on employees
--   Index Cond: (employee_id = 12345)
--   (cost=0.29..8.31 rows=1 width=48)
 
-- The executor:
-- 1. Traverses idx_emp_id to find entry for employee_id = 12345
-- 2. Retrieves the TID (page_number, tuple_offset)
-- 3. Fetches the heap page containing the actual tuple
-- 4. Returns the full row from the heap page

3. Bitmap Index Scan:

For predicates matching multiple tuples scattered across many pages, bitmap index scans provide an efficient alternative. Instead of fetching tuples one-by-one (causing random I/O), the bitmap scan:

Scans the index to collect all matching TIDs
Sorts TIDs by page number into a bitmap
Performs a single pass through heap pages in physical order

This converts random I/O into sequential I/O, dramatically improving performance for medium-selectivity queries.

bitmap_index_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Bitmap Index Scan
 
CREATE INDEX idx_salary ON employees(salary);
 
-- Query returning many scattered tuples
SELECT * FROM employees WHERE salary BETWEEN 50000 AND 60000;
 
-- PostgreSQL EXPLAIN output:
-- Bitmap Heap Scan on employees
--   Recheck Cond: (salary >= 50000 AND salary <= 60000)
--   ->  Bitmap Index Scan on idx_salary
--         Index Cond: (salary >= 50000 AND salary <= 60000)
 
-- Execution phases:
-- Phase 1: Bitmap Index Scan
--   - Traverse index for range [50000, 60000]
--   - Build bitmap: page 1 -> [tuple 3, tuple 7]
--                   page 4 -> [tuple 1, tuple 5, tuple 9]
--                   page 7 -> [tuple 2]
--                   ... (sorted by page number)
 
-- Phase 2: Bitmap Heap Scan
--   - Read heap pages in physical order (1, 4, 7, ...)
--   - Sequential I/O pattern even for scattered matches
--   - Recheck condition (may be lossy for large bitmaps)
 
-- Bitmap becomes "lossy" when too large:
-- - Stores only page numbers, not tuple offsets
-- - All tuples on matching pages must be rechecked

Comparison of Index Scan Types
Scan Type	Index Access	Heap Access	I/O Pattern	Best For
Index-Only Scan	Sequential (leaf→leaf)	None required	Sequential	Covering indexes, high selectivity
Index Scan	Sequential (leaf→leaf)	Random per-tuple	Random	High selectivity (few matches)
Bitmap Index Scan	Sequential (build bitmap)	Sequential (sorted pages)	Sequential	Medium selectivity, scattered data

Choosing the Right Scan Type

Index Scan Cost Model

Accurate cost estimation for index scans is critical for the optimizer to make correct access path decisions. The cost model must account for both index access costs and heap access costs.

Cost Components:

Let us define the parameters:

H = Height of the B+tree index (typically 3-4 for most tables)
L = Number of leaf pages to scan (depends on predicate selectivity)
M = Number of matching index entries
P_distinct = Number of distinct heap pages containing matching tuples
s = Selectivity (fraction of tuples matching predicate)
T = Total tuples in table
B = Total data pages in table

Index Access Cost:

Index Access Cost Formulas
Operation	Cost Formula	Explanation
Root-to-leaf traversal	H page reads	Fixed cost to reach first leaf node
Leaf scan (equality)	⌈M / entries_per_page⌉	Pages containing matching entries
Leaf scan (range)	L = s × Leaf_Pages	Proportion of leaf pages in range
Total index I/O	H + L	Traversal + leaf scanning

Heap Access Cost (Critical Factor):

The heap access cost depends on how matching tuples are distributed across data pages. This is where index scans can become expensive. Consider two extremes:

Best Case (Clustered Data): All matching tuples reside in adjacent pages. P_distinct ≈ M × tuple_size / page_size. Heap access is nearly sequential.
Worst Case (Scattered Data): Each matching tuple resides on a different page. P_distinct ≈ M. Every tuple fetch requires a separate random I/O.

The correlation between index order and physical tuple order determines where on this spectrum actual performance falls.

correlation_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Analyzing Index-Heap Correlation
 
-- PostgreSQL: View correlation statistics
SELECT 
    schemaname,
    tablename,
    attname AS column_name,
    correlation
FROM pg_stats
WHERE tablename = 'orders' AND attname = 'order_date';
 
-- Correlation values:
-- +1.0 : Perfect positive correlation (index order = heap order)
--        Excellent for range scans - sequential heap access
-- -1.0 : Perfect negative correlation (opposite orders)
--        Still good - predictable access pattern
--  0.0 : No correlation (random distribution)
--        Worst case - every tuple might be on different page
 
-- Example correlations and their implications:
 
-- order_date with correlation = 0.98
-- → Orders stored roughly chronologically in heap
-- → Range scan on date range will access contiguous pages
-- → Cost closer to sequential scan for heap access
 
-- customer_id with correlation = 0.12
-- → Customer IDs scattered randomly in heap
-- → Each customer lookup requires random page access
-- → High heap access cost for index scans
 
-- The optimizer uses correlation to estimate P_distinct:
-- High correlation: P_distinct ≈ M × tuple_size / page_size
-- Low correlation:  P_distinct ≈ min(M, B) where B = total pages

Complete Cost Formula:

For a standard index scan:

Total Cost = Index_IO + Heap_IO

Where:

Index_IO = H + L (index traversal and leaf scanning)
Heap_IO = P_distinct × (t_seek + t_transfer) for random access

For bitmap index scan:

Index_IO = H + L (same as standard)
Heap_IO = P_distinct × t_transfer (sequential, post-sorting)

For index-only scan:

Total Cost = H + L (no heap access)

Break-Even Point:

The optimizer compares index scan cost against sequential scan cost (B pages). The break-even selectivity is approximately:

s_break ≈ B / (H + L + P_distinct × random_io_factor)

When selectivity exceeds this threshold, sequential scans become cheaper despite reading the entire table.

The Random I/O Trap

Index Selection Strategies

When multiple indexes exist on a table, the optimizer must choose which index (if any) to use for a given predicate. This section explores the strategies and factors influencing this decision.

Strategy 1: Predicate Matching

The most fundamental requirement is that an index can satisfy the predicate structure. This determines index applicability:

Predicate Types and Index Applicability
Predicate Type	B+Tree Support	Hash Index Support	Notes
Equality (col = value)	✓ Full support	✓ Full support	Both index types optimal
Range (col > value)	✓ Full support	✗ Not supported	B+tree only (sorted structure)
IN list (col IN (...))	✓ Multiple lookups	✓ Multiple lookups	Multiple index probes
LIKE 'prefix%'	✓ Range scan	✗ Not supported	Prefix translates to range
LIKE '%suffix'	✗ Not efficient	✗ Not supported	Pattern at start prevents index use
Function(col) = value	✗ Unless expression index	✗ Unless expression index	Index on raw column doesn't help
col IS NULL	✓ (most databases)	Varies	NULL handling is vendor-specific

Strategy 2: Cost-Based Selection

When multiple indexes can satisfy a predicate, the optimizer estimates cost for each option and selects the cheapest. Factors influencing this choice:

Cost Comparison Factors

•Selectivity Estimation — How many rows match the predicate? Lower selectivity (fewer matching rows) favors index use.
•Index Depth — Shallower indexes (fewer levels) have lower traversal cost.
•Clustering Factor — The correlation between index order and heap order. High clustering reduces heap access cost.
•Index Size — Smaller indexes fit more entries in each page, reducing I/O.
•Covering Potential — If an index covers all needed columns, it eliminates heap access entirely.
•Index Freshness — Recently accessed indexes may already be in buffer pool, reducing I/O cost.

index_selection_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Index Selection Example
 
-- Table: orders (10 million rows)
-- Indexes:
--   idx_customer (customer_id)        -- Low selectivity per customer
--   idx_status (status)               -- Very low selectivity (5 status values)
--   idx_date (order_date)             -- High temporal correlation
--   idx_cust_date (customer_id, order_date)  -- Composite
 
-- Query with multiple index options:
SELECT order_id, total, status
FROM orders
WHERE customer_id = 12345
  AND order_date > '2024-01-01';
 
-- Optimizer considerations:
 
-- Option 1: Use idx_customer
-- + Good selectivity on customer_id (~500 orders per customer)
-- - Must recheck order_date predicate on all 500 rows
-- - Each tuple fetch is random I/O
 
-- Option 2: Use idx_date
-- - Poor selectivity on date range (maybe 30% of table)
-- - Would scan millions of index entries
-- - Not a good choice
 
-- Option 3: Use idx_cust_date (composite)
-- + Excellent selectivity: customer_id AND date range
-- + Both predicates satisfied by index traversal
-- + Minimal heap access for matching rows
-- → This is likely the chosen index
 
-- View optimizer's choice:
EXPLAIN (ANALYZE) 
SELECT order_id, total, status
FROM orders
WHERE customer_id = 12345
  AND order_date > '2024-01-01';

Strategy 3: Composite Index Considerations

Composite (multi-column) indexes require special consideration. The index can only be used efficiently if the query includes predicates on a prefix of the index columns:

Index on (A, B, C) supports queries on A, (A, B), or (A, B, C)
Queries on B alone, C alone, or (B, C) cannot use this index efficiently

The optimizer must evaluate whether available indexes match the query's predicate structure.

Index Selection Hints

Index Scan Optimizations

1. Index Prefetching:

During range scans, the database can issue asynchronous read requests for upcoming leaf pages, overlapping I/O with processing:

index_prefetching.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Index Prefetching Strategy
 
-- PostgreSQL: effective_io_concurrency controls prefetch depth
SET effective_io_concurrency = 200;  -- Higher for SSDs
 
-- How prefetching works during index range scan:
-- 
-- Without prefetching:
-- [Read leaf 1] → [Process] → [Read leaf 2] → [Process] → ...
--    10ms            5ms          10ms           5ms
-- Total: 30ms for 2 pages
--
-- With prefetching:
-- [Read leaf 1]    [Read leaf 2]
--        ↓               ↓
--    [Process]      [Process]
-- Total: 15ms (I/O overlapped with processing)
 
-- Oracle: db_file_multiblock_read_count for index leaf prefetch
-- MySQL/InnoDB: innodb_read_ahead_threshold controls prefetch trigger

2. Heap Access Batching:

Rather than fetching heap pages one at a time, the optimizer can batch TIDs and sort them by page number before accessing the heap. This is the core idea behind bitmap scans:

Heap Access Batching Benefits

•Reduced seek overhead — Accessing pages in physical order minimizes disk head movement on spinning disks.
•Better cache utilization — Sequential page access enables effective OS and buffer pool caching.
•Fewer page pins — Reading multiple tuples from the same page in one access reduces buffer pool management overhead.
•I/O coalescing — Adjacent page requests can be merged into larger, more efficient reads.

3. Index Condition Pushdown (ICP):

In MySQL/InnoDB and similar systems, portions of the WHERE clause can be evaluated at the index level, avoiding heap access for non-matching rows:

index_condition_pushdown.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Index Condition Pushdown (MySQL Example)
 
-- Composite index: (customer_id, status, order_date)
CREATE INDEX idx_cust_status_date ON orders(customer_id, status, order_date);
 
-- Query with multiple conditions:
SELECT * FROM orders
WHERE customer_id = 100
  AND status = 'shipped'
  AND order_date > '2024-01-01';
 
-- Without ICP (traditional approach):
-- 1. Use index to find rows with customer_id = 100
-- 2. Fetch each row from heap
-- 3. Apply status and order_date filters on fetched rows
-- → Many unnecessary heap accesses
 
-- With ICP enabled (default in MySQL 5.6+):
-- 1. Use index to find rows with customer_id = 100
-- 2. For each index entry, check if status = 'shipped' 
--    AND order_date > '2024-01-01' (values are in index!)
-- 3. Only fetch heap rows that pass all index conditions
-- → Dramatic reduction in heap accesses
 
-- EXPLAIN shows "Using index condition":
EXPLAIN SELECT * FROM orders
WHERE customer_id = 100 AND status = 'shipped' AND order_date > '2024-01-01';
 
-- Extra: Using index condition

4. Skip Scan Optimization:

Some databases can use an index even when the leading column is not in the predicate, by "skipping" through distinct values of the leading column:

skip_scan.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Skip Scan Optimization (Oracle Example)
 
-- Index on (gender, last_name) with only 2 distinct values for gender
CREATE INDEX idx_gender_name ON employees(gender, last_name);
 
-- Query NOT filtering on leading column:
SELECT * FROM employees WHERE last_name = 'Smith';
 
-- Traditional approach: Full table scan (index not usable)
 
-- Skip Scan approach (Oracle, MySQL 8.0.13+):
-- 1. For each distinct value of 'gender' (M, F):
--    a. Range scan index for (gender = 'M', last_name = 'Smith')
--    b. Range scan index for (gender = 'F', last_name = 'Smith')
-- 2. Combine results
 
-- Effective when:
-- - Leading column has few distinct values
-- - Remaining columns have good selectivity
-- - Alternative is full table scan
 
-- Skip scan is essentially performing:
-- WHERE (gender = 'M' AND last_name = 'Smith')
--    OR (gender = 'F' AND last_name = 'Smith')

Optimization Applicability

Index Scan vs. Sequential Scan: The Decision Framework

The Decision Equation:

Choose index scan when:

Cost(Index_Scan) < Cost(Sequential_Scan)

Expanding this:

H + L + (P_distinct × random_io_penalty) < B × sequential_io_cost

Where:

Left side: Index traversal + leaf scan + random heap access
Right side: Sequential scan of all data pages

Decision Factors Summary
Factor	Favors Index Scan	Favors Sequential Scan
Selectivity	Low (< 5-10% of rows)	High (> 20% of rows)
Table size	Large tables (millions of rows)	Small tables (< 1000 rows)
Correlation/Clustering	High correlation	Low correlation/scattered data
Query frequency	Frequent queries	Rare/one-time queries
Index availability	Matching index exists	No suitable index
I/O pattern	Random I/O acceptable	Sequential I/O preferred
Buffer pool state	Index/data cached	Cold cache

decision_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-- Analyzing the Index vs Scan Decision
 
-- PostgreSQL: Compare costs explicitly
-- Force sequential scan:
SET enable_indexscan = off;
SET enable_bitmapscan = off;
EXPLAIN SELECT * FROM orders WHERE customer_id = 12345;
-- Shows: Seq Scan cost = X
 
SET enable_indexscan = on;
SET enable_bitmapscan = on;
SET enable_seqscan = off;
EXPLAIN SELECT * FROM orders WHERE customer_id = 12345;
-- Shows: Index Scan cost = Y
 
-- Compare X vs Y - optimizer normally picks lower cost
 
-- Key statistics influencing decision:
SELECT 
    relname,
    relpages AS heap_pages,       -- B in our formula
    reltuples AS heap_rows        -- T
FROM pg_class WHERE relname = 'orders';
 
-- Index statistics:
SELECT 
    indexrelid::regclass AS index_name,
    relpages AS index_pages,
    reltuples AS index_entries
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE indrelid = 'orders'::regclass;
 
-- Column statistics affecting selectivity:
SELECT
    attname,
    n_distinct,   -- Cardinality estimation
    correlation   -- Physical ordering correlation
FROM pg_stats 
WHERE tablename = 'orders' AND attname = 'customer_id';

Common Decision Pitfalls:

Why Optimizers Make Wrong Decisions

•Stale statistics — Table contents changed significantly since last ANALYZE. Selectivity estimates are wrong.
•Correlation blindness — Optimizer assumes uniform distribution when data is clustered, overestimating index scan cost.
•Cardinality estimation errors — Complex predicates (AND/OR combinations, functions) are hard to estimate accurately.
•Parameter sniffing issues — Query plans cached for one parameter value perform poorly for others.
•Buffer pool assumptions — Optimizer may not account for data already in memory, favoring I/O-minimizing plans.

Debugging Decision Problems

Practical Examples and Patterns

Let us examine several real-world scenarios that demonstrate index selection principles in action. These examples illustrate common patterns and anti-patterns.

Example 1: The Deceptive Low-Cardinality Column

low_cardinality_trap.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Low Cardinality Column: Status Field
 
-- Table: orders (100 million rows)
-- Column: status (5 distinct values: pending, processing, shipped, delivered, cancelled)
 
CREATE INDEX idx_status ON orders(status);
 
-- Query for shipped orders:
SELECT * FROM orders WHERE status = 'shipped';
 
-- Analysis:
-- status = 'shipped' matches ~20% of rows (20 million rows)
-- Index approach: 20 million random heap accesses
-- Sequential approach: Read all pages sequentially
 
-- Result: SEQUENTIAL SCAN is likely faster!
-- The index is nearly useless due to low selectivity
 
-- Better approach for status queries:
-- 1. Use partial index for rare statuses:
CREATE INDEX idx_pending ON orders(order_id) WHERE status = 'pending';
-- Only indexes the small subset of pending orders
 
-- 2. Consider partitioning by status:
CREATE TABLE orders (
    order_id BIGINT,
    status VARCHAR(20),
    ...
) PARTITION BY LIST (status);
 
-- 3. For status counting, use materialized views or triggers

Example 2: Range Query Performance Variation

range_query_patterns.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Range Query: Correlation Impact
 
-- Table: sensor_readings (1 billion rows)
-- Columns: sensor_id, timestamp, value
-- Index: idx_timestamp (timestamp)
 
-- Scenario A: Time-Series Data (High Correlation)
-- Data inserted chronologically, timestamp ≈ heap order
-- correlation ≈ 0.95
 
SELECT * FROM sensor_readings 
WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02';
 
-- Result: INDEX SCAN efficient!
-- - Index identifies matching timestamp range
-- - Heap access is nearly sequential (correlated)
-- - Reads ~100 contiguous heap pages
 
-- Scenario B: Reshuffled Data (Low Correlation)
-- After bulk load from multiple sources, timestamps scattered
-- correlation ≈ 0.05
 
SELECT * FROM sensor_readings 
WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02';
 
-- Result: SEQUENTIAL SCAN may be chosen!
-- - Same index identifies same matching range
-- - But heap access hits pages scattered across table
-- - 1 million random page reads vs reading everything sequentially
 
-- Solution for Scenario B:
-- CLUSTER sensor_readings USING idx_timestamp;
-- (Rewrites table in index order, restoring correlation)

Example 3: Index-Only Scan Optimization

covering_index_pattern.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Covering Index: Eliminating Heap Access
 
-- Table: products (10 million rows)
-- Frequent query pattern:
SELECT product_id, category, price 
FROM products 
WHERE category = 'Electronics' 
ORDER BY price DESC 
LIMIT 100;
 
-- Original index:
CREATE INDEX idx_category ON products(category);
 
-- Execution with original index:
-- 1. Index scan: Find entries with category='Electronics'
-- 2. Heap access: Fetch (product_id, price) for each
-- 3. Sort by price
-- 4. Return top 100
-- Cost: Millions of random heap reads + sort
 
-- Optimized covering index:
CREATE INDEX idx_cat_price_prod ON products(category, price DESC, product_id);
 
-- Execution with covering index:
-- 1. Index-only scan: Navigate to category='Electronics'
-- 2. Already sorted by price DESC in index
-- 3. Return first 100 entries directly from index
-- 4. NO HEAP ACCESS
-- Cost: ~10 index pages (dramatic improvement!)
 
-- PostgreSQL EXPLAIN showing improvement:
-- Before: Index Scan + Sort (cost=50000)
-- After:  Index Only Scan (cost=50) - 1000x faster!

Pattern Recognition

Summary: Mastering Index Selection

Key Takeaways

•Index scan mechanics involve B+tree traversal to leaf nodes, followed by heap access using stored TIDs. The two-phase pattern creates distinct cost components.
•Index scan types (index-only, standard, bitmap) offer different tradeoffs. Index-only eliminates heap access; bitmap converts random to sequential I/O.
•Cost model components include index traversal (H + L pages) and heap access (dominated by random I/O for uncorrelated data).
•Correlation is critical — the physical relationship between index order and heap order dramatically affects actual performance.
•Selection strategies consider predicate matching, selectivity estimation, and composite index prefix rules.
•Optimizations like prefetching, batching, ICP, and skip scan can dramatically improve index scan efficiency.
•The decision framework compares index scan total cost against sequential scan cost, influenced by selectivity, table size, and data distribution.

What's Next:

Page Complete

2 / 5