Loading learning content...
While linear scans provide a reliable baseline for selection operations, they become prohibitively expensive for large tables when only a small fraction of rows match the predicate. Index-based selection transforms the access pattern from sequential-read-everything to surgical-retrieval-of-specific-rows.
An index is an auxiliary data structure that maps search key values to the locations of tuples containing those values. When a selection predicate involves an indexed column, the database can use the index to directly locate matching tuples without examining every row in the table. This transforms an O(N) operation into an O(log N) or even O(1) lookup.
Understanding index selection is crucial because it represents the primary mechanism by which databases achieve sub-millisecond response times on tables containing billions of rows. The decision of whether to use an index, which index to use, and how to use it forms the core of query optimization.
By the end of this page, you will understand the mechanics of index-based tuple retrieval, analyze the cost model for index scans across different index types, comprehend the critical distinction between index-only scans and index-heap scans, and master the decision framework for when index selection outperforms linear scanning.
An index scan uses an index structure to locate tuples matching a selection predicate. Unlike linear scans that process every page, index scans follow a targeted path through the index to retrieve only relevant data. The fundamental mechanism varies by index type, but B+tree indexes dominate in relational databases.
B+Tree Index Structure Recap:
Before diving into scan mechanics, let us recall the B+tree structure:
The B+tree maintains sorted order of keys, enabling both point lookups (equality predicates) and range scans (inequality predicates).
12345678910111213141516171819202122232425262728293031323334353637383940414243
-- B+Tree Index Scan Algorithm PROCEDURE BTIndexScan(Index I, Predicate P): -- Extract search key from predicate search_key ← ExtractKeyFromPredicate(P) result ← EmptyResultSet() -- Phase 1: Root-to-Leaf Traversal (Point Lookup) current_node ← I.root WHILE NOT IsLeaf(current_node): -- Binary search within node to find child pointer child_pointer ← SearchNode(current_node, search_key) current_node ← FetchPage(child_pointer) -- Phase 2: Leaf Node Search -- For equality: Find matching entries -- For range: Find start position leaf ← current_node position ← BinarySearchLeaf(leaf, search_key) -- Phase 3: Collect Matching Entries WHILE CurrentKeyMatchesPredicate(leaf, position, P): tid ← GetTupleId(leaf, position) -- Phase 4: Heap Access (unless index-only scan) tuple ← FetchTupleFromHeap(tid) IF EvaluateFullPredicate(tuple, P): -- Recheck if needed result.Add(tuple) -- Advance to next entry position ← position + 1 IF position >= leaf.num_entries: leaf ← leaf.next_leaf -- Follow sibling pointer position ← 0 IF leaf = NULL: BREAK RETURN result -- Example: SELECT * FROM employees WHERE emp_id = 12345-- 1. Root-to-leaf traversal: ~3-4 page reads (log complexity)-- 2. Leaf scan: 1 page read (single match)-- 3. Heap access: 1 page read (fetch actual tuple)-- Total: ~5 page reads vs. thousands for table scanThe Two-Phase Access Pattern:
Most index scans involve two distinct phases:
Index Traversal Phase: Navigate the index structure to locate entries matching the predicate. This phase reads only index pages, which are typically smaller and more cache-friendly than data pages.
Heap Access Phase: For each matching index entry, fetch the actual tuple from the heap (table) using the stored tuple identifier (TID). This phase incurs random I/O as matching tuples are typically scattered across data pages.
This two-phase pattern has profound cost implications—the heap access phase often dominates total cost because random I/O is expensive.
Each leaf-level index entry contains: (1) the indexed column value(s), and (2) a tuple identifier (TID) pointing to the heap location. In PostgreSQL, a TID is (block_number, tuple_offset). In Oracle, a ROWID encodes similar information. The TID enables direct access to the tuple without additional searching.
Database systems implement several variations of index scans, each optimized for specific access patterns. Understanding these variations is essential for query plan interpretation and optimization.
1. Index-Only Scan (Covering Index Scan):
When the index contains all columns needed by the query (both for filtering and result projection), the heap access phase can be eliminated entirely. This is the most efficient form of index access.
123456789101112131415161718192021222324
-- Index-Only Scan Example -- Create covering index including all needed columnsCREATE INDEX idx_emp_salary_dept ON employees(department_id, salary); -- Query that can be satisfied entirely from the indexSELECT department_id, salary FROM employees WHERE department_id = 10; -- PostgreSQL EXPLAIN output:-- Index Only Scan using idx_emp_salary_dept on employees-- Index Cond: (department_id = 10)-- Heap Fetches: 0 -- No heap access needed! -- Why it works:-- - Predicate column (department_id) is in the index-- - All SELECT columns (department_id, salary) are in the index-- - No heap access required → dramatically faster -- Visibility Note:-- PostgreSQL must still check visibility map to ensure tuples-- are visible to current transaction. If page isn't all-visible,-- heap access is required even for index-only scans.2. Index Scan (with Heap Access):
The standard index scan retrieves TIDs from the index, then fetches corresponding tuples from the heap. This is necessary when:
1234567891011121314151617181920
-- Standard Index Scan with Heap Access -- Index only contains employee_idCREATE INDEX idx_emp_id ON employees(employee_id); -- Query needs columns not in the indexSELECT employee_id, name, salary, hire_dateFROM employees WHERE employee_id = 12345; -- PostgreSQL EXPLAIN output:-- Index Scan using idx_emp_id on employees-- Index Cond: (employee_id = 12345)-- (cost=0.29..8.31 rows=1 width=48) -- The executor:-- 1. Traverses idx_emp_id to find entry for employee_id = 12345-- 2. Retrieves the TID (page_number, tuple_offset)-- 3. Fetches the heap page containing the actual tuple-- 4. Returns the full row from the heap page3. Bitmap Index Scan:
For predicates matching multiple tuples scattered across many pages, bitmap index scans provide an efficient alternative. Instead of fetching tuples one-by-one (causing random I/O), the bitmap scan:
This converts random I/O into sequential I/O, dramatically improving performance for medium-selectivity queries.
1234567891011121314151617181920212223242526272829
-- Bitmap Index Scan CREATE INDEX idx_salary ON employees(salary); -- Query returning many scattered tuplesSELECT * FROM employees WHERE salary BETWEEN 50000 AND 60000; -- PostgreSQL EXPLAIN output:-- Bitmap Heap Scan on employees-- Recheck Cond: (salary >= 50000 AND salary <= 60000)-- -> Bitmap Index Scan on idx_salary-- Index Cond: (salary >= 50000 AND salary <= 60000) -- Execution phases:-- Phase 1: Bitmap Index Scan-- - Traverse index for range [50000, 60000]-- - Build bitmap: page 1 -> [tuple 3, tuple 7]-- page 4 -> [tuple 1, tuple 5, tuple 9]-- page 7 -> [tuple 2]-- ... (sorted by page number) -- Phase 2: Bitmap Heap Scan-- - Read heap pages in physical order (1, 4, 7, ...)-- - Sequential I/O pattern even for scattered matches-- - Recheck condition (may be lossy for large bitmaps) -- Bitmap becomes "lossy" when too large:-- - Stores only page numbers, not tuple offsets-- - All tuples on matching pages must be rechecked| Scan Type | Index Access | Heap Access | I/O Pattern | Best For |
|---|---|---|---|---|
| Index-Only Scan | Sequential (leaf→leaf) | None required | Sequential | Covering indexes, high selectivity |
| Index Scan | Sequential (leaf→leaf) | Random per-tuple | Random | High selectivity (few matches) |
| Bitmap Index Scan | Sequential (build bitmap) | Sequential (sorted pages) | Sequential | Medium selectivity, scattered data |
The query optimizer automatically chooses among these scan types based on estimated costs. Index-only scans are preferred when possible. For point queries (= operator), standard index scans suffice. For range queries returning many rows, bitmap scans convert random heap access into sequential access, often providing 10-100× speedup over standard index scans.
Accurate cost estimation for index scans is critical for the optimizer to make correct access path decisions. The cost model must account for both index access costs and heap access costs.
Cost Components:
Let us define the parameters:
Index Access Cost:
| Operation | Cost Formula | Explanation |
|---|---|---|
| Root-to-leaf traversal | H page reads | Fixed cost to reach first leaf node |
| Leaf scan (equality) | ⌈M / entries_per_page⌉ | Pages containing matching entries |
| Leaf scan (range) | L = s × Leaf_Pages | Proportion of leaf pages in range |
| Total index I/O | H + L | Traversal + leaf scanning |
Heap Access Cost (Critical Factor):
The heap access cost depends on how matching tuples are distributed across data pages. This is where index scans can become expensive. Consider two extremes:
Best Case (Clustered Data): All matching tuples reside in adjacent pages. P_distinct ≈ M × tuple_size / page_size. Heap access is nearly sequential.
Worst Case (Scattered Data): Each matching tuple resides on a different page. P_distinct ≈ M. Every tuple fetch requires a separate random I/O.
The correlation between index order and physical tuple order determines where on this spectrum actual performance falls.
12345678910111213141516171819202122232425262728293031323334
-- Analyzing Index-Heap Correlation -- PostgreSQL: View correlation statisticsSELECT schemaname, tablename, attname AS column_name, correlationFROM pg_statsWHERE tablename = 'orders' AND attname = 'order_date'; -- Correlation values:-- +1.0 : Perfect positive correlation (index order = heap order)-- Excellent for range scans - sequential heap access-- -1.0 : Perfect negative correlation (opposite orders)-- Still good - predictable access pattern-- 0.0 : No correlation (random distribution)-- Worst case - every tuple might be on different page -- Example correlations and their implications: -- order_date with correlation = 0.98-- → Orders stored roughly chronologically in heap-- → Range scan on date range will access contiguous pages-- → Cost closer to sequential scan for heap access -- customer_id with correlation = 0.12-- → Customer IDs scattered randomly in heap-- → Each customer lookup requires random page access-- → High heap access cost for index scans -- The optimizer uses correlation to estimate P_distinct:-- High correlation: P_distinct ≈ M × tuple_size / page_size-- Low correlation: P_distinct ≈ min(M, B) where B = total pagesComplete Cost Formula:
For a standard index scan:
Total Cost = Index_IO + Heap_IO
Where:
For bitmap index scan:
For index-only scan:
Break-Even Point:
The optimizer compares index scan cost against sequential scan cost (B pages). The break-even selectivity is approximately:
s_break ≈ B / (H + L + P_distinct × random_io_factor)
When selectivity exceeds this threshold, sequential scans become cheaper despite reading the entire table.
Never underestimate random I/O cost. On spinning disks, random access is 100× slower than sequential. Even on NVMe SSDs, it's 3-5× slower. A query returning 10% of a table might seem ideal for an index, but if data is uncorrelated, the random heap access could be slower than scanning everything sequentially.
When multiple indexes exist on a table, the optimizer must choose which index (if any) to use for a given predicate. This section explores the strategies and factors influencing this decision.
Strategy 1: Predicate Matching
The most fundamental requirement is that an index can satisfy the predicate structure. This determines index applicability:
| Predicate Type | B+Tree Support | Hash Index Support | Notes |
|---|---|---|---|
| Equality (col = value) | ✓ Full support | ✓ Full support | Both index types optimal |
| Range (col > value) | ✓ Full support | ✗ Not supported | B+tree only (sorted structure) |
| IN list (col IN (...)) | ✓ Multiple lookups | ✓ Multiple lookups | Multiple index probes |
| LIKE 'prefix%' | ✓ Range scan | ✗ Not supported | Prefix translates to range |
| LIKE '%suffix' | ✗ Not efficient | ✗ Not supported | Pattern at start prevents index use |
| Function(col) = value | ✗ Unless expression index | ✗ Unless expression index | Index on raw column doesn't help |
| col IS NULL | ✓ (most databases) | Varies | NULL handling is vendor-specific |
Strategy 2: Cost-Based Selection
When multiple indexes can satisfy a predicate, the optimizer estimates cost for each option and selects the cheapest. Factors influencing this choice:
123456789101112131415161718192021222324252627282930313233343536373839
-- Index Selection Example -- Table: orders (10 million rows)-- Indexes:-- idx_customer (customer_id) -- Low selectivity per customer-- idx_status (status) -- Very low selectivity (5 status values)-- idx_date (order_date) -- High temporal correlation-- idx_cust_date (customer_id, order_date) -- Composite -- Query with multiple index options:SELECT order_id, total, statusFROM ordersWHERE customer_id = 12345 AND order_date > '2024-01-01'; -- Optimizer considerations: -- Option 1: Use idx_customer-- + Good selectivity on customer_id (~500 orders per customer)-- - Must recheck order_date predicate on all 500 rows-- - Each tuple fetch is random I/O -- Option 2: Use idx_date-- - Poor selectivity on date range (maybe 30% of table)-- - Would scan millions of index entries-- - Not a good choice -- Option 3: Use idx_cust_date (composite)-- + Excellent selectivity: customer_id AND date range-- + Both predicates satisfied by index traversal-- + Minimal heap access for matching rows-- → This is likely the chosen index -- View optimizer's choice:EXPLAIN (ANALYZE) SELECT order_id, total, statusFROM ordersWHERE customer_id = 12345 AND order_date > '2024-01-01';Strategy 3: Composite Index Considerations
Composite (multi-column) indexes require special consideration. The index can only be used efficiently if the query includes predicates on a prefix of the index columns:
The optimizer must evaluate whether available indexes match the query's predicate structure.
When the optimizer makes suboptimal index choices (often due to stale statistics), most databases provide hints or directives to force specific index use. PostgreSQL uses enable_seqscan/enable_indexscan settings, MySQL and Oracle use hint syntax like USE INDEX or INDEX hints. Use these sparingly and only after verifying the optimizer's statistics are accurate.
Modern database systems implement sophisticated optimizations to improve index scan performance beyond the basic algorithm. These optimizations address the fundamental challenges of random I/O and CPU overhead.
1. Index Prefetching:
During range scans, the database can issue asynchronous read requests for upcoming leaf pages, overlapping I/O with processing:
1234567891011121314151617181920
-- Index Prefetching Strategy -- PostgreSQL: effective_io_concurrency controls prefetch depthSET effective_io_concurrency = 200; -- Higher for SSDs -- How prefetching works during index range scan:-- -- Without prefetching:-- [Read leaf 1] → [Process] → [Read leaf 2] → [Process] → ...-- 10ms 5ms 10ms 5ms-- Total: 30ms for 2 pages---- With prefetching:-- [Read leaf 1] [Read leaf 2]-- ↓ ↓-- [Process] [Process]-- Total: 15ms (I/O overlapped with processing) -- Oracle: db_file_multiblock_read_count for index leaf prefetch-- MySQL/InnoDB: innodb_read_ahead_threshold controls prefetch trigger2. Heap Access Batching:
Rather than fetching heap pages one at a time, the optimizer can batch TIDs and sort them by page number before accessing the heap. This is the core idea behind bitmap scans:
3. Index Condition Pushdown (ICP):
In MySQL/InnoDB and similar systems, portions of the WHERE clause can be evaluated at the index level, avoiding heap access for non-matching rows:
1234567891011121314151617181920212223242526272829
-- Index Condition Pushdown (MySQL Example) -- Composite index: (customer_id, status, order_date)CREATE INDEX idx_cust_status_date ON orders(customer_id, status, order_date); -- Query with multiple conditions:SELECT * FROM ordersWHERE customer_id = 100 AND status = 'shipped' AND order_date > '2024-01-01'; -- Without ICP (traditional approach):-- 1. Use index to find rows with customer_id = 100-- 2. Fetch each row from heap-- 3. Apply status and order_date filters on fetched rows-- → Many unnecessary heap accesses -- With ICP enabled (default in MySQL 5.6+):-- 1. Use index to find rows with customer_id = 100-- 2. For each index entry, check if status = 'shipped' -- AND order_date > '2024-01-01' (values are in index!)-- 3. Only fetch heap rows that pass all index conditions-- → Dramatic reduction in heap accesses -- EXPLAIN shows "Using index condition":EXPLAIN SELECT * FROM ordersWHERE customer_id = 100 AND status = 'shipped' AND order_date > '2024-01-01'; -- Extra: Using index condition4. Skip Scan Optimization:
Some databases can use an index even when the leading column is not in the predicate, by "skipping" through distinct values of the leading column:
123456789101112131415161718192021222324
-- Skip Scan Optimization (Oracle Example) -- Index on (gender, last_name) with only 2 distinct values for genderCREATE INDEX idx_gender_name ON employees(gender, last_name); -- Query NOT filtering on leading column:SELECT * FROM employees WHERE last_name = 'Smith'; -- Traditional approach: Full table scan (index not usable) -- Skip Scan approach (Oracle, MySQL 8.0.13+):-- 1. For each distinct value of 'gender' (M, F):-- a. Range scan index for (gender = 'M', last_name = 'Smith')-- b. Range scan index for (gender = 'F', last_name = 'Smith')-- 2. Combine results -- Effective when:-- - Leading column has few distinct values-- - Remaining columns have good selectivity-- - Alternative is full table scan -- Skip scan is essentially performing:-- WHERE (gender = 'M' AND last_name = 'Smith')-- OR (gender = 'F' AND last_name = 'Smith')These optimizations have specific applicability conditions. Prefetching requires configurable I/O concurrency. ICP requires columns present in the index. Skip scan is only efficient for low-cardinality leading columns. The optimizer evaluates whether each optimization applies and estimates its benefit.
The fundamental question in selection implementation is: should we use an index or scan the entire table? The optimizer makes this decision by estimating and comparing costs. Understanding this framework enables diagnosing and correcting suboptimal query plans.
The Decision Equation:
Choose index scan when:
Cost(Index_Scan) < Cost(Sequential_Scan)
Expanding this:
H + L + (P_distinct × random_io_penalty) < B × sequential_io_cost
Where:
| Factor | Favors Index Scan | Favors Sequential Scan |
|---|---|---|
| Selectivity | Low (< 5-10% of rows) | High (> 20% of rows) |
| Table size | Large tables (millions of rows) | Small tables (< 1000 rows) |
| Correlation/Clustering | High correlation | Low correlation/scattered data |
| Query frequency | Frequent queries | Rare/one-time queries |
| Index availability | Matching index exists | No suitable index |
| I/O pattern | Random I/O acceptable | Sequential I/O preferred |
| Buffer pool state | Index/data cached | Cold cache |
12345678910111213141516171819202122232425262728293031323334353637383940
-- Analyzing the Index vs Scan Decision -- PostgreSQL: Compare costs explicitly-- Force sequential scan:SET enable_indexscan = off;SET enable_bitmapscan = off;EXPLAIN SELECT * FROM orders WHERE customer_id = 12345;-- Shows: Seq Scan cost = X SET enable_indexscan = on;SET enable_bitmapscan = on;SET enable_seqscan = off;EXPLAIN SELECT * FROM orders WHERE customer_id = 12345;-- Shows: Index Scan cost = Y -- Compare X vs Y - optimizer normally picks lower cost -- Key statistics influencing decision:SELECT relname, relpages AS heap_pages, -- B in our formula reltuples AS heap_rows -- TFROM pg_class WHERE relname = 'orders'; -- Index statistics:SELECT indexrelid::regclass AS index_name, relpages AS index_pages, reltuples AS index_entriesFROM pg_class cJOIN pg_index i ON c.oid = i.indexrelidWHERE indrelid = 'orders'::regclass; -- Column statistics affecting selectivity:SELECT attname, n_distinct, -- Cardinality estimation correlation -- Physical ordering correlationFROM pg_stats WHERE tablename = 'orders' AND attname = 'customer_id';Common Decision Pitfalls:
When you observe a suboptimal access path, first ANALYZE/UPDATE STATS the table. Then examine pg_stats/histogram data for the predicate column. Compare estimated vs actual row counts in EXPLAIN ANALYZE. If estimates are accurate but plan is still wrong, consider adjusting cost parameters (random_page_cost, seq_page_cost) to match your actual hardware characteristics.
Let us examine several real-world scenarios that demonstrate index selection principles in action. These examples illustrate common patterns and anti-patterns.
Example 1: The Deceptive Low-Cardinality Column
12345678910111213141516171819202122232425262728293031
-- Low Cardinality Column: Status Field -- Table: orders (100 million rows)-- Column: status (5 distinct values: pending, processing, shipped, delivered, cancelled) CREATE INDEX idx_status ON orders(status); -- Query for shipped orders:SELECT * FROM orders WHERE status = 'shipped'; -- Analysis:-- status = 'shipped' matches ~20% of rows (20 million rows)-- Index approach: 20 million random heap accesses-- Sequential approach: Read all pages sequentially -- Result: SEQUENTIAL SCAN is likely faster!-- The index is nearly useless due to low selectivity -- Better approach for status queries:-- 1. Use partial index for rare statuses:CREATE INDEX idx_pending ON orders(order_id) WHERE status = 'pending';-- Only indexes the small subset of pending orders -- 2. Consider partitioning by status:CREATE TABLE orders ( order_id BIGINT, status VARCHAR(20), ...) PARTITION BY LIST (status); -- 3. For status counting, use materialized views or triggersExample 2: Range Query Performance Variation
123456789101112131415161718192021222324252627282930313233
-- Range Query: Correlation Impact -- Table: sensor_readings (1 billion rows)-- Columns: sensor_id, timestamp, value-- Index: idx_timestamp (timestamp) -- Scenario A: Time-Series Data (High Correlation)-- Data inserted chronologically, timestamp ≈ heap order-- correlation ≈ 0.95 SELECT * FROM sensor_readings WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02'; -- Result: INDEX SCAN efficient!-- - Index identifies matching timestamp range-- - Heap access is nearly sequential (correlated)-- - Reads ~100 contiguous heap pages -- Scenario B: Reshuffled Data (Low Correlation)-- After bulk load from multiple sources, timestamps scattered-- correlation ≈ 0.05 SELECT * FROM sensor_readings WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02'; -- Result: SEQUENTIAL SCAN may be chosen!-- - Same index identifies same matching range-- - But heap access hits pages scattered across table-- - 1 million random page reads vs reading everything sequentially -- Solution for Scenario B:-- CLUSTER sensor_readings USING idx_timestamp;-- (Rewrites table in index order, restoring correlation)Example 3: Index-Only Scan Optimization
123456789101112131415161718192021222324252627282930313233
-- Covering Index: Eliminating Heap Access -- Table: products (10 million rows)-- Frequent query pattern:SELECT product_id, category, price FROM products WHERE category = 'Electronics' ORDER BY price DESC LIMIT 100; -- Original index:CREATE INDEX idx_category ON products(category); -- Execution with original index:-- 1. Index scan: Find entries with category='Electronics'-- 2. Heap access: Fetch (product_id, price) for each-- 3. Sort by price-- 4. Return top 100-- Cost: Millions of random heap reads + sort -- Optimized covering index:CREATE INDEX idx_cat_price_prod ON products(category, price DESC, product_id); -- Execution with covering index:-- 1. Index-only scan: Navigate to category='Electronics'-- 2. Already sorted by price DESC in index-- 3. Return first 100 entries directly from index-- 4. NO HEAP ACCESS-- Cost: ~10 index pages (dramatic improvement!) -- PostgreSQL EXPLAIN showing improvement:-- Before: Index Scan + Sort (cost=50000)-- After: Index Only Scan (cost=50) - 1000x faster!These examples demonstrate key patterns: (1) index column cardinality must support selectivity, (2) physical data organization determines actual I/O behavior, and (3) covering indexes can eliminate entire cost components. Recognizing these patterns enables proactive index design rather than reactive optimization.
We have explored index-based selection in comprehensive depth, from fundamental mechanics through optimization techniques to practical decision-making frameworks. Let us consolidate the essential knowledge:
What's Next:
Having mastered single-predicate selection strategies, we now advance to Conjunctive Selection—implementing selections with multiple AND-ed predicates. This introduces the challenge of combining multiple index accesses or choosing a single dominant index.
You now possess expert-level understanding of index-based selection—the surgical access method that enables sub-millisecond queries on billion-row tables. This knowledge is essential for query optimization, index design, and understanding database performance characteristics.