Loading learning content...
Imagine a library with 10 million books but no card catalog, no Dewey Decimal System, no organization whatsoever. Finding a specific book would require examining every single shelf—potentially taking hours or days. This is precisely what happens when a database query runs without an appropriate index: the database must scan every row in the table, one by one.
Indexes are the card catalog of your database. They are auxiliary data structures that map search keys to physical row locations, enabling queries to jump directly to the data they need. The difference between a query with and without an appropriate index can be the difference between 1 millisecond and 10 seconds—a factor of 10,000x.
But indexes are not free. They consume storage space, slow down write operations, and add complexity to database maintenance. Understanding when and how to create indexes is one of the most impactful skills in database performance optimization.
By the end of this page, you will understand: the internal structure of B-tree and other index types; how to design effective composite indexes; the power of covering indexes for eliminating table lookups; expression and partial indexes for specialized use cases; and the critical tradeoffs between read and write performance.
The B-tree (balanced tree) is the dominant index structure in relational databases. PostgreSQL, MySQL, SQL Server, Oracle, and SQLite all use B-trees as their primary index type. Understanding B-tree structure is essential for effective index design.
B-Tree Structure:
A B-tree is a self-balancing tree where:
The key property of B-trees is that all leaf nodes are at the same depth, ensuring consistent O(log n) lookup time regardless of which key you're searching for. For a table with 1 billion rows, a B-tree with a branching factor of 100 requires only 5 levels—meaning any lookup touches at most 5 pages.
12345678910111213141516171819202122
B-Tree Index Structure (simplified) [Root: 50, 100] / | \ / | \ [20, 35] [65, 85] [120, 150] / | \ / | \ / | \ / | \ / | \ / | \ [Leaf] [Leaf] [Leaf] [Leaf] [Leaf] [Leaf] [Leaf] [Leaf] [Leaf] Finding key "75": 1. Start at root [50, 100] 2. 75 > 50 and 75 < 100 → go to middle child 3. At [65, 85]: 75 > 65 and 75 < 85 → go to middle child 4. Scan leaf node for key 75 → found in 3 node accesses! B-tree properties: - Height is O(log n) where n = number of keys - All leaves at same depth (self-balancing) - Nodes are sized to fit disk pages (typically 8KB) - Branching factor of 100-1000 typical (depends on key size) - 1 billion rows ≈ 4-5 levels = 4-5 disk reads maxWhy B-trees dominate database indexing:
| Property | Benefit for Databases |
|---|---|
| O(log n) lookups | Consistent fast search regardless of table size |
| Sorted leaf nodes | Efficient range queries and ORDER BY |
| High branching factor | Shallow trees = fewer disk reads |
| Node size = page size | One I/O per tree level |
| Self-balancing | No degradation over time |
B-tree operations and their costs:
WHERE id = 123): O(log n) — traverse tree to find keyWHERE price BETWEEN 10 AND 50): O(log n + k) — find start, then scan k matching leavesDatabase performance is dominated by disk I/O, not CPU. A B-tree node is sized to match a disk page (typically 8KB). Each level of the tree requires one disk read. With caching, upper levels stay in memory—only the final 1-2 levels typically require disk access. This is why index lookups are so much faster than table scans: 2-3 random reads versus potentially millions of sequential reads.
While B-trees handle most indexing needs, databases offer specialized index types optimized for specific data types and access patterns.
| Index Type | Best For | Limitations | Database Support |
|---|---|---|---|
| B-Tree | Equality, range queries, sorting | Poor for low-cardinality | All databases (default) |
| Hash | Equality lookups only | No range queries, no sorting | PostgreSQL, MySQL Memory engine |
| GiST (Generalized Search Tree) | Geometric, full-text, custom types | Complex, larger size | PostgreSQL |
| GIN (Generalized Inverted) | Arrays, full-text, JSONB | Slow updates | PostgreSQL |
| BRIN (Block Range Index) | Naturally sorted large tables | Low selectivity on random data | PostgreSQL 9.5+ |
| Full-Text | Text search with ranking | Not for exact matches | PostgreSQL, MySQL, SQL Server |
| Spatial (R-Tree) | Geographic/geometric data | Specialized use case | PostgreSQL (PostGIS), MySQL |
| Bitmap | Data warehousing, low-cardinality | Read-heavy only | Oracle, PostgreSQL (runtime) |
12345678910111213141516171819202122232425262728293031323334
-- B-Tree (default) - equality and range queriesCREATE INDEX idx_orders_date ON orders(created_at);-- Supports: =, <, <=, >, >=, BETWEEN, IN, IS NULL -- Hash - only equality, but very fastCREATE INDEX idx_sessions_token ON sessions USING hash(token);-- Only supports: =-- Note: Pre-PostgreSQL 10, hash indexes weren't crash-safe -- GIN - for arrays and JSONBCREATE INDEX idx_products_tags ON products USING gin(tags);-- Supports: @>, <@, &&, ? for arrays-- Example query: SELECT * FROM products WHERE tags @> ARRAY['sale']; -- GIN for JSONBCREATE INDEX idx_events_data ON events USING gin(metadata);-- Example: SELECT * FROM events WHERE metadata @> '{"type": "click"}'; -- GiST - for geometric and range typesCREATE INDEX idx_stores_location ON stores USING gist(location);-- Supports: <<, >>, ~=, && for geometric operations-- With PostGIS: ST_Contains, ST_DWithin, ST_Intersects -- BRIN - for huge, naturally sorted tables (e.g., time-series)CREATE INDEX idx_logs_timestamp ON logs USING brin(created_at);-- Tiny index (1000x smaller than B-tree) for sorted data-- Perfect for append-only tables with timestamp columns -- Full-text searchCREATE INDEX idx_articles_search ON articles USING gin(to_tsvector('english', title || ' ' || body));-- Example: SELECT * FROM articles -- WHERE to_tsvector('english', title || ' ' || body) -- @@ to_tsquery('database & optimization');BRIN indexes are a hidden gem for time-series and log data. They work by storing min/max values for blocks of rows. If your data is physically sorted (e.g., timestamp on an append-only table), BRIN provides 100-1000x space savings over B-tree while maintaining reasonable query performance. But for randomly ordered data, BRIN is nearly useless.
Composite indexes (multi-column indexes) are powerful but frequently misunderstood. The order of columns in a composite index matters enormously, and getting it wrong can render the index useless.
The Leftmost Prefix Rule:
A composite index on columns (A, B, C) can efficiently satisfy queries that filter on:
But it cannot efficiently satisfy queries that filter on:
This is because B-tree indexes store data sorted by the first column, then by the second within each first-column value, and so on.
123456789101112131415161718192021222324252627282930313233343536
-- Index on (country, city, district)CREATE INDEX idx_address_location ON addresses(country, city, district); -- ✅ Uses index fully (all columns matched, leftmost to rightmost)SELECT * FROM addresses WHERE country = 'US' AND city = 'Seattle' AND district = 'Downtown'; -- ✅ Uses index (leftmost prefix: country, city)SELECT * FROM addresses WHERE country = 'US' AND city = 'Seattle'; -- ✅ Uses index (leftmost prefix: country only)SELECT * FROM addresses WHERE country = 'US'; -- ❌ Cannot use index (skips "country" - leftmost column)SELECT * FROM addresses WHERE city = 'Seattle'; -- ❌ Cannot use index (skips "country" and "city")SELECT * FROM addresses WHERE district = 'Downtown'; -- ⚠️ Partially uses index (country only, city comparison is range)SELECT * FROM addresses WHERE country = 'US' AND city LIKE 'Sea%';-- Index used for country = 'US', then scans for city LIKE 'Sea%' -- ORDER BY also benefits from composite indexes-- ✅ Uses index for both filtering AND sortingSELECT * FROM addresses WHERE country = 'US' ORDER BY city, district; -- ❌ Requires sorting (order doesn't match index)SELECT * FROM addresses WHERE country = 'US' ORDER BY district, city;Column Order Guidelines:
Designing effective composite indexes requires understanding your query patterns. Here are the key principles:
= conditions should come before range conditionsThe ESR Rule (Equality, Sort, Range):
For optimal index design, order columns as:
1234567891011121314151617
-- ❌ BAD: Range column before sort/equality-- Query: Find recent pending orders sorted by dateCREATE INDEX idx_orders_bad ON orders(created_at, status, customer_id); -- Query:SELECT * FROM ordersWHERE status = 'pending' AND created_at > '2024-01-01'ORDER BY created_at DESCLIMIT 100; -- Problem: Index starts with range column-- Database must scan all created_at > date entries-- Then filter by status (not selective in index) -- Result: Slower than it could be12345678910111213141516171819
-- ✅ GOOD: ESR order (Equality, Sort, Range)-- Query: Find recent pending orders sorted by dateCREATE INDEX idx_orders_good ON orders(status, created_at DESC); -- Query:SELECT * FROM ordersWHERE status = 'pending' AND created_at > '2024-01-01'ORDER BY created_at DESCLIMIT 100; -- Flow:-- 1. Jump to status = 'pending' (equality)-- 2. Scan in created_at DESC order (already sorted!)-- 3. Stop at created_at = '2024-01-01' (range)-- 4. Return first 100 rows (already in order!) -- Result: Optimal - no sorting neededDon't create a new index for every query variation. If you have queries that filter on (A), (A,B), (A,C), and (A,B,C), you don't need 4 indexes. A single index on (A, B, C) serves the first and last query. Consider adding (A, C) only if queries on (A, C) are frequent and the full index is too large.
A covering index contains all columns needed to answer a query, eliminating the need to access the main table (heap). This is one of the most powerful optimization techniques available.
How normal index scans work:
Step 2 is expensive—it's a random I/O for each matching row. If the index returns 10,000 rows, that's potentially 10,000 random disk reads.
How covering indexes work:
This is called an Index-Only Scan (PostgreSQL) or Covering Index Scan (SQL Server).
12345678910111213141516171819202122232425
-- Query we want to optimizeSELECT customer_id, SUM(amount), COUNT(*)FROM ordersWHERE status = 'completed' AND created_at >= '2024-01-01'GROUP BY customer_id; -- Basic index (requires heap access to get 'amount')CREATE INDEX idx_orders_status_date ON orders(status, created_at);-- Execution: Index Scan + 1 heap read per row -- Covering index includes 'amount' and 'customer_id'CREATE INDEX idx_orders_covering ON orders(status, created_at) INCLUDE (customer_id, amount);-- PostgreSQL 11+ syntax: INCLUDE for non-key columns -- Execution: Index Only Scan - no heap access!-- Much faster for queries that match this pattern -- Alternative covering index (all columns in key)CREATE INDEX idx_orders_covering_alt ON orders(status, created_at, customer_id, amount);-- Works but larger index; INCLUDE is usually betterKey columns vs INCLUDE columns:
| Aspect | Key Columns | INCLUDE Columns |
|---|---|---|
| Used for searching | Yes | No |
| Affects sort order | Yes | No |
| Stored in internal nodes | Yes | No (leaf only) |
| Size impact | High (duplicated in tree) | Lower (leaf only) |
When to use covering indexes:
PostgreSQL can only perform Index-Only Scans if the visibility map indicates all rows in a page are visible to all transactions. On heavily updated tables, run VACUUM regularly to update the visibility map. If you see 'Heap Fetches: 50000' in EXPLAIN output for what should be an index-only scan, the visibility map needs updating.
Beyond standard column indexes, databases support two powerful variants: expression indexes (indexing computed values) and partial indexes (indexing subsets of rows).
Expression indexes index the result of a function or expression, enabling index usage for queries that apply transformations to columns.
123456789101112131415161718192021222324252627282930313233343536
-- Problem: Case-insensitive search can't use regular index-- ❌ This query can't use the index on emailSELECT * FROM users WHERE LOWER(email) = 'john@example.com'; -- Solution: Expression index on the function resultCREATE INDEX idx_users_email_lower ON users(LOWER(email)); -- ✅ Now this query uses the expression index!SELECT * FROM users WHERE LOWER(email) = 'john@example.com'; -- More expression index examples: -- Index on extracted date part (for grouping by date)CREATE INDEX idx_orders_date ON orders(DATE(created_at));-- Supports: WHERE DATE(created_at) = '2024-06-15' -- Index on JSONB field extractionCREATE INDEX idx_events_type ON events((metadata->>'type'));-- Supports: WHERE metadata->>'type' = 'click' -- Index on computed age from birth dateCREATE INDEX idx_users_birth_year ON users(EXTRACT(YEAR FROM birth_date));-- Supports: WHERE EXTRACT(YEAR FROM birth_date) = 1990 -- Index on string concatenationCREATE INDEX idx_users_fullname ON users((first_name || ' ' || last_name));-- Supports: WHERE first_name || ' ' || last_name = 'John Doe' -- IMPORTANT: Query must match expression EXACTLY-- ✅ Uses index: WHERE LOWER(email) = 'john@example.com'-- ❌ Won't use: WHERE UPPER(email) = 'JOHN@EXAMPLE.COM'-- ❌ Won't use: WHERE email ILIKE 'john@example.com'Indexes accelerate reads but penalize writes. Every INSERT, UPDATE (to indexed columns), and DELETE must also update all relevant indexes. Understanding this trade-off is critical for write-heavy systems.
| Operation | Without Index | With Index | Notes |
|---|---|---|---|
| SELECT (point lookup) | O(n) full scan | O(log n) | Dramatic improvement ✓ |
| SELECT (range scan) | O(n) full scan | O(log n + k) | k = matching rows ✓ |
| INSERT | O(1) append | O(log n) per index | Each index adds overhead ✗ |
| UPDATE (indexed column) | O(n) to find row | O(log n) find + O(log n) reindex | Two index operations ✗ |
| UPDATE (non-indexed) | O(n) to find row | O(log n) find only | Index helps find, no reindex cost ✓ |
| DELETE | O(n) to find | O(log n) find + O(log n) remove | Index helps but adds removal cost |
Common indexing mistakes:
status with 3 possible values is rarely useful. The query planner will often prefer a sequential scan when >10-15% of rows match.123456789101112131415161718192021222324252627282930313233343536373839404142434445
-- Find unused indexes (PostgreSQL)SELECT schemaname, relname as table_name, indexrelname as index_name, idx_scan as times_used, pg_size_pretty(pg_relation_size(indexrelid)) as index_sizeFROM pg_stat_user_indexesWHERE idx_scan = 0 -- Never used AND indexrelname NOT LIKE '%_pkey' -- Exclude primary keysORDER BY pg_relation_size(indexrelid) DESC; -- Find duplicate/redundant indexesSELECT pg_size_pretty(sum(pg_relation_size(idx))::bigint) as size, (array_agg(idx))[1] as index1, (array_agg(idx))[2] as index2FROM ( SELECT indexrelid::regclass as idx, indrelid, indkey::text FROM pg_index) subGROUP BY indrelid, indkeyHAVING count(*) > 1; -- Check index bloat (simplified)SELECT nspname, relname, round(100 * pg_relation_size(indexrelid) / pg_relation_size(indrelid))::text || '%' as index_ratioFROM pg_stat_user_indexes iJOIN pg_class c ON i.relid = c.oidORDER BY pg_relation_size(indexrelid) DESCLIMIT 20; -- Index usage statisticsSELECT relname as table_name, indexrelname as index_name, idx_scan as scans, idx_tup_read as tuples_read, idx_tup_fetch as tuples_fetchedFROM pg_stat_user_indexesORDER BY idx_scan DESC;For read-heavy workloads (90% reads, 10% writes), aggressive indexing is beneficial. For write-heavy workloads (10% reads, 90% writes), minimize indexes to essential use cases only. For balanced workloads, profile carefully and index only the frequently-used query patterns.
What's next:
While query optimization and indexing handle individual query performance, connection pooling addresses the challenge of managing concurrent database access at scale. The next page dives into connection pooling—why it's essential, how pools work, and how to size them correctly.
You now have comprehensive understanding of database indexing. You can design effective composite indexes, create covering indexes for maximum read performance, use expression and partial indexes for specialized needs, and balance the read-write tradeoffs inherent in indexing strategies.