Loading learning content...
Sometimes, a single index—composite or otherwise—cannot efficiently satisfy a query. Consider a query with complex predicates across columns that don't share a common index prefix, or conditions that combine AND with OR in ways that defeat any single index's coverage.
In these situations, modern databases employ a technique called index intersection (also known as index merge)—combining results from multiple indexes to produce the final result set more efficiently than any single index could alone.
This page explores how index intersection works, when it's beneficial, its limitations, and how it interacts with composite index design decisions.
By the end of this page, you will understand the mechanics of index intersection, the different types (AND vs OR intersection), when the optimizer chooses intersection over single-index scans, and how to design your index strategy to leverage or avoid intersection appropriately.
Index intersection is a query execution strategy where the database uses multiple indexes on the same table and combines their results to satisfy a single query. Instead of relying on one index to find all matching rows, the optimizer uses two or more indexes, each filtering on different predicates, then merges the results.
The Two Main Types:
Index Intersection (AND) — Combines results from multiple indexes using set intersection. A row must appear in ALL index result sets to be included. Used for queries with multiple AND conditions.
Index Union (OR) — Combines results from multiple indexes using set union. A row appearing in ANY index result set is included. Used for queries with OR conditions.
12345678910111213141516171819202122
-- Table: products (with separate single-column indexes)CREATE INDEX idx_category ON products (category_id);CREATE INDEX idx_price ON products (price); -- Query that benefits from index intersection:SELECT * FROM productsWHERE category_id = 50 AND price < 100.00; -- Without index intersection:-- Option A: Use idx_category, then filter price → scans many rows-- Option B: Use idx_price, then filter category → scans many rows-- Option C: Full table scan → worst case -- With index intersection:-- Step 1: idx_category finds row IDs where category_id = 50-- Step 2: idx_price finds row IDs where price < 100-- Step 3: Intersect these row ID sets-- Step 4: Fetch only the matching rows from the table -- The intersection is efficient when:-- - Each index is selective (few rows each)-- - The intersection is significantly smaller than either setThe mechanics of index intersection depend on how indexes store row identifiers and how efficiently these can be combined.
Step-by-Step Process (AND Intersection):
The key efficiency comes from Step 3—if both indexes are selective, the intersection is small, and Step 4 fetches very few rows.
Intersection Algorithms:
Databases use various algorithms to compute set intersection:
| Algorithm | Time Complexity | Best When |
|---|---|---|
| Sort-Merge | O(n log n + m log m) | RID sets are unsorted or similar size |
| Hash Intersection | O(n + m) | Memory available, one set much smaller |
| Bitmap AND | O(n + m) | Bitmap indexes or RID bitmaps used |
| Zigzag Merge | O(min(n, m)) | RIDs already sorted, sets overlap significantly |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
// Sort-Merge Intersectionfunction sortMergeIntersection(rids1, rids2): sort(rids1) // O(n log n) sort(rids2) // O(m log m) result = [] i = 0, j = 0 while i < len(rids1) and j < len(rids2): if rids1[i] == rids2[j]: result.append(rids1[i]) i++, j++ else if rids1[i] < rids2[j]: i++ else: j++ return result // Hash Intersectionfunction hashIntersection(rids1, rids2): // Build hash set from smaller set smaller = (len(rids1) < len(rids2)) ? rids1 : rids2 larger = (len(rids1) < len(rids2)) ? rids2 : rids1 hashSet = new HashSet() for rid in smaller: hashSet.add(rid) // O(n) for smaller set result = [] for rid in larger: if hashSet.contains(rid): // O(1) lookup result.append(rid) return result // Zigzag Merge (already sorted, seek-based)function zigzagMerge(index1, index2, pred1, pred2): cursor1 = index1.seek(pred1) // Position at first match cursor2 = index2.seek(pred2) result = [] while cursor1.valid() and cursor2.valid(): rid1 = cursor1.rid() rid2 = cursor2.rid() if rid1 == rid2: result.append(rid1) cursor1.next() cursor2.next() else if rid1 < rid2: cursor1.seek(rid2) // Skip ahead to rid2 else: cursor2.seek(rid1) // Skip ahead to rid1 return resultWhen indexes store RIDs in sorted order (common for clustering keys or RID-based organization), zigzag merge becomes highly efficient. The algorithm can skip large portions of each index, making intersection much faster than visiting every entry.
Index union handles OR predicates, which are traditionally problematic for index usage. Without index union, OR conditions often force table scans because no single index can satisfy both branches.
Index Union Process:
Terminology Across Databases:
| Database | Feature Name |
|---|---|
| MySQL | Index Merge (type: union, intersection, sort_union) |
| PostgreSQL | BitmapOr, BitmapAnd |
| SQL Server | Index Intersection, Index Union |
| Oracle | INDEX COMBINE, BITMAP CONVERSION |
123456789101112131415161718192021222324252627282930
-- Separate indexesCREATE INDEX idx_status ON orders (status);CREATE INDEX idx_priority ON orders (priority); -- OR query that benefits from index union:SELECT * FROM ordersWHERE status = 'pending' OR priority = 'high'; -- MySQL EXPLAIN output showing index merge:-- Extra: Using union(idx_status,idx_priority); Using where -- PostgreSQL equivalent using bitmap operations:-- Bitmap Heap Scan on orders-- Recheck Cond: ((status = 'pending'::text) OR (priority = 'high'::text))-- -> BitmapOr-- -> Bitmap Index Scan on idx_status-- Index Cond: (status = 'pending'::text)-- -> Bitmap Index Scan on idx_priority-- Index Cond: (priority = 'high'::text) -- Complex OR with AND combinations:SELECT * FROM ordersWHERE (customer_id = 100 AND status = 'pending') OR (priority = 'critical'); -- This may use:-- - Composite index on (customer_id, status) for first branch-- - Single index on priority for second branch-- - Union of resultsIndex union for OR queries is generally less efficient than AND intersection because: (1) union produces more rows, not fewer; (2) deduplication adds overhead; (3) if any OR branch is non-selective, the union includes many rows. For high-performance OR queries, consider UNION ALL with separate queries.
12345678910111213141516
-- Original OR querySELECT * FROM ordersWHERE status = 'pending' OR priority = 'high'; -- Manual UNION ALL rewrite (sometimes more efficient):SELECT * FROM orders WHERE status = 'pending'UNIONSELECT * FROM orders WHERE priority = 'high'; -- If results don't overlap (mutually exclusive conditions):SELECT * FROM orders WHERE status = 'pending'UNION ALLSELECT * FROM orders WHERE priority = 'high' AND status != 'pending'; -- The rewrite can use optimal indexes for each branch independently-- and avoids the index merge overhead in some optimizer implementationsThe query optimizer uses cost-based analysis to decide whether index intersection is beneficial. Intersection is not always the best choice—it adds overhead that must be offset by reduced row access.
Cost Factors the Optimizer Considers:
| Scenario | Likely Decision | Reason |
|---|---|---|
| Both indexes highly selective | ✅ Use intersection | Intersection is small, few random I/Os |
| One index highly selective | ❓ Maybe single index | More selective index + filter may be cheaper |
| Both indexes non-selective | ❌ Full table scan | Intersection still large, random I/O expensive |
| Composite index exists | ✅ Use composite | One index covers both predicates efficiently |
| Covering index available | ✅ Use covering | No table access needed at all |
123456789101112131415161718192021222324252627282930313233
-- Analyze intersection decision with EXPLAIN -- MySQL: Check for "Using intersect" or "Using union"EXPLAIN SELECT * FROM productsWHERE category_id = 50 AND price < 100; -- PostgreSQL: Look for BitmapAnd or BitmapOr nodesEXPLAIN (ANALYZE, COSTS)SELECT * FROM productsWHERE category_id = 50 AND price < 100; -- Example output showing intersection choice:/*Bitmap Heap Scan on products (cost=24.50..120.75 rows=50 ...) Recheck Cond: ((category_id = 50) AND (price < 100)) -> BitmapAnd -> Bitmap Index Scan on idx_category (cost=0.00..5.20 rows=200 ...) Index Cond: (category_id = 50) -> Bitmap Index Scan on idx_price (cost=0.00..18.80 rows=500 ...) Index Cond: (price < 100)*/ -- Cost breakdown:-- idx_category: 200 matching rows-- idx_price: 500 matching rows-- After intersection: ~50 rows (estimate)-- Without intersection (using just idx_category): 200 rows to fetch -- Force optimizer to use or avoid intersection (PostgreSQL):SET enable_bitmapscan = OFF; -- Disable bitmap/intersectionEXPLAIN SELECT * FROM products WHERE category_id = 50 AND price < 100;SET enable_bitmapscan = ON; -- Re-enableIndex intersection is typically beneficial when selectivity(Index1) × selectivity(Index2) < selectivity(either index alone). If multiplying the selectivities produces a significantly smaller result set, intersection pays off.
A natural question arises: if index intersection can combine multiple single-column indexes, why bother with composite indexes at all?
The answer lies in efficiency trade-offs:
123456789101112131415161718192021222324252627282930313233343536373839404142
-- Scenario: orders table with queries on customer_id and order_date -- Option A: Two single-column indexesCREATE INDEX idx_customer ON orders (customer_id);CREATE INDEX idx_date ON orders (order_date);-- Uses intersection for: WHERE customer_id = ? AND order_date = ? -- Option B: One composite indexCREATE INDEX idx_customer_date ON orders (customer_id, order_date);-- Directly navigates to matching entries -- Performance comparison for: -- SELECT * FROM orders WHERE customer_id = 100 AND order_date = '2023-01-15' -- Option A (Intersection):-- 1. Scan idx_customer: ~100 RIDs (1ms)-- 2. Scan idx_date: ~1000 RIDs (3ms)-- 3. Intersect: ~5 RIDs (0.1ms)-- 4. Random I/O: 5 row accesses (0.5ms)-- Total: ~4.6ms -- Option B (Composite):-- 1. Seek idx_customer_date: ~5 entries (0.3ms)-- 2. Sequential I/O or direct pointer (0.2ms)-- Total: ~0.5ms -- Composite index is ~9x faster for this query pattern! -- When intersection wins:-- Diverse query patterns where composite prefix doesn't help -- Query 1: WHERE customer_id = ? -- Uses idx_customer ✅-- Query 2: WHERE order_date = ? -- Uses idx_date ✅ -- Query 3: WHERE customer_id = ? AND order_date = ? -- Uses intersection -- With composite (customer_id, order_date):-- Query 1: ✅ Uses composite prefix-- Query 2: ❌ Cannot use (wrong prefix)-- Query 3: ✅ Uses composite fully -- Separate indexes provide more flexibility for Query 2For frequently-executed queries with known patterns, composite indexes are almost always superior. Reserve index intersection for flexibility with ad-hoc queries or when storage constraints prevent creating all needed composite indexes.
Bitmap indexes are particularly suited for intersection operations because bitmaps can be combined with simple bitwise AND/OR operations—extremely fast on modern CPUs.
How Bitmap Intersection Works:
Each index value maps to a bitmap where bit position N is 1 if row N contains that value:
value 'A': [1,0,1,0,0,1,0,1,0,0] // Rows 0,2,5,7
value 'B': [0,1,1,0,1,0,0,1,0,0] // Rows 1,2,4,7
AND operation:
[0,0,1,0,0,0,0,1,0,0] // Rows 2,7 (intersection)
OR operation:
[1,1,1,0,1,1,0,1,0,0] // Rows 0,1,2,4,5,7 (union)
| Aspect | Bitmap Intersection | B-tree Intersection |
|---|---|---|
| Operation | Bitwise AND/OR (hardware optimized) | Sort-merge or hash join |
| Speed | Extremely fast | Moderate |
| Memory | Constant (bitmap size) | Proportional to RID count |
| Cardinality requirement | Low cardinality columns | Any cardinality |
| Available in | Oracle, PostgreSQL (runtime bitmaps) | All major databases |
| Update cost | High (full bitmap rebuild) | Incremental updates |
123456789101112131415161718192021222324252627282930313233
-- PostgreSQL uses runtime bitmaps for B-tree index intersection: EXPLAIN (ANALYZE) SELECT * FROM productsWHERE category = 'Electronics' AND brand = 'Acme' AND price < 500; /*Bitmap Heap Scan on products Recheck Cond: ((category = 'Electronics') AND (brand = 'Acme') AND (price < 500)) -> BitmapAnd -> Bitmap Index Scan on idx_category Index Cond: (category = 'Electronics') -> Bitmap Index Scan on idx_brand Index Cond: (brand = 'Acme') -> Bitmap Index Scan on idx_price Index Cond: (price < 500)*/ -- Each B-tree index scan creates a runtime bitmap-- Bitmaps are ANDed together-- Result bitmap used for heap access -- Oracle with actual bitmap indexes:CREATE BITMAP INDEX bmp_category ON products (category);CREATE BITMAP INDEX bmp_brand ON products (brand);CREATE BITMAP INDEX bmp_region ON products (region); -- Complex AND/OR query leverages bitmap operations natively:SELECT * FROM productsWHERE (category = 'Electronics' OR category = 'Appliances') AND brand = 'Acme' AND region IN ('North', 'South'); -- Oracle execution: BITMAP AND + BITMAP OR operationsPostgreSQL doesn't have dedicated bitmap indexes, but creates runtime bitmaps during query execution from regular B-tree indexes. This provides bitmap intersection benefits without the update overhead of maintaining persistent bitmap structures.
When should you rely on intersection versus creating composite indexes? Here are practical guidelines:
12345678910111213141516171819202122232425262728
-- Decision checklist for intersection vs composite: -- 1. Identify the query pattern-- Query: WHERE A = ? AND B = ? -- 2. Check query frequencySHOW STATUS LIKE 'Questions'; -- MySQL-- If this query runs > 100/sec, favor composite index -- 3. Analyze column selectivitySELECT COUNT(DISTINCT A) / COUNT(*) AS selectivity_A, COUNT(DISTINCT B) / COUNT(*) AS selectivity_BFROM table;-- If both selective (< 5%), intersection may work well -- 4. Check if OR predicates are involved-- Query: WHERE A = ? OR B = ?-- Composite index CAN'T help; need separate indexes + union -- 5. Consider covering index opportunity-- Query: SELECT C, D FROM t WHERE A = ? AND B = ?CREATE INDEX idx_covering ON t (A, B, C, D); -- Best solution -- 6. Test both approachesEXPLAIN ANALYZE SELECT ...; -- With current indexes-- Create composite, test againCREATE INDEX idx_composite ON t (A, B);EXPLAIN ANALYZE SELECT ...; -- Compare execution timesIndex intersection provides flexibility in query processing by combining multiple indexes. Here are the key concepts:
What's Next:
The final page of this module covers design guidelines—practical principles and best practices for designing multi-column indexes in real-world systems, synthesizing everything you've learned about composite indexes, column ordering, prefix matching, and intersection.
You now understand how index intersection works, when the optimizer chooses it, and how it compares to composite indexes. This knowledge helps you design index strategies that balance flexibility with performance for diverse query workloads.