Database Management SystemsAccess Methods

Access Methods: How Databases Retrieve Data

LevelIntermediate

Duration90 mins

TopicAccess Methods

4 / 5

Bitmap Scan: The Best of Both Worlds

Bridging Index Selectivity and Sequential Efficiency

We've seen that table scans offer efficient sequential I/O but examine all rows, while index scans provide selectivity but suffer from random I/O overhead. What if we could combine the benefits of both approaches?

Bitmap scans do exactly this. They use indexes to identify which rows match query conditions, but instead of immediately fetching each row, they build an in-memory bitmap indicating which pages (or rows) contain matches. Then, they scan those pages in physical order, converting random I/O into sequential I/O.

Even more powerfully, bitmaps can be combined with logical operations (AND, OR), enabling efficient execution of queries that would otherwise require multiple separate index scans.

What You Will Learn

By the end of this page, you will understand how bitmap scans work, when they outperform both table scans and index scans, how bitmaps enable combining multiple indexes for complex queries, and the internal mechanics of bitmap creation and execution.

How Bitmap Scans Work

A bitmap scan operates in three distinct phases:

Phase 1: Bitmap Index Scan

First, the database scans the index(es) to identify matching index entries. Instead of immediately fetching rows, it builds an in-memory bitmap where each bit corresponds to a table page (or row, depending on implementation):

Bit = 1: This page contains at least one matching row
Bit = 0: This page contains no matching rows

Phase 2: Bitmap Combination (Optional)

If multiple conditions exist, multiple bitmaps can be created and combined:

BitmapAnd: Result bit is 1 only if BOTH source bits are 1 (for AND predicates)
BitmapOr: Result bit is 1 if EITHER source bit is 1 (for OR predicates)

Phase 3: Bitmap Heap Scan

Finally, the database reads table pages in physical order, but only pages where the bitmap bit is set. All rows on each fetched page are checked against the original predicate (recheck condition), and matching rows are returned.

Bitmap Scan Algorithm (Pseudocode)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
function BitmapScan(table, indexes, predicates):
    // Phase 1: Build bitmap from index scans
    combined_bitmap = None
    
    for each (index, predicate) in (indexes, predicates):
        bitmap = new Bitmap(table.page_count)  // One bit per page
        
        // Scan index and set bits for matching pages
        for each entry in index.scan(predicate):
            page_id = entry.rid.page_id
            bitmap.set(page_id, 1)
        
        // Phase 2: Combine with previous bitmap
        if combined_bitmap is None:
            combined_bitmap = bitmap
        else if predicate_is_AND:
            combined_bitmap = combined_bitmap AND bitmap
        else:  // OR
            combined_bitmap = combined_bitmap OR bitmap
    
    // Phase 3: Scan table pages in order, only where bitmap is set
    results = []
    
    for page_id in 0 to table.page_count:
        if combined_bitmap.get(page_id) == 1:
            page = buffer_pool.fetch(page_id)
            
            for each row in page.rows:
                // Recheck: page-level bitmap may include non-matching rows
                if full_predicate_matches(row, predicates):
                    results.append(row)
    
    return results

Why Recheck Is Necessary

The bitmap typically tracks pages, not individual rows. When a page contains both matching and non-matching rows, we must 'recheck' each row after fetching the page. The recheck is visible in PostgreSQL EXPLAIN output as 'Recheck Cond'. Some implementations use row-level bitmaps to avoid recheck, at the cost of more memory.

When Bitmap Scans Outperform Alternatives

Bitmap scans occupy a sweet spot between index scans and table scans. They're optimal in several scenarios:

Scenario 1: Medium Selectivity Queries

When selectivity is too high for pure index scan (random I/O overhead too great) but too low for table scan (too many wasted page reads), bitmap scans shine:

Access Method Comparison by Selectivity
Selectivity	Matching Rows	Best Access Method	Reason
< 1%	Few rows	Index Scan	Low random I/O count
1% - 20%	Moderate rows	Bitmap Scan	Sequential page access with selectivity
20%	Many rows	Table Scan	Nearly all pages needed anyway

Scenario 2: Multiple Conditions on Different Columns

When a query has multiple conditions and separate indexes exist for each, bitmap scans can combine them:

Combining Multiple Indexes with Bitmap
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Indexes: idx_products_category (category_id)
--          idx_products_price (price)
 
-- Query with two conditions on different indexed columns
SELECT * FROM products 
WHERE category_id = 5 AND price < 100;
 
-- Without bitmap: Must choose ONE index, then filter
-- Option A: Use category index, filter by price → may read many prices > 100
-- Option B: Use price index, filter by category → may read many wrong categories
 
-- With bitmap:
EXPLAIN SELECT * FROM products WHERE category_id = 5 AND price < 100;
 
/*
Bitmap Heap Scan on products
  Recheck Cond: ((category_id = 5) AND (price < 100))
  ->  BitmapAnd
        ->  Bitmap Index Scan on idx_products_category
              Index Cond: (category_id = 5)
        ->  Bitmap Index Scan on idx_products_price
              Index Cond: (price < 100)
*/
 
-- Both indexes are used! Only pages satisfying BOTH conditions are fetched.

Scenario 3: IN Lists and OR Conditions

Bitmap scans efficiently handle multiple values by combining separate index lookups:

Bitmap Scan for IN Lists
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Query with IN list
SELECT * FROM orders WHERE status IN ('pending', 'processing', 'review');
 
/*
Bitmap Heap Scan on orders
  Recheck Cond: (status = ANY ('{pending,processing,review}'))
  ->  Bitmap Index Scan on idx_orders_status
        Index Cond: (status = ANY ('{pending,processing,review}'))
*/
 
-- Each status value produces a bitmap, combined with OR
 
-- Similar for explicit OR:
SELECT * FROM orders WHERE customer_id = 123 OR customer_id = 456;
 
/*
Bitmap Heap Scan on orders
  Recheck Cond: ((customer_id = 123) OR (customer_id = 456))
  ->  BitmapOr
        ->  Bitmap Index Scan on idx_orders_customer
              Index Cond: (customer_id = 123)
        ->  Bitmap Index Scan on idx_orders_customer
              Index Cond: (customer_id = 456)
*/

Bitmap Scans Restore Sequential Access

The key insight is that bitmap scans decouple 'finding matching entries' from 'fetching matching rows'. By collecting all matching page IDs first, then fetching in order, we convert random I/O into more sequential I/O, which is 10-50× faster on HDDs and still beneficial on SSDs.

Bitmap Scan Cost Analysis

Understanding the cost model for bitmap scans helps predict when they'll be chosen by the optimizer.

Component Costs:

Bitmap Index Scan Cost: Scanning index leaf pages to build the bitmap
```
C_bitmap_build = IndexPages × t_seq
```
Bitmap Combination Cost: CPU time to AND/OR bitmaps (very fast, typically negligible)
```
C_combine ≈ TablePages / 8  (number of bytes in bitmap)
```
Bitmap Heap Scan Cost: Reading table pages where bits are set
```
C_heap = PagesWithMatches × t_seq
```

Total Bitmap Scan Cost:

C_bitmap = C_bitmap_build + C_combine + C_heap

Bitmap Scan vs Index Scan Cost Comparison (1M row table, 10% selectivity)
Component	Index Scan	Bitmap Scan
Index traversal	3 pages	3 pages (to start)
Index leaf scan	1,000 pages	1,000 pages
Bitmap operation	N/A	~1KB bitmap, negligible
Row fetches (random)	100,000 × random	N/A
Page fetches (sequential)	N/A	~8,000 pages × sequential
Total I/O cost (HDD)	100,000 × 10ms = 1000 sec	9,000 × 0.5ms = 4.5 sec

The Crossover Point

The bitmap scan becomes preferable to index scan when the random I/O savings outweigh the extra pages read due to page-level (rather than row-level) granularity.

Factors favoring bitmap scan:

Higher selectivity (more matching rows)
Poor clustering factor (rows scattered across pages)
Slower random I/O relative to sequential (larger benefit on HDDs)
Multiple conditions that can share page reads

Factors favoring index scan:

Very low selectivity (few matching rows)
Excellent clustering (most rows on few pages anyway)
Fast random I/O (NVMe SSDs minimize the gap)

Memory Limits on Bitmaps

Bitmap scans require memory to hold the bitmap. For very large tables, the bitmap itself can become large (1 bit per page × millions of pages = megabytes). If memory is limited, the bitmap may become 'lossy' (tracking page ranges instead of individual pages), requiring more recheck work.

Bitmap Logical Operations: And, Or, and Beyond

The power of bitmap scans comes from the ability to combine multiple bitmaps using logical operations. This enables efficient processing of complex predicates.

BitmapAnd (Intersection)

For AND predicates, the resulting bitmap has 1 only where ALL source bitmaps have 1:

Result[page] = 1  iff  Bitmap1[page] = 1 AND Bitmap2[page] = 1 AND ...

This operation is extremely efficient: bitwise AND on entire bytes, processing 8 pages at a time.

BitmapAnd Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- Query: Find electronics under $50 from verified sellers
SELECT * FROM products
WHERE category = 'electronics'
  AND price < 50
  AND seller_verified = true;
 
-- Assume indexes on all three columns
 
/*
Execution Plan:
Bitmap Heap Scan on products
  Recheck Cond: (...)
  ->  BitmapAnd
        ->  Bitmap Index Scan on idx_category
              Index Cond: (category = 'electronics')
              -- Bitmap 1: [0,1,1,0,1,1,0,0,1,0,...] (pages with electronics)
        ->  Bitmap Index Scan on idx_price
              Index Cond: (price < 50)
              -- Bitmap 2: [1,1,0,1,0,1,0,1,1,0,...] (pages with low prices)
        ->  Bitmap Index Scan on idx_seller_verified
              Index Cond: (seller_verified = true)
              -- Bitmap 3: [1,1,1,0,1,0,0,0,1,1,...] (pages with verified sellers)
 
Result Bitmap (AND of all three): [0,1,0,0,0,0,0,0,1,0,...]
-- Only pages 1 and 8 need to be fetched!
*/

BitmapOr (Union)

For OR predicates, the resulting bitmap has 1 where ANY source bitmap has 1:

Result[page] = 1  iff  Bitmap1[page] = 1 OR Bitmap2[page] = 1 OR ...

BitmapOr Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Query: Find products that are either on sale OR highly rated
SELECT * FROM products
WHERE on_sale = true OR rating >= 4.5;
 
/*
Execution Plan:
Bitmap Heap Scan on products
  Recheck Cond: ((on_sale = true) OR (rating >= 4.5))
  ->  BitmapOr
        ->  Bitmap Index Scan on idx_on_sale
              Index Cond: (on_sale = true)
              -- Bitmap 1: [0,1,0,0,1,0,0,0,1,0,...] (pages with sale items)
        ->  Bitmap Index Scan on idx_rating  
              Index Cond: (rating >= 4.5)
              -- Bitmap 2: [1,0,0,1,1,0,0,1,0,0,...] (pages with high ratings)
 
Result Bitmap (OR of both): [1,1,0,1,1,0,0,1,1,0,...]
-- Pages 0,1,3,4,7,8 need to be fetched
*/

Complex Predicate Transformation

The optimizer transforms complex boolean expressions into bitmap operation trees:

Complex Predicate to Bitmap Tree
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- Complex query
SELECT * FROM orders
WHERE (status = 'pending' AND priority = 'high')
   OR (status = 'processing' AND created_at > '2024-01-01');
 
/*
Bitmap Operation Tree:
 
            BitmapOr
           /        \
     BitmapAnd    BitmapAnd
      /    \       /     \
   status priority  status  created_at
   ='pending' ='high' ='processing' >'2024-01-01'
*/

Bitmaps Enable Index Intersection

Without bitmap scans, complex predicates often can't fully utilize multiple indexes. Bitmap operations allow the database to leverage all available indexes, combining their selectivity. This is why sometimes adding a 'redundant' index on a single column can dramatically speed up queries with complex WHERE clauses.

Lossy Bitmaps and Memory Management

For very large tables, maintaining an exact bitmap (one bit per page) can require substantial memory. PostgreSQL addresses this with lossy bitmaps that trade precision for memory efficiency.

Exact vs Lossy Bitmaps:

Exact bitmap: One bit per page. Precise identification of matching pages.
Lossy bitmap: One bit per chunk of pages (e.g., 32 or 64 pages per bit). Indicates 'some page in this chunk matches'.

With lossy bitmaps, the recheck phase must filter out more non-matching rows because the granularity is coarser.

PostgreSQL Work Memory and Bitmap Behavior
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Check current work_mem setting
SHOW work_mem;  -- Default: 4MB
 
-- For a table with 1 million pages, exact bitmap needs:
-- 1,000,000 bits = 125 KB (easily fits in 4MB)
 
-- For a table with 100 million pages:
-- 100,000,000 bits = 12.5 MB (exceeds default work_mem)
-- PostgreSQL would create a lossy bitmap
 
-- Observe bitmap behavior in EXPLAIN
EXPLAIN (ANALYZE, BUFFERS) 
SELECT * FROM huge_table WHERE indexed_column < 1000;
 
/*
Bitmap Heap Scan on huge_table
  Recheck Cond: (indexed_column < 1000)
  Rows Removed by Index Recheck: 15234    -- <-- High number indicates lossy bitmap
  Heap Blocks: exact=12345 lossy=67890    -- <-- Shows exact vs lossy blocks
  ->  Bitmap Index Scan on idx_huge
        Index Cond: (indexed_column < 1000)
*/
 
-- Increase work_mem to enable exact bitmaps (for this session)
SET work_mem = '128MB';
 
-- Re-run query - should show fewer lossy blocks

Lossy Bitmap Tradeoffs:

Aspect	Exact Bitmap	Lossy Bitmap
Memory	Higher	Lower
Precision	Exact pages	Page ranges
Recheck overhead	Minimal	Higher (more false positives)
I/O efficiency	Optimal	Slightly worse
Suitable for	Normal tables	Very large tables

When Lossy Bitmaps Cause Problems:

If you see high 'Rows Removed by Index Recheck' counts, the lossy bitmap is fetching many pages that don't actually contain matches. Consider:

Increasing work_mem to allow exact bitmaps
Adding a more selective index
Restructuring the query to be more selective

Work Memory Trade-offs

Increasing work_mem helps bitmap scans but also affects other operations (sorts, hash joins). Each query can use work_mem for each operation, so very high values can lead to memory pressure under concurrent load. Tune carefully based on your workload.

Bitmap Scans Across Database Systems

Different database systems implement bitmap-style operations with varying terminology and capabilities.

Bitmap Scan Implementation by Database
Database	Feature Name	Index Combination	Notes
PostgreSQL	Bitmap Heap Scan / Bitmap Index Scan	BitmapAnd, BitmapOr	Full-featured; supports lossy bitmaps for memory management
Oracle	BITMAP CONVERSION, INDEX COMBINE	BITMAP AND, BITMAP OR	Automatic bitmap conversion from B-tree indexes; native bitmap indexes also available
SQL Server	No direct equivalent	Index Intersection (similar concept)	Uses RID-based intersection rather than page-level bitmaps
MySQL	Index Merge	Intersection, Union	Limited capability; often creates temporary table instead of true bitmap

Cross-Database Bitmap-Style Operations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
=== PostgreSQL ===
-- Bitmap scan clearly shown in EXPLAIN
EXPLAIN SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
/*
Bitmap Heap Scan on orders
  ->  BitmapAnd
        ->  Bitmap Index Scan on idx_status
        ->  Bitmap Index Scan on idx_amount
*/
 
=== Oracle ===
-- Oracle can convert B-tree indexes to bitmaps internally
EXPLAIN PLAN FOR
SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
 
-- Look for operations like:
-- BITMAP CONVERSION FROM ROWIDS
-- BITMAP AND
-- TABLE ACCESS BY INDEX ROWID BATCHED
 
=== SQL Server ===
-- SQL Server uses "Index Intersection" for similar effect
SET SHOWPLAN_TEXT ON;
SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
 
-- May show:
-- Index Seek (idx_status)
-- Index Seek (idx_amount)  
-- Hash Match (Inner Join on RID)  -- RID intersection
 
=== MySQL ===
-- MySQL "Index Merge" is the closest feature
EXPLAIN SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
 
-- May show type: index_merge
-- Extra: Using intersect(idx_status,idx_amount); Using where

PostgreSQL's Bitmap Implementation Is Among the Most Sophisticated

PostgreSQL's bitmap scan is particularly powerful due to its ability to gracefully degrade to lossy mode, combine arbitrary numbers of indexes, and handle complex boolean expressions efficiently. Other databases offer similar capabilities but often with more restrictions.

Practical Examples and Query Analysis

Let's examine real-world scenarios where bitmap scans provide significant performance benefits.

E-Commerce Product Filtering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- Scenario: User applies multiple filters on product catalog
-- Table: products (10 million rows)
-- Indexes: category, brand, price_range, rating, in_stock
 
-- User's filter: Electronics, Sony or Samsung, Under $500, 4+ stars
SELECT product_id, name, price, rating
FROM products
WHERE category = 'electronics'
  AND brand IN ('sony', 'samsung')
  AND price < 500
  AND rating >= 4.0
  AND in_stock = true;
 
-- Without bitmap scans: Pick one index, filter the rest
-- With bitmap scans:
 
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT product_id, name, price, rating
FROM products
WHERE category = 'electronics'
  AND brand IN ('sony', 'samsung')
  AND price < 500
  AND rating >= 4.0
  AND in_stock = true;
 
/*
Bitmap Heap Scan on products  (actual time=12.345..89.123 rows=2,847 loops=1)
  Recheck Cond: ((category = 'electronics') AND (brand = ANY (...)) AND (price < 500) AND (rating >= 4.0) AND in_stock)
  Heap Blocks: exact=2134
  Buffers: shared hit=2567
  ->  BitmapAnd  (actual time=11.234..11.234 rows=0 loops=1)
        ->  Bitmap Index Scan on idx_category  (actual time=1.234..1.234 rows=0 loops=1)
              Index Cond: (category = 'electronics')
              -- 800,000 products are electronics (8% selectivity)
        ->  BitmapOr  (actual time=2.345..2.345 rows=0 loops=1)
              ->  Bitmap Index Scan on idx_brand
                    Index Cond: (brand = 'sony')
              ->  Bitmap Index Scan on idx_brand  
                    Index Cond: (brand = 'samsung')
              -- 150,000 products are Sony or Samsung
        ->  Bitmap Index Scan on idx_price  (actual time=1.567..1.567 rows=0 loops=1)
              Index Cond: (price < 500)
              -- 3,000,000 products under $500
        ->  Bitmap Index Scan on idx_rating  (actual time=1.234..1.234 rows=0 loops=1)
              Index Cond: (rating >= 4.0)
              -- 2,500,000 products have 4+ rating
        ->  Bitmap Index Scan on idx_stock  (actual time=0.567..0.567 rows=0 loops=1)
              Index Cond: (in_stock = true)
              -- 6,000,000 products in stock
 
Planning Time: 0.456 ms
Execution Time: 89.567 ms
*/
 
-- Combined selectivity: only 2,847 matching rows across 2,134 pages
-- Without bitmaps: would need to pick ONE index, then filter millions of rows

Time-Based Log Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Scenario: Find error logs in specific time window from multiple sources
-- Table: application_logs (500 million rows, heavily partitioned)
 
SELECT log_id, timestamp, message, source
FROM application_logs
WHERE timestamp BETWEEN '2024-01-15 00:00' AND '2024-01-15 23:59'
  AND log_level = 'ERROR'
  AND source IN ('auth-service', 'payment-service', 'order-service');
 
/*
Bitmap Heap Scan on application_logs  (actual time=45.123..234.567 rows=8,923 loops=1)
  Recheck Cond: ((timestamp >= ...) AND (timestamp <= ...) AND (log_level = 'ERROR') AND (source = ANY (...)))
  Heap Blocks: exact=7234
  Buffers: shared hit=8234 read=567
  ->  BitmapAnd  (actual time=42.123..42.123 rows=0 loops=1)
        ->  Bitmap Index Scan on idx_logs_timestamp  
              Index Cond: ((timestamp >= ...) AND (timestamp <= ...))
              -- ~2M entries for that day
        ->  Bitmap Index Scan on idx_logs_level  
              Index Cond: (log_level = 'ERROR')
              -- ~1% of all entries are errors
        ->  BitmapOr  
              ->  Bitmap Index Scan on idx_logs_source (source='auth-service')
              ->  Bitmap Index Scan on idx_logs_source (source='payment-service')
              ->  Bitmap Index Scan on idx_logs_source (source='order-service')
 
Execution Time: 234.567 ms
*/
 
-- Without bitmap: Would need table partitioned by time, or very slow scan

Summary: Bitmap Scan Mastery

You now have comprehensive knowledge of bitmap scans—the hybrid access method that combines index selectivity with sequential access efficiency. Here are the essential takeaways:

Key Takeaways

•Bitmap scans are two-phase operations — First build a bitmap from index entries, then fetch matching pages sequentially. This converts random I/O into sequential I/O.
•Bitmaps enable index combination — BitmapAnd and BitmapOr operations allow efficient processing of complex predicates using multiple indexes.
•Optimal for medium selectivity — Bitmap scans shine at 1-20% selectivity, where index scans have too much random I/O and table scans read too many unwanted rows.
•Page-level granularity requires recheck — Standard bitmaps track pages, not rows, so matching rows must be verified after page fetch.
•Lossy bitmaps handle large tables — When memory is limited, bitmaps become coarser-grained, trading precision for memory efficiency.
•PostgreSQL excels at bitmap operations — PostgreSQL's implementation is particularly sophisticated, but similar concepts exist in other databases.

What's Next:

The final page of this module covers Access Method Selection—how query optimizers decide which access method to use for each query. We'll explore the decision process, cost-based comparisons, and how to influence access method choices through query hints and statistics.

Page Complete

You now understand bitmap scans at a deep level—their mechanics, optimal use cases, combination operations, and memory considerations. This knowledge enables you to recognize when bitmap scans are the right choice and design indexes and queries that leverage their power.

4 / 5

Loading learning content...

Database Management SystemsAccess Methods

Access Methods: How Databases Retrieve Data

LevelIntermediate

Duration90 mins

TopicAccess Methods

4 / 5

Bitmap Scan: The Best of Both Worlds

Bridging Index Selectivity and Sequential Efficiency

Even more powerfully, bitmaps can be combined with logical operations (AND, OR), enabling efficient execution of queries that would otherwise require multiple separate index scans.

What You Will Learn

How Bitmap Scans Work

A bitmap scan operates in three distinct phases:

Phase 1: Bitmap Index Scan

Bit = 1: This page contains at least one matching row
Bit = 0: This page contains no matching rows

Phase 2: Bitmap Combination (Optional)

If multiple conditions exist, multiple bitmaps can be created and combined:

BitmapAnd: Result bit is 1 only if BOTH source bits are 1 (for AND predicates)
BitmapOr: Result bit is 1 if EITHER source bit is 1 (for OR predicates)

Phase 3: Bitmap Heap Scan

Bitmap Scan Algorithm (Pseudocode)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
function BitmapScan(table, indexes, predicates):
    // Phase 1: Build bitmap from index scans
    combined_bitmap = None
    
    for each (index, predicate) in (indexes, predicates):
        bitmap = new Bitmap(table.page_count)  // One bit per page
        
        // Scan index and set bits for matching pages
        for each entry in index.scan(predicate):
            page_id = entry.rid.page_id
            bitmap.set(page_id, 1)
        
        // Phase 2: Combine with previous bitmap
        if combined_bitmap is None:
            combined_bitmap = bitmap
        else if predicate_is_AND:
            combined_bitmap = combined_bitmap AND bitmap
        else:  // OR
            combined_bitmap = combined_bitmap OR bitmap
    
    // Phase 3: Scan table pages in order, only where bitmap is set
    results = []
    
    for page_id in 0 to table.page_count:
        if combined_bitmap.get(page_id) == 1:
            page = buffer_pool.fetch(page_id)
            
            for each row in page.rows:
                // Recheck: page-level bitmap may include non-matching rows
                if full_predicate_matches(row, predicates):
                    results.append(row)
    
    return results

Why Recheck Is Necessary

When Bitmap Scans Outperform Alternatives

Bitmap scans occupy a sweet spot between index scans and table scans. They're optimal in several scenarios:

Scenario 1: Medium Selectivity Queries

When selectivity is too high for pure index scan (random I/O overhead too great) but too low for table scan (too many wasted page reads), bitmap scans shine:

Access Method Comparison by Selectivity
Selectivity	Matching Rows	Best Access Method	Reason
< 1%	Few rows	Index Scan	Low random I/O count
1% - 20%	Moderate rows	Bitmap Scan	Sequential page access with selectivity
20%	Many rows	Table Scan	Nearly all pages needed anyway

Scenario 2: Multiple Conditions on Different Columns

When a query has multiple conditions and separate indexes exist for each, bitmap scans can combine them:

Combining Multiple Indexes with Bitmap
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Indexes: idx_products_category (category_id)
--          idx_products_price (price)
 
-- Query with two conditions on different indexed columns
SELECT * FROM products 
WHERE category_id = 5 AND price < 100;
 
-- Without bitmap: Must choose ONE index, then filter
-- Option A: Use category index, filter by price → may read many prices > 100
-- Option B: Use price index, filter by category → may read many wrong categories
 
-- With bitmap:
EXPLAIN SELECT * FROM products WHERE category_id = 5 AND price < 100;
 
/*
Bitmap Heap Scan on products
  Recheck Cond: ((category_id = 5) AND (price < 100))
  ->  BitmapAnd
        ->  Bitmap Index Scan on idx_products_category
              Index Cond: (category_id = 5)
        ->  Bitmap Index Scan on idx_products_price
              Index Cond: (price < 100)
*/
 
-- Both indexes are used! Only pages satisfying BOTH conditions are fetched.

Scenario 3: IN Lists and OR Conditions

Bitmap scans efficiently handle multiple values by combining separate index lookups:

Bitmap Scan for IN Lists
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Query with IN list
SELECT * FROM orders WHERE status IN ('pending', 'processing', 'review');
 
/*
Bitmap Heap Scan on orders
  Recheck Cond: (status = ANY ('{pending,processing,review}'))
  ->  Bitmap Index Scan on idx_orders_status
        Index Cond: (status = ANY ('{pending,processing,review}'))
*/
 
-- Each status value produces a bitmap, combined with OR
 
-- Similar for explicit OR:
SELECT * FROM orders WHERE customer_id = 123 OR customer_id = 456;
 
/*
Bitmap Heap Scan on orders
  Recheck Cond: ((customer_id = 123) OR (customer_id = 456))
  ->  BitmapOr
        ->  Bitmap Index Scan on idx_orders_customer
              Index Cond: (customer_id = 123)
        ->  Bitmap Index Scan on idx_orders_customer
              Index Cond: (customer_id = 456)
*/

Bitmap Scans Restore Sequential Access

Bitmap Scan Cost Analysis

Understanding the cost model for bitmap scans helps predict when they'll be chosen by the optimizer.

Component Costs:

Bitmap Index Scan Cost: Scanning index leaf pages to build the bitmap
```
C_bitmap_build = IndexPages × t_seq
```
Bitmap Combination Cost: CPU time to AND/OR bitmaps (very fast, typically negligible)
```
C_combine ≈ TablePages / 8  (number of bytes in bitmap)
```
Bitmap Heap Scan Cost: Reading table pages where bits are set
```
C_heap = PagesWithMatches × t_seq
```

Total Bitmap Scan Cost:

C_bitmap = C_bitmap_build + C_combine + C_heap

Bitmap Scan vs Index Scan Cost Comparison (1M row table, 10% selectivity)
Component	Index Scan	Bitmap Scan
Index traversal	3 pages	3 pages (to start)
Index leaf scan	1,000 pages	1,000 pages
Bitmap operation	N/A	~1KB bitmap, negligible
Row fetches (random)	100,000 × random	N/A
Page fetches (sequential)	N/A	~8,000 pages × sequential
Total I/O cost (HDD)	100,000 × 10ms = 1000 sec	9,000 × 0.5ms = 4.5 sec

The Crossover Point

The bitmap scan becomes preferable to index scan when the random I/O savings outweigh the extra pages read due to page-level (rather than row-level) granularity.

Factors favoring bitmap scan:

Higher selectivity (more matching rows)
Poor clustering factor (rows scattered across pages)
Slower random I/O relative to sequential (larger benefit on HDDs)
Multiple conditions that can share page reads

Factors favoring index scan:

Very low selectivity (few matching rows)
Excellent clustering (most rows on few pages anyway)
Fast random I/O (NVMe SSDs minimize the gap)

Memory Limits on Bitmaps

Bitmap Logical Operations: And, Or, and Beyond

The power of bitmap scans comes from the ability to combine multiple bitmaps using logical operations. This enables efficient processing of complex predicates.

BitmapAnd (Intersection)

For AND predicates, the resulting bitmap has 1 only where ALL source bitmaps have 1:

Result[page] = 1  iff  Bitmap1[page] = 1 AND Bitmap2[page] = 1 AND ...

This operation is extremely efficient: bitwise AND on entire bytes, processing 8 pages at a time.

BitmapAnd Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- Query: Find electronics under $50 from verified sellers
SELECT * FROM products
WHERE category = 'electronics'
  AND price < 50
  AND seller_verified = true;
 
-- Assume indexes on all three columns
 
/*
Execution Plan:
Bitmap Heap Scan on products
  Recheck Cond: (...)
  ->  BitmapAnd
        ->  Bitmap Index Scan on idx_category
              Index Cond: (category = 'electronics')
              -- Bitmap 1: [0,1,1,0,1,1,0,0,1,0,...] (pages with electronics)
        ->  Bitmap Index Scan on idx_price
              Index Cond: (price < 50)
              -- Bitmap 2: [1,1,0,1,0,1,0,1,1,0,...] (pages with low prices)
        ->  Bitmap Index Scan on idx_seller_verified
              Index Cond: (seller_verified = true)
              -- Bitmap 3: [1,1,1,0,1,0,0,0,1,1,...] (pages with verified sellers)
 
Result Bitmap (AND of all three): [0,1,0,0,0,0,0,0,1,0,...]
-- Only pages 1 and 8 need to be fetched!
*/

BitmapOr (Union)

For OR predicates, the resulting bitmap has 1 where ANY source bitmap has 1:

Result[page] = 1  iff  Bitmap1[page] = 1 OR Bitmap2[page] = 1 OR ...

BitmapOr Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Query: Find products that are either on sale OR highly rated
SELECT * FROM products
WHERE on_sale = true OR rating >= 4.5;
 
/*
Execution Plan:
Bitmap Heap Scan on products
  Recheck Cond: ((on_sale = true) OR (rating >= 4.5))
  ->  BitmapOr
        ->  Bitmap Index Scan on idx_on_sale
              Index Cond: (on_sale = true)
              -- Bitmap 1: [0,1,0,0,1,0,0,0,1,0,...] (pages with sale items)
        ->  Bitmap Index Scan on idx_rating  
              Index Cond: (rating >= 4.5)
              -- Bitmap 2: [1,0,0,1,1,0,0,1,0,0,...] (pages with high ratings)
 
Result Bitmap (OR of both): [1,1,0,1,1,0,0,1,1,0,...]
-- Pages 0,1,3,4,7,8 need to be fetched
*/

Complex Predicate Transformation

The optimizer transforms complex boolean expressions into bitmap operation trees:

Complex Predicate to Bitmap Tree
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- Complex query
SELECT * FROM orders
WHERE (status = 'pending' AND priority = 'high')
   OR (status = 'processing' AND created_at > '2024-01-01');
 
/*
Bitmap Operation Tree:
 
            BitmapOr
           /        \
     BitmapAnd    BitmapAnd
      /    \       /     \
   status priority  status  created_at
   ='pending' ='high' ='processing' >'2024-01-01'
*/

Bitmaps Enable Index Intersection

Lossy Bitmaps and Memory Management

For very large tables, maintaining an exact bitmap (one bit per page) can require substantial memory. PostgreSQL addresses this with lossy bitmaps that trade precision for memory efficiency.

Exact vs Lossy Bitmaps:

Exact bitmap: One bit per page. Precise identification of matching pages.
Lossy bitmap: One bit per chunk of pages (e.g., 32 or 64 pages per bit). Indicates 'some page in this chunk matches'.

With lossy bitmaps, the recheck phase must filter out more non-matching rows because the granularity is coarser.

PostgreSQL Work Memory and Bitmap Behavior
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Check current work_mem setting
SHOW work_mem;  -- Default: 4MB
 
-- For a table with 1 million pages, exact bitmap needs:
-- 1,000,000 bits = 125 KB (easily fits in 4MB)
 
-- For a table with 100 million pages:
-- 100,000,000 bits = 12.5 MB (exceeds default work_mem)
-- PostgreSQL would create a lossy bitmap
 
-- Observe bitmap behavior in EXPLAIN
EXPLAIN (ANALYZE, BUFFERS) 
SELECT * FROM huge_table WHERE indexed_column < 1000;
 
/*
Bitmap Heap Scan on huge_table
  Recheck Cond: (indexed_column < 1000)
  Rows Removed by Index Recheck: 15234    -- <-- High number indicates lossy bitmap
  Heap Blocks: exact=12345 lossy=67890    -- <-- Shows exact vs lossy blocks
  ->  Bitmap Index Scan on idx_huge
        Index Cond: (indexed_column < 1000)
*/
 
-- Increase work_mem to enable exact bitmaps (for this session)
SET work_mem = '128MB';
 
-- Re-run query - should show fewer lossy blocks

Lossy Bitmap Tradeoffs:

Aspect	Exact Bitmap	Lossy Bitmap
Memory	Higher	Lower
Precision	Exact pages	Page ranges
Recheck overhead	Minimal	Higher (more false positives)
I/O efficiency	Optimal	Slightly worse
Suitable for	Normal tables	Very large tables

When Lossy Bitmaps Cause Problems:

If you see high 'Rows Removed by Index Recheck' counts, the lossy bitmap is fetching many pages that don't actually contain matches. Consider:

Increasing work_mem to allow exact bitmaps
Adding a more selective index
Restructuring the query to be more selective

Work Memory Trade-offs

Bitmap Scans Across Database Systems

Different database systems implement bitmap-style operations with varying terminology and capabilities.

Bitmap Scan Implementation by Database
Database	Feature Name	Index Combination	Notes
PostgreSQL	Bitmap Heap Scan / Bitmap Index Scan	BitmapAnd, BitmapOr	Full-featured; supports lossy bitmaps for memory management
Oracle	BITMAP CONVERSION, INDEX COMBINE	BITMAP AND, BITMAP OR	Automatic bitmap conversion from B-tree indexes; native bitmap indexes also available
SQL Server	No direct equivalent	Index Intersection (similar concept)	Uses RID-based intersection rather than page-level bitmaps
MySQL	Index Merge	Intersection, Union	Limited capability; often creates temporary table instead of true bitmap

Cross-Database Bitmap-Style Operations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
=== PostgreSQL ===
-- Bitmap scan clearly shown in EXPLAIN
EXPLAIN SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
/*
Bitmap Heap Scan on orders
  ->  BitmapAnd
        ->  Bitmap Index Scan on idx_status
        ->  Bitmap Index Scan on idx_amount
*/
 
=== Oracle ===
-- Oracle can convert B-tree indexes to bitmaps internally
EXPLAIN PLAN FOR
SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
 
-- Look for operations like:
-- BITMAP CONVERSION FROM ROWIDS
-- BITMAP AND
-- TABLE ACCESS BY INDEX ROWID BATCHED
 
=== SQL Server ===
-- SQL Server uses "Index Intersection" for similar effect
SET SHOWPLAN_TEXT ON;
SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
 
-- May show:
-- Index Seek (idx_status)
-- Index Seek (idx_amount)  
-- Hash Match (Inner Join on RID)  -- RID intersection
 
=== MySQL ===
-- MySQL "Index Merge" is the closest feature
EXPLAIN SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
 
-- May show type: index_merge
-- Extra: Using intersect(idx_status,idx_amount); Using where

PostgreSQL's Bitmap Implementation Is Among the Most Sophisticated

Practical Examples and Query Analysis

Let's examine real-world scenarios where bitmap scans provide significant performance benefits.

E-Commerce Product Filtering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- Scenario: User applies multiple filters on product catalog
-- Table: products (10 million rows)
-- Indexes: category, brand, price_range, rating, in_stock
 
-- User's filter: Electronics, Sony or Samsung, Under $500, 4+ stars
SELECT product_id, name, price, rating
FROM products
WHERE category = 'electronics'
  AND brand IN ('sony', 'samsung')
  AND price < 500
  AND rating >= 4.0
  AND in_stock = true;
 
-- Without bitmap scans: Pick one index, filter the rest
-- With bitmap scans:
 
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT product_id, name, price, rating
FROM products
WHERE category = 'electronics'
  AND brand IN ('sony', 'samsung')
  AND price < 500
  AND rating >= 4.0
  AND in_stock = true;
 
/*
Bitmap Heap Scan on products  (actual time=12.345..89.123 rows=2,847 loops=1)
  Recheck Cond: ((category = 'electronics') AND (brand = ANY (...)) AND (price < 500) AND (rating >= 4.0) AND in_stock)
  Heap Blocks: exact=2134
  Buffers: shared hit=2567
  ->  BitmapAnd  (actual time=11.234..11.234 rows=0 loops=1)
        ->  Bitmap Index Scan on idx_category  (actual time=1.234..1.234 rows=0 loops=1)
              Index Cond: (category = 'electronics')
              -- 800,000 products are electronics (8% selectivity)
        ->  BitmapOr  (actual time=2.345..2.345 rows=0 loops=1)
              ->  Bitmap Index Scan on idx_brand
                    Index Cond: (brand = 'sony')
              ->  Bitmap Index Scan on idx_brand  
                    Index Cond: (brand = 'samsung')
              -- 150,000 products are Sony or Samsung
        ->  Bitmap Index Scan on idx_price  (actual time=1.567..1.567 rows=0 loops=1)
              Index Cond: (price < 500)
              -- 3,000,000 products under $500
        ->  Bitmap Index Scan on idx_rating  (actual time=1.234..1.234 rows=0 loops=1)
              Index Cond: (rating >= 4.0)
              -- 2,500,000 products have 4+ rating
        ->  Bitmap Index Scan on idx_stock  (actual time=0.567..0.567 rows=0 loops=1)
              Index Cond: (in_stock = true)
              -- 6,000,000 products in stock
 
Planning Time: 0.456 ms
Execution Time: 89.567 ms
*/
 
-- Combined selectivity: only 2,847 matching rows across 2,134 pages
-- Without bitmaps: would need to pick ONE index, then filter millions of rows

Time-Based Log Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Scenario: Find error logs in specific time window from multiple sources
-- Table: application_logs (500 million rows, heavily partitioned)
 
SELECT log_id, timestamp, message, source
FROM application_logs
WHERE timestamp BETWEEN '2024-01-15 00:00' AND '2024-01-15 23:59'
  AND log_level = 'ERROR'
  AND source IN ('auth-service', 'payment-service', 'order-service');
 
/*
Bitmap Heap Scan on application_logs  (actual time=45.123..234.567 rows=8,923 loops=1)
  Recheck Cond: ((timestamp >= ...) AND (timestamp <= ...) AND (log_level = 'ERROR') AND (source = ANY (...)))
  Heap Blocks: exact=7234
  Buffers: shared hit=8234 read=567
  ->  BitmapAnd  (actual time=42.123..42.123 rows=0 loops=1)
        ->  Bitmap Index Scan on idx_logs_timestamp  
              Index Cond: ((timestamp >= ...) AND (timestamp <= ...))
              -- ~2M entries for that day
        ->  Bitmap Index Scan on idx_logs_level  
              Index Cond: (log_level = 'ERROR')
              -- ~1% of all entries are errors
        ->  BitmapOr  
              ->  Bitmap Index Scan on idx_logs_source (source='auth-service')
              ->  Bitmap Index Scan on idx_logs_source (source='payment-service')
              ->  Bitmap Index Scan on idx_logs_source (source='order-service')
 
Execution Time: 234.567 ms
*/
 
-- Without bitmap: Would need table partitioned by time, or very slow scan

Summary: Bitmap Scan Mastery

You now have comprehensive knowledge of bitmap scans—the hybrid access method that combines index selectivity with sequential access efficiency. Here are the essential takeaways:

Key Takeaways

•Bitmap scans are two-phase operations — First build a bitmap from index entries, then fetch matching pages sequentially. This converts random I/O into sequential I/O.
•Bitmaps enable index combination — BitmapAnd and BitmapOr operations allow efficient processing of complex predicates using multiple indexes.
•Optimal for medium selectivity — Bitmap scans shine at 1-20% selectivity, where index scans have too much random I/O and table scans read too many unwanted rows.
•Page-level granularity requires recheck — Standard bitmaps track pages, not rows, so matching rows must be verified after page fetch.
•Lossy bitmaps handle large tables — When memory is limited, bitmaps become coarser-grained, trading precision for memory efficiency.
•PostgreSQL excels at bitmap operations — PostgreSQL's implementation is particularly sophisticated, but similar concepts exist in other databases.

What's Next:

Page Complete

4 / 5