Database Management SystemsMulti-Level Index

Multi-Level Index Structures

LevelIntermediate

Duration75 mins

TopicMulti-Level Index

1 / 5

Single-Level Limitations

The Scalability Wall of Single-Level Indexes

Indexes are the cornerstone of efficient database access. A well-designed index can transform a query that scans millions of rows into one that retrieves results in milliseconds. Yet as elegant as the basic indexing concept is, single-level indexes hit fundamental scalability limits that become painfully apparent as databases grow.

Understanding these limitations isn't merely academic—it's essential knowledge for any engineer who designs, maintains, or troubleshoots database systems. The limitations we'll explore in this page directly motivate the multi-level index structures that power virtually every modern database system, from PostgreSQL's B-trees to MySQL's InnoDB indexes to MongoDB's internal indexing mechanisms.

What You Will Learn

By the end of this page, you will understand: (1) how single-level indexes work and their structural characteristics, (2) the mathematical relationship between data size and index size, (3) how index size affects I/O costs during search operations, (4) why binary search becomes insufficient at scale, and (5) the precise conditions that necessitate multi-level indexing.

Single-Level Index Fundamentals

Before we can understand the limitations of single-level indexes, we must precisely define what they are and how they operate. A single-level index is an auxiliary data structure that contains index entries, where each entry consists of a search key value and a pointer (or set of pointers) to the corresponding data records.

The anatomy of a single-level index:

Consider a database file containing customer records. Each record occupies 200 bytes and contains fields like customer_id, name, email, and address. The file contains 10 million records, occupying approximately 2 GB of storage. We want to create an index on customer_id to accelerate lookups.

single_level_index_structure.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Conceptual representation of a single-level index structure
-- The actual implementation is internal to the DBMS
 
-- Data file structure (simplified)
-- Record format: [customer_id (8 bytes) | name (50 bytes) | email (50 bytes) | address (92 bytes)]
-- Total record size: 200 bytes
-- Total records: 10,000,000
-- Total data file size: ~2 GB
 
-- Index file structure
-- Index entry format: [customer_id (8 bytes) | record_pointer (6 bytes)]
-- Total index entry size: 14 bytes
-- Total index entries: 10,000,000
-- Total index file size: ~140 MB
 
-- Index organization (conceptually sorted by customer_id)
-- Entry 1: [1001, ptr_to_record_at_offset_0]
-- Entry 2: [1002, ptr_to_record_at_offset_200]
-- Entry 3: [1003, ptr_to_record_at_offset_400]
-- ...
-- Entry 10,000,000: [11001001, ptr_to_record_at_offset_1,999,999,800]

Key characteristics of single-level indexes:

Ordered Structure: Index entries are maintained in sorted order by search key. This ordering enables binary search, which is central to the index's efficiency.
Separation of Concerns: The index file is separate from the data file. This separation allows the index to be much smaller than the data, since index entries contain only the key and pointer—not the full record.
Single Hierarchy Level: All index entries exist at the same level. There is no hierarchical organization—the entire index is a flat, sorted sequence of entries.
Dense or Sparse: The index may be dense (one entry per data record) or sparse (entries for some records only, typically applicable when data is sorted).

Comparison: Data File vs. Index File
Property	Data File	Index File
Record/Entry Size	200 bytes (full record)	14 bytes (key + pointer)
Total Entries	10,000,000 records	10,000,000 entries (dense index)
Total Size	~2 GB	~140 MB
Organization	May be unordered (heap)	Always sorted by search key
Purpose	Store complete data	Accelerate key-based lookups

The Size Advantage

The index is 14x smaller than the data file (140 MB vs. 2 GB). This size reduction is fundamental to indexing—searching a smaller, ordered structure is inherently faster than scanning a larger, potentially unordered data file. But as we'll see, 'smaller' is relative, and 140 MB is still substantial.

The I/O Cost Model

To understand single-level index limitations, we must first understand how databases measure efficiency. The dominant cost in database operations—by orders of magnitude—is disk I/O. CPU operations are so fast compared to disk access that they are typically negligible in cost models.

The fundamental unit of disk I/O is the block (or page).

Databases read and write data in fixed-size blocks, typically 4 KB to 16 KB. Even if you need a single 14-byte index entry, the system reads an entire 4 KB block containing that entry. This block-based access is fundamental to understanding database performance.

Storage Hierarchy Access Times
Storage Level	Access Time	Relative Speed
L1 Cache	~1 nanosecond	1x (baseline)
L2 Cache	~4 nanoseconds	4x slower
L3 Cache	~12 nanoseconds	12x slower
Main Memory (RAM)	~100 nanoseconds	100x slower
SSD (random read)	~100 microseconds	100,000x slower
HDD (random read)	~10 milliseconds	10,000,000x slower

The mathematical framework:

Let's establish the notation we'll use throughout this analysis:

r = total number of data records
R = size of each data record (in bytes)
B = block size (in bytes)
K = size of search key (in bytes)
P = size of block pointer (in bytes)
i = size of each index entry = K + P
bfr_i = blocking factor for index = ⌊B / i⌋ (entries per block)
n = number of index blocks required = ⌈r / bfr_i⌉

Let's apply this to our example:

index_size_calculation.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Single-Level Index Size Calculation
====================================
 
Given:
  - r = 10,000,000 records
  - K = 8 bytes (customer_id)
  - P = 6 bytes (block pointer)
  - B = 4,096 bytes (4 KB blocks)
 
Calculations:
  - Index entry size (i) = K + P = 8 + 6 = 14 bytes
  - Blocking factor (bfr_i) = ⌊4096 / 14⌋ = ⌊292.57⌋ = 292 entries/block
  - Number of index blocks (n) = ⌈10,000,000 / 292⌉ = ⌈34,246.58⌉ = 34,247 blocks
 
Index Properties:
  - Total index size = 34,247 × 4,096 bytes = 140,275,712 bytes ≈ 134 MB
  - Index spans 34,247 disk blocks
 
Search Cost (Binary Search):
  - log₂(34,247) = 15.06
  - Block accesses required = ⌈15.06⌉ = 16 block reads
  
  At 10ms per HDD block read: 16 × 10ms = 160ms per search
  At 0.1ms per SSD block read: 16 × 0.1ms = 1.6ms per search

The Cost Reality

Even with binary search on a sorted index, finding a single record requires 16 block reads. For a system serving 1,000 queries per second, this means 16,000 block reads per second just for index lookups—before even reading the actual data records. On HDD, this is physically impossible. On SSD, it consumes significant I/O bandwidth.

Binary Search Limitations

Binary search is an elegant algorithm with O(log₂ n) complexity, where n is the number of elements. For in-memory data structures, this is exceptionally efficient. But when applied to disk-resident single-level indexes, binary search reveals critical limitations.

The problem isn't the algorithm—it's the access pattern.

Binary search accesses elements in a specific, non-sequential pattern. Consider searching for a value in an index spanning 34,247 blocks:

binary_search_access_pattern.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Binary Search Access Pattern for 34,247 Blocks
===============================================
 
Search for customer_id = 5,500,000
 
Iteration 1: Access block 17,123 (middle)      → Compare, go left
Iteration 2: Access block 8,561               → Compare, go right  
Iteration 3: Access block 12,842              → Compare, go left
Iteration 4: Access block 10,701              → Compare, go right
Iteration 5: Access block 11,772              → Compare, go left
Iteration 6: Access block 11,236              → Compare, go right
Iteration 7: Access block 11,504              → Compare, go left
Iteration 8: Access block 11,370              → Compare, go right
Iteration 9: Access block 11,437              → Compare, go left
Iteration 10: Access block 11,403             → Compare, go right
Iteration 11: Access block 11,420             → Compare, go left
Iteration 12: Access block 11,411             → Compare, go right
Iteration 13: Access block 11,415             → Compare, go left
Iteration 14: Access block 11,413             → Compare, go right
Iteration 15: Access block 11,414             → Compare, FOUND!
Iteration 16: (May need one more for exact entry within block)
 
Each iteration requires a RANDOM disk access.
Blocks accessed: 17123, 8561, 12842, 10701, 11772, 11236, 11504, ...
 
These blocks are scattered across the disk:
- Block 17123 → Disk position A
- Block 8561 → Disk position B (far from A)
- Block 12842 → Disk position C (far from B)
...
 
For HDD: Each access incurs seek time + rotational latency
For SSD: Each access is a random read (slower than sequential)

Key limitations exposed:

1. Random Access Pattern

Binary search jumps to arbitrary positions in the index—first to the middle, then to the quarter or three-quarter mark, and so on. On disk, these jumps translate to random I/O operations. Random I/O is dramatically slower than sequential I/O:

HDD random read: ~200 IOPS (I/O operations per second)
HDD sequential read: ~100 MB/s (~25,000 blocks/second for 4 KB blocks)
SSD random read: ~50,000-100,000 IOPS
SSD sequential read: ~500 MB/s-3 GB/s

2. No Caching Benefit Between Searches

Each binary search starts from the middle of the index. While the first few blocks (the middle, quarters, eighths) might become cached after many searches, the deeper levels change with each search. There's limited opportunity for cache reuse.

3. Logarithmic Growth Still Means Significant I/O

Although log₂ n grows slowly, it still requires substantial I/O at scale:

Binary Search Block Accesses vs. Data Scale
Records	Index Blocks (4KB)	Binary Search Accesses	Time (HDD @ 10ms)	Time (SSD @ 0.1ms)
1,000	4	2	20ms	0.2ms
10,000	35	6	60ms	0.6ms
100,000	343	9	90ms	0.9ms
1,000,000	3,425	12	120ms	1.2ms
10,000,000	34,247	16	160ms	1.6ms
100,000,000	342,466	19	190ms	1.9ms
1,000,000,000	3,424,658	22	220ms	2.2ms

The Practical Impact

At 1 billion records, each index lookup requires 22 random disk accesses. On HDD, this means each lookup takes over 200ms—completely unacceptable for interactive applications. Even on SSD, 2.2ms per lookup means you can only serve ~450 lookups per second with a single disk. This is the scalability wall of single-level indexes.

Memory Constraints

A common response to single-level index performance issues is: "Just load the index into memory." This solution works—when it's possible. But as data scales, memory constraints become the critical limitation.

The economics of memory vs. storage:

Memory is expensive and limited. Storage is cheap and plentiful. The gap between what we can afford to store and what we can afford to keep in memory creates the fundamental challenge of database systems.

Memory Requirements for Single-Level Indexes
Records	Index Size (Dense)	Typical Server RAM	Index Fits in Memory?
1 million	14 MB	16 GB	✓ Yes (0.09% of RAM)
10 million	140 MB	32 GB	✓ Yes (0.44% of RAM)
100 million	1.4 GB	64 GB	✓ Yes (2.2% of RAM)
1 billion	14 GB	128 GB	⚠ Marginal (11% of RAM)
10 billion	140 GB	256 GB	✗ No (55% of RAM)
100 billion	1.4 TB	512 GB	✗ Definitely not

The multi-index reality:

The analysis above considers a single index. Real databases typically have multiple indexes:

Primary key index (required)
Foreign key indexes (often multiple)
Unique constraint indexes
Query optimization indexes (composite, covering)
Full-text search indexes

A table might have 5-10 indexes. If each index on 1 billion records requires 14 GB, that's 70-140 GB just for indexes—on a single table. Production databases may have hundreds of tables.

memory_pressure_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Real-world memory pressure analysis
-- Example: E-commerce database with 100 million orders
 
-- Table: orders (100 million rows)
-- Record size: 250 bytes
-- Data file size: ~25 GB
 
-- Indexes on orders table:
--   1. Primary index on order_id (8 bytes): 1.4 GB
--   2. Index on customer_id (8 bytes): 1.4 GB
--   3. Index on order_date (8 bytes): 1.4 GB
--   4. Index on status (4 bytes): 1.2 GB
--   5. Composite index on (customer_id, order_date): 2.4 GB
--   6. Index on total_amount (8 bytes): 1.4 GB
 
-- Total index memory for orders table alone: ~9.2 GB
 
-- Additional tables with similar scale:
--   - order_items (500M rows): ~30 GB indexes
--   - products (10M rows): ~2 GB indexes
--   - customers (50M rows): ~7 GB indexes
--   - inventory (100M rows): ~10 GB indexes
 
-- Total index memory requirement: ~60+ GB
-- Available RAM: 128 GB
-- RAM needed for query execution, connections, OS: ~40 GB
-- RAM available for index caching: ~88 GB
 
-- Result: Not all indexes can be cached
-- Some index lookups WILL hit disk, causing I/O bottlenecks

The Caching Cliff

When indexes partially fit in memory, performance becomes unpredictable. Frequently-accessed index portions stay cached, but less-frequent accesses cause disk I/O. This creates latency spikes that appear random but are actually systematic. Understanding index size and memory pressure is critical for production database tuning.

Update Overhead

Single-level indexes incur significant costs not just for reads, but for modifications. Every INSERT, UPDATE (on indexed columns), or DELETE requires index maintenance. The overhead grows with index size.

The hidden cost of sorted order:

A single-level index maintains entries in sorted order. This ordering is essential for binary search, but it creates maintenance challenges:

Update Challenges in Single-Level Indexes

•Insertion at arbitrary positions — Inserting a new entry in sorted order may require shifting subsequent entries. In the worst case, inserting at the beginning requires moving all entries.
•Block splits and compaction — When a block is full, inserting a new entry requires either moving entries to adjacent blocks (if space exists) or splitting the block and redistributing entries.
•No free space management — Unlike tree structures with built-in free space per node, flat indexes may require overflow blocks or periodic reorganization.
•Deletion complications — Deleting entries leaves gaps. These can be marked as deleted (wasting space) or compacted (expensive reorganization).
•Concurrent access difficulties — Multiple writers attempting to modify the same region of the index create contention and locking overhead.

Quantifying the update cost:

Consider inserting a new record into our 10-million-record customer database. The index has 34,247 blocks. If the insertion point is uniformly random:

insert_cost_analysis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Single-Level Index Insert Cost Analysis
========================================
 
Scenario: Insert customer_id = 5,500,000 (middle of range)
 
Step 1: Find insertion point
  - Binary search: 16 block reads
  
Step 2: Make room for new entry
  Option A: Shift entries within blocks (if space available)
    - Best case: 1 block write (space in target block)
    - Typical case: Several block reads/writes to redistribute
    
  Option B: Overflow chain (common approach)
    - Add entry to overflow block linked from primary block
    - Degrades search performance over time
    - Requires periodic reorganization
    
  Option C: Block split
    - Split full block into two blocks
    - Update pointers to blocks
    - 2-3 block writes
    - May cascade if parent tracking structures exist
 
Step 3: Write modified blocks
  - 1-10 block writes depending on strategy
 
Total I/O per insert: 17-30 block operations
At 100 inserts/second: 1,700-3,000 IOPS just for index maintenance
At 1,000 inserts/second: 17,000-30,000 IOPS
 
For perspective:
  - HDD capacity: ~200 random IOPS
  - SSD capacity: ~50,000-100,000 IOPS
  
HDD cannot sustain even 100 inserts/second efficiently.
SSD works but index maintenance dominates I/O capacity.

The Write Amplification Problem

A single logical insert (one new record) can cause 10-30 physical I/O operations for index maintenance. This 'write amplification' scales with the number of indexes on a table. Five indexes means 50-150 I/O operations per insert. This is why proper index design is critical—unnecessary indexes have real costs.

Mathematical Analysis of the Scalability Limit

Let's formalize the limitations we've discussed into a precise mathematical framework. This analysis reveals exactly why single-level indexes fail at scale and motivates the multi-level solution.

The fundamental relationship:

For a single-level dense index with r records, index entry size i, and block size B:

Number of index blocks (n) = ⌈r / (B/i)⌉ = ⌈r × i / B⌉

The number of block accesses for binary search:

Search cost = ⌈log₂(n)⌉ = ⌈log₂(r × i / B)⌉

As r grows, the search cost grows logarithmically—slowly, but unboundedly.

scalability_limit_proof.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Mathematical Proof: Single-Level Index Scalability Limit
==========================================================
 
Given:
  - Target maximum search cost: c blocks (e.g., c = 4 for acceptable latency)
  - Block size: B bytes
  - Index entry size: i bytes
  - Blocking factor: f = ⌊B/i⌋ entries per block
 
Constraint:
  log₂(n) ≤ c, where n = number of index blocks
 
Maximum index blocks:
  n_max = 2^c
 
Maximum records supportable:
  r_max = n_max × f = 2^c × ⌊B/i⌋
 
Example calculations (B = 4096 bytes, i = 14 bytes, f = 292):
  
  c = 4 (4 block reads max):
    n_max = 16 blocks
    r_max = 16 × 292 = 4,672 records
    
  c = 8 (8 block reads max):
    n_max = 256 blocks
    r_max = 256 × 292 = 74,752 records
    
  c = 12 (12 block reads max):
    n_max = 4,096 blocks
    r_max = 4,096 × 292 = 1,196,032 records (~1.2 million)
    
  c = 16 (16 block reads max):
    n_max = 65,536 blocks
    r_max = 65,536 × 292 = 19,136,512 records (~19 million)
    
  c = 20 (20 block reads max):
    n_max = 1,048,576 blocks
    r_max = 1,048,576 × 292 = 306,184,192 records (~306 million)
 
CRITICAL INSIGHT:
  To support 10 million records with ≤4 block reads using single-level index:
  IMPOSSIBLE with conventional parameters.
  
  The mathematical bound is 4,672 records for 4 block reads.
  
  To support 10 million records, single-level index REQUIRES ≥16 block reads.
  
  Multi-level indexing fundamentally changes this relationship.

The logarithmic gap:

The key insight is that single-level indexes apply binary search at the block level, not the entry level. Each block contains B/i entries, but we pay one I/O per block visited during search.

What if we could search more efficiently?

Imagine if instead of binary searching over n = 34,247 blocks, we could:

Read one 'super-block' that tells us which of 290 ranges contains our target
Within that range, read one block that tells us which of 290 sub-ranges contains our target
Within that sub-range, read the final block containing our actual entry

Total: 3 block reads instead of 16.

This is exactly the principle of multi-level indexing—and why it transforms database performance.

Preview: Multi-Level Advantage

A multi-level index with blocking factor f = 290 requires only ⌈log_f(r)⌉ block accesses instead of ⌈log₂(r × i/B)⌉. For 10 million records: single-level requires 16 accesses, multi-level requires 3. For 1 billion records: single-level requires 22 accesses, multi-level requires 4. This is the power of hierarchical indexing.

Summary: The Case for Multi-Level Indexes

We've rigorously examined the limitations of single-level indexes. These limitations aren't bugs—they're inherent to the flat, sorted structure. Let's consolidate what we've learned:

Key Takeaways

•Single-level indexes are flat sorted structures — All entries exist at one level, enabling binary search but limiting block access efficiency.
•Binary search operates at block granularity — Even though blocks contain hundreds of entries, we pay full I/O cost for each block visited.
•Search cost grows as log₂(blocks) — For large datasets, this means 15-25 block reads per lookup, causing significant latency.
•Memory constraints prevent full caching — At scale, indexes cannot fit in RAM, so disk I/O becomes unavoidable.
•Update overhead creates write amplification — Maintaining sorted order requires significant I/O for inserts, updates, and deletes.
•Mathematical limits are fixed — Given block size and entry size, there's a hard limit on supportable records per search cost target.

What's next:

The limitations we've identified motivate a fundamental redesign of index structure. The next page introduces the multi-level index concept—a hierarchical organization that reduces block accesses from log₂(blocks) to log_f(records), where f is the blocking factor (entries per block). This seemingly small change transforms database scalability, enabling efficient access to billions of records with just 3-5 block reads.

Page Complete

You now understand the fundamental limitations of single-level indexes: block-level binary search, memory constraints, update overhead, and mathematical scalability limits. These constraints motivate the multi-level index structures we'll explore next, which overcome these limitations through hierarchical organization.

1 / 5

Loading learning content...

Database Management SystemsMulti-Level Index

Multi-Level Index Structures

LevelIntermediate

Duration75 mins

TopicMulti-Level Index

1 / 5

Single-Level Limitations

The Scalability Wall of Single-Level Indexes

What You Will Learn

Single-Level Index Fundamentals

The anatomy of a single-level index:

single_level_index_structure.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Conceptual representation of a single-level index structure
-- The actual implementation is internal to the DBMS
 
-- Data file structure (simplified)
-- Record format: [customer_id (8 bytes) | name (50 bytes) | email (50 bytes) | address (92 bytes)]
-- Total record size: 200 bytes
-- Total records: 10,000,000
-- Total data file size: ~2 GB
 
-- Index file structure
-- Index entry format: [customer_id (8 bytes) | record_pointer (6 bytes)]
-- Total index entry size: 14 bytes
-- Total index entries: 10,000,000
-- Total index file size: ~140 MB
 
-- Index organization (conceptually sorted by customer_id)
-- Entry 1: [1001, ptr_to_record_at_offset_0]
-- Entry 2: [1002, ptr_to_record_at_offset_200]
-- Entry 3: [1003, ptr_to_record_at_offset_400]
-- ...
-- Entry 10,000,000: [11001001, ptr_to_record_at_offset_1,999,999,800]

Key characteristics of single-level indexes:

Ordered Structure: Index entries are maintained in sorted order by search key. This ordering enables binary search, which is central to the index's efficiency.
Separation of Concerns: The index file is separate from the data file. This separation allows the index to be much smaller than the data, since index entries contain only the key and pointer—not the full record.
Single Hierarchy Level: All index entries exist at the same level. There is no hierarchical organization—the entire index is a flat, sorted sequence of entries.
Dense or Sparse: The index may be dense (one entry per data record) or sparse (entries for some records only, typically applicable when data is sorted).

Comparison: Data File vs. Index File
Property	Data File	Index File
Record/Entry Size	200 bytes (full record)	14 bytes (key + pointer)
Total Entries	10,000,000 records	10,000,000 entries (dense index)
Total Size	~2 GB	~140 MB
Organization	May be unordered (heap)	Always sorted by search key
Purpose	Store complete data	Accelerate key-based lookups

The Size Advantage

The I/O Cost Model

The fundamental unit of disk I/O is the block (or page).

Storage Hierarchy Access Times
Storage Level	Access Time	Relative Speed
L1 Cache	~1 nanosecond	1x (baseline)
L2 Cache	~4 nanoseconds	4x slower
L3 Cache	~12 nanoseconds	12x slower
Main Memory (RAM)	~100 nanoseconds	100x slower
SSD (random read)	~100 microseconds	100,000x slower
HDD (random read)	~10 milliseconds	10,000,000x slower

The mathematical framework:

Let's establish the notation we'll use throughout this analysis:

r = total number of data records
R = size of each data record (in bytes)
B = block size (in bytes)
K = size of search key (in bytes)
P = size of block pointer (in bytes)
i = size of each index entry = K + P
bfr_i = blocking factor for index = ⌊B / i⌋ (entries per block)
n = number of index blocks required = ⌈r / bfr_i⌉

Let's apply this to our example:

index_size_calculation.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Single-Level Index Size Calculation
====================================
 
Given:
  - r = 10,000,000 records
  - K = 8 bytes (customer_id)
  - P = 6 bytes (block pointer)
  - B = 4,096 bytes (4 KB blocks)
 
Calculations:
  - Index entry size (i) = K + P = 8 + 6 = 14 bytes
  - Blocking factor (bfr_i) = ⌊4096 / 14⌋ = ⌊292.57⌋ = 292 entries/block
  - Number of index blocks (n) = ⌈10,000,000 / 292⌉ = ⌈34,246.58⌉ = 34,247 blocks
 
Index Properties:
  - Total index size = 34,247 × 4,096 bytes = 140,275,712 bytes ≈ 134 MB
  - Index spans 34,247 disk blocks
 
Search Cost (Binary Search):
  - log₂(34,247) = 15.06
  - Block accesses required = ⌈15.06⌉ = 16 block reads
  
  At 10ms per HDD block read: 16 × 10ms = 160ms per search
  At 0.1ms per SSD block read: 16 × 0.1ms = 1.6ms per search

The Cost Reality

Binary Search Limitations

The problem isn't the algorithm—it's the access pattern.

Binary search accesses elements in a specific, non-sequential pattern. Consider searching for a value in an index spanning 34,247 blocks:

binary_search_access_pattern.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Binary Search Access Pattern for 34,247 Blocks
===============================================
 
Search for customer_id = 5,500,000
 
Iteration 1: Access block 17,123 (middle)      → Compare, go left
Iteration 2: Access block 8,561               → Compare, go right  
Iteration 3: Access block 12,842              → Compare, go left
Iteration 4: Access block 10,701              → Compare, go right
Iteration 5: Access block 11,772              → Compare, go left
Iteration 6: Access block 11,236              → Compare, go right
Iteration 7: Access block 11,504              → Compare, go left
Iteration 8: Access block 11,370              → Compare, go right
Iteration 9: Access block 11,437              → Compare, go left
Iteration 10: Access block 11,403             → Compare, go right
Iteration 11: Access block 11,420             → Compare, go left
Iteration 12: Access block 11,411             → Compare, go right
Iteration 13: Access block 11,415             → Compare, go left
Iteration 14: Access block 11,413             → Compare, go right
Iteration 15: Access block 11,414             → Compare, FOUND!
Iteration 16: (May need one more for exact entry within block)
 
Each iteration requires a RANDOM disk access.
Blocks accessed: 17123, 8561, 12842, 10701, 11772, 11236, 11504, ...
 
These blocks are scattered across the disk:
- Block 17123 → Disk position A
- Block 8561 → Disk position B (far from A)
- Block 12842 → Disk position C (far from B)
...
 
For HDD: Each access incurs seek time + rotational latency
For SSD: Each access is a random read (slower than sequential)

Key limitations exposed:

1. Random Access Pattern

HDD random read: ~200 IOPS (I/O operations per second)
HDD sequential read: ~100 MB/s (~25,000 blocks/second for 4 KB blocks)
SSD random read: ~50,000-100,000 IOPS
SSD sequential read: ~500 MB/s-3 GB/s

2. No Caching Benefit Between Searches

3. Logarithmic Growth Still Means Significant I/O

Although log₂ n grows slowly, it still requires substantial I/O at scale:

Binary Search Block Accesses vs. Data Scale
Records	Index Blocks (4KB)	Binary Search Accesses	Time (HDD @ 10ms)	Time (SSD @ 0.1ms)
1,000	4	2	20ms	0.2ms
10,000	35	6	60ms	0.6ms
100,000	343	9	90ms	0.9ms
1,000,000	3,425	12	120ms	1.2ms
10,000,000	34,247	16	160ms	1.6ms
100,000,000	342,466	19	190ms	1.9ms
1,000,000,000	3,424,658	22	220ms	2.2ms

The Practical Impact

Memory Constraints

The economics of memory vs. storage:

Memory Requirements for Single-Level Indexes
Records	Index Size (Dense)	Typical Server RAM	Index Fits in Memory?
1 million	14 MB	16 GB	✓ Yes (0.09% of RAM)
10 million	140 MB	32 GB	✓ Yes (0.44% of RAM)
100 million	1.4 GB	64 GB	✓ Yes (2.2% of RAM)
1 billion	14 GB	128 GB	⚠ Marginal (11% of RAM)
10 billion	140 GB	256 GB	✗ No (55% of RAM)
100 billion	1.4 TB	512 GB	✗ Definitely not

The multi-index reality:

The analysis above considers a single index. Real databases typically have multiple indexes:

Primary key index (required)
Foreign key indexes (often multiple)
Unique constraint indexes
Query optimization indexes (composite, covering)
Full-text search indexes

A table might have 5-10 indexes. If each index on 1 billion records requires 14 GB, that's 70-140 GB just for indexes—on a single table. Production databases may have hundreds of tables.

memory_pressure_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Real-world memory pressure analysis
-- Example: E-commerce database with 100 million orders
 
-- Table: orders (100 million rows)
-- Record size: 250 bytes
-- Data file size: ~25 GB
 
-- Indexes on orders table:
--   1. Primary index on order_id (8 bytes): 1.4 GB
--   2. Index on customer_id (8 bytes): 1.4 GB
--   3. Index on order_date (8 bytes): 1.4 GB
--   4. Index on status (4 bytes): 1.2 GB
--   5. Composite index on (customer_id, order_date): 2.4 GB
--   6. Index on total_amount (8 bytes): 1.4 GB
 
-- Total index memory for orders table alone: ~9.2 GB
 
-- Additional tables with similar scale:
--   - order_items (500M rows): ~30 GB indexes
--   - products (10M rows): ~2 GB indexes
--   - customers (50M rows): ~7 GB indexes
--   - inventory (100M rows): ~10 GB indexes
 
-- Total index memory requirement: ~60+ GB
-- Available RAM: 128 GB
-- RAM needed for query execution, connections, OS: ~40 GB
-- RAM available for index caching: ~88 GB
 
-- Result: Not all indexes can be cached
-- Some index lookups WILL hit disk, causing I/O bottlenecks

The Caching Cliff

Update Overhead

The hidden cost of sorted order:

A single-level index maintains entries in sorted order. This ordering is essential for binary search, but it creates maintenance challenges:

Update Challenges in Single-Level Indexes

•Insertion at arbitrary positions — Inserting a new entry in sorted order may require shifting subsequent entries. In the worst case, inserting at the beginning requires moving all entries.
•Block splits and compaction — When a block is full, inserting a new entry requires either moving entries to adjacent blocks (if space exists) or splitting the block and redistributing entries.
•No free space management — Unlike tree structures with built-in free space per node, flat indexes may require overflow blocks or periodic reorganization.
•Deletion complications — Deleting entries leaves gaps. These can be marked as deleted (wasting space) or compacted (expensive reorganization).
•Concurrent access difficulties — Multiple writers attempting to modify the same region of the index create contention and locking overhead.

Quantifying the update cost:

Consider inserting a new record into our 10-million-record customer database. The index has 34,247 blocks. If the insertion point is uniformly random:

insert_cost_analysis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Single-Level Index Insert Cost Analysis
========================================
 
Scenario: Insert customer_id = 5,500,000 (middle of range)
 
Step 1: Find insertion point
  - Binary search: 16 block reads
  
Step 2: Make room for new entry
  Option A: Shift entries within blocks (if space available)
    - Best case: 1 block write (space in target block)
    - Typical case: Several block reads/writes to redistribute
    
  Option B: Overflow chain (common approach)
    - Add entry to overflow block linked from primary block
    - Degrades search performance over time
    - Requires periodic reorganization
    
  Option C: Block split
    - Split full block into two blocks
    - Update pointers to blocks
    - 2-3 block writes
    - May cascade if parent tracking structures exist
 
Step 3: Write modified blocks
  - 1-10 block writes depending on strategy
 
Total I/O per insert: 17-30 block operations
At 100 inserts/second: 1,700-3,000 IOPS just for index maintenance
At 1,000 inserts/second: 17,000-30,000 IOPS
 
For perspective:
  - HDD capacity: ~200 random IOPS
  - SSD capacity: ~50,000-100,000 IOPS
  
HDD cannot sustain even 100 inserts/second efficiently.
SSD works but index maintenance dominates I/O capacity.

The Write Amplification Problem

Mathematical Analysis of the Scalability Limit

Let's formalize the limitations we've discussed into a precise mathematical framework. This analysis reveals exactly why single-level indexes fail at scale and motivates the multi-level solution.

The fundamental relationship:

For a single-level dense index with r records, index entry size i, and block size B:

Number of index blocks (n) = ⌈r / (B/i)⌉ = ⌈r × i / B⌉

The number of block accesses for binary search:

Search cost = ⌈log₂(n)⌉ = ⌈log₂(r × i / B)⌉

As r grows, the search cost grows logarithmically—slowly, but unboundedly.

scalability_limit_proof.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Mathematical Proof: Single-Level Index Scalability Limit
==========================================================
 
Given:
  - Target maximum search cost: c blocks (e.g., c = 4 for acceptable latency)
  - Block size: B bytes
  - Index entry size: i bytes
  - Blocking factor: f = ⌊B/i⌋ entries per block
 
Constraint:
  log₂(n) ≤ c, where n = number of index blocks
 
Maximum index blocks:
  n_max = 2^c
 
Maximum records supportable:
  r_max = n_max × f = 2^c × ⌊B/i⌋
 
Example calculations (B = 4096 bytes, i = 14 bytes, f = 292):
  
  c = 4 (4 block reads max):
    n_max = 16 blocks
    r_max = 16 × 292 = 4,672 records
    
  c = 8 (8 block reads max):
    n_max = 256 blocks
    r_max = 256 × 292 = 74,752 records
    
  c = 12 (12 block reads max):
    n_max = 4,096 blocks
    r_max = 4,096 × 292 = 1,196,032 records (~1.2 million)
    
  c = 16 (16 block reads max):
    n_max = 65,536 blocks
    r_max = 65,536 × 292 = 19,136,512 records (~19 million)
    
  c = 20 (20 block reads max):
    n_max = 1,048,576 blocks
    r_max = 1,048,576 × 292 = 306,184,192 records (~306 million)
 
CRITICAL INSIGHT:
  To support 10 million records with ≤4 block reads using single-level index:
  IMPOSSIBLE with conventional parameters.
  
  The mathematical bound is 4,672 records for 4 block reads.
  
  To support 10 million records, single-level index REQUIRES ≥16 block reads.
  
  Multi-level indexing fundamentally changes this relationship.

The logarithmic gap:

The key insight is that single-level indexes apply binary search at the block level, not the entry level. Each block contains B/i entries, but we pay one I/O per block visited during search.

What if we could search more efficiently?

Imagine if instead of binary searching over n = 34,247 blocks, we could:

Read one 'super-block' that tells us which of 290 ranges contains our target
Within that range, read one block that tells us which of 290 sub-ranges contains our target
Within that sub-range, read the final block containing our actual entry

Total: 3 block reads instead of 16.

This is exactly the principle of multi-level indexing—and why it transforms database performance.

Preview: Multi-Level Advantage

Summary: The Case for Multi-Level Indexes

We've rigorously examined the limitations of single-level indexes. These limitations aren't bugs—they're inherent to the flat, sorted structure. Let's consolidate what we've learned:

Key Takeaways

•Single-level indexes are flat sorted structures — All entries exist at one level, enabling binary search but limiting block access efficiency.
•Binary search operates at block granularity — Even though blocks contain hundreds of entries, we pay full I/O cost for each block visited.
•Search cost grows as log₂(blocks) — For large datasets, this means 15-25 block reads per lookup, causing significant latency.
•Memory constraints prevent full caching — At scale, indexes cannot fit in RAM, so disk I/O becomes unavoidable.
•Update overhead creates write amplification — Maintaining sorted order requires significant I/O for inserts, updates, and deletes.
•Mathematical limits are fixed — Given block size and entry size, there's a hard limit on supportable records per search cost target.

What's next:

Page Complete

1 / 5