Multi Level Index - Learning Module

Loading content...

0/252

Multi-Level Concept

The Hierarchical Breakthrough

The previous page revealed a fundamental limitation: single-level indexes require log₂(blocks) random disk accesses per lookup—a cost that becomes prohibitive as databases scale to millions or billions of records. Binary search, despite its elegant O(log n) complexity, fails us at the block level because each comparison requires reading an entire disk block.

The solution lies in a deceptively simple insight: we can build an index on top of an index. This hierarchical approach transforms the search problem from binary (base-2) to a much higher base determined by the number of entries per block. The result is a dramatic reduction in I/O operations—often from 15-20 block reads down to 3-4, even for datasets spanning billions of records.

What You Will Learn

By the end of this page, you will understand: (1) the core concept of multi-level indexing, (2) how building indexes on indexes achieves efficiency, (3) the mathematical relationship between index levels and block accesses, (4) how fanout determines index depth, and (5) the intuition behind why hierarchical organization fundamentally changes scalability.

The Core Insight: Index on Index

The breakthrough of multi-level indexing comes from a single observation: a single-level index is itself a sorted file. And what do we do with sorted files to accelerate access? We build indexes on them.

Consider our running example: a 10-million-record customer database with a single-level index spanning 34,247 blocks. This index is sorted by customer_id, with each block containing 292 index entries (at 14 bytes per entry in 4 KB blocks).

Now imagine we build a second index—not on the original data, but on the first index. This second index contains one entry for each block of the first index: the key value of the first entry in that block, plus a pointer to the block.

multi_level_index_concept.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
Multi-Level Index Construction
==============================
 
Level 0 (Data File):
  - 10,000,000 customer records
  - Record size: 200 bytes
  - Total: ~2 GB across multiple data blocks
 
Level 1 (Primary Index / First-Level Index):
  - One entry per data record (dense index)
  - Entry size: 14 bytes (8-byte key + 6-byte pointer)
  - Entries per block: 292
  - Total blocks: 34,247
  - Total size: ~134 MB
 
Level 2 (Secondary Index on Level 1):
  - One entry per BLOCK of Level 1 (sparse index on Level 1)
  - Entry size: 14 bytes (8-byte key + 6-byte pointer)
  - Total entries needed: 34,247 (one per Level 1 block)
  - Entries per block: 292
  - Total blocks: ⌈34,247 / 292⌉ = 118 blocks
  - Total size: ~472 KB
 
Level 3 (Tertiary Index on Level 2):
  - One entry per BLOCK of Level 2 (sparse index on Level 2)
  - Entry size: 14 bytes
  - Total entries needed: 118
  - Entries per block: 292
  - Total blocks: ⌈118 / 292⌉ = 1 block
  - Total size: 4 KB
 
TOTAL INDEX STRUCTURE:
  Level 3: 1 block (root)
  Level 2: 118 blocks
  Level 1: 34,247 blocks
  Level 0: Data (millions of blocks)

The crucial observation:

The second-level index has only 118 blocks (compared to 34,247 for the first level). We can build a third index on the second, which has only 1 block. This single root block contains entries pointing to all 118 second-level blocks.

Search process transformation:

With single-level index: Binary search over 34,247 blocks = 16 block reads.

With multi-level index:

Read Level 3 root block (1 block) → find which Level 2 block
Read that Level 2 block (1 block) → find which Level 1 block
Read that Level 1 block (1 block) → find pointer to data record
Read data block (1 block) → get actual record

Total: 4 block reads instead of 16 (plus data access).

The Power of Hierarchy

We reduced 16 random disk accesses to 4 by adding just 119 additional blocks (118 + 1) of metadata. The overhead is minimal (119 blocks = 476 KB), but the performance improvement is 4x. This is the fundamental value proposition of multi-level indexing.

The Mathematics of Levels

The efficiency of multi-level indexes stems from a mathematical relationship between the fanout (entries per block) and the number of levels required. Understanding this relationship is essential for database tuning and capacity planning.

Key definitions:

Fanout (f): The number of index entries that fit in one block. Also called branching factor.
- For our example: f = ⌊4096 / 14⌋ = 292
Number of levels (L): The depth of the index hierarchy.
Records addressable at L levels: f^L records (approximately)

The exponential growth of addressable records:

Records Addressable by Index Levels (Fanout f = 292)
Levels (L)	Addressable Records (f^L)	Index Block Accesses	Storage (Index Only)
1	292	1	4 KB
2	85,264	2	4 KB + 4 KB = 8 KB
3	24,897,088 (~25M)	3	~476 KB
4	7,270,009,696 (~7.3B)	4	~134 MB
5	2,122,842,831,232 (~2.1T)	5	~38 GB

The logarithmic relationship:

To find the number of levels required to index r records:

L = ⌈log_f(r)⌉

For our 10-million-record database with f = 292:

L = ⌈log₂₉₂(10,000,000)⌉
  = ⌈ln(10,000,000) / ln(292)⌉
  = ⌈16.12 / 5.68⌉
  = ⌈2.84⌉
  = 3 levels

Three levels of index can address 24.9 million records. Our 10-million-record database fits comfortably.

level_calculation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
-- Calculating required index levels for various data sizes
-- Assuming: Block size = 4096 bytes, Entry size = 14 bytes
-- Fanout (f) = floor(4096 / 14) = 292
 
-- Formula: Levels = CEILING(LOG(records) / LOG(fanout))
 
WITH fanout_params AS (
    SELECT 
        4096 AS block_size,
        14 AS entry_size,
        FLOOR(4096.0 / 14) AS fanout  -- 292
),
data_scales AS (
    SELECT 1000 AS records UNION ALL
    SELECT 10000 UNION ALL
    SELECT 100000 UNION ALL
    SELECT 1000000 UNION ALL        -- 1 million
    SELECT 10000000 UNION ALL       -- 10 million
    SELECT 100000000 UNION ALL      -- 100 million
    SELECT 1000000000 UNION ALL     -- 1 billion
    SELECT 10000000000 UNION ALL    -- 10 billion
    SELECT 100000000000             -- 100 billion
)
SELECT 
    d.records,
    f.fanout,
    CEILING(LOG(d.records) / LOG(f.fanout)) AS levels_required,
    -- Compare to single-level binary search blocks
    CEILING(d.records * 14.0 / 4096) AS single_level_blocks,
    CEILING(LOG(CEILING(d.records * 14.0 / 4096)) / LOG(2)) AS binary_search_blocks
FROM data_scales d
CROSS JOIN fanout_params f;
 
-- Results:
-- Records      Fanout  Levels  Single-Level Blocks  Binary Search Blocks
-- ----------   ------  ------  -------------------  ---------------------
-- 1,000        292     2       4                    2
-- 10,000       292     2       35                   6
-- 100,000      292     3       343                  9
-- 1,000,000    292     3       3,425                12
-- 10,000,000   292     3       34,247               16
-- 100,000,000  292     4       342,466              19
-- 1,000,000,000  292   4       3,424,658            22
-- 10,000,000,000 292   5       34,246,576           25
-- 100,000,000,000 292  5       342,465,754          29

The Fanout Advantage

With fanout of 292, each level of the multi-level index effectively 'skips' 8-9 levels of binary search (since log₂(292) ≈ 8.2). A 4-level multi-level index has search efficiency comparable to a 32-level binary search—but we never need 32 binary search levels because the math caps us at ~25 for even the largest databases.

Sparse vs Dense at Each Level

Multi-level indexes have a specific density characteristic at each level that is fundamental to their efficiency. Understanding this pattern reveals why the structure works.

The density pattern:

Bottom level (Level 1): Can be either dense or sparse, depending on whether data is sorted.
- Dense: One entry per data record
- Sparse: One entry per data block (possible only if data is sorted by index key)
Upper levels (Level 2 and above): Always sparse—one entry per block of the level below.

This pattern ensures that each upper level has a predictable size: it has exactly ⌈(blocks at level below) / fanout⌉ blocks.

Dense Bottom Level

•When: Data file is not sorted by index key (most common case)
•Entries: One per data record
•Size: r × entry_size (proportional to record count)
•Use case: Secondary indexes, indexes on unsorted tables

Sparse Bottom Level

•When: Data file IS sorted by index key (primary/clustering index)
•Entries: One per data block (not per record)
•Size: (data_blocks) × entry_size
•Use case: Primary indexes, clustering indexes

sparse_dense_comparison.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
Sparse vs Dense Multi-Level Index Comparison
============================================
 
Dataset: 10,000,000 records
Record size: 200 bytes
Block size: 4,096 bytes
Data blocks: ⌈10,000,000 × 200 / 4,096⌉ = 488,282 blocks
 
CASE 1: Dense Index (data not sorted by index key)
-------------------------------------------------
Level 1 (Dense - one entry per record):
  Entries: 10,000,000
  Blocks: ⌈10,000,000 / 292⌉ = 34,247 blocks
 
Level 2 (Sparse - one entry per Level 1 block):
  Entries: 34,247
  Blocks: ⌈34,247 / 292⌉ = 118 blocks
 
Level 3 (Sparse - one entry per Level 2 block):
  Entries: 118
  Blocks: ⌈118 / 292⌉ = 1 block (root)
 
Total index blocks: 34,247 + 118 + 1 = 34,366 blocks
Total index size: ~134 MB
Search cost: 3 index blocks + 1 data block = 4 I/Os
 
 
CASE 2: Sparse Index (data IS sorted by index key)
-------------------------------------------------
Level 1 (Sparse - one entry per DATA block):
  Entries: 488,282 (one per data block)
  Blocks: ⌈488,282 / 292⌉ = 1,673 blocks
 
Level 2 (Sparse - one entry per Level 1 block):
  Entries: 1,673
  Blocks: ⌈1,673 / 292⌉ = 6 blocks
 
Level 3 (Sparse - one entry per Level 2 block):
  Entries: 6
  Blocks: ⌈6 / 292⌉ = 1 block (root)
 
Total index blocks: 1,673 + 6 + 1 = 1,680 blocks
Total index size: ~6.5 MB
Search cost: 3 index blocks + 1 data block = 4 I/Os
 
 
COMPARISON:
  Dense index: 34,366 blocks (~134 MB)
  Sparse index: 1,680 blocks (~6.5 MB)
  Ratio: 20:1 size difference!
 
The sparse index is 20x smaller but has the SAME search cost.
This is why clustering/primary key indexes use sparse organization.

The Clustering Trade-off

A table can have only ONE clustered index (because data can only be physically sorted one way). This makes the clustered index selection critical—it gets the efficiency of sparse indexing. All other indexes must be dense, paying the 20x size overhead. Choose your clustering key wisely.

The Multi-Level Search Algorithm

Understanding how search works in a multi-level index solidifies the conceptual model and prepares you for the tree structures (B-trees, B+-trees) that implement this concept. The algorithm is elegant in its simplicity.

The search algorithm:

multi_level_search.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
def multi_level_search(target_key, root_block):
    """
    Search for a key in a multi-level index.
    
    At each level:
    1. Read the current block into memory
    2. Find the largest key K where K <= target_key
    3. Follow the pointer associated with K to the next level
    4. Repeat until we reach the leaf level
    5. At leaf level, follow pointer to data record
    
    Time Complexity: O(L) block reads, where L = number of levels
    Space Complexity: O(1) additional memory (reuse buffer for each block)
    """
    current_block = root_block
    
    # Traverse from root to leaf level
    while current_block.level > 0:  # 0 = leaf level
        # Read block from disk (1 I/O)
        block_data = disk_read(current_block.block_id)
        
        # Binary search within block (in-memory, fast)
        # Find largest entry where entry.key <= target_key
        entry = binary_search_le(block_data.entries, target_key)
        
        if entry is None:
            # Target key is smaller than all keys in index
            # Follow pointer before first key (if exists)
            entry = block_data.entries[0]
        
        # Follow pointer to next level
        current_block = entry.pointer
    
    # At leaf level - find exact entry
    block_data = disk_read(current_block.block_id)  # 1 I/O
    entry = binary_search_exact(block_data.entries, target_key)
    
    if entry is None:
        return None  # Key not found
    
    # Follow pointer to data record
    data_block = disk_read(entry.data_pointer.block_id)  # 1 I/O
    record = extract_record(data_block, entry.data_pointer.offset)
    
    return record
 
 
def binary_search_le(entries, target_key):
    """
    Binary search within a single block (in memory).
    Find the largest key <= target_key.
    
    Time Complexity: O(log f) where f = fanout
    This is FAST because it's entirely in RAM.
    """
    low, high = 0, len(entries) - 1
    result = None
    
    while low <= high:
        mid = (low + high) // 2
        if entries[mid].key <= target_key:
            result = entries[mid]
            low = mid + 1  # Look for larger key still <= target
        else:
            high = mid - 1
    
    return result

Key aspects of the algorithm:

1. One block read per level

At each level, we read exactly one block. The block tells us which block to read at the next level. This is the source of the O(L) I/O complexity.

2. In-memory binary search is free (comparatively)

Within each block, we use binary search to find the right entry. With 292 entries per block, this takes ~8 comparisons—microseconds of CPU time. Compared to the milliseconds (HDD) or hundreds of microseconds (SSD) for the block read, this is negligible.

3. No wasted reads

Unlike single-level binary search (which might read blocks we don't need in later iterations), multi-level search reads only blocks on the direct path from root to leaf. Every I/O contributes to finding the answer.

4. Predictable performance

Every search follows the same pattern and accesses the same number of blocks (L levels + 1 data block). There are no worst cases or lucky cases—performance is consistent.

The Power of Two Logarithm Bases

Single-level: O(log₂ blocks) I/Os. Multi-level: O(log_f records) I/Os. Since f ≈ 300 and log₂(300) ≈ 8, multi-level indexing effectively divides the number of I/Os by 8. For very large datasets, this means 3-4 reads instead of 20-25.

Visualizing the Structure

A visual representation helps cement understanding of how multi-level indexes organize data. Let's trace through a concrete example with simplified numbers for clarity.

multi_level_visualization.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
Multi-Level Index Structure (Simplified Example)
=================================================
 
Scenario: Index 1,000 records with fanout f = 10
(Real systems have f ≈ 100-500, but 10 is easier to visualize)
 
                          LEVEL 3 (Root)
                         ┌─────────────┐
                         │ 1 100 200   │  <- 10 entries, pointing to Level 2 blocks
                         │ 300 400 500 │     
                         │ 600 700 800 │     Each entry: (key, pointer to L2 block)
                         │ 900         │
                         └─────────────┘
                               │
            ┌──────────────────┼──────────────────┐
            ▼                  ▼                  ▼
         LEVEL 2            LEVEL 2            LEVEL 2 
    ┌──────────┐       ┌──────────┐       ┌──────────┐
    │ 1 10 20  │       │ 300 310  │       │ 900 910  │
    │ 30 40 50 │       │ 320 330  │       │ 920 930  │
    │ 60 70 80 │       │ 340 350  │       │ 940 950  │
    │ 90       │       │ ...      │       │ 960 ...  │
    └──────────┘       └──────────┘       └──────────┘
         │                  │                  │
         ▼                  ▼                  ▼
      LEVEL 1            LEVEL 1            LEVEL 1
    (100 blocks)       (100 blocks)       (100 blocks)
    Each block         Each block         Each block
    has 10 entries     has 10 entries     has 10 entries
    pointing to        pointing to        pointing to
    data records       data records       data records
         │                  │                  │
         ▼                  ▼                  ▼
      DATA FILE          DATA FILE          DATA FILE
    (1000 records distributed across data blocks)
 
 
SEARCH EXAMPLE: Find record with key = 347
-----------------------------------------
 
Step 1: Read Level 3 (Root)
  - Scan entries: 1, 100, 200, 300, 400, ...
  - 300 <= 347 < 400, so follow pointer for key 300
  - → Go to Level 2 block for range [300, 400)
 
Step 2: Read Level 2 Block (300 range)
  - Scan entries: 300, 310, 320, 330, 340, 350, ...
  - 340 <= 347 < 350, so follow pointer for key 340
  - → Go to Level 1 block for range [340, 350)
 
Step 3: Read Level 1 Block (340 range)
  - Scan entries: 340, 341, 342, 343, 344, 345, 346, 347, 348, 349
  - Found exact match: 347
  - Follow data pointer to actual record
 
Step 4: Read Data Block
  - Retrieve the actual record with key 347
 
Total I/O: 4 block reads (3 index + 1 data)
 
Compare to single-level binary search on 100 index blocks:
  log₂(100) = 6.6, so 7 block reads for index alone.
  PLUS 1 data read = 8 total.
  
Multi-level saves 50% of I/Os in this small example.
The savings grow larger with bigger datasets.

The Tree Emerges

Notice how the multi-level index forms a tree structure naturally: one root, multiple branches at Level 2, even more at Level 1, and finally the leaves (data pointers). This isn't a coincidence—B-trees and B+-trees are the formalized, optimized versions of this exact structure.

Practical Implications

Multi-level indexing isn't just a theoretical improvement—it has profound practical implications for database system design, capacity planning, and query performance. Understanding these implications is essential for database administrators and engineers.

Caching Behavior Transformations

•Root block caching — The root block is accessed by EVERY search. It becomes permanently cached after the first access. Effective search cost: L-1 disk reads + 1 cached read.
•Upper level clustering — Upper levels have few blocks. In our 10M record example: Level 3 = 1 block, Level 2 = 118 blocks. These 119 blocks (476 KB) can easily stay cached, reducing effective search cost to 1 disk read (Level 1) + 1 data read.
•Working set efficiency — Frequently accessed key ranges keep their Level 1 blocks in cache. Hot data becomes faster over time as the cache warms.
•Predictable memory planning — Unlike single-level indexes where caching is hit-or-miss, multi-level indexes have clear caching priorities: cache upper levels first, then Level 1 for hot ranges.

cache_analysis.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-- Practical cache utilization for multi-level index
-- 10 million records, fanout = 292
 
-- Index Structure:
--   Level 3: 1 block (always cached)
--   Level 2: 118 blocks (easily cached)
--   Level 1: 34,247 blocks (selectively cached based on access patterns)
 
-- Memory allocation strategy:
 
-- Scenario 1: Minimal cache (< 1 MB allocated to this index)
--   Cached: Level 3 only (4 KB)
--   Search cost: 3 disk reads (L3 cached, L2+L1+data from disk)
 
-- Scenario 2: Moderate cache (1-5 MB allocated)
--   Cached: Level 3 + Level 2 (476 KB)
--   Search cost: 2 disk reads (L3+L2 cached, L1+data from disk)
 
-- Scenario 3: Generous cache (50+ MB allocated for hot data)
--   Cached: Level 3 + Level 2 + ~10% of Level 1 (hot ranges)
--   Search cost: 1.3 disk reads average (assuming 70% hot data hit rate)
 
-- Scenario 4: Full index cache (140 MB allocated)
--   Cached: All index levels
--   Search cost: 1 disk read (data only)
 
-- KEY INSIGHT: 
-- With just 476 KB of cache (Level 2 + Level 3), we cut disk I/O in half.
-- Multi-level indexes have excellent cache efficiency due to hierarchical access patterns.
 
SELECT 
    'Level 3 (Root)' as level,
    1 as blocks,
    1 * 4096.0 / 1024 as size_kb,
    'Always cached - every query touches this' as cache_strategy
UNION ALL SELECT 
    'Level 2', 118, 118 * 4096.0 / 1024, 'Should be cached - only 472 KB'
UNION ALL SELECT 
    'Level 1', 34247, 34247 * 4096.0 / 1024 / 1024 as size_mb, 
    'Selective caching based on access patterns';

Multi-Level Index: Performance Characteristics
Characteristic	Single-Level Index	Multi-Level Index	Improvement
Search I/O (cold cache)	16 block reads	4 block reads	4x reduction
Search I/O (warm cache)	~10 block reads (50% cached)	2 block reads (upper levels cached)	5x reduction
Cache efficiency	Random—binary search scatters access	Hierarchical—upper levels always hot	Much better
Memory required for good performance	~134 MB (full index)	~0.5 MB (upper levels)	268x reduction
Scalability to 1B records	22 block reads	4 block reads	5.5x reduction

Database Tuning Implication

When tuning database cache sizes, prioritize memory for upper index levels. A small cache that holds all Level 2+ blocks outperforms a larger cache randomly holding Level 1 blocks. Most modern databases do this automatically, but understanding the principle helps in troubleshooting and capacity planning.

Summary: The Multi-Level Revolution

The multi-level index concept represents a fundamental breakthrough in database access efficiency. By building indexes on indexes, we transform the scaling behavior from logarithm-base-2 to logarithm-base-fanout—a change that reduces I/O operations by factors of 4-6x at scale.

Key Takeaways

•Index on index — A sorted first-level index can itself be indexed, creating a hierarchy that narrows search space exponentially.
•Fanout determines efficiency — With fanout f entries per block, each level reduces search space by factor f. Higher fanout = fewer levels = fewer I/Os.
•Logarithm base change — Search cost becomes log_f(records) instead of log₂(blocks). For f=292, this is roughly 8x more efficient per level.
•Upper levels are sparse — Except possibly the bottom level, each level has one entry per block of the level below, ensuring minimal overhead.
•Cache behavior is hierarchical — Upper levels are accessed by all queries and stay cached. Effective search cost on warm systems is often just 1-2 disk reads.
•Tree structure emerges naturally — Multi-level indexes form trees, which is why B-trees and B+-trees are the standard implementation.

What's next:

The multi-level concept naturally leads to tree-based data structures. The next page explores how this hierarchical organization is formalized into tree structures—specifically the balanced search trees that power every modern database index. We'll see how B-trees maintain balance, handle updates, and optimize for the block-based access patterns of disk storage.

Page Complete

You now understand the multi-level index concept: hierarchical organization, fanout-based efficiency, sparse upper levels, and the resulting search algorithm. This conceptual foundation prepares you for understanding B-trees and B+-trees—the practical implementations of multi-level indexing used in every major database system.