Database Management SystemsB+-Tree Operations

B+-Tree Operations

LevelIntermediate

Duration90 mins

TopicB+-Tree Operations

5 / 5

Bulk Loading

Building Trees at Scale

Imagine you need to create an index on a table with 100 million rows. Using repeated single-key insertions, each requiring O(log N) I/Os, would take an astronomical number of disk operations. Add the overhead of repeated splits, and index creation becomes a multi-hour ordeal.

Bulk loading is the solution. Rather than inserting keys one at a time, bulk loading constructs the entire B+-tree structure in a single optimized pass. The result is the same balanced tree, but built in a fraction of the time with far fewer I/Os. Understanding bulk loading is essential for database administrators managing large tables and for engineers building custom storage systems.

What You Will Learn

By the end of this page, you will master B+-tree bulk loading including: the performance problems with naive insertion, the sorted input requirement, bottom-up construction algorithms, fill factor considerations, I/O optimization techniques, and practical applications in database systems.

The Problem with Naive Construction

To appreciate bulk loading, we must first understand why repeated insertions are inefficient for large datasets.

Naive Approach: Insert Keys One by One

For N keys:

Each insertion requires O(log N) I/Os to find the correct leaf
Total insertions: N × O(log N) = O(N log N) I/Os
Additional splits: Each split writes multiple pages
Random access pattern: Each insertion may access different parts of the tree

Concrete Example: 100 Million Keys

Assuming tree height = 5 and 10ms per random I/O:

Insertions: 100M × 5 = 500M reads minimum
Writes: 100M × 1 = 100M writes minimum (more with splits)
Time: 600M × 10ms = 6 million seconds ≈ 69 days

This is clearly impractical.

Naive Insertion Cost for Various Dataset Sizes
Records	Tree Height	Read I/Os	Write I/Os	Time @ 10ms/IO
10,000	3	30,000	~15,000	7.5 minutes
1,000,000	4	4,000,000	~1,500,000	15 hours
100,000,000	5	500,000,000	~150,000,000	69 days
10,000,000,000	6	60,000,000,000	~15,000,000,000	19 years

Additional Problems:

Random I/O Pattern: Insertions touch different nodes randomly, defeating read-ahead and caching
Excessive Splits: If keys are inserted in order, the rightmost leaf splits repeatedly causing 50% node utilization
Cache Thrashing: The tree grows during construction, leading to cache misses as new nodes are accessed
Logging Overhead: Each modification is individually logged for crash recovery

Sequential Inserts: The Worst Case

Inserting keys in sorted order (e.g., auto-incrementing IDs) is particularly bad. Every insert goes to the rightmost leaf, which splits when full, resulting in leaves that are only 50% utilized. The tree is twice as large as necessary, and every split touches the rightmost path from root to leaf.

The Bulk Loading Approach

Bulk loading constructs a B+-tree from a sorted sequence of keys using a fundamentally different approach:

Key Insight: If the input is sorted, we know exactly where every key will end up relative to others. We can build the tree bottom-up rather than top-down, filling leaves sequentially and constructing internal nodes afterward.

High-Level Process:

Sort the input data (if not already sorted)
Create leaf pages sequentially — fill each leaf to desired capacity, then start the next
Build internal levels — once leaves are complete, construct internal nodes bottom-up
Create root — continue until a single root node encompasses the entire tree

This approach is sequential rather than random—pages are written in order, maximizing disk throughput.

Converting Mermaid diagram...

Why Sorting is Worth It

External sorting of 100 million records requires ~O(N log N) I/Os but with sequential access patterns. This is vastly faster than 500M random I/Os from naive insertion. For very large datasets, sorting might take hours, but it saves days of random I/O during tree construction.

Bottom-Up Construction Algorithm

The bottom-up bulk loading algorithm constructs the tree level by level, starting from leaves:

B+-Tree Bulk Loading Algorithm
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
FUNCTION BulkLoadBPlusTree(sortedKeys[], fillFactor)
    // fillFactor: 0.0 to 1.0 (typically 0.7 to 1.0)
    // Determines how full each page should be
    
    keysPerLeaf ← FLOOR(maxLeafKeys × fillFactor)
    
    // ================================================
    // PHASE 1: BUILD LEAF LEVEL
    // ================================================
    
    leafPages ← []
    currentLeaf ← AllocateNewLeafPage()
    
    FOR EACH (key, recordPointer) IN sortedKeys DO
        IF currentLeaf.keyCount = keysPerLeaf THEN
            // Current leaf is full, finalize and start new one
            leafPages.append(currentLeaf)
            
            newLeaf ← AllocateNewLeafPage()
            currentLeaf.nextSibling ← newLeaf
            newLeaf.prevSibling ← currentLeaf
            currentLeaf ← newLeaf
        END IF
        
        // Add key to current leaf
        currentLeaf.keys[currentLeaf.keyCount] ← key
        currentLeaf.recordPointers[currentLeaf.keyCount] ← recordPointer
        currentLeaf.keyCount ← currentLeaf.keyCount + 1
    END FOR
    
    // Don't forget the last leaf
    IF currentLeaf.keyCount > 0 THEN
        leafPages.append(currentLeaf)
    END IF
    
    // ================================================
    // PHASE 2: BUILD INTERNAL LEVELS BOTTOM-UP
    // ================================================
    
    currentLevel ← leafPages
    
    WHILE LENGTH(currentLevel) > 1 DO
        currentLevel ← BuildParentLevel(currentLevel, fillFactor)
    END WHILE
    
    // currentLevel[0] is the root
    root ← currentLevel[0]
    RETURN root
END FUNCTION
 
FUNCTION BuildParentLevel(childPages[], fillFactor)
    // Build internal nodes for a level of child pages
    // Each internal node points to multiple children
    
    childrenPerNode ← FLOOR(maxInternalChildren × fillFactor)
    parentPages ← []
    currentParent ← AllocateNewInternalPage()
    
    FOR i ← 0 TO LENGTH(childPages) - 1 DO
        IF currentParent.childCount = childrenPerNode THEN
            // Current parent is full
            parentPages.append(currentParent)
            currentParent ← AllocateNewInternalPage()
        END IF
        
        // Add child pointer
        currentParent.children[currentParent.childCount] ← childPages[i]
        currentParent.childCount ← currentParent.childCount + 1
        
        // Add separator key (except for first child in node)
        IF currentParent.childCount > 1 THEN
            // Separator is the first key in the child (for leaves)
            // or pulled from child for internal nodes
            currentParent.keys[currentParent.keyCount] ← GetSeparatorKey(childPages[i])
            currentParent.keyCount ← currentParent.keyCount + 1
        END IF
    END FOR
    
    // Don't forget the last parent
    IF currentParent.childCount > 0 THEN
        parentPages.append(currentParent)
    END IF
    
    RETURN parentPages
END FUNCTION
 
FUNCTION GetSeparatorKey(childPage)
    IF childPage.isLeaf THEN
        RETURN childPage.keys[0]    // First key in leaf
    ELSE
        // For internal nodes, we need the smallest key in subtree
        RETURN GetSeparatorKey(childPage.children[0])
    END IF
END FUNCTION

Separator Key Selection

During bulk loading, separator keys in internal nodes are derived from the first key of each child page (for leaves) or recursively (for internal children). This is subtly different from insertion-based construction where separators come from splits. The result is functionally equivalent.

Worked Example: Bulk Loading in Action

Let's trace bulk loading with a concrete example.

Setup:

Keys to load: [5, 10, 15, 20, 25, 30, 35, 40, 45, 50] (already sorted)
B+-tree order: 4 (max 3 keys per leaf, max 4 children per internal node)
Fill factor: 100% (fill leaves completely)

Phase 1: Build Leaves

With 3 keys per leaf:

Leaf 1: [5, 10, 15]     → FULL, start new leaf
Leaf 2: [20, 25, 30]    → FULL, start new leaf
Leaf 3: [35, 40, 45]    → FULL, start new leaf
Leaf 4: [50]            → Last key, partial leaf

Link leaves: L1 ↔ L2 ↔ L3 ↔ L4

Phase 2: Build Internal Level 1

With 4 children per internal node, we can fit all 4 leaves:

Internal Node 1:
  Children: [L1, L2, L3, L4]
  Separators: First key of each child after first = [20, 35, 50]

Since we have only one internal node, it becomes the root.

Converting Mermaid diagram...

Bulk Load I/O Summary
Phase	Pages Written	Access Pattern
Build Leaf 1	1	Sequential
Build Leaf 2	1	Sequential
Build Leaf 3	1	Sequential
Build Leaf 4	1	Sequential
Build Root	1	Sequential
Total	5 pages	100% sequential

I/O Efficiency

Bulk loading wrote exactly 5 pages with no random I/O. Naive insertion of these 10 keys would require 10 tree traversals (3 I/Os each = 30 reads) plus 10 writes minimum, and likely several splits adding more I/O. Bulk loading is ~10× more efficient for this small example; the advantage grows dramatically with scale.

Fill Factor: Space vs Performance Trade-offs

The fill factor determines how full each page is during bulk loading. This parameter has significant implications:

High Fill Factor (90-100%):

Maximum space efficiency
Minimum tree height
Risk: Soon-to-arrive insertions will immediately cause splits

Low Fill Factor (50-70%):

Leaves room for future insertions
Fewer splits after bulk load
Trade-off: Larger tree, slightly more I/Os for searches

The Right Choice Depends on Workload:

Fill Factor Recommendations by Workload
Workload Type	Recommended Fill	Rationale
Read-only / Analytics	95-100%	No future inserts, maximize density
Moderate inserts	70-85%	Room for growth without immediate splits
Heavy inserts	50-70%	Minimize splits at cost of space
Sequential key inserts	100% then rebuild	Splits are inevitable; periodic rebuild is better
Unknown / Mixed	75-80%	Balanced default choice

Impact on Tree Size:

For a dataset requiring 1000 leaf pages at 100% fill:

Fill Factor	Leaf Pages	Height Impact	Space Overhead
100%	1000	Optimal	0%
90%	1112	Same	11%
75%	1334	Same usually	33%
50%	2000	May add level	100%

The tree height is determined by: h = ⌈log_fanout(pages)⌉. A 33% space increase rarely adds a level, but 100% might.

Variable Fill Factor

Some implementations use different fill factors at different tree levels. Internal nodes are typically filled to 100% (they rarely need to absorb insertions), while leaves are filled to 70-80% (where insertions actually occur). This balances space efficiency with insert performance.

Bulk Loading Complexity Analysis

Let's rigorously analyze the I/O and time complexity of bulk loading:

Notation:

N = number of keys to load
B = keys per leaf page
F = children per internal node (fanout)
P = number of leaf pages = ⌈N/B⌉

Phase 1: Sorting (if needed)

External merge sort: O(N/B × log_{M/B}(N/B)) I/Os
Where M = available memory in pages
For 100M records with 1GB memory: ~2-4 passes over data

Phase 2: Building Leaves

Write each leaf exactly once: P writes
Sequential access pattern
I/O cost: O(N/B) = O(P)

Phase 3: Building Internal Levels

Level 1: ⌈P/F⌉ internal nodes
Level 2: ⌈P/F²⌉ internal nodes
...
Height h: ⌈P/F^h⌉ = 1 (root)

Total internal nodes: P/F + P/F² + ... ≈ P/(F-1) I/O cost: O(P/F) for all internal levels combined

Total I/O Cost: O(sort) + O(N/B)

This is essentially linear in the data size, compared to O(N log N) for naive insertion.

Bulk Loading vs Naive Insertion: I/O Comparison
Records (N)	Naive Insertion I/Os	Bulk Loading I/Os	Speedup
10,000	45,000	200	225×
1,000,000	5,500,000	12,000	458×
100,000,000	650,000,000	1,100,000	590×
10,000,000,000	80,000,000,000	105,000,000	762×

The Power of Sequential I/O

Beyond fewer I/O operations, bulk loading uses sequential access. Modern storage achieves 100-500 MB/s sequential throughput vs 5-10 MB/s effective random throughput. This compounds the improvement: bulk loading might be 1000× faster in wall-clock time, not just I/O count.

Advanced Bulk Loading Techniques

Production database systems employ several advanced techniques to further optimize bulk loading:

Production Bulk Loading Optimizations

•Parallel Bulk Loading — Partition the sorted input by key ranges. Build sub-trees independently in parallel, then stitch them together. Modern systems use all available cores, with each building a portion of the tree.
•Pre-sorted Input Detection — If the input is already sorted (e.g., from an ORDER BY in the source), skip the sort phase entirely. The optimizer can detect this and choose the optimal path.
•Minimal Logging — During bulk load, log only enough to recover from failure (e.g., allocation metadata), not individual key insertions. This dramatically reduces WAL I/O. Some systems use 'bulk-logged' recovery mode.
•Direct I/O — Bypass the filesystem cache for bulk writes since the data is written once and not immediately reread. This prevents cache pollution and reduces memory pressure.
•Vectorized Writing — Batch multiple pages into single large writes (e.g., 1MB blocks). This maximizes disk throughput and reduces system call overhead.
•Incremental Bulk Loading — For adding data to existing indexes, some systems support 'append-only' bulk loading where new data is added to the rightmost edge of an existing tree, avoiding full reconstruction.

Online vs Offline Bulk Loading

Offline bulk loading blocks concurrent access during construction—simpler and faster. Online bulk loading allows reads during construction using techniques like shadow paging or incremental materialization. Most systems offer both options, with offline being significantly faster.

Bulk Loading from Existing Index:

When rebuilding an index (e.g., to reclaim space or change fill factor), the existing index serves as the sorted input source:

Scan the old index sequentially (leaf to leaf)
Build new index with desired fill factor
Atomically swap new index for old
Deallocate old index pages

This is much faster than DROP INDEX + CREATE INDEX from table scan because the leaves are already sorted.

Practical Applications of Bulk Loading

Bulk loading is used in many real-world scenarios:

Scenario: Loading a new database from CSV files or data warehouse exports.

Approach:

Load data into heap tables (unindexed) first
Sort data by primary key
Bulk load primary index
Repeat for each secondary index

SQL Example:

-- PostgreSQL
CREATE TABLE orders_new (LIKE orders INCLUDING ALL);
ALTER TABLE orders_new SET UNLOGGED;  -- Disable WAL for speed
COPY orders_new FROM '/data/orders.csv' WITH (FORMAT csv);
CLUSTER orders_new USING orders_pkey;  -- Rebuild clustered
ALTER TABLE orders_new SET LOGGED;

Key Point: Disabling logging during bulk load and re-enabling after is a common pattern for massive speedups.

Monitoring Bulk Load Progress

Most databases provide ways to monitor bulk load progress: PostgreSQL's pg_stat_progress_create_index, SQL Server's sys.dm_exec_requests with percent_complete, and MySQL's performance_schema. Watching these helps estimate completion time for long-running operations.

Bulk Loading vs Alternative Approaches

Index Construction Method Comparison
Method	I/O Complexity	Concurrency	Use Case
Naive insertion	O(N log N)	Full (each insert independent)	Small datasets, online inserts
Bottom-up bulk load	O(N)	Blocked during build	Large initial loads
Concurrent bulk load	O(N) + overhead	Reads allowed	Production with availability requirements
Log-structured merge	O(N) amortized	Full	Write-heavy workloads (LSM-trees)

Bulk Loading Advantages

•Optimal I/O efficiency — Linear in data size
•Sequential access — Maximizes disk throughput
•Controlled fill factor — Produces well-organized trees
•Minimal fragmentation — Pages are contiguous
•Predictable performance — Easy to estimate time

Bulk Loading Limitations

•Requires sorted input — Sorting may dominate time
•Blocking (offline) — Table unavailable during build
•All-or-nothing — Failure requires restart
•Memory intensive — Sorting needs working memory
•Doesn't help online inserts — Only initial construction

Summary: Bulk Loading Mastery

Key Takeaways

•Naive Insertion is Impractical at Scale — O(N log N) random I/Os make it prohibitively slow for large datasets.
•Bulk Loading Requires Sorted Input — External sorting is worthwhile because it enables linear-time construction.
•Bottom-Up Construction — Build leaves first, then internal levels. Write each page exactly once.
•Fill Factor Controls Space/Performance — Higher fill = denser tree but more splits later. Choose based on workload.
•Sequential I/O is Key — Bulk loading achieves near-streaming throughput, 10-100× faster than random access.
•Production Systems Optimize Further — Parallelism, minimal logging, and vectorized writes compound the advantages.

Module Complete:

Congratulations! You have now mastered all fundamental B+-tree operations:

Search — Logarithmic point lookup via tree descent
Range Queries — Efficient sequential leaf scanning
Insertion — Key addition with split propagation
Deletion — Key removal with redistribution and merging
Bulk Loading — Optimal construction from sorted data

These operations form the foundation of index behavior in every major database system. Your understanding of their mechanics will inform query optimization, capacity planning, and performance troubleshooting throughout your career.

Module Complete

You now possess deep knowledge of B+-tree operations, from search through bulk loading. This understanding is essential for database administration, query tuning, and storage engine development. The next module explores B+-tree performance analysis and optimization techniques.

5 / 5

Loading learning content...

Database Management SystemsB+-Tree Operations

B+-Tree Operations

LevelIntermediate

Duration90 mins

TopicB+-Tree Operations

5 / 5

Bulk Loading

Building Trees at Scale

What You Will Learn

The Problem with Naive Construction

To appreciate bulk loading, we must first understand why repeated insertions are inefficient for large datasets.

Naive Approach: Insert Keys One by One

For N keys:

Each insertion requires O(log N) I/Os to find the correct leaf
Total insertions: N × O(log N) = O(N log N) I/Os
Additional splits: Each split writes multiple pages
Random access pattern: Each insertion may access different parts of the tree

Concrete Example: 100 Million Keys

Assuming tree height = 5 and 10ms per random I/O:

Insertions: 100M × 5 = 500M reads minimum
Writes: 100M × 1 = 100M writes minimum (more with splits)
Time: 600M × 10ms = 6 million seconds ≈ 69 days

This is clearly impractical.

Naive Insertion Cost for Various Dataset Sizes
Records	Tree Height	Read I/Os	Write I/Os	Time @ 10ms/IO
10,000	3	30,000	~15,000	7.5 minutes
1,000,000	4	4,000,000	~1,500,000	15 hours
100,000,000	5	500,000,000	~150,000,000	69 days
10,000,000,000	6	60,000,000,000	~15,000,000,000	19 years

Additional Problems:

Random I/O Pattern: Insertions touch different nodes randomly, defeating read-ahead and caching
Excessive Splits: If keys are inserted in order, the rightmost leaf splits repeatedly causing 50% node utilization
Cache Thrashing: The tree grows during construction, leading to cache misses as new nodes are accessed
Logging Overhead: Each modification is individually logged for crash recovery

Sequential Inserts: The Worst Case

The Bulk Loading Approach

Bulk loading constructs a B+-tree from a sorted sequence of keys using a fundamentally different approach:

High-Level Process:

Sort the input data (if not already sorted)
Create leaf pages sequentially — fill each leaf to desired capacity, then start the next
Build internal levels — once leaves are complete, construct internal nodes bottom-up
Create root — continue until a single root node encompasses the entire tree

This approach is sequential rather than random—pages are written in order, maximizing disk throughput.

Converting Mermaid diagram...

Why Sorting is Worth It

Bottom-Up Construction Algorithm

The bottom-up bulk loading algorithm constructs the tree level by level, starting from leaves:

B+-Tree Bulk Loading Algorithm
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
FUNCTION BulkLoadBPlusTree(sortedKeys[], fillFactor)
    // fillFactor: 0.0 to 1.0 (typically 0.7 to 1.0)
    // Determines how full each page should be
    
    keysPerLeaf ← FLOOR(maxLeafKeys × fillFactor)
    
    // ================================================
    // PHASE 1: BUILD LEAF LEVEL
    // ================================================
    
    leafPages ← []
    currentLeaf ← AllocateNewLeafPage()
    
    FOR EACH (key, recordPointer) IN sortedKeys DO
        IF currentLeaf.keyCount = keysPerLeaf THEN
            // Current leaf is full, finalize and start new one
            leafPages.append(currentLeaf)
            
            newLeaf ← AllocateNewLeafPage()
            currentLeaf.nextSibling ← newLeaf
            newLeaf.prevSibling ← currentLeaf
            currentLeaf ← newLeaf
        END IF
        
        // Add key to current leaf
        currentLeaf.keys[currentLeaf.keyCount] ← key
        currentLeaf.recordPointers[currentLeaf.keyCount] ← recordPointer
        currentLeaf.keyCount ← currentLeaf.keyCount + 1
    END FOR
    
    // Don't forget the last leaf
    IF currentLeaf.keyCount > 0 THEN
        leafPages.append(currentLeaf)
    END IF
    
    // ================================================
    // PHASE 2: BUILD INTERNAL LEVELS BOTTOM-UP
    // ================================================
    
    currentLevel ← leafPages
    
    WHILE LENGTH(currentLevel) > 1 DO
        currentLevel ← BuildParentLevel(currentLevel, fillFactor)
    END WHILE
    
    // currentLevel[0] is the root
    root ← currentLevel[0]
    RETURN root
END FUNCTION
 
FUNCTION BuildParentLevel(childPages[], fillFactor)
    // Build internal nodes for a level of child pages
    // Each internal node points to multiple children
    
    childrenPerNode ← FLOOR(maxInternalChildren × fillFactor)
    parentPages ← []
    currentParent ← AllocateNewInternalPage()
    
    FOR i ← 0 TO LENGTH(childPages) - 1 DO
        IF currentParent.childCount = childrenPerNode THEN
            // Current parent is full
            parentPages.append(currentParent)
            currentParent ← AllocateNewInternalPage()
        END IF
        
        // Add child pointer
        currentParent.children[currentParent.childCount] ← childPages[i]
        currentParent.childCount ← currentParent.childCount + 1
        
        // Add separator key (except for first child in node)
        IF currentParent.childCount > 1 THEN
            // Separator is the first key in the child (for leaves)
            // or pulled from child for internal nodes
            currentParent.keys[currentParent.keyCount] ← GetSeparatorKey(childPages[i])
            currentParent.keyCount ← currentParent.keyCount + 1
        END IF
    END FOR
    
    // Don't forget the last parent
    IF currentParent.childCount > 0 THEN
        parentPages.append(currentParent)
    END IF
    
    RETURN parentPages
END FUNCTION
 
FUNCTION GetSeparatorKey(childPage)
    IF childPage.isLeaf THEN
        RETURN childPage.keys[0]    // First key in leaf
    ELSE
        // For internal nodes, we need the smallest key in subtree
        RETURN GetSeparatorKey(childPage.children[0])
    END IF
END FUNCTION

Separator Key Selection

Worked Example: Bulk Loading in Action

Let's trace bulk loading with a concrete example.

Setup:

Keys to load: [5, 10, 15, 20, 25, 30, 35, 40, 45, 50] (already sorted)
B+-tree order: 4 (max 3 keys per leaf, max 4 children per internal node)
Fill factor: 100% (fill leaves completely)

Phase 1: Build Leaves

With 3 keys per leaf:

Leaf 1: [5, 10, 15]     → FULL, start new leaf
Leaf 2: [20, 25, 30]    → FULL, start new leaf
Leaf 3: [35, 40, 45]    → FULL, start new leaf
Leaf 4: [50]            → Last key, partial leaf

Link leaves: L1 ↔ L2 ↔ L3 ↔ L4

Phase 2: Build Internal Level 1

With 4 children per internal node, we can fit all 4 leaves:

Internal Node 1:
  Children: [L1, L2, L3, L4]
  Separators: First key of each child after first = [20, 35, 50]

Since we have only one internal node, it becomes the root.

Converting Mermaid diagram...

Bulk Load I/O Summary
Phase	Pages Written	Access Pattern
Build Leaf 1	1	Sequential
Build Leaf 2	1	Sequential
Build Leaf 3	1	Sequential
Build Leaf 4	1	Sequential
Build Root	1	Sequential
Total	5 pages	100% sequential

I/O Efficiency

Fill Factor: Space vs Performance Trade-offs

The fill factor determines how full each page is during bulk loading. This parameter has significant implications:

High Fill Factor (90-100%):

Maximum space efficiency
Minimum tree height
Risk: Soon-to-arrive insertions will immediately cause splits

Low Fill Factor (50-70%):

Leaves room for future insertions
Fewer splits after bulk load
Trade-off: Larger tree, slightly more I/Os for searches

The Right Choice Depends on Workload:

Fill Factor Recommendations by Workload
Workload Type	Recommended Fill	Rationale
Read-only / Analytics	95-100%	No future inserts, maximize density
Moderate inserts	70-85%	Room for growth without immediate splits
Heavy inserts	50-70%	Minimize splits at cost of space
Sequential key inserts	100% then rebuild	Splits are inevitable; periodic rebuild is better
Unknown / Mixed	75-80%	Balanced default choice

Impact on Tree Size:

For a dataset requiring 1000 leaf pages at 100% fill:

Fill Factor	Leaf Pages	Height Impact	Space Overhead
100%	1000	Optimal	0%
90%	1112	Same	11%
75%	1334	Same usually	33%
50%	2000	May add level	100%

The tree height is determined by: h = ⌈log_fanout(pages)⌉. A 33% space increase rarely adds a level, but 100% might.

Variable Fill Factor

Bulk Loading Complexity Analysis

Let's rigorously analyze the I/O and time complexity of bulk loading:

Notation:

N = number of keys to load
B = keys per leaf page
F = children per internal node (fanout)
P = number of leaf pages = ⌈N/B⌉

Phase 1: Sorting (if needed)

External merge sort: O(N/B × log_{M/B}(N/B)) I/Os
Where M = available memory in pages
For 100M records with 1GB memory: ~2-4 passes over data

Phase 2: Building Leaves

Write each leaf exactly once: P writes
Sequential access pattern
I/O cost: O(N/B) = O(P)

Phase 3: Building Internal Levels

Level 1: ⌈P/F⌉ internal nodes
Level 2: ⌈P/F²⌉ internal nodes
...
Height h: ⌈P/F^h⌉ = 1 (root)

Total internal nodes: P/F + P/F² + ... ≈ P/(F-1) I/O cost: O(P/F) for all internal levels combined

Total I/O Cost: O(sort) + O(N/B)

This is essentially linear in the data size, compared to O(N log N) for naive insertion.

Bulk Loading vs Naive Insertion: I/O Comparison
Records (N)	Naive Insertion I/Os	Bulk Loading I/Os	Speedup
10,000	45,000	200	225×
1,000,000	5,500,000	12,000	458×
100,000,000	650,000,000	1,100,000	590×
10,000,000,000	80,000,000,000	105,000,000	762×

The Power of Sequential I/O

Advanced Bulk Loading Techniques

Production database systems employ several advanced techniques to further optimize bulk loading:

Production Bulk Loading Optimizations

•Parallel Bulk Loading — Partition the sorted input by key ranges. Build sub-trees independently in parallel, then stitch them together. Modern systems use all available cores, with each building a portion of the tree.
•Pre-sorted Input Detection — If the input is already sorted (e.g., from an ORDER BY in the source), skip the sort phase entirely. The optimizer can detect this and choose the optimal path.
•Minimal Logging — During bulk load, log only enough to recover from failure (e.g., allocation metadata), not individual key insertions. This dramatically reduces WAL I/O. Some systems use 'bulk-logged' recovery mode.
•Direct I/O — Bypass the filesystem cache for bulk writes since the data is written once and not immediately reread. This prevents cache pollution and reduces memory pressure.
•Vectorized Writing — Batch multiple pages into single large writes (e.g., 1MB blocks). This maximizes disk throughput and reduces system call overhead.
•Incremental Bulk Loading — For adding data to existing indexes, some systems support 'append-only' bulk loading where new data is added to the rightmost edge of an existing tree, avoiding full reconstruction.

Online vs Offline Bulk Loading

Bulk Loading from Existing Index:

When rebuilding an index (e.g., to reclaim space or change fill factor), the existing index serves as the sorted input source:

Scan the old index sequentially (leaf to leaf)
Build new index with desired fill factor
Atomically swap new index for old
Deallocate old index pages

This is much faster than DROP INDEX + CREATE INDEX from table scan because the leaves are already sorted.

Practical Applications of Bulk Loading

Bulk loading is used in many real-world scenarios:

Scenario: Loading a new database from CSV files or data warehouse exports.

Approach:

Load data into heap tables (unindexed) first
Sort data by primary key
Bulk load primary index
Repeat for each secondary index

SQL Example:

-- PostgreSQL
CREATE TABLE orders_new (LIKE orders INCLUDING ALL);
ALTER TABLE orders_new SET UNLOGGED;  -- Disable WAL for speed
COPY orders_new FROM '/data/orders.csv' WITH (FORMAT csv);
CLUSTER orders_new USING orders_pkey;  -- Rebuild clustered
ALTER TABLE orders_new SET LOGGED;

Key Point: Disabling logging during bulk load and re-enabling after is a common pattern for massive speedups.

Monitoring Bulk Load Progress

Bulk Loading vs Alternative Approaches

Index Construction Method Comparison
Method	I/O Complexity	Concurrency	Use Case
Naive insertion	O(N log N)	Full (each insert independent)	Small datasets, online inserts
Bottom-up bulk load	O(N)	Blocked during build	Large initial loads
Concurrent bulk load	O(N) + overhead	Reads allowed	Production with availability requirements
Log-structured merge	O(N) amortized	Full	Write-heavy workloads (LSM-trees)

Bulk Loading Advantages

•Optimal I/O efficiency — Linear in data size
•Sequential access — Maximizes disk throughput
•Controlled fill factor — Produces well-organized trees
•Minimal fragmentation — Pages are contiguous
•Predictable performance — Easy to estimate time

Bulk Loading Limitations

•Requires sorted input — Sorting may dominate time
•Blocking (offline) — Table unavailable during build
•All-or-nothing — Failure requires restart
•Memory intensive — Sorting needs working memory
•Doesn't help online inserts — Only initial construction

Summary: Bulk Loading Mastery

Key Takeaways

•Naive Insertion is Impractical at Scale — O(N log N) random I/Os make it prohibitively slow for large datasets.
•Bulk Loading Requires Sorted Input — External sorting is worthwhile because it enables linear-time construction.
•Bottom-Up Construction — Build leaves first, then internal levels. Write each page exactly once.
•Fill Factor Controls Space/Performance — Higher fill = denser tree but more splits later. Choose based on workload.
•Sequential I/O is Key — Bulk loading achieves near-streaming throughput, 10-100× faster than random access.
•Production Systems Optimize Further — Parallelism, minimal logging, and vectorized writes compound the advantages.

Module Complete:

Congratulations! You have now mastered all fundamental B+-tree operations:

Search — Logarithmic point lookup via tree descent
Range Queries — Efficient sequential leaf scanning
Insertion — Key addition with split propagation
Deletion — Key removal with redistribution and merging
Bulk Loading — Optimal construction from sorted data

Module Complete

5 / 5