B Plus Tree Structure - Learning Module

Loading content...

0/241

B+-Tree Definition

The Most Important Index Structure in Databases

If you open any production database system—Oracle, PostgreSQL, MySQL, SQL Server, or even modern systems like CockroachDB and TiDB—you'll find that the B+-tree is the default, and often the only, tree-based index structure offered. This isn't a historical accident or vendor laziness. The B+-tree represents decades of refinement in understanding what database workloads actually need.

The B+-tree evolved from the B-tree with a singular focus: optimizing for the realities of disk-based storage and the query patterns databases actually encounter. While B-trees were a theoretical breakthrough, B+-trees are an engineering triumph—the structure that powers virtually every relational database on Earth.

What You Will Learn

By the end of this page, you will understand the formal definition of B+-trees, including their mathematical properties, structural invariants, and the fundamental design decisions that make them the dominant index structure in database systems. You'll see exactly how and why B+-trees differ from B-trees at a definitional level.

From B-Trees to B+-Trees: The Evolution

The B-tree, invented by Rudolf Bayer and Edward McCreight at Boeing Research Labs in 1970, revolutionized how we store and retrieve data on disk. Its balanced structure and high fanout made it dramatically more efficient than binary search trees for disk-based operations.

However, as database systems matured and workloads grew more complex, engineers identified several limitations of the original B-tree design:

Problem 1: Range Query Inefficiency

In a standard B-tree, data records (or pointers to them) can appear at any level of the tree. To perform a range query (e.g., "find all employees with salaries between $50,000 and $75,000"), you might need to traverse back and forth through the tree structure, visiting internal nodes repeatedly.

Problem 2: Variable Node Sizes

When data records are stored in internal nodes, those nodes become larger—sometimes much larger if records are variable-length. This reduces fanout and increases tree height, exactly the opposite of what we want for disk efficiency.

Problem 3: Sequential Access Patterns

Databases frequently need to scan through ordered data sequentially. B-trees make this unnecessarily complex because the "next" record might be in a sibling subtree requiring navigation back up through parent nodes.

The Design Insight

The B+-tree's key insight was to separate navigation from storage. Internal nodes exist solely to guide searches—they hold keys but no data. All actual data resides in leaf nodes, which are linked together for efficient sequential access. This separation enables optimizations impossible in standard B-trees.

Formal Definition of a B+-Tree

A B+-tree of order m is a balanced tree structure that satisfies the following properties. Understanding these properties precisely is essential for reasoning about B+-tree behavior, implementing the algorithms correctly, and analyzing their performance.

Definition: A B+-tree of order m is a rooted tree satisfying the following conditions:

Structural Properties

•Root Node Constraints: The root node contains at least 1 key (if the tree is non-empty) and at most m-1 keys. If the tree has more than one node, the root has at least 2 children.
•Internal Node Constraints: Every internal node (non-root, non-leaf) contains between ⌈m/2⌉ and m children, and between ⌈m/2⌉-1 and m-1 keys.
•Leaf Node Constraints: Every leaf node contains between ⌈(m-1)/2⌉ and m-1 key-value pairs. All leaf nodes appear at the same depth (the tree is perfectly balanced).
•Key Distribution: For an internal node with keys k₁, k₂, ..., kₙ and children pointers p₀, p₁, ..., pₙ: all keys in the subtree pointed to by p₀ are less than k₁; all keys in subtree pᵢ satisfy kᵢ ≤ key < kᵢ₊₁; all keys in subtree pₙ are ≥ kₙ.
•Data Location: All data records (or pointers to data records) are stored exclusively in leaf nodes. Internal nodes contain only keys and child pointers.
•Leaf Linking: All leaf nodes are linked together in a doubly-linked list, ordered by key value, enabling efficient sequential traversal.

Order Convention Warning

Different textbooks define 'order' differently! Some define order as the maximum number of children (our definition), others as the maximum number of keys, and still others as the minimum degree. Always verify which convention is being used. In this course, order m means a node has at most m children and m-1 keys.

Mathematical Properties

The formal constraints of a B+-tree lead to precise mathematical bounds that are crucial for performance analysis. These bounds determine the tree's height, which directly controls I/O costs.

B+-Tree Node Capacity Constraints
Node Type	Minimum Keys	Maximum Keys	Minimum Children	Maximum Children
Root (single node tree)	1	m-1	—	—
Root (multi-node tree)	1	m-1	2	m
Internal Node	⌈m/2⌉ - 1	m-1	⌈m/2⌉	m
Leaf Node	⌈(m-1)/2⌉	m-1	—	—

Height Bounds:

For a B+-tree of order m containing n key-value pairs:

Maximum Height: The tree achieves maximum height when every node has the minimum number of entries:

h_max ≤ ⌈log_{⌈m/2⌉}((n+1)/2)⌉ + 1

Minimum Height: The tree achieves minimum height when every node is completely full:

h_min = ⌈log_m(n+1)⌉

Example Calculation:

Consider a B+-tree of order 200 storing 1 million records:

Minimum height: ⌈log₂₀₀(1,000,001)⌉ = 3 levels
Maximum height: approximately 4 levels

This means any record in a million-record database can be found with at most 4 disk reads. This logarithmic growth is what makes B+-trees practical for databases of any size.

Why Logarithmic Height Matters

Each level of tree height adds one disk I/O to every search. A tree of height 4 means 4 disk reads; height 10 means 10 disk reads. Since disk I/O typically takes 5-10ms (millions of times slower than memory access), keeping height small is paramount. The B+-tree's high fanout ensures height stays small even for billions of records.

The Separation Principle: Navigation vs. Storage

The defining characteristic of B+-trees—the feature that most distinguishes them from B-trees—is the rigid separation between navigation and data storage. This isn't just an implementation detail; it's a fundamental architectural decision with far-reaching implications.

Internal Nodes (Navigation)

•Contain only keys and child pointers
•Keys serve as separators or signposts
•Guide searches toward the correct leaf
•No actual data records stored here
•Optimized for maximum fanout
•Can use prefix compression

Leaf Nodes (Storage)

•Contain complete key-value pairs
•Value may be data or a record pointer
•Linked together for sequential access
•All search paths terminate here
•Optimized for data density
•Support range scans efficiently

Why This Separation Matters:

1. Consistent Search Path Length

In a B+-tree, every search—whether successful or not—traverses exactly the same number of levels: from root to leaf. This predictability simplifies cost estimation for query optimization.

2. Maximized Internal Node Fanout

Since internal nodes don't store data, they can hold more keys and pointers. If data records average 200 bytes but keys are only 8 bytes:

B-tree internal node might hold 20 entries
B+-tree internal node might hold 200 entries

This 10× improvement in fanout can reduce tree height by an entire level.

3. Cache-Friendly Navigation

Internal nodes are small enough to remain cached in memory. A database might keep all internal nodes resident while leaf nodes are demand-paged from disk. This means most searches require only one disk read—for the leaf.

4. Simplified Concurrency

Data modifications only affect leaf nodes. This simplifies locking strategies: internal nodes can use lighter-weight latches since they're modified less frequently.

Understanding Keys as Separators

A subtle but crucial aspect of B+-tree definition is how keys function in internal nodes versus leaf nodes. This distinction often causes confusion but is essential for correct understanding.

In Leaf Nodes: Keys represent the actual search key values of the stored data. If you're indexing employee IDs, the leaf contains the actual employee IDs alongside their associated data (or pointers to it).

In Internal Nodes: Keys are separators or routing keys. They need only be sufficient to correctly route searches—they don't necessarily correspond to any actual data record. The separator key between two subtrees can be:

A copy of a key from the right subtree (most common)
Any value that correctly separates the key ranges
A prefix-compressed version of the actual key

Separator vs. Copy Semantics

When a leaf node splits, the middle key is typically copied up to the parent (not moved). This is different from B-tree splits where the middle key moves up. This copy semantics means the same key value can appear in both an internal node (as a separator) and a leaf node (as actual data). This is normal and correct in B+-trees.

Routing Semantics:

Given an internal node with keys [k₁, k₂, k₃] and child pointers [p₀, p₁, p₂, p₃], a search for key K follows:

if K < k₁:           follow p₀
else if K < k₂:      follow p₁  
else if K < k₃:      follow p₂
else:                follow p₃

Note that the search keys in internal nodes define exclusive upper bounds for all but the rightmost pointer. Different implementations may use ≤ instead of <, but the logic must be consistent throughout the tree.

Key Semantics Comparison
Aspect	Internal Node Keys	Leaf Node Keys
Purpose	Route searches to correct subtree	Identify stored data records
Correspondence	May or may not match actual data	Always matches actual data key
Modification	Can be optimized (prefix compression)	Must preserve exact key value
On Deletion	May remain even if data deleted	Removed with data record
Duplication	Same key can appear multiple times in internal nodes	Unique within leaf (for unique indexes)

Occupancy Guarantees and Space Utilization

A defining property of B+-trees is their minimum occupancy guarantee. Unlike simpler structures where nodes can become arbitrarily empty, B+-trees ensure every node (except possibly the root) is at least half full. This guarantee has profound implications for both space efficiency and performance predictability.

Minimum Fill Guarantees

•Internal nodes: At least ⌈m/2⌉ children, meaning at least 50% occupancy
•Leaf nodes: At least ⌈(m-1)/2⌉ entries, meaning at least 50% occupancy
•Root exception: The root can have fewer entries during tree shrinkage
•Practical effect: Average occupancy is typically 67-69% due to random insertions

Why 50% Matters:

Space Efficiency: With a guaranteed 50% minimum fill, the B+-tree uses at most twice the space of a perfectly packed structure. In practice, random insertions and deletions lead to approximately 69% average fill (proven by Yao, 1978), meaning space overhead is only about 45%.

Height Guarantees: Minimum occupancy directly affects maximum height. If nodes could become nearly empty, the tree could grow much taller, destroying the logarithmic performance guarantee.

Performance Predictability: With bounded occupancy, the number of disk reads for any operation is predictable. This enables the query optimizer to make accurate cost estimates.

Example:

For a B+-tree of order 100:

Each leaf holds 50-99 entries
A tree with 1 million entries has between 10,102 and 20,000 leaf nodes
Height is exactly 3 levels for this entire range
Any search requires exactly 3 disk reads (one per level)

The Space-Time Tradeoff

While 50% minimum fill wastes some space, attempting higher guarantees would require more frequent rebalancing. The 50% threshold is the sweet spot: it limits wasted space while keeping maintenance operations infrequent. Some systems use 2/3 fill (order divisible by 3) for slightly better space utilization.

Formal Definition vs. Practical Implementation

The textbook definition of B+-trees describes an idealized structure. Real database implementations introduce variations to handle practical concerns. Understanding these variations helps you recognize B+-trees in the wild.

Textbook vs. Real-World B+-Trees
Aspect	Textbook Definition	Practical Implementation
Order	Fixed parameter m	Derived from page size and key/pointer sizes
Node size	Abstract 'slots'	Fixed-size pages (4KB, 8KB, 16KB typical)
Fill factor	≥50% always enforced	Configurable, may allow lower during bulk operations
Leaf linking	Doubly-linked list	May include parent pointers for crash recovery
Key format	Atomic values	May use prefix compression, suffix truncation
Concurrency	Not addressed	Elaborate locking/latching protocols
Persistence	Assumed	Write-ahead logging, checkpointing

Variable-Width Keys:

The formal definition assumes fixed-size keys, but real databases support variable-length strings. This complicates the notion of 'order' since the number of keys per node varies. Implementations typically define order based on a target page size rather than a fixed key count.

Overflow Pages:

When values are very large (e.g., TEXT or BLOB columns), they may not fit in the leaf page. Implementations use overflow pages linked from the main leaf entry. The key remains in the leaf; only the value spills over.

Different Leaf Structures:

Some implementations store only keys in leaves with separate pointers to data pages (secondary indexes). Others store the entire row in the leaf (clustered indexes). Both are valid B+-trees; only the "value" part of key-value pairs differs.

Implementation Diversity

PostgreSQL, MySQL/InnoDB, Oracle, and SQL Server all implement B+-trees, but with different page sizes, compression schemes, and concurrency protocols. The core B+-tree properties remain the same—all data in leaves, linked leaves, balanced height—but implementation details vary significantly.

Summary: The B+-Tree Definition

Let's consolidate the essential elements of the B+-tree definition:

Key Definitional Properties

•All data in leaves — Internal nodes contain only keys (as separators) and child pointers; actual data records reside exclusively in leaf nodes.
•Balanced structure — All leaf nodes appear at the same depth; the tree is perfectly height-balanced at all times.
•High fanout — Nodes contain many keys (dozens to hundreds), keeping tree height small even for large datasets.
•Minimum occupancy — Every node (except root) is at least half full, guaranteeing space efficiency and predictable height.
•Linked leaves — Leaf nodes form a linked list, enabling efficient sequential access and range queries.
•Logarithmic operations — Search, insert, and delete all operate in O(log n) time, with constants optimized for disk I/O.

What's Next:

Now that we've established the formal definition of B+-trees, the next page explores the most distinctive feature in depth: all data resides in leaf nodes. We'll examine why this design decision is so powerful and how it enables optimizations impossible in standard B-trees.

Page Complete

You now understand the formal definition of B+-trees, including their structural properties, mathematical bounds, and the key design decisions that distinguish them from B-trees. Next, we'll explore the implications of storing all data exclusively in leaf nodes.