Index Terminology - Learning Module

Loading content...

0/241

Height: The Vertical Dimension of Index Performance

Every Level Costs Milliseconds

When a query accesses a B+-tree index, it begins at the root and descends level by level until reaching a leaf node. Each level traversed typically requires one disk I/O operation—and with disk access times measured in milliseconds, every additional level has measurable impact on query latency.

Tree height is the fundamental metric that determines this cost. It represents the number of edges (or equivalently, the number of pages accessed minus one) on the path from root to any leaf.

What makes height particularly elegant is its predictability: for a given fanout and record count, height is precisely calculable. This means you can estimate I/O costs for indexes that don't yet exist, compare theoretical performance across different key designs, and understand exactly why your production B+-trees remain remarkably shallow even as tables grow to billions of rows.

What You Will Learn

By the end of this page, you will understand how tree height is defined and calculated, why production B+-trees rarely exceed 4 levels, how height directly translates to I/O costs, and the relationship between height, fanout, and data volume. You'll be able to predict index performance from first principles.

Defining Tree Height: Precision Matters

Tree height has multiple valid definitions in different contexts. For B+-tree indexes in database systems, the definition must be precise because it directly maps to I/O operations.

Standard Definition:

Height (h) = The number of levels in the tree, counting from root (level 0 or 1) to leaves.

Alternative Counting Conventions:

Height Counting Conventions
Convention	Root Level	Height of Single-Node Tree	Used By
0-indexed levels	Level 0	Height 0	Some academic texts
1-indexed levels	Level 1	Height 1	PostgreSQL, most DBs
Edge counting	N/A	Height 0	Graph theory
Path length	N/A	Path 0	I/O cost analysis

Practical Convention

In database contexts, we typically count height as the number of pages accessed from root to leaf (inclusive), or equivalently, the number of levels in the tree where root is level 1. A tree with root + 1 internal level + leaves has height 3, meaning 3 pages are accessed per lookup.

Key Terminology Distinction:

Height: Top-down measure from root to deepest leaf
Depth: Bottom-up measure from a specific node to the root
Level: The 'horizontal slice' of nodes at the same depth

For a balanced tree (which all B+-trees are by definition), every leaf has the same depth, and the maximum depth equals the tree height.

B+-Tree Height Property:

B+-trees are perfectly balanced — all leaf nodes are at exactly the same level. This is guaranteed by the insertion and deletion algorithms that split and merge nodes rather than growing unevenly. This property is essential for predictable, consistent I/O costs.

The Height Formula: From Records to Levels

The relationship between tree height, fanout, and record count is captured by a simple logarithmic formula that every database professional should know.

The Fundamental Height Formula:

Height (h) = ⌈log_F(N)⌉

Where:

F = Fanout (average entries per internal node)
N = Number of leaf entries (data pointers or records)
⌈ ⌉ = Ceiling function (round up to next integer)

Why Logarithmic?

Each level of the tree multiplies capacity by the fanout factor:

Level 1 (root): 1 node, up to F children
Level 2: F nodes, up to F² children
Level 3: F² nodes, up to F³ children
...
Level h: F^(h-1) nodes containing N leaf pointers

Solving F^(h-1) = N for h gives us h = log_F(N) + 1, which simplifies to the ceiling of log_F(N) when accounting for partial levels.

Base Conversion for Calculation

Since most calculators don't have arbitrary base logarithm, use the change of base formula:

log_F(N) = ln(N) / ln(F) = log₁₀(N) / log₁₀(F)

For example, with F=500 and N=100,000,000: Height = ⌈log₅₀₀(100,000,000)⌉ = ⌈log(100,000,000) / log(500)⌉ = ⌈8 / 2.7⌉ = ⌈2.96⌉ = 3

Height Calculations for Common Scenarios
Records (N)	Fanout = 100	Fanout = 500	Fanout = 1000
1,000	2	2	2
10,000	3	2	2
100,000	3	3	2
1,000,000	4	3	3
10,000,000	4	3	3
100,000,000	5	3	3
1,000,000,000	5	4	4
10,000,000,000	6	4	4

Critical Observation:

Notice how slowly height grows relative to data volume. With a fanout of 500:

1 thousand records → 2 levels
1 billion records → 4 levels

The difference between indexing a thousand records and a billion records is just 2 additional I/O operations per lookup. This logarithmic scaling is why B+-trees remain practical at any scale achievable with current storage technology.

Height and I/O Cost: The Direct Relationship

For disk-based databases, the primary cost of index operations is disk I/O. Tree height directly determines this cost because each level of the tree requires reading one page from storage.

I/O Cost Model:

I/O Operations per Point Lookup = Tree Height (h)

For range queries, additional leaf-to-leaf traversal adds more I/O, but the initial descent cost remains h.

Latency Implications:

Height Impact on Query Latency by Storage Type
Storage Type	Random Read Latency	Height 3	Height 4	Height 5
HDD (7200 RPM)	~10 ms	30 ms	40 ms	50 ms
HDD (15000 RPM)	~5 ms	15 ms	20 ms	25 ms
SATA SSD	~0.1 ms	0.3 ms	0.4 ms	0.5 ms
NVMe SSD	~0.02 ms	0.06 ms	0.08 ms	0.1 ms
Intel Optane	~0.01 ms	0.03 ms	0.04 ms	0.05 ms

The Buffer Pool Changes Everything

These latency figures assume every page requires disk access. In practice, the buffer pool caches frequently-accessed pages in RAM. Root and upper internal nodes are almost always cached (read by virtually every query), meaning actual I/O cost is often just 1-2 reads (the leaf page, possibly its parent). Effective height is often 1-2 for hot indexes, not the full tree height.

Buffer Pool Caching Effect:

Consider a B+-tree with height 4:

Level	Node Count	Cumulative Size	Cache Status (typical)
Root	1	8 KB	Always cached
Level 2	~500	~4 MB	Usually cached
Level 3	~250,000	~2 GB	Partially cached
Leaves	~125 million	~1 TB	Rarely cached

With a 16 GB buffer pool, you'd expect levels 1-2 to be permanently resident, level 3 to have a reasonable hit rate, and leaves to be fetched on-demand. Effective I/O cost: 1-2 reads, not 4.

This is why buffer pool sizing is critical — ensuring upper tree levels remain cached transforms height from a direct I/O multiplier to a cached metadata lookup.

Why B+-Trees Stay Short: The Power of High Fanout

One of the most remarkable properties of B+-trees is how shallow they remain even for enormous data volumes. A tree indexing 10 billion records is typically only 4-5 levels deep. Understanding why requires appreciating the exponential growth pattern of tree capacity.

Capacity Growth Per Level (Fanout = 500)

•Level 1 (Root): 1 node → Can point to 500 children
•Level 2: 500 nodes → Can point to 250,000 children
•Level 3: 250,000 nodes → Can point to 125,000,000 children
•Level 4 (Leaves): 125 million leaf pages → 125+ billion records with 1000 records/leaf

The Mathematics of Shallowness:

With fanout F, the maximum number of leaf entries in a tree of height h is approximately:

Max Leaves ≈ F^h

For F = 500:
- h = 2: 250,000 leaves
- h = 3: 125 million leaves  
- h = 4: 62.5 billion leaves

The capacity multiplies with each added level, but height only increments by one. This means going from 1 million to 1 billion records doesn't require 1000x more I/O—it requires just 1-2 additional page reads.

Practical Upper Bounds:

For current storage capacities and typical configurations:

Height 3: Adequate for up to ~125 million indexed values
Height 4: Adequate for up to ~62 billion indexed values
Height 5: Adequate for essentially unlimited practical data

No production RDBMS workload has needed height > 5 for a single B+-tree index.

Contrast with Binary Trees

A binary search tree (fanout 2) would require height 40 to hold 1 trillion records. A B+-tree with fanout 500 achieves this in height 4. This 10x reduction in tree levels directly translates to 10x reduction in worst-case I/O operations—the fundamental reason databases use B+-trees instead of binary trees.

Height in Different Scenarios: Practical Examples

Let's examine realistic tree heights across different database workloads and configurations to build intuition for what to expect in production environments.

Real-World B+-Tree Height Examples
Scenario	Table Rows	Index Key	Est. Fanout	Expected Height
Small web app users table	100,000	INT id	600	2
E-commerce products	10 million	INT id	600	3
Social media posts	500 million	BIGINT id	450	3
Financial transactions	5 billion	BIGINT id	450	4
IoT sensor events	100 billion	TIMESTAMP + UUID	150	5
Email archive (VARCHAR)	1 billion	VARCHAR(255) email	70	5

Key Observations:

Integer primary keys stay shallow — Even 5 billion rows only requires height 4 with INT/BIGINT keys
Wide keys increase height — The email VARCHAR index reaches height 5 despite fewer rows than the sensor events, due to lower fanout
Composite keys compound the problem — The IoT example with TIMESTAMP + UUID creates wide composite keys, dropping fanout to ~150

Height Calculation Example (E-commerce Products):

Assumptions:
- 8 KB pages
- 50 byte header
- 4 byte INT key  
- 6 byte pointer

Fanout = (8192 - 50) / (4 + 6) = 814

Height for 10 million records:
= ⌈log₈₁₄(10,000,000)⌉
= ⌈7 / 2.9⌉ 
= ⌈2.4⌉
= 3 levels

Clustered vs Non-Clustered Height

Clustered indexes (where leaves contain actual row data) may have different effective fanout than non-clustered indexes (where leaves contain row pointers). Clustered index leaves hold fewer entries if rows are wide, but internal node fanout remains comparable. Non-clustered indexes have consistent leaf fanout since they only store key + pointer.

Measuring Height in Production Databases

While theoretical height calculations are useful for planning, confirming actual index height in production databases helps validate expectations and diagnose performance issues.

Measuring Index Height Across Database Systems
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- PostgreSQL: Use pageinspect extension to get B-tree metadata
CREATE EXTENSION IF NOT EXISTS pageinspect;
 
-- Get tree height (called 'level' in PostgreSQL metadata)
-- Height in PostgreSQL terms: 0 = root only, 1 = root + leaves, etc.
SELECT 
    'idx_users_email' as index_name,
    (bt_metap('idx_users_email')).level + 1 as tree_height,
    (bt_metap('idx_users_email')).root as root_page,
    pg_relation_size('idx_users_email') as index_size_bytes
;
 
-- Get page-level statistics
SELECT * FROM bt_page_stats('idx_users_email', 1);
 
-- Count pages at each level (requires scanning metadata pages)
-- This query shows distribution across tree levels
SELECT 
    btpo as tree_level,
    count(*) as pages_at_level,
    sum(live_items) as items_at_level
FROM generate_series(1, 
    (SELECT (bt_metap('idx_users_email')).root)) as pages(n),
    LATERAL bt_page_stats('idx_users_email', pages.n)
GROUP BY btpo
ORDER BY btpo;

Height Discrepancy Investigation

If measured height exceeds calculated height:

Low fill factor — Pages aren't fully utilized, reducing effective fanout
Fragmentation — Page splits created sparse pages
Wide keys — Variable-length keys using more space than estimated
Index bloat — Dead tuples (PostgreSQL) consuming space

Solution: REINDEX or ALTER INDEX REBUILD to restore optimal structure.

Height Changes and Index Maintenance

Tree height is remarkably stable during normal operations, but understanding when and how it changes is important for capacity planning and maintenance scheduling.

When Height Increases

•Root node split — The only operation that increases tree height. When the root becomes full and must split, a new root is created above both halves, adding one level.
•Massive data growth — Height increases are rare; a table must approximately multiply by fanout factor to trigger a height increase. Going from height 3 to height 4 with fanout 500 requires growing from ~125 million to ~62 billion entries.
•Index rebuild with larger keys — If key size increases (e.g., column length altered), fanout decreases and height may increase on rebuild.

When Height Decreases

•Never during normal operations — Standard B+-tree delete algorithms don't reduce height even when the tree shrinks dramatically. Empty levels persist.
•Index rebuild — REINDEX or rebuilding recalculates optimal structure, potentially reducing height if data volume decreased significantly.
•DROP and recreate — Starting fresh with current data volume will produce optimal height.

The Height Ratchet Problem

B+-tree height acts as a ratchet — it goes up but doesn't come down naturally. A table that grew to 1 billion rows (height 4) and was later truncated to 1 million rows still has height 4 until rebuilt. This is rarely a practical problem (2 extra cached reads), but can indicate the need for index maintenance on volatile tables.

Height Monitoring Best Practices:

Baseline height on creation — Record initial height for comparison
Monitor after major loads — ETL jobs or bulk imports may trigger height increase
Include in maintenance reports — Height changes indicate structural evolution
Investigate unexpected increases — May indicate fragmentation or bloat
Plan rebuilds proactively — Don't wait for performance degradation

Summary: Mastering Tree Height

Tree height is the ultimate measure of index efficiency — it directly determines how many pages must be accessed to reach any record. Let's consolidate the critical insights:

Key Takeaways

•Height = I/O operations — Each level requires one page read (unless cached), making height the primary determinant of index lookup cost.
•Height = ⌈log_F(N)⌉ — The logarithmic relationship between fanout (F), record count (N), and height means enormous data growth causes minimal height increase.
•Production trees are shallow — With typical fanouts of 500+, even billion-row tables have indexes just 3-4 levels deep.
•Caching reduces effective height — Root and upper levels are almost always cached, reducing actual I/O to 1-2 operations for hot indexes.
•Height rarely changes — Once established, height only increases on root splits (rare) and doesn't decrease without rebuild.
•Measure, don't assume — Use database-specific tools to verify actual height matches expected height; deviations indicate maintenance needs.

What's Next:

With height understood, we'll explore leaf nodes — the terminal level of the B+-tree where actual data resides (or is pointed to). You'll learn how leaf nodes are structured, why they're linked as a doubly-linked list for range scans, and how leaf organization differs between clustered and non-clustered indexes.

Page Complete

You now understand tree height — the vertical dimension that determines index access cost. This knowledge enables you to calculate expected I/O for any index configuration, understand why B+-trees scale so well, and identify when index maintenance may be beneficial. Next, we examine leaf nodes where data actually lives.