Loading learning content...
In the realm of database indexing, few concepts are as fundamental yet underappreciated as fanout. Also known as the branching factor, fanout is the single most influential parameter determining how quickly a tree-based index can locate any record in a database of millions—or billions—of rows.
Consider this: a B+-tree with a fanout of 100 can search through 100 million records by examining just 4 nodes. The same data in a binary tree (fanout of 2) would require approximately 27 node accesses. This isn't a minor optimization—it's a fundamental architectural difference that separates enterprise-grade database performance from academic toy implementations.
Understanding fanout transforms how you think about index design, capacity planning, and performance troubleshooting. It's the vocabulary principal engineers use when discussing index optimization at scale.
By the end of this page, you will understand what fanout means precisely, how it's calculated for different index structures, why it has such profound impact on performance, and how database systems optimize for maximum fanout. You'll be able to estimate index heights, predict I/O costs, and make informed decisions about index configuration.
Fanout (also called the branching factor or order) refers to the maximum number of child pointers that a single node in a tree-based index can hold. In database terminology, it represents how many branches emerge from each internal node of the index structure.
Formally, for a B+-tree:
Fanout (F) = Maximum number of child pointers per internal node
This directly corresponds to the number of keys a node can store. An internal node with n keys has n + 1 child pointers (since each key separates two ranges of values). However, in common usage, 'fanout' often refers simply to the number of entries or pointers per node.
Why 'fanout' as a term?
The term visualizes how the tree 'fans out' from root to leaves. Starting from a single root node, each level multiplies the number of nodes by a factor equal to the fanout. A root with fanout 100 leads to up to 100 nodes at level 1, 10,000 at level 2, and 1,000,000 at level 3.
| Tree Type | Typical Fanout | Node Content | Use Case |
|---|---|---|---|
| Binary Search Tree | 2 | 1 key, 2 child pointers | In-memory, educational |
| AVL / Red-Black Tree | 2 | 1 key, 2 pointers + metadata | In-memory balanced trees |
| B-Tree | 50-500 | Multiple keys + child pointers | Disk-based indexes |
| B+-Tree | 100-1000+ | Keys in internal, data in leaves | Database indexes (dominant) |
| B*-Tree | Similar to B+-tree | Higher minimum fill factor | Improved space utilization |
Academic literature sometimes uses 'order' to describe B-trees, but definitions vary (some define order as minimum keys, others as maximum). 'Fanout' is less ambiguous and directly corresponds to the number of child pointers, which is what matters for performance calculations.
Fanout isn't an arbitrary choice—it's determined by the physical relationship between page size, key size, and pointer size. Understanding this calculation is essential for predicting index performance and making configuration decisions.
The Fundamental Equation:
For a B+-tree internal node:
Fanout (F) = ⌊(Page Size - Header Size) / (Key Size + Pointer Size)⌋
Let's break down each component:
Consider a database with 8KB pages, 50-byte headers, 8-byte integer keys, and 6-byte pointers:
Fanout = ⌊(8192 - 50) / (8 + 6)⌋ = ⌊8142 / 14⌋ = 581 children per node
With a fanout of 581, a 3-level B+-tree can address 581³ ≈ 196 million leaf pages!
Real-World Fanout Values:
Database vendors optimize heavily for fanout. Here are typical values:
| Database System | Default Page Size | Typical Fanout (INT keys) |
|---|---|---|
| PostgreSQL | 8 KB | 400-500 |
| MySQL (InnoDB) | 16 KB | 800-1200 |
| Oracle | 8 KB | 400-600 |
| SQL Server | 8 KB | 400-500 |
| SQLite | 4 KB | 200-300 |
Key insight: Larger page sizes directly increase fanout. MySQL's 16KB default pages give it roughly double the fanout of PostgreSQL's 8KB pages, meaning InnoDB B+-trees are typically one level shallower for equivalent data sets.
Fanout's importance stems from a fundamental property of tree structures: the number of nodes grows exponentially with depth, while search cost grows linearly with depth.
This relationship is captured by the tree height formula:
Height (h) = ⌈log_F(N)⌉
Where:
Each increase in fanout logarithmically reduces the tree height, which directly translates to fewer disk I/O operations per query.
| Fanout | Height (levels) | Nodes Accessed per Lookup | Relative I/O Cost |
|---|---|---|---|
| 2 (Binary) | 30 | 30 | 15x |
| 10 | 10 | 10 | 5x |
| 50 | 5 | 5 | 2.5x |
| 100 | 4 | 4 | 2x |
| 500 | 3 | 3 | 1.5x |
| 1000 | 2 | 2 | 1x (baseline) |
A binary search tree (fanout=2) indexing 1 billion records requires 30 levels. Even with all internal nodes cached, leaf access requires 30 disk seeks. At 5ms per seek, that's 150ms per query. A B+-tree with fanout 500 reaches any record in 3 disk reads (~15ms). This is why binary trees are never used for disk-based indexes.
The I/O Connection:
In database systems, each level of the index typically requires one disk I/O operation (unless cached in the buffer pool). Therefore:
I/O operations per lookup ≈ Tree Height = log_F(N)
Since disk I/O is 5-6 orders of magnitude slower than memory access, minimizing tree height is the primary optimization goal for disk-based indexes. Fanout is the lever that achieves this.
Concrete Example:
For a table with 100 million rows:
That one-level difference means 25% fewer disk reads per query, which at high query volumes translates to significant performance and cost savings.
One of the most important yet often overlooked relationships in index design is the inverse relationship between key size and fanout. As keys grow larger, fewer fit in each node, reducing fanout and increasing tree height.
This has profound implications for index key selection and composite index design.
| Key Type | Key Size | Fanout | Height for 100M rows |
|---|---|---|---|
| TINYINT | 1 byte | ~1,163 | 3 |
| INT | 4 bytes | ~814 | 3 |
| BIGINT | 8 bytes | ~581 | 4 |
| UUID (binary) | 16 bytes | ~372 | 4 |
| VARCHAR(50) avg | ~30 bytes | ~226 | 4 |
| VARCHAR(255) avg | ~100 bytes | ~77 | 5 |
| Composite (3 cols) | ~50 bytes | ~145 | 4-5 |
This is why experienced DBAs prefer INTEGER or BIGINT primary keys over UUIDs or long VARCHAR columns for high-volume tables. The 2-4x difference in key size translates directly to lower fanout and deeper trees, increasing I/O costs for every index operation.
Composite Key Fanout:
When creating composite indexes (indexes on multiple columns), the combined key size further reduces fanout:
Composite Key Size = Key₁ Size + Key₂ Size + ... + Keyₙ Size
Fanout = ⌊(Page Size - Header) / (Composite Key Size + Pointer Size)⌋
Example: An index on (last_name VARCHAR(50), first_name VARCHAR(50), birth_date DATE) might have an average composite key size of 30 + 25 + 4 = 59 bytes, yielding fanout of approximately 130 with 8KB pages.
Best Practices for Maximizing Fanout:
Modern database systems implement sophisticated optimizations to maximize effective fanout. Understanding these techniques reveals why production databases achieve better performance than theoretical calculations suggest.
Most databases provide tools to inspect actual index fanout:
• PostgreSQL: SELECT * FROM bt_metap('index_name'); (pageinspect extension)
• MySQL: SHOW INDEX FROM table_name; shows cardinality that helps infer structure
• SQL Server: sys.dm_db_index_physical_stats provides page counts per level
These tools reveal the gap between theoretical maximum fanout and achieved fanout due to fill factor, fragmentation, and key distribution.
123456789101112131415161718192021
-- Enable the pageinspect extension (requires superuser)CREATE EXTENSION IF NOT EXISTS pageinspect; -- Get B-tree metadata including tree height and root pageSELECT * FROM bt_metap('idx_users_email'); -- Sample output interpretation:-- magic: 340322 (B-tree identifier)-- version: 4 (B-tree format version)-- root: 290 (Page number of root node)-- level: 2 (Height: 0=root-only, 2=3 levels)-- fastroot: 290 (Fast-path root for single-page trees)-- fastlevel: 2 -- Examine a specific internal node to see actual fanoutSELECT * FROM bt_page_stats('idx_users_email', 290); -- For each page, bt_page_items shows individual entries-- Count entries to determine actual fanoutSELECT COUNT(*) as entries_in_node FROM bt_page_items('idx_users_email', 290);Maximum fanout isn't always optimal fanout. Database systems use fill factor to intentionally leave space in index nodes, trading fanout for other benefits.
Fill Factor specifies what percentage of each index page should be filled during initial creation or rebuild. A 90% fill factor means 10% of each page is left empty.
Effective Fanout Calculation with Fill Factor:
Effective Fanout = Maximum Fanout × Fill Factor
For an index with maximum fanout of 500 and fill factor of 70%:
This lower effective fanout increases height for large tables but prevents costly page splits during write operations.
When to Adjust Fill Factor:
| Workload Type | Recommended Fill Factor | Rationale |
|---|---|---|
| Read-only / Analytical | 100% | Maximum fanout, no updates occurring |
| Read-heavy OLTP | 90% | Slight buffer for occasional inserts |
| Balanced OLTP | 80% | Trade-off between reads and writes |
| Write-heavy / Logging | 70% | Room for frequent inserts |
| Sequential inserts (timestamps, IDs) | 90-100% | New data goes to rightmost leaf anyway |
| Random inserts (UUIDs) | 50-70% | Inserts distributed across all pages |
Random UUIDs as primary keys create a perfect storm for fanout degradation:
If UUIDs are required, consider UUIDv7 (time-ordered) or application-side techniques to maintain sequential insert patterns.
While the term 'fanout' originates from B-tree family structures, the concept extends to other index types with different interpretations and implications.
| Index Type | Fanout Concept | Typical Values | Performance Impact |
|---|---|---|---|
| B+-Tree | Child pointers per internal node | 100-1000+ | Determines tree height, I/O per lookup |
| B-Tree (classic) | Child pointers + data per node | 50-500 | Lower than B+-tree (data in internal nodes) |
| Hash Index | Bucket count | Variable | Affects collision rate, not depth |
| R-Tree (spatial) | MBR entries per node | 50-200 | Search coverage, overlap handling |
| GiST | Configurable per opclass | Varies | Depends on data type and operator |
| Bitmap Index | Bits per bitmap segment | Often page-aligned | Compression effectiveness |
B+-Tree vs B-Tree Fanout:
B+-trees achieve higher fanout than classic B-trees because:
This is why B+-trees dominate database indexing despite B-trees being conceptually simpler.
Hash Index 'Fanout':
Hash indexes don't have traditional fanout (there's no tree), but bucket count serves a similar role:
Log-Structured Merge Trees (used by RocksDB, LevelDB, Cassandra) have a different relationship with fanout. The 'fanout' or 'size ratio' between levels (typically 10x) determines write amplification and space amplification tradeoffs. A higher size ratio means fewer levels but more write amplification during compaction.
Fanout is arguably the single most important metric for understanding B+-tree index performance. Let's consolidate the critical insights:
What's Next:
With fanout understood, we'll explore height — the tree depth that fanout determines. You'll learn how to calculate expected heights for production indexes, understand the significance of the 'root-to-leaf path,' and see why most production B+-trees are just 3-4 levels deep regardless of table size.
You now understand fanout — the branching factor that fundamentally determines index performance. This knowledge enables you to reason about index efficiency, make informed key design decisions, and predict I/O behavior for tree-based indexes. Next, we examine tree height and its direct relationship to query performance.