Loading learning content...
Node merging is the symmetric counterpart to node splitting. Where splitting handles overflow during insertion, merging handles underflow during deletion. Together, these operations form the complete balancing mechanism that keeps B-trees efficient regardless of the operation mix.
While we introduced merging conceptually in the deletion algorithm, this page explores the operation in exhaustive detail—examining the mechanics, cascading effects, edge cases, and optimizations that production implementations must handle.
By completing this page, you will understand: (1) The complete merging algorithm with all edge cases, (2) How merging propagates through the tree (cascading merges), (3) Differences between leaf and internal node merges, (4) When to merge vs. when to redistribute, (5) Atomicity requirements and crash recovery for merges, and (6) Advanced optimizations like deferred merging and merge batching.
Merging combines two sibling nodes and their parent's separator into a single node. This operation is triggered when deletion would cause underflow and no sibling has keys to spare.
Pre-conditions for merging:
Post-conditions after merging:
Why 2t-1 keys result:
This is exactly the inverse of splitting, where a 2t-1-key node becomes two (t-1)-key nodes plus one promoted separator.
Merging produces a full node (2t-1 keys). If the next operation is an insertion targeting this node, it will immediately split. This 'oscillation' between merge and split is rare in practice but demonstrates the mathematical symmetry of B-tree operations.
Here is the complete, production-quality merge algorithm with all necessary operations:
MERGE-NODES(parent, separatorIndex):
Input: parent - the parent of the nodes being merged
separatorIndex - index of the separator key in parent.keys[]
(also index of left child in parent.children[])
// Step 1: Identify the nodes to merge
leftNode ← parent.children[separatorIndex]
rightNode ← parent.children[separatorIndex + 1]
// Step 2: Pull separator from parent into left node
leftNode.keys[leftNode.numKeys] ← parent.keys[separatorIndex]
leftNode.numKeys ← leftNode.numKeys + 1
// Step 3: Copy all keys and values from right node to left node
FOR j ← 0 TO rightNode.numKeys - 1:
leftNode.keys[leftNode.numKeys + j] ← rightNode.keys[j]
IF leftNode.isLeaf:
leftNode.values[leftNode.numKeys + j] ← rightNode.values[j]
// Step 4: For internal nodes, copy child pointers
IF NOT leftNode.isLeaf:
FOR j ← 0 TO rightNode.numKeys: // Note: one more child than keys
leftNode.children[leftNode.numKeys + j] ← rightNode.children[j]
// Step 5: Update left node's key count
leftNode.numKeys ← leftNode.numKeys + rightNode.numKeys
// Step 6: Remove separator and right child pointer from parent
FOR j ← separatorIndex TO parent.numKeys - 2:
parent.keys[j] ← parent.keys[j + 1]
FOR j ← separatorIndex + 1 TO parent.numKeys - 1:
parent.children[j] ← parent.children[j + 1]
parent.numKeys ← parent.numKeys - 1
// Step 7: For B+-tree leaves, update sibling pointers
IF leftNode.isLeaf:
leftNode.nextLeaf ← rightNode.nextLeaf
IF rightNode.nextLeaf ≠ NULL:
rightNode.nextLeaf.prevLeaf ← leftNode
// Step 8: Free the right node and persist changes
FREE-NODE(rightNode)
DISK-WRITE(leftNode)
DISK-WRITE(parent)
A merge requires writing 2 nodes (left and parent) and potentially updating a third (next leaf for doubly-linked lists). The right node is freed but may require a write to the free list. In the worst case of cascading merges, we write O(h) nodes where h is tree height.
When a merge consumes a separator from the parent, the parent's key count decreases. If the parent was already at minimum occupancy (t-1 keys), it now underflows (t-2 keys), potentially triggering another merge at the parent level.
Example of Cascading Merges (t=3, min keys=2):
Initial state:
[50] ← Root with 1 key
/
[20, 30] [70, 80] ← Internal nodes at minimum
/ | \ / |
[...] [...] [...] [...] [...]
↑
Deleting from here will cascade
Step 1: Delete causes leaf underflow → merge leaves Merging leaves consumes separator from [70, 80], which becomes [70]. But [70] now has only 1 key (below minimum of 2)!
Step 2: Internal node underflow → try to borrow from [20, 30] [20, 30] has 2 keys = minimum, cannot lend.
Step 3: Merge internal nodes [20, 30] and [70]
[] ← Root now empty!
|
[20, 30, 50, 70] ← Merged internal node
/ | | |
[...] [...][...][...]
Step 4: Root collapse
[20, 30, 50, 70] ← This is now the root
/ | |
[...] [...][...][...]
Tree height reduced from 3 to 2!
Cascade Probability Analysis:
Cascading merges require a chain of nodes all at minimum occupancy. With random deletions:
For h=4 (typical tree height), full cascade probability ≈ 6%. Cascades are uncommon but must be handled correctly.
Proactive Rebalancing Prevents Cascades:
Just as proactive splitting prevents cascading splits during insertion, proactive rebalancing during deletion ensures we never need to cascade:
Before descending to child[i]:
IF child[i].numKeys = t-1: // At minimum
IF sibling has spare keys:
Borrow (redistribute)
ELSE:
Merge with sibling now
This guarantees the child we descend into has ≥t keys, so even after deleting one, it still has ≥t-1 (valid).
The maximum cascade depth equals tree height: O(log_t n). This worst case occurs when deleting causes a merge at every level up to the root. While the probability is low, the algorithm must handle it correctly. Proactive rebalancing transforms cascades into a sequence of merge-then-descend operations.
Like splitting, merging differs between leaf and internal nodes, particularly for B+-trees.
Leaf Node Merges:
Internal Node Merges:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
// B+-Tree Leaf Merge: Separator is NOT included (it was a copy)function mergeLeafNodes( parent: InternalNode, separatorIndex: number): void { const left = parent.children[separatorIndex] as LeafNode; const right = parent.children[separatorIndex + 1] as LeafNode; // Transfer right's keys and values to left // Note: separator is dropped, not included for (let j = 0; j < right.keys.length; j++) { left.keys.push(right.keys[j]); left.values.push(right.values[j]); } // Update sibling pointers left.nextLeaf = right.nextLeaf; if (right.nextLeaf) { right.nextLeaf.prevLeaf = left; } // Remove separator and right child from parent removeFromParent(parent, separatorIndex);} // B+-Tree Internal Node Merge: Separator IS included (it was a promotion)function mergeInternalNodes( parent: InternalNode, separatorIndex: number): void { const left = parent.children[separatorIndex] as InternalNode; const right = parent.children[separatorIndex + 1] as InternalNode; // Include the separator key left.keys.push(parent.keys[separatorIndex]); // Transfer right's keys and children to left for (let j = 0; j < right.keys.length; j++) { left.keys.push(right.keys[j]); } for (let j = 0; j < right.children.length; j++) { left.children.push(right.children[j]); } // Remove separator and right child from parent removeFromParent(parent, separatorIndex);}| Aspect | Leaf Merge | Internal Merge |
|---|---|---|
| Separator handling | Discarded (was a copy) | Included (was a promotion) |
| Data transfer | Keys and values | Keys and child pointers |
| Resulting left node | Has rightNode.numKeys more keys | Has 1 + rightNode.numKeys more keys |
| Sibling pointers | Must update linked list | Not applicable (usually) |
| Reason for difference | Separator duplicated leaf key | Separator was unique to parent |
If you include the parent's separator when merging leaves, you'll duplicate a key (since the separator is already a copy of a leaf key). This breaks range scans and wastes space. Always discard the separator for B+-tree leaf merges.
Before merging, we should first attempt redistribution (borrowing). Redistribution keeps the node count stable and often results in better balance.
Decision Algorithm:
FIX-UNDERFLOW(parent, childIndex):
child ← parent.children[childIndex]
IF child.numKeys >= t:
RETURN // No underflow
leftSibling ← childIndex > 0 ? parent.children[childIndex - 1] : NULL
rightSibling ← childIndex < parent.numKeys ? parent.children[childIndex + 1] : NULL
// Preference 1: Borrow from left sibling
IF leftSibling ≠ NULL AND leftSibling.numKeys > t - 1:
REDISTRIBUTE-FROM-LEFT(parent, childIndex)
RETURN
// Preference 2: Borrow from right sibling
IF rightSibling ≠ NULL AND rightSibling.numKeys > t - 1:
REDISTRIBUTE-FROM-RIGHT(parent, childIndex)
RETURN
// Preference 3: Merge with left sibling (if exists)
IF leftSibling ≠ NULL:
MERGE-NODES(parent, childIndex - 1) // Merge into left
RETURN
// Preference 4: Merge with right sibling
MERGE-NODES(parent, childIndex) // Merge right into child
Why Prefer Redistribution?
| Aspect | Redistribution | Merge |
|---|---|---|
| Node count | Unchanged | Decreased by 1 |
| Parent modification | Yes (separator change) | Yes (separator and child removal) |
| Sibling modification | Yes (lose/gain keys) | Yes (absorbed into other node) |
| Risk of cascade | None | Parent may underflow |
| Fill factor | Improved (more even) | Increases (node fuller) |
When Merge is Inevitable:
Merge is necessary when both siblings are at minimum occupancy. After merge, the resulting node is full (2t-1 keys). If we then need to delete from it, we can safely do so (resulting in 2t-2 ≥ t-1 for t ≥ 2).
Some implementations always merge in one direction (e.g., right into left) for consistency. Others prefer merging smaller into larger or based on I/O locality. The choice doesn't affect correctness but can impact performance—cached nodes or sequentially adjacent pages may be preferred.
Redistribution moves keys between siblings through the parent, avoiding merge. Let's examine both directions in detail.
Redistribute from Left Sibling:
REDISTRIBUTE-FROM-LEFT(parent, childIndex):
child ← parent.children[childIndex]
leftSibling ← parent.children[childIndex - 1]
// Make room in child: shift everything right
FOR i ← child.numKeys - 1 DOWNTO 0:
child.keys[i + 1] ← child.keys[i]
IF child.isLeaf:
child.values[i + 1] ← child.values[i]
IF NOT child.isLeaf:
FOR i ← child.numKeys DOWNTO 0:
child.children[i + 1] ← child.children[i]
// Separator from parent moves down to child's first position
child.keys[0] ← parent.keys[childIndex - 1]
child.numKeys ← child.numKeys + 1
// Left sibling's last key moves up to parent
parent.keys[childIndex - 1] ← leftSibling.keys[leftSibling.numKeys - 1]
// For internal nodes, also transfer child pointer
IF NOT child.isLeaf:
child.children[0] ← leftSibling.children[leftSibling.numKeys]
ELSE:
child.values[0] ← leftSibling.values[leftSibling.numKeys - 1]
leftSibling.numKeys ← leftSibling.numKeys - 1
DISK-WRITE(child)
DISK-WRITE(leftSibling)
DISK-WRITE(parent)
Redistribute from Right Sibling:
REDISTRIBUTE-FROM-RIGHT(parent, childIndex):
child ← parent.children[childIndex]
rightSibling ← parent.children[childIndex + 1]
// Separator from parent moves down to child's last position
child.keys[child.numKeys] ← parent.keys[childIndex]
child.numKeys ← child.numKeys + 1
// Right sibling's first key moves up to parent
parent.keys[childIndex] ← rightSibling.keys[0]
// For internal nodes, also transfer child pointer
IF NOT child.isLeaf:
child.children[child.numKeys] ← rightSibling.children[0]
ELSE:
child.values[child.numKeys - 1] ← rightSibling.values[0]
// Shift right sibling's contents left
FOR i ← 0 TO rightSibling.numKeys - 2:
rightSibling.keys[i] ← rightSibling.keys[i + 1]
IF rightSibling.isLeaf:
rightSibling.values[i] ← rightSibling.values[i + 1]
IF NOT rightSibling.isLeaf:
FOR i ← 0 TO rightSibling.numKeys - 1:
rightSibling.children[i] ← rightSibling.children[i + 1]
rightSibling.numKeys ← rightSibling.numKeys - 1
DISK-WRITE(child)
DISK-WRITE(rightSibling)
DISK-WRITE(parent)
For B+-tree leaves, the separator in the parent must be updated to reflect the new boundary. Unlike internal node redistribution where the separator 'rotates through,' leaf redistribution replaces the separator with the new first key of the right sibling.
Like splitting, merging modifies multiple nodes atomically. Crashes mid-operation can corrupt the tree. Production systems use similar techniques to ensure merge atomicity.
Possible Crash States:
| Crash Point | State | Recovery Action |
|---|---|---|
| After writing left node | Left has merged content, but parent and right unchanged | Redo the merge; right node data is duplicated but consistent |
| After writing left and parent | Parent updated, right node orphaned | Complete merge by freeing right node |
| After writing all but before freeing right | All data consistent, right node still allocated | Add right to free list during recovery |
WAL-Based Merge:
MERGE-WITH-WAL(parent, separatorIndex):
// Step 1: Log the merge intent with full state
logRecord ← CreateLogRecord(
type: MERGE,
parentPageId: parent.pageId,
leftPageId: leftNode.pageId,
rightPageId: rightNode.pageId,
separatorIndex: separatorIndex,
leftKeys: leftNode.keys,
rightKeys: rightNode.keys,
separator: parent.keys[separatorIndex]
)
WRITE-TO-WAL(logRecord)
FLUSH-WAL() // Ensure log is durable
// Step 2: Perform the actual merge
MERGE-NODES(parent, separatorIndex)
// Step 3: Log completion
WRITE-TO-WAL(CreateCommitRecord(logRecord.id))
Recovery Algorithm:
RECOVER-MERGE(log):
FOR each uncommitted MERGE record in log:
parent ← ReadPage(log.parentPageId)
left ← ReadPage(log.leftPageId)
right ← ReadPage(log.rightPageId)
// Determine current state
IF left contains merged content AND parent still has separator:
// Merge was interrupted after left write
// Redo: update parent and free right
UpdateParent(parent, log.separatorIndex)
FreeNode(right)
ELSE IF parent doesn't have separator:
// Parent was updated but right may not be freed
IF right still allocated:
FreeNode(right)
ELSE:
// Merge didn't start or completed; log can be discarded
CONTINUE
WriteCommitRecord(log.id)
Alternative: Undo Logging
Instead of logging the merge intent, log how to undo it:
During recovery, uncommitted undo records trigger rollback to pre-merge state.
The freed right node must be properly added to the storage allocator's free list. If the free list update is lost, the page becomes a 'leak'—allocated but unreachable. Include free list updates in the write-ahead log for full crash safety.
Production systems employ various optimizations to reduce merge overhead and improve overall performance.
1. Deferred Merging:
Instead of merging immediately upon underflow, allow nodes to remain underfull:
DEFERRED-MERGE-POLICY(node):
// Allow underflow down to 25% capacity before forcing merge
minBeforeMerge ← MAX(1, t / 2 - 1)
IF node.numKeys >= minBeforeMerge:
RETURN // Don't merge yet
// Only merge when severely underutilized
MERGE-WITH-SIBLING(node)
Benefits:
Trade-offs:
2. Merge Batching:
Buffer multiple underflowing nodes and merge them together:
BATCHED-MERGE():
underflowNodes ← CollectUnderflowingNodes()
FOR each pair of adjacent siblings in underflowNodes:
MERGE-NODES(parent, siblingIndex)
FLUSH-ALL-DIRTY-PAGES() // Single I/O flush for batch
3. Preferential Merge Direction:
Merge in the direction that minimizes I/O:
CHOOSE-MERGE-DIRECTION(parent, childIndex):
left ← parent.children[childIndex]
leftSibling ← childIndex > 0 ? parent.children[childIndex - 1] : NULL
rightSibling ← childIndex < parent.numKeys ? parent.children[childIndex + 1] : NULL
// Prefer cached sibling
IF leftSibling ≠ NULL AND IsInBufferPool(leftSibling):
RETURN MERGE_LEFT
IF rightSibling ≠ NULL AND IsInBufferPool(rightSibling):
RETURN MERGE_RIGHT
// Prefer sequential I/O (adjacent page)
IF leftSibling ≠ NULL AND IsAdjacentPage(left, leftSibling):
RETURN MERGE_LEFT
IF rightSibling ≠ NULL AND IsAdjacentPage(left, rightSibling):
RETURN MERGE_RIGHT
// Default: arbitrary choice (left preferred)
RETURN leftSibling ≠ NULL ? MERGE_LEFT : MERGE_RIGHT
4. Balanced Redistribution:
Instead of moving one key at a time, redistribute to equalize fill factors:
BALANCED-REDISTRIBUTE(left, right, separator):
allKeys ← left.keys + [separator] + right.keys
totalKeys ← len(allKeys)
// Divide keys evenly
leftCount ← totalKeys / 2
rightCount ← totalKeys - leftCount - 1 // -1 for new separator
left.keys ← allKeys[0:leftCount]
newSeparator ← allKeys[leftCount]
right.keys ← allKeys[leftCount+1:]
RETURN newSeparator
This maximizes both nodes' capacity, reducing future underflow risk.
5. Lazy Merge (Tombstones):
As discussed in deletion, don't merge at all—just mark slots as deleted:
LAZY-APPROACH():
// Never merge during normal operation
// Periodic background compaction reclaims space
IF deletedPercentage > THRESHOLD:
FULL-TREE-COMPACTION()
Track merge operations in production: frequency, latency, and patterns. High merge rates may indicate: (1) minimum degree t is too large for workload, (2) access patterns cause clustering of deletions, or (3) an opportunity for lazy deletion with batch compaction.
Merge bugs can be subtle, often manifesting as missing data or tree inconsistencies discovered long after the problematic merge.
Common Merge Bug Symptoms:
| Symptom | Likely Cause |
|---|---|
| Data disappearing | Separator not included in internal merge |
| Duplicate data | Separator incorrectly included in leaf merge |
| Orphaned pages | Right node not freed or parent not updated |
| Tree height increasing unexpectedly | Merge not triggering when it should |
| Infinite loops in range scans | Leaf sibling pointers not updated |
| Crash loops during recovery | Merge log record not idempotent |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
interface MergeVerificationResult { valid: boolean; errors: string[];} function verifyMerge( parentBefore: BTreeNode, parentAfter: BTreeNode, mergedNode: BTreeNode, separatorIndex: number, wasLeafMerge: boolean, t: number): MergeVerificationResult { const errors: string[] = []; // Verify parent lost exactly one key if (parentAfter.keys.length !== parentBefore.keys.length - 1) { errors.push( `Parent should have ${parentBefore.keys.length - 1} keys, has ${parentAfter.keys.length}` ); } // Verify parent lost exactly one child if (parentAfter.children!.length !== parentBefore.children!.length - 1) { errors.push( `Parent should have ${parentBefore.children!.length - 1} children, has ${parentAfter.children!.length}` ); } // Verify merged node has correct key count const leftKeys = parentBefore.children![separatorIndex].keys.length; const rightKeys = parentBefore.children![separatorIndex + 1].keys.length; const separatorContribution = wasLeafMerge ? 0 : 1; const expectedKeys = leftKeys + rightKeys + separatorContribution; if (mergedNode.keys.length !== expectedKeys) { errors.push( `Merged node should have ${expectedKeys} keys, has ${mergedNode.keys.length}` ); } // Verify merged node is at most full if (mergedNode.keys.length > 2 * t - 1) { errors.push( `Merged node overflow: ${mergedNode.keys.length} keys exceeds max ${2 * t - 1}` ); } // Verify ordering in merged node for (let i = 1; i < mergedNode.keys.length; i++) { if (mergedNode.keys[i] <= mergedNode.keys[i - 1]) { errors.push(`Keys not sorted at position ${i}`); } } // For leaves, verify sibling pointer updated if (wasLeafMerge && mergedNode.nextLeaf) { if (mergedNode.nextLeaf.prevLeaf !== mergedNode) { errors.push('Next leaf does not point back to merged node'); } } return { valid: errors.length === 0, errors };}Thoroughly test: (1) Merge causing root collapse, (2) Cascading merges to root, (3) Merge with minimum t (t=2), (4) Merge after concurrent deletions, (5) Recovery from crash at each merge step, (6) Merge of nodes containing duplicate keys (for non-unique indexes).
Node merging completes the picture of B-tree balancing, working with splitting to maintain optimal tree structure through all operations. Let's consolidate the essential knowledge:
Congratulations! You have mastered all core B-tree operations: search, insertion with splitting, and deletion with merging. You now understand how B-trees maintain balance and efficiency through millions of operations, and you're equipped to implement, debug, and optimize B-tree indexes in production database systems.