Loading content...
Of all B+-tree operations, deletion is the most intricate. While insertion adds nodes and grows the tree upward, deletion removes keys and may shrink the tree—potentially all the way back to a single root. The complexity arises from maintaining the minimum occupancy invariant: every non-root node must be at least half full.
When deletion causes a node to fall below minimum occupancy, the tree must respond with either redistribution (borrowing from a sibling) or merging (combining with a sibling). These operations can cascade upward through the tree, potentially decreasing tree height. Understanding deletion deeply is essential for predicting update performance, analyzing concurrency behavior, and implementing custom index structures.
By the end of this page, you will master B+-tree deletion including: locating and removing keys, detecting underflow conditions, borrowing keys from siblings (redistribution), combining underflowing nodes (merging), propagating changes to parent nodes, and handling cascading merges that may shrink tree height.
B+-tree deletion follows this high-level process:
Step 1: Find the Key Search for the key to delete. It must be in a leaf node (since all data is stored in leaves).
Step 2: Delete from Leaf Remove the key and its record pointer from the leaf.
Step 3: Check for Underflow If the leaf still meets minimum occupancy requirements, we're done. Otherwise, we must fix the underflow.
Step 4: Handle Underflow Two strategies, in order of preference:
Step 5: Update Parent After redistribution or merging, update the separator key(s) in the parent. Merging removes a separator, which may cause parent underflow.
Step 6: Propagate Upward If parent underflows, apply the same redistribution/merge logic. Continue to root if needed.
For a B+-tree of order n: Leaf nodes must have at least ⌈(n-1)/2⌉ keys (except the root, which may have 1). Internal nodes must have at least ⌈n/2⌉ children (except the root, which may have 2). Deletion must maintain these invariants at all levels.
The simplest case occurs when removing a key leaves the leaf with at least the minimum number of keys. No restructuring is needed.
Process:
If the deleted key was the first key in the leaf and it appears as a separator in an ancestor, the separator may or may not need updating. In many implementations, stale separators are tolerated because they still route correctly (any key that would have matched the old separator will match the new first key).
123456789101112131415161718192021222324252627282930
FUNCTION DeleteFromLeaf(leaf, targetKey) // Find the key's position position ← -1 FOR i ← 0 TO leaf.keyCount - 1 DO IF leaf.keys[i] = targetKey THEN position ← i BREAK END IF END FOR IF position = -1 THEN RETURN KEY_NOT_FOUND END IF // Shift remaining keys left to fill the gap FOR i ← position TO leaf.keyCount - 2 DO leaf.keys[i] ← leaf.keys[i + 1] leaf.recordPointers[i] ← leaf.recordPointers[i + 1] END FOR leaf.keyCount ← leaf.keyCount - 1 MarkDirty(leaf) // Check if underflow occurred IF leaf.keyCount >= minLeafKeys THEN RETURN SUCCESS ELSE RETURN UNDERFLOW // Caller must handle END IFEND FUNCTIONExample: Delete Key = 22 from Leaf [20, 22, 28, 35]
Leaf has minimum occupancy of 2 keys (for order 5):
With 3 keys remaining (≥ minimum of 2), no underflow occurs. Done!
| Operation | I/O Cost | Explanation |
|---|---|---|
| Find target leaf | O(height) | Tree descent via search |
| Delete from leaf | 0 (in-memory) | Modify cached page |
| Write dirty page | 1 | Persist modified leaf |
| Total | O(height) + 1 | Typically 4-6 I/Os |
When deletion causes a leaf to fall below minimum occupancy, we have an underflow. The B+-tree responds with one of two strategies:
Strategy 1: Redistribution (Preferred) Borrow one or more keys from an adjacent sibling that has more than the minimum. This is preferred because it doesn't change the tree structure—only key positions change.
Strategy 2: Merging (Fallback) If both adjacent siblings are at minimum occupancy (can't donate), merge the underflowing node with one sibling. This removes a node from the tree and may cascade upward.
| Left Sibling Keys | Right Sibling Keys | Underflowing Node | Action |
|---|---|---|---|
minimum | any | underflow | Borrow from left sibling |
| = minimum | minimum | underflow | Borrow from right sibling |
minimum | minimum | underflow | Borrow from either (impl. choice) |
| = minimum | = minimum | underflow | Merge with either sibling |
| doesn't exist | minimum | underflow | Borrow from right sibling |
| doesn't exist | = minimum | underflow | Merge with right sibling |
Finding Siblings:
To find siblings, we need the parent node and the position of the underflowing node within the parent:
This is why the deletion algorithm tracks the path from root to leaf—we need parent information for underflow handling.
When both siblings can donate, implementations typically prefer the left sibling for consistency. Some implementations alternate or use heuristics to balance node sizes. The choice doesn't affect correctness—only the specific tree configuration after the operation.
Redistribution transfers keys between sibling nodes to resolve underflow without changing the tree structure. The process differs depending on whether we borrow from the left or right sibling.
Borrowing from Left Sibling (Leaf Nodes):
Borrowing from Right Sibling (Leaf Nodes):
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
FUNCTION RedistributeFromLeftSibling(parent, leftSibling, node, separatorIndex) // Borrow the largest key from left sibling borrowedKey ← leftSibling.keys[leftSibling.keyCount - 1] borrowedPointer ← leftSibling.recordPointers[leftSibling.keyCount - 1] // Remove from left sibling leftSibling.keyCount ← leftSibling.keyCount - 1 // Insert at beginning of underflowing node // Shift existing keys right FOR i ← node.keyCount DOWNTO 1 DO node.keys[i] ← node.keys[i - 1] node.recordPointers[i] ← node.recordPointers[i - 1] END FOR node.keys[0] ← borrowedKey node.recordPointers[0] ← borrowedPointer node.keyCount ← node.keyCount + 1 // Update parent separator: new boundary is the new first key of node parent.keys[separatorIndex] ← node.keys[0] MarkDirty(leftSibling) MarkDirty(node) MarkDirty(parent)END FUNCTION FUNCTION RedistributeFromRightSibling(parent, node, rightSibling, separatorIndex) // Borrow the smallest key from right sibling borrowedKey ← rightSibling.keys[0] borrowedPointer ← rightSibling.recordPointers[0] // Remove from right sibling (shift remaining keys left) FOR i ← 0 TO rightSibling.keyCount - 2 DO rightSibling.keys[i] ← rightSibling.keys[i + 1] rightSibling.recordPointers[i] ← rightSibling.recordPointers[i + 1] END FOR rightSibling.keyCount ← rightSibling.keyCount - 1 // Insert at end of underflowing node node.keys[node.keyCount] ← borrowedKey node.recordPointers[node.keyCount] ← borrowedPointer node.keyCount ← node.keyCount + 1 // Update parent separator: new boundary is the new first key of right sibling parent.keys[separatorIndex] ← rightSibling.keys[0] MarkDirty(rightSibling) MarkDirty(node) MarkDirty(parent)END FUNCTIONWhen redistributing between internal nodes, the separator key in the parent is rotated through the nodes, not just updated. The borrowed key goes through the parent separator, and the old separator moves to the receiving node. This maintains the correct routing structure.
Let's trace a complete redistribution scenario.
Setup:
Before Deletion:
[25]
/
[12,18,20] [25,30,35]
Step 1: Delete 20 from left leaf
[25]
/
[12,18] [25,30,35]
Left leaf has 2 keys, minimum is 1. No underflow! Done.
Now let's try a case that causes underflow:
Delete key = 18:
[25]
/
[12] [25,30,35]
Left leaf has 1 key, which equals the minimum. No underflow yet.
Delete key = 12:
[25]
/
[] [25,30,35] ← UNDERFLOW! Left leaf is empty
Step 2: Check siblings
Step 3: Borrow from right sibling
After Redistribution:
[30] ← separator updated from 25 to 30
/
[25] [30,35] ← 25 moved to left, right sibling shrinks
Both leaves now have at least 1 key. Tree is balanced!
| Step | Left Leaf | Right Leaf | Parent Separator | Action |
|---|---|---|---|---|
| Initial | [12, 18, 20] | [25, 30, 35] | 25 | — |
| Delete 20 | [12, 18] | [25, 30, 35] | 25 | No underflow |
| Delete 18 | [12] | [25, 30, 35] | 25 | No underflow |
| Delete 12 | [] | [25, 30, 35] | 25 | UNDERFLOW |
| Redistribute | [25] | [30, 35] | 30 | Borrow 25, update separator |
Redistribution modifies exactly three nodes: the underflowing node (receives key), the donating sibling (loses key), and the parent (separator updated). This is efficient—no nodes are created or destroyed, no sibling pointers change.
When all siblings are at minimum occupancy, neither can donate a key. The only option is to merge the underflowing node with one of its siblings.
Merging Process (Leaf Nodes):
12345678910111213141516171819202122232425262728293031323334353637383940414243
FUNCTION MergeLeaves(parent, leftLeaf, rightLeaf, separatorIndex) // Merge: move all keys from rightLeaf into leftLeaf, then delete rightLeaf // This removes the separator key from parent // Step 1: Copy all keys from right to left FOR i ← 0 TO rightLeaf.keyCount - 1 DO leftLeaf.keys[leftLeaf.keyCount + i] ← rightLeaf.keys[i] leftLeaf.recordPointers[leftLeaf.keyCount + i] ← rightLeaf.recordPointers[i] END FOR leftLeaf.keyCount ← leftLeaf.keyCount + rightLeaf.keyCount // Step 2: Update sibling pointers leftLeaf.nextSibling ← rightLeaf.nextSibling IF rightLeaf.nextSibling ≠ NULL THEN rightLeaf.nextSibling.prevSibling ← leftLeaf END IF // Step 3: Remove separator from parent and update child pointers // Remove key at separatorIndex FOR i ← separatorIndex TO parent.keyCount - 2 DO parent.keys[i] ← parent.keys[i + 1] END FOR // Remove child pointer at separatorIndex + 1 (the right leaf) FOR i ← separatorIndex + 1 TO parent.childCount - 2 DO parent.children[i] ← parent.children[i + 1] END FOR parent.keyCount ← parent.keyCount - 1 // Step 4: Deallocate the right leaf DeallocatePage(rightLeaf) MarkDirty(leftLeaf) MarkDirty(parent) // Step 5: Check for parent underflow IF parent.keyCount < minInternalKeys AND parent ≠ root THEN RETURN PARENT_UNDERFLOW ELSE RETURN SUCCESS END IFEND FUNCTIONEvery merge removes one separator key from the parent. If the parent was at minimum occupancy, this causes parent underflow, requiring either redistribution or merging at the parent level. This can cascade all the way to the root.
Internal Node Merging:
Merging internal nodes is similar but includes an additional step: the separator key from the parent is pulled down into the merged node.
Merge Process for Internal Nodes:
1. Pull down the separator key from parent
2. Append all keys and children from right node to left node
3. The pulled-down separator becomes a key in the merged node
4. Remove right node and its parent entry
This is the reverse of internal node splitting, where a key was pushed up to create the separator.
Let's trace a merge that cascades to the parent.
Setup:
Initial Structure:
[30]
/
[15] [45]
/ \ /
[10] [20] [35] [50]
Step 1: Delete 10
[30]
/
[15] [45]
/ \ /
[] [20] [35] [50] ← Leftmost leaf is EMPTY (underflow)
Step 2: Check siblings for redistribution
Step 3: Merge leaves
[30]
/
[] [45] ← Internal node now has 0 keys (underflow!)
/ /
[20] [35] [50] ← Merged leaf contains [20]
Wait—the left internal node now has only 1 child and 0 keys. That's an underflow!
Step 4: Handle internal node underflow
Step 5: Merge internal nodes
[30, 45] ← New root is the merged internal node
/ |
[20] [35] [50]
The root now has only one child, which means we should shrink:
Step 6: Shrink tree (root has single child)
[30, 45]
/ |
[20] [35] [50] ← This internal node becomes the new root
Tree height decreased from 3 to 2!
| Step | Action | Tree Height |
|---|---|---|
| Initial | Balanced tree | 3 |
| Delete 10 | Leaf underflow | 3 |
| Merge leaves | Internal underflow | 3 |
| Merge internals | Root has 1 child | 3 |
| Shrink root | Tree rebalanced | 2 |
Just as height only increases when the root splits, height only decreases when the root's children merge into a single child, making the root redundant. The single child then becomes the new root. This maintains the invariant that all leaves are at the same depth.
Now let's see the complete deletion algorithm:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105
FUNCTION BPlusTreeDelete(tree, targetKey) // ================================================ // PHASE 1: FIND KEY AND DELETE FROM LEAF // ================================================ // Track path for potential restructuring path ← [] // Stack of (node, childIndex) pairs currentNode ← tree.root WHILE NOT currentNode.isLeaf DO childIndex ← FindChildIndex(currentNode, targetKey) path.push((currentNode, childIndex)) currentNode ← currentNode.children[childIndex] END WHILE leaf ← currentNode // Delete from leaf result ← DeleteFromLeaf(leaf, targetKey) IF result = KEY_NOT_FOUND THEN RETURN KEY_NOT_FOUND END IF IF result = SUCCESS THEN RETURN SUCCESS // No underflow, we're done END IF // ================================================ // PHASE 2: HANDLE UNDERFLOW // ================================================ underflowNode ← leaf WHILE path is not empty DO (parent, childIndex) ← path.pop() // Try redistribution first redistributed ← TryRedistribute(parent, underflowNode, childIndex) IF redistributed THEN RETURN SUCCESS END IF // Must merge (mergedNode, parentUnderflow) ← MergeSiblings( parent, underflowNode, childIndex ) IF NOT parentUnderflow THEN RETURN SUCCESS END IF underflowNode ← parent // Continue with parent END WHILE // ================================================ // PHASE 3: HANDLE ROOT UNDERFLOW // ================================================ // If we get here, underflowNode is the root IF tree.root.keyCount = 0 AND tree.root.childCount = 1 THEN // Root has single child - shrink tree oldRoot ← tree.root tree.root ← tree.root.children[0] tree.height ← tree.height - 1 DeallocatePage(oldRoot) END IF RETURN SUCCESSEND FUNCTION FUNCTION TryRedistribute(parent, underflowNode, childIndex) // Try to borrow from left sibling IF childIndex > 0 THEN leftSibling ← parent.children[childIndex - 1] IF leftSibling.keyCount > minKeys THEN IF underflowNode.isLeaf THEN RedistributeFromLeftSibling(parent, leftSibling, underflowNode, childIndex - 1) ELSE RedistributeInternalFromLeft(parent, leftSibling, underflowNode, childIndex - 1) END IF RETURN TRUE END IF END IF // Try to borrow from right sibling IF childIndex < parent.childCount - 1 THEN rightSibling ← parent.children[childIndex + 1] IF rightSibling.keyCount > minKeys THEN IF underflowNode.isLeaf THEN RedistributeFromRightSibling(parent, underflowNode, rightSibling, childIndex) ELSE RedistributeInternalFromRight(parent, underflowNode, rightSibling, childIndex) END IF RETURN TRUE END IF END IF RETURN FALSE // Redistribution not possibleEND FUNCTIONMany production systems use 'lazy deletion': simply mark entries as deleted without restructuring. Actual space reclamation happens during background maintenance or page splits. This trades storage efficiency for simpler, faster deletes and reduced concurrency conflicts.
| Operation | Worst Case | Typical Case | Notes |
|---|---|---|---|
| Find target leaf | O(log N) | O(log N) | Standard tree descent |
| Delete from leaf | O(n) | O(n) | Shift within node |
| Redistribution | O(n) | Rare | Modify 3 nodes |
| Single merge | O(n) | Rare | Combine 2 nodes |
| Cascading merges | O(h × n) | Very rare | At most h merges |
I/O Complexity:
All cases are O(log N) I/Os. Deletion, like insertion, is logarithmic.
Why Merges Are Rare:
With random deletions and nodes at 50-100% occupancy:
Most deletions complete without restructuring.
What's Next:
We've covered the fundamental operations on B+-trees: search, range query, insert, and delete. The final page of this module explores bulk loading—a specialized technique for efficiently constructing B+-trees from large datasets, bypassing the normal insert path for dramatic speedups.
You now understand the complete mechanics of B+-tree deletion including redistribution and merging. This knowledge is essential for understanding delete-heavy workload performance, index maintenance costs, and why some systems prefer lazy deletion strategies.