Loading learning content...
If directory doubling expands the addressing capacity, bucket splitting is the operation that actually uses that capacity. When a bucket overflows, splitting divides its records into two buckets, each handling a more specific subset of hash values.
Bucket splitting is the workhorse of extendible hashing—it happens far more frequently than directory doubling and is the mechanism that keeps load balanced across the data structure. Understanding splitting is essential for predicting performance, debugging issues, and implementing extendible hashing correctly.
By the end of this page, you will understand the complete bucket splitting algorithm, how records are redistributed based on hash bits, directory pointer updates, handling of edge cases like pathological splits, and performance characteristics of split operations.
Bucket splitting is triggered when an insertion would exceed bucket capacity:
The Trigger:
An insertion targets a bucket B that has no room for additional records.
Two scenarios:
| Scenario | Condition | Required Action |
|---|---|---|
| Local split | B.local_depth < global_depth | Split B immediately |
| Split with doubling | B.local_depth = global_depth | Double directory first, then split B |
The vast majority of splits are local splits (no directory doubling). As the table matures:
Pre-split decision flow:
123456789101112131415
function handleOverflow(bucket, key, value): // Check if we can split locally if bucket.local_depth < global_depth: splitBucket(bucket) insert(key, value) // Retry after split else if bucket.local_depth == global_depth: // Need more addressing bits doubleDirectory() splitBucket(bucket) insert(key, value) // Retry after split // Key insight:// After split, the new key might go to either the original bucket// or the sibling bucket. The retry recalculates the correct target.With global_depth d and assuming uniform hash distribution, the probability of a random split requiring directory doubling is approximately 1/(2^(d-1)). At d=16, only about 0.003% of splits trigger doubling. Local splits dominate overwhelmingly.
The bucket splitting algorithm consists of four phases:
Complete algorithm:
123456789101112131415161718192021222324252627282930313233343536373839404142
function splitBucket(bucket): old_local_depth = bucket.local_depth new_local_depth = old_local_depth + 1 // Phase 1: Create sibling bucket sibling = allocateNewBucket() sibling.local_depth = new_local_depth bucket.local_depth = new_local_depth // Phase 2: Redistribute records discriminating_bit_position = new_local_depth for each record in bucket: hash = computeHash(record.key) bit = getBit(hash, discriminating_bit_position) // 0 or 1 if bit == 1: sibling.add(record) bucket.remove(record) // else: record stays in bucket // Phase 3: Update directory pointers // Find all entries pointing to bucket // Half will now point to sibling updateDirectoryPointers(bucket, sibling, old_local_depth) // Phase 4: Verification (debug mode) assert bucket.local_depth == new_local_depth assert sibling.local_depth == new_local_depth verifyRecordDistribution(bucket, sibling) function updateDirectoryPointers(bucket, sibling, old_local_depth): num_pointers = 2 ^ (global_depth - old_local_depth) // Find starting index for this bucket's pointers start_index = bucket.hash_prefix << (global_depth - old_local_depth) // Half of pointers stay with bucket, half go to sibling half = num_pointers / 2 for i = start_index + half to start_index + num_pointers - 1: directory[i] = siblingKey observations:
new_local_depth determines which bucket each record goes toThe record redistribution phase is the core of splitting. It uses a single hash bit to partition records.
The discriminating bit:
For a bucket with local_depth d' splitting to local_depth d' + 1, the discriminating bit is bit number d' + 1 (counting from the most significant bit, 1-indexed).
Visual example:
123456789101112131415161718192021
Scenario: - Bucket B with local_depth = 2, hash prefix = "01" - Bucket overflows, splitting to local_depth = 3 - Discriminating bit: position 3 Records in bucket B (showing first 4 hash bits): Record A: hash = 0b0100... → bit 3 = 0 → stays in B Record B: hash = 0b0111... → bit 3 = 1 → moves to sibling Record C: hash = 0b0101... → bit 3 = 0 → stays in B Record D: hash = 0b0110... → bit 3 = 1 → moves to sibling Record E: hash = 0b0100... → bit 3 = 0 → stays in B Record F: hash = 0b0111... → bit 3 = 1 → moves to sibling After split: Bucket B (prefix "010"): Records A, C, E Sibling S (prefix "011"): Records B, D, F Verification: - All records in B have hashes starting with 010 - All records in S have hashes starting with 011 - Both buckets have local_depth = 3Bit extraction implementation:
123456789101112131415161718192021222324252627
// Extract the nth bit from a 32-bit hash (1-indexed from MSB)inline bool getBitMSB(uint32_t hash, int n) { // n=1: most significant bit // n=32: least significant bit int shift = 32 - n; return (hash >> shift) & 1;} // Example usage during split:void redistributeRecords(Bucket& bucket, Bucket& sibling, int new_local_depth) { std::vector<Record> to_move; for (auto& record : bucket.records) { uint32_t hash = computeHash(record.key); bool bit = getBitMSB(hash, new_local_depth); if (bit == 1) { to_move.push_back(record); } } // Move records to sibling for (auto& record : to_move) { bucket.remove(record); sibling.add(record); }}With a good hash function, records split roughly 50-50 between bucket and sibling. Significant deviation (e.g., 90-10) suggests hash function weaknesses or pathological data patterns.
After records are redistributed, the directory must be updated so lookups find the correct bucket.
The pointer update pattern:
Before split, the bucket had 2^(d - d') directory entries. After split:
The entries are partitioned based on the discriminating bit in their index.
Detailed example:
Algorithm for pointer update:
123456789101112131415161718192021222324252627
function updatePointers(bucket, sibling, old_local_depth): // Compute the range of directory indices pointing to bucket prefix_bits = bucket.hash_prefix // First old_local_depth bits suffix_bits_count = global_depth - old_local_depth start_index = prefix_bits << suffix_bits_count end_index = start_index + (1 << suffix_bits_count) - 1 // Partition point: indices where discriminating bit = 1 num_entries = end_index - start_index + 1 split_point = start_index + (num_entries / 2) // Update second half to point to sibling for i = split_point to end_index: directory[i] = sibling // First half remains pointing to bucket (no change) // Example walkthrough:// - old_local_depth = 2, global_depth = 4// - bucket prefix = "01" (binary)// - start_index = 01 << 2 = 0100 = 4// - end_index = 4 + 4 - 1 = 7// - Range: indices 4, 5, 6, 7 (binary: 0100, 0101, 0110, 0111)// - split_point = 4 + 2 = 6// - Indices 4,5 stay with bucket// - Indices 6,7 point to siblingDirectory updates must happen atomically or be ordered correctly with record redistribution. If a lookup occurs between record movement and pointer update, it might find an empty bucket (records moved but pointer not updated) or miss a record (pointer updated before record moved).
A pathological split occurs when, after splitting, all records end up in the same bucket—the split doesn't distribute load at all.
Why this happens:
If all keys in a bucket happen to have the same bit value at position new_local_depth, one bucket gets everything and the other remains empty. The next insertion to the full bucket would trigger another split, potentially with the same result.
Example scenario:
1234567891011121314151617181920212223
Pathological case: Initial bucket (local_depth = 2, prefix "01"): Record 1: hash = 0b0100_1111... Record 2: hash = 0b0100_0011... Record 3: hash = 0b0100_1010... Record 4: hash = 0b0100_0101... // All have bit 3 = 0! Record 5: hash = 0b0100_1100... Split attempt (new_local_depth = 3): Discriminating bit = 3 All records have bit 3 = 0 After split: Bucket (prefix "010"): All 5 records Sibling (prefix "011"): 0 records ← Empty! New insertion with hash 0b0100_0000... still goes to full bucket.Another split attempt... Worst case: If all keys hash to the same 32-bit value (collision), no split can ever redistribute. You'll hit maximum depth and overflow.Handling pathological splits:
Pathological splits are rare with cryptographic-quality hash functions. They typically indicate either: (1) a weak hash function, (2) intentional key attacks (hash flooding), or (3) extremely unlucky key patterns. Production systems use chains as a fallback but log warnings when triggered.
Understanding the costs of splitting is essential for performance tuning.
Time complexity components:
| Operation | Complexity | I/O Operations | Notes |
|---|---|---|---|
| Allocate sibling | O(1) | 1 write (new page) | Allocate from free list |
| Read original bucket | O(1) | 1 read | Already in buffer for insert |
| Compute destinations | O(B) | 0 | B = records in bucket |
| Write sibling records | O(B/2) | Included in allocation | Average half move |
| Rewrite original | O(B/2) | 1 write | Remaining records |
| Update directory | O(2^(d-d')) | 0-1 (if disk) | Pointer updates |
| Total | O(B + 2^(d-d')) | 2-3 I/Os |
I/O analysis:
The critical observation is that splitting requires only constant I/O operations (typically 2-3 page accesses), regardless of table size. This is what makes extendible hashing practical for disk-based databases.
Comparison with static hashing full rehash:
Some systems batch nearby splits or preemptively split buckets approaching capacity. This smooths latency spikes and can improve I/O patterns by writing multiple related pages together.
Bucket splitting must coordinate with concurrent lookups and other insertions.
Concurrency scenarios:
Bucket-level locking approach:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
// Fine-grained locking for bucket splitfunction splitWithLocking(bucket): // Lock bucket exclusively for split bucket.writeLock() try: // Re-check if split still needed (might have split due to another thread) if bucket.hasSpace(): return // Another split freed space // Allocate sibling (no lock needed - fresh page) sibling = allocateNewBucket() sibling.writeLock() // Lock sibling before publication // Redistribute records (both buckets locked) redistribute(bucket, sibling) // Update directory with latching directoryLatch.acquire() updateDirectoryPointers(bucket, sibling) directoryLatch.release() // Now safe to release bucket locks sibling.writeUnlock() bucket.writeUnlock() finally: if bucket.isLocked(): bucket.writeUnlock() // Lookup with optimistic concurrencyfunction lookup(key): retry: bucket = findBucket(key) bucket.readLock() // Verify bucket is correct (might have split) expected_bucket = findBucket(key) // Recalculate if expected_bucket != bucket: bucket.readUnlock() goto retry result = bucket.find(key) bucket.readUnlock() return resultTo prevent deadlocks, always acquire locks in a consistent order: directory latch → bucket locks in page ID order. Some systems use lock-free techniques with hazard pointers or epoch-based reclamation to eliminate traditional locking overhead.
Alternative: Lock-free split:
Advanced implementations use compare-and-swap (CAS) operations:
This approach eliminates blocking but requires careful memory ordering.
Bucket splitting must be crash-safe. If the system fails mid-split, recovery must restore consistency.
Crash points and their implications:
| Crash Point | State on Disk | Recovery Action |
|---|---|---|
| Before sibling allocated | Original bucket full | Retry split on next insert |
| After sibling allocated, before records move | Empty sibling exists | Deallocate sibling, retry split |
| Mid record redistribution | Some records in both buckets | Undo partial move, retry split |
| After redistribution, before directory update | Records split, directory points to original only | Complete directory update |
| After directory update | Complete split | No recovery needed |
WAL-based recovery:
Most database systems use Write-Ahead Logging (WAL) to ensure split atomicity:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
// WAL records for split operationfunction splitWithLogging(bucket): // Log intention to split WAL.write(SplitStartRecord { bucket_id: bucket.id, old_local_depth: bucket.local_depth }) sibling = allocateBucket() WAL.write(AllocateRecord { page_id: sibling.id }) // Log each record movement for each record to move: WAL.write(MoveRecord { record_key: record.key, from_bucket: bucket.id, to_bucket: sibling.id }) // Log directory updates WAL.write(DirectoryUpdateRecord { entries: [affected_indices], new_target: sibling.id }) // Log local depth changes WAL.write(DepthUpdateRecord { bucket: bucket.id, new_depth: bucket.local_depth + 1 }) WAL.write(DepthUpdateRecord { bucket: sibling.id, new_depth: bucket.local_depth + 1 }) // Log completion WAL.write(SplitCompleteRecord { bucket_id: bucket.id }) // Force WAL to disk before committing changes WAL.flush() commitChangesToDisk() // On crash recovery:// - Find incomplete SplitStartRecord without matching SplitCompleteRecord// - Redo or undo based on logged operationsModern databases often use physiological logging—page-level physical REDO with logical UNDO. This allows efficient recovery while supporting operations like split that span multiple pages. The entire split can be treated as a single logical operation for undo purposes.
Bucket splitting is the core growth mechanism of extendible hashing. Let's consolidate the key concepts:
Module Complete:
You have now mastered the complete architecture of extendible hashing:
These concepts work together to create an index structure that provides O(1) lookup while gracefully handling arbitrary growth without the catastrophic reorganization of static hashing.
You have achieved comprehensive understanding of extendible hashing. This dynamic hash indexing technique is foundational to database systems like Oracle, PostgreSQL, and many key-value stores. The principles you've learned—directory indirection, depth management, and incremental splitting—apply broadly to adaptive data structures throughout computer science.