Database Management SystemsExtendible Hashing

Extendible Hashing

LevelIntermediate

Duration75 mins

TopicExtendible Hashing

4 / 5

Directory Doubling

Growing the Index Without Moving Data

Directory doubling is the most structurally significant operation in extendible hashing. It occurs when the current directory structure cannot accommodate a bucket split—specifically, when a bucket with local_depth = global_depth overflows.

What makes directory doubling remarkable is that no data moves. Unlike full rehashing in static hashing (which reads and rewrites every record), directory doubling only copies pointers. The actual bucket pages, containing potentially millions of records, remain exactly where they are.

What You Will Learn

By the end of this page, you will understand when directory doubling is triggered, the exact mechanics of the doubling operation, its time and space complexity, concurrency considerations, and optimization techniques used in production systems.

The Trigger Condition

Directory doubling is triggered by a precise condition:

The Trigger:

An insertion attempts to add a record to a bucket that is:

Full (at capacity)

Has local_depth = global_depth (already at maximum specificity)

When both conditions are true, the bucket cannot split within the current directory structure. Splitting would require halving the pointer count, but a bucket with local_depth = global_depth has exactly 1 pointer—you cannot have half a pointer.

Decision flowchart:

doubling_decision.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function insert(key, value):
    bucket = findBucket(key)
    
    if bucket.hasSpace():
        bucket.add(key, value)
        return SUCCESS
    
    // Bucket is full - need to split
    if bucket.local_depth < global_depth:
        // CASE A: Local split only (no directory change)
        splitBucket(bucket)
        return insert(key, value)  // Retry
    
    else:
        // CASE B: bucket.local_depth == global_depth
        // Directory doubling required before split
        doubleDirectory()
        splitBucket(bucket)
        return insert(key, value)  // Retry
 
// Key insight:
// - CASE A handles most overflow situations (increasingly common as table grows)
// - CASE B is rare but necessary for continued growth

Why can't we just use overflow chains?

Some systems do allow overflow chains as a fallback, but they defeat the purpose of extendible hashing:

Lookup degradation — Chained buckets turn O(1) lookups into O(k) where k is chain length
Space fragmentation — Overflow pages may be scattered across disk
Deletion complexity — Managing free space across chains is complex

Directory doubling preserves the O(1) guarantee by ensuring every bucket can always split when needed.

Frequency Decreases Over Time

With global_depth d, only 1/(2^(d-1)) of buckets have local_depth = d (the maximum). As d grows, the probability that a random overflow triggers doubling approaches zero. A mature table with d=20 has less than a 1-in-500,000 chance per overflow.

The Doubling Algorithm

The directory doubling algorithm is elegantly simple:

Algorithm: Double Directory

directory_double_algorithm.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
function doubleDirectory():
    old_size = 2^global_depth
    new_size = 2 * old_size
    
    // Step 1: Allocate new directory array
    new_directory = allocateArray(new_size, type=BucketPointer)
    
    // Step 2: Copy and duplicate all pointers
    for i = 0 to old_size - 1:
        // Original entry at position i
        new_directory[i] = directory[i]
        
        // Duplicate at position i + old_size
        new_directory[i + old_size] = directory[i]
    
    // Step 3: Atomically swap directories
    old_directory = directory
    directory = new_directory
    
    // Step 4: Increment global depth
    global_depth = global_depth + 1
    
    // Step 5: Deallocate old directory (deferred if concurrent readers)
    scheduleDeallocation(old_directory)
 
// After doubling:
//   - directory size: 2^(global_depth)
//   - Every bucket now has 2× pointers pointing to it
//   - No bucket has changed (local depths unchanged)
//   - No data has moved

Visual example:

Converting Mermaid diagram...

Key observations:

Buckets A, B, C are unchanged — Same contents, same local depths, same physical locations
Each bucket's pointer count doubled — A had 1 pointer (00), now has 2 (000, 100)
The copying pattern — Entry i is copied to positions i and i + old_size
C still has multiple pointers — Its local depth (1) is now even further below global depth (3)

Complexity Analysis

Understanding the costs of directory doubling is essential for performance analysis.

Time complexity:

Directory Doubling Time Costs
Operation	Complexity	Explanation
Memory allocation	O(2^d)	Allocate 2^(d+1) pointer slots
Pointer copying	O(2^d)	Copy 2^d pointers twice each
Directory swap	O(1)	Single pointer assignment
Bucket operations	0	No bucket I/O during doubling
Total	O(2^d)	Linear in current directory size

Space complexity:

Peak memory — 2×(2^d) × pointer_size during transition (old + new directories exist simultaneously)
Final memory — 2^(d+1) × pointer_size (new directory only)
Disk I/O — Zero pages written (directory is typically in memory)

Amortized analysis:

While each doubling costs O(2^d), doublings are exponentially rare:

amortized_analysis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Amortized Cost Analysis for N insertions:
 
Assumptions:
  - Start with d = 1 (2 buckets)
  - Each bucket holds B records
  - Grow to N records total
 
Number of doublings:
  - Need approximately N/B buckets
  - Maximum global depth: d_max ≈ log₂(N/B)
  - Total doublings: d_max - 1 = O(log(N/B))
 
Total doubling work:
  Σ (2^i for i = 1 to log₂(N/B))
  = 2 + 4 + 8 + ... + (N/B)
  = 2 × (N/B) - 2
  = O(N/B)
 
Amortized cost per insertion:
  = O(N/B) / N
  = O(1/B)
  ≈ O(1) for fixed bucket size B
 
Conclusion:
  Despite O(2^d) per doubling, the amortized cost per insertion
  is O(1) because doublings become exponentially less frequent.

Practical Performance Numbers

With 8-byte pointers and d=20, copying 1 million pointers takes ~8ms on modern hardware (memory bandwidth ~1GB/s). For d=24, it's ~130ms. These are rare operations—the 24th doubling happens only once in the table's lifetime after handling billions of insertions.

Pointer Duplication Pattern

The pointer duplication during doubling follows a precise pattern that maintains correctness.

The pattern:

Entry at index i (in old directory) is copied to indices i and i + 2^d (in new directory)
In binary: index 0bXXX becomes 0b0XXX and 0b1XXX

This works because the most significant bit of the new index is the "new" bit added by increasing global depth.

Detailed example:

duplication_pattern.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Before doubling (d = 3, 8 entries):
  Index 0 (000) → Bucket A
  Index 1 (001) → Bucket A     // A has local_depth 2, shared
  Index 2 (010) → Bucket B
  Index 3 (011) → Bucket C
  Index 4 (100) → Bucket D
  Index 5 (101) → Bucket D     // D has local_depth 2, shared
  Index 6 (110) → Bucket E
  Index 7 (111) → Bucket E     // E has local_depth 2, shared
 
After doubling (d = 4, 16 entries):
  Index  0 (0000) → Bucket A   // From old index 0
  Index  1 (0001) → Bucket A   // From old index 1
  Index  2 (0010) → Bucket B   // From old index 2
  Index  3 (0011) → Bucket C   // From old index 3
  Index  4 (0100) → Bucket D   // From old index 4
  Index  5 (0101) → Bucket D   // From old index 5
  Index  6 (0110) → Bucket E   // From old index 6
  Index  7 (0111) → Bucket E   // From old index 7
  ────────────────────────────────────────────────
  Index  8 (1000) → Bucket A   // Copy of old index 0
  Index  9 (1001) → Bucket A   // Copy of old index 1
  Index 10 (1010) → Bucket B   // Copy of old index 2
  Index 11 (1011) → Bucket C   // Copy of old index 3
  Index 12 (1100) → Bucket D   // Copy of old index 4
  Index 13 (1101) → Bucket D   // Copy of old index 5
  Index 14 (1110) → Bucket E   // Copy of old index 6
  Index 15 (1111) → Bucket E   // Copy of old index 7
 
Pointer counts after doubling:
  Bucket A: was 2, now 4 (indices 0,1,8,9)
  Bucket B: was 1, now 2 (indices 2,10)
  Bucket C: was 1, now 2 (indices 3,11)
  Bucket D: was 2, now 4 (indices 4,5,12,13)
  Bucket E: was 2, now 4 (indices 6,7,14,15)

Why this pattern is correct:

Consider a key with hash h(k) = 0b1011....

Before doubling (d=3): First 3 bits are 101 → directory index 5 → Bucket D
After doubling (d=4): First 4 bits are 1011 → directory index 11 → Bucket D

Index 11 in the new directory is a copy of index 3 (which pointed to D in old directory)... wait, that seems wrong!

Actually, let me recalculate:

Bit Ordering Matters

The duplication pattern assumes MSB-first indexing. Index 5 (binary 101) in a d=3 directory corresponds to keys with hash prefix 101. After doubling to d=4, these keys split into prefixes 1010 and 1011 (indices 10 and 11). The copy of old index 5 goes to new index 5 (0101) and new index 13 (1101)—both correctly unreachable by 101x keys. The split handles the rest.

Relationship to Subsequent Split

Directory doubling is never performed in isolation—it always immediately precedes a bucket split. The sequence is:

Detect overflow in bucket B with local_depth = global_depth
Double the directory (global_depth increases)
Split bucket B (local_depth of B and its sibling increase)
Retry insertion (now succeeds)

After this sequence:

State Changes During Double-Then-Split
Property	Before	After Doubling	After Split
global_depth	d	d + 1	d + 1
directory_size	2^d	2^(d+1)	2^(d+1)
B.local_depth	d	d	d + 1
B.pointers	1	2	1
Sibling.local_depth	—	—	d + 1
Sibling.pointers	—	—	1

Step-by-step walkthrough:

double_and_split_walkthrough.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Initial state:
  global_depth = 2
  Bucket B at index 11, local_depth = 2, pointers = 1
  B is full and receives new insertion
 
Step 1: Double directory
  global_depth becomes 3
  Directory grows from 4 to 8 entries
  Entry 11 (0b11) copied to entries 011 and 111
  B now has 2 pointers (indices 3 and 7)
  B.local_depth still = 2
 
Step 2: Split bucket B
  B.local_depth becomes 3
  Create sibling S with local_depth = 3
  Redistribute B's records:
    - Records with bit 3 = 0 stay in B
    - Records with bit 3 = 1 move to S
  Update directory:
    - Index 011 (3) points to B
    - Index 111 (7) points to S
  B now has 1 pointer, S has 1 pointer
 
Step 3: Retry insertion
  Recalculate target bucket using new global_depth = 3
  Insert into B or S depending on key's bit 3
  Insertion succeeds (buckets now have space)

Atomic Operation

In practice, the doubling and split are often treated as a single atomic operation for crash recovery purposes. The write-ahead log records the combined operation, ensuring either both complete or neither does.

Concurrency During Doubling

Directory doubling presents unique concurrency challenges because it affects the global structure used by all operations.

The challenge:

During doubling, the relationship between hash values and bucket locations temporarily changes. Readers using the old global_depth will compute different indices than readers using the new depth. Without synchronization, lookups could:

Use stale global_depth with new directory (wrong bucket)
Use new global_depth with old directory (out-of-bounds access)
Observe torn writes during pointer copying

Synchronization strategies:

Concurrency Approaches

•Global read-write lock — Writers (doublers) take exclusive lock; readers take shared lock. Simple but creates contention point.
•Optimistic locking with retry — Readers capture directory state, perform lookup, verify state unchanged. Retry if mismatch.
•Read-copy-update (RCU) — New directory is fully constructed; single atomic pointer swap makes it active. Readers on old directory remain valid until quiescent.
•Versioned directories — Each directory version includes its global_depth. Readers use consistent (version, directory) pairs.
•Hazard pointers — Readers register which directory they're using. Old directory freed only when no hazard pointers reference it.

rcu_doubling.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// RCU-style directory doubling
class HashIndex {
    atomic<DirectoryState*> current_state
    
    struct DirectoryState {
        uint32_t global_depth
        BucketPointer* directory
        uint64_t version
    }
}
 
function doubleDirectoryRCU():
    old_state = current_state.load()
    
    // Build complete new state
    new_directory = allocateAndCopyDoubled(old_state.directory, old_state.global_depth)
    new_state = new DirectoryState {
        global_depth: old_state.global_depth + 1,
        directory: new_directory,
        version: old_state.version + 1
    }
    
    // Atomic swap - no locks needed for readers
    while !current_state.compare_and_swap(old_state, new_state):
        // Another writer beat us - retry from scratch
        deallocate(new_directory)
        old_state = current_state.load()
        new_directory = allocateAndCopyDoubled(old_state.directory, old_state.global_depth)
        new_state.directory = new_directory
        new_state.global_depth = old_state.global_depth + 1
        new_state.version = old_state.version + 1
    
    // Schedule old state for deferred cleanup
    rcu_free(old_state)
 
function lookup(key):
    state = current_state.load()  // Consistent snapshot
    index = extractBits(hash(key), state.global_depth)
    return state.directory[index]
    // No validation needed - state is immutable once published

RCU is Preferred for Read-Heavy Workloads

Since lookups vastly outnumber directory modifications, RCU's zero-cost reads are ideal. Writers pay the cost of copying the directory, but this happens rarely. PostgreSQL's hash index uses a variant of this approach.

Optimization Techniques

Production systems employ several techniques to optimize directory doubling.

Pre-allocation:

Instead of allocating exactly 2^d entries, allocate extra capacity:

pre_allocation.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// Pre-allocate directory with growth margin
class OptimizedDirectory {
private:
    BucketPointer* storage;      // Actual memory
    size_t allocated_size;       // May exceed 2^global_depth
    uint32_t global_depth;
    
public:
    OptimizedDirectory() {
        global_depth = 1;
        // Allocate for depth 4 even though starting at 1
        allocated_size = 16;  // 2^4
        storage = new BucketPointer[allocated_size];
        // Only first 2 entries used initially
    }
    
    void doubleDirectory() {
        size_t needed = 2 * (1 << global_depth);
        
        if (needed <= allocated_size) {
            // Just duplicate pointers in-place
            size_t old_size = 1 << global_depth;
            for (size_t i = 0; i < old_size; i++) {
                storage[i + old_size] = storage[i];
            }
            global_depth++;
            return;  // No allocation!
        }
        
        // Need to reallocate
        size_t new_allocated = std::max(needed, allocated_size * 2);
        // ... proceed with standard reallocation
    }
};

Multi-level directories:

For very large directories, use a tree of directory pages instead of a flat array:

multilevel_directory.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Multi-level Directory Structure (2-level example):
 
Level 0 (Root): Array of pointers to Level 1 pages
  - Fix size throughout table lifetime
  - Each entry covers 2^(d - k) buckets (k = bits for L0 indexing)
 
Level 1 (Leaves): Pages containing bucket pointers
  - Grows by adding new L1 pages
  - Each page covers a range of bucket addresses
 
Example with k = 10:
  L0: 1024 entries, each pointing to an L1 page
  L1: Each page has up to 2^14 = 16K entries
  Maximum addressable: 2^24 = 16M buckets
 
Doubling behavior:
  - If d < 10: Only L0 doubles (like regular doubling)
  - If d >= 10: L1 pages split, L0 unchanged
  
Advantage:
  - L0 fits in cache (8KB)
  - L1 pages can be demand-paged
  - Doubling never copies entire directory

Additional Optimizations

•SIMD copying — Use vector instructions to copy 4-8 pointers per instruction during doubling
•Huge pages — Allocate directory on 2MB huge pages to reduce TLB misses during traversal
•NUMA awareness — On multi-socket systems, replicate directory on each NUMA node for local access
•Lazy doubling — Double directory in background, using old directory until new one is ready

Trade-offs

Pre-allocation trades memory for lower doubling latency. Multi-level directories trade an extra memory access per lookup for bounded doubling cost. The right choice depends on workload characteristics—read-heavy vs write-heavy, memory constraints, latency requirements.

Summary: Directory Doubling Mastery

Directory doubling is the mechanism that allows extendible hashing to scale without bounds (within system limits). Let's consolidate the key concepts:

Key Takeaways

•Trigger condition — Doubling occurs when a bucket with local_depth = global_depth overflows.
•Algorithm — Copy each pointer to positions i and i + old_size, then increment global_depth.
•No data movement — Buckets and their contents remain in place; only metadata (pointers) is copied.
•Time complexity — O(2^d) per doubling, but O(1) amortized per insertion due to exponentially decreasing frequency.
•Precedes split — Doubling always immediately precedes a bucket split to resolve the overflow.
•Concurrency solutions — RCU, versioned directories, or read-write locks protect concurrent access during doubling.
•Optimizations exist — Pre-allocation, multi-level directories, and SIMD copying reduce overhead.

What's next:

With directory doubling understood, we'll explore bucket splitting—the complementary operation that redistributes records when a bucket overflows. While doubling expands capacity, splitting actually uses that capacity by creating new buckets.

Page Complete

You now understand directory doubling—the global expansion operation that enables unbounded growth. The key insight is that doubling is metadata-only; no records move, making it orders of magnitude faster than static hashing's full rehash.

4 / 5

Loading learning content...

Database Management SystemsExtendible Hashing

Extendible Hashing

LevelIntermediate

Duration75 mins

TopicExtendible Hashing

4 / 5

Directory Doubling

Growing the Index Without Moving Data

What You Will Learn

The Trigger Condition

Directory doubling is triggered by a precise condition:

The Trigger:

An insertion attempts to add a record to a bucket that is:

Full (at capacity)

Has local_depth = global_depth (already at maximum specificity)

Decision flowchart:

doubling_decision.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function insert(key, value):
    bucket = findBucket(key)
    
    if bucket.hasSpace():
        bucket.add(key, value)
        return SUCCESS
    
    // Bucket is full - need to split
    if bucket.local_depth < global_depth:
        // CASE A: Local split only (no directory change)
        splitBucket(bucket)
        return insert(key, value)  // Retry
    
    else:
        // CASE B: bucket.local_depth == global_depth
        // Directory doubling required before split
        doubleDirectory()
        splitBucket(bucket)
        return insert(key, value)  // Retry
 
// Key insight:
// - CASE A handles most overflow situations (increasingly common as table grows)
// - CASE B is rare but necessary for continued growth

Why can't we just use overflow chains?

Some systems do allow overflow chains as a fallback, but they defeat the purpose of extendible hashing:

Lookup degradation — Chained buckets turn O(1) lookups into O(k) where k is chain length
Space fragmentation — Overflow pages may be scattered across disk
Deletion complexity — Managing free space across chains is complex

Directory doubling preserves the O(1) guarantee by ensuring every bucket can always split when needed.

Frequency Decreases Over Time

The Doubling Algorithm

The directory doubling algorithm is elegantly simple:

Algorithm: Double Directory

directory_double_algorithm.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
function doubleDirectory():
    old_size = 2^global_depth
    new_size = 2 * old_size
    
    // Step 1: Allocate new directory array
    new_directory = allocateArray(new_size, type=BucketPointer)
    
    // Step 2: Copy and duplicate all pointers
    for i = 0 to old_size - 1:
        // Original entry at position i
        new_directory[i] = directory[i]
        
        // Duplicate at position i + old_size
        new_directory[i + old_size] = directory[i]
    
    // Step 3: Atomically swap directories
    old_directory = directory
    directory = new_directory
    
    // Step 4: Increment global depth
    global_depth = global_depth + 1
    
    // Step 5: Deallocate old directory (deferred if concurrent readers)
    scheduleDeallocation(old_directory)
 
// After doubling:
//   - directory size: 2^(global_depth)
//   - Every bucket now has 2× pointers pointing to it
//   - No bucket has changed (local depths unchanged)
//   - No data has moved

Visual example:

Converting Mermaid diagram...

Key observations:

Buckets A, B, C are unchanged — Same contents, same local depths, same physical locations
Each bucket's pointer count doubled — A had 1 pointer (00), now has 2 (000, 100)
The copying pattern — Entry i is copied to positions i and i + old_size
C still has multiple pointers — Its local depth (1) is now even further below global depth (3)

Complexity Analysis

Understanding the costs of directory doubling is essential for performance analysis.

Time complexity:

Directory Doubling Time Costs
Operation	Complexity	Explanation
Memory allocation	O(2^d)	Allocate 2^(d+1) pointer slots
Pointer copying	O(2^d)	Copy 2^d pointers twice each
Directory swap	O(1)	Single pointer assignment
Bucket operations	0	No bucket I/O during doubling
Total	O(2^d)	Linear in current directory size

Space complexity:

Peak memory — 2×(2^d) × pointer_size during transition (old + new directories exist simultaneously)
Final memory — 2^(d+1) × pointer_size (new directory only)
Disk I/O — Zero pages written (directory is typically in memory)

Amortized analysis:

While each doubling costs O(2^d), doublings are exponentially rare:

amortized_analysis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Amortized Cost Analysis for N insertions:
 
Assumptions:
  - Start with d = 1 (2 buckets)
  - Each bucket holds B records
  - Grow to N records total
 
Number of doublings:
  - Need approximately N/B buckets
  - Maximum global depth: d_max ≈ log₂(N/B)
  - Total doublings: d_max - 1 = O(log(N/B))
 
Total doubling work:
  Σ (2^i for i = 1 to log₂(N/B))
  = 2 + 4 + 8 + ... + (N/B)
  = 2 × (N/B) - 2
  = O(N/B)
 
Amortized cost per insertion:
  = O(N/B) / N
  = O(1/B)
  ≈ O(1) for fixed bucket size B
 
Conclusion:
  Despite O(2^d) per doubling, the amortized cost per insertion
  is O(1) because doublings become exponentially less frequent.

Practical Performance Numbers

Pointer Duplication Pattern

The pointer duplication during doubling follows a precise pattern that maintains correctness.

The pattern:

Entry at index i (in old directory) is copied to indices i and i + 2^d (in new directory)
In binary: index 0bXXX becomes 0b0XXX and 0b1XXX

This works because the most significant bit of the new index is the "new" bit added by increasing global depth.

Detailed example:

duplication_pattern.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Before doubling (d = 3, 8 entries):
  Index 0 (000) → Bucket A
  Index 1 (001) → Bucket A     // A has local_depth 2, shared
  Index 2 (010) → Bucket B
  Index 3 (011) → Bucket C
  Index 4 (100) → Bucket D
  Index 5 (101) → Bucket D     // D has local_depth 2, shared
  Index 6 (110) → Bucket E
  Index 7 (111) → Bucket E     // E has local_depth 2, shared
 
After doubling (d = 4, 16 entries):
  Index  0 (0000) → Bucket A   // From old index 0
  Index  1 (0001) → Bucket A   // From old index 1
  Index  2 (0010) → Bucket B   // From old index 2
  Index  3 (0011) → Bucket C   // From old index 3
  Index  4 (0100) → Bucket D   // From old index 4
  Index  5 (0101) → Bucket D   // From old index 5
  Index  6 (0110) → Bucket E   // From old index 6
  Index  7 (0111) → Bucket E   // From old index 7
  ────────────────────────────────────────────────
  Index  8 (1000) → Bucket A   // Copy of old index 0
  Index  9 (1001) → Bucket A   // Copy of old index 1
  Index 10 (1010) → Bucket B   // Copy of old index 2
  Index 11 (1011) → Bucket C   // Copy of old index 3
  Index 12 (1100) → Bucket D   // Copy of old index 4
  Index 13 (1101) → Bucket D   // Copy of old index 5
  Index 14 (1110) → Bucket E   // Copy of old index 6
  Index 15 (1111) → Bucket E   // Copy of old index 7
 
Pointer counts after doubling:
  Bucket A: was 2, now 4 (indices 0,1,8,9)
  Bucket B: was 1, now 2 (indices 2,10)
  Bucket C: was 1, now 2 (indices 3,11)
  Bucket D: was 2, now 4 (indices 4,5,12,13)
  Bucket E: was 2, now 4 (indices 6,7,14,15)

Why this pattern is correct:

Consider a key with hash h(k) = 0b1011....

Before doubling (d=3): First 3 bits are 101 → directory index 5 → Bucket D
After doubling (d=4): First 4 bits are 1011 → directory index 11 → Bucket D

Index 11 in the new directory is a copy of index 3 (which pointed to D in old directory)... wait, that seems wrong!

Actually, let me recalculate:

Bit Ordering Matters

Relationship to Subsequent Split

Directory doubling is never performed in isolation—it always immediately precedes a bucket split. The sequence is:

Detect overflow in bucket B with local_depth = global_depth
Double the directory (global_depth increases)
Split bucket B (local_depth of B and its sibling increase)
Retry insertion (now succeeds)

After this sequence:

State Changes During Double-Then-Split
Property	Before	After Doubling	After Split
global_depth	d	d + 1	d + 1
directory_size	2^d	2^(d+1)	2^(d+1)
B.local_depth	d	d	d + 1
B.pointers	1	2	1
Sibling.local_depth	—	—	d + 1
Sibling.pointers	—	—	1

Step-by-step walkthrough:

double_and_split_walkthrough.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Initial state:
  global_depth = 2
  Bucket B at index 11, local_depth = 2, pointers = 1
  B is full and receives new insertion
 
Step 1: Double directory
  global_depth becomes 3
  Directory grows from 4 to 8 entries
  Entry 11 (0b11) copied to entries 011 and 111
  B now has 2 pointers (indices 3 and 7)
  B.local_depth still = 2
 
Step 2: Split bucket B
  B.local_depth becomes 3
  Create sibling S with local_depth = 3
  Redistribute B's records:
    - Records with bit 3 = 0 stay in B
    - Records with bit 3 = 1 move to S
  Update directory:
    - Index 011 (3) points to B
    - Index 111 (7) points to S
  B now has 1 pointer, S has 1 pointer
 
Step 3: Retry insertion
  Recalculate target bucket using new global_depth = 3
  Insert into B or S depending on key's bit 3
  Insertion succeeds (buckets now have space)

Atomic Operation

Concurrency During Doubling

Directory doubling presents unique concurrency challenges because it affects the global structure used by all operations.

The challenge:

Use stale global_depth with new directory (wrong bucket)
Use new global_depth with old directory (out-of-bounds access)
Observe torn writes during pointer copying

Synchronization strategies:

Concurrency Approaches

•Global read-write lock — Writers (doublers) take exclusive lock; readers take shared lock. Simple but creates contention point.
•Optimistic locking with retry — Readers capture directory state, perform lookup, verify state unchanged. Retry if mismatch.
•Read-copy-update (RCU) — New directory is fully constructed; single atomic pointer swap makes it active. Readers on old directory remain valid until quiescent.
•Versioned directories — Each directory version includes its global_depth. Readers use consistent (version, directory) pairs.
•Hazard pointers — Readers register which directory they're using. Old directory freed only when no hazard pointers reference it.

rcu_doubling.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// RCU-style directory doubling
class HashIndex {
    atomic<DirectoryState*> current_state
    
    struct DirectoryState {
        uint32_t global_depth
        BucketPointer* directory
        uint64_t version
    }
}
 
function doubleDirectoryRCU():
    old_state = current_state.load()
    
    // Build complete new state
    new_directory = allocateAndCopyDoubled(old_state.directory, old_state.global_depth)
    new_state = new DirectoryState {
        global_depth: old_state.global_depth + 1,
        directory: new_directory,
        version: old_state.version + 1
    }
    
    // Atomic swap - no locks needed for readers
    while !current_state.compare_and_swap(old_state, new_state):
        // Another writer beat us - retry from scratch
        deallocate(new_directory)
        old_state = current_state.load()
        new_directory = allocateAndCopyDoubled(old_state.directory, old_state.global_depth)
        new_state.directory = new_directory
        new_state.global_depth = old_state.global_depth + 1
        new_state.version = old_state.version + 1
    
    // Schedule old state for deferred cleanup
    rcu_free(old_state)
 
function lookup(key):
    state = current_state.load()  // Consistent snapshot
    index = extractBits(hash(key), state.global_depth)
    return state.directory[index]
    // No validation needed - state is immutable once published

RCU is Preferred for Read-Heavy Workloads

Optimization Techniques

Production systems employ several techniques to optimize directory doubling.

Pre-allocation:

Instead of allocating exactly 2^d entries, allocate extra capacity:

pre_allocation.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// Pre-allocate directory with growth margin
class OptimizedDirectory {
private:
    BucketPointer* storage;      // Actual memory
    size_t allocated_size;       // May exceed 2^global_depth
    uint32_t global_depth;
    
public:
    OptimizedDirectory() {
        global_depth = 1;
        // Allocate for depth 4 even though starting at 1
        allocated_size = 16;  // 2^4
        storage = new BucketPointer[allocated_size];
        // Only first 2 entries used initially
    }
    
    void doubleDirectory() {
        size_t needed = 2 * (1 << global_depth);
        
        if (needed <= allocated_size) {
            // Just duplicate pointers in-place
            size_t old_size = 1 << global_depth;
            for (size_t i = 0; i < old_size; i++) {
                storage[i + old_size] = storage[i];
            }
            global_depth++;
            return;  // No allocation!
        }
        
        // Need to reallocate
        size_t new_allocated = std::max(needed, allocated_size * 2);
        // ... proceed with standard reallocation
    }
};

Multi-level directories:

For very large directories, use a tree of directory pages instead of a flat array:

multilevel_directory.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Multi-level Directory Structure (2-level example):
 
Level 0 (Root): Array of pointers to Level 1 pages
  - Fix size throughout table lifetime
  - Each entry covers 2^(d - k) buckets (k = bits for L0 indexing)
 
Level 1 (Leaves): Pages containing bucket pointers
  - Grows by adding new L1 pages
  - Each page covers a range of bucket addresses
 
Example with k = 10:
  L0: 1024 entries, each pointing to an L1 page
  L1: Each page has up to 2^14 = 16K entries
  Maximum addressable: 2^24 = 16M buckets
 
Doubling behavior:
  - If d < 10: Only L0 doubles (like regular doubling)
  - If d >= 10: L1 pages split, L0 unchanged
  
Advantage:
  - L0 fits in cache (8KB)
  - L1 pages can be demand-paged
  - Doubling never copies entire directory

Additional Optimizations

•SIMD copying — Use vector instructions to copy 4-8 pointers per instruction during doubling
•Huge pages — Allocate directory on 2MB huge pages to reduce TLB misses during traversal
•NUMA awareness — On multi-socket systems, replicate directory on each NUMA node for local access
•Lazy doubling — Double directory in background, using old directory until new one is ready

Trade-offs

Summary: Directory Doubling Mastery

Directory doubling is the mechanism that allows extendible hashing to scale without bounds (within system limits). Let's consolidate the key concepts:

Key Takeaways

•Trigger condition — Doubling occurs when a bucket with local_depth = global_depth overflows.
•Algorithm — Copy each pointer to positions i and i + old_size, then increment global_depth.
•No data movement — Buckets and their contents remain in place; only metadata (pointers) is copied.
•Time complexity — O(2^d) per doubling, but O(1) amortized per insertion due to exponentially decreasing frequency.
•Precedes split — Doubling always immediately precedes a bucket split to resolve the overflow.
•Concurrency solutions — RCU, versioned directories, or read-write locks protect concurrent access during doubling.
•Optimizations exist — Pre-allocation, multi-level directories, and SIMD copying reduce overhead.

What's next:

Page Complete

4 / 5