Collision — Chaining - Learning Module

Loading content...

0/276

Chaining with Linked Lists

The Elegant Solution

We've established that hash collisions are inevitable—a mathematical certainty arising from mapping a large key space to a smaller index space. The question now becomes: How do we elegantly handle this reality?

The most intuitive and widely-used solution is chaining (also called separate chaining or closed addressing). The core insight is beautifully simple:

Instead of storing a single entry at each index, store a collection of entries.

When multiple keys hash to index 5, we don't panic or probe elsewhere. We simply add all of them to a list at index 5. Each array slot becomes a "bucket" that can hold multiple entries, and we use a linked list to chain them together.

What You Will Learn

By the end of this page, you will understand how chaining transforms hash tables from single-entry slots to multi-entry buckets, implement a fully functional hash table with chaining, trace through insert/lookup/delete operations step-by-step, and understand why linked lists are the classic choice for chains.

The Chaining Concept: From Slots to Buckets

In a hash table without collision handling, each array index holds at most one key-value pair:

[ Entry | Entry | null | Entry | null | null | Entry | null ]
    0      1       2       3       4       5      6       7

With chaining, each array index holds a linked list of entries:

[ List → | List → | null | List → | null | null | List → | null ]
    0        1        2       3        4       5      6        7
    ↓        ↓               ↓                        ↓
  (k1,v1)  (k3,v3)        (k4,v4)                  (k7,v7)
    ↓                        ↓  
  (k2,v2)                 (k5,v5)
                             ↓
                          (k6,v6)

The Mental Model

Think of each bucket as a coat hook by the door. In a naive system, each hook holds one coat—if someone's coat is already there, you have a problem. With chaining, each hook has a coat rack that can hang multiple coats. You find the right hook using the hash, then search along the rack for your specific coat.

Why Linked Lists?

The classic implementation uses singly linked lists for several practical reasons:

Dynamic size: No need to pre-allocate space for collisions
O(1) insertion: Add to head of list in constant time
Memory efficient: Only allocate nodes as needed
Simple implementation: Well-understood data structure
No wasted space: Unlike arrays, no empty slots in the chain

While other data structures (arrays, balanced trees) can also serve as chains, linked lists remain the standard choice due to their simplicity and the fact that chains are typically short with a good hash function.

Converting Mermaid diagram...

Designing the Data Structures

Before implementing operations, let's establish the data structures we need. A hash table with chaining requires three components:

Node: Represents a single key-value entry in a chain
Bucket: The linked list at each index (can be the head node)
HashTable: The array of buckets plus size tracking

data-structures
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
/**
 * Node in a chain - represents one key-value entry
 */
class ChainNode<K, V> {
    key: K;
    value: V;
    next: ChainNode<K, V> | null;
    
    constructor(key: K, value: V) {
        this.key = key;
        this.value = value;
        this.next = null;
    }
}
 
/**
 * Hash Table using separate chaining for collision resolution
 */
class ChainingHashTable<K, V> {
    // Array of bucket heads (each bucket is a linked list)
    private buckets: (ChainNode<K, V> | null)[];
    
    // Number of entries in the table (not number of buckets)
    private size: number;
    
    // Load factor threshold for resizing
    private readonly loadFactorThreshold: number;
    
    constructor(initialCapacity: number = 16, loadFactor: number = 0.75) {
        this.buckets = new Array(initialCapacity).fill(null);
        this.size = 0;
        this.loadFactorThreshold = loadFactor;
    }
    
    // Current load factor = entries / buckets
    private get loadFactor(): number {
        return this.size / this.buckets.length;
    }
    
    // Number of buckets (array length)
    get capacity(): number {
        return this.buckets.length;
    }
    
    // Number of entries stored
    get count(): number {
        return this.size;
    }
}

Load Factor Preview

The load factor (entries/buckets) is crucial for performance. We'll explore it deeply in Page 4, but for now, know that it measures how 'full' the table is. Higher load factors mean longer chains and slower operations.

Implementing the Hash Function

The hash function maps keys to bucket indices. While we covered hash function properties in an earlier module, let's implement a practical hash function for our chaining hash table.

The process involves two steps:

Compute a hash code: Transform the key into an integer
Map to bucket index: Use modulo to fit within array bounds

hash-function
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class ChainingHashTable<K, V> {
    // ... previous code ...
    
    /**
     * Computes the hash code for a key.
     * Uses a polynomial rolling hash for strings.
     */
    private hashCode(key: K): number {
        if (key === null || key === undefined) {
            return 0;
        }
        
        const keyStr = String(key);
        let hash = 0;
        
        // Polynomial rolling hash with prime base 31
        for (let i = 0; i < keyStr.length; i++) {
            // Using 31 as multiplier (common choice - prime, 2^5 - 1)
            hash = (hash * 31 + keyStr.charCodeAt(i)) | 0;
        }
        
        return hash;
    }
    
    /**
     * Maps a hash code to a valid bucket index.
     * Handles negative hash codes properly.
     */
    private getBucketIndex(key: K): number {
        const hashCode = this.hashCode(key);
        
        // Ensure positive index using bitwise AND with max positive int
        // Or use Math.abs, but be aware of edge cases with MIN_VALUE
        const positiveHash = hashCode & 0x7FFFFFFF;
        
        return positiveHash % this.buckets.length;
    }
}
 
// Example usage:
// "apple" → hashCode: 92668751 → index: 92668751 % 16 = 15
// "banana" → hashCode: -1396355347 → index: (& 0x7F...) % 16 = 5

Negative Hash Codes

Many hash functions produce negative values (due to integer overflow). Always ensure the bucket index is positive before using modulo. The bitwise AND with 0x7FFFFFFF clears the sign bit, guaranteeing a positive value.

The Insert Operation

Insertion is the most fundamental operation. With chaining, the algorithm is:

Compute the bucket index from the key's hash
Search the chain at that bucket for an existing entry with the same key
If found: Update the value (standard map behavior)
If not found: Add a new node to the chain
Update the size counter if a new entry was added

insert-operation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
class ChainingHashTable<K, V> {
    // ... previous code ...
    
    /**
     * Inserts or updates a key-value pair.
     * Returns the previous value if key existed, undefined otherwise.
     */
    put(key: K, value: V): V | undefined {
        // Step 1: Find the bucket
        const index = this.getBucketIndex(key);
        
        // Step 2: Search the chain for existing key
        let current = this.buckets[index];
        while (current !== null) {
            if (this.keysEqual(current.key, key)) {
                // Key exists - update value
                const oldValue = current.value;
                current.value = value;
                return oldValue; // Return old value, don't increment size
            }
            current = current.next;
        }
        
        // Step 3: Key not found - insert new node at HEAD of chain
        // (Head insertion is O(1) and preferred)
        const newNode = new ChainNode(key, value);
        newNode.next = this.buckets[index];
        this.buckets[index] = newNode;
        
        // Step 4: Update size
        this.size++;
        
        // Step 5: Check if resize is needed (covered in later module)
        if (this.loadFactor > this.loadFactorThreshold) {
            this.resize();
        }
        
        return undefined; // No previous value existed
    }
    
    /**
     * Compares two keys for equality.
     * Handles null/undefined and uses strict equality.
     */
    private keysEqual(key1: K, key2: K): boolean {
        if (key1 === key2) return true;
        if (key1 === null || key2 === null) return false;
        
        // For objects, you might want deep equality
        // For primitives and strings, === works fine
        return key1 === key2;
    }
}

Head Insertion vs Tail Insertion

We insert new nodes at the HEAD of the chain, not the tail. This is O(1) because we have direct access to the head. Tail insertion would require O(n) traversal to find the end. Unless you need ordering (rare for hash tables), head insertion is preferred.

Step-by-Step Trace:

Let's trace inserting three entries that all hash to index 3:

put("apple", 100) → Index 3, chain is empty → Create node → [apple:100]
put("cherry", 200) → Index 3, chain has apple → Search, not found → Insert at head → [cherry:200] → [apple:100]
put("fig", 300) → Index 3, chain has cherry→apple → Search, not found → Insert at head → [fig:300] → [cherry:200] → [apple:100]
put("apple", 150) → Index 3, chain has fig→cherry→apple → Search, found apple → Update → [fig:300] → [cherry:200] → [apple:150]

The Lookup Operation

Lookup (also called get or find) retrieves the value associated with a key. The algorithm follows the same pattern as the insert search phase:

Compute the bucket index from the key's hash
Traverse the chain at that bucket
Compare each node's key to the search key
Return the value if found, or indicate not found

lookup-operation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
class ChainingHashTable<K, V> {
    // ... previous code ...
    
    /**
     * Retrieves the value associated with a key.
     * Returns undefined if the key doesn't exist.
     */
    get(key: K): V | undefined {
        const index = this.getBucketIndex(key);
        
        // Traverse the chain
        let current = this.buckets[index];
        while (current !== null) {
            if (this.keysEqual(current.key, key)) {
                return current.value;
            }
            current = current.next;
        }
        
        // Key not found
        return undefined;
    }
    
    /**
     * Checks if a key exists in the table.
     */
    has(key: K): boolean {
        const index = this.getBucketIndex(key);
        
        let current = this.buckets[index];
        while (current !== null) {
            if (this.keysEqual(current.key, key)) {
                return true;
            }
            current = current.next;
        }
        
        return false;
    }
    
    /**
     * Returns the value for a key, or a default value if not found.
     * Useful for avoiding undefined checks.
     */
    getOrDefault(key: K, defaultValue: V): V {
        const value = this.get(key);
        return value !== undefined ? value : defaultValue;
    }
}

Time Complexity Analysis:

Best case: O(1) — The key is the first node in the chain
Average case: O(1 + α) — Where α is the load factor (average chain length)
Worst case: O(n) — All keys hash to the same bucket (degenerate case)

With a good hash function and reasonable load factor (< 0.75), chains are very short (typically 0-2 elements), making lookups effectively O(1).

The Critical Importance of Key Comparison

Notice that we ALWAYS compare the actual keys, not just rely on the hash. Two different keys might hash to the same index, so we must verify identity by comparing keys directly. This is why hash tables require both a hash function AND an equality function.

The Delete Operation

Deletion removes an entry from the hash table. With chaining, this is standard linked list deletion:

Compute the bucket index
Find the node to delete (keeping track of the previous node)
Update the previous node's next pointer to skip the deleted node
Handle the special case when deleting the head node

delete-operation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
class ChainingHashTable<K, V> {
    // ... previous code ...
    
    /**
     * Removes an entry from the table.
     * Returns the removed value, or undefined if key didn't exist.
     */
    delete(key: K): V | undefined {
        const index = this.getBucketIndex(key);
        
        let current = this.buckets[index];
        let previous: ChainNode<K, V> | null = null;
        
        // Search for the key
        while (current !== null) {
            if (this.keysEqual(current.key, key)) {
                // Found the node to delete
                const removedValue = current.value;
                
                if (previous === null) {
                    // Deleting the head node
                    this.buckets[index] = current.next;
                } else {
                    // Deleting a middle or tail node
                    previous.next = current.next;
                }
                
                this.size--;
                return removedValue;
            }
            
            previous = current;
            current = current.next;
        }
        
        // Key not found
        return undefined;
    }
    
    /**
     * Removes all entries from the table.
     */
    clear(): void {
        // Simply reset all buckets to null
        this.buckets = new Array(this.buckets.length).fill(null);
        this.size = 0;
    }
}

Chaining Makes Deletion Simple

Unlike open addressing (covered in the next module), chaining doesn't require 'tombstone' markers for deleted entries. Once we remove a node from the chain, that slot is simply gone. This simplicity is one of chaining's key advantages.

Visual Trace of Deletion:

Before deletion of "cherry" at index 3:

[fig:300] → [cherry:200] → [apple:100]

After deletion:

[fig:300] → [apple:100]

The chain is simply re-linked around the removed node.

Complete Implementation

Let's bring together all the pieces into a complete, production-ready hash table implementation with chaining. This includes additional utility methods that make the structure more usable:

complete-chaining-hash-table
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
class ChainNode<K, V> {
    constructor(
        public key: K,
        public value: V,
        public next: ChainNode<K, V> | null = null
    ) {}
}
 
class ChainingHashTable<K, V> {
    private buckets: (ChainNode<K, V> | null)[];
    private size: number = 0;
    private readonly loadFactorThreshold: number;
 
    constructor(initialCapacity: number = 16, loadFactor: number = 0.75) {
        this.buckets = new Array(initialCapacity).fill(null);
        this.loadFactorThreshold = loadFactor;
    }
 
    get capacity(): number { return this.buckets.length; }
    get count(): number { return this.size; }
    get loadFactor(): number { return this.size / this.buckets.length; }
 
    private hashCode(key: K): number {
        const keyStr = String(key);
        let hash = 0;
        for (let i = 0; i < keyStr.length; i++) {
            hash = (hash * 31 + keyStr.charCodeAt(i)) | 0;
        }
        return hash;
    }
 
    private getBucketIndex(key: K): number {
        return (this.hashCode(key) & 0x7FFFFFFF) % this.buckets.length;
    }
 
    put(key: K, value: V): V | undefined {
        const index = this.getBucketIndex(key);
        let current = this.buckets[index];
        
        while (current !== null) {
            if (current.key === key) {
                const oldValue = current.value;
                current.value = value;
                return oldValue;
            }
            current = current.next;
        }
        
        const newNode = new ChainNode(key, value, this.buckets[index]);
        this.buckets[index] = newNode;
        this.size++;
        
        if (this.loadFactor > this.loadFactorThreshold) {
            this.resize(this.buckets.length * 2);
        }
        return undefined;
    }
 
    get(key: K): V | undefined {
        const index = this.getBucketIndex(key);
        let current = this.buckets[index];
        
        while (current !== null) {
            if (current.key === key) return current.value;
            current = current.next;
        }
        return undefined;
    }
 
    has(key: K): boolean {
        return this.get(key) !== undefined;
    }
 
    delete(key: K): V | undefined {
        const index = this.getBucketIndex(key);
        let current = this.buckets[index];
        let previous: ChainNode<K, V> | null = null;
        
        while (current !== null) {
            if (current.key === key) {
                const removedValue = current.value;
                if (previous === null) {
                    this.buckets[index] = current.next;
                } else {
                    previous.next = current.next;
                }
                this.size--;
                return removedValue;
            }
            previous = current;
            current = current.next;
        }
        return undefined;
    }
 
    private resize(newCapacity: number): void {
        const oldBuckets = this.buckets;
        this.buckets = new Array(newCapacity).fill(null);
        this.size = 0;
        
        for (const head of oldBuckets) {
            let current = head;
            while (current !== null) {
                this.put(current.key, current.value);
                current = current.next;
            }
        }
    }
 
    *keys(): Generator<K> {
        for (const head of this.buckets) {
            let current = head;
            while (current !== null) {
                yield current.key;
                current = current.next;
            }
        }
    }
 
    *values(): Generator<V> {
        for (const head of this.buckets) {
            let current = head;
            while (current !== null) {
                yield current.value;
                current = current.next;
            }
        }
    }
 
    *entries(): Generator<[K, V]> {
        for (const head of this.buckets) {
            let current = head;
            while (current !== null) {
                yield [current.key, current.value];
                current = current.next;
            }
        }
    }
}

Production-Ready Features

This implementation includes automatic resizing, iterators for keys/values/entries, load factor tracking, and proper handling of edge cases. It's suitable for real-world use (though language built-ins are typically more optimized).

Summary: Chaining with Linked Lists

Key Takeaways

•Chaining converts slots to buckets — Each array index holds a linked list of entries, not just one entry.
•Collisions are handled gracefully — Multiple keys at the same index coexist in the chain.
•Operations traverse the chain — Insert, lookup, and delete all search through the bucket's linked list.
•Head insertion is O(1) — New nodes go at the front of the chain for constant-time insertion.
•Key comparison is essential — We must verify the actual key, not just trust the hash index.
•Deletion is simple — No tombstones needed; just re-link the chain around the removed node.

What's Next:

Now that we understand how chaining works mechanically, we'll examine its tradeoffs in the next page. What are the advantages and disadvantages of chaining compared to alternatives? When is chaining the right choice, and when might you prefer open addressing?

Page Complete

You now understand the complete mechanics of chaining—from data structure design through all core operations. You can implement a hash table with chaining from scratch and trace through its operations step by step.