Loading learning content...
Insertion and lookup receive most of the attention in hash table discussions, but deletion harbors subtle complexities that can fundamentally impact implementation choices and long-term performance. While deletion shares the same O(1) average / O(n) worst case complexity as its siblings, the mechanisms differ significantly between collision resolution strategies.
With chaining, deletion is straightforward—remove a node from a linked list. With open addressing, deletion threatens to break the very probe sequences that make lookups work, requiring the introduction of tombstones that introduce their own degradation patterns.
This page provides an exhaustive examination of deletion across both collision strategies, analyzing the trade-offs, performance implications, and engineering decisions that make deletion a hidden source of complexity in hash table implementations.
By the end of this page, you will understand: why deletion is fundamentally different between chaining and open addressing; the tombstone mechanism and why it's necessary; how tombstone accumulation degrades performance; strategies for managing tombstone overhead; and when deletion patterns should influence hash table choice.
With separate chaining, deletion is conceptually straightforward: find the key in its bucket's linked list, remove the node. No special considerations are needed—removing a node from a linked list doesn't affect any other element's accessibility.
The Deletion Algorithm (Chaining):
Every step is O(1) except step 4, which is O(k) where k is the bucket's chain length. Under normal conditions with good hash distribution, k averages to the load factor α, giving O(1) average-case deletion.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156
from typing import Optional, List, Tuple, TypeVar K = TypeVar('K')V = TypeVar('V') class Node: """Linked list node for chaining.""" def __init__(self, key: K, value: V): self.key = key self.value = value self.next: Optional['Node'] = None class ChainingHashTable: """ Hash table with separate chaining demonstrating straightforward deletion. """ def __init__(self, capacity: int = 16): self.capacity = capacity self.buckets: List[Optional[Node]] = [None] * capacity self.size = 0 # Deletion tracking self.deletion_comparisons = 0 def _hash(self, key: K) -> int: return hash(key) def _bucket_index(self, key: K) -> int: return self._hash(key) % self.capacity def insert(self, key: K, value: V): """Insert a key-value pair.""" idx = self._bucket_index(key) # Check if key exists current = self.buckets[idx] while current: if current.key == key: current.value = value # Update return current = current.next # Insert at head of chain new_node = Node(key, value) new_node.next = self.buckets[idx] self.buckets[idx] = new_node self.size += 1 def delete(self, key: K) -> Tuple[bool, int]: """ Delete a key from the hash table. Returns: - success: True if key was found and deleted - comparisons: Number of key comparisons performed Time Complexity: - Average case: O(1) when load factor is bounded - Worst case: O(n) when all elements in one bucket """ idx = self._bucket_index(key) comparisons = 0 current = self.buckets[idx] previous = None while current: comparisons += 1 self.deletion_comparisons += 1 if current.key == key: # Found the key - remove the node if previous is None: # Removing head of chain self.buckets[idx] = current.next else: # Removing middle/end of chain previous.next = current.next self.size -= 1 return (True, comparisons) previous = current current = current.next # Key not found return (False, comparisons) def get_chain_lengths(self) -> List[int]: """Get the length of each bucket's chain.""" lengths = [] for bucket in self.buckets: length = 0 current = bucket while current: length += 1 current = current.next lengths.append(length) return lengths def demonstrate_chaining_deletion(): """Show chaining deletion behavior.""" print("Chaining Deletion Demonstration") print("=" * 60) import random # Create table with known capacity table = ChainingHashTable(capacity=100) # Insert elements keys = [f"key_{i}" for i in range(500)] for key in keys: table.insert(key, f"value_for_{key}") print(f"Inserted {table.size} elements into {table.capacity} buckets") print(f"Load factor: {table.size / table.capacity:.2f}") chain_lengths = table.get_chain_lengths() print(f"Avg chain length: {sum(chain_lengths)/len(chain_lengths):.2f}") print(f"Max chain length: {max(chain_lengths)}") print() # Delete random subset to_delete = random.sample(keys, 100) table.deletion_comparisons = 0 successful_deletions = 0 for key in to_delete: success, comps = table.delete(key) if success: successful_deletions += 1 print(f"Deletion results:") print(f" Attempted: {len(to_delete)}") print(f" Successful: {successful_deletions}") print(f" Total comparisons: {table.deletion_comparisons}") print(f" Avg comparisons: {table.deletion_comparisons / len(to_delete):.2f}") print() # Try deleting already-deleted keys (should fail fast) table.deletion_comparisons = 0 for key in to_delete[:20]: success, comps = table.delete(key) # These will fail since we already deleted them print(f"Re-deleting (already deleted keys):") print(f" Total comparisons: {table.deletion_comparisons}") print(f" Note: Empty buckets have 0 comparisons, non-empty must scan to confirm absence") demonstrate_chaining_deletion()With chaining, deleting an element has no side effects on other elements. Each key's probe 'path' is simply the linked list in its bucket—removing one node doesn't affect how other nodes are found. This isolation is chaining's major advantage for workloads with frequent deletions.
With open addressing, deletion introduces a fundamental challenge that doesn't exist with chaining: removing an element can break probe sequences for other elements.
Consider this scenario with linear probing:
Now the probe chain for C is: 5 → 6 → 7. If we naively delete B (slot 6) by marking it empty:
The deletion broke C's lookup. The empty slot created a false termination point in C's probe sequence. This is the core problem that tombstones solve.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132
class BrokenOpenAddressing: """ Demonstrates how naive deletion breaks lookups in open addressing. THIS IS INTENTIONALLY BROKEN to show the problem. """ EMPTY = None def __init__(self, capacity=10): self.capacity = capacity self.slots = [self.EMPTY] * capacity def _hash(self, key) -> int: # For demonstration, use a simple hash that causes predictable collisions if isinstance(key, str) and key.startswith("collision_"): return 5 # All "collision_*" keys hash to slot 5 return hash(key) % self.capacity def insert(self, key, value): """Insert using linear probing.""" start_idx = self._hash(key) % self.capacity for offset in range(self.capacity): idx = (start_idx + offset) % self.capacity if self.slots[idx] is self.EMPTY: self.slots[idx] = (key, value) return idx # Return where we inserted if self.slots[idx][0] == key: self.slots[idx] = (key, value) return idx raise RuntimeError("Table full") def naive_delete(self, key) -> bool: """ BROKEN: Naive deletion that just empties the slot. This will break lookups for keys that probed past this slot. """ start_idx = self._hash(key) % self.capacity for offset in range(self.capacity): idx = (start_idx + offset) % self.capacity if self.slots[idx] is self.EMPTY: return False # Not found if self.slots[idx][0] == key: self.slots[idx] = self.EMPTY # PROBLEM: Creates hole return True return False def lookup(self, key): """Lookup that stops at empty slots.""" start_idx = self._hash(key) % self.capacity for offset in range(self.capacity): idx = (start_idx + offset) % self.capacity if self.slots[idx] is self.EMPTY: return (False, None) # Stops at empty if self.slots[idx][0] == key: return (True, self.slots[idx][1]) return (False, None) def __str__(self): result = [] for i, slot in enumerate(self.slots): if slot is self.EMPTY: result.append(f"[{i}]: empty") else: result.append(f"[{i}]: {slot[0]}") return "\n".join(result) def demonstrate_broken_deletion(): """Show how naive deletion breaks lookups.""" print("Demonstrating Why Naive Deletion Breaks Open Addressing") print("=" * 60) table = BrokenOpenAddressing(capacity=10) # Insert three keys that all hash to slot 5 keys = ["collision_A", "collision_B", "collision_C"] print("Step 1: Insert three keys that all hash to slot 5") print("-" * 40) for key in keys: slot = table.insert(key, f"value_{key}") print(f" Inserted '{key}' at slot {slot}") print() print("Table state after insertions:") print(table) print() # Verify all keys can be found print("Step 2: Verify all keys can be found") print("-" * 40) for key in keys: found, value = table.lookup(key) print(f" Lookup '{key}': found={found}") print() # Now naively delete the middle key print("Step 3: Naively delete 'collision_B' (middle of chain)") print("-" * 40) table.naive_delete("collision_B") print(" Deleted!") print() print("Table state after deletion:") print(table) print() # Try to find collision_C - IT WILL FAIL print("Step 4: Try to lookup 'collision_C'") print("-" * 40) found, value = table.lookup("collision_C") print(f" Lookup 'collision_C': found={found}") print() print(" PROBLEM: collision_C IS in the table (slot 7), but lookup") print(" stopped at slot 6 (now empty) and concluded it's absent!") print() print(" This is why we need TOMBSTONES instead of true deletion.") demonstrate_broken_deletion()The fundamental issue is that open addressing uses the absence of data (empty slots) as a termination signal for probe sequences. Deleting an element by creating an empty slot introduces false termination points that break lookups for elements that were displaced past the deleted slot. This makes naive deletion impossible—we must use a different mechanism.
The standard solution to open addressing deletion is the tombstone (also called a deleted marker or lazy deletion). Instead of truly emptying a slot, we mark it with a special value that says:
"This slot was occupied but is now deleted. Keep probing past me during lookups—I'm not a terminus. But you can reuse me during insertions—I'm available."
Tombstones maintain the integrity of probe sequences while allowing slot reuse. The distinction between three slot states becomes:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211
from typing import Optional, Tuplefrom enum import Enumfrom dataclasses import dataclass class SlotState(Enum): EMPTY = "EMPTY" # Never used OCCUPIED = "OCCUPIED" # Contains data DELETED = "DELETED" # Tombstone @dataclassclass Slot: state: SlotState key: Optional[str] = None value: Optional[str] = None class TombstoneHashTable: """ Open addressing hash table with proper tombstone-based deletion. """ def __init__(self, capacity: int = 16): self.capacity = capacity self.slots = [Slot(SlotState.EMPTY) for _ in range(capacity)] self.size = 0 # Active elements self.tombstone_count = 0 # Deleted markers def _hash(self, key) -> int: return hash(key) def _probe_sequence(self, key): """Generate probe indices using linear probing.""" start = self._hash(key) % self.capacity for offset in range(self.capacity): yield (start + offset) % self.capacity def insert(self, key, value) -> Tuple[int, int]: """ Insert with tombstone reuse. Returns (slot_index, probes). """ probes = 0 tombstone_slot = None # Remember first tombstone for reuse for idx in self._probe_sequence(key): probes += 1 slot = self.slots[idx] if slot.state == SlotState.EMPTY: # Truly empty - use this slot (or earlier tombstone) target_idx = tombstone_slot if tombstone_slot is not None else idx if tombstone_slot is not None: self.tombstone_count -= 1 # Reusing tombstone self.slots[target_idx] = Slot(SlotState.OCCUPIED, key, value) self.size += 1 return (target_idx, probes) if slot.state == SlotState.DELETED: # Remember first tombstone for potential reuse if tombstone_slot is None: tombstone_slot = idx continue # Keep probing - key might exist later if slot.key == key: # Key exists - update value self.slots[idx] = Slot(SlotState.OCCUPIED, key, value) return (idx, probes) raise RuntimeError("Table is full") def delete(self, key) -> Tuple[bool, int]: """ Delete using tombstone. Returns: - success: True if key was found and deleted - probes: Number of slots examined Time Complexity: - Average: O(1) with good hash distribution - Worst: O(n) when heavily clustered """ probes = 0 for idx in self._probe_sequence(key): probes += 1 slot = self.slots[idx] if slot.state == SlotState.EMPTY: # Never occupied - key definitely absent return (False, probes) if slot.state == SlotState.DELETED: # Was occupied - keep probing continue if slot.key == key: # Found - replace with tombstone self.slots[idx] = Slot(SlotState.DELETED) self.size -= 1 self.tombstone_count += 1 return (True, probes) return (False, probes) def lookup(self, key) -> Tuple[bool, Optional[str], int]: """ Lookup that properly handles tombstones. Returns: (found, value, probes) """ probes = 0 for idx in self._probe_sequence(key): probes += 1 slot = self.slots[idx] if slot.state == SlotState.EMPTY: # Never occupied - key absent return (False, None, probes) if slot.state == SlotState.DELETED: # Was occupied - keep probing continue if slot.key == key: return (True, slot.value, probes) return (False, None, probes) def effective_load_factor(self) -> float: """Load factor including tombstones (affects probing cost).""" return (self.size + self.tombstone_count) / self.capacity def true_load_factor(self) -> float: """Load factor counting only live elements.""" return self.size / self.capacity def __str__(self): lines = [] for i, slot in enumerate(self.slots): if slot.state == SlotState.EMPTY: lines.append(f"[{i}]: EMPTY") elif slot.state == SlotState.DELETED: lines.append(f"[{i}]: DELETED (tombstone)") else: lines.append(f"[{i}]: {slot.key}") return "\n".join(lines) def demonstrate_tombstones(): """Show correct tombstone behavior.""" print("Tombstone-Based Deletion Demonstration") print("=" * 60) table = TombstoneHashTable(capacity=10) # Use a hash function that creates collisions for demo original_hash = table._hash def predictable_hash(key): if isinstance(key, str) and key.startswith("collision_"): return 5 # Force all collision_* keys to hash to slot 5 return original_hash(key) table._hash = predictable_hash # Insert three colliding keys keys = ["collision_A", "collision_B", "collision_C"] print("Step 1: Insert three keys that all hash to slot 5") print("-" * 40) for key in keys: slot, probes = table.insert(key, f"value_{key}") print(f" Inserted '{key}' at slot {slot} ({probes} probes)") print() print("Table state after insertions:") print(table) print() # Delete the middle key print("Step 2: Delete 'collision_B' using tombstone") print("-" * 40) success, probes = table.delete("collision_B") print(f" Deleted: {success} ({probes} probes)") print(f" Tombstones now: {table.tombstone_count}") print() print("Table state after deletion:") print(table) print() # Verify collision_C is still findable print("Step 3: Verify 'collision_C' is still accessible") print("-" * 40) found, value, probes = table.lookup("collision_C") print(f" Lookup 'collision_C': found={found}, probes={probes}") print(" SUCCESS: Tombstone allowed probing to continue past deleted slot!") print() # Insert reuses tombstone print("Step 4: Insert 'collision_D' - should reuse tombstone") print("-" * 40) slot, probes = table.insert("collision_D", "value_D") print(f" Inserted 'collision_D' at slot {slot} ({probes} probes)") print(f" Tombstones now: {table.tombstone_count}") print() print("Table state after insertion:") print(table) demonstrate_tombstones()EMPTY means 'never occupied'—safe to stop probing. DELETED means 'was occupied'—must keep probing. This distinction is what makes tombstones work. Insertions can treat DELETED as available (since we can overwrite), but lookups must treat it as 'occupied by a ghost' and continue searching.
Tombstones solve the correctness problem but introduce their own performance challenge: tombstone accumulation. In workloads with many deletions, tombstones build up, increasing probe lengths even when the live element count stays low.
The Problem:
Consider a table with capacity 1000:
Even though only 250 elements exist, probing must traverse tombstones as if they were occupied. The 'effective' load—what matters for probe length—includes both live elements and tombstones.
This creates a paradox: a table can appear lightly loaded while performing like a nearly-full table.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174
import random def demonstrate_tombstone_impact(): """ Show how tombstone accumulation degrades performance even when live element count is low. """ class MeasuredHashTable: EMPTY = "E" DELETED = "D" def __init__(self, capacity): self.capacity = capacity self.slots = [self.EMPTY] * capacity self.size = 0 self.tombstones = 0 self.total_probes = 0 self.operations = 0 def insert(self, key, value): self.operations += 1 probes = 0 start = hash(key) % self.capacity first_tombstone = None for offset in range(self.capacity): idx = (start + offset) % self.capacity probes += 1 slot = self.slots[idx] if slot == self.EMPTY: target = first_tombstone if first_tombstone is not None else idx if first_tombstone is not None: self.tombstones -= 1 self.slots[target] = (key, value) self.size += 1 self.total_probes += probes return probes if slot == self.DELETED: if first_tombstone is None: first_tombstone = idx continue if slot[0] == key: self.slots[idx] = (key, value) self.total_probes += probes return probes raise RuntimeError("Full") def delete(self, key): self.operations += 1 probes = 0 start = hash(key) % self.capacity for offset in range(self.capacity): idx = (start + offset) % self.capacity probes += 1 slot = self.slots[idx] if slot == self.EMPTY: self.total_probes += probes return False if slot == self.DELETED: continue if slot[0] == key: self.slots[idx] = self.DELETED self.size -= 1 self.tombstones += 1 self.total_probes += probes return True self.total_probes += probes return False def lookup(self, key): self.operations += 1 probes = 0 start = hash(key) % self.capacity for offset in range(self.capacity): idx = (start + offset) % self.capacity probes += 1 slot = self.slots[idx] if slot == self.EMPTY: self.total_probes += probes return (False, probes) if slot == self.DELETED: continue if slot[0] == key: self.total_probes += probes return (True, probes) self.total_probes += probes return (False, probes) def avg_probes(self): return self.total_probes / self.operations if self.operations > 0 else 0 print("Tombstone Accumulation Impact Analysis") print("=" * 70) print() capacity = 1000 # Scenario 1: Fresh table with 250 elements (25% load) print("Scenario 1: Fresh table with 250 elements") print("-" * 50) table1 = MeasuredHashTable(capacity) keys = [f"key_{i}" for i in range(250)] for key in keys: table1.insert(key, "value") table1.total_probes = 0 table1.operations = 0 for key in keys: table1.lookup(key) print(f" Live elements: {table1.size}") print(f" Tombstones: {table1.tombstones}") print(f" True load: {table1.size / capacity:.0%}") print(f" Effective load: {(table1.size + table1.tombstones) / capacity:.0%}") print(f" Avg probes per lookup: {table1.avg_probes():.2f}") print() # Scenario 2: Table with 750 inserts, 500 deletes = 250 live + 500 tombstones print("Scenario 2: Same 250 elements, but 500 tombstones from deletions") print("-" * 50) table2 = MeasuredHashTable(capacity) all_keys = [f"key_{i}" for i in range(750)] for key in all_keys: table2.insert(key, "value") # Delete 500 of them to_delete = random.sample(all_keys, 500) for key in to_delete: table2.delete(key) remaining_keys = [k for k in all_keys if k not in to_delete] table2.total_probes = 0 table2.operations = 0 for key in remaining_keys: table2.lookup(key) print(f" Live elements: {table2.size}") print(f" Tombstones: {table2.tombstones}") print(f" True load: {table2.size / capacity:.0%}") print(f" Effective load: {(table2.size + table2.tombstones) / capacity:.0%}") print(f" Avg probes per lookup: {table2.avg_probes():.2f}") print() print("Comparison:") print(f" Both have ~250 live elements") print(f" Tombstone-laden table requires {table2.avg_probes() / table1.avg_probes():.1f}x more probes!") print() print("Conclusion: Tombstones count toward probing cost even though") print("they hold no data. Periodic cleanup is essential.") demonstrate_tombstone_impact()| True Load | Tombstones | Effective Load | Expected Probe Increase |
|---|---|---|---|
| 25% | 0% | 25% | 1.0x (baseline) |
| 25% | 25% | 50% | ~1.5x |
| 25% | 50% | 75% | ~3.0x |
| 25% | 65% | 90% | ~6.0x |
| 10% | 80% | 90% | ~6.0x (extremely degraded) |
For workloads with high deletion rates—caches with TTL eviction, session stores, temporary data—tombstone accumulation is a real threat. Either implement periodic compaction, track tombstone ratio and trigger rebuild, or consider chaining instead.
To prevent tombstone-induced degradation, implementations employ various management strategies. The choice depends on workload characteristics and performance requirements.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155
class ManagedHashTable: """ Hash table with automatic tombstone management through rebuilding. """ EMPTY = "EMPTY" DELETED = "DELETED" def __init__(self, capacity=16, tombstone_threshold=0.25): self.capacity = capacity self.slots = [self.EMPTY] * capacity self.size = 0 self.tombstone_count = 0 self.tombstone_threshold = tombstone_threshold # Max tombstone ratio self.rebuild_count = 0 def _needs_rebuild(self) -> bool: """Determine if tombstone cleanup is needed.""" if self.capacity == 0: return False tombstone_ratio = self.tombstone_count / self.capacity return tombstone_ratio > self.tombstone_threshold def _rebuild(self): """ Rebuild table without tombstones. Time: O(n) where n is current size. """ self.rebuild_count += 1 old_slots = self.slots # Potentially resize based on current size # Here we keep same capacity, just remove tombstones self.slots = [self.EMPTY] * self.capacity self.size = 0 self.tombstone_count = 0 for slot in old_slots: if slot not in (self.EMPTY, self.DELETED): key, value = slot self._insert_no_rebuild(key, value) def _insert_no_rebuild(self, key, value): """Insert without triggering rebuild (used during rebuild).""" start = hash(key) % self.capacity for offset in range(self.capacity): idx = (start + offset) % self.capacity slot = self.slots[idx] if slot in (self.EMPTY, self.DELETED): self.slots[idx] = (key, value) self.size += 1 return if slot[0] == key: self.slots[idx] = (key, value) return def insert(self, key, value): """Insert with automatic tombstone management.""" # Check load factor for resize (separate from tombstone cleanup) if self.size / self.capacity > 0.7: self._resize() self._insert_no_rebuild(key, value) def delete(self, key) -> bool: """Delete with automatic tombstone cleanup.""" start = hash(key) % self.capacity for offset in range(self.capacity): idx = (start + offset) % self.capacity slot = self.slots[idx] if slot == self.EMPTY: return False if slot == self.DELETED: continue if slot[0] == key: self.slots[idx] = self.DELETED self.size -= 1 self.tombstone_count += 1 # Check if cleanup needed if self._needs_rebuild(): self._rebuild() return True return False def _resize(self): """Double capacity (implicitly removes tombstones).""" old_slots = self.slots self.capacity *= 2 self.slots = [self.EMPTY] * self.capacity self.size = 0 self.tombstone_count = 0 for slot in old_slots: if slot not in (self.EMPTY, self.DELETED): self._insert_no_rebuild(slot[0], slot[1]) def stats(self) -> dict: return { "size": self.size, "capacity": self.capacity, "tombstones": self.tombstone_count, "rebuilds": self.rebuild_count, "true_load": f"{self.size / self.capacity:.1%}", "effective_load": f"{(self.size + self.tombstone_count) / self.capacity:.1%}", } def demonstrate_managed_table(): """Show automatic tombstone management.""" import random print("Automatic Tombstone Management Demonstration") print("=" * 60) table = ManagedHashTable(capacity=100, tombstone_threshold=0.25) # Simulate heavy deletion workload print("Simulating workload: insert 50, delete 40, repeat 10 times") print("-" * 50) all_keys = [] for round in range(10): # Insert 50 new_keys = [f"key_{round}_{i}" for i in range(50)] for key in new_keys: table.insert(key, "value") all_keys.extend(new_keys) # Delete 40 random existing keys to_delete = random.sample(all_keys, min(40, len(all_keys))) for key in to_delete: table.delete(key) all_keys = [k for k in all_keys if k not in to_delete] stats = table.stats() print(f"Round {round + 1}: {stats}") print() print(f"Final state: {table.stats()}") print() print(f"Table was rebuilt {table.rebuild_count} times to manage tombstones") print(f"Without management, tombstones would have accumulated to ~400") demonstrate_managed_table()Let's consolidate the complexity analysis for deletion across both collision strategies:
| Aspect | Chaining | Open Addressing (Tombstones) |
|---|---|---|
| Average Case | O(1 + α) | O(1/(1-α)) |
| Worst Case | O(n) | O(n) |
| Space Overhead | Pointer per node | Tombstone slots |
| Effect on Other Keys | None | None (with tombstones) |
| Cleanup Needed | No | Yes (tombstone accumulation) |
| Implementation Complexity | Simple | Moderate (3 slot states) |
| Cache Performance | Poor (pointer chasing) | Good (contiguous) |
| High Deletion Rate | Handles well | Degrades without cleanup |
Deletion is where hash table implementations diverge most significantly. Let's consolidate the essential insights:
What's Next:
We've examined insert, lookup, and delete individually. The final piece of the puzzle is understanding when worst-case behavior actually matters in practice. The next page explores real-world scenarios where O(n) worst case transitions from theoretical concern to production crisis—and how to prevent it.
You now have a thorough understanding of hash table deletion. You can explain why open addressing requires tombstones, articulate the tombstone accumulation problem, describe management strategies, and make informed decisions about collision resolution based on deletion patterns. Next, we'll explore when worst-case complexity truly matters in practice.