Loading content...
If insertion is how hash tables receive data, lookup is how they earn their reputation. The ability to retrieve any element from a collection of millions in constant time—regardless of the collection's size—is hash tables' defining superpower. This O(1) average lookup is why hash tables underpin caches, symbol tables, database indexes, and countless other performance-critical systems.
Yet, like insertion, this O(1) is a probabilistic average, not a universal guarantee. Understanding when lookup truly achieves constant time, when it degrades to linear, and what factors govern this behavior is essential knowledge for any engineer building systems at scale.
This page provides a comprehensive examination of the lookup operation, from its mathematical foundations to its real-world performance characteristics. By the end, you'll understand exactly what makes hash table lookup special—and exactly what can break it.
By the end of this page, you will understand: the step-by-step mechanics of hash table lookup in both chaining and open addressing; the mathematical basis for O(1) average-case performance; how worst-case O(n) lookup manifests; the difference between successful and unsuccessful lookups; and how hash table lookup performance compares to alternative data structures.
The lookup operation answers a fundamental question: "Given a key, does this hash table contain it, and if so, what is its associated value?" The process mirrors insertion but terminates differently—instead of storing, we retrieve or report absence.
The Fundamental Lookup Algorithm:
The critical observation: steps 1-3 are O(1), but step 4 depends entirely on the bucket's contents. An empty bucket or immediate match gives instant results. A long chain forces linear traversal.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148
from typing import Optional, Tuple, TypeVar K = TypeVar('K')V = TypeVar('V') class HashTableLookup: """ Illustrates hash table lookup with detailed step-by-step analysis. Each step's complexity contribution is annotated. """ def __init__(self, capacity: int = 16): self.capacity = capacity self.buckets = [[] for _ in range(capacity)] self.size = 0 # Metrics for analysis self.comparisons = 0 self.lookups_performed = 0 def _hash(self, key: K) -> int: """ Step 1: Compute hash value. Time: O(1) for primitive types, O(k) for k-length strings. Treated as O(1) for bounded-size keys. """ return hash(key) def _get_bucket_index(self, hash_value: int) -> int: """ Step 2: Reduce hash to bucket index. Time: O(1) - single modulo operation. """ return hash_value % self.capacity def lookup(self, key: K) -> Tuple[bool, Optional[V], int]: """ Complete lookup operation with detailed analysis. Returns: - found: Whether the key exists - value: The associated value (None if not found) - comparisons: Number of key comparisons performed """ self.lookups_performed += 1 comparisons_this_lookup = 0 # Step 1: Compute hash - O(1) hash_value = self._hash(key) # Step 2: Compute bucket index - O(1) bucket_index = self._get_bucket_index(hash_value) # Step 3: Access bucket - O(1) array indexing bucket = self.buckets[bucket_index] # Step 4: Search within bucket - O(k) where k is bucket length # THIS IS WHERE COMPLEXITY VARIANCE OCCURS for existing_key, value in bucket: comparisons_this_lookup += 1 self.comparisons += 1 if existing_key == key: # FOUND - successful lookup return (True, value, comparisons_this_lookup) # NOT FOUND - unsuccessful lookup # We traversed entire bucket without match return (False, None, comparisons_this_lookup) def get(self, key: K, default: V = None) -> Optional[V]: """ Pythonic interface: return value or default. """ found, value, _ = self.lookup(key) return value if found else default def __contains__(self, key: K) -> bool: """ Enable 'key in table' syntax. """ found, _, _ = self.lookup(key) return found def insert(self, key: K, value: V): """Helper method to populate the table.""" bucket_index = hash(key) % self.capacity bucket = self.buckets[bucket_index] for i, (k, _) in enumerate(bucket): if k == key: bucket[i] = (key, value) return bucket.append((key, value)) self.size += 1 def get_lookup_statistics(self) -> dict: """Return lookup performance statistics.""" return { "total_lookups": self.lookups_performed, "total_comparisons": self.comparisons, "avg_comparisons": (self.comparisons / self.lookups_performed if self.lookups_performed > 0 else 0), "elements": self.size, "buckets": self.capacity, "load_factor": self.size / self.capacity, } # Demonstrationdef demonstrate_lookup_mechanics(): """Show lookup behavior under different conditions.""" print("=== Hash Table Lookup Mechanics Demo ===\n") table = HashTableLookup(capacity=10) # Insert some elements test_data = [ ("apple", "red"), ("banana", "yellow"), ("cherry", "red"), ("date", "brown"), ("elderberry", "purple"), ] for key, value in test_data: table.insert(key, value) # Perform lookups print("Performing lookups:") # Successful lookups for key in ["apple", "cherry", "elderberry"]: found, value, comparisons = table.lookup(key) print(f" lookup('{key}'): found={found}, " f"value='{value}', comparisons={comparisons}") # Unsuccessful lookups for key in ["grape", "mango", "kiwi"]: found, value, comparisons = table.lookup(key) print(f" lookup('{key}'): found={found}, comparisons={comparisons}") print() print("Statistics:", table.get_lookup_statistics()) demonstrate_lookup_mechanics()The entire complexity story of hash table lookup hinges on step 4—searching within the bucket. Everything else is guaranteed O(1). When we say lookup is O(1) on average, we're really saying: 'on average, buckets contain O(1) elements.' When we say worst-case is O(n), we mean: 'in the worst case, one bucket contains everything.'
Lookup complexity analysis distinguishes between two fundamentally different scenarios: searching for a key that exists versus one that doesn't. This isn't merely academic—the distinction has real performance implications.
Successful Lookup (Key Exists):
When seeking a key that's present, we traverse the collision chain until we find a match. On average, the key is neither first nor last—it's somewhere in the middle of its chain.
Expected comparisons (successful) ≈ 1 + α/2
Where α = n/m is the load factor. We check about half the chain before finding our element.
Unsuccessful Lookup (Key Absent):
When seeking a key that doesn't exist, we must examine every element in the bucket before concluding it's absent. There's no early termination—absence can only be proven by exhaustive search.
Expected comparisons (unsuccessful) ≈ α
We traverse the entire average chain length. For high load factors, unsuccessful lookups are measurably slower than successful ones.
| Load Factor (α) | Successful Lookup | Unsuccessful Lookup | Ratio (Fail/Success) |
|---|---|---|---|
| 0.25 | 1.125 comparisons | 0.25 comparisons | 0.22x (failures faster) |
| 0.50 | 1.25 comparisons | 0.50 comparisons | 0.40x (failures faster) |
| 0.75 | 1.375 comparisons | 0.75 comparisons | 0.55x (failures faster) |
| 1.00 | 1.50 comparisons | 1.00 comparison | 0.67x (approaching parity) |
| 2.00 | 2.00 comparisons | 2.00 comparisons | 1.00x (equal cost) |
| 5.00 | 3.50 comparisons | 5.00 comparisons | 1.43x (failures slower) |
The Surprising Inversion:
At low load factors, unsuccessful lookups are actually faster than successful ones on average. This counterintuitive result occurs because:
As load factor increases, this advantage diminishes, and above α = 1, unsuccessful lookups become slower because they traverse longer chains before concluding absence.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
import randomimport timefrom typing import Set def compare_lookup_types(): """ Empirically compare successful vs unsuccessful lookup performance. Demonstrates the theoretical distinction in practice. """ class SimpleHashTable: def __init__(self, capacity): self.capacity = capacity self.buckets = [[] for _ in range(capacity)] self.comparisons = 0 def insert(self, key, value): idx = hash(key) % self.capacity for i, (k, _) in enumerate(self.buckets[idx]): if k == key: self.buckets[idx][i] = (key, value) return self.buckets[idx].append((key, value)) def lookup(self, key): idx = hash(key) % self.capacity bucket = self.buckets[idx] comps = 0 for k, v in bucket: comps += 1 if k == key: self.comparisons += comps return (True, v, comps) self.comparisons += comps return (False, None, comps) # Test parameters n_elements = 10000 n_lookups = 5000 for load_factor in [0.25, 0.50, 0.75, 1.00, 2.00]: capacity = int(n_elements / load_factor) table = SimpleHashTable(capacity) # Insert known keys keys_in_table: Set[int] = set() for _ in range(n_elements): key = random.randint(0, 10**9) keys_in_table.add(key) table.insert(key, f"value_{key}") # Generate lookup targets: half exist, half don't existing_keys = random.sample(list(keys_in_table), min(n_lookups // 2, len(keys_in_table))) missing_keys = [] while len(missing_keys) < n_lookups // 2: candidate = random.randint(0, 10**9) if candidate not in keys_in_table: missing_keys.append(candidate) # Measure successful lookups table.comparisons = 0 success_comparisons = [] for key in existing_keys: _, _, comps = table.lookup(key) success_comparisons.append(comps) avg_success = sum(success_comparisons) / len(success_comparisons) if success_comparisons else 0 # Measure unsuccessful lookups table.comparisons = 0 fail_comparisons = [] for key in missing_keys: _, _, comps = table.lookup(key) fail_comparisons.append(comps) avg_fail = sum(fail_comparisons) / len(fail_comparisons) if fail_comparisons else 0 print(f"Load Factor {load_factor:.2f}:") print(f" Successful lookup: {avg_success:.2f} comparisons (expected: {1 + load_factor/2:.2f})") print(f" Unsuccessful lookup: {avg_fail:.2f} comparisons (expected: {load_factor:.2f})") print(f" Ratio (fail/success): {avg_fail/avg_success if avg_success > 0 else 0:.2f}") print() compare_lookup_types()If your workload involves many 'check if key exists' operations where you expect many negatives (e.g., filtering, deduplication), consider Bloom filters as a preprocessing step. Bloom filters can answer 'definitely not present' in O(1) with zero false negatives, allowing you to skip hash table lookups entirely for non-existent keys.
The O(1) average-case claim for hash table lookup is grounded in probability theory and relies on specific assumptions about hash function behavior. Let's examine the mathematics rigorously.
Setup:
SUHA Statement: Each key is equally likely to hash to any of the m buckets, independently of other keys.
Chain Length Distribution:
Under SUHA, the number of keys in each bucket follows a Poisson distribution with parameter λ = α for large n and m:
P(bucket has k elements) ≈ (e^(-α) × α^k) / k!
This tells us:
The expected lookup time is dominated by the expected chain length, which is O(α). When α is kept constant (through resizing), this is O(1).
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
import mathfrom collections import Counterimport random def analyze_chain_distribution(): """ Demonstrate that chain lengths follow Poisson distribution. This validates the mathematical foundation of O(1) analysis. """ def poisson_pmf(k: int, lambda_: float) -> float: """Poisson probability mass function.""" return (math.exp(-lambda_) * (lambda_ ** k)) / math.factorial(k) # Parameters m = 10000 # Number of buckets n = 7500 # Number of elements (load factor = 0.75) alpha = n / m # Simulate hash table insertion buckets = [0] * m # Count elements per bucket for _ in range(n): # Simulate uniform hash by using random bucket selection bucket_idx = random.randint(0, m - 1) buckets[bucket_idx] += 1 # Count bucket sizes size_distribution = Counter(buckets) max_size = max(size_distribution.keys()) print(f"Hash Table Chain Length Distribution") print(f"Buckets: {m}, Elements: {n}, Load Factor: {alpha:.2f}") print("=" * 60) print(f"{'Chain Length':<15} {'Observed':<15} {'Poisson Expected':<15} {'Deviation'}") print("-" * 60) for k in range(min(max_size + 1, 10)): observed = size_distribution.get(k, 0) expected = m * poisson_pmf(k, alpha) deviation = (observed - expected) / expected if expected > 0 else 0 print(f"{k:<15} {observed:<15} {expected:<15.1f} {deviation:+.2%}") print() # Calculate expected lookup cost total_comparisons_for_all = sum(k * count for k, count in size_distribution.items()) avg_chain_length = total_comparisons_for_all / m if m > 0 else 0 print(f"Average chain length: {avg_chain_length:.3f} (expected: {alpha:.3f})") print(f"Max chain length: {max_size}") print(f"Empty buckets: {size_distribution.get(0, 0)} ({size_distribution.get(0, 0)/m:.1%})") print(f"Expected empty: {m * math.exp(-alpha):.0f} ({math.exp(-alpha):.1%})") # Prove O(1) by showing average is constant regardless of n print() print("Verification: Average lookup cost remains constant as n grows") print("-" * 60) for n_test in [1000, 5000, 10000, 50000, 100000]: m_test = int(n_test / 0.75) # Maintain load factor 0.75 buckets_test = [0] * m_test for _ in range(n_test): buckets_test[random.randint(0, m_test - 1)] += 1 total_comps = sum(k * count for k, count in Counter(buckets_test).items()) avg_comps = total_comps / m_test # For successful lookup: 1 + α/2 expected_successful = 1 + 0.75 / 2 print(f"n = {n_test:>6}: avg chain = {avg_comps:.3f}, " f"expected success = {expected_successful:.3f}") analyze_chain_distribution()The key insight is that when we keep α bounded (e.g., α ≤ 0.75), the expected chain length is bounded by a constant (0.75). Thus, expected lookup time is bounded by a constant—giving us O(1). Without load factor management, α grows linearly with n, and lookup becomes O(n/m), which is O(n) if m is fixed.
The O(n) worst case for lookup mirrors insertion: all n elements occupy a single bucket, transforming the hash table into a linked list. But the consequences for lookup are arguably more severe than for insertion.
Why Lookup Worst-Case Is More Damaging:
Lookups are typically more frequent — Most applications read far more than they write. A 100:1 read-to-write ratio is common. Slow lookups compound.
Users wait for lookups — Insertions often happen asynchronously or in batches. Lookups often block user-facing operations. A slow lookup means visible latency.
Unsuccessful lookups are common — Many applications check for existence (caching, deduplication). Worst-case unsuccessful lookup traverses the entire chain every time.
Repeated hits on hot keys — A malicious or pathological access pattern can repeatedly query the degenerate bucket, causing sustained degradation rather than occasional spikes.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124
import timeimport random class DegenerateHashTable: """ Demonstrates worst-case lookup behavior when all elements collide. """ def __init__(self, capacity=100): self.capacity = capacity self.buckets = [[] for _ in range(capacity)] self.size = 0 def insert_with_hash(self, key, value, hash_value: int): """Insert with explicit hash value (allows forcing collisions).""" bucket_idx = hash_value % self.capacity bucket = self.buckets[bucket_idx] for i, (k, _) in enumerate(bucket): if k == key: bucket[i] = (key, value) return bucket.append((key, value)) self.size += 1 def lookup_timed(self, key, hash_value: int) -> tuple: """Lookup with timing and comparison counting.""" bucket_idx = hash_value % self.capacity bucket = self.buckets[bucket_idx] start = time.perf_counter_ns() comparisons = 0 for k, v in bucket: comparisons += 1 if k == key: end = time.perf_counter_ns() return (True, v, comparisons, end - start) end = time.perf_counter_ns() return (False, None, comparisons, end - start) def demonstrate_lookup_degradation(): """Show dramatic performance difference under collision attack.""" n_elements = 10000 n_lookups = 1000 print("=" * 70) print("DEMONSTRATION: Lookup Performance Normal vs Degenerate") print("=" * 70) print() # SCENARIO 1: Normal distribution (good hash function) print("SCENARIO 1: Normal Hash Distribution") print("-" * 50) normal_table = DegenerateHashTable(capacity=1000) # Insert with uniformly distributed hashes keys = [f"key_{i}" for i in range(n_elements)] for key in keys: normal_table.insert_with_hash(key, f"value", hash(key)) # Measure lookup times sample_keys = random.sample(keys, n_lookups) normal_times = [] normal_comps = [] for key in sample_keys: _, _, comps, ns = normal_table.lookup_timed(key, hash(key)) normal_times.append(ns) normal_comps.append(comps) print(f" Elements: {normal_table.size}") print(f" Avg comparisons: {sum(normal_comps)/len(normal_comps):.2f}") print(f" Max comparisons: {max(normal_comps)}") print(f" Avg lookup time: {sum(normal_times)/len(normal_times):.0f} ns") print(f" Max lookup time: {max(normal_times):.0f} ns") print() # SCENARIO 2: All collisions (adversarial/pathological) print("SCENARIO 2: Complete Collision (All Same Bucket)") print("-" * 50) degenerate_table = DegenerateHashTable(capacity=1000) # Insert ALL with the same hash value -> same bucket collision_hash = 42 for key in keys: degenerate_table.insert_with_hash(key, f"value", collision_hash) # Measure lookup times degenerate_times = [] degenerate_comps = [] for key in sample_keys: _, _, comps, ns = degenerate_table.lookup_timed(key, collision_hash) degenerate_times.append(ns) degenerate_comps.append(comps) print(f" Elements: {degenerate_table.size}") print(f" Avg comparisons: {sum(degenerate_comps)/len(degenerate_comps):.2f}") print(f" Max comparisons: {max(degenerate_comps)}") print(f" Avg lookup time: {sum(degenerate_times)/len(degenerate_times):.0f} ns") print(f" Max lookup time: {max(degenerate_times):.0f} ns") print() # COMPARISON print("COMPARISON:") print("-" * 50) avg_normal = sum(normal_comps) / len(normal_comps) avg_degen = sum(degenerate_comps) / len(degenerate_comps) time_normal = sum(normal_times) / len(normal_times) time_degen = sum(degenerate_times) / len(degenerate_times) print(f" Comparison increase: {avg_degen / avg_normal:.0f}x") print(f" Time increase: {time_degen / time_normal:.0f}x") print() print(f" This is the difference between O(1) and O(n)!") demonstrate_lookup_degradation()In 2003, Scott Crosby demonstrated algorithmic complexity attacks against Perl's hash tables. By crafting HTTP request parameters that all collided, he could make a web server spend minutes processing a single request that should take milliseconds. Modern languages now use randomized hashing or alternative data structures specifically to prevent these attacks.
Open addressing stores all elements directly in the array. Lookup follows the same probe sequence used during insertion, searching until finding the target, an empty slot (indicating absence), or completing a full cycle.
Critical Subtlety: Tombstones Complicate Lookup
With open addressing, deletion creates a problem. If we simply empty a slot that was part of a collision chain, we break lookups for elements that probed past it. The solution is tombstones—markers indicating 'deleted, but keep probing.'
This means unsuccessful lookups cannot stop at the first 'empty' slot—they must stop at a slot that was never occupied (not just deleted). As tombstones accumulate, lookup performance degrades.
Probe Count Analysis (Linear Probing):
For load factor α, expected probes are:
As α → 1, these explode. Linear probing is extremely sensitive to high load factors.
| Load Factor | Successful Lookup | Unsuccessful Lookup | Practical Assessment |
|---|---|---|---|
| 50% | 1.5 probes | 2.5 probes | Excellent—minimal degradation |
| 70% | 2.2 probes | 6.1 probes | Good—still acceptable |
| 80% | 3.0 probes | 13.0 probes | Degrading—consider resize |
| 90% | 5.5 probes | 50.5 probes | Poor—resize urgently needed |
| 95% | 10.5 probes | 200.5 probes | Critical—effectively broken |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134
class OpenAddressingLookup: """ Demonstrates lookup with open addressing and tombstone handling. """ EMPTY = "EMPTY" DELETED = "DELETED" def __init__(self, capacity=16): self.capacity = capacity self.slots = [self.EMPTY] * capacity self.size = 0 self.deleted_count = 0 def _hash(self, key) -> int: return hash(key) def _probe_sequence(self, key): """Generate probe sequence using linear probing.""" start = self._hash(key) % self.capacity for offset in range(self.capacity): yield (start + offset) % self.capacity def insert(self, key, value) -> int: """Insert and return number of probes.""" probes = 0 for idx in self._probe_sequence(key): probes += 1 slot = self.slots[idx] if slot is self.EMPTY or slot is self.DELETED: # Found empty/deleted slot - insert if slot is self.DELETED: self.deleted_count -= 1 self.slots[idx] = (key, value) self.size += 1 return probes if slot[0] == key: # Key exists - update self.slots[idx] = (key, value) return probes raise RuntimeError("Table is full") def lookup(self, key) -> tuple: """ Lookup with probe counting. Returns: (found, value, probes) Key insight: We CANNOT stop at DELETED slots - we must continue probing until we find the key or hit a truly EMPTY slot. """ probes = 0 for idx in self._probe_sequence(key): probes += 1 slot = self.slots[idx] if slot is self.EMPTY: # Never occupied = key definitely absent return (False, None, probes) if slot is self.DELETED: # Was occupied then deleted - keep probing continue if slot[0] == key: # Found it! return (True, slot[1], probes) # Full cycle without finding key return (False, None, probes) def delete(self, key) -> bool: """Delete a key, leaving tombstone.""" for idx in self._probe_sequence(key): slot = self.slots[idx] if slot is self.EMPTY: return False # Key not found if slot is not self.DELETED and slot[0] == key: self.slots[idx] = self.DELETED self.size -= 1 self.deleted_count += 1 return True return False def load_factor(self) -> float: """Effective load including tombstones.""" return (self.size + self.deleted_count) / self.capacity def demonstrate_tombstone_impact(): """Show how deletions affect lookup performance.""" print("Demonstrating Tombstone Impact on Lookup") print("=" * 60) capacity = 100 for scenario, deletions in [("No deletions", 0), ("25% deleted", 25), ("50% deleted", 50)]: table = OpenAddressingLookup(capacity=capacity) # Insert 70 elements for i in range(70): table.insert(f"key_{i}", f"value_{i}") # Delete specified number for i in range(deletions): table.delete(f"key_{i}") # Measure lookup performance for remaining elements probes_list = [] for i in range(deletions, 70): _, _, probes = table.lookup(f"key_{i}") probes_list.append(probes) # Also measure unsuccessful lookups fail_probes = [] for i in range(100, 200): _, _, probes = table.lookup(f"key_{i}") fail_probes.append(probes) print(f"\n{scenario}:") print(f" Active elements: {table.size}") print(f" Tombstones: {table.deleted_count}") print(f" Effective load: {table.load_factor():.0%}") print(f" Avg probes (success): {sum(probes_list)/len(probes_list):.1f}") print(f" Avg probes (failure): {sum(fail_probes)/len(fail_probes):.1f}") demonstrate_tombstone_impact()In workloads with many insertions and deletions, tombstones accumulate. Even if the 'live' load factor is low, the probing cost reflects all slots that were ever occupied. Periodic 'compaction' (rebuilding the table without tombstones) is necessary to maintain performance. This is why some implementations prefer chaining—deletion is simply removing a list node.
Hash tables' O(1) average lookup is their primary advantage. Let's compare against alternative data structures to understand when this matters.
The Competition:
The key question: when does O(1) vs O(log n) actually matter?
| Collection Size | O(n) Unsorted | O(log n) BST | O(1) Hash Table |
|---|---|---|---|
| 100 | ~50 comparisons | ~7 comparisons | ~1-2 comparisons |
| 10,000 | ~5,000 comparisons | ~14 comparisons | ~1-2 comparisons |
| 1,000,000 | ~500,000 comparisons | ~20 comparisons | ~1-2 comparisons |
| 100,000,000 | ~50,000,000 comparisons | ~27 comparisons | ~1-2 comparisons |
The Profound Implication:
For 100 million elements, hash tables perform approximately the same number of operations as for 100 elements. BSTs scale logarithmically—still excellent, but growing. Unsorted lists become completely impractical.
However, hash tables trade order for speed:
BSTs preserve order at the cost of O(log n) operations—a worthy trade for many applications.
We've thoroughly examined the lookup operation—the crown jewel of hash table performance. Let's consolidate the essential insights:
What's Next:
Insert and lookup are the glamorous operations, but a complete understanding requires examining delete—where subtle complexities arise, especially with open addressing. The next page explores deletion mechanics, tombstone strategies, and their performance implications.
You now possess a comprehensive understanding of hash table lookup. You can explain the mathematical foundations of O(1) average case, articulate precisely how and why O(n) worst case emerges, compare successful versus unsuccessful lookups, and evaluate when hash tables are the right choice versus ordered alternatives. Next, we'll complete the picture with delete operations.