Loading learning content...
A hash function tells us where to store and find data—but what exactly exists at that location? The answer is a bucket: a logical container that holds all entries whose keys hash to the same value.
Buckets are where the theoretical elegance of hashing meets the physical realities of disk-based storage. Their design determines whether a hash index achieves its O(1) promise or degrades into linked-list-like O(n) performance. Understanding buckets is understanding the difference between a hash index that scales to billions of records and one that collapses under load.
By the end of this page, you will understand bucket structure at the physical level, know how to size buckets for optimal performance, master overflow handling strategies, and be able to diagnose bucket-related performance issues in production hash indexes.
A bucket is the storage unit associated with each possible hash value. When a key hashes to value h, the corresponding entry is stored in bucket h. The bucket is both a logical concept (all entries with the same hash value) and a physical structure (the disk page(s) holding those entries).
The mapping relationship:
Key → Hash Function → Hash Value → Bucket Number → Physical Page(s)
↓
+-------------------+
| Bucket Contents |
| (key₁, RID₁) |
| (key₂, RID₂) |
| (key₃, RID₃) |
| ... |
+-------------------+
Key observations:
One hash value, one bucket — Each hash value corresponds to exactly one bucket. The hash function deterministically maps keys to bucket numbers.
Multiple keys per bucket — Due to collisions (inevitable by the pigeonhole principle), a bucket typically contains entries for multiple different keys.
Bucket ≠ Page (usually) — A bucket might occupy part of a page, exactly one page, or span multiple pages. The relationship depends on load factor and design choices.
Bucket is searched linearly — Once we locate a bucket, finding the exact key requires scanning through its entries. For small buckets, this is effectively O(1).
Hash index lookup isn't literally one operation—it's one I/O plus a linear scan of a small bucket. With average bucket size of 10 entries and in-memory page comparison, the 'O(1)' is accurate in practice. The constant factor is small enough to ignore at scale.
At the physical level, a bucket is implemented as one or more disk pages. The page structure for hash index buckets closely resembles that of B+-tree leaf pages, with some notable differences.
Standard bucket page layout:
+---------------------------------------------------------------+
| Page Header (24-32 bytes) |
| +----------------------------------------------------------+ |
| | Page ID | LSN (Log Sequence Number) | |
| | Page Type | Checksum | |
| | Entry Count | Free Space Offset | |
| | Bucket Number | Overflow Page Pointer | |
| +----------------------------------------------------------+ |
+---------------------------------------------------------------+
| Entry Area (grows downward) |
| +----------------------------------------------------------+ |
| | Entry 1: | Key Value (variable/fixed) | RID (8 bytes) | |
| | Entry 2: | Key Value (variable/fixed) | RID (8 bytes) | |
| | Entry 3: | Key Value (variable/fixed) | RID (8 bytes) | |
| | ... | |
| +----------------------------------------------------------+ |
+---------------------------------------------------------------+
| Free Space |
+---------------------------------------------------------------+
| Slot Directory (grows upward, for variable-length entries) |
| +-----------+-----------+-----------+-----------+ |
| | Offset 1 | Offset 2 | Offset 3 | ... | |
| +-----------+-----------+-----------+-----------+ |
+---------------------------------------------------------------+
Entry format within bucket:
Each entry in the bucket contains:
Key Value — The search key that was hashed. Stored to handle collisions: we must verify the actual key matches, not just the hash.
Record Identifier (RID) — Pointer to the actual data record. Typically 8 bytes: 4 bytes for page number, 4 bytes for slot number within page.
Fixed-length vs. variable-length keys:
| Key Type | Storage Approach | Entry Size | Slot Directory |
|---|---|---|---|
| Fixed-length (INT, BIGINT) | Direct storage | Key size + 8 bytes | Not needed (calculate position) |
| Variable-length (VARCHAR) | Offset-based with slot directory | Varies per entry | Required (backwards from page end) |
| Composite key | Serialize all components | Sum of component sizes | Depends on components |
Since multiple keys can hash to the same bucket, we must store the actual key to distinguish between entries. Without storing keys, we couldn't tell if we found the right record or a collision. This is the space overhead of hash indexing—each entry needs both key and RID.
The number of buckets (M) is a critical configuration decision that balances space efficiency against performance. This decision is governed by the load factor.
Load factor (λ):
λ = n / M
Where:
n = number of entries (records to index)
M = number of buckets
The load factor represents the average number of entries per bucket, assuming uniform distribution.
Impact of load factor:
| Load Factor | Avg Bucket Size | Space Efficiency | Lookup Performance | Typical Use Case |
|---|---|---|---|---|
| λ = 0.3 | 0.3 entries | Poor (70% empty) | Excellent | Memory-rich systems |
| λ = 0.5 | 0.5 entries | Moderate | Very good | General purpose |
| λ = 0.7 | 0.7 entries | Good | Good | Default recommendation |
| λ = 1.0 | 1.0 entries | Very good | Acceptable | Space-constrained |
| λ = 2.0 | 2.0 entries | Excellent | Degraded | Special cases only |
| λ = 5.0+ | 5+ entries | Excellent | Poor (too many collisions) | Anti-pattern |
Calculating optimal bucket count:
Given:
M = ceil(n / (target_λ × max_entries_per_bucket))
Example calculation:
For a hash index on 10 million records:
M = 10,000,000 / (0.7 × 408)
= 10,000,000 / 285.6
≈ 35,014 buckets
// Round up to nearest prime for division hashing
M = 35023 (prime)
With 35,023 buckets:
Traditional hash indexes have a fixed number of buckets. If data grows beyond the planned capacity, performance degrades. This is why dynamic hashing schemes (extendible hashing, linear hashing) were developed—we'll cover these in later pages. For static hashing, plan for anticipated growth.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293
def calculate_optimal_buckets( expected_records: int, key_size: int = 8, # Average key size in bytes rid_size: int = 8, # RID size (typically 8 bytes) page_size: int = 8192, # Page size in bytes page_header_size: int = 32, # Page header overhead target_load_factor: float = 0.7, use_prime: bool = True) -> dict: """ Calculate optimal bucket count for a hash index. Returns configuration details and expected performance metrics. """ import math # Calculate entry size and capacity entry_size = key_size + rid_size usable_page_space = page_size - page_header_size entries_per_page = usable_page_space // entry_size # Calculate bucket count raw_bucket_count = expected_records / (target_load_factor * entries_per_page) bucket_count = math.ceil(raw_bucket_count) # Find next prime if requested (for division hashing) if use_prime: bucket_count = next_prime(bucket_count) # Calculate expected metrics actual_load_factor = expected_records / bucket_count avg_entries_per_bucket = expected_records / bucket_count avg_pages_per_bucket = avg_entries_per_bucket / entries_per_page # Estimate overflow probability (using Poisson approximation) overflow_probability = poisson_overflow_prob( avg_entries_per_bucket, entries_per_page ) return { "bucket_count": bucket_count, "entries_per_page": entries_per_page, "actual_load_factor": actual_load_factor, "avg_entries_per_bucket": avg_entries_per_bucket, "avg_pages_per_bucket": avg_pages_per_bucket, "overflow_probability": overflow_probability, "total_pages_estimate": bucket_count * max(1, avg_pages_per_bucket), "directory_size_bytes": bucket_count * 8, # 8 bytes per bucket pointer } def next_prime(n: int) -> int: """Find the smallest prime >= n.""" def is_prime(x): if x < 2: return False for i in range(2, int(x**0.5) + 1): if x % i == 0: return False return True while not is_prime(n): n += 1 return n def poisson_overflow_prob(mean: float, threshold: int) -> float: """ Probability that a bucket exceeds threshold entries, using Poisson distribution. """ import math prob_at_or_below = 0 for k in range(threshold + 1): prob_at_or_below += (math.exp(-mean) * (mean ** k)) / math.factorial(k) return 1 - prob_at_or_below # Example usageconfig = calculate_optimal_buckets( expected_records=10_000_000, key_size=12, target_load_factor=0.7) print("Hash Index Configuration")print("=" * 40)for key, value in config.items(): if isinstance(value, float): print(f"{key}: {value:.4f}") else: print(f"{key}: {value:,}")When a bucket fills beyond its single-page capacity, we have overflow. How we handle overflow significantly impacts hash index performance. There are two primary strategies: chaining and open addressing.
Chaining (Overflow Pages):
The most common approach in database systems. When the primary bucket page is full:
+------------+ +------------+ +------------+
| Bucket 47 | --> | Overflow 1 | --> | Overflow 2 |
| Primary | | Page 2891 | | Page 2904 |
| Page 1847 | | | | |
| (408 entries)| | (408 entries)| | (193 entries)|
+------------+ +------------+ +------------+
Open Addressing:
An alternative primarily used in in-memory hash tables but occasionally in databases:
h(key) is full, probe for an alternative bucketh(key) + 1, h(key) + 2, ... (linear probing)h(key) + 1², h(key) + 2², ... (quadratic probing)h(key) + i×h'(key) (double hashing)Why databases prefer chaining:
| Aspect | Chaining | Open Addressing |
|---|---|---|
| Max load factor | Unlimited (graceful degradation) | Must be < 1.0 (typically < 0.7) |
| Space utilization | Good (only allocate as needed) | Moderate (pre-allocate all buckets) |
| Cache/disk behavior | Poor (chains scatter) | Good (probes are nearby) |
| Deletion | Simple (remove from chain) | Complex (tombstones required) |
| Worst-case lookup | O(n) for longest chain | O(n) for full table probe |
| Primary use | Disk-based databases | In-memory hash tables |
In production, monitor average and maximum overflow chain lengths. If chains exceed 2-3 pages on average, the hash index needs reorganization or more buckets. Many databases provide system views or functions to inspect hash index health.
The bucket directory (also called the header array or bucket array) maps bucket numbers to physical page addresses. It's the lookup table that translates hash values into disk locations.
Directory structure:
Bucket Directory (array of page pointers)
+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 2 | 3 | 4 | 5 | ... |
+-------+-------+-------+-------+-------+-------+-------+
| | | | | |
v v v v v v
Page Page Page Page Page Page
2001 2147 2089 2201 2056 2178
Each directory entry contains:
Directory sizing:
With M buckets and 8 bytes per directory entry:
This directory is typically:
Benefits of keeping directory in memory:
Eliminates one I/O per lookup — Without caching, every lookup requires: read directory page, then read bucket page (2 I/Os). With cached directory: just read bucket page (1 I/O).
Enables batch operations — Multiple lookups can proceed without competing for directory page.
Feasible for most workloads — A 1-million-bucket index needs only 8MB of memory for its directory.
Directory organization patterns:
| Approach | Structure | Trade-off |
|---|---|---|
| Single array | One contiguous array of pointers | Simple, fast; grows awkwardly |
| Paged directory | Array spans multiple pages | Flexible growth; extra indirection |
| Multi-level | Directory of directory pages | Scales to huge bucket counts |
| Segmented | Directory split by hash prefix | Enables concurrent access |
The bucket directory is fundamental: every hash index operation starts here. Changes to the directory (rehashing, bucket splitting) must be carefully synchronized to prevent concurrent access issues. This is why hash index restructuring is often more disruptive than B+-tree splits.
Understanding the step-by-step mechanics of bucket operations illuminates the O(1) performance characteristics and reveals where performance can degrade.
LOOKUP (Search) Operation:
LOOKUP(key):
1. bucket_num = hash(key) mod M
2. page_id = directory[bucket_num]
3. Read page_id into buffer (I/O #1 if not cached)
4. FOR each entry in page:
IF entry.key == key:
RETURN entry.RID
5. IF page.overflow_pointer != NULL:
page_id = page.overflow_pointer
GOTO step 3 // Follow overflow chain (I/O #2, #3, ...)
6. RETURN NOT_FOUND
Cost analysis:
INSERT Operation:
INSERT(key, RID):
1. bucket_num = hash(key) mod M
2. page_id = directory[bucket_num]
3. Read page_id into buffer
4. IF page has space:
Insert (key, RID) into page
Write page to disk
RETURN SUCCESS
5. ELSE:
// Page full, need overflow
IF page.overflow_pointer != NULL:
page_id = page.overflow_pointer
GOTO step 3
ELSE:
new_page = allocate_overflow_page()
page.overflow_pointer = new_page.id
Write original page
Insert (key, RID) into new_page
Write new_page
RETURN SUCCESS
DELETE Operation:
DELETE(key):
1. bucket_num = hash(key) mod M
2. page_id = directory[bucket_num]
3. Read page_id into buffer
4. FOR each entry in page:
IF entry.key == key:
Remove entry (mark slot as deleted or compact)
Write page
// Optional: reclaim overflow pages if now empty
RETURN SUCCESS
5. IF page.overflow_pointer != NULL:
page_id = page.overflow_pointer
prev_page = current_page // Track for unlinking
GOTO step 3
6. RETURN NOT_FOUND
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165
class BucketPage: """Represents a single bucket page with overflow capability.""" def __init__(self, page_id: int, capacity: int = 100): self.page_id = page_id self.entries = [] # List of (key, rid) tuples self.capacity = capacity self.overflow_page_id = None self.is_dirty = False def has_space(self) -> bool: return len(self.entries) < self.capacity def insert(self, key, rid) -> bool: if self.has_space(): self.entries.append((key, rid)) self.is_dirty = True return True return False def search(self, key): for k, rid in self.entries: if k == key: return rid return None def delete(self, key) -> bool: for i, (k, rid) in enumerate(self.entries): if k == key: del self.entries[i] self.is_dirty = True return True return False class HashIndexBucketManager: """Manages bucket operations for a hash index.""" def __init__(self, num_buckets: int, page_capacity: int = 100): self.num_buckets = num_buckets self.page_capacity = page_capacity self.directory = {} # bucket_num -> primary page_id self.pages = {} # page_id -> BucketPage self.next_page_id = 0 self.io_count = 0 # Track I/O operations # Initialize empty buckets for i in range(num_buckets): page = self._allocate_page() self.directory[i] = page.page_id def _allocate_page(self) -> BucketPage: page_id = self.next_page_id self.next_page_id += 1 page = BucketPage(page_id, self.page_capacity) self.pages[page_id] = page return page def _read_page(self, page_id: int) -> BucketPage: self.io_count += 1 # Simulate I/O return self.pages[page_id] def _hash(self, key) -> int: if isinstance(key, str): h = 0 for c in key: h = (h * 31 + ord(c)) return h % self.num_buckets return key % self.num_buckets def lookup(self, key): """Search for key, returns RID or None.""" bucket_num = self._hash(key) page_id = self.directory[bucket_num] while page_id is not None: page = self._read_page(page_id) result = page.search(key) if result is not None: return result page_id = page.overflow_page_id return None def insert(self, key, rid) -> bool: """Insert key-rid pair into hash index.""" bucket_num = self._hash(key) page_id = self.directory[bucket_num] prev_page = None while True: page = self._read_page(page_id) if page.insert(key, rid): return True if page.overflow_page_id is not None: prev_page = page page_id = page.overflow_page_id else: # Need new overflow page new_page = self._allocate_page() page.overflow_page_id = new_page.page_id page.is_dirty = True new_page.insert(key, rid) return True def delete(self, key) -> bool: """Delete key from hash index.""" bucket_num = self._hash(key) page_id = self.directory[bucket_num] while page_id is not None: page = self._read_page(page_id) if page.delete(key): return True page_id = page.overflow_page_id return False def get_stats(self) -> dict: """Return statistics about the hash index.""" total_entries = 0 overflow_pages = 0 max_chain = 0 for bucket_num in range(self.num_buckets): chain_length = 0 page_id = self.directory[bucket_num] while page_id is not None: page = self.pages[page_id] total_entries += len(page.entries) chain_length += 1 page_id = page.overflow_page_id overflow_pages += max(0, chain_length - 1) max_chain = max(max_chain, chain_length) return { "total_entries": total_entries, "num_buckets": self.num_buckets, "load_factor": total_entries / self.num_buckets, "overflow_pages": overflow_pages, "max_chain_length": max_chain, "io_operations": self.io_count, } # Demonstrationindex = HashIndexBucketManager(num_buckets=10, page_capacity=5) # Insert recordsfor i in range(100): index.insert(f"key_{i}", f"RID_{i}") # Lookupprint(f"Lookup 'key_42': {index.lookup('key_42')}")print(f"Lookup 'key_999': {index.lookup('key_999')}") # Statsstats = index.get_stats()print(f"\nHash Index Statistics:")for k, v in stats.items(): print(f" {k}: {v}")Bucket design directly determines hash index performance. Let's analyze the performance implications systematically.
I/O cost model:
| Operation | I/O Cost (Best) | I/O Cost (Average) | I/O Cost (Worst) |
|---|---|---|---|
| Equality lookup | 1 (directory cached) | 1 + E[overflow]/2 | 1 + max_chain |
| Insert | 1 read + 1 write | 1-2 reads + 1-2 writes | chain_length × 2 + 2 |
| Delete | 1 read + 1 write | Similar to lookup + 1 write | chain_length + 1 |
| Range scan | N (all buckets) | N (all buckets) | N (all buckets) |
Factors affecting performance:
1. Load Factor Impact
As load factor (λ) increases:
2. Hash Quality Impact
Poor hash distribution creates 'hot' buckets:
3. Workload Pattern Impact
4. Physical Layout Impact
Performance degradation isn't linear. As buckets overflow, inserts slow down (must traverse chains). Slower inserts mean longer transactions. Longer transactions hold locks longer. This creates contention, further slowing the system. A hash index past its capacity can trigger cascading performance collapse.
Monitoring bucket health:
-- PostgreSQL: Check hash index page stats
SELECT * FROM pgstatindex('idx_hash_example');
-- PostgreSQL: Estimate index bloat
SELECT pg_size_pretty(pg_relation_size('idx_hash_example')) as index_size;
-- Oracle: Hash cluster monitoring
SELECT * FROM DBA_CLUSTERS WHERE CLUSTER_TYPE = 'HASH';
Key metrics to monitor:
Buckets are the physical manifestation of hash index theory. Their design choices translate directly into performance characteristics.
| Decision | Recommendation |
|---|---|
| Number of buckets | n / (target_λ × entries_per_page), use prime for division hashing |
| Target load factor | 0.7 for general use; 0.5 for performance-critical; 0.8+ for space-constrained |
| Page size | Match database page size (typically 8KB or 16KB) |
| Directory caching | Always cache in memory if possible (usually small) |
| Overflow strategy | Chaining (standard); open addressing only for memory-only scenarios |
| Reorganization threshold | When avg chain length > 2 or max chain > 5 |
What's next:
With buckets understood, the next page focuses on point queries: the use case where hash indexes truly shine. We'll examine how hash indexes answer equality queries with unmatched efficiency, explore query optimizer decisions, and understand the practical scenarios where point query performance justifies the hash index trade-offs.
You now understand buckets at both logical and physical levels. You can calculate appropriate bucket counts, understand overflow mechanics, and diagnose bucket-related performance issues. This knowledge is essential for tuning hash indexes and understanding their behavior under various workloads.