Loading learning content...
As computing evolved from 32-bit to 64-bit architectures, a fundamental challenge emerged in virtual memory management. A 64-bit address space theoretically spans 18 exabytes (2⁶⁴ bytes)—more than a billion times larger than typical physical memory. Traditional page tables, which allocate entries proportional to the virtual address space, become grotesquely inefficient when the address space dwarfs actual memory usage.
Hashed page tables represent a paradigm shift in address translation design. Rather than organizing page table entries by virtual page number in a linear or hierarchical structure, hashed page tables use a hash function to map virtual page numbers directly to table entries. This approach decouples the page table size from the virtual address space size, making it proportional instead to the number of actually mapped pages.
This architectural innovation is critical for modern 64-bit systems, sparse address spaces, and operating systems that must handle processes with wildly varying memory footprints efficiently.
By the end of this page, you will understand the fundamental motivation for hashed page tables, their internal structure and organization, how hash functions are designed for address translation, the complete lookup mechanism, and the performance trade-offs that make hashed tables essential for 64-bit system design.
Before understanding why hashed page tables exist, we must appreciate the limitations of traditional page table designs—linear page tables and multi-level hierarchical page tables.
The Linear Page Table Problem:
A linear (single-level) page table maintains one entry per virtual page in the address space. For a 64-bit architecture with 4KB pages:
Virtual address bits: 64
Page offset bits: 12 (for 4KB pages)
Virtual page number bits: 52
Number of page entries: 2⁵² = 4,503,599,627,370,496 (4.5 quadrillion)
Entry size: 8 bytes (typical for 64-bit systems)
Total page table size: 2⁵² × 8 = 32 PB (petabytes)
This is obviously impossible. No system can allocate 32 petabytes just for a single process's page table. Even for 32-bit systems, a linear page table requires 4MB per process—significant overhead that motivated multi-level designs.
64-bit address spaces are 2³² = 4 billion times larger than 32-bit spaces. Solutions that work for 32-bit systems often become untenable for 64-bit architectures. This exponential growth drives the need for fundamentally different approaches.
The Multi-Level Page Table Approach:
Hierarchical page tables address the space explosion by only allocating table levels for actually-used portions of the address space. A typical x86-64 four-level page table structure:
Level 4 (PML4): 512 entries × 8 bytes = 4KB
Level 3 (PDPT): Up to 512 tables × 4KB each
Level 2 (PD): Up to 512² tables × 4KB each
Level 1 (PT): Up to 512³ tables × 4KB each
For sparse address spaces (most pages unmapped), multi-level tables are efficient—only populated regions require table entries. However, multi-level tables have significant drawbacks:
Multiple Memory Accesses: Each address translation requires traversing multiple levels (up to 4 for x86-64), resulting in 4 memory accesses without TLB support.
Pointer Chasing: Each level requires following a pointer to the next, causing poor cache behavior due to scattered memory locations.
Fixed Depth: The hierarchy depth is fixed by architecture, potentially over-provisioning for small processes or under-serving unusual access patterns.
Complex Management: Creating, destroying, and maintaining multi-level structures adds OS complexity.
| Approach | Space Complexity | Time Complexity | Scalability | Sparsity Handling |
|---|---|---|---|---|
| Linear Table | O(address space) | O(1) | Poor for 64-bit | None |
| Multi-Level | O(mapped pages) | O(levels) | Good | Excellent |
| Hashed Table | O(mapped pages) | O(1 + chain length) | Excellent | Excellent |
The Insight Behind Hashed Tables:
The key observation driving hashed page tables is that processes typically use only a tiny fraction of their virtual address space. A process with 2GB of actual memory usage in a 64-bit address space is using:
2GB / 2⁶⁴ bytes = 2³¹ / 2⁶⁴ = 2⁻³³ ≈ 0.00000001%
Why maintain any structure proportional to the address space when utilization is infinitesimally small? Hashed page tables organize entries by actual mappings, not potential addresses.
This insight leads to a structure where:
A hashed page table is fundamentally a hash table specialized for address translation. It consists of an array of buckets, where each bucket can contain one or more page table entries. The virtual page number is hashed to select a bucket, and the bucket contains the translation to the physical frame.
Core Components:
1. Hash Table Array (Bucket Array)
The primary structure is an array of buckets. Each bucket is either:
The number of buckets is chosen based on expected memory usage and acceptable collision rates—not virtual address space size.
2. Page Table Entry (Hash Table Entry)
Each entry contains:
3. Hash Function
A function that maps virtual page numbers to bucket indices:
bucket_index = hash(virtual_page_number) mod table_size
Entry Structure in Detail:
A typical hashed page table entry for a 64-bit system:
┌─────────────────────────────────────────────────────────────────┐
│ Hashed Page Table Entry (24 bytes) │
├─────────────────────────────────────────────────────────────────┤
│ Virtual Page Number (52 bits) │ Pad (12 bits)│
│ 0 63 │
├─────────────────────────────────────────────────────────────────┤
│ Physical Frame Number (40 bits) │ Control Bits (24 bits) │
│ 0 63 │
├─────────────────────────────────────────────────────────────────┤
│ Chain Pointer (64 bits - pointer to next entry) │
│ 0 63 │
└─────────────────────────────────────────────────────────────────┘
Control Bits:
- Valid (V): 1 bit - Entry contains valid translation
- Read (R): 1 bit - Page is readable
- Write (W): 1 bit - Page is writable
- Execute (X): 1 bit - Page is executable
- User (U): 1 bit - Accessible in user mode
- Dirty (D): 1 bit - Page has been modified
- Accessed (A): 1 bit - Page has been accessed
- Cache Disable: 1 bit - Bypass CPU caches
- Write-Through: 1 bit - Write-through caching policy
- Reserved: 15 bits - Future use
Why Store the Complete VPN?
Unlike array-indexed page tables where the array index implicitly encodes the VPN, hashed tables must explicitly store the VPN because:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
// Hashed page table data structures #include <stdint.h>#include <stdbool.h> // Control bits for page table entriestypedef struct { uint32_t valid : 1; // Entry is valid uint32_t read : 1; // Page is readable uint32_t write : 1; // Page is writable uint32_t execute : 1; // Page is executable uint32_t user : 1; // User-mode accessible uint32_t dirty : 1; // Page modified uint32_t accessed : 1; // Page accessed uint32_t cache_disable : 1; // Disable caching uint32_t write_through : 1; // Write-through policy uint32_t reserved : 23; // Reserved for future use} PageControlBits; // Single entry in the hashed page tabletypedef struct HashedPageTableEntry { uint64_t virtual_page_number; // Full VPN (52 bits used) uint64_t physical_frame_number; // PFN (40 bits used) PageControlBits control; // Protection and status bits struct HashedPageTableEntry* next; // Chain pointer for collisions} HashedPageTableEntry; // The hashed page table structuretypedef struct { HashedPageTableEntry** buckets; // Array of bucket pointers size_t num_buckets; // Number of buckets size_t entry_count; // Total entries in table uint64_t hash_seed; // Seed for hash function} HashedPageTable; // Page table operationsHashedPageTable* hpt_create(size_t num_buckets, uint64_t seed);void hpt_destroy(HashedPageTable* table);bool hpt_insert(HashedPageTable* table, uint64_t vpn, uint64_t pfn, PageControlBits control);bool hpt_lookup(HashedPageTable* table, uint64_t vpn, uint64_t* pfn_out, PageControlBits* control_out);bool hpt_remove(HashedPageTable* table, uint64_t vpn);void hpt_update_control(HashedPageTable* table, uint64_t vpn, PageControlBits new_control);For a process with 10,000 mapped pages: A linear page table for 64-bit needs 32 PB (impossible). A 4-level table needs ~160KB minimum. A hashed table with 16K buckets and 10K entries needs ~280KB (16K bucket pointers + ~240KB for entries). The hashed table trades slightly more memory for O(1) lookup without multi-level traversal.
The hash function is the heart of a hashed page table. Its quality directly determines lookup performance—a poor hash function leads to clustering, long chains, and degraded O(n) performance. Designing hash functions for virtual page numbers presents unique challenges.
Requirements for Address Translation Hash Functions:
1. Uniform Distribution
The function must distribute VPNs uniformly across buckets. Real-world VPN patterns are NOT uniform:
A good hash function must transform these clustered patterns into uniform bucket distribution.
2. Fast Computation
Address translation is on the critical path of EVERY memory access. The hash function is computed millions of times per second. Complex cryptographic hashes (SHA, MD5) are too slow—we need hardware-friendly operations.
3. Low Collision Rate
Collisions require chain traversal. Each collision adds a memory access. Target: average chain length < 2 entries.
4. Deterministic and Reproducible
The same VPN must always hash to the same bucket within a process lifecycle. No randomness per-lookup (though the table can be initialized with random seeds).
Common Hash Function Approaches:
Simple Modulo (Poor Choice):
hash(vpn) = vpn mod table_size
Problem: Sequential VPNs map to sequential buckets, causing poor distribution when access patterns are localized.
Multiplicative Hashing (Better):
hash(vpn) = ((vpn × A) mod 2ʷ) >> (w - log₂(table_size))
Where A is a carefully chosen odd constant (e.g., 2654435769 for 32-bit).
This spreads sequential numbers across the table by exploiting the mathematical properties of multiplication.
XOR-Folding (Good for VPNs):
hash(vpn) = (vpn ^ (vpn >> 16) ^ (vpn >> 32)) mod table_size
Combines high and low bits, effective when address patterns are predictable.
MurmurHash/xxHash (Excellent): Modern non-cryptographic hash functions designed for speed and distribution. These are often used in production systems.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
// Hash function implementations for page tables #include <stdint.h> // Simple multiplicative hash - fast but moderate distributionstatic inline uint64_t hash_multiplicative(uint64_t vpn, uint64_t table_size) { // Golden ratio constant for 64-bit const uint64_t GOLDEN_RATIO = 0x9E3779B97F4A7C15ULL; return ((vpn * GOLDEN_RATIO) >> 32) % table_size;} // XOR-folding hash - good for clustered VPNsstatic inline uint64_t hash_xor_fold(uint64_t vpn, uint64_t table_size) { uint64_t h = vpn; h ^= (h >> 23); h *= 0x2127599BF4325C37ULL; h ^= (h >> 47); return h % table_size;} // Fibonacci hash - excellent distribution, very faststatic inline uint64_t hash_fibonacci(uint64_t vpn, uint64_t num_buckets_log2) { const uint64_t FIBONACCI = 11400714819323198485ULL; // 2^64 / φ return (vpn * FIBONACCI) >> (64 - num_buckets_log2);} // MurmurHash3 finalizer - production qualitystatic inline uint64_t hash_murmur3_finalizer(uint64_t vpn, uint64_t table_size) { uint64_t h = vpn; h ^= h >> 33; h *= 0xFF51AFD7ED558CCDULL; h ^= h >> 33; h *= 0xC4CEB9FE1A85EC53ULL; h ^= h >> 33; return h % table_size;} // Recommended: Combined approach for page tablesstatic inline uint64_t hash_vpn(uint64_t vpn, uint64_t seed, uint64_t table_size) { // Mix with seed for per-table randomization (security) uint64_t h = vpn ^ seed; // Murmur3 finalizer for excellent distribution h ^= h >> 33; h *= 0xFF51AFD7ED558CCDULL; h ^= h >> 33; h *= 0xC4CEB9FE1A85EC53ULL; h ^= h >> 33; // Use Fibonacci for final bucket selection (power-of-2 table size) // This avoids expensive modulo operation return (h * 11400714819323198485ULL) >> (64 - __builtin_ctzll(table_size));}Table Size Selection:
The number of buckets in the hash table is a critical design decision:
Too Few Buckets:
Too Many Buckets:
Optimal Sizing:
A common heuristic is to size the table such that the load factor (entries / buckets) stays between 0.5 and 0.75:
load_factor = num_entries / num_buckets
Recommended: 0.5 ≤ load_factor ≤ 0.75
For 10,000 mapped pages:
Minimum buckets: 10,000 / 0.75 ≈ 13,333
Maximum buckets: 10,000 / 0.5 = 20,000
Practical choice: 16,384 (power of 2 for fast modulo)
Power-of-2 Table Sizes:
Using power-of-2 bucket counts allows replacing expensive modulo operations with fast bit masking:
bucket = hash(vpn) & (table_size - 1) // Only works when table_size is power of 2
This is significantly faster than hash(vpn) % table_size and is used universally in performance-critical implementations.
Without randomization, attackers could craft input patterns that cause pathological hash collisions (hash flooding attacks). Modern systems initialize hash tables with random seeds, making collision attacks computationally infeasible. The seed changes per-process or per-table instantiation, preventing adversarial exploitation.
The lookup process in a hashed page table involves computing the hash, locating the correct bucket, and traversing the collision chain to find the matching entry. Understanding this process in detail reveals the performance characteristics and design trade-offs.
Complete Lookup Algorithm:
PHYSICAL_ADDRESS translate(VIRTUAL_ADDRESS va):
1. Extract virtual page number: vpn = va >> PAGE_OFFSET_BITS
2. Extract page offset: offset = va & PAGE_OFFSET_MASK
3. Compute bucket index: bucket = hash(vpn) mod table_size
4. Get bucket head: entry = table.buckets[bucket]
5. Traverse chain:
while entry != NULL:
if entry.vpn == vpn AND entry.valid:
CHECK PERMISSIONS(entry.control)
UPDATE accessed bit
return (entry.pfn << PAGE_OFFSET_BITS) | offset
entry = entry.next
6. Entry not found: TRIGGER PAGE FAULT
Step-by-Step Analysis:
| Step | Operation | Time Cost | Notes |
|---|---|---|---|
| 1-2 | VPN/Offset Extraction | 1-2 cycles | Simple bit operations, pipelined |
| 3 | Hash Computation | 3-10 cycles | Depends on hash function complexity |
| 4 | Bucket Access | ~100 cycles (cache miss) | Memory load; cached if recently used |
| 5 | Chain Traversal | ~100 cycles per entry | Memory load per chain link |
| 6 | Permission Check | 1-2 cycles | Bit comparisons |
Performance Characteristics:
Best Case (No Collision):
Average Case:
Worst Case (Collision Chain):
Comparison with Multi-Level Tables:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
// Complete lookup implementation for hashed page table #include "hashed_page_table.h"#include <stdio.h> // Translation result codestypedef enum { TRANSLATE_SUCCESS, TRANSLATE_PAGE_FAULT, // No mapping exists TRANSLATE_PROTECTION_FAULT, // Permission denied TRANSLATE_INVALID_ADDRESS // Address out of range} TranslateResult; // Full translation structuretypedef struct { TranslateResult result; uint64_t physical_address; PageControlBits permissions; int chain_depth; // For performance monitoring} TranslationResult; // Main translation functionTranslationResult translate_address(HashedPageTable* table, uint64_t virtual_address, bool is_write, bool is_execute, bool is_user_mode) { TranslationResult result = {0}; // Step 1 & 2: Extract VPN and offset const uint64_t PAGE_OFFSET_BITS = 12; // 4KB pages const uint64_t PAGE_OFFSET_MASK = (1ULL << PAGE_OFFSET_BITS) - 1; uint64_t vpn = virtual_address >> PAGE_OFFSET_BITS; uint64_t offset = virtual_address & PAGE_OFFSET_MASK; // Step 3: Compute hash and bucket index uint64_t bucket_idx = hash_vpn(vpn, table->hash_seed, table->num_buckets); // Step 4: Access bucket HashedPageTableEntry* entry = table->buckets[bucket_idx]; result.chain_depth = 0; // Step 5: Traverse chain while (entry != NULL) { result.chain_depth++; if (entry->virtual_page_number == vpn) { // Found matching entry // Check if valid if (!entry->control.valid) { result.result = TRANSLATE_PAGE_FAULT; return result; } // Permission checks if (is_user_mode && !entry->control.user) { result.result = TRANSLATE_PROTECTION_FAULT; return result; } if (is_write && !entry->control.write) { result.result = TRANSLATE_PROTECTION_FAULT; return result; } if (is_execute && !entry->control.execute) { result.result = TRANSLATE_PROTECTION_FAULT; return result; } // Update accessed bit (atomically in real implementation) entry->control.accessed = 1; // Update dirty bit if write if (is_write) { entry->control.dirty = 1; } // Compute physical address result.physical_address = (entry->physical_frame_number << PAGE_OFFSET_BITS) | offset; result.permissions = entry->control; result.result = TRANSLATE_SUCCESS; return result; } entry = entry->next; } // Step 6: Entry not found - page fault result.result = TRANSLATE_PAGE_FAULT; return result;}In practice, the vast majority of translations hit the TLB (Translation Lookaside Buffer) and never touch the page table. The hashed table is only accessed on TLB misses. A good TLB hit rate (>99%) means page table design affects only ~1% of translations—but that 1% can still be millions of lookups per second, making hash table efficiency crucial.
Standard hashed page tables store one entry per page mapping. Clustered page tables extend this concept by storing multiple contiguous page mappings in each hash table entry. This optimization exploits spatial locality—processes typically map contiguous virtual pages to related physical frames.
The Clustering Insight:
When a process allocates memory, it typically receives contiguous virtual addresses:
Instead of hashing each page individually (creating many entries), we can hash the base page and store multiple translations in one entry.
Clustered Entry Structure:
┌─────────────────────────────────────────────────────────────────┐
│ Clustered Page Table Entry (64 bytes) │
├─────────────────────────────────────────────────────────────────┤
│ Base VPN (52 bits) │
├─────────────────────────────────────────────────────────────────┤
│ Cluster Size (4 bits: 1-16 pages) │ Control Bits │
├─────────────────────────────────────────────────────────────────┤
│ PFN[0]: Frame for VPN+0 │
│ PFN[1]: Frame for VPN+1 │
│ PFN[2]: Frame for VPN+2 │
│ PFN[3]: Frame for VPN+3 │
│ ... (up to 16 PFNs based on cluster size) │
├─────────────────────────────────────────────────────────────────┤
│ Chain Pointer │
└─────────────────────────────────────────────────────────────────┘
Lookup Modification for Clustered Tables:
LOOKUP(vpn):
1. Compute cluster base: base_vpn = vpn & ~(CLUSTER_SIZE - 1)
2. Compute offset within cluster: cluster_offset = vpn & (CLUSTER_SIZE - 1)
3. Hash using base_vpn: bucket = hash(base_vpn)
4. Search for entry with matching base_vpn
5. Extract PFN[cluster_offset] from entry
Benefits of Clustering:
Drawbacks of Clustering:
When Clustering Works Well:
When Clustering is Suboptimal:
Real-World Usage:
IA-64 (Itanium) architecture used clustered hashed page tables natively. The PowerPC architecture also supports hashed page tables with clustering. These implementations demonstrated that clustering provides significant benefits for typical workloads while gracefully handling suboptimal cases.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
// Clustered hashed page table implementation #define CLUSTER_BITS 4#define CLUSTER_SIZE (1 << CLUSTER_BITS) // 16 pages per cluster#define CLUSTER_MASK (CLUSTER_SIZE - 1) typedef struct { uint64_t base_vpn; // Base VPN of cluster uint64_t pfn[CLUSTER_SIZE]; // Physical frames for each page uint8_t valid_bitmap; // Which slots are valid (16 bits) PageControlBits control; // Shared control for cluster struct ClusteredEntry* next; // Chain pointer} ClusteredEntry; // Lookup in clustered page tableTranslationResult clustered_lookup(ClusteredPageTable* table, uint64_t vpn) { TranslationResult result = {0}; // Compute cluster base and offset uint64_t base_vpn = vpn & ~((uint64_t)CLUSTER_MASK); uint64_t cluster_offset = vpn & CLUSTER_MASK; // Hash on base VPN only uint64_t bucket = hash_vpn(base_vpn, table->hash_seed, table->num_buckets); ClusteredEntry* entry = table->buckets[bucket]; while (entry != NULL) { if (entry->base_vpn == base_vpn) { // Check if this specific slot in cluster is valid if (!(entry->valid_bitmap & (1 << cluster_offset))) { result.result = TRANSLATE_PAGE_FAULT; return result; } // Get PFN from cluster uint64_t pfn = entry->pfn[cluster_offset]; result.physical_address = (pfn << 12) | (vpn & 0xFFF); result.result = TRANSLATE_SUCCESS; return result; } entry = entry->next; } result.result = TRANSLATE_PAGE_FAULT; return result;}Achieving optimal performance from hashed page tables requires careful attention to several interacting factors. Understanding these considerations is essential for system designers and kernel developers.
Load Factor Management:
The load factor (entries / buckets) is the primary determinant of lookup performance:
Expected chain length ≈ load_factor (for uniform hashing)
Load Factor Expected Lookups Recommendation
0.25 1.12 Oversized, waste memory
0.50 1.25 Good balance
0.75 1.56 Acceptable
1.00 2.00 Consider resizing
2.00 3.00 Performance degrading
Dynamic Resizing:
As processes map more pages, the hash table may need to grow. Resizing strategies:
Resizing is Expensive:
Cache Behavior Analysis:
Memory access patterns critically affect hash table performance:
Bucket Array Caching:
Entry Caching:
Workload-Dependent Behavior:
| Access Type | Bucket | Entry | Total Cycles (approx) |
|---|---|---|---|
| L1 Hit (hot) | 4 cycles | 4 cycles | ~20 cycles |
| L2 Hit (warm) | 12 cycles | 12 cycles | ~40 cycles |
| L3 Hit (recent) | 40 cycles | 40 cycles | ~100 cycles |
| Memory (cold) | 100 cycles | 100+ cycles | ~250+ cycles |
Hardware Considerations:
TLB Miss Penalty: The hash table is only accessed on TLB misses. On modern CPUs:
Memory Prefetching: Modern CPUs have hardware prefetchers that can hide memory latency:
Simultaneous Multi-Threading (SMT/Hyperthreading):
In practice, hashed page tables often outperform multi-level tables for 64-bit systems despite the collision overhead. The key is that average-case O(1+ε) with ε small beats guaranteed O(4) for common workloads. However, absolute worst-case latency matters for real-time systems—multi-level tables may be preferred when bounded latency is critical.
Hashed page tables represent a fundamental rethinking of address translation structure. By using hash functions to map virtual page numbers to table entries, they achieve translation lookup that scales with actual memory usage rather than potential address space size—a critical advantage for 64-bit architectures.
What's Next:
Hashed page tables must handle the case when multiple virtual page numbers hash to the same bucket—collisions. The next page explores collision handling strategies in depth, examining chaining, open addressing, and the trade-offs each approach presents for address translation workloads.
You now understand hashed page tables comprehensively—from the motivation behind their design through the internal structure, hash function requirements, lookup mechanisms, and performance considerations. This foundation prepares you to understand collision handling, inverted page tables, and their role in modern operating system design.