Loading learning content...
Two-level paging elegantly solves the page table size problem for 32-bit systems. With a 10-10-12 address split, we achieve page-sized tables at both levels and dramatic memory savings for sparse address spaces.
But computing has evolved. Modern systems use 64-bit addresses, providing a theoretical address space of 2⁶⁴ bytes—16 exabytes. Even though current implementations typically use only 48 or 57 bits of this space, the scale difference is staggering:
Two-level paging cannot handle these address spaces efficiently. The outer table alone would require billions of entries. We need deeper hierarchies—more levels in our page table tree to keep each level manageable.
This page explores three-level paging as the conceptual bridge between two-level and modern four/five-level systems. You'll understand when three levels become necessary, how the address is decomposed across three indices, the translation process through three tables, and why modern systems extended to four or more levels.
To understand why we need three levels (and beyond), let's analyze what happens when we apply two-level paging to larger address spaces.
The scaling problem:
In a two-level system with 4KB pages, we split the virtual address into:
For the inner table to be page-sized (4KB = 1024 entries × 4 bytes), it uses 10 bits of index. This leaves all remaining bits for the outer index.
| Address Bits | Outer Index | Inner Index | Offset | Outer Table Size | Entries |
|---|---|---|---|---|---|
| 32-bit | 10 bits | 10 bits | 12 bits | 4KB | 1,024 |
| 36-bit | 14 bits | 10 bits | 12 bits | 64KB | 16,384 |
| 40-bit | 18 bits | 10 bits | 12 bits | 1MB | 262,144 |
| 48-bit | 26 bits | 10 bits | 12 bits | 256MB | 67M |
| 57-bit | 35 bits | 10 bits | 12 bits | 128GB | 34B |
The outer table explosion:
As the address space grows, the outer table grows exponentially. For a 48-bit address space:
This is worse than a flat page table for 32-bit systems! The outer table must be fully allocated even for sparse address spaces because it's a single contiguous array.
The fundamental problem:
Two-level paging only saves memory when the inner tables can be sparse. But if the outer table is huge, we've just moved the problem rather than solving it. We need the outer table itself to be sparse.
For multi-level paging to save memory, EVERY level except the innermost must be sparse. If any level requires a huge contiguous array, the memory savings are lost. This is why we need enough levels to keep each level's table page-sized.
The solution: Add another level
By adding a third level, we can split the outer table into smaller chunks:
Two-Level (48-bit):
Level 1 (26 bits) → Level 2 (10 bits) → Offset (12 bits)
Problem: Level 1 is 256MB
Three-Level (48-bit):
Level 1 (16 bits) → Level 2 (10 bits) → Level 3 (10 bits) → Offset (12 bits)
Improvement: Level 1 is now 256KB
Four-Level (48-bit):
L1 (9 bits) → L2 (9 bits) → L3 (9 bits) → L4 (9 bits) → Offset (12 bits)
Each level: 512 entries × 8 bytes = 4KB ✓
This is exactly what modern 64-bit systems do: use four levels (or five for 57-bit addresses) to keep each table page-sized.
A three-level page table introduces an additional layer of indirection between the top-level directory and the page tables that directly map physical frames.
Hierarchical structure:
┌─────────────────────┐
│ Level 1 Directory │ ← CR3 register points here
│ (Root Table) │ 1 table always allocated
└─────────┬───────────┘
│ L1 index selects entry
▼
┌─────────────────────┐
│ Level 2 Directory │ ← Multiple possible tables
│ (Middle Tables) │ Only allocated if L1 entry is valid
└─────────┬───────────┘
│ L2 index selects entry
▼
┌─────────────────────┐
│ Level 3 Tables │ ← Multiple possible tables
│ (Page Tables) │ Only allocated if L2 entry is valid
└─────────┬───────────┘
│ L3 index selects entry
▼
┌─────────────────────┐
│ Physical Frame │ ← Frame number from L3 entry
│ + Offset │ Offset from virtual address
└─────────────────────┘
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
// Three-Level Page Table Structure for Larger Address Spaces// Example: 40-bit virtual address with 4KB pages #define PAGE_SIZE 4096#define PAGE_SHIFT 12#define ENTRIES_PER_TABLE 1024 // 2^10 entries per table // Address field sizes#define L1_BITS 8 // Level 1 index bits#define L2_BITS 10 // Level 2 index bits#define L3_BITS 10 // Level 3 index bits#define OFFSET_BITS 12 // Page offset bits// Total: 8 + 10 + 10 + 12 = 40 bits // Address field masks and shifts#define OFFSET_MASK 0xFFF // Bits 0-11#define L3_MASK 0x3FF // 10 bits#define L2_MASK 0x3FF // 10 bits#define L1_MASK 0xFF // 8 bits #define L3_SHIFT 12#define L2_SHIFT 22#define L1_SHIFT 32 // Entry types (all 64 bits for alignment)typedef uint64_t L1Entry; // Points to L2 tabletypedef uint64_t L2Entry; // Points to L3 table typedef uint64_t L3Entry; // Contains frame number // Entry flags (same as x86-64)#define ENTRY_PRESENT (1ULL << 0)#define ENTRY_WRITE (1ULL << 1)#define ENTRY_USER (1ULL << 2)#define ENTRY_ACCESSED (1ULL << 5)#define ENTRY_DIRTY (1ULL << 6)#define ENTRY_ADDR_MASK 0x000FFFFFFFFFF000ULL // Table typestypedef L1Entry Level1Table[256]; // 2^8 = 256 entriestypedef L2Entry Level2Table[1024]; // 2^10 = 1024 entriestypedef L3Entry Level3Table[1024]; // 2^10 = 1024 entries // Level 1: 256 entries × 8 bytes = 2KB (not full page, suboptimal)// Level 2: 1024 entries × 8 bytes = 8KB (2 pages, also suboptimal)// Level 3: 1024 entries × 8 bytes = 8KB (2 pages) // NOTE: Real systems adjust bit splits to make each level page-sized.// x86-64 uses 9-9-9-9-12 for exactly 512 entries × 8 bytes = 4KB per table.Naming conventions:
Different architectures and texts use various names for the levels:
| Level | Intel x86-64 | Generic Name | Role |
|---|---|---|---|
| 1 | PML4 (Page Map Level 4) | Page Global Directory | Root table, pointed to by CR3 |
| 2 | PDPT (Page Directory Pointer Table) | Page Upper Directory | Second-level directory |
| 3 | PD (Page Directory) | Page Middle Directory | Third-level directory |
| 4 | PT (Page Table) | Page Table | Contains frame numbers |
Note: Intel's naming is for four-level paging. For pure three-level, levels 1-3 would be PDPT, PD, and PT.
The design principle:
Each level serves as a directory for the level below it. Only the bottommost level (L3 in three-level, L4 in four-level) contains actual frame mappings. All upper levels contain pointers to tables at the next level down, with validity bits indicating whether those tables exist.
With 8-byte entries (64-bit pointers), a 4KB page holds exactly 512 entries. log₂(512) = 9 bits per level. Four levels × 9 bits = 36 bits for page table indexing, plus 12 bits for offset = 48-bit addresses. This elegant design makes every table exactly one page.
In three-level paging, the virtual address is decomposed into four fields:
┌────────────────────────────────────────────────────────────────────────────────┐
│ Virtual Address (e.g., 40 bits) │
├──────────────┬──────────────┬──────────────┬───────────────────────────────────┤
│ Bits 39-32 │ Bits 31-22 │ Bits 21-12 │ Bits 11-0 │
│ (8 bits) │ (10 bits) │ (10 bits) │ (12 bits) │
├──────────────┼──────────────┼──────────────┼───────────────────────────────────┤
│ L1 │ L2 │ L3 │ Offset │
│ (Root) │ (Middle) │ (Bottom) │ (Page offset) │
└──────────────┴──────────────┴──────────────┴───────────────────────────────────┘
Each field serves a specific purpose in navigating the three-level hierarchy:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
// Three-Level Address Decomposition and Extraction // Example configuration for 40-bit virtual addresses#define VA_BITS 40#define OFFSET_BITS 12#define L3_BITS 10#define L2_BITS 10#define L1_BITS (VA_BITS - OFFSET_BITS - L3_BITS - L2_BITS) // 8 bits // Extraction macros#define GET_OFFSET(va) ((va) & ((1UL << OFFSET_BITS) - 1))#define GET_L3(va) (((va) >> OFFSET_BITS) & ((1UL << L3_BITS) - 1))#define GET_L2(va) (((va) >> (OFFSET_BITS + L3_BITS)) & ((1UL << L2_BITS) - 1))#define GET_L1(va) (((va) >> (OFFSET_BITS + L3_BITS + L2_BITS)) & ((1UL << L1_BITS) - 1)) // Demonstration functionvoid decompose_address(uint64_t virtual_address) { uint64_t l1_index = GET_L1(virtual_address); uint64_t l2_index = GET_L2(virtual_address); uint64_t l3_index = GET_L3(virtual_address); uint64_t offset = GET_OFFSET(virtual_address); printf("Virtual Address: 0x%010lX", virtual_address); printf("Decomposition (40-bit address, 8-10-10-12 split):"); printf(" L1 Index (Root): %3lu (0x%02lX)", l1_index, l1_index); printf(" L2 Index (Middle): %4lu (0x%03lX)", l2_index, l2_index); printf(" L3 Index (Bottom): %4lu (0x%03lX)", l3_index, l3_index); printf(" Page Offset: %4lu (0x%03lX)", offset, offset);} /* * Example: Virtual address 0x00_AB_CD_EF_12 * Binary: 0000 0000 1010 1011 1100 1101 1110 1111 0001 0010 * └─────────┘└──────────┘└──────────┘└────────────┘ * L1 (8b) L2 (10b) L3 (10b) Offset (12b) * * L1 Index: 0x02 = 2 * L2 Index: 0x2F3 = 755 * L3 Index: 0x1EF = 495 * Offset: 0x112 = 274 */The number of bits per level determines both table sizes and maximum coverage. More bits at a level means larger tables but fewer tables needed. Fewer bits means smaller tables but deeper nesting. Modern systems optimize for page-sized tables at every level.
The translation process in three-level paging extends the two-level approach with an additional table lookup. Each level must be traversed sequentially, as each entry provides the address of the next table.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
// Complete Three-Level Address Translation #define ENTRY_PRESENT (1ULL << 0)#define ENTRY_ADDR_MASK 0x0000FFFFFFFFF000ULL // Bits 12-51 for address typedef uint64_t* PageTable; // Hardware register holding L1 table physical addressextern uint64_t cr3_register; // Convert physical address to kernel virtual addressextern void* phys_to_virt(uint64_t phys_addr); // Page fault handlerextern void page_fault(uint64_t vaddr, int reason); uint64_t translate_three_level(uint64_t virtual_address) { // STEP 1: Decompose virtual address uint64_t l1_index = GET_L1(virtual_address); uint64_t l2_index = GET_L2(virtual_address); uint64_t l3_index = GET_L3(virtual_address); uint64_t offset = GET_OFFSET(virtual_address); // STEP 2: Access L1 table (root) // First memory access PageTable l1_table = phys_to_virt(cr3_register); uint64_t l1_entry = l1_table[l1_index]; // STEP 3: Validate L1 entry if (!(l1_entry & ENTRY_PRESENT)) { page_fault(virtual_address, FAULT_L2_NOT_PRESENT); return 0; // Not reached } // STEP 4: Access L2 table // Second memory access uint64_t l2_phys = l1_entry & ENTRY_ADDR_MASK; PageTable l2_table = phys_to_virt(l2_phys); uint64_t l2_entry = l2_table[l2_index]; // STEP 5: Validate L2 entry if (!(l2_entry & ENTRY_PRESENT)) { page_fault(virtual_address, FAULT_L3_NOT_PRESENT); return 0; // Not reached } // STEP 6: Access L3 table (page table) // Third memory access uint64_t l3_phys = l2_entry & ENTRY_ADDR_MASK; PageTable l3_table = phys_to_virt(l3_phys); uint64_t l3_entry = l3_table[l3_index]; // STEP 7: Validate L3 entry if (!(l3_entry & ENTRY_PRESENT)) { page_fault(virtual_address, FAULT_PAGE_NOT_PRESENT); return 0; // Not reached } // STEP 8 & 9: Extract frame and form physical address uint64_t frame_address = l3_entry & ENTRY_ADDR_MASK; uint64_t physical_address = frame_address | offset; return physical_address;} /* * Memory Access Count: 3 (L1, L2, L3) + 1 (actual data) = 4 total * * Without TLB: Every memory access requires 4 physical reads * Performance would be 4× slower than uncached * * With TLB: Translation cached, most accesses need only 1 read * TLB is essential for multi-level paging viability */Visualizing the translation:
Virtual Address: 0x00ABCDEF12 (40-bit example)
├───┤├─────┤├─────┤├───────┤
L1:2 L2:755 L3:495 Off:274
CR3: 0x00001000
│
▼
┌─────────────────────────────────────────┐
│ L1 Table @ 0x00001000 │
│ Entry[0]: 0x0 (not present) │
│ Entry[1]: 0x0 (not present) │
│ Entry[2]: 0x00072003 ─────────────────────┐
│ ... │ │
└─────────────────────────────────────────┘ │
│
┌──────────────────────────────┘
▼
┌─────────────────────────────────────────┐
│ L2 Table @ 0x00072000 │
│ Entry[755]: 0x00089003 ───────────────────┐
│ ... │ │
└─────────────────────────────────────────┘ │
│
┌──────────────────────────────┘
▼
┌─────────────────────────────────────────┐
│ L3 Table @ 0x00089000 │
│ Entry[495]: 0x00456003 ───────────────────┐
│ (Frame: 0x00456, Flags: Present, RW) │ │
│ ... │ │
└─────────────────────────────────────────┘ │
│
┌──────────────────────────────┘
▼
┌─────────────────────────────────────────┐
│ Physical Frame @ 0x00456000 │
│ Physical Address = 0x00456000 + 274 │
│ = 0x00456112 │
└─────────────────────────────────────────┘
Three-level paging requires three memory accesses before accessing the actual data—a 4× overhead compared to direct physical access. This makes the TLB (Translation Lookaside Buffer) absolutely critical. Without TLB caching, multi-level paging would be impractically slow.
Three-level paging extends the memory savings of two-level paging to larger address spaces. Let's analyze the overhead for different usage patterns.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
# Three-Level Page Table Memory Analysis # Configuration: 48-bit address space# If using three levels: might use 16-10-10-12 split# L1: 16 bits = 65,536 entries × 8 bytes = 512KB (problematic!)# L2: 10 bits = 1,024 entries × 8 bytes = 8KB# L3: 10 bits = 1,024 entries × 8 bytes = 8KB # Better: Use more levels to keep tables page-sized# x86-64 four-level: 9-9-9-9-12 split# Each table: 512 entries × 8 bytes = 4KB ✓ def analyze_three_level(address_bits, l1_bits, l2_bits, l3_bits): """Analyze memory overhead for three-level paging.""" offset_bits = 12 # 4KB pages entry_size = 8 # 64-bit entries l1_entries = 2 ** l1_bits l2_entries = 2 ** l2_bits l3_entries = 2 ** l3_bits l1_table_size = l1_entries * entry_size l2_table_size = l2_entries * entry_size l3_table_size = l3_entries * entry_size # Coverage per table l3_coverage = l3_entries * (2 ** offset_bits) # Bytes per L3 table l2_coverage = l2_entries * l3_coverage # Bytes per L2 table l1_coverage = l1_entries * l2_coverage # Total address space print(f"Three-Level Analysis: {address_bits}-bit addresses") print(f"Split: {l1_bits}-{l2_bits}-{l3_bits}-{offset_bits}") print(f"") print(f"L1 (Root) Table:") print(f" Entries: {l1_entries:,}") print(f" Size: {l1_table_size:,} bytes ({l1_table_size/1024:.1f} KB)") print(f" Each entry covers: {l2_coverage / (1024**2):.1f} MB") print(f"") print(f"L2 (Middle) Tables:") print(f" Entries per table: {l2_entries:,}") print(f" Size per table: {l2_table_size:,} bytes ({l2_table_size/1024:.1f} KB)") print(f" Each entry covers: {l3_coverage / (1024**2):.1f} MB") print(f" Max L2 tables: {l1_entries:,}") print(f"") print(f"L3 (Bottom) Tables:") print(f" Entries per table: {l3_entries:,}") print(f" Size per table: {l3_table_size:,} bytes ({l3_table_size/1024:.1f} KB)") print(f" Each entry covers: {2**offset_bits / 1024:.1f} KB (one page)") print(f" Max L3 tables: {l1_entries * l2_entries:,}") return l1_table_size, l2_table_size, l3_table_size # Example: 40-bit address with 8-10-10-12 splitprint("=" * 60)analyze_three_level(40, 8, 10, 10)print()print("=" * 60) # Compare overhead for a sparse 20MB process:# - Code: 8MB at 0x00400000# - Heap: 8MB at 0x00800000 # - Stack: 4MB at 0xFF000000## L1 table: 2KB (always allocated)# L2 tables needed: 3 (entries 1, 2, and 255)# L3 tables needed: ~8 (2 per used region, covering 4MB each)## Total: 2KB + 3×8KB + 8×8KB = 2KB + 24KB + 64KB = 90KB# # Flat table for 40-bit: 2^28 entries × 8 bytes = 2GB per process!| Scenario | Used Memory | L1 Tables | L2 Tables | L3 Tables | Total Overhead |
|---|---|---|---|---|---|
| Minimal process (1MB) | 1MB | 1 (4KB)* | 1 (4KB) | 1 (4KB) | 12KB |
| Small process (20MB) | 20MB | 1 (4KB)* | 3 (12KB) | 5 (20KB) | 36KB |
| Medium process (100MB) | 100MB | 1 (4KB)* | 4 (16KB) | 25 (100KB) | 120KB |
| Large process (1GB) | 1GB | 1 (4KB)* | ~8 (32KB) | 256 (1MB) | ~1MB |
| Huge process (10GB) | 10GB | 1 (4KB)* | ~40 (160KB) | 2560 (10MB) | ~10MB |
*Assuming 4KB page-sized tables with 9-9-9-9-12 split (actual x86-64 uses 4 levels).
Key observations:
Overhead scales with usage, not address space — A process using 20MB needs ~36KB of page tables, regardless of whether the address space is 40-bit or 48-bit.
Small processes benefit most — A 1MB process has only 12KB overhead—a 1.2% overhead. This is vastly better than a flat table.
Large processes have reasonable overhead — Even a 10GB process needs only 10MB of page tables—0.1% overhead.
Three levels may not be enough — With a 48-bit address space, keeping L1 page-sized requires four levels, not three.
Multi-level paging achieves O(used memory) overhead instead of O(address space) overhead. This fundamental shift makes huge virtual address spaces practical, enabling modern computing's memory abstractions.
Multi-level page tables enable a powerful optimization: large pages (also called huge pages or superpages). Instead of traversing all levels to map a 4KB page, an intermediate entry can directly map a larger region, skipping lower levels.
How large pages work:
In a three-level system with appropriate bit splits, an L2 entry can directly map a 2MB region (the same as what one L3 table would cover). Similarly, an L1 entry could map a 1GB region. When a large page is used:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576
// Large Page Support in Three-Level Paging // Standard page sizes#define PAGE_4KB (1UL << 12) // 4KB - Standard page#define PAGE_2MB (1UL << 21) // 2MB - Large page (L3 skipped)#define PAGE_1GB (1UL << 30) // 1GB - Huge page (L2 & L3 skipped) // Large page flag (PS bit in x86-64)#define ENTRY_LARGE_PAGE (1UL << 7) // Detecting large pagesbool is_large_page_l2(uint64_t l2_entry) { return (l2_entry & ENTRY_LARGE_PAGE) != 0;} bool is_large_page_l1(uint64_t l1_entry) { return (l1_entry & ENTRY_LARGE_PAGE) != 0;} // Translation with large page supportuint64_t translate_with_large_pages(uint64_t virtual_address) { uint64_t l1_index = GET_L1(virtual_address); uint64_t l2_index = GET_L2(virtual_address); uint64_t l3_index = GET_L3(virtual_address); uint64_t offset_4kb = GET_OFFSET(virtual_address); // Access L1 table PageTable l1_table = phys_to_virt(cr3_register); uint64_t l1_entry = l1_table[l1_index]; if (!(l1_entry & ENTRY_PRESENT)) { page_fault(virtual_address, FAULT_L1); return 0; } // Check for 1GB large page at L1 if (is_large_page_l1(l1_entry)) { // L1 entry directly maps 1GB page // Offset = L2 index bits + L3 index bits + page offset uint64_t offset_1gb = (l2_index << 21) | (l3_index << 12) | offset_4kb; uint64_t frame_1gb = l1_entry & 0xFFFFFFC0000000UL; // 1GB aligned return frame_1gb | offset_1gb; } // Access L2 table uint64_t l2_phys = l1_entry & ENTRY_ADDR_MASK; PageTable l2_table = phys_to_virt(l2_phys); uint64_t l2_entry = l2_table[l2_index]; if (!(l2_entry & ENTRY_PRESENT)) { page_fault(virtual_address, FAULT_L2); return 0; } // Check for 2MB large page at L2 if (is_large_page_l2(l2_entry)) { // L2 entry directly maps 2MB page // Offset = L3 index bits + page offset uint64_t offset_2mb = (l3_index << 12) | offset_4kb; uint64_t frame_2mb = l2_entry & 0xFFFFFFFFE00000UL; // 2MB aligned return frame_2mb | offset_2mb; } // Standard 4KB page - continue to L3 uint64_t l3_phys = l2_entry & ENTRY_ADDR_MASK; PageTable l3_table = phys_to_virt(l3_phys); uint64_t l3_entry = l3_table[l3_index]; if (!(l3_entry & ENTRY_PRESENT)) { page_fault(virtual_address, FAULT_L3); return 0; } uint64_t frame_4kb = l3_entry & ENTRY_ADDR_MASK; return frame_4kb | offset_4kb;}Modern Linux supports Transparent Huge Pages, which automatically uses 2MB pages when possible without application changes. The kernel's khugepaged daemon coalesces small pages into huge pages in the background, and can split them back under memory pressure.
While three-level paging works for address spaces up to around 40-42 bits, modern 64-bit systems typically support 48-bit addresses (256TB), requiring four levels to keep all tables page-sized.
Why x86-64 uses four levels:
With 48-bit addresses, 4KB pages, and 8-byte entries:
Address bits to index: 48 - 12 = 36 bits
If 3 levels: 36 / 3 = 12 bits per level → 4096 entries × 8 bytes = 32KB per table
If 4 levels: 36 / 4 = 9 bits per level → 512 entries × 8 bytes = 4KB per table ✓
Four levels with 9 bits each gives exactly page-sized tables, which:
| Level | Name | Index Bits | Entries | Entry Size | Table Size | Coverage |
|---|---|---|---|---|---|---|
| 1 | PML4 | 39-47 (9 bits) | 512 | 8 bytes | 4KB | 512GB per entry |
| 2 | PDPT | 30-38 (9 bits) | 512 | 8 bytes | 4KB | 1GB per entry |
| 3 | PD | 21-29 (9 bits) | 512 | 8 bytes | 4KB | 2MB per entry |
| 4 | PT | 12-20 (9 bits) | 512 | 8 bytes | 4KB | 4KB per entry |
| — | Offset | 0-11 (12 bits) | 4096 | 1 byte | — | Byte within page |
The extension to five levels:
Intel's newer processors support 5-level paging for 57-bit addresses (128PB), adding a PML5 level:
5-Level: PML5 → PML4 → PDPT → PD → PT → Frame + Offset
9 bits 9 bits 9 bits 9 bits 9 bits 12 bits = 57 bits
This provides headroom for future memory expansion while maintaining the elegant 4KB-per-table design.
Architectural comparison:
| Architecture | Levels | Address Bits | Page Size | Entry Size |
|---|---|---|---|---|
| x86 (32-bit) | 2 | 32 | 4KB | 4 bytes |
| x86 PAE | 3 | 36 | 4KB | 8 bytes |
| x86-64 | 4 | 48 | 4KB | 8 bytes |
| x86-64 (LA57) | 5 | 57 | 4KB | 8 bytes |
| ARM AArch32 | 2 | 32 | 4KB | 4 bytes |
| ARM AArch64 | 3-4 | 39-48 | 4KB | 8 bytes |
Each additional level adds one memory access to translation. With 5 levels, a TLB miss requires 5 memory accesses before reaching data. More levels would make cold TLB misses prohibitively expensive. The 4-5 level design balances table size, address space, and translation latency.
Three-level paging extends the hierarchical approach to accommodate larger address spaces while maintaining the memory efficiency benefits of multi-level structures.
What's next:
We've now seen two-level and three-level paging as specific instances of hierarchical page tables. In the next page, we'll generalize these concepts to understand hierarchical page tables as a universal pattern—examining the principles that govern any multi-level structure, regardless of the specific number of levels.
You now understand three-level paging: why additional levels are needed for larger address spaces, how the address is decomposed across three indices, the complete translation algorithm, the role of large pages, and how three-level concepts extend to modern four and five-level systems.