Loading learning content...
Virtual memory provides each process with the illusion of a vast, contiguous address space. But this abstraction comes at a cost: the operating system must maintain page tables that map virtual addresses to physical frames. These page tables consume memory—sometimes substantial amounts.
Consider a 64-bit system with 4KB pages. A naive page table would need 2^52 entries (the number of possible virtual pages). At 8 bytes per entry, that's 36 petabytes per process—clearly impossible. Understanding how page table size is calculated, and the techniques used to reduce it, is essential for both interviews and real system design.
Page table size calculations test your understanding of:
By the end of this page, you will be able to calculate page table sizes for any configuration: single-level, two-level, three-level, and inverted page tables. You'll understand why page size choices matter, how multi-level paging reduces overhead, and the trade-offs involved in each approach.
Before calculating page table sizes, we must understand the structure of virtual and physical addresses.
Virtual Address Decomposition:
A virtual address is divided into two parts:
| Virtual Page Number (VPN) | Page Offset |
p bits d bits
Page Offset (d bits): Identifies the byte within a page
Virtual Page Number (p bits): Identifies which page
Physical Address Decomposition:
| Physical Frame Number (PFN) | Page Offset |
f bits d bits
| System | Virtual Address | Page Size | VPN Bits | Offset Bits | Virtual Pages |
|---|---|---|---|---|---|
| 32-bit, 4KB | 32 bits | 4 KB | 20 bits | 12 bits | 2^20 = 1M |
| 32-bit, 4MB | 32 bits | 4 MB | 10 bits | 22 bits | 2^10 = 1K |
| 64-bit, 4KB | 48 bits* | 4 KB | 36 bits | 12 bits | 2^36 = 64G |
| 64-bit, 2MB | 48 bits* | 2 MB | 27 bits | 21 bits | 2^27 = 128M |
| 64-bit, 1GB | 48 bits* | 1 GB | 18 bits | 30 bits | 2^18 = 256K |
Modern 64-bit systems (x86-64, ARM64) typically use only 48 bits of the 64-bit virtual address, as 2^48 bytes (256 TB) far exceeds current needs. The upper 16 bits must match bit 47 (canonical addresses). Future systems may expand to 57 bits (128 PB).
The Fundamental Page Table Calculation:
A single-level page table has one entry for every possible virtual page:
Page Table Size = Number of Virtual Pages × Size of Page Table Entry
= 2^(Address Bits - Offset Bits) × Entry Size
= 2^VPN × Entry Size
Page Table Entry (PTE) Contents:
Each PTE contains:
Typical PTE sizes:
Single-level (flat) page tables are the simplest to calculate but have significant overhead for large address spaces.
Example 1: 32-bit System with 4KB Pages
Given:
Calculation:
Offset bits = log₂(4KB) = log₂(4096) = 12 bits
VPN bits = 32 - 12 = 20 bits
Number of PTEs = 2^20 = 1,048,576 entries
Page Table Size = 2^20 × 4 bytes = 4,194,304 bytes = 4 MB
Insight: Each process needs a 4 MB page table, even if the process only uses a small amount of memory. For 100 processes, that's 400 MB just for page tables!
Example 2: Effect of Page Size
Let's see how page size affects page table size:
Given:
| Page Size | Offset Bits | VPN Bits | #PTEs | Page Table Size |
|---|---|---|---|---|
| 1 KB | 10 | 22 | 4M | 16 MB |
| 4 KB | 12 | 20 | 1M | 4 MB |
| 16 KB | 14 | 18 | 256K | 1 MB |
| 64 KB | 16 | 16 | 64K | 256 KB |
Key Insight: Larger page sizes dramatically reduce page table size, but waste memory due to internal fragmentation. A 64KB page holding 1KB of data wastes 63KB!
Larger pages → Smaller page tables, better TLB coverage, but more internal fragmentation. Smaller pages → Larger page tables, more TLB misses, but less wasted memory. Modern systems use multiple page sizes (4KB, 2MB, 1GB) to get benefits of both.
Example 3: 64-bit System Reality Check
Given:
Naïve Calculation:
Offset bits = 12
VPN bits = 64 - 12 = 52 bits
Number of PTEs = 2^52 = 4,503,599,627,370,496 entries
Page Table Size = 2^52 × 8 bytes = 2^55 bytes = 36 PB (petabytes)
This is clearly impossible! This is why:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
def single_level_page_table_size( virtual_address_bits: int, page_size_bytes: int, pte_size_bytes: int) -> dict: """ Calculate single-level page table size. Returns dictionary with all relevant metrics. """ import math # Calculate offset bits from page size offset_bits = int(math.log2(page_size_bytes)) # VPN uses remaining bits vpn_bits = virtual_address_bits - offset_bits # Number of page table entries num_ptes = 2 ** vpn_bits # Total page table size pt_size_bytes = num_ptes * pte_size_bytes # Express in human-readable units def format_size(size_bytes): for unit in ['B', 'KB', 'MB', 'GB', 'TB', 'PB']: if size_bytes < 1024: return f"{size_bytes:.2f} {unit}" size_bytes /= 1024 return f"{size_bytes:.2f} EB" return { 'virtual_address_bits': virtual_address_bits, 'page_size': format_size(page_size_bytes), 'offset_bits': offset_bits, 'vpn_bits': vpn_bits, 'num_ptes': num_ptes, 'pte_size': pte_size_bytes, 'page_table_size': format_size(pt_size_bytes), 'page_table_size_bytes': pt_size_bytes } # Example usageresult = single_level_page_table_size( virtual_address_bits=32, page_size_bytes=4096, # 4KB pte_size_bytes=4) for key, value in result.items(): print(f"{key}: {value}")Two-level (hierarchical) paging solves the space problem by only allocating page table entries for virtual addresses actually used.
Two-Level Address Decomposition:
| Outer Index | Inner Index | Page Offset |
p1 bits p2 bits d bits
Design Principle: Each inner page table should fit exactly in one page. This determines how to split the VPN bits.
Calculation Strategy:
Example: 32-bit, 4KB Pages, 4-byte PTEs
Step 1: Basic Parameters
Page size = 4 KB = 4096 bytes
Offset bits = log₂(4096) = 12 bits
PTE size = 4 bytes
Step 2: PTEs per Page
PTEs per page = Page Size / PTE Size
= 4096 / 4 = 1024 PTEs per inner table
Step 3: Index Bit Allocation
Inner index bits = log₂(1024) = 10 bits
Total VPN bits = 32 - 12 = 20 bits
Outer index bits = 20 - 10 = 10 bits
Step 4: Structure Sizes
Outer table entries = 2^10 = 1024
Outer table size = 1024 × 4 bytes = 4 KB (fits in one page!)
Each inner table size = 1024 × 4 bytes = 4 KB
Maximum inner tables = 1024
Maximum total size = 4 KB + (1024 × 4 KB) = 4 MB + 4 KB
| Component | Entries | Entry Size | Total Size |
|---|---|---|---|
| Outer Table (Page Directory) | 2^10 = 1024 | 4 bytes | 4 KB |
| Each Inner Table | 2^10 = 1024 | 4 bytes | 4 KB |
| Maximum Inner Tables | 1024 | ||
| Maximum Total | 4,100 KB |
The Space Savings:
The key insight is that unused inner tables don't exist. If a process only uses the first 8 MB of its address space:
Pages needed = 8 MB / 4 KB = 2048 pages = 2 inner tables
Actual size = 4 KB (outer) + 2 × 4 KB (inner) = 12 KB
Compare to single-level: 4 MB regardless of actual usage!
Minimum Page Table Size:
Minimum = Outer table + 1 inner table = 4 KB + 4 KB = 8 KB
This handles up to 4 MB of mapped virtual address space (1024 pages × 4 KB/page).
Two-level paging saves space only when the virtual address space is sparsely used. If a process maps its entire 4 GB address space, two-level paging actually uses MORE memory (outer table overhead). However, typical processes use a tiny fraction of their address space.
Example: Calculating Space for Specific Memory Usage
Problem: A 32-bit system uses 4KB pages and 4-byte PTEs with two-level paging. How much page table space is needed if a process uses:
Solution:
Code segment (2 MB at 0x00400000):
Data segment (1 MB at 0x10000000):
Stack (64 KB at 0xFFFx0000):
Total page table size:
= Outer table + 3 inner tables
= 4 KB + 3 × 4 KB
= 16 KB
For 64-bit systems, even two-level paging isn't enough. Modern systems use three or four levels of paging.
x86-64 Four-Level Paging (48-bit addresses):
| PML4 | PDPT | PD | PT | Offset |
9 bits 9 bits 9 bits 9 bits 12 bits
Why 9 bits per level?
With 8-byte PTEs and 4KB pages:
Entries per table = 4096 bytes / 8 bytes = 512 = 2^9
So each level uses 9 bits of the virtual address.
Example: x86-64 Page Table Size Calculation
Given:
Maximum Structure Sizes:
| Level | Name | Entries | Size | Max Count |
|---|---|---|---|---|
| 1 | PML4 | 512 | 4 KB | 1 |
| 2 | PDPT | 512 | 4 KB | 512 |
| 3 | PD | 512 | 4 KB | 262,144 |
| 4 | PT | 512 | 4 KB | 134,217,728 |
Maximum Total:
= 4KB + (512 × 4KB) + (262,144 × 4KB) + (134,217,728 × 4KB)
= 4KB + 2MB + 1GB + 512GB
≈ 513 GB
But a minimal process uses:
= 1 PML4 + 1 PDPT + 1 PD + 1 PT
= 4 × 4 KB = 16 KB
This maps up to 512 pages = 2 MB of virtual address space.
| Memory Mapped | PT Tables | PD Tables | PDPT | PML4 | Total Size |
|---|---|---|---|---|---|
| < 2 MB | 1 | 1 | 1 | 1 | 16 KB |
| < 1 GB | 512 | 1 | 1 | 1 | 2 MB + 12 KB |
| < 512 GB | 262,144 | 512 | 1 | 1 | 1 GB + 2 MB + 8 KB |
| Full 256 TB | ~134M | 262,144 | 512 | 1 | ~513 GB |
x86-64 supports 2MB huge pages (uses PD entries directly, skips PT level) and 1GB gigantic pages (uses PDPT entries directly, skips PD and PT). This dramatically reduces page table overhead for large memory mappings.
Huge Page Calculation Example:
Problem: A database server maps 32 GB of shared memory. Calculate page table overhead using (a) 4KB pages, (b) 2MB pages, (c) 1GB pages.
(a) 4KB Pages:
Number of pages = 32 GB / 4 KB = 8,388,608 pages
PT tables needed = 8,388,608 / 512 = 16,384
PD tables needed = 16,384 / 512 = 32
PDPT entries used = 32 (fits in 1 table)
Total = 4 KB (PML4) + 4 KB (PDPT) + 32 × 4 KB (PD) + 16,384 × 4 KB (PT)
= 4 KB + 4 KB + 128 KB + 64 MB
= 64.13 MB
(b) 2MB Huge Pages:
Number of huge pages = 32 GB / 2 MB = 16,384 huge pages
PD entries needed = 16,384 (no PT level!)
PD tables needed = 16,384 / 512 = 32
Total = 4 KB (PML4) + 4 KB (PDPT) + 32 × 4 KB (PD)
= 4 KB + 4 KB + 128 KB
= 136 KB
(c) 1GB Gigantic Pages:
Number of gigantic pages = 32 (no PD or PT levels!)
Total = 4 KB (PML4) + 4 KB (PDPT)
= 8 KB
Summary: Using 1GB pages reduces overhead from 64 MB to 8 KB—a 8000× improvement!
Inverted page tables take a fundamentally different approach: instead of one entry per virtual page, there's one entry per physical frame.
Inverted Page Table Structure:
Table Size = Number of Physical Frames × Entry Size
= (Physical Memory Size / Page Size) × Entry Size
Each entry contains:
Advantage: Table size depends on physical memory, not virtual address space. A 64-bit system with 16 GB RAM needs only:
Frames = 16 GB / 4 KB = 4,194,304 frames
Entry size ≈ 16 bytes (PID + VPN + control + chain)
Table size = 4,194,304 × 16 = 64 MB
This is shared across ALL processes, unlike hierarchical tables.
Example: Inverted vs Hierarchical Comparison
System Configuration:
Hierarchical (per-process, assuming 100 MB average usage):
Per process (100 MB mapped):
Pages = 100 MB / 4 KB = 25,600 pages
PT tables = 25,600 / 512 = 50
PD tables = 1, PDPT = 1, PML4 = 1
Per-process size = (50 + 1 + 1 + 1) × 4 KB = 212 KB
Total for 100 processes = 100 × 212 KB = 21.2 MB
Inverted (system-wide):
Frames = 64 GB / 4 KB = 16,777,216 frames
Entry size = 16 bytes
Total = 16,777,216 × 16 = 256 MB
In this case, hierarchical paging wins! But if processes map larger address spaces or we have more physical memory, inverted could be better.
The downside of inverted page tables is lookup time. A hierarchical table allows O(1) lookup (follow pointers). Inverted tables require searching the table, typically using hashing. Hash collisions can extend lookup time. This makes TLB efficiency even more critical.
| Aspect | Hierarchical | Inverted |
|---|---|---|
| Table size scales with | Virtual address space × processes | Physical memory only |
| Per-process overhead | Yes (each process has tables) | No (shared system-wide) |
| Lookup complexity | O(levels) = O(4) | O(1) average, O(n) worst case |
| Sharing pages | Easy (point to same frame) | Complex (multiple entries) |
| Used in | x86, x86-64, ARM | IBM PowerPC, HP PA-RISC |
Understanding PTE format is essential for accurate size calculations. The PTE must be large enough to hold the physical frame number plus all control bits.
Minimum PTE Size Calculation:
Physical Frame Number bits = log₂(Physical Memory / Page Size)
Control bits = Valid + Dirty + Accessed + Protection + ...
Minimum PTE size = ceil((PFN bits + Control bits) / 8) bytes
Example: Determining PTE Size
Given:
Calculation:
PFN bits = log₂(2^36 / 2^12) = log₂(2^24) = 24 bits
Total bits needed = 24 + 12 = 36 bits = 4.5 bytes
Actual PTE size = 8 bytes (rounded to power of 2)
The extra bits (28 bits) can be used for additional flags like:
x86-64 PTE Format (64 bits):
63 62 52 51 12 11 9 8 7 6 5 4 3 2 1 0
+---+------+---------------+-----+-+-+-+-+-+-+-+-+-+
|XD | Avl | PFN | Avl |G|S|D|A|C|W|U|W|P|
+---+------+---------------+-----+-+-+-+-+-+-+-+-+-+
PFN Field: 40 bits (bits 12-51) → supports 52-bit physical addresses → 4 PB of physical memory maximum.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
def calculate_pte_size( physical_memory_bytes: int, page_size_bytes: int, control_bits: int, round_to_power_of_2: bool = True) -> dict: """ Calculate minimum and practical PTE size. """ import math # Calculate physical frame number bits num_frames = physical_memory_bytes // page_size_bytes pfn_bits = int(math.ceil(math.log2(num_frames))) # Total bits needed total_bits = pfn_bits + control_bits min_bytes = math.ceil(total_bits / 8) # Practical size (power of 2) if round_to_power_of_2: practical_bytes = 1 while practical_bytes < min_bytes: practical_bytes *= 2 else: practical_bytes = min_bytes # Available extra bits extra_bits = (practical_bytes * 8) - total_bits return { 'physical_frames': num_frames, 'pfn_bits': pfn_bits, 'control_bits': control_bits, 'total_bits_needed': total_bits, 'minimum_bytes': min_bytes, 'practical_bytes': practical_bytes, 'extra_bits_available': extra_bits } # Example: 64 GB RAM, 4 KB pages, 12 control bitsresult = calculate_pte_size( physical_memory_bytes=64 * 1024**3, page_size_bytes=4096, control_bits=12) for k, v in result.items(): print(f"{k}: {v}")Page table size has a direct impact on address translation cost. Each level of paging requires a memory access to read the page table entry.
Single-Level Translation:
Total memory accesses = 1 (page table) + 1 (data) = 2
Four-Level Translation (x86-64):
Total memory accesses = 4 (page tables) + 1 (data) = 5
Without TLB caching, a 4-level page table makes every memory access 5× slower!
Memory Access Time Calculation:
If base memory access time = 100 ns:
EAT = 0.95 × (TLB_time + 100ns) + 0.05 × (TLB_time + 500ns)
= 0.95 × 110 + 0.05 × 510 (assuming 10ns TLB)
= 104.5 + 25.5
= 130 ns
Modern CPUs cache not just the TLB entry but also intermediate page table entries (PDEs, PDPEs, PMl4Es). When translating addresses in the same region, higher-level entries are often already cached, reducing the effective walk cost.
Walk Cost Analysis Example:
Given:
Calculation:
TLB Hit (98%):
Time = TLB lookup + memory access = 5 + 100 = 105 ns
TLB Miss with cached upper levels (1%):
Levels to walk = 2 (PT and data only, PD/PDPT/PML4 cached)
Time = 5 + (2 × 100) + 100 = 305 ns
TLB Miss without cached upper levels (1%):
Levels to walk = 4 (full walk)
Time = 5 + (4 × 100) + 100 = 505 ns
Effective Access Time:
EAT = 0.98 × 105 + 0.01 × 305 + 0.01 × 505
= 102.9 + 3.05 + 5.05
= 111 ns
This is only 11% overhead compared to the theoretical 400% overhead of a full walk every time!
Test your understanding with these comprehensive practice problems.
A system has a 24-bit virtual address space and 16-bit physical address space. Pages are 512 bytes. Each PTE contains the frame number plus 4 control bits. (a) How many virtual pages exist? (b) What is the minimum PTE size? (c) What is the page table size?
(a) Virtual Pages:
Page size = 512 bytes = 2^9 bytes
Offset bits = 9
VPN bits = 24 - 9 = 15 bits
Virtual pages = 2^15 = 32,768 pages
(b) Minimum PTE Size:
Physical frames = 2^16 / 2^9 = 2^7 = 128 frames
PFN bits = 7 bits
Control bits = 4 bits
Total = 7 + 4 = 11 bits → 2 bytes minimum
(c) Page Table Size:
Page Table = 32,768 entries × 2 bytes = 65,536 bytes = 64 KB
Design a two-level page table for a system with 32-bit addresses and 8KB pages. PTEs are 4 bytes. Each inner page table should fit in exactly one page. How should the virtual address be divided?
Step 1: Calculate offset bits:
Page size = 8 KB = 8192 bytes = 2^13 bytes
Offset bits = 13
Step 2: PTEs per inner table:
PTEs per page = 8192 / 4 = 2048 = 2^11
Inner index bits = 11
Step 3: Outer index:
VPN bits = 32 - 13 = 19 bits
Outer index bits = 19 - 11 = 8 bits
Address Division:
| Outer (8) | Inner (11) | Offset (13) |
8 bits 11 bits 13 bits
Verification:
A 64-bit system (48-bit virtual, 40-bit physical) uses 8-byte PTEs with 4-level paging. Compare the page table hierarchy for (a) 4KB pages and (b) 16KB pages. How many entries per table? How many levels could potentially be eliminated?
(a) 4KB Pages:
Offset bits = 12
VPN bits = 48 - 12 = 36 bits
Entries per table = 4096 / 8 = 512 = 2^9
Levels needed = 36 / 9 = 4 levels exactly
(b) 16KB Pages:
Offset bits = 14
VPN bits = 48 - 14 = 34 bits
Entries per table = 16384 / 8 = 2048 = 2^11
Levels needed = ceil(34 / 11) = ceil(3.09) = 4 levels
Bit distribution: 12 + 11 + 11 + 14 = 48
(or 1 + 11 + 11 + 11 + 14 with tiny top level)
Key insight: 16KB pages reduce table depth slightly but the 4-level structure remains. However, each table covers more address space, reducing total table count for sparse mappings.
Essential Formulas:
Offset bits = log₂(Page Size)
VPN bits = Virtual Address Bits - Offset bits
Single-level table size = 2^VPN × PTE size
PTEs per page = Page Size / PTE Size
PFN bits = log₂(Physical Memory / Page Size)
What's Next:
We've covered page table structure calculations. Next, we'll examine TLB hit ratio calculations, which determine the effective performance of virtual memory systems.
You now have comprehensive knowledge of page table size calculations for single-level, multi-level, and inverted page tables. You understand how page size, address space size, and PTE format affect page table overhead.