Loading content...
One of the most consequential decisions in paging system design is choosing the page size. This single parameter ripples through every aspect of the memory system: from how much space is wasted in partially-filled pages to how large page tables grow, from how efficiently disk I/O operates to how effectively the TLB caches translations.
Modern systems typically use page sizes ranging from 4KB to 1GB, with 4KB being the most common default. But why 4KB? Why not 1KB or 16KB? Why do some systems support multiple page sizes simultaneously? Understanding the tradeoffs behind page size selection is essential for anyone designing or tuning memory systems.
By the end of this page, you will understand the fundamental tradeoffs in page size selection, analyze how page size affects internal fragmentation, page table size, and I/O efficiency, and appreciate why modern systems support multiple page sizes for different use cases.
Page size selection is not a simple optimization problem with a clear optimal answer. Instead, it involves balancing multiple competing concerns:
Small pages (e.g., 512 bytes, 1KB, 2KB):
Large pages (e.g., 2MB, 1GB):
| Factor | Small Pages | Large Pages |
|---|---|---|
| Internal Fragmentation | Lower (avg = pageSize/2) | Higher (avg = pageSize/2) |
| Page Table Size | Larger (more entries) | Smaller (fewer entries) |
| TLB Coverage | Less (each entry covers less) | More (each entry covers more) |
| TLB Miss Rate | Higher | Lower |
| Disk I/O Efficiency | Lower (small transfers) | Higher (large transfers) |
| Memory Granularity | Finer | Coarser |
| Working Set Accuracy | More precise | Less precise |
There is no 'perfect' page size. The optimal choice depends on workload characteristics, hardware capabilities, and design priorities. This is why modern architectures support multiple page sizes, allowing the OS to choose the best match for each situation.
Understanding page size requires grasping the mathematical relationships between page size, address space size, and page table structures.
Address Decomposition:
For a system with:
The logical address divides as:
Example Calculations:
| Address Size | Page Size | Offset Bits | Page Number Bits | Max Pages |
|---|---|---|---|---|
| 32-bit | 4KB (2^12) | 12 | 20 | ~1 million |
| 32-bit | 64KB (2^16) | 16 | 16 | ~65,000 |
| 64-bit | 4KB (2^12) | 12 | 52 | ~4 quadrillion |
| 64-bit | 2MB (2^21) | 21 | 43 | ~8 trillion |
Page Table Size Calculation:
The page table must have one entry for each possible page. If each page table entry (PTE) is e bytes:
Page Table Size = (Address Space Size / Page Size) × PTE Size
= (2^m / 2^n) × e
= 2^(m-n) × e
Concrete Example:
For a 32-bit address space with 4KB pages and 4-byte PTEs:
For a 64-bit address space with 4KB pages:
This last calculation reveals why 64-bit systems cannot use simple linear page tables and must use hierarchical or other compact representations.
A naive linear page table for a 64-bit address space would require petabytes of memory—far more than exists in any system. This absurdity drove the development of multi-level page tables, inverted tables, and other techniques we'll explore in later modules.
Internal fragmentation occurs because processes rarely require exact multiples of the page size. The last page of every memory allocation is typically only partially filled, wasting the remaining space within that page.
Statistical Analysis:
For a randomly-sized allocation:
For a system with many processes, the total wasted space averages:
Total Internal Fragmentation ≈ (Number of Processes) × (Page Size / 2)
Process size: 10,300 bytes. Compare 1KB, 4KB, and 64KB pages.1KB pages: 3 pages needed, 484 bytes wasted (4.7%)
4KB pages: 3 pages needed, 1,988 bytes wasted (19.3%)
64KB pages: 1 page needed, 55,236 bytes wasted (536%!)With 1KB (1024-byte) pages:
With 4KB pages:
With 64KB pages:
For small allocations, large page sizes waste enormous amounts of memory!
Implications for System Design:
Small Programs Suffer Most: A program using 100KB with 2MB pages wastes 1.9MB per segment. Small utilities, startup processes, and shell scripts become memory hogs.
Sparsely-Used Address Spaces: Languages/runtimes that allocate large virtual address spaces (for flexibility) but use memory sparsely face severe fragmentation with large pages.
The Mixed Workload Problem: Systems with both large (databases, VMs) and small (utilities, scripts) processes face a dilemma: small pages hurt big processes, large pages hurt small processes.
Solution: Multiple Page Sizes: Modern systems support multiple page sizes, using large pages for memory-intensive applications and small pages for general use.
On average, each memory region (code segment, data segment, stack, heap) wastes half a page. A process with 4 segments wastes approximately 2 full pages on average. With 4KB pages, that's 8KB per process—acceptable. With 2MB huge pages, that's 4MB per process—sometimes acceptable, sometimes not.
Page tables consume memory too—and this overhead scales inversely with page size. Smaller pages mean more pages, which means more page table entries.
Page Table Memory as a Fraction of Address Space:
For a linear page table with entry size E:
Overhead Ratio = Page Table Size / Address Space Size
= (Address Space / Page Size) × E / Address Space
= E / Page Size
Example:
| Page Size | PTE Size | 32-bit Page Table | 64-bit Page Table* |
|---|---|---|---|
| 512 B | 4 bytes | 32 MB | Impractical |
| 1 KB | 4 bytes | 16 MB | Impractical |
| 4 KB | 4 bytes | 4 MB | Multilevel: ~40 KB idle |
| 4 KB | 8 bytes | 8 MB | Multilevel: ~80 KB idle |
| 64 KB | 4 bytes | 256 KB | Multilevel: varies |
| 2 MB | 8 bytes | 16 KB | Tiny |
| 1 GB | 8 bytes | 32 bytes | Negligible |
*64-bit systems use hierarchical page tables, so 'page table size' depends on address space utilization, not total possible addresses.
The Per-Process Cost:
Every process needs its own page table. On a system with 1000 processes and 4KB pages:
Why Hierarchical Page Tables Help:
With linear page tables, you need entries for the entire possible address space. With hierarchical tables, you only need entries for regions actually in use. A process using 100MB of a 4GB address space might have:
This is why real systems use multi-level page tables—especially critical for 64-bit systems where linear tables are completely impractical.
Page tables are themselves stored in memory—creating a meta-level concern where memory management structures consume the very resource they manage. Efficient page table design is crucial: at any moment, multiple page tables (one per process) are competing for the same physical frames as application data.
The Translation Lookaside Buffer (TLB) is a small, fast cache within the CPU that stores recently-used page table entries. Because accessing the page table in main memory is slow (100+ CPU cycles), the TLB is essential for good performance. The TLB's effectiveness is critically influenced by page size.
TLB Coverage:
The amount of memory addressable through the TLB without causing a miss:
TLB Coverage = Number of TLB entries × Page Size
Example:
The Working Set Problem:
If a program's working set (actively used memory) exceeds TLB coverage, it will experience frequent TLB misses. Each miss requires:
For memory-intensive workloads like databases or scientific computing, the working set can be gigabytes. With 4KB pages:
With 2MB huge pages:
| Page Size | TLB Coverage | Suitable For |
|---|---|---|
| 4 KB (base) | 512 KB | Small processes, general use |
| 2 MB (large) | 256 MB | Databases, VMs, large applications |
| 1 GB (huge) | 128 GB | In-memory databases, HPC |
Modern systems provide 'huge pages' (2MB on x86, 1GB for truly large allocations) specifically to improve TLB coverage. Applications like Oracle Database, SAP HANA, and large-scale virtualization routinely use huge pages to achieve acceptable performance with multi-gigabyte working sets.
Page size significantly affects disk I/O performance when pages must be loaded from or written to secondary storage (swap space, memory-mapped files).
Disk I/O Characteristics:
Traditional hard drives (HDDs) have:
SSDs have:
But even on SSDs, each I/O operation has fixed overhead (command processing, queue handling). Larger transfers amortize this overhead better.
Efficiency Analysis:
For HDDs, the time to read a page:
Time = Seek Time + Rotational Latency + Transfer Time
≈ 10ms + 4ms + (Page Size / Transfer Rate)
≈ 14ms + (Page Size / 150 MB/s)
For 4KB page: 14ms + 0.026ms ≈ 14.03ms
For 64KB page: 14ms + 0.43ms ≈ 14.43ms
For 1MB page: 14ms + 6.7ms ≈ 20.7ms
The key insight: seek time and rotational latency dominate. Reading 16× more data (64KB vs 4KB) takes essentially the same time! This is why larger pages are more efficient for disk I/O—you get more data per 'penalty' of positioning.
Effective Bandwidth:
Effective Bandwidth = Page Size / Total Time
For 4KB page: 4KB / 14ms = 285 KB/s
For 64KB page: 64KB / 14.4ms = 4.4 MB/s (15× better!)
When paging was invented in the 1960s-70s, disk speeds were much slower and page I/O was the primary performance concern. This heavily influenced page size choices. Modern SSDs have reduced but not eliminated the I/O efficiency argument for larger pages.
Different architectures and operating systems have made different page size choices based on their design priorities and historical constraints.
Standard Page Sizes by Architecture:
| Architecture | Base Page | Large Pages | Huge Pages | Notes |
|---|---|---|---|---|
| x86 (32-bit) | 4 KB | 4 MB | — | 4MB pages via PSE extension |
| x86-64 | 4 KB | 2 MB | 1 GB | Most common modern architecture |
| ARM (32-bit) | 4 KB | 64 KB, 1 MB | 16 MB | Flexible TLB design |
| ARMv8 (64-bit) | 4 KB | 16 KB, 64 KB | 2 MB, 1 GB | Configurable page sizes |
| POWER (IBM) | 4 KB | 64 KB | 16 MB, 16 GB | Enterprise server focus |
| SPARC | 8 KB | 64 KB, 512 KB | 4 MB | Larger base than x86 |
| RISC-V | 4 KB | 2 MB | 1 GB | Follows x86-64 pattern |
Why 4KB Became the Standard:
Historical Hardware Constraints: Early page table implementations had limited space. 4KB balanced internal fragmentation against page table size for 32-bit systems.
Disk Sector Alignment: Traditional hard drives used 512-byte sectors. 4KB = 8 sectors provided good alignment and minimal internal fragmentation for file-backed pages.
Memory Technology: DRAM chips were organized in ways that made 4KB a natural access unit.
Software Compatibility: Once 4KB became widespread, enormous amounts of software implicitly assumed it, creating inertia.
Power-of-Two Convenience: 4KB = 2^12 bytes makes address decomposition trivial in hardware (just look at bit positions).
Modern Trends:
Apple Silicon (M1, M2, etc.) uses 16KB pages instead of 4KB. This improves TLB coverage by 4× while Android devices increasingly support 16KB pages. Expect 16KB to become more common as memory sizes grow and TLB pressure becomes more critical.
Given the tradeoffs we've discussed, it's clear that no single page size is optimal for all situations. Modern systems therefore support multiple page sizes, allowing the operating system and applications to choose the best fit for each particular use case.
How Multiple Page Sizes Work:
MMU Support: The hardware page table format includes a 'page size' indicator for each entry, or uses separate page table structures for different sizes.
TLB Partitioning: Some TLBs have separate sections for different page sizes; others handle all sizes in a unified structure with varying coverage.
OS Allocation: The kernel maintains separate free lists for different page sizes and allocates appropriately.
Application Requests: Applications can request large pages via special APIs (e.g., mmap with MAP_HUGETLB on Linux, VirtualAlloc with MEM_LARGE_PAGES on Windows).
Transparent Huge Page (THP) Challenges:
Linux's Transparent Huge Pages feature attempts to automatically use huge pages when beneficial. However, THP has known issues:
Many high-performance applications disable THP and manually allocate huge pages at startup to avoid these issues.
Transparent Huge Pages (THP) can cause significant latency variations in production systems. Major databases (Redis, MongoDB, Oracle) recommend disabling THP and using explicitly-allocated huge pages instead. Always benchmark your workload with different configurations.
Page size is one of the most consequential parameters in memory system design. Let's consolidate the key insights:
What's Next:
Now that we understand pages, frames, and page size trade-offs, we'll explore one of paging's most important benefits: non-contiguous allocation. We'll see how paging fundamentally changes memory allocation by allowing a process's pages to be placed in any available frames, anywhere in physical memory.
You now understand why page size isn't arbitrary—it's a carefully considered design parameter with system-wide implications. When you encounter a system using 16KB pages or huge page allocations, you'll understand the reasoning. This knowledge is essential for performance tuning and system architecture decisions.