Loading content...
If lazy loading is the art of strategic procrastination, prepaging is the art of strategic anticipation. Rather than waiting for a page fault to occur and then reacting, prepaging attempts to predict which pages will be needed and load them in advance, eliminating faults entirely.
The appeal is obvious: every page fault avoided is a potential disk I/O prevented, latency reduced, and user experience improved. If we can accurately predict future page accesses and load those pages before they're needed, we transform the unpredictable, latency-laden experience of demand paging into smooth, uninterrupted execution.
But prepaging is a gamble. Every page loaded speculatively that turns out to be unnecessary represents wasted memory, wasted I/O bandwidth, and potentially wasted time that could have been spent loading pages that are needed. The challenge is prediction accuracy: how well can we anticipate future memory accesses?
By the end of this page, you will understand prepaging strategies in depth: the different forms of prepaging (initial prepaging, working set prepaging, read-ahead), the prediction mechanisms they employ, their costs and benefits, and how to evaluate when prepaging is worthwhile versus when pure demand paging is preferable.
Prepaging (also called prefetching or anticipatory paging) is the strategy of loading pages into memory before they are explicitly referenced by the CPU. The goal is to have pages already present when the CPU needs them, converting potential page faults into cache hits.
The Core Trade-off:
Page fault cost saved vs. Wasted resources for unused pages
(~100 μs - 10 ms) (memory, I/O bandwidth, CPU)
The Math:
Let's formalize when prepaging is beneficial:
Expected benefit = p × s - (1-p) × w
Prepaging is worthwhile when:
p × s > (1-p) × w
Rearranging:
p > w / (s + w)
For example, if s = 100 (page fault cost in arbitrary units) and w = 10 (wasted resources):
p > 10 / (100 + 10) = 9.1%
If we're right more than ~9% of the time, prepaging is beneficial. This might seem like an low bar, but prediction accuracy varies dramatically across scenarios.
| Fault Cost | Waste Cost | Min Accuracy | Scenario |
|---|---|---|---|
| 10 ms (HDD) | 100 μs | 1% | Almost always beneficial |
| 100 μs (SSD) | 100 μs | 50% | Need good predictions |
| 10 μs (RAM/NVMe) | 100 μs | 91% | Rarely worthwhile |
| 10 ms (HDD) | 500 μs | 5% | Aggressive prefetch OK |
| 100 μs (SSD) | 500 μs | 83% | Conservative prefetch only |
On slow storage (HDD), page faults are so expensive that even poor prediction accuracy justifies prepaging. On fast storage (NVMe), faults are cheap enough that inaccurate prepaging wastes more resources than it saves. Modern systems adapt their prepaging aggressiveness based on detected storage characteristics.
Prepaging comes in several forms, each with different prediction strategies and use cases.
1. Initial Prepaging (Startup Prepaging)
Load pages at process startup before execution begins:
This combats the "cold start" problem of pure demand paging.
2. Working Set Prepaging (Resume Prepaging)
When a swapped-out process is swapped back in, preload its entire (or most of its) previous working set rather than letting it fault each page back in:
Process suspended → working set tracked
Process resumed → preload all tracked pages
Result: process resumes quickly without fault storm
3. Spatial Read-Ahead
When loading page N, also load pages N+1, N+2, ... N+K:
4. Clustered/Fault-Around
While servicing a fault for page N, map additional pages near N that are already in the page cache:
Fault on page N:
1. Load page N from disk (normal fault handling)
2. Check if pages N-2, N-1, N+1, N+2 are in page cache
3. If so, map them into the process too (no I/O needed)
4. Result: fewer future minor faults
5. Application-Directed Prefetch
Applications explicitly request page loading via system calls:
madvise(addr, len, MADV_WILLNEED); // Load these pages soon
posix_fadvise(fd, offset, len, POSIX_FADV_WILLNEED); // Preload file region
This is the most accurate form—the application knows its own access patterns.
6. Pattern-Based Prefetch
OS tracks access patterns and predicts future accesses:
| Strategy | When Active | Prediction Basis | Accuracy |
|---|---|---|---|
| Initial prepaging | Process start | Static analysis + history | Medium |
| Working set prepaging | Process resume | Previous run data | High |
| Read-ahead | Any file fault | Spatial locality | High for sequential |
| Fault-around | Any fault | Pages already cached | Very high (no risk) |
| Application hints | On request | Application knowledge | Very high |
| Pattern-based | Detected patterns | Machine learning | Varies |
Fault-around stands out because it has virtually no cost. If pages are already in the page cache, mapping them into the process requires only page table updates—no I/O. The only cost is the memory for the page table entries and minimal CPU time. This makes it almost universally beneficial.
Read-ahead is the most common form of prepaging in modern systems. It exploits the observation that file and code access is often sequential—if you accessed page N, you'll probably access page N+1 soon.
How Read-Ahead Works:
Without read-ahead (pure demand):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ A │ . │ . │ . │ . │ . │ . │ . │ Page A accessed (fault)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ . │ . │ . │ . │ . │ . │ Page B accessed (fault)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ C │ . │ . │ . │ . │ . │ Page C accessed (fault)
└───┴───┴───┴───┴───┴───┴───┴───┘ ... 8 faults for 8 pages
With read-ahead (window = 4):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ A │ B │ C │ D │ . │ . │ . │ . │ Page A accessed (fault + read-ahead B,C,D)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ C │ D │ . │ . │ . │ . │ Page B accessed (hit!)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ C │ D │ . │ . │ . │ . │ Page C accessed (hit!)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ C │ D │ E │ F │ G │ H │ Page D accessed (hit! + async read-ahead E,F,G,H)
└───┴───┴───┴───┴───┴───┴───┴───┘ ... 2 faults for 8 pages
Adaptive Read-Ahead:
Modern systems don't use fixed read-ahead windows. Instead, they adapt based on observed behavior:
Linux Read-Ahead Algorithm (simplified):
Initial fault:
read_ahead_size = 4 pages (16 KB)
Subsequent sequential faults:
if (pages hit in read-ahead window) {
// Sequential access confirmed
read_ahead_size *= 2;
if (read_ahead_size > max_read_ahead) {
read_ahead_size = max_read_ahead; // Typically 128 KB - 2 MB
}
}
Random faults (not in window):
read_ahead_size = 0; // Disable read-ahead for this file
Read-ahead trigger point:
When access reaches 50% into current window,
initiate async read-ahead for next window
The read-ahead window grows as sequential access is confirmed, up to a maximum (often 2 MB). If random access is detected, read-ahead is disabled or reduced.
123456789101112131415161718192021222324252627282930313233343536373839
/* Controlling read-ahead behavior */ #include <fcntl.h>#include <sys/mman.h> void process_file_sequential(int fd, size_t file_size) { /* Hint: we'll read sequentially */ posix_fadvise(fd, 0, file_size, POSIX_FADV_SEQUENTIAL); /* Kernel uses aggressive read-ahead */ char *map = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0); /* Equivalent for mmap: */ madvise(map, file_size, MADV_SEQUENTIAL); /* Process entire file... */ for (size_t i = 0; i < file_size; i += PAGE_SIZE) { process_page(map + i); } munmap(map, file_size);} void process_file_random(int fd, size_t file_size) { /* Hint: random access - disable read-ahead */ posix_fadvise(fd, 0, file_size, POSIX_FADV_RANDOM); /* Kernel disables read-ahead */ char *map = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0); madvise(map, file_size, MADV_RANDOM); /* Random access pattern */ for (int i = 0; i < 10000; i++) { size_t offset = random() % file_size; process_byte(map[offset]); } munmap(map, file_size);}Modern read-ahead is asynchronous. When the trigger point is reached, the kernel schedules I/O for future pages but returns immediately. The I/O happens in the background while the CPU processes currently-available pages. By the time the CPU needs the next pages, they're already loaded.
When a process that was previously swapped out is resumed, pure demand paging would require it to fault in every page again. Working set prepaging avoids this by remembering which pages the process was using and preloading them.
The Problem:
Process A running with 1000 pages in memory
→ System under memory pressure
→ Process A swapped out (pages written to swap)
→ Process A remains dormant for 1 hour
→ Process A wakes up
With pure demand paging:
• First access → fault → load page 1
• Second access → fault → load page 2
• ... (1000 faults to restore working set)
• Process is sluggish for seconds
With working set prepaging:
• System remembers: A used pages [1, 5, 12, 17, 23, ...]
• On resume: preload all tracked pages asynchronously
• Process resumes with most pages already present
• Only genuinely new accesses cause faults
| Strategy | Mechanism | Accuracy | Overhead |
|---|---|---|---|
| Full working set | Track all present pages | High | Large memory for tracking |
| Accessed pages only | Track pages with A bit set | High | Periodic scanning needed |
| Hot pages only | Track frequently-accessed pages | Medium | Lower overhead |
| File-backed only | Only prepage file-backed pages | Medium | Ignores anonymous pages |
| Application-provided | App saves own state | Very High | Requires app modification |
Implementation Approaches:
1. Windows Approach (Working Set Manager):
Windows maintains detailed working set information for each process:
When a process is trimmed (pages evicted) and later resumes, Windows can use the saved information to prioritize reloading.
2. Linux Approach (Implicit):
Linux doesn't explicitly track working sets for prepaging, but achieves similar effects through:
3. Application-Level (Android)
Android apps can save and restore state including memory usage hints, allowing faster app resume after being killed for memory.
The terms 'cold start' and 'warm start' distinguish between scenarios. A cold start has no prior information—the process must demand-page everything. A warm start can use cached/preload data to accelerate startup. Working set prepaging converts cold starts into warm starts.
Initial prepaging attempts to load pages at process startup before execution begins, eliminating the initial fault storm that pure demand paging creates.
Sources of Startup Information:
Static Analysis:
Historical Data:
Application Manifests:
Windows Prefetcher:
Windows Prefetcher Architecture:
1. Learning Phase (first few runs):
┌────────────────────────────────────────┐
│ Application launches │
│ Prefetcher monitors page faults │
│ Records: file, offset, timestamp │
│ Stores in .pf file after 10 seconds │
└────────────────────────────────────────┘
2. Prefetch Phase (subsequent runs):
┌────────────────────────────────────────┐
│ Application launch requested │
│ Prefetcher loads .pf file │
│ Initiates async reads for all pages │
│ Application starts with pages loading │
│ Many faults become cache hits │
└────────────────────────────────────────┘
Typical improvement:
Without Prefetcher: 5 second startup (thousands of faults)
With Prefetcher: 1-2 second startup (pages already loading)
SuperFetch (Windows Vista+):
SuperFetch extends Prefetcher with predictive loading:
Chrome browser: ~200 MB of code and data, ~50 MB typical startup working set4.5 second cold start vs. 1.2 second warm prefetched startCold start (no prefetcher): • 12,500 pages × 0.3ms average fault time = 3.75 seconds of I/O wait • Plus CPU initialization: ~0.75 seconds • Total: ~4.5 seconds to usable state
With prefetcher: • Prefetch file loaded, async I/O started immediately • Application starts while I/O in progress • Most faults find pages already in cache • I/O overlapped with CPU initialization • Total: ~1.2 seconds to usable state
The 3.3 second improvement represents roughly $100M in user time saved per year across Windows users, assuming 1 browser launch per user per day.
Prefetcher data doesn't help on the very first run or after boot. The first run of each application is always cold. Some systems (Windows ReadyBoost) use flash drives to cache prefetch data persistently across reboots, reducing cold boot impact.
Prepaging is not free. Every page loaded speculatively consumes resources that might be wasted if the page is never used. Understanding these costs is essential for deciding when prepaging is appropriate.
1. Memory Consumption:
Each preloaded page occupies a physical frame:
2. I/O Bandwidth:
Disk/SSD bandwidth used for speculation isn't available for other I/O:
3. CPU Overhead:
Prepaging requires computation:
| Cost Type | Per-Page Impact | System Impact | When Problematic |
|---|---|---|---|
| Memory | 4 KB per page | MB to GB wasted | Memory pressure situations |
| Read I/O | ~100 μs SSD | Bandwidth competition | I/O-bound workloads |
| CPU | ~1 μs per page | Minimal | Rarely problematic |
| Eviction pressure | Forces other evictions | Cascading evictions | Limited RAM |
| Power | Disk/SSD active | Battery drain | Mobile/laptop |
| Write (swap) | Write unused pages | SSD wear | Overcommitted memory |
Wasted Prepaging Example:
Aggressive startup prefetch:
Preloaded: 5000 pages (20 MB)
Actually used: 2000 pages (8 MB)
Wasted: 3000 pages (12 MB)
Memory waste: 12 MB
I/O waste: 12 MB × 100 μs/page = 300 ms of I/O bandwidth
Eviction impact: Might have evicted 3000 pages from other processes
When Waste Becomes Critical:
Low-Memory Systems: Preloading 10 MB of potentially unused pages on a 512 MB embedded system is unacceptable.
Overcommitted Servers: VMs sharing memory via overcommit can't afford speculative loading.
Battery-Powered Devices: Unnecessary I/O directly translates to reduced battery life.
Short-Lived Processes: Process that runs for 50 ms doesn't benefit from 100 ms of prefetch I/O.
Every resource spent on incorrect speculation is resource unavailable for actual work. Aggressive prepaging with low accuracy can make overall system performance WORSE by consuming memory and I/O bandwidth needed for demand faults of actually-needed pages.
The effectiveness of prepaging depends entirely on prediction accuracy. Let's examine what makes predictions accurate and when they fail.
Factors Enabling Accurate Prediction:
Sequential Access Patterns: If access to page N predicts access to page N+1, read-ahead is highly accurate.
Repetitive Workloads: Applications used the same way repeatedly have predictable startup patterns.
Stable Working Sets: Programs that access the same pages throughout their lifetime are predictable.
Application Cooperation: Applications that explicitly request prefetch (madvise) provide perfect predictions.
Factors Causing Prediction Failure:
Measuring Prediction Accuracy:
Prepaging Metrics:
Pages Preloaded and Used
Precision = ────────────────────────────────
Total Pages Preloaded
Pages Preloaded and Used
Recall = ────────────────────────────────────
Total Pages Actually Needed
Example:
Preloaded 100 pages
60 pages were actually used
Total pages the app needed: 80
Precision = 60/100 = 60% (40 wasted preloads)
Recall = 60/80 = 75% (25% still had to fault)
Optimal Strategy Depends on Costs:
The best prepaging systems adapt. They observe outcomes (was the prepaged page used?), adjust aggressiveness (reduce window if precision is low), and learn patterns (increase window for sequential access). Static one-size-fits-all prepaging fails for diverse workloads.
Having examined both pure demand paging and prepaging in depth, we can now synthesize guidelines for when each approach is preferable.
Decision Framework:
| Factor | Favor Prepaging | Favor Demand |
|---|---|---|
| Access pattern | Sequential, predictable | Random, unpredictable |
| Storage speed | Slow (HDD) | Fast (NVMe, RAM) |
| Memory availability | Abundant | Constrained |
| Process lifetime | Long-running | Short-lived |
| Startup time importance | Critical (user-facing) | Unimportant (background) |
| Working set stability | Stable | Rapidly changing |
| Historical data | Available, accurate | Unavailable, outdated |
| System load | Lightly loaded | Heavily loaded |
The Hybrid Reality:
Real systems don't choose one or the other—they combine strategies:
Typical Linux Behavior:
┌─────────────────────────────────────────────────────────────┐
│ Component │ Strategy │
├─────────────────────┼───────────────────────────────────────┤
│ Code pages │ Demand + moderate read-ahead │
│ Sequential file │ Aggressive read-ahead │
│ Random file │ Pure demand (MADV_RANDOM) │
│ Anonymous pages │ Pure demand │
│ Swap-in │ Demand + swap cache │
│ mmap WILLNEED │ Immediate prepaging │
└─────────────────────┴───────────────────────────────────────┘
The system adapts per-file, per-region, and even dynamically based on observed behavior.
For most applications, the OS default behavior is good enough. Kernel developers have tuned these heuristics over decades. Intervene with madvise/fadvise only when profiling shows demand paging or prepaging behavior is suboptimal for your specific workload.
As an application developer, you can influence prepaging behavior to improve your application's performance. Here are practical guidelines.
1. Use Advisory Hints:
/* When you know access will be sequential */
madvise(addr, size, MADV_SEQUENTIAL);
posix_fadvise(fd, 0, size, POSIX_FADV_SEQUENTIAL);
/* When you know access will be random */
madvise(addr, size, MADV_RANDOM);
posix_fadvise(fd, 0, size, POSIX_FADV_RANDOM);
/* When you're about to need data */
madvise(addr, size, MADV_WILLNEED);
posix_fadvise(fd, offset, size, POSIX_FADV_WILLNEED);
/* When you're done with data */
madvise(addr, size, MADV_DONTNEED);
posix_fadvise(fd, offset, size, POSIX_FADV_DONTNEED);
2. Organize Data for Locality:
Design data structures so related data is on the same or adjacent pages:
perf stat -e page-faults or similar.123456789101112131415161718192021222324252627282930313233343536
/* Patterns for effective prepaging */ /* Pattern 1: Pre-touch array before time-critical section */void pretouch_buffer(char *buf, size_t len) { volatile char dummy; for (size_t i = 0; i < len; i += 4096) { dummy = buf[i]; /* Force page fault now, not later */ }} /* Pattern 2: Prefetch in background while doing other work */void process_with_prefetch(char *data, size_t len) { size_t chunk = 1 << 20; /* 1 MB chunks */ for (size_t offset = 0; offset < len; offset += chunk) { /* Prefetch next chunk while processing current */ if (offset + chunk < len) { madvise(data + offset + chunk, chunk, MADV_WILLNEED); } /* Process current chunk */ process_chunk(data + offset, chunk); }} /* Pattern 3: Lock critical real-time sections */void audio_callback(void *buffer, size_t frames) { /* Called from real-time thread - CANNOT page fault */ /* Buffer and all code must be pre-locked */ process_audio(buffer, frames);} void init_audio_system(void) { /* Lock our code and data before going real-time */ mlockall(MCL_CURRENT | MCL_FUTURE);}The most effective prepaging optimizations come from profiling real usage. Record page faults during typical scenarios, identify hot spots, and target hints at those regions. Generic optimization without profiling often wastes effort or makes things worse.
We've explored prepaging—the proactive counterpart to demand paging's reactive approach. Let's consolidate the key insights:
Module Complete:
You've now completed the Demand Paging module. You understand:
Together, these concepts form the foundation of how modern operating systems efficiently manage virtual memory, creating the illusion of abundant RAM while using physical resources judiciously.
You now have a comprehensive understanding of demand paging—from the lazy loading philosophy through practical prepaging optimizations. This knowledge is essential for systems programming, performance tuning, and understanding how modern operating systems efficiently virtualize memory.