Demand Paging - Learning Module

Loading content...

0/240

Prepaging

Anticipating the Future

If lazy loading is the art of strategic procrastination, prepaging is the art of strategic anticipation. Rather than waiting for a page fault to occur and then reacting, prepaging attempts to predict which pages will be needed and load them in advance, eliminating faults entirely.

The appeal is obvious: every page fault avoided is a potential disk I/O prevented, latency reduced, and user experience improved. If we can accurately predict future page accesses and load those pages before they're needed, we transform the unpredictable, latency-laden experience of demand paging into smooth, uninterrupted execution.

But prepaging is a gamble. Every page loaded speculatively that turns out to be unnecessary represents wasted memory, wasted I/O bandwidth, and potentially wasted time that could have been spent loading pages that are needed. The challenge is prediction accuracy: how well can we anticipate future memory accesses?

What You Will Learn

By the end of this page, you will understand prepaging strategies in depth: the different forms of prepaging (initial prepaging, working set prepaging, read-ahead), the prediction mechanisms they employ, their costs and benefits, and how to evaluate when prepaging is worthwhile versus when pure demand paging is preferable.

The Prepaging Principle

Prepaging (also called prefetching or anticipatory paging) is the strategy of loading pages into memory before they are explicitly referenced by the CPU. The goal is to have pages already present when the CPU needs them, converting potential page faults into cache hits.

The Core Trade-off:

Page fault cost saved      vs.    Wasted resources for unused pages
    (~100 μs - 10 ms)                  (memory, I/O bandwidth, CPU)

The Math:

Let's formalize when prepaging is beneficial:

p = probability that a preloaded page will be used
s = cost savings when a preloaded page IS used (avoided fault)
w = wasted cost when a preloaded page is NOT used

Expected benefit = p × s - (1-p) × w

Prepaging is worthwhile when:
p × s > (1-p) × w

Rearranging:
p > w / (s + w)

For example, if s = 100 (page fault cost in arbitrary units) and w = 10 (wasted resources):

p > 10 / (100 + 10) = 9.1%

If we're right more than ~9% of the time, prepaging is beneficial. This might seem like an low bar, but prediction accuracy varies dramatically across scenarios.

Prepaging Break-Even Analysis
Fault Cost	Waste Cost	Min Accuracy	Scenario
10 ms (HDD)	100 μs	1%	Almost always beneficial
100 μs (SSD)	100 μs	50%	Need good predictions
10 μs (RAM/NVMe)	100 μs	91%	Rarely worthwhile
10 ms (HDD)	500 μs	5%	Aggressive prefetch OK
100 μs (SSD)	500 μs	83%	Conservative prefetch only

Storage Speed Matters

On slow storage (HDD), page faults are so expensive that even poor prediction accuracy justifies prepaging. On fast storage (NVMe), faults are cheap enough that inaccurate prepaging wastes more resources than it saves. Modern systems adapt their prepaging aggressiveness based on detected storage characteristics.

Types of Prepaging

Prepaging comes in several forms, each with different prediction strategies and use cases.

1. Initial Prepaging (Startup Prepaging)

Load pages at process startup before execution begins:

Entry point code page
Initial stack page
Frequently-used library pages
Working set from previous runs (if tracked)

This combats the "cold start" problem of pure demand paging.

2. Working Set Prepaging (Resume Prepaging)

When a swapped-out process is swapped back in, preload its entire (or most of its) previous working set rather than letting it fault each page back in:

Process suspended → working set tracked
Process resumed → preload all tracked pages
Result: process resumes quickly without fault storm

3. Spatial Read-Ahead

When loading page N, also load pages N+1, N+2, ... N+K:

Exploits spatial locality
Particularly effective for sequential access
Configurable window size (K)

4. Clustered/Fault-Around

While servicing a fault for page N, map additional pages near N that are already in the page cache:

Fault on page N:
1. Load page N from disk (normal fault handling)
2. Check if pages N-2, N-1, N+1, N+2 are in page cache
3. If so, map them into the process too (no I/O needed)
4. Result: fewer future minor faults

5. Application-Directed Prefetch

Applications explicitly request page loading via system calls:

madvise(addr, len, MADV_WILLNEED);  // Load these pages soon
posix_fadvise(fd, offset, len, POSIX_FADV_WILLNEED);  // Preload file region

This is the most accurate form—the application knows its own access patterns.

6. Pattern-Based Prefetch

OS tracks access patterns and predicts future accesses:

Windows SuperFetch: ML-based, application-specific prefetching
Linux fadvise hints: kernel detects sequential/random patterns

Comparison of Prepaging Strategies
Strategy	When Active	Prediction Basis	Accuracy
Initial prepaging	Process start	Static analysis + history	Medium
Working set prepaging	Process resume	Previous run data	High
Read-ahead	Any file fault	Spatial locality	High for sequential
Fault-around	Any fault	Pages already cached	Very high (no risk)
Application hints	On request	Application knowledge	Very high
Pattern-based	Detected patterns	Machine learning	Varies

Fault-Around is Free

Fault-around stands out because it has virtually no cost. If pages are already in the page cache, mapping them into the process requires only page table updates—no I/O. The only cost is the memory for the page table entries and minimal CPU time. This makes it almost universally beneficial.

Read-Ahead in Depth

Read-ahead is the most common form of prepaging in modern systems. It exploits the observation that file and code access is often sequential—if you accessed page N, you'll probably access page N+1 soon.

How Read-Ahead Works:

Without read-ahead (pure demand):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ A │ . │ . │ . │ . │ . │ . │ . │  Page A accessed (fault)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ . │ . │ . │ . │ . │ . │  Page B accessed (fault)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ C │ . │ . │ . │ . │ . │  Page C accessed (fault)
└───┴───┴───┴───┴───┴───┴───┴───┘  ... 8 faults for 8 pages

With read-ahead (window = 4):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ A │ B │ C │ D │ . │ . │ . │ . │  Page A accessed (fault + read-ahead B,C,D)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ C │ D │ . │ . │ . │ . │  Page B accessed (hit!)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ C │ D │ . │ . │ . │ . │  Page C accessed (hit!)
├───┼───┼───┼───┼───┼───┼───┼───┤
│ A │ B │ C │ D │ E │ F │ G │ H │  Page D accessed (hit! + async read-ahead E,F,G,H)
└───┴───┴───┴───┴───┴───┴───┴───┘  ... 2 faults for 8 pages

Adaptive Read-Ahead:

Modern systems don't use fixed read-ahead windows. Instead, they adapt based on observed behavior:

Linux Read-Ahead Algorithm (simplified):

Initial fault:
  read_ahead_size = 4 pages (16 KB)

Subsequent sequential faults:
  if (pages hit in read-ahead window) {
    // Sequential access confirmed
    read_ahead_size *= 2;
    if (read_ahead_size > max_read_ahead) {
      read_ahead_size = max_read_ahead;  // Typically 128 KB - 2 MB
    }
  }

Random faults (not in window):
  read_ahead_size = 0;  // Disable read-ahead for this file
  
Read-ahead trigger point:
  When access reaches 50% into current window,
  initiate async read-ahead for next window

The read-ahead window grows as sequential access is confirmed, up to a maximum (often 2 MB). If random access is detected, read-ahead is disabled or reduced.

read_ahead_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
/* Controlling read-ahead behavior */
 
#include <fcntl.h>
#include <sys/mman.h>
 
void process_file_sequential(int fd, size_t file_size) {
    /* Hint: we'll read sequentially */
    posix_fadvise(fd, 0, file_size, POSIX_FADV_SEQUENTIAL);
    /* Kernel uses aggressive read-ahead */
    
    char *map = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
    
    /* Equivalent for mmap: */
    madvise(map, file_size, MADV_SEQUENTIAL);
    
    /* Process entire file... */
    for (size_t i = 0; i < file_size; i += PAGE_SIZE) {
        process_page(map + i);
    }
    
    munmap(map, file_size);
}
 
void process_file_random(int fd, size_t file_size) {
    /* Hint: random access - disable read-ahead */
    posix_fadvise(fd, 0, file_size, POSIX_FADV_RANDOM);
    /* Kernel disables read-ahead */
    
    char *map = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
    madvise(map, file_size, MADV_RANDOM);
    
    /* Random access pattern */
    for (int i = 0; i < 10000; i++) {
        size_t offset = random() % file_size;
        process_byte(map[offset]);
    }
    
    munmap(map, file_size);
}

Asynchronous Read-Ahead

Modern read-ahead is asynchronous. When the trigger point is reached, the kernel schedules I/O for future pages but returns immediately. The I/O happens in the background while the CPU processes currently-available pages. By the time the CPU needs the next pages, they're already loaded.

Working Set Prepaging

When a process that was previously swapped out is resumed, pure demand paging would require it to fault in every page again. Working set prepaging avoids this by remembering which pages the process was using and preloading them.

The Problem:

Process A running with 1000 pages in memory
→ System under memory pressure
→ Process A swapped out (pages written to swap)
→ Process A remains dormant for 1 hour
→ Process A wakes up

With pure demand paging:
  • First access → fault → load page 1
  • Second access → fault → load page 2
  • ... (1000 faults to restore working set)
  • Process is sluggish for seconds

With working set prepaging:
  • System remembers: A used pages [1, 5, 12, 17, 23, ...]
  • On resume: preload all tracked pages asynchronously
  • Process resumes with most pages already present
  • Only genuinely new accesses cause faults

Working Set Prepaging Strategies
Strategy	Mechanism	Accuracy	Overhead
Full working set	Track all present pages	High	Large memory for tracking
Accessed pages only	Track pages with A bit set	High	Periodic scanning needed
Hot pages only	Track frequently-accessed pages	Medium	Lower overhead
File-backed only	Only prepage file-backed pages	Medium	Ignores anonymous pages
Application-provided	App saves own state	Very High	Requires app modification

Implementation Approaches:

1. Windows Approach (Working Set Manager):

Windows maintains detailed working set information for each process:

Working set minimum and maximum
Pages in current working set
Age information for each page

When a process is trimmed (pages evicted) and later resumes, Windows can use the saved information to prioritize reloading.

2. Linux Approach (Implicit):

Linux doesn't explicitly track working sets for prepaging, but achieves similar effects through:

Page cache persistence (file-backed pages remain in cache)
Swap cache (recently swapped pages cached for quick retrieval)
Transparent Huge Pages for locality

3. Application-Level (Android)

Android apps can save and restore state including memory usage hints, allowing faster app resume after being killed for memory.

Cold Start vs. Warm Start

The terms 'cold start' and 'warm start' distinguish between scenarios. A cold start has no prior information—the process must demand-page everything. A warm start can use cached/preload data to accelerate startup. Working set prepaging converts cold starts into warm starts.

Initial (Startup) Prepaging

Initial prepaging attempts to load pages at process startup before execution begins, eliminating the initial fault storm that pure demand paging creates.

Sources of Startup Information:

Static Analysis:
- Entry point is always needed → preload code page containing entry
- Program headers indicate which sections exist
- First few pages of each section are likely needed
Historical Data:
- Track which pages were accessed in previous runs
- Store this information for use in future launches
- Windows Prefetcher/SuperFetch uses this approach
Application Manifests:
- Application specifies required startup resources
- Common on mobile platforms (Android, iOS)

Windows Prefetcher:

Windows Prefetcher Architecture:

1. Learning Phase (first few runs):
   ┌────────────────────────────────────────┐
   │ Application launches                   │
   │ Prefetcher monitors page faults        │
   │ Records: file, offset, timestamp       │
   │ Stores in .pf file after 10 seconds    │
   └────────────────────────────────────────┘

2. Prefetch Phase (subsequent runs):
   ┌────────────────────────────────────────┐
   │ Application launch requested           │
   │ Prefetcher loads .pf file              │
   │ Initiates async reads for all pages    │
   │ Application starts with pages loading  │
   │ Many faults become cache hits          │
   └────────────────────────────────────────┘

Typical improvement:
  Without Prefetcher: 5 second startup (thousands of faults)
  With Prefetcher: 1-2 second startup (pages already loading)

SuperFetch (Windows Vista+):

SuperFetch extends Prefetcher with predictive loading:

Learns when applications are typically used (e.g., Outlook at 9 AM)
Preloads applications into memory before user requests them
Uses idle I/O bandwidth to prime the cache
When user clicks, app data is often already in memory

Example: Prefetcher Impact on Browser StartupMeasuring startup time with and without prefetching

Input

Chrome browser: ~200 MB of code and data, ~50 MB typical startup working set

Output

4.5 second cold start vs. 1.2 second warm prefetched start

Explanation

Cold start (no prefetcher): • 12,500 pages × 0.3ms average fault time = 3.75 seconds of I/O wait • Plus CPU initialization: ~0.75 seconds • Total: ~4.5 seconds to usable state

With prefetcher: • Prefetch file loaded, async I/O started immediately • Application starts while I/O in progress • Most faults find pages already in cache • I/O overlapped with CPU initialization • Total: ~1.2 seconds to usable state

The 3.3 second improvement represents roughly $100M in user time saved per year across Windows users, assuming 1 browser launch per user per day.

The Cold Boot Problem

Prefetcher data doesn't help on the very first run or after boot. The first run of each application is always cold. Some systems (Windows ReadyBoost) use flash drives to cache prefetch data persistently across reboots, reducing cold boot impact.

Costs of Prepaging

Prepaging is not free. Every page loaded speculatively consumes resources that might be wasted if the page is never used. Understanding these costs is essential for deciding when prepaging is appropriate.

1. Memory Consumption:

Each preloaded page occupies a physical frame:

Reduces memory available for actually-needed pages
May cause eviction of useful pages under memory pressure
Wastes RAM on pages never accessed

2. I/O Bandwidth:

Disk/SSD bandwidth used for speculation isn't available for other I/O:

Slows down other applications' I/O
May reduce overall system throughput
SSD write amplification for unnecessary reads

3. CPU Overhead:

Prepaging requires computation:

Prediction algorithm execution
Page table updates for prepaged pages
Book-keeping for tracking
Prefetch data file management

Prepaging Cost Breakdown
Cost Type	Per-Page Impact	System Impact	When Problematic
Memory	4 KB per page	MB to GB wasted	Memory pressure situations
Read I/O	~100 μs SSD	Bandwidth competition	I/O-bound workloads
CPU	~1 μs per page	Minimal	Rarely problematic
Eviction pressure	Forces other evictions	Cascading evictions	Limited RAM
Power	Disk/SSD active	Battery drain	Mobile/laptop
Write (swap)	Write unused pages	SSD wear	Overcommitted memory

Wasted Prepaging Example:

Aggressive startup prefetch:
  Preloaded: 5000 pages (20 MB)
  Actually used: 2000 pages (8 MB)
  Wasted: 3000 pages (12 MB)
  
  Memory waste: 12 MB
  I/O waste: 12 MB × 100 μs/page = 300 ms of I/O bandwidth
  Eviction impact: Might have evicted 3000 pages from other processes

When Waste Becomes Critical:

Low-Memory Systems: Preloading 10 MB of potentially unused pages on a 512 MB embedded system is unacceptable.
Overcommitted Servers: VMs sharing memory via overcommit can't afford speculative loading.
Battery-Powered Devices: Unnecessary I/O directly translates to reduced battery life.
Short-Lived Processes: Process that runs for 50 ms doesn't benefit from 100 ms of prefetch I/O.

The Wasted Work Principle

Every resource spent on incorrect speculation is resource unavailable for actual work. Aggressive prepaging with low accuracy can make overall system performance WORSE by consuming memory and I/O bandwidth needed for demand faults of actually-needed pages.

Prediction Accuracy

The effectiveness of prepaging depends entirely on prediction accuracy. Let's examine what makes predictions accurate and when they fail.

Factors Enabling Accurate Prediction:

Sequential Access Patterns: If access to page N predicts access to page N+1, read-ahead is highly accurate.
Repetitive Workloads: Applications used the same way repeatedly have predictable startup patterns.
Stable Working Sets: Programs that access the same pages throughout their lifetime are predictable.
Application Cooperation: Applications that explicitly request prefetch (madvise) provide perfect predictions.

Factors Causing Prediction Failure:

Prediction Failure Scenarios

•Data-Dependent Access: Which code runs depends on user input, making prediction impossible.
•Random Access Patterns: Database index lookups, hash table accesses are fundamentally unpredictable.
•Changing Workloads: Application usage patterns change over time, invalidating historical data.
•Polymorphic Behavior: Same application, different modes (e.g., document editing vs. printing).
•First-Run Scenarios: No historical data available for new applications.
•Shared Systems: Multiple users with different usage patterns.

Measuring Prediction Accuracy:

Prepaging Metrics:

                  Pages Preloaded and Used
Precision = ────────────────────────────────
                  Total Pages Preloaded

                Pages Preloaded and Used
Recall = ────────────────────────────────────
          Total Pages Actually Needed

Example:
  Preloaded 100 pages
  60 pages were actually used
  Total pages the app needed: 80

  Precision = 60/100 = 60% (40 wasted preloads)
  Recall = 60/80 = 75% (25% still had to fault)

Optimal Strategy Depends on Costs:

High fault cost, low waste cost: Maximize recall (preload more, accept waste)
Low fault cost, high waste cost: Maximize precision (preload conservatively)

Adaptive Systems Win

The best prepaging systems adapt. They observe outcomes (was the prepaged page used?), adjust aggressiveness (reduce window if precision is low), and learn patterns (increase window for sequential access). Static one-size-fits-all prepaging fails for diverse workloads.

Prepaging vs. Pure Demand: When to Choose

Having examined both pure demand paging and prepaging in depth, we can now synthesize guidelines for when each approach is preferable.

Decision Framework:

Prepaging vs. Demand Paging Decision Matrix
Factor	Favor Prepaging	Favor Demand
Access pattern	Sequential, predictable	Random, unpredictable
Storage speed	Slow (HDD)	Fast (NVMe, RAM)
Memory availability	Abundant	Constrained
Process lifetime	Long-running	Short-lived
Startup time importance	Critical (user-facing)	Unimportant (background)
Working set stability	Stable	Rapidly changing
Historical data	Available, accurate	Unavailable, outdated
System load	Lightly loaded	Heavily loaded

Use Prepaging For

•Desktop application startup
•Sequential file processing
•Process resume after swap
•Known multimedia playback
•Database sequential scans
•Predictable server workloads

Use Demand Paging For

•Random database access
•Short-lived CLI tools
•Memory-constrained devices
•Unpredictable user workloads
•First run of new software
•Overcommitted virtual environments

The Hybrid Reality:

Real systems don't choose one or the other—they combine strategies:

Typical Linux Behavior:

┌─────────────────────────────────────────────────────────────┐
│  Component          │  Strategy                            │
├─────────────────────┼───────────────────────────────────────┤
│  Code pages         │  Demand + moderate read-ahead         │
│  Sequential file    │  Aggressive read-ahead                │
│  Random file        │  Pure demand (MADV_RANDOM)            │
│  Anonymous pages    │  Pure demand                          │
│  Swap-in            │  Demand + swap cache                  │
│  mmap WILLNEED      │  Immediate prepaging                  │
└─────────────────────┴───────────────────────────────────────┘

The system adapts per-file, per-region, and even dynamically based on observed behavior.

Let the OS Decide (Usually)

For most applications, the OS default behavior is good enough. Kernel developers have tuned these heuristics over decades. Intervene with madvise/fadvise only when profiling shows demand paging or prepaging behavior is suboptimal for your specific workload.

Best Practices for Application Developers

As an application developer, you can influence prepaging behavior to improve your application's performance. Here are practical guidelines.

1. Use Advisory Hints:

/* When you know access will be sequential */
madvise(addr, size, MADV_SEQUENTIAL);
posix_fadvise(fd, 0, size, POSIX_FADV_SEQUENTIAL);

/* When you know access will be random */
madvise(addr, size, MADV_RANDOM);
posix_fadvise(fd, 0, size, POSIX_FADV_RANDOM);

/* When you're about to need data */
madvise(addr, size, MADV_WILLNEED);
posix_fadvise(fd, offset, size, POSIX_FADV_WILLNEED);

/* When you're done with data */
madvise(addr, size, MADV_DONTNEED);
posix_fadvise(fd, offset, size, POSIX_FADV_DONTNEED);

2. Organize Data for Locality:

Design data structures so related data is on the same or adjacent pages:

Group frequently-accessed fields together
Separate hot and cold data
Consider cache line and page boundaries

Developer Best Practices

•Profile first: Measure page fault rates before optimizing. Use perf stat -e page-faults or similar.
•Use hints appropriately: Don't guess—use madvise/fadvise when you have genuine knowledge of access patterns.
•Pre-touch critical pages: Touch pages you need during initialization, not during latency-sensitive operations.
•Consider mlock for real-time: If you can't tolerate any page faults, lock pages. But use sparingly.
•Separate hot and cold data: Structure programs so hot working set is smaller and more contiguous.
•Test on realistic storage: Development on NVMe may hide problems that appear on HDD deployments.
•Batch file access: Instead of many small reads, read larger chunks to benefit from read-ahead.

prepaging_patterns.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/* Patterns for effective prepaging */
 
/* Pattern 1: Pre-touch array before time-critical section */
void pretouch_buffer(char *buf, size_t len) {
    volatile char dummy;
    for (size_t i = 0; i < len; i += 4096) {
        dummy = buf[i];  /* Force page fault now, not later */
    }
}
 
/* Pattern 2: Prefetch in background while doing other work */
void process_with_prefetch(char *data, size_t len) {
    size_t chunk = 1 << 20;  /* 1 MB chunks */
    
    for (size_t offset = 0; offset < len; offset += chunk) {
        /* Prefetch next chunk while processing current */
        if (offset + chunk < len) {
            madvise(data + offset + chunk, chunk, MADV_WILLNEED);
        }
        
        /* Process current chunk */
        process_chunk(data + offset, chunk);
    }
}
 
/* Pattern 3: Lock critical real-time sections */
void audio_callback(void *buffer, size_t frames) {
    /* Called from real-time thread - CANNOT page fault */
    /* Buffer and all code must be pre-locked */
    process_audio(buffer, frames);
}
 
void init_audio_system(void) {
    /* Lock our code and data before going real-time */
    mlockall(MCL_CURRENT | MCL_FUTURE);
}

Profile-Guided Optimization

The most effective prepaging optimizations come from profiling real usage. Record page faults during typical scenarios, identify hot spots, and target hints at those regions. Generic optimization without profiling often wastes effort or makes things worse.

Summary: Prepaging

We've explored prepaging—the proactive counterpart to demand paging's reactive approach. Let's consolidate the key insights:

Key Takeaways

•Prepaging loads pages before they are accessed, attempting to eliminate faults by anticipating future needs.
•Multiple prepaging strategies exist: read-ahead (spatial), working set prepaging (resume), initial prepaging (startup), and application-directed (madvise).
•Effectiveness depends on prediction accuracy—loading pages that aren't used wastes memory and I/O bandwidth.
•Storage speed affects the trade-off: prepaging is more valuable on slow storage where fault costs are high.
•Read-ahead is particularly effective for sequential access patterns, dramatically reducing fault counts.
•Real systems combine strategies: demanding for unpredictable access, prepaging for predictable patterns.
•Application hints (madvise, fadvise) allow developers to communicate knowledge the OS can't infer.
•Profile before optimizing: measure page fault behavior before adding hints or changing access patterns.

Module Complete:

You've now completed the Demand Paging module. You understand:

Lazy loading as the core philosophy
Valid-invalid bit as the mechanism
Page fault handling as the implementation
Pure demand paging as the theoretical baseline
Prepaging as the proactive optimization

Together, these concepts form the foundation of how modern operating systems efficiently manage virtual memory, creating the illusion of abundant RAM while using physical resources judiciously.

Module Complete

You now have a comprehensive understanding of demand paging—from the lazy loading philosophy through practical prepaging optimizations. This knowledge is essential for systems programming, performance tuning, and understanding how modern operating systems efficiently virtualize memory.