Virtual Memory Concepts - Learning Module

Loading content...

0/227

Demand Paging and Segmentation

Loading Only What You Need

When you launch a modern web browser, it springs to life in under a second despite being hundreds of megabytes in size. When you open a 10 GB video file in an editor, it appears instantly rather than making you wait for the entire file to load. These experiences—so fundamental to modern computing that we take them for granted—are enabled by demand paging (and historically, demand segmentation).

Demand paging is the mechanism that makes virtual memory practical. Rather than loading an entire program and all its data into memory before execution can begin, the operating system loads pages on demand—only when they're actually accessed. This lazy loading strategy is the key that unlocks virtual memory's power to exceed physical memory limits while providing responsive program startup.

This page provides a comprehensive examination of demand paging and its segmentation counterpart, covering the mechanisms, variants, design decisions, and practical implications that every systems engineer should understand.

What You Will Learn

By the end of this page, you will deeply understand how demand paging works, the difference between pure demand paging and prepaging, how demand segmentation differs from demand paging, the role of valid-invalid bits, and the complete life cycle of a demand-paged memory access.

The Concept of Demand Loading

Demand loading is a resource management strategy based on a simple principle: don't load something until you actually need it. This principle applies broadly in computing (lazy evaluation, just-in-time compilation), but in memory management, it takes a specific form.

The alternative: Eager loading

Eager (or anticipatory) loading means loading all resources before they're used. For memory, this would mean:

When a program starts, load every page of the executable
Load every page of every shared library
Pre-allocate and zero all heap space the program might use
Pre-extend the stack to its maximum size

This approach is simple but wasteful—most programs never touch the majority of their address space.

Eager vs. Demand Loading
Aspect	Eager Loading	Demand Loading
Startup Time	Slow (load everything)	Fast (load almost nothing)
Memory Usage	High (everything resident)	Low (only working set resident)
First Access	Fast (already loaded)	Slow (triggers page fault)
Unused Code	Wastes memory	Never loaded
Implementation	Simple	Requires page fault handling
Predictability	Consistent (no faults)	Variable (faults on new access)

The demand paging strategy:

Demand paging makes the opposite choice:

When a program starts, create page table entries for its entire virtual address space, but mark them as not present (not in physical memory)
Load zero or minimal pages initially
When the program accesses any address, check if that page is present
If not present, trigger a page fault and load the page from its backing store
Resume execution after the page is loaded

This strategy trades consistent performance for better average-case performance and dramatically lower memory usage.

Demand Loading Is Ubiquitous

The demand loading principle appears throughout computing: lazy initialization (create objects when first used), JavaScript module loading (import modules on first reference), database query materialization (fetch rows on iteration). Memory demand paging is just one instance of this powerful pattern.

Demand Paging Mechanics

Let's trace through exactly what happens when a process accesses a page that isn't currently in physical memory. This is the core mechanism of demand paging.

Prerequisites for demand paging:

Valid-invalid bit (present bit) — Each page table entry has a bit indicating whether the page is in physical memory
Page fault exception — The CPU generates an exception when accessing a non-present page
Backing store — A place to store pages not currently in RAM (swap, executable file, etc.)
Restartable instructions — The CPU must be able to restart the faulting instruction after the fault is handled

Demand Paging: Complete Flow
Demand Paging Step-by-Step:
═══════════════════════════
 
STEP 1: Memory Reference
─────────────────────────
Process executes:  mov eax, [0x7fff0040]
                          ↓
CPU needs to read from virtual address 0x7fff0040
 
 
STEP 2: TLB Lookup
──────────────────
CPU checks TLB for VPN = 0x7fff0
    → TLB MISS (not cached)
 
 
STEP 3: Page Table Walk
───────────────────────
MMU walks page table hierarchy:
    CR3 → PML4[...] → PDPT[...] → PD[...] → PT[...]
         ↓
    Page Table Entry for VPN 0x7fff0:
    ┌──────────────────────────────────────────┐
    │ Present = 0 │  Swap Location = Block 42  │
    └──────────────────────────────────────────┘
                      ↓
    Present bit is 0!  ← This is a PAGE FAULT
 
 
STEP 4: Page Fault Exception
────────────────────────────
CPU pushes state to kernel stack:
    - Faulting address (CR2 = 0x7fff0040)
    - Error code (read access, not present)
    - Return address (instruction to restart)
         ↓
Jumps to page fault handler (interrupt vector 14 on x86)
 
 
STEP 5: Kernel Page Fault Handler
─────────────────────────────────
a) Read CR2 to get faulting address: 0x7fff0040
 
b) Look up address in process's memory map:
   → Found: Valid area, file-backed/anonymous, permissions OK
   → If NOT found: Send SIGSEGV (segmentation fault)
 
c) Find a free physical frame:
   → If none free, run page replacement algorithm
   → Evict a victim page (possibly to swap)
 
d) Determine page source:
   → If anonymous and never written: Zero-fill frame
   → If file-backed: Read from file at offset
   → If in swap: Read from swap block 42
 
e) Read page content (blocking I/O):
   → Initiate disk read
   → Block current process
   → Schedule another process
   → When I/O completes, wake this process
 
 
STEP 6: Update Page Table
─────────────────────────
Set page table entry:
    ┌──────────────────────────────────────────┐
    │ Present = 1 │  Frame = 0x1234 │ Access...│
    └──────────────────────────────────────────┘
 
Update TLB (or let hardware refill on next access)
 
 
STEP 7: Return from Exception
─────────────────────────────
IRET instruction returns to saved address:
    → Re-executes:  mov eax, [0x7fff0040]
    → This time: TLB/PT has valid mapping
    → Access completes normally!

The Process Doesn't Know

Crucially, the process is unaware that a page fault occurred. From its perspective, the memory access just took a bit longer than usual. This transparency is essential—existing programs work without modification under demand paging.

The Valid-Invalid Bit

The valid-invalid bit (also called the present bit) is a single bit in each page table entry that indicates whether the page is currently in physical memory. It's the fundamental trigger for demand paging.

Terminology clarification:

Different systems use different terms:

Intel x86: Present bit (P bit) — 1 if page is in RAM
Academic texts: Valid-invalid bit — 1 = valid (in RAM), 0 = invalid (not in RAM)
Some texts: Resident bit — whether page is resident in memory

These all refer to the same concept: does the page table entry point to a physical frame that currently holds this page's data?

Page Table Entry Bit Layout (x86-64)
x86-64 Page Table Entry (PTE) Structure:
════════════════════════════════════════
 
Bit   Name        Meaning
────────────────────────────────────────────────────────────────
0     P (Present) 1 = in physical memory, 0 = not present
1     R/W         1 = writable, 0 = read-only
2     U/S         1 = user accessible, 0 = supervisor only
3     PWT         Page-Level Write-Through
4     PCD         Page-Level Cache Disable
5     A (Accessed) Hardware sets to 1 when page is read/written
6     D (Dirty)    Hardware sets to 1 when page is written
7     PS          Page Size (for huge pages in PD entry)
8     G           Global (don't flush from TLB on CR3 change)
11-9  AVL         Available for OS use
12-51 PFN         Physical Frame Number (40 bits = 1 TB addressable)
62    PKE         Protection Key (if enabled)
63    NX          No-Execute bit (if enabled)
 
When P = 0, other bits can be repurposed:
─────────────────────────────────────────
The OS can store auxiliary information:
  - Swap location (which swap slot)
  - File offset (for file-backed pages)
  - Copy-on-write indicator
  - Page state flags

What happens when P = 0:

When the CPU's MMU encounters a page table entry with P = 0 during address translation:

Hardware stops the translation — Can't proceed without a valid frame number
Page fault exception generated — CPU raises exception #14 (Page Fault) on x86
Faulting address saved — The address that caused the fault is stored in CR2
Error code pushed — Indicates read/write, user/supervisor, cause
Control transfers to OS — Kernel page fault handler takes over

Why P = 0 is not always an error:

A page with P = 0 might be:

A valid page that's been swapped out (demand paging in action)
A valid page that's never been loaded yet (first access)
A copy-on-write page waiting for its first write
A truly invalid page (bad pointer, stack overflow into guard page)

The OS must examine its own data structures to determine which case applies.

Invalid ≠ Illegal

A page marked 'invalid' (P = 0) is not necessarily an error. It means the page isn't in RAM right now. The OS must check if the address is within a valid memory region—if so, it's a demand paging opportunity; if not, it's a genuine memory access violation (segfault).

Pure Demand Paging

Pure demand paging is the extreme form of lazy loading: no pages are loaded until they generate a fault. When a process starts under pure demand paging, it has zero pages in memory—even the first instruction of the program must be faulted in.

How pure demand paging works at startup:

Process Startup Under Pure Demand Paging
Pure Demand Paging Process Lifecycle:
══════════════════════════════════════
 
1. PROCESS CREATION (exec syscall):
   ─────────────────────────────────
   • Parse executable header (ELF, PE, Mach-O)
   • Create virtual memory regions:
     - Text segment: virtual 0x400000-0x4ff000 → file offset 0x1000
     - Data segment: virtual 0x500000-0x510000 → file offset 0x100000
     - BSS: virtual 0x510000-0x520000 → zero-fill on demand
     - Stack: virtual 0x7fff0000-0x80000000 → zero-fill on demand
   • All pages marked NOT PRESENT in page table
   • NO ACTUAL LOADING OCCURS
 
 
2. SET INSTRUCTION POINTER:
   ─────────────────────────
   Set initial PC to program entry point (e.g., 0x400100)
   Transfer control to user mode
                ↓
 
3. FIRST INSTRUCTION FAULT:
   ─────────────────────────
   CPU tries to fetch instruction at 0x400100
   Page table lookup: 0x400xxx → NOT PRESENT!
   PAGE FAULT #1!
   
   Handler loads page containing 0x400100 from executable file
   Returns to 0x400100, now instruction executes
 
 
4. SUBSEQUENT FAULTS:
   ───────────────────
   mov rax, [0x500000]  ; Access data segment
   → PAGE FAULT #2 (load data page from file)
   
   call printf          ; Call into libc
   → PAGE FAULT #3 (load libc text page)
   
   push rbp            ; Use stack
   → PAGE FAULT #4 (zero-fill stack page)
   
   ... and so on until working set stabilizes
 
 
5. STEADY STATE:
   ──────────────
   After initial burst of faults, most accessed pages are resident
   New faults only for:
   • First access to previously-unused code paths
   • Growing stack/heap
   • After page eviction under memory pressure

Advantages of pure demand paging:

Minimum startup delay — Process begins immediately, no waiting for loading
Minimum memory usage — Only accessed pages ever load
Maximum sharing opportunity — Pages only brought in when needed, can check if already in cache

Disadvantages of pure demand paging:

High initial fault rate — First access to every page faults, causing I/O
Poor initial performance — Early execution is slow due to fault storms
I/O overhead — Many small I/O operations instead of fewer large ones

When pure demand paging makes sense:

Memory is extremely constrained
Programs are short-lived and may not use much memory
System prioritizes fast startup over initial runtime performance

Visualizing Fault Storms

Use tools like 'perf record -e page-faults' on Linux to observe the initial burst of page faults when a program starts under pure demand paging. You'll see hundreds or thousands of faults in the first few milliseconds, then the rate drops dramatically as the working set becomes resident.

Prepaging and Practical Variants

Pure demand paging, while theoretically elegant, is rarely used in practice. Real systems employ various optimizations to reduce the initial fault storm while retaining demand paging's benefits.

Prepaging (Anticipatory Paging):

Prepaging loads some pages before they're accessed, anticipating that they'll be needed soon. This trades some memory usage for reduced fault rate.

Prepaging Strategies
Strategy	Description	Use Case
Startup prepaging	Load entry point region at exec()	Reduce initial fault storm
Clustered paging	On fault, load adjacent pages too	Exploit spatial locality
Working set prepaging	Load saved working set from previous run	Reduce warm-up time
Read-ahead	Prefetch sequential pages for sequential access	File I/O optimization
Swap prefetch	Load pages from swap proactively	Anticipate reuse

Clustered paging (fault clustering):

When a page fault occurs, the OS loads not just the faulted page but also several adjacent pages. The rationale: if a program is accessing page N, it likely will access pages N+1, N+2, etc. (spatial locality).

Linux's readahead mechanism is a form of clustered paging for file-backed pages:

Default cluster size: 128-256 KB (32-64 pages for 4K pages)
Adaptive: Increases cluster size if sequential access detected
Benefit: Single I/O operation loads many pages
Risk: Wastes memory if prediction wrong

Working set prepaging:

Some systems track each process's working set and save it when the process is swapped out or hibernated. On resume, the entire working set is preloaded, eliminating the warm-up period.

Windows SuperFetch/Prefetch and macOS's similar features use this approach:

Monitor page access patterns over time
At boot/app launch, prefetch predicted pages
Can dramatically reduce perceived startup time

Linux Readahead Settings
1
2
3
4
5
6
7
8
9
10
11
12
13
# View current readahead settings
$ cat /sys/block/sda/queue/read_ahead_kb
128
 
# Set readahead to 256 KB (64 pages)
$ echo 256 > /sys/block/sda/queue/read_ahead_kb
 
# View per-file readahead with posix_fadvise():
#   POSIX_FADV_SEQUENTIAL - Increase readahead
#   POSIX_FADV_RANDOM     - Disable readahead
 
# In code:
posix_fadvise(fd, 0, 0, POSIX_FADV_SEQUENTIAL);

The Prepaging Tradeoff

Prepaging is speculative—it guesses which pages will be needed. If the guess is right, it saves time. If wrong, it wastes memory and I/O bandwidth. Systems must balance aggressive prepaging (better hit rate, more waste) against conservative prepaging (less waste, more faults).

Demand Segmentation

While demand paging is the dominant technique today, an alternative approach—demand segmentation—played an important historical role and still informs understanding of virtual memory.

What is demand segmentation?

Demand segmentation applies the lazy loading principle to segments rather than pages. Instead of dividing memory into fixed-size pages, segmentation divides it into variable-size logical units (code segment, data segment, stack segment, etc.).

Under demand segmentation:

Entire segments are loaded as units
If a segment isn't resident and is accessed, a segment fault occurs
The entire segment is loaded from disk
Segments can be swapped out as units

Demand Paging vs. Demand Segmentation
Aspect	Demand Paging	Demand Segmentation
Unit of transfer	Fixed-size page (4 KB typical)	Variable-size segment (KB to MB)
Granularity	Fine (single page)	Coarse (entire segment)
Internal fragmentation	Up to page_size-1 per allocation	None within segment
External fragmentation	None	Significant (variable sizes)
Loading time	Fast (small unit)	Slow (large segment)
Memory utilization	Precise	Coarse (all or nothing per segment)
Sharing	Page-level	Segment-level
Modern usage	Universal	Rare (mostly obsolete)

Why demand paging won:

Demand paging has largely replaced demand segmentation for several reasons:

Finer granularity — Loading one 4 KB page is faster than loading a 100 KB segment
No external fragmentation — Fixed-size pages fit neatly into fixed-size frames
Better memory utilization — Can evict cold parts of a segment while keeping hot parts
Simpler implementation — Fixed-size units simplify allocation, replacement, and hardware

Where segmentation still matters:

While pure demand segmentation is obsolete, segmentation concepts persist:

x86 uses segments for memory protection (though with flat memory model)
Process address space is still conceptually divided into segments (text, data, stack)
Some systems combine segmentation and paging (segmented paging)
Intel's MPX (Memory Protection Extensions) used segment-like bounds checking

Historical Note: Multics

Multics (1969) famously combined segmentation and paging. Each segment was divided into pages, providing both the logical structure of segments and the physical efficiency of pages. This 'segmented paging' influenced later designs, though modern systems mostly use 'paged segmentation' where paging is primary.

Types of Page Faults

Not all page faults are equal. Understanding the different types helps in performance analysis and system tuning.

Classification by severity:

Types of Page Faults
Type	Also Called	Description	Typical Latency
Minor fault	Soft fault	Page is in RAM but not mapped to this process	~1 μs
Major fault	Hard fault	Page must be loaded from disk	~100 μs - 10 ms
Invalid fault	Segfault	Genuinely invalid access (unmapped, protection)	N/A (signal/exception)

Minor (soft) faults:

A minor fault occurs when the page exists in physical memory (perhaps in the page cache or another process's mapping) but isn't yet mapped in the faulting process's page table. The handler just needs to update the page table—no disk I/O required.

Common causes:

First access to a demand-zero page (allocate frame, zero it, map it)
Accessing a shared library page that's already in the page cache
Copy-on-write: accessing a shared page with write intent (copy then map)

Major (hard) faults:

A major fault requires disk I/O to bring the page into memory. This is the expensive case—potentially 10,000× slower than a minor fault.

Common causes:

First access to a file-backed page (read from file)
Accessing a page that was swapped out (read from swap)
Accessing code in an executable that wasn't yet loaded

Monitoring page faults:

Measuring Page Faults on Linux
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Per-process page faults (majflt = major, minflt = minor)
$ ps -o pid,min_flt,maj_flt,cmd $(pgrep -f myapp)
  PID  MINFL  MAJFL CMD
12345   5243    122 myapp --arg
 
# System-wide page fault rate
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- 
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs 
 1  0  10240 500000  50000 200000    0    0    52    10  200  150
                                    ↑    ↑
                                    │    └─ Pages swapped out (may cause future major faults)
                                    └─ Pages swapped in (major faults resolved)
 
# Detailed page fault statistics
$ time -v ls
...
    Minor (reclaiming a frame) page faults: 93
    Major (requiring I/O) page faults: 0
...
 
# Using perf to trace page faults
$ perf stat -e page-faults,minor-faults,major-faults ./myapp
       1,234      page-faults
       1,200      minor-faults
          34      major-faults

Major Faults Are the Enemy

For performance-sensitive applications, monitor major fault rate closely. A high major fault rate indicates insufficient RAM for the working set, excessive swap usage, or poor locality. Minor faults are usually acceptable—they're just bookkeeping.

Instruction Restart and Atomicity

For demand paging to work, the hardware must support restartable instructions: after a page fault is resolved, the faulting instruction must execute as if nothing happened. This seemingly simple requirement has subtle implications.

The restart requirement:

When a page fault occurs mid-instruction, the CPU must:

Save sufficient state to resume from the exact point of the fault
Not have made any irreversible changes before the fault
Re-execute the instruction cleanly after the fault resolves

Problematic instruction scenarios:

Complex Instructions and Page Faults
Challenging Cases for Instruction Restart:
═══════════════════════════════════════════
 
1. BLOCK MOVE INSTRUCTIONS (e.g., REP MOVSB on x86)
   ─────────────────────────────────────────────────
   rep movsb  ; Copy ECX bytes from [ESI] to [EDI]
   
   Problem: What if fault occurs after copying 50 of 100 bytes?
   Solution: CPU tracks partial progress in registers; 
             ECX/ESI/EDI updated per byte, so restart continues
             from where it left off.
 
 
2. INSTRUCTIONS THAT MODIFY OPERANDS (auto-increment)
   ──────────────────────────────────────────────────
   mov eax, [ebx]++  ; (hypothetical: load and increment pointer)
   
   Problem: If ebx is incremented before the load faults,
            restart would load from wrong address.
   Solution: Modern ISAs avoid such designs, or precisely define
             when side-effects occur relative to potential faults.
 
 
3. INSTRUCTIONS WITH MULTIPLE MEMORY ACCESSES
   ──────────────────────────────────────────
   push [address]  ; Read from memory, then write to stack
   
   Problem: If read succeeds but stack write faults,
            must restart the read too.
   Solution: CPU checks for potential faults before committing,
             or can restart the entire instruction cleanly.
 
 
4. COMPARE-AND-SWAP / ATOMIC OPERATIONS
   ─────────────────────────────────────
   lock cmpxchg [mem], ebx
   
   Problem: Atomicity must be preserved across restart.
   Solution: If fault occurs, no memory modification has happened;
             entire atomic operation restarts.

Historical challenges:

Some historical architectures (like certain models of the Motorola 68000) had instructions that couldn't be restarted cleanly. This made demand paging impossible or required special handling.

The Motorola 68010 added the ability to save and restore instruction execution state mid-flight, fixing this issue. Modern architectures are designed with restartability in mind.

The x86 approach:

x86 instructions are designed to be restartable. The processor:

Doesn't commit register changes until instruction completes
Saves instruction pointer to point at the faulting instruction (not after)
For REP-prefixed instructions, updates count registers per iteration
For string operations, updates pointers as they proceed

This design ensures that after a page fault is handled, re-executing the same instruction produces the correct result.

Microarchitecture Complexity

Modern out-of-order CPUs speculate past memory operations. When a page fault occurs, all speculative state must be discarded and the CPU resumed from a precise architectural state. This rollback capability, needed for exceptions anyway, also enables demand paging.

Summary: Lazy Loading Powers Virtual Memory

Demand paging is the mechanism that transforms virtual memory from an addressing abstraction into a powerful resource management system. Let's consolidate what we've learned:

Key Takeaways

•Demand loading defers work — Pages are loaded from backing storage only when accessed, not when allocated or at program start.
•The valid-invalid bit triggers faults — Each PTE's present bit indicates whether the page is in RAM; P=0 triggers a page fault on access.
•Pure demand paging is elegant but impractical — Loading zero pages at startup causes fault storms; real systems use prepaging optimizations.
•Prepaging balances memory and performance — Clustered paging, working set prepaging, and readahead reduce faults at the cost of speculative memory use.
•Demand segmentation is mostly historical — Fixed-size pages won over variable-size segments due to simpler allocation and no external fragmentation.
•Minor faults are cheap; major faults are expensive — Distinguish between soft faults (no I/O) and hard faults (disk access required).
•Instruction restart is essential — Hardware must support restarting faulted instructions cleanly for demand paging to work.

What's next:

We've seen how pages are loaded on demand. The next page examines the benefits of virtual memory holistically—not just the ability to exceed physical memory, but also memory protection, simplified programming, efficient use of RAM, and enabling advanced features like copy-on-write and memory-mapped files.

Page Complete

You now have a comprehensive understanding of demand paging and demand segmentation—the lazy loading mechanisms that make virtual memory practical. These mechanisms are the foundation for all the advanced virtual memory features we'll explore in subsequent pages.

Demand Paging and Segmentation

Loading Only What You Need

What You Will Learn

The Concept of Demand Loading

The alternative: Eager loading

Eager (or anticipatory) loading means loading all resources before they're used. For memory, this would mean:

When a program starts, load every page of the executable
Load every page of every shared library
Pre-allocate and zero all heap space the program might use
Pre-extend the stack to its maximum size

This approach is simple but wasteful—most programs never touch the majority of their address space.

Eager vs. Demand Loading
Aspect	Eager Loading	Demand Loading
Startup Time	Slow (load everything)	Fast (load almost nothing)
Memory Usage	High (everything resident)	Low (only working set resident)
First Access	Fast (already loaded)	Slow (triggers page fault)
Unused Code	Wastes memory	Never loaded
Implementation	Simple	Requires page fault handling
Predictability	Consistent (no faults)	Variable (faults on new access)

The demand paging strategy:

Demand paging makes the opposite choice:

When a program starts, create page table entries for its entire virtual address space, but mark them as not present (not in physical memory)
Load zero or minimal pages initially
When the program accesses any address, check if that page is present
If not present, trigger a page fault and load the page from its backing store
Resume execution after the page is loaded

This strategy trades consistent performance for better average-case performance and dramatically lower memory usage.

Demand Loading Is Ubiquitous

Demand Paging Mechanics

Let's trace through exactly what happens when a process accesses a page that isn't currently in physical memory. This is the core mechanism of demand paging.

Prerequisites for demand paging:

Valid-invalid bit (present bit) — Each page table entry has a bit indicating whether the page is in physical memory
Page fault exception — The CPU generates an exception when accessing a non-present page
Backing store — A place to store pages not currently in RAM (swap, executable file, etc.)
Restartable instructions — The CPU must be able to restart the faulting instruction after the fault is handled

Demand Paging: Complete Flow
Demand Paging Step-by-Step:
═══════════════════════════
 
STEP 1: Memory Reference
─────────────────────────
Process executes:  mov eax, [0x7fff0040]
                          ↓
CPU needs to read from virtual address 0x7fff0040
 
 
STEP 2: TLB Lookup
──────────────────
CPU checks TLB for VPN = 0x7fff0
    → TLB MISS (not cached)
 
 
STEP 3: Page Table Walk
───────────────────────
MMU walks page table hierarchy:
    CR3 → PML4[...] → PDPT[...] → PD[...] → PT[...]
         ↓
    Page Table Entry for VPN 0x7fff0:
    ┌──────────────────────────────────────────┐
    │ Present = 0 │  Swap Location = Block 42  │
    └──────────────────────────────────────────┘
                      ↓
    Present bit is 0!  ← This is a PAGE FAULT
 
 
STEP 4: Page Fault Exception
────────────────────────────
CPU pushes state to kernel stack:
    - Faulting address (CR2 = 0x7fff0040)
    - Error code (read access, not present)
    - Return address (instruction to restart)
         ↓
Jumps to page fault handler (interrupt vector 14 on x86)
 
 
STEP 5: Kernel Page Fault Handler
─────────────────────────────────
a) Read CR2 to get faulting address: 0x7fff0040
 
b) Look up address in process's memory map:
   → Found: Valid area, file-backed/anonymous, permissions OK
   → If NOT found: Send SIGSEGV (segmentation fault)
 
c) Find a free physical frame:
   → If none free, run page replacement algorithm
   → Evict a victim page (possibly to swap)
 
d) Determine page source:
   → If anonymous and never written: Zero-fill frame
   → If file-backed: Read from file at offset
   → If in swap: Read from swap block 42
 
e) Read page content (blocking I/O):
   → Initiate disk read
   → Block current process
   → Schedule another process
   → When I/O completes, wake this process
 
 
STEP 6: Update Page Table
─────────────────────────
Set page table entry:
    ┌──────────────────────────────────────────┐
    │ Present = 1 │  Frame = 0x1234 │ Access...│
    └──────────────────────────────────────────┘
 
Update TLB (or let hardware refill on next access)
 
 
STEP 7: Return from Exception
─────────────────────────────
IRET instruction returns to saved address:
    → Re-executes:  mov eax, [0x7fff0040]
    → This time: TLB/PT has valid mapping
    → Access completes normally!

The Process Doesn't Know

The Valid-Invalid Bit

Terminology clarification:

Different systems use different terms:

Intel x86: Present bit (P bit) — 1 if page is in RAM
Academic texts: Valid-invalid bit — 1 = valid (in RAM), 0 = invalid (not in RAM)
Some texts: Resident bit — whether page is resident in memory

These all refer to the same concept: does the page table entry point to a physical frame that currently holds this page's data?

Page Table Entry Bit Layout (x86-64)
x86-64 Page Table Entry (PTE) Structure:
════════════════════════════════════════
 
Bit   Name        Meaning
────────────────────────────────────────────────────────────────
0     P (Present) 1 = in physical memory, 0 = not present
1     R/W         1 = writable, 0 = read-only
2     U/S         1 = user accessible, 0 = supervisor only
3     PWT         Page-Level Write-Through
4     PCD         Page-Level Cache Disable
5     A (Accessed) Hardware sets to 1 when page is read/written
6     D (Dirty)    Hardware sets to 1 when page is written
7     PS          Page Size (for huge pages in PD entry)
8     G           Global (don't flush from TLB on CR3 change)
11-9  AVL         Available for OS use
12-51 PFN         Physical Frame Number (40 bits = 1 TB addressable)
62    PKE         Protection Key (if enabled)
63    NX          No-Execute bit (if enabled)
 
When P = 0, other bits can be repurposed:
─────────────────────────────────────────
The OS can store auxiliary information:
  - Swap location (which swap slot)
  - File offset (for file-backed pages)
  - Copy-on-write indicator
  - Page state flags

What happens when P = 0:

When the CPU's MMU encounters a page table entry with P = 0 during address translation:

Hardware stops the translation — Can't proceed without a valid frame number
Page fault exception generated — CPU raises exception #14 (Page Fault) on x86
Faulting address saved — The address that caused the fault is stored in CR2
Error code pushed — Indicates read/write, user/supervisor, cause
Control transfers to OS — Kernel page fault handler takes over

Why P = 0 is not always an error:

A page with P = 0 might be:

A valid page that's been swapped out (demand paging in action)
A valid page that's never been loaded yet (first access)
A copy-on-write page waiting for its first write
A truly invalid page (bad pointer, stack overflow into guard page)

The OS must examine its own data structures to determine which case applies.

Invalid ≠ Illegal

Pure Demand Paging

How pure demand paging works at startup:

Process Startup Under Pure Demand Paging
Pure Demand Paging Process Lifecycle:
══════════════════════════════════════
 
1. PROCESS CREATION (exec syscall):
   ─────────────────────────────────
   • Parse executable header (ELF, PE, Mach-O)
   • Create virtual memory regions:
     - Text segment: virtual 0x400000-0x4ff000 → file offset 0x1000
     - Data segment: virtual 0x500000-0x510000 → file offset 0x100000
     - BSS: virtual 0x510000-0x520000 → zero-fill on demand
     - Stack: virtual 0x7fff0000-0x80000000 → zero-fill on demand
   • All pages marked NOT PRESENT in page table
   • NO ACTUAL LOADING OCCURS
 
 
2. SET INSTRUCTION POINTER:
   ─────────────────────────
   Set initial PC to program entry point (e.g., 0x400100)
   Transfer control to user mode
                ↓
 
3. FIRST INSTRUCTION FAULT:
   ─────────────────────────
   CPU tries to fetch instruction at 0x400100
   Page table lookup: 0x400xxx → NOT PRESENT!
   PAGE FAULT #1!
   
   Handler loads page containing 0x400100 from executable file
   Returns to 0x400100, now instruction executes
 
 
4. SUBSEQUENT FAULTS:
   ───────────────────
   mov rax, [0x500000]  ; Access data segment
   → PAGE FAULT #2 (load data page from file)
   
   call printf          ; Call into libc
   → PAGE FAULT #3 (load libc text page)
   
   push rbp            ; Use stack
   → PAGE FAULT #4 (zero-fill stack page)
   
   ... and so on until working set stabilizes
 
 
5. STEADY STATE:
   ──────────────
   After initial burst of faults, most accessed pages are resident
   New faults only for:
   • First access to previously-unused code paths
   • Growing stack/heap
   • After page eviction under memory pressure

Advantages of pure demand paging:

Minimum startup delay — Process begins immediately, no waiting for loading
Minimum memory usage — Only accessed pages ever load
Maximum sharing opportunity — Pages only brought in when needed, can check if already in cache

Disadvantages of pure demand paging:

High initial fault rate — First access to every page faults, causing I/O
Poor initial performance — Early execution is slow due to fault storms
I/O overhead — Many small I/O operations instead of fewer large ones

When pure demand paging makes sense:

Memory is extremely constrained
Programs are short-lived and may not use much memory
System prioritizes fast startup over initial runtime performance

Visualizing Fault Storms

Prepaging and Practical Variants

Pure demand paging, while theoretically elegant, is rarely used in practice. Real systems employ various optimizations to reduce the initial fault storm while retaining demand paging's benefits.

Prepaging (Anticipatory Paging):

Prepaging loads some pages before they're accessed, anticipating that they'll be needed soon. This trades some memory usage for reduced fault rate.

Prepaging Strategies
Strategy	Description	Use Case
Startup prepaging	Load entry point region at exec()	Reduce initial fault storm
Clustered paging	On fault, load adjacent pages too	Exploit spatial locality
Working set prepaging	Load saved working set from previous run	Reduce warm-up time
Read-ahead	Prefetch sequential pages for sequential access	File I/O optimization
Swap prefetch	Load pages from swap proactively	Anticipate reuse

Clustered paging (fault clustering):

Linux's readahead mechanism is a form of clustered paging for file-backed pages:

Default cluster size: 128-256 KB (32-64 pages for 4K pages)
Adaptive: Increases cluster size if sequential access detected
Benefit: Single I/O operation loads many pages
Risk: Wastes memory if prediction wrong

Working set prepaging:

Some systems track each process's working set and save it when the process is swapped out or hibernated. On resume, the entire working set is preloaded, eliminating the warm-up period.

Windows SuperFetch/Prefetch and macOS's similar features use this approach:

Monitor page access patterns over time
At boot/app launch, prefetch predicted pages
Can dramatically reduce perceived startup time

Linux Readahead Settings
1
2
3
4
5
6
7
8
9
10
11
12
13
# View current readahead settings
$ cat /sys/block/sda/queue/read_ahead_kb
128
 
# Set readahead to 256 KB (64 pages)
$ echo 256 > /sys/block/sda/queue/read_ahead_kb
 
# View per-file readahead with posix_fadvise():
#   POSIX_FADV_SEQUENTIAL - Increase readahead
#   POSIX_FADV_RANDOM     - Disable readahead
 
# In code:
posix_fadvise(fd, 0, 0, POSIX_FADV_SEQUENTIAL);

The Prepaging Tradeoff

Demand Segmentation

While demand paging is the dominant technique today, an alternative approach—demand segmentation—played an important historical role and still informs understanding of virtual memory.

What is demand segmentation?

Under demand segmentation:

Entire segments are loaded as units
If a segment isn't resident and is accessed, a segment fault occurs
The entire segment is loaded from disk
Segments can be swapped out as units

Demand Paging vs. Demand Segmentation
Aspect	Demand Paging	Demand Segmentation
Unit of transfer	Fixed-size page (4 KB typical)	Variable-size segment (KB to MB)
Granularity	Fine (single page)	Coarse (entire segment)
Internal fragmentation	Up to page_size-1 per allocation	None within segment
External fragmentation	None	Significant (variable sizes)
Loading time	Fast (small unit)	Slow (large segment)
Memory utilization	Precise	Coarse (all or nothing per segment)
Sharing	Page-level	Segment-level
Modern usage	Universal	Rare (mostly obsolete)

Why demand paging won:

Demand paging has largely replaced demand segmentation for several reasons:

Finer granularity — Loading one 4 KB page is faster than loading a 100 KB segment
No external fragmentation — Fixed-size pages fit neatly into fixed-size frames
Better memory utilization — Can evict cold parts of a segment while keeping hot parts
Simpler implementation — Fixed-size units simplify allocation, replacement, and hardware

Where segmentation still matters:

While pure demand segmentation is obsolete, segmentation concepts persist:

x86 uses segments for memory protection (though with flat memory model)
Process address space is still conceptually divided into segments (text, data, stack)
Some systems combine segmentation and paging (segmented paging)
Intel's MPX (Memory Protection Extensions) used segment-like bounds checking

Historical Note: Multics

Types of Page Faults

Not all page faults are equal. Understanding the different types helps in performance analysis and system tuning.

Classification by severity:

Types of Page Faults
Type	Also Called	Description	Typical Latency
Minor fault	Soft fault	Page is in RAM but not mapped to this process	~1 μs
Major fault	Hard fault	Page must be loaded from disk	~100 μs - 10 ms
Invalid fault	Segfault	Genuinely invalid access (unmapped, protection)	N/A (signal/exception)

Minor (soft) faults:

Common causes:

First access to a demand-zero page (allocate frame, zero it, map it)
Accessing a shared library page that's already in the page cache
Copy-on-write: accessing a shared page with write intent (copy then map)

Major (hard) faults:

A major fault requires disk I/O to bring the page into memory. This is the expensive case—potentially 10,000× slower than a minor fault.

Common causes:

First access to a file-backed page (read from file)
Accessing a page that was swapped out (read from swap)
Accessing code in an executable that wasn't yet loaded

Monitoring page faults:

Measuring Page Faults on Linux
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Per-process page faults (majflt = major, minflt = minor)
$ ps -o pid,min_flt,maj_flt,cmd $(pgrep -f myapp)
  PID  MINFL  MAJFL CMD
12345   5243    122 myapp --arg
 
# System-wide page fault rate
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- 
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs 
 1  0  10240 500000  50000 200000    0    0    52    10  200  150
                                    ↑    ↑
                                    │    └─ Pages swapped out (may cause future major faults)
                                    └─ Pages swapped in (major faults resolved)
 
# Detailed page fault statistics
$ time -v ls
...
    Minor (reclaiming a frame) page faults: 93
    Major (requiring I/O) page faults: 0
...
 
# Using perf to trace page faults
$ perf stat -e page-faults,minor-faults,major-faults ./myapp
       1,234      page-faults
       1,200      minor-faults
          34      major-faults

Major Faults Are the Enemy

Instruction Restart and Atomicity

The restart requirement:

When a page fault occurs mid-instruction, the CPU must:

Save sufficient state to resume from the exact point of the fault
Not have made any irreversible changes before the fault
Re-execute the instruction cleanly after the fault resolves

Problematic instruction scenarios:

Complex Instructions and Page Faults
Challenging Cases for Instruction Restart:
═══════════════════════════════════════════
 
1. BLOCK MOVE INSTRUCTIONS (e.g., REP MOVSB on x86)
   ─────────────────────────────────────────────────
   rep movsb  ; Copy ECX bytes from [ESI] to [EDI]
   
   Problem: What if fault occurs after copying 50 of 100 bytes?
   Solution: CPU tracks partial progress in registers; 
             ECX/ESI/EDI updated per byte, so restart continues
             from where it left off.
 
 
2. INSTRUCTIONS THAT MODIFY OPERANDS (auto-increment)
   ──────────────────────────────────────────────────
   mov eax, [ebx]++  ; (hypothetical: load and increment pointer)
   
   Problem: If ebx is incremented before the load faults,
            restart would load from wrong address.
   Solution: Modern ISAs avoid such designs, or precisely define
             when side-effects occur relative to potential faults.
 
 
3. INSTRUCTIONS WITH MULTIPLE MEMORY ACCESSES
   ──────────────────────────────────────────
   push [address]  ; Read from memory, then write to stack
   
   Problem: If read succeeds but stack write faults,
            must restart the read too.
   Solution: CPU checks for potential faults before committing,
             or can restart the entire instruction cleanly.
 
 
4. COMPARE-AND-SWAP / ATOMIC OPERATIONS
   ─────────────────────────────────────
   lock cmpxchg [mem], ebx
   
   Problem: Atomicity must be preserved across restart.
   Solution: If fault occurs, no memory modification has happened;
             entire atomic operation restarts.

Historical challenges:

Some historical architectures (like certain models of the Motorola 68000) had instructions that couldn't be restarted cleanly. This made demand paging impossible or required special handling.

The Motorola 68010 added the ability to save and restore instruction execution state mid-flight, fixing this issue. Modern architectures are designed with restartability in mind.

The x86 approach:

x86 instructions are designed to be restartable. The processor:

Doesn't commit register changes until instruction completes
Saves instruction pointer to point at the faulting instruction (not after)
For REP-prefixed instructions, updates count registers per iteration
For string operations, updates pointers as they proceed

This design ensures that after a page fault is handled, re-executing the same instruction produces the correct result.

Microarchitecture Complexity

Summary: Lazy Loading Powers Virtual Memory

Demand paging is the mechanism that transforms virtual memory from an addressing abstraction into a powerful resource management system. Let's consolidate what we've learned:

Key Takeaways

•Demand loading defers work — Pages are loaded from backing storage only when accessed, not when allocated or at program start.
•The valid-invalid bit triggers faults — Each PTE's present bit indicates whether the page is in RAM; P=0 triggers a page fault on access.
•Pure demand paging is elegant but impractical — Loading zero pages at startup causes fault storms; real systems use prepaging optimizations.
•Prepaging balances memory and performance — Clustered paging, working set prepaging, and readahead reduce faults at the cost of speculative memory use.
•Demand segmentation is mostly historical — Fixed-size pages won over variable-size segments due to simpler allocation and no external fragmentation.
•Minor faults are cheap; major faults are expensive — Distinguish between soft faults (no I/O) and hard faults (disk access required).
•Instruction restart is essential — Hardware must support restarting faulted instructions cleanly for demand paging to work.

What's next:

Page Complete