Page Replacement Need - Learning Module

Loading content...

0/227

Over-allocation

The Illusion of Unlimited Memory

Virtual memory creates one of computing's most powerful illusions: the appearance that every process has access to a vast, private address space—often far exceeding the physical RAM installed in the machine. A 64-bit system theoretically offers 18 exabytes of addressable memory per process, yet typical machines contain merely 8, 16, or 32 gigabytes of physical RAM.

This disparity is not a bug—it's a feature. But every powerful abstraction comes with a hidden cost. When processes collectively demand more physical memory than the system possesses, the operating system faces a fundamental crisis: memory over-allocation.

This page explores the mechanics, causes, and consequences of over-allocation—the core problem that makes page replacement not just useful, but absolutely essential to modern computing.

What You Will Learn

By the end of this page, you will understand: (1) What over-allocation means in the context of virtual memory, (2) Why it occurs naturally in multiprogrammed systems, (3) The mathematical relationship between virtual and physical memory, (4) How over-commitment enables higher utilization but creates page replacement necessity, and (5) The fundamental tradeoffs involved in memory over-allocation policies.

Understanding Over-allocation

Over-allocation occurs when the total virtual memory allocated across all processes exceeds the available physical memory (RAM). This situation is not exceptional—it's the normal operating state of virtually every modern computing system.

Formal Definition:

Let's define over-allocation mathematically. If we have:

n processes: P₁, P₂, ..., Pₙ
Each process Pᵢ has a virtual address space of size Vᵢ
Total physical memory available: M

Over-allocation occurs when:

∑(i=1 to n) Vᵢ > M

In practice, this inequality is almost always satisfied. Consider a typical system:

Physical RAM: 16 GB
Web browser (multiple tabs): 4 GB virtual space
IDE: 2 GB virtual space
Background services: 3 GB virtual space
Operating system kernel: 2 GB virtual space
Additional applications: 10 GB virtual space

Total virtual memory allocated: 21 GB > 16 GB physical memory

This 31% over-allocation is modest by real-world standards. Systems routinely operate with 2x, 5x, or even 10x over-allocation ratios.

Critical Distinction

Over-allocation refers to virtual address space allocation, not actual memory usage. A process may have a 4 GB virtual address space but actively use only 200 MB at any given moment. This gap between allocation and active use is what makes over-allocation viable—and is central to why demand paging works.

The Over-allocation Ratio:

Operating systems track the over-allocation ratio (also called the over-commit ratio):

Over-allocation Ratio = Total Virtual Memory Allocated / Physical Memory

Different systems handle over-allocation differently:

Policy	Linux	Windows	macOS
Default	Heuristic over-commit	Commit limit	Compressed memory
Behavior	Allows allocation, may OOM-kill	Fails allocation if exceeded	Aggressive compression
Typical Ratio	0.5-2.0	1.0-1.5	1.0-2.0

Linux's Over-commit Modes:

Linux provides explicit control via /proc/sys/vm/overcommit_memory:

0 (heuristic): Kernel estimates reasonable over-commit
1 (always): Never refuse any allocation (dangerous but useful for specific workloads)
2 (never): Strict accounting; commit limit = swap + (RAM × overcommit_ratio)

Why Over-allocation Occurs

Over-allocation isn't an accident or poor system design—it's an inevitable consequence of how modern operating systems maximize resource utilization. Understanding the root causes reveals why over-allocation is not just tolerated but actively beneficial.

Root Cause 1: Multiprogramming

Modern operating systems run many processes simultaneously to maximize CPU utilization. When one process waits for I/O, another can use the CPU. This multiprogramming model fundamentally requires memory for multiple processes:

A typical desktop runs 100-300 simultaneous processes
A server may run thousands of threads/processes
Cloud environments virtualize multiple complete operating systems

If each process required dedicated physical memory equal to its virtual address space, systems would support far fewer concurrent processes, devastating throughput and responsiveness.

Causes of Over-allocation

•Sparse Address Space Usage — Processes allocate large virtual address ranges but use them sparsely. A 64-bit process might map 8 TB virtual but touch only 100 MB. The OS would be wasteful allocating physical frames for unmapped regions.
•Copy-on-Write Semantics — After fork(), parent and child share physical pages until one writes. The virtual allocation appears doubled, but physical usage remains unchanged until modification.
•Memory-Mapped Files — Mapping a 10 GB file creates 10 GB of virtual allocations, but only accessed pages consume physical memory. You can map larger files than RAM.
•Shared Libraries — Each process appears to have its own copy of libc, but they share physical pages. Virtual allocation counts multiply; physical usage doesn't.
•Lazy Allocation Strategies — malloc() often returns immediately without physical backing. Pages are allocated only on first access (demand paging).

Root Cause 2: Demand Paging Philosophy

Demand paging intentionally loads pages only when accessed, not when allocated. This "pay only for what you use" model is fundamentally over-committed:

Process calls malloc(1GB) → Virtual memory manager reserves 1 GB of address space
No physical frames are allocated yet
Process accesses first page → Page fault allocates one 4 KB frame
Most of the 1 GB may never be touched

With 100 processes each allocating 1 GB "just in case," the system has 100 GB of virtual allocations but might use only 5 GB physically—a 20x over-commit that works perfectly because of demand paging.

Root Cause 3: The Working Set Principle

Processes exhibit locality of reference—they access a small, slowly-changing subset of their memory at any time. This working set is typically much smaller than total allocation:

Typical Working Set Size / Virtual Allocation Ratio: 10% to 30%

If a process allocates 1 GB but its working set is 100 MB, the system only needs to keep 100 MB resident. The other 900 MB can remain on disk until needed—enabling massive over-allocation.

The Power of Locality

Over-allocation works because of the 90/10 rule: programs spend 90% of their time in 10% of their code (temporal locality) and access data near recently accessed data (spatial locality). Without locality, over-allocation would cause constant page thrashing. With locality, it enables efficient resource sharing.

The Over-allocation Threshold

While over-allocation is beneficial, there exists a critical threshold beyond which it becomes pathological. Understanding this threshold is essential for system configuration and performance tuning.

The Breaking Point:

Over-allocation works as long as:

∑(Working Sets of all active processes) ≤ Available Physical Memory

When this inequality is violated, the system enters a dangerous state: thrashing. Processes continually page fault because their working sets cannot fit in memory. The CPU spends most of its time handling page faults instead of executing application code.

Visualizing the Threshold:

Over-allocation Zones and System Behavior
Zone	Over-allocation Level	Physical Memory State	System Behavior
Green (Optimal)	1x - 3x	Working sets fit comfortably	Minimal page faults, high throughput
Yellow (Caution)	3x - 5x	Working sets barely fit	Occasional page faults, slight degradation
Orange (Warning)	5x - 10x	Working sets overlap in memory	Frequent page faults, noticeable slowdown
Red (Critical)	10x	Working sets cannot fit	Thrashing, system near unusable

Factors Affecting the Threshold:

The exact threshold varies based on:

Workload Characteristics
- I/O-bound workloads tolerate higher over-allocation (processes wait a lot)
- CPU-bound workloads tolerate lower over-allocation (constant memory access)
- Mixed workloads require careful tuning
Swap Space Performance
- SSD swap: ~100,000 IOPS, tolerates higher over-allocation
- HDD swap: ~100 IOPS, requires conservative over-allocation
- No swap: Zero tolerance for working set > RAM
Page Replacement Algorithm Quality
- Better algorithms maintain working sets longer
- Poor algorithms evict needed pages, lowering effective threshold
Memory Pressure Responsiveness
- Systems with good pressure signals can shed memory when needed
- Systems without feedback over-commit blindly

Calculating Safe Over-allocation:

A conservative formula for maximum safe over-allocation:

Max Safe Over-allocation = (RAM + Swap) / Average Working Set Size

Example:

RAM: 16 GB, Swap: 8 GB, Average Working Set: 200 MB
Max Safe Processes: (16 GB + 8 GB) / 200 MB = 120 processes
If each process allocates 1 GB virtual: 120x over-allocation is safe

The OOM Killer

When over-allocation exceeds sustainable limits and the system cannot find memory to satisfy a page fault, Linux invokes the Out-Of-Memory (OOM) killer, which terminates processes to reclaim memory. This is a last resort—the system would rather kill a process than deadlock entirely. Understanding over-allocation helps avoid triggering this drastic measure.

Physical Frame Allocation

To fully understand over-allocation, we must examine how physical memory is organized and distributed among competing processes.

Frame Structure:

Physical memory is divided into fixed-size units called frames, typically matching the page size (4 KB on most systems). The operating system maintains several data structures to track frame status:

struct frame_entry {
    unsigned long frame_number;     // Physical frame number
    unsigned int reference_count;   // Number of mappings to this frame
    unsigned int flags;             // State flags (dirty, locked, etc.)
    struct page *page_descriptor;   // Pointer to page metadata
    struct list_head lru_list;      // Position in LRU list
    unsigned long last_access_time; // For replacement algorithms
};

Frame States:

At any moment, each frame is in one of several states:

Physical Frame States

•Free — Available for immediate allocation; on the free frame list
•Allocated (Clean) — In use by a process; contents match disk/backing store
•Allocated (Dirty) — In use and modified; must be written before eviction
•Locked/Pinned — Cannot be evicted (kernel pages, DMA buffers, mlock'd)
•Reserved — Set aside for specific purposes (huge pages, kernel pool)
•Bad/Unusable — Hardware errors detected; never allocated

The Free Frame List:

The operating system maintains a free frame list—a collection of currently unallocated frames. This list is the system's immediately available memory supply:

Free List Operations:
  - get_free_frame()    : O(1) removal from list head
  - return_frame(frame) : O(1) insertion at list tail
  - count_free_frames() : Maintained as running total

Critical Thresholds:

Systems define thresholds that trigger different behaviors:

High Watermark (pages_high):
  - Free frames above this level → System is comfortable
  - No proactive reclamation needed

Low Watermark (pages_low):
  - Free frames drop below this → Start background reclamation
  - kswapd daemon begins finding pages to evict

Minimum Watermark (pages_min):
  - Free frames drop below this → Synchronous reclamation
  - Allocating process must wait while system finds free frames
  - Direct reclaim in allocation path

Critical Level:
  - Free frames near zero → Emergency measures
  - OOM killer may be invoked

Memory Zones:

Physical memory is divided into zones based on addressing capabilities:

Zone	Address Range	Usage
ZONE_DMA	0-16 MB	Legacy DMA operations
ZONE_DMA32	0-4 GB	32-bit DMA operations
ZONE_NORMAL	4 GB+	General purpose
ZONE_HIGHMEM	Beyond direct map	Additional memory (32-bit only)

Each zone has independent free lists and watermarks. Over-allocation affects zones differently—DMA zones are precious and rarely over-committed.

When All Frames Are Occupied

The fundamental problem arises when a page fault occurs but the free frame list is empty. This moment is where over-allocation transforms from abstract concept to immediate operational crisis.

The Scenario:

Process P executes instruction accessing virtual address V
MMU translates V, finds page not present (valid bit = 0)
Page fault trap transfers control to OS
OS determines page must be loaded from disk
OS requests a free frame... but none exist

The Dilemma:

The operating system faces an impossible situation with only two options:

Block Indefinitely — Wait for a frame to become free
- Problem: If all processes are blocked waiting for frames, deadlock occurs
- No process can free memory because none can run
Create a Free Frame — Evict an existing page
- This is page replacement — the subject of this chapter
- Requires choosing a victim page to remove from memory

The second option is the only viable solution, making page replacement an essential operating system mechanism—not an optimization, but a fundamental requirement.

The Fundamental Insight

Page replacement is the mechanism that makes over-allocation possible. Without it, virtual memory would require physical backing for every allocated page—eliminating most benefits of virtual memory. Page replacement is the bridge between the promise of vast virtual address spaces and the reality of limited physical RAM.

The Complete Page Fault Handling with Replacement:

Page Fault Handler:
1. Save process state
2. Determine faulting virtual address
3. Validate access (segmentation fault if invalid)
4. Locate page on disk (page table, file mapping, or swap)
5. Find a free frame:
   a. If free frame available → use it
   b. If no free frame → invoke page replacement:
      i.   Select victim page
      ii.  If victim is dirty → write to disk
      iii. Update victim's page table entry (valid=0)
      iv.  Frame is now free
6. Read desired page from disk into frame
7. Update page table entry (frame number, valid=1)
8. Restart faulting instruction

Cost Analysis:

Page replacement adds significant overhead to page fault handling:

Step	Time (SSD)	Time (HDD)
Page fault detection	~1 μs	~1 μs
Select victim	~1-10 μs	~1-10 μs
Write dirty page	~50 μs	~10 ms
Read new page	~50 μs	~10 ms
Total (clean victim)	~100 μs	~10 ms
Total (dirty victim)	~150 μs	~20 ms

This cost is why page replacement algorithm choice matters enormously—selecting a dirty page when a clean one is available doubles the I/O cost.

Over-allocation Policies

Operating systems implement different policies regarding how much over-allocation to permit. These policies represent fundamental tradeoffs between resource utilization and system stability.

Policy 1: Never Over-commit (Conservative)

Allocation Rule: Virtual allocations ≤ Physical + Swap

Characteristics:

malloc() fails if no physical backing possible
Processes can rely on allocated memory being available
System never faces OOM-kill situation
Lower memory utilization (unused allocations reserved)

Use Cases: Mission-critical systems, databases, real-time applications

Policy 2: Heuristic Over-commit (Balanced)

Allocation Rule: Allow reasonable over-commit based on historical patterns

Characteristics:

malloc() usually succeeds even without immediate backing
Kernel estimates "reasonable" over-commit level
If estimate wrong, OOM-kill may be needed
Higher memory utilization with moderate risk

Use Cases: General-purpose desktops, development servers

Policy 3: Always Over-commit (Aggressive)

Allocation Rule: Allow any allocation; deal with consequences later

Characteristics:

malloc() never fails (returns NULL only at extreme limits)
Page faults handle actual allocation
OOM-kill activated when physical memory exhausted
Maximum memory utilization but unpredictable failure mode

Use Cases: Scientific computing with known workloads, batch processing

Over-commit Policy Comparison
Aspect	Never	Heuristic	Always
Memory Utilization	60-80%	80-95%	90-100%
Allocation Failures	Common	Rare	Never (until OOM)
Predictability	High	Medium	Low
OOM-kill Risk	None	Low	Moderate
Configuration	Simple	Moderate	Complex monitoring needed

linux-overcommit-configuration.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Linux Over-commit Control
 
# View current setting
cat /proc/sys/vm/overcommit_memory
# 0 = heuristic (default)
# 1 = always over-commit
# 2 = never over-commit
 
# View overcommit ratio (used when mode = 2)
cat /proc/sys/vm/overcommit_ratio
# Default: 50 (meaning commit limit = swap + 50% of RAM)
 
# Example: Set strict accounting
echo 2 > /proc/sys/vm/overcommit_memory
echo 80 > /proc/sys/vm/overcommit_ratio
# Now: commit limit = swap + 80% of RAM
 
# Check current commit status
cat /proc/meminfo | grep -i commit
# CommitLimit:    16384000 kB  (maximum allowed)
# Committed_AS:   12000000 kB  (currently committed)
 
# Persistent configuration via sysctl.conf
# vm.overcommit_memory = 2
# vm.overcommit_ratio = 80

Implications for System Design

Understanding over-allocation has profound implications for how we design, deploy, and operate software systems.

Implication 1: Memory Allocation ≠ Memory Usage

Developers must understand that allocating memory doesn't guarantee immediate physical availability:

// This allocates virtual memory, not physical
char *buffer = malloc(1024 * 1024 * 1024);  // 1 GB

// No physical memory used yet!
// Physical allocation happens on first access:
buffer[0] = 'x';  // First page fault → first frame allocated

// Touch all pages to force physical allocation:
memset(buffer, 0, 1024 * 1024 * 1024);  // Now 1 GB physically allocated

Implication 2: Resource Planning Requires Working Set Analysis

Capacity planning must consider working sets, not allocations:

❌ Wrong: "We have 10 processes each allocating 2 GB; need 20 GB RAM"
✓ Right: "We have 10 processes with 500 MB working sets; 8 GB RAM may suffice"

Implication 3: Containers and cgroups

Modern container systems (Docker, Kubernetes) allow memory limits:

# Kubernetes resource specification
resources:
  requests:
    memory: "512Mi"    # Working set expectation
  limits:
    memory: "1Gi"      # Maximum allowed

These limits interact with over-allocation:

Request: Scheduler uses this for bin-packing (working set)
Limit: OOM-killer triggers if exceeded (hard cap)

Over-committing container memory can lead to node-level OOM events that kill pods unexpectedly.

Best Practices for Managing Over-allocation

•Monitor Commit Charge — Track Committed_AS vs CommitLimit on Linux; high ratios indicate risk
•Size Working Sets Appropriately — Profile applications to understand actual memory usage patterns
•Use Memory Limits — Employ cgroups/containers to prevent runaway allocation
•Provision Adequate Swap — Provides breathing room during transient spikes
•Configure OOM Scores — Set oom_score_adj to protect critical processes
•Test Under Memory Pressure — Simulate over-allocation to verify system behavior

The Big Picture

Over-allocation is the reality that virtual memory creates: the promise of more memory than physically exists. Page replacement is the mechanism that fulfills this promise—when physical memory runs out, we steal from one process to give to another. Understanding over-allocation is understanding why page replacement exists and why choosing the right victim matters enormously for system performance.

Summary and Key Takeaways

We've established the foundational problem that necessitates page replacement. Let's consolidate our understanding:

Key Takeaways

•Over-allocation is when total virtual memory exceeds physical memory — This is the normal state of modern systems running multiple processes.
•Over-allocation works because of locality — Processes use only a fraction (working set) of their allocated memory at any time.
•The free frame list is central to memory allocation — When empty, the OS has no choice but to evict existing pages.
•Page replacement makes over-allocation viable — It's the mechanism that allows virtual memory to exceed physical memory.
•Over-allocation policies involve fundamental tradeoffs — Conservative policies waste memory; aggressive policies risk OOM situations.
•System design must account for over-allocation — Understanding working sets vs allocations is essential for capacity planning.

What's Next:

Now that we understand why page replacement is necessary, we'll explore what it actually involves. The next page examines the page replacement concept in detail—the fundamental operation of choosing a victim, evicting it, and using its frame for a new page.

Page Complete

You now understand over-allocation—the core reality that makes page replacement essential. Virtual memory's promise of abundant address space only works because we can move pages between memory and disk. Without page replacement, every virtual page would need permanent physical residence, eliminating most benefits of virtual memory.

Over-allocation

The Illusion of Unlimited Memory

This page explores the mechanics, causes, and consequences of over-allocation—the core problem that makes page replacement not just useful, but absolutely essential to modern computing.

What You Will Learn

Understanding Over-allocation

Formal Definition:

Let's define over-allocation mathematically. If we have:

n processes: P₁, P₂, ..., Pₙ
Each process Pᵢ has a virtual address space of size Vᵢ
Total physical memory available: M

Over-allocation occurs when:

∑(i=1 to n) Vᵢ > M

In practice, this inequality is almost always satisfied. Consider a typical system:

Physical RAM: 16 GB
Web browser (multiple tabs): 4 GB virtual space
IDE: 2 GB virtual space
Background services: 3 GB virtual space
Operating system kernel: 2 GB virtual space
Additional applications: 10 GB virtual space

Total virtual memory allocated: 21 GB > 16 GB physical memory

This 31% over-allocation is modest by real-world standards. Systems routinely operate with 2x, 5x, or even 10x over-allocation ratios.

Critical Distinction

The Over-allocation Ratio:

Operating systems track the over-allocation ratio (also called the over-commit ratio):

Over-allocation Ratio = Total Virtual Memory Allocated / Physical Memory

Different systems handle over-allocation differently:

Policy	Linux	Windows	macOS
Default	Heuristic over-commit	Commit limit	Compressed memory
Behavior	Allows allocation, may OOM-kill	Fails allocation if exceeded	Aggressive compression
Typical Ratio	0.5-2.0	1.0-1.5	1.0-2.0

Linux's Over-commit Modes:

Linux provides explicit control via /proc/sys/vm/overcommit_memory:

0 (heuristic): Kernel estimates reasonable over-commit
1 (always): Never refuse any allocation (dangerous but useful for specific workloads)
2 (never): Strict accounting; commit limit = swap + (RAM × overcommit_ratio)

Why Over-allocation Occurs

Root Cause 1: Multiprogramming

A typical desktop runs 100-300 simultaneous processes
A server may run thousands of threads/processes
Cloud environments virtualize multiple complete operating systems

If each process required dedicated physical memory equal to its virtual address space, systems would support far fewer concurrent processes, devastating throughput and responsiveness.

Causes of Over-allocation

•Sparse Address Space Usage — Processes allocate large virtual address ranges but use them sparsely. A 64-bit process might map 8 TB virtual but touch only 100 MB. The OS would be wasteful allocating physical frames for unmapped regions.
•Copy-on-Write Semantics — After fork(), parent and child share physical pages until one writes. The virtual allocation appears doubled, but physical usage remains unchanged until modification.
•Memory-Mapped Files — Mapping a 10 GB file creates 10 GB of virtual allocations, but only accessed pages consume physical memory. You can map larger files than RAM.
•Shared Libraries — Each process appears to have its own copy of libc, but they share physical pages. Virtual allocation counts multiply; physical usage doesn't.
•Lazy Allocation Strategies — malloc() often returns immediately without physical backing. Pages are allocated only on first access (demand paging).

Root Cause 2: Demand Paging Philosophy

Demand paging intentionally loads pages only when accessed, not when allocated. This "pay only for what you use" model is fundamentally over-committed:

Process calls malloc(1GB) → Virtual memory manager reserves 1 GB of address space
No physical frames are allocated yet
Process accesses first page → Page fault allocates one 4 KB frame
Most of the 1 GB may never be touched

Root Cause 3: The Working Set Principle

Processes exhibit locality of reference—they access a small, slowly-changing subset of their memory at any time. This working set is typically much smaller than total allocation:

Typical Working Set Size / Virtual Allocation Ratio: 10% to 30%

If a process allocates 1 GB but its working set is 100 MB, the system only needs to keep 100 MB resident. The other 900 MB can remain on disk until needed—enabling massive over-allocation.

The Power of Locality

The Over-allocation Threshold

While over-allocation is beneficial, there exists a critical threshold beyond which it becomes pathological. Understanding this threshold is essential for system configuration and performance tuning.

The Breaking Point:

Over-allocation works as long as:

∑(Working Sets of all active processes) ≤ Available Physical Memory

Visualizing the Threshold:

Over-allocation Zones and System Behavior
Zone	Over-allocation Level	Physical Memory State	System Behavior
Green (Optimal)	1x - 3x	Working sets fit comfortably	Minimal page faults, high throughput
Yellow (Caution)	3x - 5x	Working sets barely fit	Occasional page faults, slight degradation
Orange (Warning)	5x - 10x	Working sets overlap in memory	Frequent page faults, noticeable slowdown
Red (Critical)	10x	Working sets cannot fit	Thrashing, system near unusable

Factors Affecting the Threshold:

The exact threshold varies based on:

Workload Characteristics
- I/O-bound workloads tolerate higher over-allocation (processes wait a lot)
- CPU-bound workloads tolerate lower over-allocation (constant memory access)
- Mixed workloads require careful tuning
Swap Space Performance
- SSD swap: ~100,000 IOPS, tolerates higher over-allocation
- HDD swap: ~100 IOPS, requires conservative over-allocation
- No swap: Zero tolerance for working set > RAM
Page Replacement Algorithm Quality
- Better algorithms maintain working sets longer
- Poor algorithms evict needed pages, lowering effective threshold
Memory Pressure Responsiveness
- Systems with good pressure signals can shed memory when needed
- Systems without feedback over-commit blindly

Calculating Safe Over-allocation:

A conservative formula for maximum safe over-allocation:

Max Safe Over-allocation = (RAM + Swap) / Average Working Set Size

Example:

RAM: 16 GB, Swap: 8 GB, Average Working Set: 200 MB
Max Safe Processes: (16 GB + 8 GB) / 200 MB = 120 processes
If each process allocates 1 GB virtual: 120x over-allocation is safe

The OOM Killer

Physical Frame Allocation

To fully understand over-allocation, we must examine how physical memory is organized and distributed among competing processes.

Frame Structure:

struct frame_entry {
    unsigned long frame_number;     // Physical frame number
    unsigned int reference_count;   // Number of mappings to this frame
    unsigned int flags;             // State flags (dirty, locked, etc.)
    struct page *page_descriptor;   // Pointer to page metadata
    struct list_head lru_list;      // Position in LRU list
    unsigned long last_access_time; // For replacement algorithms
};

Frame States:

At any moment, each frame is in one of several states:

Physical Frame States

•Free — Available for immediate allocation; on the free frame list
•Allocated (Clean) — In use by a process; contents match disk/backing store
•Allocated (Dirty) — In use and modified; must be written before eviction
•Locked/Pinned — Cannot be evicted (kernel pages, DMA buffers, mlock'd)
•Reserved — Set aside for specific purposes (huge pages, kernel pool)
•Bad/Unusable — Hardware errors detected; never allocated

The Free Frame List:

The operating system maintains a free frame list—a collection of currently unallocated frames. This list is the system's immediately available memory supply:

Free List Operations:
  - get_free_frame()    : O(1) removal from list head
  - return_frame(frame) : O(1) insertion at list tail
  - count_free_frames() : Maintained as running total

Critical Thresholds:

Systems define thresholds that trigger different behaviors:

High Watermark (pages_high):
  - Free frames above this level → System is comfortable
  - No proactive reclamation needed

Low Watermark (pages_low):
  - Free frames drop below this → Start background reclamation
  - kswapd daemon begins finding pages to evict

Minimum Watermark (pages_min):
  - Free frames drop below this → Synchronous reclamation
  - Allocating process must wait while system finds free frames
  - Direct reclaim in allocation path

Critical Level:
  - Free frames near zero → Emergency measures
  - OOM killer may be invoked

Memory Zones:

Physical memory is divided into zones based on addressing capabilities:

Zone	Address Range	Usage
ZONE_DMA	0-16 MB	Legacy DMA operations
ZONE_DMA32	0-4 GB	32-bit DMA operations
ZONE_NORMAL	4 GB+	General purpose
ZONE_HIGHMEM	Beyond direct map	Additional memory (32-bit only)

Each zone has independent free lists and watermarks. Over-allocation affects zones differently—DMA zones are precious and rarely over-committed.

When All Frames Are Occupied

The fundamental problem arises when a page fault occurs but the free frame list is empty. This moment is where over-allocation transforms from abstract concept to immediate operational crisis.

The Scenario:

Process P executes instruction accessing virtual address V
MMU translates V, finds page not present (valid bit = 0)
Page fault trap transfers control to OS
OS determines page must be loaded from disk
OS requests a free frame... but none exist

The Dilemma:

The operating system faces an impossible situation with only two options:

Block Indefinitely — Wait for a frame to become free
- Problem: If all processes are blocked waiting for frames, deadlock occurs
- No process can free memory because none can run
Create a Free Frame — Evict an existing page
- This is page replacement — the subject of this chapter
- Requires choosing a victim page to remove from memory

The second option is the only viable solution, making page replacement an essential operating system mechanism—not an optimization, but a fundamental requirement.

The Fundamental Insight

The Complete Page Fault Handling with Replacement:

Page Fault Handler:
1. Save process state
2. Determine faulting virtual address
3. Validate access (segmentation fault if invalid)
4. Locate page on disk (page table, file mapping, or swap)
5. Find a free frame:
   a. If free frame available → use it
   b. If no free frame → invoke page replacement:
      i.   Select victim page
      ii.  If victim is dirty → write to disk
      iii. Update victim's page table entry (valid=0)
      iv.  Frame is now free
6. Read desired page from disk into frame
7. Update page table entry (frame number, valid=1)
8. Restart faulting instruction

Cost Analysis:

Page replacement adds significant overhead to page fault handling:

Step	Time (SSD)	Time (HDD)
Page fault detection	~1 μs	~1 μs
Select victim	~1-10 μs	~1-10 μs
Write dirty page	~50 μs	~10 ms
Read new page	~50 μs	~10 ms
Total (clean victim)	~100 μs	~10 ms
Total (dirty victim)	~150 μs	~20 ms

This cost is why page replacement algorithm choice matters enormously—selecting a dirty page when a clean one is available doubles the I/O cost.

Over-allocation Policies

Operating systems implement different policies regarding how much over-allocation to permit. These policies represent fundamental tradeoffs between resource utilization and system stability.

Policy 1: Never Over-commit (Conservative)

Allocation Rule: Virtual allocations ≤ Physical + Swap

Characteristics:

malloc() fails if no physical backing possible
Processes can rely on allocated memory being available
System never faces OOM-kill situation
Lower memory utilization (unused allocations reserved)

Use Cases: Mission-critical systems, databases, real-time applications

Policy 2: Heuristic Over-commit (Balanced)

Allocation Rule: Allow reasonable over-commit based on historical patterns

Characteristics:

malloc() usually succeeds even without immediate backing
Kernel estimates "reasonable" over-commit level
If estimate wrong, OOM-kill may be needed
Higher memory utilization with moderate risk

Use Cases: General-purpose desktops, development servers

Policy 3: Always Over-commit (Aggressive)

Allocation Rule: Allow any allocation; deal with consequences later

Characteristics:

malloc() never fails (returns NULL only at extreme limits)
Page faults handle actual allocation
OOM-kill activated when physical memory exhausted
Maximum memory utilization but unpredictable failure mode

Use Cases: Scientific computing with known workloads, batch processing

Over-commit Policy Comparison
Aspect	Never	Heuristic	Always
Memory Utilization	60-80%	80-95%	90-100%
Allocation Failures	Common	Rare	Never (until OOM)
Predictability	High	Medium	Low
OOM-kill Risk	None	Low	Moderate
Configuration	Simple	Moderate	Complex monitoring needed

linux-overcommit-configuration.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Linux Over-commit Control
 
# View current setting
cat /proc/sys/vm/overcommit_memory
# 0 = heuristic (default)
# 1 = always over-commit
# 2 = never over-commit
 
# View overcommit ratio (used when mode = 2)
cat /proc/sys/vm/overcommit_ratio
# Default: 50 (meaning commit limit = swap + 50% of RAM)
 
# Example: Set strict accounting
echo 2 > /proc/sys/vm/overcommit_memory
echo 80 > /proc/sys/vm/overcommit_ratio
# Now: commit limit = swap + 80% of RAM
 
# Check current commit status
cat /proc/meminfo | grep -i commit
# CommitLimit:    16384000 kB  (maximum allowed)
# Committed_AS:   12000000 kB  (currently committed)
 
# Persistent configuration via sysctl.conf
# vm.overcommit_memory = 2
# vm.overcommit_ratio = 80

Implications for System Design

Understanding over-allocation has profound implications for how we design, deploy, and operate software systems.

Implication 1: Memory Allocation ≠ Memory Usage

Developers must understand that allocating memory doesn't guarantee immediate physical availability:

// This allocates virtual memory, not physical
char *buffer = malloc(1024 * 1024 * 1024);  // 1 GB

// No physical memory used yet!
// Physical allocation happens on first access:
buffer[0] = 'x';  // First page fault → first frame allocated

// Touch all pages to force physical allocation:
memset(buffer, 0, 1024 * 1024 * 1024);  // Now 1 GB physically allocated

Implication 2: Resource Planning Requires Working Set Analysis

Capacity planning must consider working sets, not allocations:

❌ Wrong: "We have 10 processes each allocating 2 GB; need 20 GB RAM"
✓ Right: "We have 10 processes with 500 MB working sets; 8 GB RAM may suffice"

Implication 3: Containers and cgroups

Modern container systems (Docker, Kubernetes) allow memory limits:

# Kubernetes resource specification
resources:
  requests:
    memory: "512Mi"    # Working set expectation
  limits:
    memory: "1Gi"      # Maximum allowed

These limits interact with over-allocation:

Request: Scheduler uses this for bin-packing (working set)
Limit: OOM-killer triggers if exceeded (hard cap)

Over-committing container memory can lead to node-level OOM events that kill pods unexpectedly.

Best Practices for Managing Over-allocation

•Monitor Commit Charge — Track Committed_AS vs CommitLimit on Linux; high ratios indicate risk
•Size Working Sets Appropriately — Profile applications to understand actual memory usage patterns
•Use Memory Limits — Employ cgroups/containers to prevent runaway allocation
•Provision Adequate Swap — Provides breathing room during transient spikes
•Configure OOM Scores — Set oom_score_adj to protect critical processes
•Test Under Memory Pressure — Simulate over-allocation to verify system behavior

The Big Picture

Summary and Key Takeaways

We've established the foundational problem that necessitates page replacement. Let's consolidate our understanding:

Key Takeaways

•Over-allocation is when total virtual memory exceeds physical memory — This is the normal state of modern systems running multiple processes.
•Over-allocation works because of locality — Processes use only a fraction (working set) of their allocated memory at any time.
•The free frame list is central to memory allocation — When empty, the OS has no choice but to evict existing pages.
•Page replacement makes over-allocation viable — It's the mechanism that allows virtual memory to exceed physical memory.
•Over-allocation policies involve fundamental tradeoffs — Conservative policies waste memory; aggressive policies risk OOM situations.
•System design must account for over-allocation — Understanding working sets vs allocations is essential for capacity planning.

What's Next:

Page Complete