Operating SystemsThrashing

Thrashing: The Performance Catastrophe

LevelIntermediate

Duration90 mins

TopicThrashing

2 / 5

Cause: Insufficient Frames

The Root Cause Revealed

In the previous page, we defined thrashing and observed its devastating effects. Now we examine the root cause: insufficient frame allocation. Understanding this cause is essential because it reveals that thrashing is not random or unpredictable—it's the direct, deterministic result of a fundamental resource shortage.

The core insight is straightforward but profound: thrashing occurs when processes receive fewer frames than they need to hold their working sets in memory. This page explores every dimension of this problem.

What You Will Learn

By the end of this page, you will understand precisely why insufficient frames cause thrashing, how the relationship between frame allocation and page faults creates instability, the role of the operating system scheduler in exacerbating the problem, and why some allocation strategies are more vulnerable than others.

The Frame Allocation Problem

Every operating system faces a fundamental resource allocation challenge: how should limited physical memory frames be distributed among competing processes? This is the frame allocation problem, and its solution directly determines whether the system thrashes.

The Constraints:

Frame Allocation Constraints

•Total frames are fixed — Physical memory capacity is determined by hardware
•Kernel reserves frames — The OS kernel needs frames for its own code, data, and page tables
•Processes have varying needs — Different processes require different working set sizes
•Needs change over time — A process's working set shifts as it moves through different phases
•Frame allocation affects all processes — Giving more frames to one process means fewer for others

The Mathematical Reality:

Let's formalize the constraint. Given:

m = total physical frames available for user processes
n = number of active processes
f(i) = frames allocated to process i
WSS(i) = working set size of process i

The fundamental constraint is:

∑ f(i) ≤ m    (Cannot allocate more frames than exist)

For stability, we need:

f(i) ≥ WSS(i)    for all i    (Each process needs enough frames)

Combining these:

∑ WSS(i) ≤ m    (Sum of working sets must fit in memory)

When this condition is violated—when processes collectively need more frames than exist—thrashing is inevitable.

The Overcommitment Trap

Modern operating systems allow memory overcommitment—processes can request more virtual memory than physical memory exists. This is normally fine because processes rarely use all allocated memory simultaneously. But when memory pressure increases and processes actually need their allocations, the bill comes due. There's no way to satisfy everyone—thrashing begins.

The Minimum Frame Requirement

Before discussing how many frames processes should receive, we must establish how many they must receive. Every process has a minimum frame requirement—the bare minimum frames needed for the process to execute at all.

This minimum is determined by the architecture's instruction set and addressing modes:

Minimum Frame Requirements by Architecture
Addressing Mode	Pages Potentially Accessed	Minimum Frames Required
Direct addressing	Instruction page + operand page	2
Indirect addressing	Instruction + pointer + data pages	3
Indexed addressing	Instruction + base + index + data	4
Memory-to-memory	Instruction + source + destination	3
Complex (VAX MOVC3)	Instruction + control + src + dst segments	6+

Why the Minimum Matters:

Consider an instruction like MOV [addr1], [addr2] that copies data from one memory location to another:

Instruction page — Contains the MOV instruction itself
Source operand page — Contains the data at addr1
Destination operand page — Contains the location at addr2

Now imagine the system allocates only 2 frames to this process:

Fetch instruction → Page fault, load instruction page (uses frame 1)
Read source operand → Page fault, load source page (uses frame 2)
Write destination → Page fault, need to load destination page
Evict a page—but which one?
Evict instruction page → Cannot continue instruction execution
Evict source page → Cannot complete the read

Neither option allows the instruction to complete!

Below Minimum = Infinite Loop

If a process receives fewer than its minimum frames, it cannot complete even a single instruction. Every attempt to make progress causes a page fault, which evicts a page needed for the current instruction. The process enters an infinite fault loop—a degenerate form of thrashing where literally no progress is possible. The OS must guarantee the minimum or refuse to run the process.

Minimum vs. Adequate:

Meeting the minimum prevents infinite loops but doesn't prevent thrashing. The minimum allows a single instruction to complete, but if the working set far exceeds the minimum, the process will still thrash—just not infinitely.

Frames Allocated	Behavior
< Minimum	Cannot complete any instruction; infinite fault loop
= Minimum	Can execute but severe thrashing; every instruction causes faults
< Working Set	Thrashing; frequent page faults degrade performance
≥ Working Set	Normal operation; occasional faults at phase transitions

The Demand and Supply Imbalance

Thrashing fundamentally stems from a demand-supply imbalance. Processes demand frames (through their working sets), and the system supplies them (through physical memory). When demand exceeds supply, something must give.

Visualizing the Imbalance:

              DEMAND SIDE                    SUPPLY SIDE
        ┌─────────────────────┐         ┌─────────────────────┐
        │   Process 1: Web    │         │                     │
        │   Server (WSS=800)  │         │    Total Physical   │
        ├─────────────────────┤         │    Memory: 2048     │
        │   Process 2: DB     │         │    frames           │
        │   Server (WSS=600)  │         │                     │
        ├─────────────────────┤         │    Kernel Reserve:  │
        │   Process 3: App    │         │    256 frames       │
        │   Server (WSS=500)  │         │                     │
        ├─────────────────────┤         │    Buffer Cache:    │
        │   Process 4: Redis  │         │    256 frames       │
        │   Cache (WSS=400)   │         │                     │
        └─────────────────────┘         │    Available:       │
                                        │    1536 frames      │
        Total Demand:                   └─────────────────────┘
        800+600+500+400 = 2300 frames

        DEMAND (2300) > SUPPLY (1536)  →  THRASHING INEVITABLE

In this scenario, there's no allocation strategy that prevents thrashing. The collective working set exceeds available memory by 764 frames (50% oversubscribed).

The Zero-Sum Game

When demand exceeds supply, frame allocation becomes a zero-sum game. Every frame given to one process is a frame taken from another. The OS can reshuffle frames endlessly, but the total shortage remains. This is why thrashing cannot be solved by cleverer allocation—only by reducing demand (fewer processes, smaller working sets) or increasing supply (more RAM).

Sources of Demand Increase:

Understanding where demand comes from helps prevent thrashing:

Demand Source	Description	Example
New processes	Scheduler admits more work	Fork bomb, service startup
Process growth	Existing process expands working set	Loading large dataset
Phase transition	Process enters memory-intensive phase	Initialization → main loop
Shared page invalidation	Pages become private copies	Copy-on-write after fork
Memory leaks	Process accumulates unreleased memory	Application bug

Sources of Supply Decrease:

Supply Reduction	Description	Example
Kernel expansion	OS needs more memory	Module loading, buffer growth
Memory fragmentation	Usable but not contiguous	Huge page allocation failure
Hot plugging	Physical memory removed	NUMA node offline
Reservation increase	More memory pre-allocated	Huge page reservation

How Allocation Policies Fail

Different frame allocation policies handle memory pressure differently, but all eventually fail when demand exceeds supply. Understanding how each policy fails illuminates the fundamental nature of the problem.

Equal Allocation Failure

•Policy: Give all processes the same number of frames
•With 1536 frames / 4 processes: Each gets 384 frames
•Problem: Web server needs 800, gets 384
•Problem: Redis needs 400, almost enough
•Result: Large processes thrash heavily; small processes are fine
•Failure mode: Ignores actual working set sizes

Proportional Allocation Failure

•Policy: Allocate based on virtual memory size
•Rationale: Larger processes get more frames
•With 50% oversubscription: Each gets 67% of need
•Web server: Needs 800, gets 534, short by 266
•All processes thrash: Proportional shortage
•Failure mode: All processes degrade together

The Priority Allocation Trap:

Priority-based allocation gives more frames to high-priority processes. This seems sensible—important work should proceed smoothly. But it creates perverse effects:

High-priority processes get enough frames—they run well
Low-priority processes get starved—they thrash severely
Low-priority processes block on page faults constantly
They hold resources (locks, connections) while blocked
High-priority processes block waiting for those resources
Priority inversion: High-priority work is blocked by low-priority thrashing

The low-priority processes, unable to make progress, cannot release resources the high-priority processes need.

No Policy Solves Oversubscription

The key insight is that allocation policies only determine who suffers—they cannot solve the underlying shortage. Equal allocation spreads suffering equally. Proportional allocation spreads it proportionally. Priority allocation concentrates suffering on low-priority processes. But in all cases, total demand exceeds total supply, so someone thrashes.

Comparative Failure Analysis:

Policy	Frame Distribution	Who Thrashes	System Throughput
Equal	Uniform	Large processes	Highest if all similar size
Proportional	By VM size	All (proportionally)	Often best overall
Priority	By importance	Low-priority processes	High-priority work proceeds
First-Come	Arrival order	Latecomers	Early processes succeed
Working Set	By actual usage	Last to arrive	Best utilization, complex

Each policy makes different tradeoffs, but none escapes the fundamental constraint.

The OS Scheduler's Tragic Role

One of the most tragic aspects of thrashing is how the operating system's own scheduling logic accelerates the collapse. The scheduler, following reasonable heuristics, makes decisions that worsen the very problem it should prevent.

The Scheduler's Logic:

CPU Scheduler Behavior

•Observe low CPU utilization — The scheduler sees the CPU is often idle
•Diagnose underutilization — Concludes there aren't enough runnable processes
•Increase multiprogramming — Admits more processes from the ready queue
•Expected outcome — More processes means more CPU utilization
•Actual outcome — More processes means fewer frames each, more thrashing
•CPU utilization drops further — All processes now blocked on page faults
•Scheduler admits more processes — The spiral continues

The Fundamental Misdiagnosis

The scheduler's error is treating a memory problem as a CPU problem. Low CPU utilization can mean either (a) not enough processes, or (b) all processes blocked on I/O. The standard scheduler assumes (a) and acts accordingly. But when (b) is the cause—and specifically when I/O is page fault service—adding more processes is exactly wrong.

Timeline of Scheduler-Induced Collapse:

Time    Event                           CPU%   Page Faults/s   Processes
─────   ─────────────────────────────   ────   ─────────────   ─────────
t₀      Normal operation                 80%        50              10
t₁      Memory pressure begins           75%       100              10
t₂      Scheduler observes low CPU       70%       200              10
t₃      Scheduler admits 2 new procs     60%       500              12
t₄      Thrashing intensifies            40%     1,500              12
t₅      Scheduler admits 3 more procs    20%     5,000              15
t₆      System nearly unresponsive        5%    20,000              15
t₇      Scheduler still trying            2%    50,000              18
t₈      Complete collapse                <1%   100,000+             18

Notice how CPU utilization and page fault rate move in opposite directions. The scheduler's attempts to raise CPU utilization only accelerate the collapse.

The Missing Feedback:

The core problem is that the scheduler lacks memory-aware feedback. It optimizes for CPU utilization without considering memory constraints:

What the scheduler sees: CPU utilization (too low)
What the scheduler misses: Page fault rate, paging I/O, working set pressure
What it should do: Reduce multiprogramming when memory is exhausted
What it does: Increase multiprogramming, making everything worse

Modern schedulers incorporate memory pressure signals, but the fundamental tension remains.

The Frame Shortage Equation

We can formalize the relationship between frame shortage and thrashing intensity using quantitative analysis.

Shortage Ratio Definition

Define the shortage ratio S as:

S = (∑ WSS(i) - m) / m

Where:

∑ WSS(i) = total working set demand
m = available physical frames

If S ≤ 0: No shortage, system stable If S > 0: Shortage exists, thrashing risk If S > 0.5: Severe shortage, heavy thrashing

Page Fault Rate as a Function of Shortage:

Empirical studies show that page fault rate increases non-linearly with shortage ratio:

Page Fault Rate
        ▲
        │
   1.0  ┤                              ╱ Saturation
        │                            ╱
   0.8  ┤                          ╱
        │                        ╱
   0.6  ┤                      ╱
        │                    ╱   Thrashing zone
   0.4  ┤                  ╱
        │                ╱
   0.2  ┤              ╱
        │            ╱
   0.1  ┤──────────╱
        │  Stable ╱
        └─────────┴────┴────┴────┴────┴────► Shortage Ratio
              0    0.2  0.4  0.6  0.8  1.0

The curve shows:

Stable region (S < 0): Low, constant page fault rate
Transition (0 < S < 0.3): Rapidly increasing faults
Thrashing zone (S > 0.3): Fault rate dominates execution
Saturation (S > 1.0): Fault rate at maximum; all time spent paging

Shortage Impact on System Performance
Shortage Ratio	Expected Fault Rate	Effective Throughput	User Experience
S ≤ 0 (adequate)	< 1%	~100%	Normal, responsive
0 < S ≤ 0.1	1-5%	80-95%	Slightly sluggish
0.1 < S ≤ 0.3	5-20%	50-80%	Noticeably slow
0.3 < S ≤ 0.5	20-50%	20-50%	Very slow, frustrating
0.5 < S ≤ 0.8	50-80%	5-20%	Nearly unusable
S > 0.8	80%	< 5%	System appears frozen

The Non-Linear Cliff:

The transition from "slightly sluggish" to "completely unusable" can occur with a small change in shortage ratio. A system at S=0.2 might feel "a bit slow." At S=0.4, it's "very slow." At S=0.6, it's "frozen."

This non-linearity makes thrashing dangerous:

Systems can operate near the edge without obvious symptoms
A small additional load (one more process, one dataset loaded) pushes past the cliff
Recovery requires substantial load reduction, not just removing what triggered collapse

Global vs. Local Replacement Impact

The page replacement scope—global or local—significantly affects how frame shortage manifests and how thrashing propagates through the system.

Global Replacement Behavior

•Any page can be evicted — Victim selected from entire system
•High-frequency process steals pages — Active process dominates
•Cascading thrashing — One process's activity causes others to thrash
•Unpredictable performance — Process execution time varies widely
•No isolation — Memory-intensive process affects everything
•Often higher throughput — When memory is adequate

Local Replacement Behavior

•Process evicts own pages — Victim from process's own frames
•Isolated thrashing — One process thrashing doesn't affect others
•Predictable performance — Each process has dedicated frames
•Fair degradation — Shortage affects all processes equally
•Underutilization possible — Process with excess frames can't share
•Prevents cascading failure — Thrashing stays contained

The Global Replacement Risk

Under global replacement, a single memory-intensive process can cause system-wide thrashing. Imagine a runaway process that rapidly touches a huge working set. With global replacement, it continually evicts pages from other processes. Those processes then page fault, trying to reload evicted pages. But the aggressive process evicts them again. The entire system grinds to a halt because of one misbehaving process.

Scenario Comparison:

Consider a system with 1000 frames and 4 processes:

Process A: WSS = 400 (normal application)
Process B: WSS = 300 (database)
Process C: WSS = 200 (web server)
Process D: WSS = 800 (suddenly goes memory-intensive)

With Global Replacement:

Process D needs 800 frames, has 250
D aggressively faults and evicts pages from A, B, C
A, B, C now fault when they try to access evicted pages
The entire system's effective throughput collapses
Even well-behaved processes suffer from D's behavior

With Local Replacement:

Process D has 250 frames, needs 800
D can only evict its own pages—thrashes within its allocation
A, B, C continue with their allocations intact
D's throughput collapses, but A, B, C are unaffected
System can still serve requests from functioning processes

Tradeoff Summary:

Aspect	Global Replacement	Local Replacement
Memory utilization	Higher	Lower
Fairness	None guaranteed	Guaranteed
Isolation	None	Complete
Thrashing propagation	System-wide	Per-process
Optimal for	Cooperative workloads	Adversarial workloads
Implementation	Simpler	More complex

Solutions Preview: Addressing Frame Shortage

Now that we understand why insufficient frames cause thrashing, we can preview the solutions. All solutions address the fundamental equation: ∑ WSS(i) > m.

There are only three approaches:

Thrashing Solutions

•Increase Supply (m) — Add more physical memory. Most effective but requires hardware investment.
•Decrease Demand (∑ WSS(i)) — Reduce multiprogramming level by suspending or swapping out processes.
•Improve Matching — Ensure frames go to processes that benefit most (working set tracking).

Solution Approaches Overview
Approach	Mechanism	Tradeoff	When Appropriate
Add RAM	Install more memory	Cost	Long-term solution
Swap out processes	Suspend low-priority processes	Reduced concurrency	Temporary overload
Working set tracking	Monitor actual page usage	Overhead	Dynamic workloads
PFF control	Adjust allocation based on fault rate	Complexity	Proactive prevention
Load shedding	Reject new work when saturated	Lost requests	Server systems

The Ultimate Solution

The most robust solution combines detection with action. First, detect when thrashing is occurring or imminent (high page fault rate, memory pressure). Then, take corrective action (reduce multiprogramming, swap processes). Prevention is better than cure—tracking working sets allows proactive management before thrashing begins.

What's Next:

With the cause clearly understood—insufficient frames relative to total working set demand—the next page examines the symptom that most directly indicates thrashing: high page fault rate. Understanding this symptom enables detection and informed response.

Page Complete

You now understand the root cause of thrashing: frame shortage relative to working set demand. You've seen how allocation policies fail, how the scheduler worsens the problem, and why both global and local replacement have limitations. Next, we explore the high page fault rate that signals thrashing.

2 / 5

Loading learning content...

Operating SystemsThrashing

Thrashing: The Performance Catastrophe

LevelIntermediate

Duration90 mins

TopicThrashing

2 / 5

Cause: Insufficient Frames

The Root Cause Revealed

What You Will Learn

The Frame Allocation Problem

The Constraints:

Frame Allocation Constraints

•Total frames are fixed — Physical memory capacity is determined by hardware
•Kernel reserves frames — The OS kernel needs frames for its own code, data, and page tables
•Processes have varying needs — Different processes require different working set sizes
•Needs change over time — A process's working set shifts as it moves through different phases
•Frame allocation affects all processes — Giving more frames to one process means fewer for others

The Mathematical Reality:

Let's formalize the constraint. Given:

m = total physical frames available for user processes
n = number of active processes
f(i) = frames allocated to process i
WSS(i) = working set size of process i

The fundamental constraint is:

∑ f(i) ≤ m    (Cannot allocate more frames than exist)

For stability, we need:

f(i) ≥ WSS(i)    for all i    (Each process needs enough frames)

Combining these:

∑ WSS(i) ≤ m    (Sum of working sets must fit in memory)

When this condition is violated—when processes collectively need more frames than exist—thrashing is inevitable.

The Overcommitment Trap

The Minimum Frame Requirement

This minimum is determined by the architecture's instruction set and addressing modes:

Minimum Frame Requirements by Architecture
Addressing Mode	Pages Potentially Accessed	Minimum Frames Required
Direct addressing	Instruction page + operand page	2
Indirect addressing	Instruction + pointer + data pages	3
Indexed addressing	Instruction + base + index + data	4
Memory-to-memory	Instruction + source + destination	3
Complex (VAX MOVC3)	Instruction + control + src + dst segments	6+

Why the Minimum Matters:

Consider an instruction like MOV [addr1], [addr2] that copies data from one memory location to another:

Instruction page — Contains the MOV instruction itself
Source operand page — Contains the data at addr1
Destination operand page — Contains the location at addr2

Now imagine the system allocates only 2 frames to this process:

Fetch instruction → Page fault, load instruction page (uses frame 1)
Read source operand → Page fault, load source page (uses frame 2)
Write destination → Page fault, need to load destination page
Evict a page—but which one?
Evict instruction page → Cannot continue instruction execution
Evict source page → Cannot complete the read

Neither option allows the instruction to complete!

Below Minimum = Infinite Loop

Minimum vs. Adequate:

Frames Allocated	Behavior
< Minimum	Cannot complete any instruction; infinite fault loop
= Minimum	Can execute but severe thrashing; every instruction causes faults
< Working Set	Thrashing; frequent page faults degrade performance
≥ Working Set	Normal operation; occasional faults at phase transitions

The Demand and Supply Imbalance

Visualizing the Imbalance:

              DEMAND SIDE                    SUPPLY SIDE
        ┌─────────────────────┐         ┌─────────────────────┐
        │   Process 1: Web    │         │                     │
        │   Server (WSS=800)  │         │    Total Physical   │
        ├─────────────────────┤         │    Memory: 2048     │
        │   Process 2: DB     │         │    frames           │
        │   Server (WSS=600)  │         │                     │
        ├─────────────────────┤         │    Kernel Reserve:  │
        │   Process 3: App    │         │    256 frames       │
        │   Server (WSS=500)  │         │                     │
        ├─────────────────────┤         │    Buffer Cache:    │
        │   Process 4: Redis  │         │    256 frames       │
        │   Cache (WSS=400)   │         │                     │
        └─────────────────────┘         │    Available:       │
                                        │    1536 frames      │
        Total Demand:                   └─────────────────────┘
        800+600+500+400 = 2300 frames

        DEMAND (2300) > SUPPLY (1536)  →  THRASHING INEVITABLE

In this scenario, there's no allocation strategy that prevents thrashing. The collective working set exceeds available memory by 764 frames (50% oversubscribed).

The Zero-Sum Game

Sources of Demand Increase:

Understanding where demand comes from helps prevent thrashing:

Demand Source	Description	Example
New processes	Scheduler admits more work	Fork bomb, service startup
Process growth	Existing process expands working set	Loading large dataset
Phase transition	Process enters memory-intensive phase	Initialization → main loop
Shared page invalidation	Pages become private copies	Copy-on-write after fork
Memory leaks	Process accumulates unreleased memory	Application bug

Sources of Supply Decrease:

Supply Reduction	Description	Example
Kernel expansion	OS needs more memory	Module loading, buffer growth
Memory fragmentation	Usable but not contiguous	Huge page allocation failure
Hot plugging	Physical memory removed	NUMA node offline
Reservation increase	More memory pre-allocated	Huge page reservation

How Allocation Policies Fail

Equal Allocation Failure

•Policy: Give all processes the same number of frames
•With 1536 frames / 4 processes: Each gets 384 frames
•Problem: Web server needs 800, gets 384
•Problem: Redis needs 400, almost enough
•Result: Large processes thrash heavily; small processes are fine
•Failure mode: Ignores actual working set sizes

Proportional Allocation Failure

•Policy: Allocate based on virtual memory size
•Rationale: Larger processes get more frames
•With 50% oversubscription: Each gets 67% of need
•Web server: Needs 800, gets 534, short by 266
•All processes thrash: Proportional shortage
•Failure mode: All processes degrade together

The Priority Allocation Trap:

Priority-based allocation gives more frames to high-priority processes. This seems sensible—important work should proceed smoothly. But it creates perverse effects:

High-priority processes get enough frames—they run well
Low-priority processes get starved—they thrash severely
Low-priority processes block on page faults constantly
They hold resources (locks, connections) while blocked
High-priority processes block waiting for those resources
Priority inversion: High-priority work is blocked by low-priority thrashing

The low-priority processes, unable to make progress, cannot release resources the high-priority processes need.

No Policy Solves Oversubscription

Comparative Failure Analysis:

Policy	Frame Distribution	Who Thrashes	System Throughput
Equal	Uniform	Large processes	Highest if all similar size
Proportional	By VM size	All (proportionally)	Often best overall
Priority	By importance	Low-priority processes	High-priority work proceeds
First-Come	Arrival order	Latecomers	Early processes succeed
Working Set	By actual usage	Last to arrive	Best utilization, complex

Each policy makes different tradeoffs, but none escapes the fundamental constraint.

The OS Scheduler's Tragic Role

The Scheduler's Logic:

CPU Scheduler Behavior

•Observe low CPU utilization — The scheduler sees the CPU is often idle
•Diagnose underutilization — Concludes there aren't enough runnable processes
•Increase multiprogramming — Admits more processes from the ready queue
•Expected outcome — More processes means more CPU utilization
•Actual outcome — More processes means fewer frames each, more thrashing
•CPU utilization drops further — All processes now blocked on page faults
•Scheduler admits more processes — The spiral continues

The Fundamental Misdiagnosis

Timeline of Scheduler-Induced Collapse:

Time    Event                           CPU%   Page Faults/s   Processes
─────   ─────────────────────────────   ────   ─────────────   ─────────
t₀      Normal operation                 80%        50              10
t₁      Memory pressure begins           75%       100              10
t₂      Scheduler observes low CPU       70%       200              10
t₃      Scheduler admits 2 new procs     60%       500              12
t₄      Thrashing intensifies            40%     1,500              12
t₅      Scheduler admits 3 more procs    20%     5,000              15
t₆      System nearly unresponsive        5%    20,000              15
t₇      Scheduler still trying            2%    50,000              18
t₈      Complete collapse                <1%   100,000+             18

Notice how CPU utilization and page fault rate move in opposite directions. The scheduler's attempts to raise CPU utilization only accelerate the collapse.

The Missing Feedback:

The core problem is that the scheduler lacks memory-aware feedback. It optimizes for CPU utilization without considering memory constraints:

What the scheduler sees: CPU utilization (too low)
What the scheduler misses: Page fault rate, paging I/O, working set pressure
What it should do: Reduce multiprogramming when memory is exhausted
What it does: Increase multiprogramming, making everything worse

Modern schedulers incorporate memory pressure signals, but the fundamental tension remains.

The Frame Shortage Equation

We can formalize the relationship between frame shortage and thrashing intensity using quantitative analysis.

Shortage Ratio Definition

Define the shortage ratio S as:

S = (∑ WSS(i) - m) / m

Where:

∑ WSS(i) = total working set demand
m = available physical frames

If S ≤ 0: No shortage, system stable If S > 0: Shortage exists, thrashing risk If S > 0.5: Severe shortage, heavy thrashing

Page Fault Rate as a Function of Shortage:

Empirical studies show that page fault rate increases non-linearly with shortage ratio:

Page Fault Rate
        ▲
        │
   1.0  ┤                              ╱ Saturation
        │                            ╱
   0.8  ┤                          ╱
        │                        ╱
   0.6  ┤                      ╱
        │                    ╱   Thrashing zone
   0.4  ┤                  ╱
        │                ╱
   0.2  ┤              ╱
        │            ╱
   0.1  ┤──────────╱
        │  Stable ╱
        └─────────┴────┴────┴────┴────┴────► Shortage Ratio
              0    0.2  0.4  0.6  0.8  1.0

The curve shows:

Stable region (S < 0): Low, constant page fault rate
Transition (0 < S < 0.3): Rapidly increasing faults
Thrashing zone (S > 0.3): Fault rate dominates execution
Saturation (S > 1.0): Fault rate at maximum; all time spent paging

Shortage Impact on System Performance
Shortage Ratio	Expected Fault Rate	Effective Throughput	User Experience
S ≤ 0 (adequate)	< 1%	~100%	Normal, responsive
0 < S ≤ 0.1	1-5%	80-95%	Slightly sluggish
0.1 < S ≤ 0.3	5-20%	50-80%	Noticeably slow
0.3 < S ≤ 0.5	20-50%	20-50%	Very slow, frustrating
0.5 < S ≤ 0.8	50-80%	5-20%	Nearly unusable
S > 0.8	80%	< 5%	System appears frozen

The Non-Linear Cliff:

This non-linearity makes thrashing dangerous:

Systems can operate near the edge without obvious symptoms
A small additional load (one more process, one dataset loaded) pushes past the cliff
Recovery requires substantial load reduction, not just removing what triggered collapse

Global vs. Local Replacement Impact

The page replacement scope—global or local—significantly affects how frame shortage manifests and how thrashing propagates through the system.

Global Replacement Behavior

•Any page can be evicted — Victim selected from entire system
•High-frequency process steals pages — Active process dominates
•Cascading thrashing — One process's activity causes others to thrash
•Unpredictable performance — Process execution time varies widely
•No isolation — Memory-intensive process affects everything
•Often higher throughput — When memory is adequate

Local Replacement Behavior

•Process evicts own pages — Victim from process's own frames
•Isolated thrashing — One process thrashing doesn't affect others
•Predictable performance — Each process has dedicated frames
•Fair degradation — Shortage affects all processes equally
•Underutilization possible — Process with excess frames can't share
•Prevents cascading failure — Thrashing stays contained

The Global Replacement Risk

Scenario Comparison:

Consider a system with 1000 frames and 4 processes:

Process A: WSS = 400 (normal application)
Process B: WSS = 300 (database)
Process C: WSS = 200 (web server)
Process D: WSS = 800 (suddenly goes memory-intensive)

With Global Replacement:

Process D needs 800 frames, has 250
D aggressively faults and evicts pages from A, B, C
A, B, C now fault when they try to access evicted pages
The entire system's effective throughput collapses
Even well-behaved processes suffer from D's behavior

With Local Replacement:

Process D has 250 frames, needs 800
D can only evict its own pages—thrashes within its allocation
A, B, C continue with their allocations intact
D's throughput collapses, but A, B, C are unaffected
System can still serve requests from functioning processes

Tradeoff Summary:

Aspect	Global Replacement	Local Replacement
Memory utilization	Higher	Lower
Fairness	None guaranteed	Guaranteed
Isolation	None	Complete
Thrashing propagation	System-wide	Per-process
Optimal for	Cooperative workloads	Adversarial workloads
Implementation	Simpler	More complex

Solutions Preview: Addressing Frame Shortage

Now that we understand why insufficient frames cause thrashing, we can preview the solutions. All solutions address the fundamental equation: ∑ WSS(i) > m.

There are only three approaches:

Thrashing Solutions

•Increase Supply (m) — Add more physical memory. Most effective but requires hardware investment.
•Decrease Demand (∑ WSS(i)) — Reduce multiprogramming level by suspending or swapping out processes.
•Improve Matching — Ensure frames go to processes that benefit most (working set tracking).

Solution Approaches Overview
Approach	Mechanism	Tradeoff	When Appropriate
Add RAM	Install more memory	Cost	Long-term solution
Swap out processes	Suspend low-priority processes	Reduced concurrency	Temporary overload
Working set tracking	Monitor actual page usage	Overhead	Dynamic workloads
PFF control	Adjust allocation based on fault rate	Complexity	Proactive prevention
Load shedding	Reject new work when saturated	Lost requests	Server systems

The Ultimate Solution

What's Next:

Page Complete

2 / 5