Thrashing Solutions - Learning Module

Loading content...

0/227

Working Set Approach

The Definitive Solution to Thrashing

When a system enters thrashing—that catastrophic state where processes spend more time waiting for pages than executing instructions—the fundamental cause is invariably a mismatch between memory demands and available resources. The working set approach represents the most theoretically sound and practically effective solution to this problem, transforming thrashing prevention from reactive firefighting into proactive resource management.

Peter Denning's introduction of the working set model in 1968 wasn't merely an optimization technique—it was a paradigm shift in how operating systems conceptualize memory allocation. By grounding allocation decisions in observed locality behavior rather than static policies, the working set approach provides a principled framework for ensuring that every process receives exactly the memory it needs to maintain acceptable performance.

What You Will Learn

By the end of this page, you will understand how the working set approach fundamentally prevents thrashing, the mathematical framework that defines working set size, implementation strategies for tracking working sets, and the admission control policies that keep systems operating in the stable region. You'll gain insight into why this approach remains the gold standard for thrashing prevention in modern operating systems.

Understanding the Working Set Philosophy

The working set approach is built upon a profound insight: processes exhibit locality of reference. At any given moment, a process doesn't access memory uniformly across its entire address space. Instead, it focuses on a relatively small subset of pages—the working set—and this subset changes gradually over time as the process moves through different phases of execution.

This observation leads to a powerful principle: if we guarantee that each process's working set is resident in memory, we can virtually eliminate thrashing. The working set approach converts this principle into a practical allocation policy.

Core Principles of the Working Set Approach

•Locality-Based Allocation — Frame allocation is driven by observed access patterns rather than static formulas or arbitrary limits. Each process receives frames proportional to its actual memory needs.
•Dynamic Adjustment — Working set size is continuously monitored and allocation adapts accordingly. As a process enters a new phase with different locality, its allocation grows or shrinks.
•Admission Control — New processes are only admitted if sufficient memory exists to accommodate their working sets without starving existing processes.
•Suspension Rather Than Starvation — When memory becomes scarce, entire processes are suspended rather than allowing all processes to thrash with inadequate allocations.
•Guaranteed Performance — By ensuring working set residence, each active process maintains acceptable page fault rates and forward progress.

The Fundamental Insight

The working set approach inverts the traditional perspective on memory management. Instead of asking 'How do we fit all processes into available memory?', it asks 'How many processes can we run with acceptable performance given available memory?' This shift from fairness to performance is the key to eliminating thrashing.

Mathematical Foundation: Defining the Working Set

The working set is defined with mathematical precision, which enables rigorous analysis and implementation. Let's establish the formal definitions that underpin the working set approach.

Working Set Definition:

For a process P at virtual time t, the working set WS(t, Δ) is defined as:

WS(t, Δ) = { pages referenced in the interval (t - Δ, t] }

Where:

t is the current virtual time (measured in page references, not wall-clock time)
Δ (delta) is the working set window size
The result is the set of all distinct pages referenced in the last Δ page references

Working Set Size (WSS):

The working set size at time t is simply the cardinality of the working set:

|WS(t, Δ)| = number of distinct pages in WS(t, Δ)

This value fluctuates over time as the process moves through different phases of computation.

Working Set Calculation Example: Process with Δ = 10
Reference #	Page Referenced	Last 10 References	Working Set	WS Size
1-10	1,2,3,1,4,1,2,1,3,5	{1,2,3,4,5}	{1,2,3,4,5}	5
11	1	{2,3,1,4,1,2,1,3,5,1}	{1,2,3,4,5}	5
12	1	{3,1,4,1,2,1,3,5,1,1}	{1,2,3,4,5}	5
13	6	{1,4,1,2,1,3,5,1,1,6}	{1,2,3,4,5,6}	6
14	6	{4,1,2,1,3,5,1,1,6,6}	{1,2,3,5,6}	5 (4 aged out)
15-20	6,6,6,6,6,6	All 6s except oldest	{6} (eventually)	1 (phase shift)

The Critical Role of Δ (Window Size):

The window size Δ is the most critical parameter in working set management. It must be chosen carefully:

Δ too small: Working set underestimates true locality. Pages that are genuinely needed but referenced slightly less frequently fall outside the window, causing excessive page faults.
Δ too large: Working set overestimates true locality. Pages from previous phases that are no longer needed remain in the working set, wasting memory and reducing multiprogramming level.
Optimal Δ: Captures approximately one complete locality phase. This typically corresponds to a few thousand to tens of thousands of references, depending on the application.

The Total Working Set Sum:

For a system with n processes, the total demand D is:

D = Σ |WS_i(t, Δ)| (sum over all processes i = 1 to n)

The working set approach maintains the invariant:

D ≤ m (where m is the number of available frames)

If D exceeds m, processes must be suspended until the inequality is restored.

Virtual Time vs. Real Time

Working set calculations use virtual time (page reference count) rather than wall-clock time. This is crucial because a process that's blocked waiting for I/O shouldn't have its working set shrink—it will need the same pages when it resumes. Virtual time only advances when the process actually executes and makes memory references.

Working Set Policy Implementation

Transforming the working set concept into a practical memory management policy requires addressing several implementation challenges. The pure working set model, while theoretically elegant, requires modifications for real-world deployment.

The Working Set Page Replacement Policy:

At each page reference:

Identify pages in the current working set (referenced within the last Δ references)
Any page not in the working set is immediately eligible for replacement
If a page fault occurs and no pages are outside the working set, the process must wait or the system must take corrective action

This policy has a beautiful property: it automatically adjusts allocation to match demand. A process entering a phase with larger locality automatically receives more frames; one transitioning to tighter locality releases frames for others.

working_set_policy.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
// Conceptual Working Set Page Replacement Implementation
// This demonstrates the core logic, not production code
 
typedef struct {
    int page_number;
    unsigned long last_reference_time;  // Virtual time of last access
    bool valid;
    bool in_working_set;
} PageTableEntry;
 
typedef struct {
    PageTableEntry* page_table;
    int page_table_size;
    unsigned long virtual_time;         // Process's page reference counter
    unsigned long delta;                 // Working set window size
    int current_wss;                     // Current working set size
    int allocated_frames;                // Number of frames currently held
} ProcessMemoryState;
 
// Called on every memory reference
void on_memory_reference(ProcessMemoryState* pms, int page_num) {
    pms->virtual_time++;  // Advance virtual time
    
    PageTableEntry* pte = &pms->page_table[page_num];
    pte->last_reference_time = pms->virtual_time;
    
    // Recalculate working set membership for all pages
    update_working_set(pms);
    
    if (!pte->valid) {
        // Page fault - need to bring page into memory
        handle_page_fault(pms, page_num);
    }
}
 
// Update which pages are in the working set
void update_working_set(ProcessMemoryState* pms) {
    int wss = 0;
    unsigned long threshold = pms->virtual_time - pms->delta;
    
    for (int i = 0; i < pms->page_table_size; i++) {
        PageTableEntry* pte = &pms->page_table[i];
        if (pte->valid) {
            if (pte->last_reference_time > threshold) {
                // Page is in working set
                pte->in_working_set = true;
                wss++;
            } else {
                // Page has aged out of working set
                pte->in_working_set = false;
                // This page can be reclaimed!
                release_frame_to_system(pte);
            }
        }
    }
    pms->current_wss = wss;
}
 
// Handle a page fault under working set policy
void handle_page_fault(ProcessMemoryState* pms, int page_num) {
    // First, try to reclaim pages outside our working set
    PageTableEntry* victim = find_non_working_set_page(pms);
    
    if (victim != NULL) {
        // Good - we can replace without affecting working set
        replace_page(victim, page_num);
    } else {
        // All our pages are in working set - need more frames
        if (system_has_free_frames()) {
            allocate_new_frame(pms, page_num);
        } else {
            // System is under memory pressure
            // This is where thrashing prevention kicks in
            signal_memory_pressure(pms);
        }
    }
}

Implementation Complexity

The pure working set algorithm is expensive: tracking exact reference times for every page requires updating metadata on every memory access. Real systems use approximations based on reference bits and timer interrupts, trading some accuracy for practical performance. We'll explore these approximations shortly.

Admission Control: The Gateway to Stability

The working set approach's most powerful thrashing prevention mechanism is admission control. Rather than allowing processes to compete for inadequate resources, the system controls how many processes can be active simultaneously.

The Admission Control Principle:

A new process may enter the Ready queue only if the sum of all working set sizes (including the new process's estimated WSS) does not exceed available memory.

If: Σ WSS_current + WSS_new ≤ Available Frames Then: Admit the new process Else: Suspend the new process (or an existing process with lower priority)

With Admission Control

•System maintains stable multiprogramming level
•Each running process has adequate frames
•Page fault rates remain acceptable
•CPU utilization stays high
•Throughput maximized for active processes
•Waiting processes get resources when available

Without Admission Control

•All processes admitted regardless of memory
•Each process receives fewer frames than needed
•Page fault rates escalate system-wide
•CPU utilization collapses
•Throughput approaches zero
•All processes suffer, none complete efficiently

Estimating Working Set Size for New Processes:

A challenge with admission control is estimating the WSS for processes that haven't started executing yet. Strategies include:

Historical Data: Use the WSS from previous runs of the same program
Program Characteristics: Estimate based on program type (compilers need more, I/O-bound programs need less)
Conservative Estimates: Start with a default estimate and adjust based on observed behavior
User Hints: Allow users or schedulers to specify expected memory requirements
Gradual Admission: Start with minimal allocation and only fully admit if early behavior confirms the estimate

Dynamic Adjustment:

Admission control isn't a one-time decision. The system continuously monitors:

Actual WSS of running processes versus initial estimates
Total memory demand versus available supply
Page fault rates across the system

If conditions change (e.g., a process enters a memory-intensive phase), the system may need to suspend a process even after admission.

The Economics of Admission Control

Think of admission control like a restaurant with limited seating. It's better to serve fewer customers well than to pack everyone in and serve everyone poorly. A customer waiting for a table (suspended process) will eventually get good service; a customer at an overcrowded table (thrashing process) gets a miserable experience indefinitely.

Practical Approximations of the Working Set

The pure working set algorithm requires tracking the exact reference time of every page access—a prohibitively expensive operation since memory references occur millions of times per second. Real implementations use approximations that capture the essence of working set behavior with acceptable overhead.

Timer-Based Working Set Approximation:

The most common approximation uses hardware reference bits and periodic timer interrupts:

Each page has a reference bit set by hardware when the page is accessed
A timer interrupt fires periodically (e.g., every 100ms)
At each interrupt, the OS examines reference bits for each page:
- If reference bit = 1: Page was accessed recently, stays in working set
- If reference bit = 0: Page wasn't accessed, increment its age counter
Pages whose age counter exceeds a threshold are evicted
All reference bits are cleared after inspection

This trades temporal precision for efficiency—we know a page was accessed "sometime in the last interval" rather than exactly when.

ws_approximation.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
// Timer-Based Working Set Approximation
// Called on each timer interrupt (e.g., every 100ms)
 
#define AGE_THRESHOLD 3  // Evict after 3 intervals without access()
#define NUM_PAGES 4096
 
typedef struct {
    bool reference_bit;      // Set by hardware on access
    bool modified_bit;       // Set by hardware on write
    int age_counter;         // Intervals since last access
    bool is_resident;        // Currently in physical memory
    int frame_number;        // Physical frame if resident
} PageInfo;
 
PageInfo page_table[NUM_PAGES];
 
// Called on each timer interrupt
void timer_interrupt_handler(void) {
    int pages_in_ws = 0;
    int pages_evicted = 0;
    
    for (int i = 0; i < NUM_PAGES; i++) {
        if (!page_table[i].is_resident) continue;
        
        if (page_table[i].reference_bit) {
            // Page was accessed in this interval
            page_table[i].age_counter = 0;  // Reset age
            page_table[i].reference_bit = 0; // Clear for next interval
            pages_in_ws++;
        } else {
            // Page was NOT accessed in this interval
            page_table[i].age_counter++;
            
            if (page_table[i].age_counter > AGE_THRESHOLD) {
                // Page has aged out of working set - evict it
                evict_page(&page_table[i]);
                pages_evicted++;
            } else {
                pages_in_ws++;  // Still in working set for now
            }
        }
    }
    
    // Update process working set size for scheduling decisions
    current_process->current_wss = pages_in_ws;
    
    // Report metrics for system monitoring
    log_memory_metrics(pages_in_ws, pages_evicted);
}
 
// More sophisticated: use reference bit history
typedef struct {
    uint8_t reference_history;  // 8-bit shift register of reference bits
    int frame_number;
    bool is_resident;
} EnhancedPageInfo;
 
void enhanced_timer_handler(EnhancedPageInfo* table, int size) {
    for (int i = 0; i < size; i++) {
        if (!table[i].is_resident) continue;
        
        // Shift history right, insert current reference bit at MSB
        bool current_ref = get_and_clear_reference_bit(i);
        table[i].reference_history >>= 1;
        if (current_ref) {
            table[i].reference_history |= 0x80;  // Set MSB
        }
        
        // Page is in working set if any recent access occurred
        // reference_history == 0 means no access in 8 intervals
        if (table[i].reference_history == 0) {
            consider_for_eviction(&table[i]);
        }
    }
}

The Reference Bit History Technique:

A refinement of the basic approach maintains a history of reference bits rather than just the current state:

Keep an 8-bit (or 16-bit) shift register per page
At each timer interrupt, shift right and insert the current reference bit
The resulting value approximates recency: higher values = more recent access

For example:

11111111 (255) = accessed in all recent intervals
10000000 (128) = accessed only in the most recent interval
00000001 (1) = accessed 7 intervals ago, nothing since
00000000 (0) = no access in the last 8 intervals

This provides a more nuanced view of access patterns and enables better replacement decisions.

Tradeoffs in Approximation

Timer interval affects accuracy: shorter intervals provide finer granularity but increase overhead. Typical systems use 50-200ms intervals. The age threshold must be calibrated to match Δ: if Δ corresponds to 1 second of execution, and timer fires every 100ms, threshold should be approximately 10 intervals.

Suspension and Resumption Policies

When total working set demand exceeds available memory, the working set approach requires suspending processes rather than allowing them all to thrash. The choice of which processes to suspend—and when to resume them—is critical to system performance.

Suspension Criteria:

When memory pressure is detected (Σ WSS > Available Frames), the system must select victims for suspension. Considerations include:

Suspension Policy Comparison
Policy	Selection Criterion	Advantages	Disadvantages
Lowest Priority First	Suspend process with lowest scheduling priority	Protects important processes, intuitive policy	Low-priority processes may starve completely
Largest WSS First	Suspend process consuming most memory	Frees maximum memory with one suspension	Penalizes memory-intensive workloads unfairly
Smallest WSS First	Suspend process consuming least memory	Minimal impact on suspended process	May need to suspend many processes
Most Recent Arrival	Suspend most recently started process	Older processes likely closer to completion	Unfair to new arrivals during busy periods
Longest Wait Time	Suspend process waiting longest for I/O	Already blocked, minimal disruption	Complex to track, may not free enough memory
Composite Score	Weighted combination of factors	Balanced, configurable decisions	Complex implementation, tuning required

Resumption Policies:

Once a suspended process has been written to swap space, it must eventually resume. Resumption decisions require careful consideration:

Memory Availability Check: Only resume if sufficient frames exist to accommodate the process's expected WSS
Gradual Warming: When resuming, don't load the entire working set immediately. Let the process fault in pages as needed—demand paging will naturally reconstruct the working set
Prepaging Option: Alternatively, proactively load pages that were in the working set at suspension time. This reduces initial page fault bursts but risks loading pages no longer needed
Priority Consideration: Resume higher-priority suspended processes before lower-priority ones
Aging Prevention: Implement maximum suspension time to prevent indefinite starvation
Hysteresis: Don't resume immediately when memory becomes available—wait for a margin to prevent immediate re-suspension

suspension_policy.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
// Working Set-Based Suspension Management
 
typedef struct {
    pid_t pid;
    int priority;
    unsigned long wss;
    unsigned long time_in_system;
    unsigned long last_io_time;
    bool is_suspended;
    unsigned long suspension_time;
} ProcessInfo;
 
// Calculate suspension score (higher = more likely to suspend)
int calculate_suspension_score(ProcessInfo* proc) {
    int score = 0;
    
    // Lower priority increases suspension likelihood
    score += (MAX_PRIORITY - proc->priority) * PRIORITY_WEIGHT;
    
    // Larger WSS increases suspension likelihood
    score += (proc->wss / WSS_UNIT) * WSS_WEIGHT;
    
    // Recent arrivals more likely to be suspended
    score += (MAX_TIME - min(proc->time_in_system, MAX_TIME)) / TIME_UNIT * TIME_WEIGHT;
    
    // Longer I/O wait makes suspension less disruptive
    if (proc->last_io_time > 0) {
        score += (proc->last_io_time / IO_UNIT) * IO_WEIGHT;
    }
    
    return score;
}
 
// Select process to suspend when memory pressure detected
ProcessInfo* select_suspension_victim(ProcessInfo* processes, int count, 
                                      unsigned long required_frames) {
    ProcessInfo* best_victim = NULL;
    int best_score = -1;
    
    for (int i = 0; i < count; i++) {
        ProcessInfo* proc = &processes[i];
        
        if (proc->is_suspended) continue;  // Already suspended
        if (proc->pid == KERNEL_PID) continue;  // Never suspend kernel
        if (proc->wss < required_frames) continue;  // Won't help enough
        
        int score = calculate_suspension_score(proc);
        if (score > best_score) {
            best_score = score;
            best_victim = proc;
        }
    }
    
    return best_victim;
}
 
// Check if suspended processes can be resumed
void evaluate_resumption(ProcessInfo* processes, int count, 
                         unsigned long available_frames) {
    // Sort suspended processes by priority (highest first)
    ProcessInfo* suspended[MAX_PROCESSES];
    int suspended_count = collect_suspended(processes, count, suspended);
    sort_by_priority_descending(suspended, suspended_count);
    
    unsigned long remaining_frames = available_frames;
    
    for (int i = 0; i < suspended_count; i++) {
        ProcessInfo* proc = suspended[i];
        
        // Apply hysteresis: need 20% more than WSS to resume
        unsigned long required = proc->wss * 120 / 100;
        
        if (remaining_frames >= required) {
            resume_process(proc);
            remaining_frames -= proc->wss;  // Assume it will use its WSS
        }
        
        // Check starvation: resume if suspended too long
        if (get_current_time() - proc->suspension_time > MAX_SUSPENSION_TIME) {
            // Force resume even if memory tight
            // This may cause another suspension later
            force_resume_with_swap_reservation(proc);
        }
    }
}

The Cost of Suspension

Suspension is not free. Writing a process's memory to swap and later reading it back incurs significant I/O overhead. Frequent suspension/resumption cycles can themselves become a performance bottleneck. The goal is to stabilize the system with minimal suspension activity.

Working Set in Modern Operating Systems

While the pure working set model is rarely implemented exactly as Denning described, its principles pervade modern operating system design. Let's examine how contemporary systems adapt working set concepts.

Linux:

Linux doesn't track working sets per-process but implements analogous mechanisms:

Active/Inactive Lists: Pages are classified as active (recently used) or inactive
kswapd: Background daemon that maintains free page reserves
Memory Pressure Detection: Out-of-memory (OOM) conditions trigger process termination
cgroups: Memory limits per group of processes approximate admission control

Windows:

Windows explicitly uses working set terminology:

Working Set Minimum/Maximum: Each process has configurable limits
Working Set Trimming: When memory is low, processes are trimmed to their minimum
Soft/Hard Faults: Distinguished between satisfied from standby and from disk
SuperFetch: Proactively loads anticipated working sets

macOS:

Apple systems employ sophisticated memory management:

Compressed Memory: Instead of swapping, pages may be compressed in RAM
Memory Pressure Framework: Applications notified of memory state
App Nap: Inactive applications reduced to minimal resource usage
Automatic Termination: Applications can be terminated and restored transparently

Working Set Concepts Across Operating Systems
Concept	Linux Implementation	Windows Implementation	macOS Implementation
Working Set Tracking	Active/inactive lists	Per-process WS structs	Memory object tracking
Admission Control	OOM killer (reactive)	Job objects, WS limits	Automatic termination
Suspension/Swapping	Swappiness parameter	Modified page writer	Compressed memory first
WSS Estimation	Page age heuristics	Balance set manager	Memory pressure levels
Thrashing Detection	No explicit detection	Monitor page fault rate	Memory pressure signals

The Enduring Legacy

Though implementation details vary, the core working set insight—that processes should receive memory proportional to their locality requirements, and that it's better to run fewer processes well than many processes poorly—remains the foundation of modern memory management. Denning's 1968 model continues to guide designs over 50 years later.

Summary: The Working Set Approach

The working set approach represents the gold standard for thrashing prevention, grounded in theoretical rigor yet adaptable to practical constraints. Let's consolidate our understanding:

Core Principles to Remember

•Working Set Definition — WS(t, Δ) is the set of pages referenced in the last Δ memory accesses, capturing the process's current locality
•Fundamental Guarantee — If each process's working set is resident, thrashing cannot occur; page faults remain at acceptable levels
•Admission Control — New processes are admitted only if their expected WSS can be accommodated without starving existing processes
•Dynamic Allocation — Frame allocation automatically adjusts as working sets grow and shrink with phase changes
•Suspension Over Starvation — When memory is insufficient, entire processes are suspended rather than allowing all to thrash
•Practical Approximations — Real systems use timer-based reference bit techniques rather than exact reference time tracking
•Window Size Δ — Critical parameter that must match locality duration; too small causes excessive faults, too large wastes memory

Looking Ahead:

The working set approach provides the theoretical foundation for thrashing prevention, but implementing pure working set policies requires approximation. The next page explores Page Fault Frequency (PFF), an alternative approach that achieves similar goals by directly monitoring and controlling page fault rates rather than tracking page references.

Section Complete

You now understand the working set approach as a comprehensive framework for thrashing prevention. The key insight is that memory allocation should be driven by observed locality behavior, and that controlling the multiprogramming level is preferable to allowing all processes to receive inadequate resources.