Thrashing Solutions - Learning Module

Loading content...

0/227

Page Fault Frequency

Direct Measurement for Direct Action

While the working set approach provides a theoretically elegant solution to thrashing, it requires tracking complex temporal information about page references. Page Fault Frequency (PFF) offers an alternative philosophy: instead of predicting which pages a process needs, why not simply measure how well it's performing and adjust accordingly?

The PFF approach embodies a simple yet powerful principle: page fault rate directly indicates whether a process has adequate memory. A process with too few frames experiences high fault rates; one with too many experiences low rates (and wastes frames other processes could use). By establishing acceptable fault rate bounds and adjusting allocation to keep processes within those bounds, PFF achieves thrashing prevention through feedback control rather than prediction.

What You Will Learn

By the end of this page, you will understand how PFF monitors page fault behavior to dynamically adjust frame allocation, the upper and lower threshold mechanism that triggers allocation changes, implementation strategies for PFF-based memory management, and the tradeoffs between PFF and working set approaches. You'll gain insight into why PFF is often preferred for its simplicity and directness.

The Page Fault Frequency Philosophy

The PFF approach is grounded in a straightforward observation: page fault rate is a reliable indicator of memory adequacy. Consider what happens as we vary a process's frame allocation:

Very few frames: Every page access is potentially a fault. Fault rate approaches 100%.
Moderately few frames: Frequent faults as the process cannot hold its working set. High fault rate.
Adequate frames: Working set fits comfortably. Fault rate drops to a low, stable level.
Excess frames: Additional frames provide no benefit. Fault rate cannot drop below zero.

This relationship suggests a control system: monitor fault rate, and adjust allocation to maintain the rate within an acceptable range.

Core PFF Principles

•Output-Based Control — PFF adjusts based on observed behavior (fault rate) rather than predicted needs (working set). This eliminates the need to track reference history.
•Threshold-Driven Decisions — Upper and lower thresholds define the acceptable fault rate band. Violations trigger allocation adjustments.
•Self-Correcting System — When allocation is too low, high fault rate triggers increase; when too high, low fault rate permits decrease. The system naturally converges.
•Simplicity of Implementation — Counting page faults is easy; tracking exact reference times is hard. PFF requires minimal bookkeeping.
•Direct Thrashing Prevention — High fault rates across multiple processes directly indicate thrashing. PFF can detect and respond immediately.

The Thermostat Analogy

PFF operates like a thermostat for memory allocation. The thermostat doesn't predict heat loss through walls—it measures temperature and adjusts heating accordingly. Similarly, PFF doesn't predict memory needs—it measures fault rate and adjusts allocation. Both achieve their goal through feedback rather than modeling.

The Threshold Mechanism: Defining Acceptable Performance

The heart of PFF is its dual-threshold system. Two boundaries define the acceptable fault rate range, and crossing either boundary triggers corrective action.

Upper Threshold (T_high):

When a process's page fault rate exceeds T_high, the process is suffering from memory starvation. Action required:

Allocate additional frame(s) to the process
If no frames available, consider suspending another process
If still insufficient after allocation, the process may need to be suspended

Lower Threshold (T_low):

When a process's page fault rate falls below T_low, the process has more frames than it needs. Action permissible:

Reclaim frame(s) from the process
Add reclaimed frames to the global free pool
Other processes can use these frames

The Hysteresis Zone:

Processes with fault rates between T_low and T_high are in the acceptable zone. No action needed—allocation is appropriate for current behavior.

PFF Threshold Actions
Fault Rate Zone	Condition	Interpretation	Action
Above T_high	PFR > T_high	Memory starvation	Increase allocation
Acceptable Zone	T_low ≤ PFR ≤ T_high	Adequate memory	No change
Below T_low	PFR < T_low	Excess memory	Decrease allocation
Zero Faults	PFR = 0 (extended)	Major over-allocation	Significant decrease

Setting Threshold Values:

Threshold selection is critical to PFF effectiveness:

T_high Considerations:

Too low: Frequent allocation requests, system instability
Too high: Processes suffer significant degradation before help arrives
Typical range: 5-20 faults per 1000 page references (or equivalent time-based metric)

T_low Considerations:

Too low: Frames hoarded by processes that don't need them
Too high: Constant frame reclamation, potential oscillation
Typical range: 1-5 faults per 1000 page references

The Gap Between Thresholds:

Too narrow: System oscillates between adding and removing frames
Too wide: Slow response to changing conditions, wasted resources
Rule of thumb: T_high should be at least 3-5x T_low

In practice, threshold values are tuned based on:

Workload characteristics (I/O-bound vs. compute-bound)
Memory latency (slower memory tolerates lower fault rates)
System stability requirements (mission-critical vs. best-effort)

Measuring Fault Rate

Fault rate can be measured as faults per unit time (faults/second) or per page reference (faults/reference). Time-based metrics are easier to compute but affected by process blocking; reference-based metrics require counting references, which has overhead. Modern systems typically use time-based metrics with corrections for process wait time.

PFF Algorithm Implementation

Implementing PFF requires infrastructure for monitoring fault rates, making allocation decisions, and coordinating with the page replacement system. Here's a comprehensive implementation approach.

pff_implementation.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
// Page Fault Frequency (PFF) Implementation
// Comprehensive implementation with threshold management
 
#include <stdint.h>
#include <stdbool.h>
 
// Configuration parameters
#define PFF_HIGH_THRESHOLD    15    // Faults per 1000 references (trigger increase)
#define PFF_LOW_THRESHOLD      3    // Faults per 1000 references (trigger decrease)
#define MEASUREMENT_WINDOW  1000    // References per measurement period
#define MIN_FRAMES_PER_PROC    5    // Minimum allocation (architecture dependent)
#define FRAMES_ADJUST_COUNT    2    // Frames to add/remove per adjustment
 
typedef enum {
    PFF_STARVED,      // Above high threshold
    PFF_ADEQUATE,     // Between thresholds
    PFF_EXCESS        // Below low threshold
} PFFStatus;
 
typedef struct {
    pid_t pid;
    
    // Fault tracking
    uint32_t page_faults;           // Faults in current window
    uint32_t page_references;       // References in current window
    uint32_t window_start_time;     // For time-based measurement
    
    // Allocation state
    uint32_t allocated_frames;
    uint32_t min_frames;
    uint32_t max_frames;
    
    // PFF status
    PFFStatus current_status;
    uint32_t consecutive_high;      // Consecutive high-threshold violations
    uint32_t consecutive_low;       // Consecutive low-threshold violations
} ProcessPFFState;
 
// Global state
ProcessPFFState process_states[MAX_PROCESSES];
uint32_t free_frame_count;
uint32_t total_frames;
 
// Calculate current page fault rate (per 1000 references)
uint32_t calculate_pff_rate(ProcessPFFState* state) {
    if (state->page_references == 0) {
        return 0;  // No data yet
    }
    
    // PFF rate in faults per 1000 references
    return (state->page_faults * 1000) / state->page_references;
}
 
// Called on each page fault
void pff_record_fault(ProcessPFFState* state) {
    state->page_faults++;
    state->page_references++;  // Fault is also a reference
    
    // Check if measurement window complete
    if (state->page_references >= MEASUREMENT_WINDOW) {
        pff_evaluate_and_adjust(state);
        pff_reset_window(state);
    }
}
 
// Called on each successful page access (no fault)
void pff_record_reference(ProcessPFFState* state) {
    state->page_references++;
    
    // Check if measurement window complete
    if (state->page_references >= MEASUREMENT_WINDOW) {
        pff_evaluate_and_adjust(state);
        pff_reset_window(state);
    }
}
 
// Core PFF decision logic
void pff_evaluate_and_adjust(ProcessPFFState* state) {
    uint32_t pff_rate = calculate_pff_rate(state);
    
    if (pff_rate > PFF_HIGH_THRESHOLD) {
        // Process is starved for memory
        state->current_status = PFF_STARVED;
        state->consecutive_high++;
        state->consecutive_low = 0;
        
        // Attempt to allocate more frames
        handle_high_fault_rate(state);
        
    } else if (pff_rate < PFF_LOW_THRESHOLD) {
        // Process has excess memory
        state->current_status = PFF_EXCESS;
        state->consecutive_low++;
        state->consecutive_high = 0;
        
        // Consider reclaiming frames
        handle_low_fault_rate(state);
        
    } else {
        // Process is in acceptable range
        state->current_status = PFF_ADEQUATE;
        state->consecutive_high = 0;
        state->consecutive_low = 0;
    }
    
    log_pff_metrics(state, pff_rate);
}
 
// Handle high fault rate - need more frames
void handle_high_fault_rate(ProcessPFFState* state) {
    uint32_t frames_needed = FRAMES_ADJUST_COUNT;
    
    // Escalate if persistently high
    if (state->consecutive_high > 3) {
        frames_needed *= 2;  // More aggressive allocation
    }
    
    // Check if frames available from free pool
    if (free_frame_count >= frames_needed) {
        allocate_frames_to_process(state, frames_needed);
        return;
    }
    
    // Try to reclaim from processes with excess frames
    uint32_t reclaimed = reclaim_excess_frames(frames_needed);
    if (reclaimed >= frames_needed) {
        allocate_frames_to_process(state, frames_needed);
        return;
    }
    
    // If still insufficient, consider suspension of low-priority process
    if (state->consecutive_high > 5 && is_system_under_memory_pressure()) {
        // This process is persistently starved
        // Either suspend a lower-priority process or this one
        handle_persistent_memory_pressure(state);
    }
}
 
// Handle low fault rate - reclaim excess frames
void handle_low_fault_rate(ProcessPFFState* state) {
    // Don't reclaim below minimum
    if (state->allocated_frames <= state->min_frames) {
        return;
    }
    
    // Only reclaim if consistently low (avoid oscillation)
    if (state->consecutive_low < 2) {
        return;  // Wait for confirmation
    }
    
    uint32_t frames_to_reclaim = min(FRAMES_ADJUST_COUNT, 
                                     state->allocated_frames - state->min_frames);
    
    // Select victim frames (LRU or similar)
    reclaim_frames_from_process(state, frames_to_reclaim);
    free_frame_count += frames_to_reclaim;
    state->allocated_frames -= frames_to_reclaim;
}
 
// Reset measurement window for next period
void pff_reset_window(ProcessPFFState* state) {
    state->page_faults = 0;
    state->page_references = 0;
    state->window_start_time = get_current_time();
}
 
// Try to reclaim frames from processes with low fault rates
uint32_t reclaim_excess_frames(uint32_t needed) {
    uint32_t reclaimed = 0;
    
    for (int i = 0; i < MAX_PROCESSES && reclaimed < needed; i++) {
        ProcessPFFState* state = &process_states[i];
        
        if (state->current_status == PFF_EXCESS && 
            state->allocated_frames > state->min_frames) {
            
            uint32_t available = state->allocated_frames - state->min_frames;
            uint32_t to_take = min(available, needed - reclaimed);
            
            reclaim_frames_from_process(state, to_take);
            state->allocated_frames -= to_take;
            reclaimed += to_take;
        }
    }
    
    return reclaimed;
}

Window Size Matters

The measurement window size (MEASUREMENT_WINDOW) balances responsiveness against stability. Smaller windows react quickly but may oscillate on temporary spikes. Larger windows provide smoother behavior but slower adaptation. Adaptive window sizing—expanding during stable periods and shrinking during transitions—offers the best of both worlds.

PFF vs. Working Set: A Comparative Analysis

Both PFF and working set approaches aim to prevent thrashing, but they differ fundamentally in philosophy and implementation. Understanding these differences helps system designers choose the appropriate approach.

PFF vs. Working Set Comparison
Aspect	Working Set Approach	Page Fault Frequency
Primary Metric	Pages referenced in window Δ	Fault rate per time/reference unit
Control Type	Predictive (based on reference history)	Reactive (based on observed performance)
Implementation Complexity	High (track reference times)	Low (count faults)
Memory Overhead	Significant (per-page timestamps)	Minimal (per-process counters)
Response Time	Immediate (pages age out)	Delayed (wait for measurement window)
Oscillation Risk	Low (stable window)	Moderate (threshold crossings)
Accuracy	Theoretically optimal	Practically sufficient
Adaptability	Excellent (tracks phase changes)	Good (responds to behavior changes)

When to Prefer Working Set

•Applications with predictable locality phases
•Systems where proactive allocation matters
•Environments with minimal page fault cost
•When detailed memory usage analysis is needed
•Research or educational contexts

When to Prefer PFF

•Production systems prioritizing simplicity
•Memory-constrained environments
•Workloads with unpredictable behavior
•When implementation overhead matters
•Systems with high page fault costs

A Deeper Analysis:

Predictive vs. Reactive:

The working set approach is fundamentally predictive—it uses past reference behavior to predict future needs. This works well for processes with stable locality but can lag during phase transitions.

PFF is reactive—it responds to actual performance degradation. This means processes may experience temporary degradation before correction, but the approach never allocates based on incorrect predictions.

Convergence Behavior:

Both approaches converge to similar steady-state allocations. The difference lies in transient behavior:

Working set: May temporarily over-allocate during phase transitions as old pages age out
PFF: May temporarily under-allocate until fault rate rises and triggers allocation

Hybrid Approaches:

Many production systems combine elements of both: use reference bits for page replacement decisions (working set influenced) while using fault rate for allocation decisions (PFF). This captures the strengths of each approach.

The Practical Reality

In practice, most modern systems don't implement pure versions of either approach. Instead, they use pragmatic combinations: Linux's active/inactive lists approximate working sets; Windows' working set manager uses fault rate monitoring. The theoretical distinction matters less than the practical outcome—preventing thrashing while maintaining efficiency.

Advanced PFF Techniques

Basic PFF provides a solid foundation, but advanced techniques enhance responsiveness, reduce oscillation, and improve overall system behavior.

1. Adaptive Thresholds:

Fixed thresholds may not suit all workloads or system states. Adaptive threshold adjustment responds to conditions:

adaptive_pff.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
// Adaptive PFF Thresholds
// Thresholds adjust based on system state and workload
 
typedef struct {
    // Base thresholds (starting point)
    float base_high_threshold;
    float base_low_threshold;
    
    // Current adjusted thresholds
    float current_high_threshold;
    float current_low_threshold;
    
    // System state factors
    float memory_pressure_factor;   // 1.0 = normal, >1 = pressure
    float process_priority_factor;  // Higher priority = tighter thresholds
} AdaptivePFFConfig;
 
// Adjust thresholds based on system state
void adapt_thresholds(AdaptivePFFConfig* config, SystemState* state) {
    // Under memory pressure, be more aggressive about reclamation
    if (state->free_frames_percent < 10) {
        config->memory_pressure_factor = 1.5;  // Raise low threshold
    } else if (state->free_frames_percent > 30) {
        config->memory_pressure_factor = 0.8;  // Lower low threshold
    } else {
        config->memory_pressure_factor = 1.0;
    }
    
    // Adjust for overall system fault rate
    float system_fault_rate = calculate_system_fault_rate();
    if (system_fault_rate > SYSTEM_FAULT_RATE_HIGH) {
        // System-wide issue - tighten all thresholds
        config->current_high_threshold = config->base_high_threshold * 0.8;
        config->current_low_threshold = config->base_low_threshold * 1.2;
    } else {
        config->current_high_threshold = config->base_high_threshold;
        config->current_low_threshold = config->base_low_threshold * 
                                        config->memory_pressure_factor;
    }
}
 
// Per-process threshold adjustment based on priority
void adjust_process_thresholds(ProcessPFFState* state, int priority) {
    float priority_multiplier = 1.0;
    
    switch (priority) {
        case PRIORITY_REALTIME:
            priority_multiplier = 0.5;  // Very tight thresholds - respond fast
            break;
        case PRIORITY_HIGH:
            priority_multiplier = 0.75;
            break;
        case PRIORITY_NORMAL:
            priority_multiplier = 1.0;
            break;
        case PRIORITY_LOW:
            priority_multiplier = 1.5;  // Looser thresholds - tolerate more faults
            break;
        case PRIORITY_IDLE:
            priority_multiplier = 2.0;
            break;
    }
    
    state->high_threshold = base_high_threshold * priority_multiplier;
    state->low_threshold = base_low_threshold * priority_multiplier;
}

2. Weighted Moving Average Fault Rate:

Raw fault rate measurements are noisy. Smoothing via weighted moving average reduces oscillation:

smoothed_rate = α × current_rate + (1-α) × previous_smoothed_rate

Where α (typically 0.1-0.3) controls responsiveness vs. stability.

3. Inter-Fault Time Analysis:

Rather than counting faults per window, measure the time between consecutive faults:

Long inter-fault times → Process has adequate memory
Short inter-fault times → Process is struggling
Decreasing inter-fault times → Workload intensifying

This provides instant feedback rather than waiting for window completion.

4. Predictive Fault Rate:

Combine PFF with limited prediction:

Track fault rate trends (increasing, stable, decreasing)
Anticipate threshold violations before they occur
Pre-allocate frames when trend indicates impending starvation

5. Phase Detection:

Identify distinct program phases with different memory demands:

Sudden fault rate spikes may indicate phase transition
Historical phase data can inform allocation decisions
Phase-aware allocation eliminates transition lag

Avoiding Over-Engineering

Each advanced technique adds complexity. The basic PFF algorithm handles most scenarios well. Add sophistication incrementally, measuring improvement at each step. A simple, well-tuned PFF often outperforms a complex but poorly-tuned advanced variant.

System-Wide PFF Coordination

Individual process PFF tracking must be coordinated at the system level to detect global thrashing and make coherent allocation decisions. This requires centralized monitoring and arbitration.

Global Thrashing Detection:

Thrashing manifests differently at system level than process level:

Single Process Thrashing: One process has high fault rate while others are normal. Solution: Give frames to that process (or suspend it).
System-Wide Thrashing: Many processes simultaneously have high fault rates. Solution: Reduce multiprogramming level (suspend processes).
Cascading Thrashing: One process's frame acquisition raises others' fault rates. Solution: Careful coordination of allocation decisions.

system_pff_coordinator.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
// System-Wide PFF Coordination
// Coordinates PFF decisions across all processes
 
typedef struct {
    uint32_t total_processes;
    uint32_t processes_above_high_threshold;
    uint32_t processes_below_low_threshold;
    float average_fault_rate;
    float system_fault_rate;  // Total faults / second
    uint32_t free_frames;
    uint32_t total_frames;
} SystemPFFState;
 
typedef enum {
    SYSTEM_HEALTHY,           // Most processes have adequate memory
    SYSTEM_PRESSURE,          // Some processes struggling
    SYSTEM_THRASHING,         // Widespread memory starvation
    SYSTEM_OVERPROVISIONED    // Significant unused capacity
} SystemMemoryState;
 
// Evaluate system-wide memory state
SystemMemoryState evaluate_system_state(SystemPFFState* state) {
    float starved_ratio = (float)state->processes_above_high_threshold / 
                          state->total_processes;
    float excess_ratio = (float)state->processes_below_low_threshold /
                         state->total_processes;
    
    // Check for system-wide thrashing
    if (starved_ratio > 0.5 && state->free_frames < CRITICAL_FREE_THRESHOLD) {
        return SYSTEM_THRASHING;
    }
    
    // Check for significant pressure
    if (starved_ratio > 0.2) {
        return SYSTEM_PRESSURE;
    }
    
    // Check for over-provisioning
    if (excess_ratio > 0.5 && state->free_frames > EXCESS_FREE_THRESHOLD) {
        return SYSTEM_OVERPROVISIONED;
    }
    
    return SYSTEM_HEALTHY;
}
 
// Handle different system states
void handle_system_state(SystemMemoryState state) {
    switch (state) {
        case SYSTEM_THRASHING:
            // Emergency action: suspend lowest priority processes
            reduce_multiprogramming_level();
            adjust_all_thresholds_emergency();
            break;
            
        case SYSTEM_PRESSURE:
            // Moderate action: reclaim from over-provisioned, suspend if needed
            aggressive_frame_reclamation();
            if (still_under_pressure()) {
                suspend_lowest_priority_process();
            }
            break;
            
        case SYSTEM_HEALTHY:
            // Normal operation: individual PFF decisions sufficient
            break;
            
        case SYSTEM_OVERPROVISIONED:
            // Consider resuming suspended processes
            consider_process_resumption();
            break;
    }
}
 
// Reduce multiprogramming to stop thrashing
void reduce_multiprogramming_level(void) {
    ProcessPFFState* candidates[MAX_PROCESSES];
    int candidate_count = 0;
    
    // Find processes to suspend
    for (int i = 0; i < process_count; i++) {
        ProcessPFFState* proc = &process_states[i];
        if (proc->current_status == PFF_STARVED && 
            proc->priority < PRIORITY_HIGH) {
            candidates[candidate_count++] = proc;
        }
    }
    
    // Sort by priority (lowest first) then by WSS (largest first)
    sort_suspension_candidates(candidates, candidate_count);
    
    // Suspend until system stabilizes
    int suspended = 0;
    while (is_system_thrashing() && suspended < candidate_count) {
        suspend_process(candidates[suspended]);
        suspended++;
        
        // Wait briefly for system to stabilize
        schedule_delayed_reevaluation(100);  // 100ms
    }
    
    log_suspension_event(suspended);
}
 
// Coordinate allocation requests
bool coordinate_frame_request(ProcessPFFState* requestor, int frames_needed) {
    // Check global state first
    SystemPFFState* system = get_system_state();
    
    if (system->free_frames >= frames_needed) {
        // Easy case: frames available
        return allocate_from_free_pool(requestor, frames_needed);
    }
    
    // Need to reclaim from other processes
    if (system_state != SYSTEM_THRASHING) {
        int reclaimed = reclaim_from_excess_processes(frames_needed);
        if (reclaimed >= frames_needed) {
            return allocate_reclaimed_frames(requestor, frames_needed);
        }
    }
    
    // Cannot satisfy request without hurting others' working sets
    if (requestor->priority >= PRIORITY_HIGH) {
        // High priority: force reclamation even from adequate processes
        return force_allocation(requestor, frames_needed);
    }
    
    // Low priority: request denied, may need suspension
    return false;
}

Coordination Overhead

System-wide coordination requires synchronization, which has overhead. Balance the frequency of global state evaluation against the cost. Periodic evaluation (e.g., every 100-500ms) usually suffices; only escalate to continuous monitoring when system pressure is detected.

PFF in Production Operating Systems

Production operating systems use PFF concepts in various forms. Understanding these implementations provides insight into practical application of the theory.

Windows Working Set Manager:

Windows doesn't use pure PFF but incorporates fault rate as a key metric:

Working Set Trimming: Triggered when system memory is low and process fault rates indicate excess allocation
Working Set Manager: Monitors fault rates to determine when to trim and when to allow growth
Standby List: Trimmed pages go to a standby list; if faulted back quickly (high fault rate), working set grows

Linux Memory Management:

Linux approaches this differently:

No per-process fault rate tracking: Linux focuses on global memory pressure
kswapd: Background thread activated by memory pressure, not individual fault rates
Memory Pressure Notifications: Applications can receive notifications and self-adjust
cgroups: Memory limits can be set, with various enforcement policies

FreeBSD:

FreeBSD's VM system uses fault rate implicitly:

Inactive/Active Queues: Pages move between queues based on access (fault-driven)
Page Daemon (pagedaemon): Activated by free page deficit, reclaims based on access patterns
Rate-Limited Scanning: Avoids excessive CPU use by limiting scan rate

PFF-Like Mechanisms in Production Systems
System	Mechanism	Fault Rate Role	Notable Features
Windows	Balance Set Manager	Direct—drives WS adjustment	Per-process limits, soft/hard faults
Linux	kswapd + reclaim	Indirect—triggers reclaim pressure	Memory cgroups, OOM killer
macOS	Compressed Memory	Indirect—compression precedes swap	Application termination, App Nap
FreeBSD	pagedaemon	Indirect—inactive queue eviction	Rate-limited scanning
Solaris	Scanner thread	Direct—configurable scan rate	Lotsfree/minfree thresholds

Real-World Validation

The enduring use of PFF concepts across major operating systems validates the approach. While implementations vary, the core insight—that fault rate indicates allocation adequacy and should drive adjustment decisions—proves consistently valuable. Modern systems may not call it 'PFF' but the principle permeates memory management.

Summary: Page Fault Frequency

Page Fault Frequency provides a practical, direct approach to thrashing prevention through continuous monitoring and feedback-driven allocation adjustment. Let's consolidate our understanding:

Core Principles of PFF

•Fault Rate as Adequacy Indicator — High fault rate means insufficient frames; low rate means excess frames. Monitor and adjust accordingly.
•Dual Threshold System — Upper threshold (T_high) triggers allocation increase; lower threshold (T_low) permits reclamation. Gap prevents oscillation.
•Reactive Control — PFF responds to observed behavior rather than predicting needs. This simplifies implementation and avoids misprediction errors.
•Implementation Simplicity — Counting faults is far easier than tracking reference history. PFF adds minimal overhead to page fault handling.
•System-Wide Coordination — Individual PFF decisions must be coordinated at system level to detect global thrashing and manage multiprogramming.
•Threshold Tuning — Threshold values critically affect behavior. Too tight causes oscillation; too loose delays correction. Tune based on workload.
•Production Prevalence — PFF concepts appear throughout modern OS memory management, even when not explicitly named.

Looking Ahead:

Working set and PFF approaches address thrashing through intelligent memory allocation. But sometimes the fundamental problem is too many processes for available memory. The next page explores Load Control—mechanisms for controlling which processes are active to maintain system stability, including admission control, medium-term scheduling, and process prioritization.

Section Complete

You now understand Page Fault Frequency as an alternative approach to thrashing prevention. The key insight is that observing actual performance (fault rate) and using feedback control (threshold-triggered adjustment) provides a simpler, more direct path to the same goal as working set tracking—ensuring each process has adequate memory for acceptable performance.