Thrashing Solutions - Learning Module

Loading content...

0/227

Load Control

Controlling the Gate to Prevent Chaos

Consider a highway during rush hour. Individual drivers can optimize their routes and speeds, but if too many cars enter the highway, congestion becomes inevitable no matter how well each driver performs. The solution isn't better individual driving—it's controlling how many cars enter.

Load control applies this same principle to memory management. No matter how sophisticated the page replacement algorithm or how clever the working set approximation, if the total memory demand of active processes exceeds available physical memory, thrashing will occur. Load control addresses this fundamental constraint by managing the multiprogramming level—deciding which processes are allowed to compete for memory at any given time.

What You Will Learn

By the end of this page, you will understand the concept of multiprogramming level and its relationship to thrashing, admission control mechanisms that prevent overcommitment, medium-term scheduling strategies for balancing load, the criteria for selecting processes to deactivate, and how modern systems implement load control in various forms.

The Multiprogramming Level Problem

The multiprogramming level (also called the degree of multiprogramming) is the number of processes actively competing for system resources. This seemingly simple metric has profound implications for system performance.

The Fundamental Tension:

Too Few Processes: CPU sits idle when processes block for I/O. Resources underutilized.
Too Many Processes: Memory is overcommitted. Thrashing occurs. Throughput collapses.
Optimal Level: CPU stays busy without overcommitting memory. Maximum throughput achieved.

The challenge is that the optimal level is dynamic—it depends on the current processes' memory requirements, which change over time.

Impact of Multiprogramming Level on System Behavior
Multiprogramming Level	CPU Utilization	Memory Pressure	Page Fault Rate	Throughput
Very Low (1-2)	Low (idle during I/O)	None	Minimal	Poor
Low (3-5)	Moderate	Low	Low	Moderate
Optimal	High (80-95%)	Moderate	Acceptable	Maximum
High	Declining	High	Elevated	Declining
Excessive	Very Low (paging overhead)	Severe	Extreme	Near Zero

The Thrashing Threshold:

There exists a critical multiprogramming level beyond which performance degrades catastrophically. This threshold depends on:

Total Physical Memory: More memory supports more processes
Per-Process Working Set Size: Larger WSS means fewer processes fit
Working Set Overlap: Shared libraries reduce effective memory demand
I/O Intensity: I/O-bound processes need less memory per second of execution
Locality Quality: Processes with tight locality tolerate smaller allocations

The Key Insight:

The goal of load control is to keep the multiprogramming level at or below the threshold that triggers thrashing. It's better to run N processes efficiently than N+K processes in a thrashing state.

The Non-Linear Cliff

Performance degradation when crossing the thrashing threshold is not gradual—it's a cliff. Adding one too many processes can transform a system running at 90% CPU utilization into one running at 10%. Load control is about staying safely away from this cliff.

Admission Control: The First Line of Defense

Admission control is the proactive component of load control. Rather than waiting for thrashing to occur and then reacting, admission control prevents overcommitment before it happens by gating entry to the set of active processes.

Admission Control Decision:

When a new process requests activation (becoming runnable), the admission controller evaluates:

Can this process be accommodated without causing total working set demand to exceed available memory?

If yes, the process is admitted. If no, the process waits.

admission_control.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
// Admission Control Implementation
// Prevents overcommitment by gating process activation
 
#include <stdbool.h>
 
typedef struct {
    pid_t pid;
    uint32_t estimated_wss;     // Estimated working set size
    uint32_t priority;
    bool is_active;             // In active memory competition
    uint32_t time_waiting;      // Time spent waiting for admission
} ProcessAdmissionState;
 
typedef struct {
    uint32_t total_frames;
    uint32_t reserved_frames;   // Reserved for OS, buffers, etc.
    uint32_t available_frames;  // For user processes
    uint32_t committed_frames;  // Sum of active processes' WSS estimates
    uint32_t headroom_percent;  // Safety margin (e.g., 10%)
} SystemAdmissionState;
 
// Calculate available capacity for new processes
uint32_t calculate_available_capacity(SystemAdmissionState* state) {
    // Available = Total - Reserved - Committed - Headroom
    uint32_t headroom = (state->total_frames * state->headroom_percent) / 100;
    uint32_t usable = state->total_frames - state->reserved_frames - headroom;
    
    if (usable <= state->committed_frames) {
        return 0;  // No capacity available
    }
    
    return usable - state->committed_frames;
}
 
// Main admission decision
typedef enum {
    ADMIT_GRANTED,           // Process can be admitted
    ADMIT_DEFERRED,          // Wait until capacity available
    ADMIT_DENIED_PRIORITY,   // Lower priority must wait for higher
    ADMIT_DENIED_SIZE        // WSS too large for system
} AdmissionResult;
 
AdmissionResult evaluate_admission(ProcessAdmissionState* candidate) {
    SystemAdmissionState* system = get_system_state();
    uint32_t available = calculate_available_capacity(system);
    
    // Check if process could ever fit
    if (candidate->estimated_wss > system->available_frames) {
        return ADMIT_DENIED_SIZE;  // Process can never fit
    }
    
    // Check if capacity available now
    if (available >= candidate->estimated_wss) {
        return ADMIT_GRANTED;
    }
    
    // Check if preemption possible for high-priority process
    if (candidate->priority >= PRIORITY_HIGH) {
        uint32_t reclaimable = estimate_reclaimable_from_low_priority();
        if (available + reclaimable >= candidate->estimated_wss) {
            // Can make room by deactivating lower priority
            return evaluate_preemption(candidate);
        }
    }
    
    return ADMIT_DEFERRED;
}
 
// Admit a process after approval
void admit_process(ProcessAdmissionState* proc) {
    SystemAdmissionState* system = get_system_state();
    
    proc->is_active = true;
    system->committed_frames += proc->estimated_wss;
    
    // Move process to ready queue
    move_to_ready_queue(proc->pid);
    
    log_admission(proc);
}
 
// Handle waiting processes when capacity becomes available
void check_deferred_admissions(void) {
    // Sort waiting processes by priority, then by wait time
    ProcessAdmissionState* waiting = get_waiting_processes();
    sort_by_priority_then_wait_time(waiting);
    
    for (int i = 0; i < waiting_count; i++) {
        AdmissionResult result = evaluate_admission(&waiting[i]);
        
        if (result == ADMIT_GRANTED) {
            admit_process(&waiting[i]);
        } else {
            // No more capacity for remaining processes
            break;
        }
    }
}

Estimating Working Set Size for New Processes:

Admission control requires estimates of working set size before the process runs. Techniques include:

Historical Data: Track WSS from previous executions of the same program
Static Analysis: Examine program structure to estimate memory needs
Conservative Default: Use a default estimate (e.g., 100 frames) and adjust
User/Scheduler Hints: Accept memory requirements as process metadata
First-Phase Learning: Admit tentatively, observe actual WSS, then confirm or revoke

Priority-Based Admission:

Not all processes are equal. Priority-based admission ensures high-priority processes get memory:

Real-time processes: Admitted if physically possible, preempting lower priority
Interactive processes: High admission priority to maintain responsiveness
Batch processes: Admitted when capacity permits, first deferred when tight
Idle processes: Only admitted when system is lightly loaded

Admission vs. Creation

Admission control doesn't prevent process creation—processes can exist but remain inactive (swapped out, not in ready queue). The control is over which created processes are allowed to actively compete for execution time and memory. This allows the system to maintain a pool of processes without overcommitting resources.

Medium-Term Scheduling: The Load Balancer

While admission control gates new processes, medium-term scheduling (MTS) manages the multiprogramming level among already-admitted processes. MTS makes decisions about swapping—which processes should be temporarily moved out of active memory competition.

The Three Levels of Scheduling:

Level	Also Called	Time Scale	Decision
Long-term	Job scheduler	Minutes-hours	Which jobs to admit
Medium-term	Swapper	Seconds-minutes	Which processes to swap
Short-term	CPU scheduler	Milliseconds	Which ready process to run

Medium-term scheduling bridges long-term job admission and short-term CPU allocation, adjusting the active process set in response to system conditions.

Swap-Out Decisions:

MTS decreases multiprogramming level by selecting processes to swap out (deactivate). Criteria include:

Priority: Lower-priority processes are candidates for swap-out
State: Blocked processes (waiting for I/O) are good candidates
Memory Consumption: Large WSS processes free more memory when swapped
Time Since Last Run: Processes idle for long periods may be swapped
Remaining Time: Processes near completion may be allowed to finish
Interactive Status: Interactive processes are protected to maintain responsiveness

Swap-In Decisions:

MTS increases multiprogramming level by selecting processes to swap in (reactivate). Considerations:

Wait Time: Processes waiting longest get priority (prevent starvation)
Priority: Higher-priority processes swap in first
Memory Availability: Only swap in if sufficient frames available
Event Completion: Swap in when the event a process was waiting for occurs
System Load: Don't swap in if system is already at optimal load

medium_term_scheduler.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
// Medium-Term Scheduler (Swapper)
// Manages the set of processes actively in memory competition
 
typedef enum {
    SWAP_STATE_ACTIVE,         // In memory, competing for CPU
    SWAP_STATE_SWAPPING_OUT,   // Being written to swap
    SWAP_STATE_SWAPPED,        // Entirely in swap, not competing
    SWAP_STATE_SWAPPING_IN     // Being read from swap
} SwapState;
 
typedef struct {
    pid_t pid;
    SwapState swap_state;
    uint32_t priority;
    uint32_t memory_resident;  // Frames currently in memory
    uint32_t wss_estimate;
    uint64_t last_run_time;
    uint64_t swap_out_time;    // When swapped out (for starvation prevention)
    bool is_blocked;           // Waiting for event
    bool is_interactive;       // Needs quick response
} MediumTermProcState;
 
// Call periodically or when memory pressure detected
void medium_term_schedule(void) {
    SystemMemoryState mem_state = get_memory_state();
    
    if (mem_state.free_frames < SWAP_OUT_THRESHOLD) {
        // Memory pressure - need to swap out
        select_and_swap_out();
    }
    
    if (mem_state.free_frames > SWAP_IN_THRESHOLD) {
        // Capacity available - can swap in
        select_and_swap_in();
    }
    
    // Check for starvation of swapped processes
    check_starvation();
}
 
// Select process to swap out
MediumTermProcState* select_swap_out_victim(void) {
    MediumTermProcState* best_victim = NULL;
    int best_score = -1;
    
    for (int i = 0; i < process_count; i++) {
        MediumTermProcState* proc = &processes[i];
        
        if (proc->swap_state != SWAP_STATE_ACTIVE) continue;
        if (proc->is_interactive && system_load < HIGH) continue;
        
        int score = calculate_swapout_score(proc);
        if (score > best_score) {
            best_score = score;
            best_victim = proc;
        }
    }
    
    return best_victim;
}
 
// Score for swap-out (higher = more likely to swap out)
int calculate_swapout_score(MediumTermProcState* proc) {
    int score = 0;
    
    // Lower priority increases swap-out likelihood
    score += (MAX_PRIORITY - proc->priority) * 100;
    
    // Blocked processes are good candidates
    if (proc->is_blocked) {
        score += 300;
    }
    
    // Larger memory footprint means more benefit from swapping
    score += proc->memory_resident / 10;
    
    // Longer idle time increases likelihood
    uint64_t idle_time = get_current_time() - proc->last_run_time;
    score += min(idle_time / 1000, 200);  // Cap contribution
    
    // Interactive processes protected
    if (proc->is_interactive) {
        score -= 500;
    }
    
    return score;
}
 
// Select process to swap in
MediumTermProcState* select_swap_in_candidate(uint32_t available_frames) {
    MediumTermProcState* best_candidate = NULL;
    int best_score = -1;
    
    for (int i = 0; i < process_count; i++) {
        MediumTermProcState* proc = &processes[i];
        
        if (proc->swap_state != SWAP_STATE_SWAPPED) continue;
        if (proc->wss_estimate > available_frames) continue;  // Won't fit
        
        int score = calculate_swapin_score(proc);
        if (score > best_score) {
            best_score = score;
            best_candidate = proc;
        }
    }
    
    return best_candidate;
}
 
// Score for swap-in (higher = more likely to swap in)
int calculate_swapin_score(MediumTermProcState* proc) {
    int score = 0;
    
    // Higher priority increases swap-in likelihood
    score += proc->priority * 100;
    
    // Longer wait time increases priority (starvation prevention)
    uint64_t wait_time = get_current_time() - proc->swap_out_time;
    score += min(wait_time / 500, 400);  // Strong contribution
    
    // Event completion makes swap-in urgent
    if (event_completed_for_process(proc->pid)) {
        score += 500;
    }
    
    // Interactive processes need quick return
    if (proc->is_interactive) {
        score += 300;
    }
    
    return score;
}
 
// Prevent indefinite swapping
void check_starvation(void) {
    for (int i = 0; i < process_count; i++) {
        MediumTermProcState* proc = &processes[i];
        
        if (proc->swap_state == SWAP_STATE_SWAPPED) {
            uint64_t wait_time = get_current_time() - proc->swap_out_time;
            
            if (wait_time > MAX_SWAP_WAIT_TIME) {
                // Force swap-in to prevent starvation
                // May require swapping out another process
                force_swap_in(proc);
            }
        }
    }
}

Avoiding Swap Thrashing

Medium-term scheduling must be careful not to thrash the swap device itself. Swapping processes in and out too frequently causes high I/O load and gains nothing. Hysteresis in thresholds (different triggers for swap-in vs. swap-out) and minimum residence times help prevent swap thrashing.

Deactivation Criteria: Choosing Who Waits

When load control determines that the multiprogramming level must decrease, the critical question becomes: which processes should be deactivated? This decision has significant fairness and performance implications.

Deactivation Methods:

Swap-Out (traditional): Write process memory to swap partition, free all frames
Compression: Compress process memory in RAM rather than swapping (faster)
Working Set Release: Keep core pages, release non-essential pages
Freeze: Stop scheduling the process but retain its memory (temporary)

Deactivation Criteria Comparison
Criterion	Rationale	Advantage	Disadvantage
Lowest Priority	Protect important work	Clear policy, predictable	Low-priority starvation
Largest Memory User	Maximum memory freed	Efficient resource recovery	Penalizes legitimate big processes
Most Recent Arrival	Seniority protection	Fairness to older processes	Hurts new processes unfairly
Longest Blocked	Already not running	Minimal immediate impact	May not free enough memory
Poorest Locality	Most likely thrashing	Removes problem process	Requires tracking, may be temporary
Least CPU Used	Least productive recently	Protect active processes	Punishes I/O-bound unfairly

Composite Scoring:

Production systems typically use weighted combinations of criteria:

deactivation_score = w1 × priority_factor +
                     w2 × memory_factor +
                     w3 × blocked_duration +
                     w4 × locality_quality +
                     w5 × arrival_time

Weights are tuned based on system goals:

Throughput-focused: Weight memory and locality heavily
Fairness-focused: Weight arrival time and priority
Interactivity-focused: Protect interactive processes (negative score component)

Protecting System-Critical Processes:

Some processes should never be deactivated:

Kernel threads: Required for system operation
Page daemon: Handles the deactivation itself
Interrupt handlers: Must respond to hardware
Real-time processes: Have guaranteed execution requirements
Processes in critical sections: May hold locks others need

Avoiding Priority Inversion

Care must be taken when deactivating processes based on priority. If a high-priority process is waiting for a low-priority process (e.g., for a lock or message), deactivating the low-priority process can delay the high-priority process indefinitely. The scheduler must consider dependencies, not just isolated priorities.

Detecting Overload: Triggering Load Control

Load control must be triggered at the right time—early enough to prevent thrashing but not so aggressively as to unnecessarily limit concurrency. Several metrics help detect impending or actual overload.

Primary Indicators:

Overload Detection Metrics

•Free Frame Count — When free frames drop below a threshold (e.g., 5% of total), system is under memory pressure
•System-Wide Page Fault Rate — Elevated fault rate across many processes indicates widespread memory starvation
•CPU Utilization Paradox — Low CPU utilization despite many runnable processes suggests thrashing (CPU waiting for I/O)
•Page-Out Rate — High rate of pages being written to swap indicates memory pressure
•Scan Rate — Page replacement algorithm scanning many pages to find victims indicates tight memory
•I/O Queue Depth — Long queues for disk I/O (especially swap) indicates memory-related bottleneck

overload_detection.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
// Overload Detection System
// Multi-metric detection for robust overload identification
 
typedef struct {
    uint32_t free_frames;
    uint32_t total_frames;
    
    float system_page_fault_rate;   // Faults per second, system-wide
    float page_out_rate;            // Pages written to swap per second
    float scan_rate;                // Pages scanned per second
    
    float cpu_utilization;
    float io_wait_percent;
    uint32_t runnable_processes;
    
    uint32_t swap_io_queue_depth;
} SystemLoadMetrics;
 
typedef enum {
    LOAD_LIGHT,        // Can consider admitting more
    LOAD_MODERATE,     // System operating well
    LOAD_HEAVY,        // Near capacity, be careful
    LOAD_OVERLOADED,   // Thrashing imminent or occurring
    LOAD_CRITICAL      // Severe thrashing, emergency action
} LoadLevel;
 
// Evaluate current load level
LoadLevel evaluate_load(SystemLoadMetrics* metrics) {
    int pressure_indicators = 0;
    int critical_indicators = 0;
    
    // Check free memory
    float free_percent = (float)metrics->free_frames / metrics->total_frames * 100;
    if (free_percent < 5) {
        critical_indicators++;
    } else if (free_percent < 10) {
        pressure_indicators++;
    }
    
    // Check fault rate
    if (metrics->system_page_fault_rate > CRITICAL_FAULT_RATE) {
        critical_indicators++;
    } else if (metrics->system_page_fault_rate > HIGH_FAULT_RATE) {
        pressure_indicators++;
    }
    
    // Check for CPU utilization paradox (thrashing signature)
    if (metrics->cpu_utilization < 50 && 
        metrics->runnable_processes > 5 &&
        metrics->io_wait_percent > 30) {
        // Classic thrashing: many processes want CPU but CPU is idle (waiting for I/O)
        critical_indicators++;
    }
    
    // Check page-out rate
    if (metrics->page_out_rate > CRITICAL_PAGEOUT_RATE) {
        pressure_indicators++;
    }
    
    // Check scan rate (desperation scanning)
    if (metrics->scan_rate > DESPERATE_SCAN_RATE) {
        pressure_indicators++;
    }
    
    // Composite decision
    if (critical_indicators >= 2) {
        return LOAD_CRITICAL;
    } else if (critical_indicators >= 1 || pressure_indicators >= 3) {
        return LOAD_OVERLOADED;
    } else if (pressure_indicators >= 2) {
        return LOAD_HEAVY;
    } else if (pressure_indicators >= 1) {
        return LOAD_MODERATE;
    } else {
        return LOAD_LIGHT;
    }
}
 
// Take action based on load level
void respond_to_load_level(LoadLevel level) {
    switch (level) {
        case LOAD_CRITICAL:
            // Immediate action: suspend multiple processes
            emergency_load_reduction();
            disable_new_admissions();
            break;
            
        case LOAD_OVERLOADED:
            // Strong action: suspend one process, restrict admissions
            suspend_lowest_priority_process();
            restrict_admissions();
            break;
            
        case LOAD_HEAVY:
            // Moderate action: reclaim frames aggressively, careful admissions
            aggressive_frame_reclamation();
            careful_admissions();
            break;
            
        case LOAD_MODERATE:
            // Normal operation
            normal_admissions();
            break;
            
        case LOAD_LIGHT:
            // Consider bringing in swapped processes
            consider_resumptions();
            permissive_admissions();
            break;
    }
}
 
// Trend detection for proactive intervention
typedef struct {
    float fault_rate_history[10];
    int history_index;
    float trend_slope;
} TrendDetector;
 
void update_trend(TrendDetector* td, float current_rate) {
    td->fault_rate_history[td->history_index] = current_rate;
    td->history_index = (td->history_index + 1) % 10;
    
    // Simple linear regression for trend
    td->trend_slope = calculate_slope(td->fault_rate_history, 10);
    
    // Proactive intervention if trend is strongly upward
    if (td->trend_slope > WORRYING_TREND_THRESHOLD) {
        // Fault rate rising fast - intervene before overload
        preemptive_load_reduction();
    }
}

The Value of Trend Detection

Monitoring trends is often more valuable than monitoring absolute values. A system at 90% memory utilization with stable metrics is fine; one at 70% with rapidly rising fault rates is about to crash. Derivative (rate of change) signals enable proactive intervention before crisis.

Load Shedding Strategies

When overload is detected, the system must shed load—reduce the number of active processes until the remaining processes can execute effectively. Several strategies exist, ranging from gentle to aggressive.

Progressive Load Shedding:

The most effective approach escalates through increasingly aggressive levels:

Load Shedding Escalation Levels

•Level 1: Restrict Admissions — Stop admitting new processes. Existing processes continue. Wait for some to complete naturally.
•Level 2: Aggressive Reclamation — Reclaim pages from all processes with low fault rates. Push processes toward their minimums.
•Level 3: Targeted Suspension — Identify and suspend the process(es) most responsible for memory pressure.
•Level 4: Bulk Suspension — Suspend multiple low-priority processes to rapidly free memory.
•Level 5: Emergency Suspension — Suspend processes regardless of priority until system stabilizes. Preserve only critical system processes.
•Level 6: Process Termination (OOM)— As last resort, terminate processes to free memory. This is the Out-of-Memory (OOM) killer.

The OOM Killer: Last Resort:

When all other measures fail, the system must terminate processes to survive. Linux's OOM killer is the classic example:

OOM Score Calculation (Linux):

oom_score = memory_percentage × adjustment_factors

Where:

memory_percentage = process RSS / total RAM
Adjustments for:
- User-defined oom_score_adj (-1000 to +1000)
- Root processes (slight protection)
- Long-running processes (slight protection)

OOM Killer Selection:

Calculate oom_score for all processes
Select process with highest score
Send SIGKILL (unblockable termination)
If still under pressure, repeat

The OOM killer is brutal but necessary—it's better to lose one process than to have all processes unusable indefinitely.

load_shedding.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
// Progressive Load Shedding Implementation
 
typedef enum {
    SHED_RESTRICT,      // Stop new admissions
    SHED_RECLAIM,       // Aggressive page reclamation
    SHED_SUSPEND_ONE,   // Suspend single process
    SHED_SUSPEND_BULK,  // Suspend multiple processes
    SHED_EMERGENCY,     // Suspend all non-critical
    SHED_TERMINATE      // OOM kill
} SheddingLevel;
 
SheddingLevel current_shedding_level = SHED_RESTRICT;
 
// Execute load shedding at current level
void execute_shedding(SheddingLevel level) {
    switch (level) {
        case SHED_RESTRICT:
            set_admission_policy(ADMISSION_RESTRICTED);
            break;
            
        case SHED_RECLAIM:
            set_admission_policy(ADMISSION_BLOCKED);
            trigger_aggressive_reclaim();
            break;
            
        case SHED_SUSPEND_ONE:
            if (suspend_best_candidate()) {
                break;
            }
            // Fall through if no candidate found
            
        case SHED_SUSPEND_BULK:
            suspend_multiple_processes(calculate_needed_suspensions());
            break;
            
        case SHED_EMERGENCY:
            suspend_all_non_critical();
            break;
            
        case SHED_TERMINATE:
            invoke_oom_killer();
            break;
    }
}
 
// Escalate if current level insufficient
void maybe_escalate(void) {
    if (!check_memory_stable() && 
        time_since_shedding() > STABILIZATION_WAIT) {
        
        if (current_shedding_level < SHED_TERMINATE) {
            current_shedding_level++;
            execute_shedding(current_shedding_level);
        }
    }
}
 
// De-escalate when pressure relieved
void maybe_deescalate(void) {
    if (check_memory_comfortable() && 
        time_since_stable() > DEESCALATION_DELAY) {
        
        if (current_shedding_level > SHED_RESTRICT) {
            current_shedding_level--;
        }
        
        if (current_shedding_level == SHED_RESTRICT && 
            check_memory_abundant()) {
            set_admission_policy(ADMISSION_NORMAL);
            resume_suspended_processes();
        }
    }
}
 
// OOM killer implementation
void invoke_oom_killer(void) {
    int best_score = -1;
    Process* victim = NULL;
    
    for (int i = 0; i < process_count; i++) {
        Process* p = &all_processes[i];
        
        // Never kill kernel threads
        if (p->is_kernel) continue;
        
        // Never kill init (pid 1)
        if (p->pid == 1) continue;
        
        // Skip processes with oom_score_adj = -1000 (unkillable)
        if (p->oom_score_adj == -1000) continue;
        
        int score = calculate_oom_score(p);
        if (score > best_score) {
            best_score = score;
            victim = p;
        }
    }
    
    if (victim != NULL) {
        log_oom_kill(victim);
        send_signal(victim->pid, SIGKILL);
    }
}

OOM is Failure

Triggering the OOM killer represents a failure of earlier load control mechanisms. A well-tuned system should rarely or never reach this point. If OOM kills are frequent, the solution is better admission control, more memory, or fewer processes—not just better OOM victim selection.

Load Control in Modern Operating Systems

Modern systems implement load control through various mechanisms, often without explicit 'swapper' or 'medium-term scheduler' components.

Linux:

kswapd: Background thread that maintains free page reserves, triggered when free falls below watermarks
Direct reclaim: Processes needing memory perform reclamation themselves when kswapd can't keep up
Memory cgroups: Allow per-group memory limits, isolating workloads
OOM killer: Last resort termination when other mechanisms fail
No explicit swapper: Linux relies on page-level management rather than whole-process swapping

Windows:

Working Set Manager: Trims working sets when memory pressure rises
Balance Set Manager: Manages the set of processes in memory
Superfetch: Preloads anticipated pages, but yields to real demand
Memory Priority: Different priorities for process memory retention
Modern Standby/Connected Standby: Sophisticated power/memory management for mobile

macOS:

Compressed Memory: Compress inactive pages before swapping
Memory Pressure Framework: Applications notified of memory state, can self-reduce
Automatic App Termination: Apps marked as terminatable can be killed and automatically relaunched
App Nap: Reduces resource usage of non-visible applications

Modern System Load Control Comparison
Feature	Linux	Windows	macOS
Primary mechanism	kswapd + direct reclaim	Working set manager	Compressed memory
Process suspension	Rare (per-page focus)	Explicit (swap out)	App termination
Admission control	cgroups (optional)	Job objects	Automatic management
Application cooperation	None required	None required	Memory pressure notifications
Last resort	OOM killer	Close applications prompt	Terminate low-priority apps

Evolution of Load Control

Modern systems have evolved beyond explicit medium-term scheduling to more nuanced approaches. Compression reduces swap I/O, app termination with restoration provides user transparency, and memory limits provide workload isolation. The core principle remains: control the number of active competitors for memory to prevent thrashing.

Summary: Load Control

Load control represents the highest-level approach to thrashing prevention—controlling not just how much memory each process gets, but how many processes actively compete. Let's consolidate our understanding:

Core Load Control Principles

•Multiprogramming Level Matters — There's an optimal number of active processes. Too few underutilizes resources; too many causes thrashing.
•Admission Control — Gate entry to the active set, preventing overcommitment before it happens.
•Medium-Term Scheduling — Adjust which admitted processes are currently active by swapping in/out.
•Deactivation Criteria — Use priority, memory consumption, blocking status, and fairness to select which processes wait.
•Multi-Metric Detection — Combine free memory, fault rate, CPU patterns, and I/O metrics to detect overload early.
•Progressive Load Shedding — Escalate from gentle (restrict admissions) to aggressive (termination) as needed.
•Prevention Over Reaction — Proactive admission control is preferable to reactive process termination.

Looking Ahead:

Load control adjusts which processes compete for memory. But sometimes even with intelligent load control, specific processes need to be temporarily removed from competition entirely. The next page explores Swapping Processes—the mechanics and policies of moving entire processes between memory and secondary storage.

Section Complete

You now understand load control as a fundamental thrashing prevention strategy. The key insight is that controlling the number of active processes—not just their individual allocations—is essential for system stability. Better to run fewer processes efficiently than many processes in a thrashing state.