Loading learning content...
We've explored how to determine a single process's memory needs—through working set tracking or page fault frequency. But an operating system doesn't manage memory for one process in isolation. It must coordinate allocation across all processes while pursuing multiple, sometimes conflicting goals:
This page explores dynamic allocation—the mechanisms and policies that translate per-process memory needs into system-wide allocation decisions.
By the end of this page, you will understand: (1) The global allocation problem and its constraints, (2) Allocation policies (equal, proportional, priority-based), (3) How to balance local needs with global resources, (4) Frame allocation algorithms in practice, (5) Handling over-commitment through load control, and (6) Modern approaches in production systems.
Consider a system with M physical frames and N running processes, where process i has working set size WSS_i. The global allocation problem is:
Given: M frames, N processes with varying memory needs Goal: Allocate frames to each process to optimize system performance Constraint: Sum of allocations ≤ M (can't allocate more than exists)
Three Fundamental Cases:
| Case | Condition | Situation | Solution |
|---|---|---|---|
| Under-committed | Σ WSS_i < M | More frames than needed | Give each process its WSS; pool remainder |
| Balanced | Σ WSS_i ≈ M | Memory matches demand | Each process gets approximately its WSS |
| Over-committed | Σ WSS_i > M | More demand than supply | Someone must be shortchanged or suspended |
The balanced case is the operating point we seek. The under-committed case is easy (everyone's happy). The over-committed case is the challenge.
Without coordination, over-commitment leads to a tragedy of the commons: each process grabs frames for its own benefit, but collectively they exhaust resources, and everyone suffers. This is exactly the thrashing phenomenon. Dynamic allocation exists to prevent this tragedy through centralized coordination.
Constraints on Allocation:
Any allocation scheme must respect several constraints:
1. Minimum Frames: Every process needs a minimum number of frames to execute at all. This minimum is architecturally determined:
Example: If an instruction can reference itself, a source operand, a destination operand, and two levels of indirect addressing, the minimum is 5 frames.
2. Maximum Frames: A process shouldn't get unlimited frames:
3. Total Constraint: Allocations must sum to at most M. This seems obvious but has profound implications—allocation is a zero-sum game. Giving more to one process means taking from others.
4. Dynamic Nature: Allocations must change over time:
Several policies determine how to divide frames among processes. Each has strengths and weaknesses.
Equal Allocation:
Formula: Each process gets M/N frames
Example: 100 frames, 5 processes → 20 frames each
| Pro | Con |
|---|---|
| Simple to implement | Ignores process differences |
| Clearly fair in one sense | Small processes waste frames |
| No measurement overhead | Large processes may thrash |
| Easy to understand | Doesn't adapt to behavior |
Proportional Allocation:
Formula: Process i gets (S_i / Σ S_j) × M frames, where S_i is process i's size
Example: Two processes with sizes 10MB and 40MB, 100 frames available:
| Pro | Con |
|---|---|
| Considers process size | Size ≠ working set |
| Fair in proportion | Sparse address spaces over-allocated |
| No runtime measurement | Doesn't adapt to phases |
A critical distinction: process size (virtual address space) differs from need (working set). A 1GB address space process might actively use only 50MB. Proportional allocation based on address space size can severely over-allocate to sparse processes.
Priority-Based Allocation:
Priority can influence allocation in multiple ways:
Example with priority weighting:
Need-Based (Dynamic) Allocation:
This is where working set and PFF concepts are applied:
This is the most sophisticated approach but requires accurate measurement of working set or fault rate.
A crucial design decision is whether page replacement operates globally (across all processes) or locally (within each process).
Local Replacement: When process P needs a frame, it evicts one of its own pages. Its allocation stays constant.
Global Replacement: When process P needs a frame, it can take a frame from any process (including itself). Allocations shift based on demand.
The Hybrid Approach: Scoped Global Replacement
Most modern systems use a hybrid:
This gives the benefits of both:
Implementation Sketch:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
// Scoped global replacement policy typedef struct { int soft_min; // Preferred minimum allocation int hard_min; // Absolute minimum (never violate) int soft_max; // Preferred maximum int hard_max; // Absolute maximum (cap) int current; // Currently allocated frames} ProcessMemoryLimits; // Called when a process needs a frameFrameAllocation find_frame_for_process(Process* requester) { // Strategy 1: Take from own excess (if above soft max) if (requester->limits.current > requester->limits.soft_max) { return evict_from_process(requester, LRU); } // Strategy 2: Take from global free pool if (global_free_frames > 0) { return allocate_free_frame(); } // Strategy 3: Take from other process with excess for (Process* p : all_processes_by_priority_ascending) { if (p != requester && p->limits.current > p->limits.soft_min && p->priority < requester->priority) { return evict_from_process(p, LRU); } } // Strategy 4: Take from self (standard local replacement) if (requester->limits.current > requester->limits.hard_min) { return evict_from_process(requester, LRU); } // Strategy 5: Emergency - steal from anyone above hard_min for (Process* p : all_processes_by_priority_ascending) { if (p->limits.current > p->limits.hard_min) { return evict_from_process(p, LRU); } } // Strategy 6: No frames available - must suspend a process return initiate_swapping();}Global replacement can cause priority inversion: a low-priority process grabs memory from a high-priority one. This is why priority-aware victim selection is essential—always prefer to steal from lower-priority processes.
When total memory demand exceeds supply, no amount of clever allocation can satisfy everyone. The system must reduce the number of competing processes—this is load control.
The Load Control Decision:
When Σ WSS_i > M (working sets exceed frames):
| Criterion | Why It Matters | Tradeoff |
|---|---|---|
| Lowest Priority | Protect important processes | Low-priority work may never complete |
| Largest Working Set | Frees most memory per suspension | May suspend process close to completion |
| Longest Wait Time | Round-robin fairness | May suspend most productive process |
| Most Recently Swapped In | Avoid thrashing same process | May repeatedly suspend same 'unlucky' process |
| Blocked Process | Not actively running anyway | Will need frames when unblocked |
| Highest Recent Allocation Growth | Penalize memory-hungry process | May be legitimate growing need |
Degree of Multiprogramming Control:
The degree of multiprogramming (number of processes in memory) must be controlled to prevent thrashing:
Feedback Control Loop:
Measure: System-wide page fault rate, CPU utilization, swap activity
If thrashing detected (high faults + low CPU):
Decrease multiprogramming (swap out processes)
If under-utilized (low faults + low CPU due to blocking):
Increase multiprogramming (swap in processes)
If healthy (low faults + high CPU):
Maintain current multiprogramming level
Swapping Mechanics:
Swapping a process out involves:
Swapping back in:
Swapping is expensive—writing/reading many pages to/from disk. It should be a last resort. Good allocation policies prevent the need for swapping by keeping the system near the balanced operating point. Frequent swapping indicates fundamental memory shortage.
Let's examine concrete algorithms for dynamic allocation that combine the concepts we've discussed.
Algorithm 1: Combined WSS + PFF
This algorithm tracks both working set size and page fault frequency:
For each process:
wss = estimate_working_set_size(process)
pff = measure_page_fault_frequency(process)
// Allocation target based on WSS
target_allocation = wss * (1 + headroom_factor)
// Adjust based on PFF
if pff > upper_threshold:
target_allocation *= 1.2 // Need more than WSS suggests
else if pff < lower_threshold:
target_allocation *= 0.9 // Can get by with less
// Constrain to limits
target_allocation = clamp(target_allocation, min_frames, max_frames)
process.target_frames = target_allocation
// Global reconciliation
if sum(targets) <= available_frames:
// Under-committed: everyone gets their target
allocate_to_targets()
else:
// Over-committed: scale down proportionally, prioritize
scale_allocations_to_fit()
Algorithm 2: Pure PFF with Global Balancing
This simpler algorithm uses only PFF:
periodically:
for each process p:
if p.pff > upper_threshold and free_frames > 0:
give_frame_to(p)
else if p.pff < lower_threshold and p.frames > min_frames:
take_frame_from(p)
// Check for systemic over-commitment
if global_pff > emergency_threshold:
victim = select_swap_victim()
swap_out(victim)
// Check for under-utilization
if global_pff < low_threshold and swapped_processes exist:
candidate = select_swap_in_candidate()
swap_in(candidate)
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
// Adaptive allocation algorithm implementation typedef struct { Process* proc; int current_frames; int target_frames; double priority_weight; double pff; int wss_estimate;} ProcessAllocationEntry; void compute_target_allocations(ProcessAllocationEntry* entries, int n) { int total_available = total_physical_frames - kernel_frames; // Phase 1: Compute unconstrained targets int total_demand = 0; for (int i = 0; i < n; i++) { ProcessAllocationEntry* e = &entries[i]; // Base target is working set estimate with headroom e->target_frames = e->wss_estimate * 1.1; // Adjust based on fault rate if (e->pff > UPPER_PFF_THRESHOLD) { e->target_frames *= 1.0 + min(0.5, (e->pff - UPPER_PFF_THRESHOLD) / UPPER_PFF_THRESHOLD); } else if (e->pff < LOWER_PFF_THRESHOLD) { e->target_frames *= 0.9; } // Apply limits e->target_frames = max(e->target_frames, MIN_FRAMES); e->target_frames = min(e->target_frames, e->proc->max_frames); total_demand += e->target_frames; } // Phase 2: Scale if over-committed if (total_demand > total_available) { double scale = (double)total_available / total_demand; // Priority-weighted scaling: high priority gets less reduction double total_weighted_demand = 0; for (int i = 0; i < n; i++) { total_weighted_demand += entries[i].target_frames * entries[i].priority_weight; } for (int i = 0; i < n; i++) { ProcessAllocationEntry* e = &entries[i]; // Higher weight = less scaling (closer to 1.0) double adjusted_scale = scale + (1.0 - scale) * (e->priority_weight - 1.0) / MAX_PRIORITY_WEIGHT; e->target_frames *= adjusted_scale; e->target_frames = max(e->target_frames, MIN_FRAMES); } }} void apply_allocations(ProcessAllocationEntry* entries, int n) { // Apply targets gradually to avoid thrashing for (int i = 0; i < n; i++) { ProcessAllocationEntry* e = &entries[i]; int delta = e->target_frames - e->current_frames; // Don't change more than 20% per interval for stability int max_change = max(1, e->current_frames / 5); delta = clamp(delta, -max_change, max_change); if (delta > 0) { add_frames_to_process(e->proc, delta); } else if (delta < 0) { remove_frames_from_process(e->proc, -delta); } }}Note the 20% limit on per-interval change. Rapid allocation changes can cause instability—a process suddenly losing 50% of its frames will spike in fault rate, triggering allocation increase, causing oscillation. Gradual adjustment provides stability.
Modern operating systems don't strictly implement either working set or PFF. Instead, they use pragmatic approaches that achieve similar goals with simpler mechanisms.
Linux: Memory Pressure-Based Reclamation
Linux doesn't track per-process working sets or fault rates explicitly. Instead:
This is essentially global replacement with recency-based victim selection, plus load control via OOM.
| Concept | Mechanism | Purpose |
|---|---|---|
| kswapd | Kernel daemon that reclaims pages in background | Keep free memory above watermarks |
| Direct Reclaim | Process reclaims synchronously when allocating | Handle sudden memory needs |
| LRU Lists | Active and inactive lists track page age | Approximate LRU for victim selection |
| Memory Cgroups | Hierarchical memory limits per container/group | Isolation and resource control |
| OOM Killer | Terminates process when memory exhausted | Last-resort load control |
| Writeback | Flush dirty pages to disk in background | Prepare pages for eviction |
Windows: Working Set Manager
Windows is closer to classic working set model:
Key Differences from Classic Model:
In containerized environments (Docker, Kubernetes), memory cgroups provide hierarchical allocation. Each container has memory limits, and the OS enforces them. This moves allocation decisions up a level—the container runtime configures limits; the OS enforces them. This model dominates cloud computing.
FreeBSD: Soft/Hard Limit Model
FreeBSD uses resource limits with soft and hard bounds:
| Limit Type | Behavior |
|---|---|
| Soft limit | Process receives signals/warnings when exceeded |
| Hard limit | Allocation fails if would exceed |
| Per-process | Individual process constraints |
| Per-user | Total across all user's processes |
| System-wide | Global constraints |
This provides flexibility for administrators to tune behavior.
VMware ESXi: Hypervisor Memory Management
Virtualization adds another layer:
This is essentially priority-based proportional allocation with guaranteed minimums.
Let's examine how allocation plays out in realistic scenarios.
Scenario 1: Database Server with Web Frontend
Workload:
Allocation Strategy:
Behavior:
Result: Stable performance; database protected; web absorbs variation
# Scenario 2: Build Server with Multiple Jobs ## Workload- Build jobs: Variable memory needs (500MB to 8GB each)- Concurrent jobs: Up to 16 parallel builds- Total RAM: 64GB ## ChallengeBuild jobs have phases:1. Parse/compile: Small WSS2. Link: Large WSS (loads all objects)3. Test: Variable WSS Jobs in link phase need much more memory temporarily. ## Allocation Strategy ### Priority Tiers| Tier | Jobs | Memory Share | Preemption ||------|------|--------------|------------|| High | CI/critical path | 40% (25.6GB) | Protected || Medium | Developer builds | 35% (22.4GB) | Preemptable || Low | Speculative builds | 25% (16GB) | First to suspend | ### Dynamic Adjustment1. Jobs start with 1GB allocation2. PFF monitored; allocation grows as needed3. When job enters link phase (detected by syscall patterns): - Pre-allocate additional frames - Delay other job starts if memory tight4. After link phase, excess reclaimed ### Over-Commitment Handling1. First: Delay starting new low-priority jobs2. Second: Suspend idle speculative builds3. Third: OOM-kill if truly exhausted (rare) ## Result- CI jobs reliably complete- Developer builds usually fast- Speculative builds best-effort- System rarely thrashesScenario 3: Mixed Interactive and Batch
Workload:
Challenge:
Allocation Strategy:
Interactive tier: High priority, guaranteed response
Batch tier: Lower priority, best effort
Pressure response:
Key Insight: Interactive responsiveness depends on having working set in memory. Even 100ms of page fault latency is noticeable. Batch jobs can tolerate higher fault rates—they just take longer.
For interactive workloads, preventing page faults is more important than maximizing throughput. A user waiting 200ms for a page fault notices; a batch job taking 60 instead of 58 minutes doesn't matter. Allocation policies should encode this asymmetry.
We've explored dynamic allocation in comprehensive depth—from fundamental policies to practical algorithms and real-world case studies. Let's consolidate the key insights:
What's Next:
In the final page of this module, we'll explore Thrashing Prevention—how operating systems detect, respond to, and prevent the catastrophic condition where page faulting dominates useful work. This brings together everything we've learned about working sets, PFF, and dynamic allocation.
You now understand dynamic frame allocation—how operating systems coordinate memory across all processes to maximize performance while preventing thrashing. This understanding connects individual process needs (working set, PFF) to system-wide behavior.