Operating SystemsGlobal vs Local Replacement

Global vs Local Replacement

LevelIntermediate

Duration60 mins

TopicGlobal vs Local Replacement

5 / 5

Choosing Strategy

Making the Right Choice

After understanding the mechanics, trade-offs, and performance implications of global and local replacement, the practical question remains: which strategy should you choose for your specific situation?

The answer is rarely 'always global' or 'always local.' Modern systems often employ hybrid strategies, using global replacement as a baseline while enforcing local boundaries where isolation is required. This page provides a decision framework for navigating these choices effectively.

What You Will Learn

By the end of this page, you will have a systematic framework for choosing replacement strategies based on workload characteristics, isolation requirements, and operational constraints. You will understand when to use pure global, pure local, or hybrid approaches, and how to configure modern systems to implement your chosen strategy.

Decision Framework Overview

Choosing a replacement strategy requires evaluating your situation across multiple dimensions. The following framework structures this evaluation systematically.

The four key questions:

Decision Framework Questions

•What is your trust model? Do all processes on the system share the same trust boundary, or are there untrusted/multi-tenant workloads?
•What are your performance requirements? Is maximum throughput the goal, or are latency/predictability requirements dominant?
•What is your workload profile? Are workloads homogeneous with similar demands, or heterogeneous with widely varying needs?
•What operational constraints exist? Do you have SLAs, isolation requirements, billing based on usage, or security concerns?

Converting Mermaid diagram...

General guidance:

Start with global replacement as the default—it maximizes efficiency
Add local boundaries where trust, SLA, or isolation requirements demand it
Use hybrid approaches to capture benefits of both
Configure QoS mechanisms to prioritize critical workloads within global pools

When to Choose Global Replacement

Global replacement is appropriate when efficiency and throughput are paramount, and the downsides of process interference are acceptable or manageable.

Ideal scenarios for global replacement:

Global Replacement Suitability Matrix
Scenario	Why Global Works	Considerations
Single-Tenant Server	All processes owned by same user; interference is self-interference	May still want priority protection for critical services
Batch Processing Cluster	Throughput matters most; latency predictability less important	Jobs complete faster when memory flows to active jobs
Development/Test Environments	Isolation less critical; resource efficiency valued	Interference may even help identify performance issues
Homogeneous Workloads	All processes have similar needs; interference is symmetric	Equal impact under pressure is inherently 'fair'
Memory Overcommit Required	Need to run more processes than fit simultaneously	Monitor for thrashing signs; may need to reduce load
Best-Effort Services	No strict SLAs; focus on aggregate performance	Occasional latency spikes acceptable

Global Replacement Checklist

•✓ All processes are within the same trust boundary or owned by the same organization
•✓ Maximum system throughput is the primary optimization goal
•✓ No strict latency SLAs that would be violated by occasional interference
•✓ Workload composition is relatively stable (no extreme memory hogs)
•✓ Operations team can monitor for thrashing and take corrective action
•✓ Memory overcommit is acceptable or desirable for consolidation

Global With Guards

Even when choosing global replacement, add protective mechanisms: use OOM killer scoring to protect critical processes, set memory.low limits for minimum guarantees, and monitor page fault rates for early warning of interference. Pure, unguarded global replacement is rarely appropriate in production.

When to Choose Local Replacement

Local replacement is appropriate when isolation, predictability, and fairness outweigh efficiency concerns, or when trust boundaries require hard separation.

Ideal scenarios for local replacement:

Local Replacement Suitability Matrix
Scenario	Why Local Works	Considerations
Multi-Tenant Cloud	Customer isolation is mandatory; cannot let one tenant affect another	Balance between isolation and efficient resource use
Real-Time Systems	Predictable timing required; interference would violate guarantees	May need to static-allocate with no dynamic adjustment
SLA-Bound Services	Latency guarantees require predictable memory behavior	Size allocations to meet committed SLAs
Security-Critical Workloads	Timing side-channels via memory sharing are a threat model	Consider additional isolation mechanisms beyond memory
Metered Billing	Customers pay for allocated resources; must be enforceable	Allocation = what they pay for, no more, no less
Regulated Environments	Compliance requires demonstrable resource isolation	Document and audit isolation mechanisms

Local Replacement Checklist

•✓ Different processes represent different trust boundaries (multi-tenant)
•✓ Strict performance isolation is required (SLAs, real-time)
•✓ Security requirements include memory isolation (side-channel prevention)
•✓ Resource billing is based on allocated capacity
•✓ Audit or compliance requires demonstrable isolation
•✓ Workload memory requirements are well-understood for allocation sizing

Sizing Challenges

Local replacement transfers the complexity from 'managing interference' to 'sizing allocations correctly.' Undersized allocations cause process thrashing. Oversized allocations waste resources. You must understand your workloads' memory requirements—a challenge that global replacement sidesteps but local replacement demands.

Hybrid Strategies

Most production systems use hybrid strategies that combine elements of both global and local replacement. These hybrids capture much of global's efficiency while providing local's isolation where needed.

Common hybrid patterns:

Hybrid Strategy Patterns

•Global Within, Local Between: Use local boundaries between trust boundaries (tenants), but global replacement within each tenant's allocation. Example: Kubernetes pods with memory limits; containers within a pod share that limit via global replacement.
•Hierarchical Boundaries: Multiple levels of isolation—e.g., per-VM (hard local), within VM per process (soft local via cgroups), within cgroup (global). Example: Cloud VMs with containers inside.
•Global with Hard Limits: Global replacement as baseline, but hard memory limits prevent any process from exceeding its cap. Allows memory to flow to demand up to limits. Example: Linux memory.max in cgroups.
•Global with Soft Protections: Global replacement with guaranteed minimums for critical processes. Memory above minimum is shared globally. Example: Linux memory.low or memory.min in cgroups.
•Priority-Based Global: Global replacement but victim selection considers process priority. High-priority processes' pages evicted last. Example: Priority-based page reclamation policies.

hybrid_configuration.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# Kubernetes Pod with Hybrid Memory Strategy
# 
# The pod gets a total memory limit (local boundary from other pods)
# Containers within the pod share that limit (global-like within pod)
# This is "Global Within, Local Between" pattern
 
apiVersion: v1
kind: Pod
metadata:
  name: microservice-pod
spec:
  containers:
  # Container 1: Web server - gets guaranteed minimum
  - name: web-server
    image: nginx:latest
    resources:
      requests:
        memory: "512Mi"    # Guaranteed minimum (protected)
      limits:
        memory: "1Gi"      # Hard cap (local boundary)
  
  # Container 2: Cache - can burst but has lower priority
  - name: redis-cache
    image: redis:latest
    resources:
      requests:
        memory: "256Mi"    # Lower guarantee
      limits:
        memory: "768Mi"    # Can use more if available
  
  # Container 3: Log shipper - best effort
  - name: log-shipper
    image: fluent-bit:latest
    resources:
      requests:
        memory: "64Mi"     # Minimal guarantee
      limits:
        memory: "256Mi"    # Hard cap to prevent runaway
 
# Behavior:
# - Pod has hard boundary from other pods (local)
# - Within pod, memory.requests provide soft protection
# - memory.limits provide hard caps
# - Kubernetes uses global reclaim within the pod's allocation
# - If pod exceeds sum of limits, containers get OOM killed
 
---
# Linux cgroup v2 hybrid configuration
# This achieves similar effect on bare Linux
 
# Create cgroup hierarchy
# /sys/fs/cgroup/
#   └── production/              # Top-level boundary
#       ├── database/            # Protected service
#       │   └── memory.min: 4G   # Guaranteed minimum
#       │   └── memory.max: 8G   # Hard limit
#       ├── webserver/           # Front-end tier
#       │   └── memory.min: 2G
#       │   └── memory.max: 4G
#       └── batch/               # Best effort
#           └── memory.min: 0    # No guarantee
#           └── memory.max: 4G   # Still capped
 
# Effect:
# - database always gets at least 4G (local minimum)
# - webserver always gets at least 2G (local minimum)  
# - batch gets what's left (global from remainder)
# - All capped at their max (local upper bound)
# - Within each cgroup, global replacement among its processes

The Best of Both Worlds

Hybrid strategies are the norm in production because they let you capture global replacement's efficiency (high utilization, adaptive flow) while enforcing local replacement's isolation at boundaries where it matters (tenants, SLA tiers, priority levels). The art is in choosing where to draw boundaries and how permeable to make them.

Configuration in Practice

Translating strategy decisions into actual system configuration requires understanding the specific mechanisms available in your operating system or container platform.

Linux configuration options:

Linux Memory Control Mechanisms
Mechanism	Effect	Use When
memory.max	Hard limit; processes killed if exceeded	Need strict cap on resource usage
memory.high	Soft limit; throttling and reclaim above this	Want to limit bursts but allow flexibility
memory.min	Guaranteed reservation; protected from reclaim	Critical service needs minimum guarantee
memory.low	Best-effort protection; reclaimed last	Want preference but not hard guarantee
oom_score_adj	OOM killer priority tuning	Protect specific processes from termination
mlock()	Lock specific pages in memory	Ultra-critical pages must never be evicted

production_configuration.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
#!/bin/bash
# Production Memory Configuration Examples
 
# ============================================
# Scenario 1: Multi-Tenant Container Host
# Goal: Strong isolation between tenants
# ============================================
 
create_tenant_cgroup() {
    local tenant=$1
    local max_memory=$2
    
    # Create isolated cgroup for tenant
    mkdir -p /sys/fs/cgroup/$tenant
    
    # Hard limit - tenant cannot exceed this
    echo $max_memory > /sys/fs/cgroup/$tenant/memory.max
    
    # No minimum guarantee - they get what they pay for
    echo 0 > /sys/fs/cgroup/$tenant/memory.min
    
    # High watermark - throttle before hitting max
    echo $(( max_memory * 90 / 100 )) > /sys/fs/cgroup/$tenant/memory.high
}
 
create_tenant_cgroup "tenant-a" "4G"
create_tenant_cgroup "tenant-b" "8G"
create_tenant_cgroup "tenant-c" "2G"
 
# ============================================
# Scenario 2: Critical Service Protection
# Goal: Database always gets memory it needs
# ============================================
 
configure_critical_service() {
    local cgroup_path=$1
    local min_memory=$2
    local max_memory=$3
    
    mkdir -p /sys/fs/cgroup/$cgroup_path
    
    # Guaranteed minimum - NEVER reclaim below this
    echo $min_memory > /sys/fs/cgroup/$cgroup_path/memory.min
    
    # Hard limit to prevent runaway
    echo $max_memory > /sys/fs/cgroup/$cgroup_path/memory.max
    
    # Move database process into cgroup
    echo $(pgrep postgres) > /sys/fs/cgroup/$cgroup_path/cgroup.procs
    
    # Additional protection: resistant to OOM killer
    echo -500 > /proc/$(pgrep postgres)/oom_score_adj
}
 
configure_critical_service "production/database" "4G" "8G"
 
# ============================================
# Scenario 3: Tiered Service Priority
# Goal: Gold > Silver > Bronze in memory access
# ============================================
 
setup_tiered_services() {
    # Gold tier: guaranteed memory, high protection
    mkdir -p /sys/fs/cgroup/gold
    echo "4G" > /sys/fs/cgroup/gold/memory.min
    echo "10G" > /sys/fs/cgroup/gold/memory.max
    
    # Silver tier: some protection, flexible limit
    mkdir -p /sys/fs/cgroup/silver  
    echo "1G" > /sys/fs/cgroup/silver/memory.low  # soft protection
    echo "6G" > /sys/fs/cgroup/silver/memory.max
    
    # Bronze tier: no protection, capped
    mkdir -p /sys/fs/cgroup/bronze
    echo "0" > /sys/fs/cgroup/bronze/memory.min
    echo "4G" > /sys/fs/cgroup/bronze/memory.max
}
 
setup_tiered_services
 
# Result:
# - Gold tier always gets 4G, can burst to 10G
# - Silver tier prefers to keep 1G, can burst to 6G
# - Bronze tier shares remaining memory, capped at 4G
# - Under pressure: Bronze reclaimed first, then Silver, Gold last
 
# ============================================
# Scenario 4: Kubernetes Memory QoS Classes
# ============================================
 
# Kubernetes automatically creates QoS classes:
#
# Guaranteed: requests == limits for all containers
#   -> Highest priority, least likely to be evicted
#
# Burstable: requests < limits for some containers
#   -> Medium priority, can be evicted if exceeds requests
#
# BestEffort: no requests or limits specified
#   -> Lowest priority, first to be evicted
 
# Example Guaranteed pod:
cat <<EOF
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: db
    resources:
      requests:
        memory: "4Gi"  # request == limit
      limits:
        memory: "4Gi"  # Guaranteed class
EOF

Configuration Best Practices

Always start with monitoring before adding limits. Understand actual memory usage patterns before setting allocations. Set limits slightly above observed needs to allow for variability. Use memory.high before memory.max where possible—throttling is gentler than OOM killing. Review configurations regularly as workloads evolve.

Common Pitfalls and Mistakes

Even with good intentions, memory management configurations often go wrong. Learning from common mistakes helps avoid them in your own systems.

Frequent configuration mistakes:

Common Memory Management Pitfalls

•Sum of memory.min > physical memory — If guaranteed minimums exceed available RAM, some processes will be starved despite 'guarantees.' Always verify that sum of minimums fits within available memory.
•Setting limits without understanding usage — Arbitrary limits cause either wasted resources (too high) or constant OOM kills (too low). Profile workloads before setting limits.
•Forgetting kernel and buffer memory — Physical memory isn't all available to processes. Kernel, page tables, and I/O buffers consume significant memory. Account for this overhead.
•Ignoring swap interactions — Memory limits interact with swap in complex ways. A process 'within its limit' may still cause disk thrashing via swap usage.
•Equal allocation for unequal workloads — Giving every container 2GB when some need 500MB and others need 4GB wastes memory and causes failures.
•Not monitoring after configuration — Configurations that work today may fail as workloads grow or change. Continuous monitoring is essential.
•Confusing memory.low with memory.min — memory.low is a preference, not a guarantee. Under severe pressure, it can still be reclaimed. Use memory.min for true guarantees.

pitfall_examples.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
"""
Common Memory Configuration Pitfalls - Examples and Fixes
"""
 
def pitfall_1_overcommitted_minimums():
    """
    PITFALL: Sum of memory.min exceeds available
    """
    physical_memory_gb = 16
    kernel_reservation_gb = 2  # For kernel, buffers, etc.
    available_gb = physical_memory_gb - kernel_reservation_gb  # = 14GB
    
    # BAD: Guarantees exceed available
    bad_config = {
        "database": {"memory_min": "8G"},
        "webserver": {"memory_min": "6G"},
        "cache": {"memory_min": "4G"},
    }
    # Total min: 18GB > 14GB available => SOMEONE WILL STARVE
    
    # GOOD: Guarantees fit within available
    good_config = {
        "database": {"memory_min": "6G"},
        "webserver": {"memory_min": "4G"},
        "cache": {"memory_min": "2G"},  
    }
    # Total min: 12GB < 14GB available => All guarantees honored
    
    print("PITFALL 1: Overcommitted Minimums")
    print(f"  Available memory: {available_gb}GB")
    print(f"  Bad config total min: 18GB (EXCEEDS AVAILABLE)")
    print(f"  Good config total min: 12GB (fits)")
 
def pitfall_2_equal_allocation_unequal_needs():
    """
    PITFALL: One-size-fits-all allocations
    """
    print("\nPITFALL 2: Equal Allocation for Unequal Workloads")
    
    # Actual memory needs (measured via profiling)
    actual_needs = {
        "api-gateway": 0.5,      # 500MB
        "auth-service": 0.2,     # 200MB
        "data-processor": 4.0,   # 4GB
        "report-generator": 2.0, # 2GB
    }
    
    # BAD: Everyone gets 1GB
    bad_allocation = 1.0  # GB each
    total_bad = len(actual_needs) * bad_allocation  # 4GB total
    
    print(f"  Bad approach: Give everyone {bad_allocation}GB")
    for service, need in actual_needs.items():
        status = "WASTED" if need < bad_allocation else "STARVED"
        diff = abs(need - bad_allocation)
        print(f"    {service}: needs {need}GB, gets {bad_allocation}GB -> {status} by {diff}GB")
    
    # GOOD: Proportional to actual needs with buffer
    total_need = sum(actual_needs.values())
    available = 8.0  # GB available for these services
    
    print(f"\n  Good approach: Proportional allocation from {available}GB")
    for service, need in actual_needs.items():
        allocation = (need / total_need) * available
        status = "ADEQUATE" if allocation >= need else "INSUFFICIENT"
        print(f"    {service}: needs {need}GB, gets {allocation:.1f}GB -> {status}")
 
def pitfall_3_forgetting_overhead():
    """
    PITFALL: Not accounting for system overhead
    """
    print("\nPITFALL 3: Forgetting OS/Kernel Overhead")
    
    total_ram = 32  # GB
    
    # BAD: Allocate everything to applications
    bad_app_allocation = total_ram  # 32GB to apps
    print(f"  Bad: Allocate {bad_app_allocation}GB to applications")
    print(f"    Kernel needs: ~2GB (page tables, slab, etc.)")
    print(f"    Buffer cache: ~1-4GB (for I/O performance)")
    print(f"    Result: Apps fight kernel; system unstable")
    
    # GOOD: Reserve for system needs
    kernel_reserve = 2
    buffer_reserve = 4  # For healthy I/O caching
    app_allocation = total_ram - kernel_reserve - buffer_reserve
    
    print(f"\n  Good: Reserve {kernel_reserve}GB kernel + {buffer_reserve}GB buffers")
    print(f"    Available for apps: {app_allocation}GB")
    print(f"    Result: Healthy system with room for caching")
 
# Run examples
pitfall_1_overcommitted_minimums()
pitfall_2_equal_allocation_unequal_needs()
pitfall_3_forgetting_overhead()

Prevention Strategy

Use a pre-deployment checklist: (1) Sum all memory.min values and verify they fit in available RAM with 10-20% margin, (2) Profile workloads to base limits on actual usage, (3) Account for kernel, buffer cache, and shared libraries, (4) Set up monitoring and alerting before deployment.

Decision Checklist

Use this comprehensive checklist when making replacement policy decisions for a new system or re-evaluating an existing configuration.

Pre-decision information gathering:

Information to Gather

•□ Total physical memory available
•□ Number and type of workloads to run
•□ Working set size for each workload (measured)
•□ Peak vs. steady-state memory usage patterns
•□ Trust boundaries (single-tenant, multi-tenant, etc.)
•□ SLA requirements for each workload
•□ Priority relationships between workloads
•□ Acceptable overcommit ratio (if any)

Policy Selection Decision Points

•□ IF multi-tenant → Local boundaries required between tenants
•□ IF strict latency SLAs → Local boundaries or guaranteed minimums
•□ IF security isolation required → Local boundaries mandatory
•□ IF throughput is primary goal → Global preferred (with guards)
•□ IF overcommit needed → Global required (local cannot overcommit)
•□ IF workloads are heterogeneous → Consider proportional allocation
•□ IF workloads are homogeneous → Equal allocation or global is fine

Configuration Validation

•□ Sum of memory.min ≤ available RAM - overhead (with 10-20% margin)
•□ Each critical service has appropriate memory.min or OOM protection
•□ memory.max set to prevent runaway processes
•□ Monitoring deployed for page fault rates and memory pressure
•□ Alerting configured for OOM events and high pressure
•□ Runbook exists for responding to memory issues

Configuration Review Cadence

Memory configurations should be reviewed quarterly or whenever significant workload changes occur. Workloads grow over time—limits that work today may cause problems next year. Build regular review into your operational practices.

Summary: Choosing Strategy

We have synthesized the module's concepts into a practical decision framework for choosing replacement strategies.

Key Takeaways

•Start with understanding — Gather information about workloads, trust boundaries, and requirements before choosing a strategy
•Default to global with guards — Global replacement is efficient; add OOM protection and optional minimums for critical processes
•Add local boundaries for trust — Multi-tenancy, SLA, and security requirements demand local isolation
•Use hybrid strategies — Most production systems use global within boundaries, local between them
•Configure deliberately — Base settings on measured workload needs; avoid arbitrary or equal allocations
•Avoid common pitfalls — Don't overcommit minimums; account for overhead; monitor continuously
•Review regularly — Workloads change; configurations must evolve to match

Module Complete: Global vs Local Replacement

You have mastered the fundamental distinction between global and local replacement, understanding their mechanisms, trade-offs, interference patterns, performance implications, and practical configuration. This knowledge enables you to design and operate memory management policies that balance efficiency with isolation according to your specific requirements.

Module Complete

Congratulations! You have completed the Global vs Local Replacement module. You now have a comprehensive understanding of replacement scope decisions—from fundamental concepts to practical configuration. In the next module, we will explore Thrashing—what happens when systems run out of manageable memory and how to detect, prevent, and recover from this condition.

5 / 5

Loading learning content...

Operating SystemsGlobal vs Local Replacement

Global vs Local Replacement

LevelIntermediate

Duration60 mins

TopicGlobal vs Local Replacement

5 / 5

Choosing Strategy

Making the Right Choice

What You Will Learn

Decision Framework Overview

Choosing a replacement strategy requires evaluating your situation across multiple dimensions. The following framework structures this evaluation systematically.

The four key questions:

Decision Framework Questions

•What is your trust model? Do all processes on the system share the same trust boundary, or are there untrusted/multi-tenant workloads?
•What are your performance requirements? Is maximum throughput the goal, or are latency/predictability requirements dominant?
•What is your workload profile? Are workloads homogeneous with similar demands, or heterogeneous with widely varying needs?
•What operational constraints exist? Do you have SLAs, isolation requirements, billing based on usage, or security concerns?

Converting Mermaid diagram...

General guidance:

Start with global replacement as the default—it maximizes efficiency
Add local boundaries where trust, SLA, or isolation requirements demand it
Use hybrid approaches to capture benefits of both
Configure QoS mechanisms to prioritize critical workloads within global pools

When to Choose Global Replacement

Global replacement is appropriate when efficiency and throughput are paramount, and the downsides of process interference are acceptable or manageable.

Ideal scenarios for global replacement:

Global Replacement Suitability Matrix
Scenario	Why Global Works	Considerations
Single-Tenant Server	All processes owned by same user; interference is self-interference	May still want priority protection for critical services
Batch Processing Cluster	Throughput matters most; latency predictability less important	Jobs complete faster when memory flows to active jobs
Development/Test Environments	Isolation less critical; resource efficiency valued	Interference may even help identify performance issues
Homogeneous Workloads	All processes have similar needs; interference is symmetric	Equal impact under pressure is inherently 'fair'
Memory Overcommit Required	Need to run more processes than fit simultaneously	Monitor for thrashing signs; may need to reduce load
Best-Effort Services	No strict SLAs; focus on aggregate performance	Occasional latency spikes acceptable

Global Replacement Checklist

•✓ All processes are within the same trust boundary or owned by the same organization
•✓ Maximum system throughput is the primary optimization goal
•✓ No strict latency SLAs that would be violated by occasional interference
•✓ Workload composition is relatively stable (no extreme memory hogs)
•✓ Operations team can monitor for thrashing and take corrective action
•✓ Memory overcommit is acceptable or desirable for consolidation

Global With Guards

When to Choose Local Replacement

Local replacement is appropriate when isolation, predictability, and fairness outweigh efficiency concerns, or when trust boundaries require hard separation.

Ideal scenarios for local replacement:

Local Replacement Suitability Matrix
Scenario	Why Local Works	Considerations
Multi-Tenant Cloud	Customer isolation is mandatory; cannot let one tenant affect another	Balance between isolation and efficient resource use
Real-Time Systems	Predictable timing required; interference would violate guarantees	May need to static-allocate with no dynamic adjustment
SLA-Bound Services	Latency guarantees require predictable memory behavior	Size allocations to meet committed SLAs
Security-Critical Workloads	Timing side-channels via memory sharing are a threat model	Consider additional isolation mechanisms beyond memory
Metered Billing	Customers pay for allocated resources; must be enforceable	Allocation = what they pay for, no more, no less
Regulated Environments	Compliance requires demonstrable resource isolation	Document and audit isolation mechanisms

Local Replacement Checklist

•✓ Different processes represent different trust boundaries (multi-tenant)
•✓ Strict performance isolation is required (SLAs, real-time)
•✓ Security requirements include memory isolation (side-channel prevention)
•✓ Resource billing is based on allocated capacity
•✓ Audit or compliance requires demonstrable isolation
•✓ Workload memory requirements are well-understood for allocation sizing

Sizing Challenges

Hybrid Strategies

Common hybrid patterns:

Hybrid Strategy Patterns

•Global Within, Local Between: Use local boundaries between trust boundaries (tenants), but global replacement within each tenant's allocation. Example: Kubernetes pods with memory limits; containers within a pod share that limit via global replacement.
•Hierarchical Boundaries: Multiple levels of isolation—e.g., per-VM (hard local), within VM per process (soft local via cgroups), within cgroup (global). Example: Cloud VMs with containers inside.
•Global with Hard Limits: Global replacement as baseline, but hard memory limits prevent any process from exceeding its cap. Allows memory to flow to demand up to limits. Example: Linux memory.max in cgroups.
•Global with Soft Protections: Global replacement with guaranteed minimums for critical processes. Memory above minimum is shared globally. Example: Linux memory.low or memory.min in cgroups.
•Priority-Based Global: Global replacement but victim selection considers process priority. High-priority processes' pages evicted last. Example: Priority-based page reclamation policies.

hybrid_configuration.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# Kubernetes Pod with Hybrid Memory Strategy
# 
# The pod gets a total memory limit (local boundary from other pods)
# Containers within the pod share that limit (global-like within pod)
# This is "Global Within, Local Between" pattern
 
apiVersion: v1
kind: Pod
metadata:
  name: microservice-pod
spec:
  containers:
  # Container 1: Web server - gets guaranteed minimum
  - name: web-server
    image: nginx:latest
    resources:
      requests:
        memory: "512Mi"    # Guaranteed minimum (protected)
      limits:
        memory: "1Gi"      # Hard cap (local boundary)
  
  # Container 2: Cache - can burst but has lower priority
  - name: redis-cache
    image: redis:latest
    resources:
      requests:
        memory: "256Mi"    # Lower guarantee
      limits:
        memory: "768Mi"    # Can use more if available
  
  # Container 3: Log shipper - best effort
  - name: log-shipper
    image: fluent-bit:latest
    resources:
      requests:
        memory: "64Mi"     # Minimal guarantee
      limits:
        memory: "256Mi"    # Hard cap to prevent runaway
 
# Behavior:
# - Pod has hard boundary from other pods (local)
# - Within pod, memory.requests provide soft protection
# - memory.limits provide hard caps
# - Kubernetes uses global reclaim within the pod's allocation
# - If pod exceeds sum of limits, containers get OOM killed
 
---
# Linux cgroup v2 hybrid configuration
# This achieves similar effect on bare Linux
 
# Create cgroup hierarchy
# /sys/fs/cgroup/
#   └── production/              # Top-level boundary
#       ├── database/            # Protected service
#       │   └── memory.min: 4G   # Guaranteed minimum
#       │   └── memory.max: 8G   # Hard limit
#       ├── webserver/           # Front-end tier
#       │   └── memory.min: 2G
#       │   └── memory.max: 4G
#       └── batch/               # Best effort
#           └── memory.min: 0    # No guarantee
#           └── memory.max: 4G   # Still capped
 
# Effect:
# - database always gets at least 4G (local minimum)
# - webserver always gets at least 2G (local minimum)  
# - batch gets what's left (global from remainder)
# - All capped at their max (local upper bound)
# - Within each cgroup, global replacement among its processes

The Best of Both Worlds

Configuration in Practice

Translating strategy decisions into actual system configuration requires understanding the specific mechanisms available in your operating system or container platform.

Linux configuration options:

Linux Memory Control Mechanisms
Mechanism	Effect	Use When
memory.max	Hard limit; processes killed if exceeded	Need strict cap on resource usage
memory.high	Soft limit; throttling and reclaim above this	Want to limit bursts but allow flexibility
memory.min	Guaranteed reservation; protected from reclaim	Critical service needs minimum guarantee
memory.low	Best-effort protection; reclaimed last	Want preference but not hard guarantee
oom_score_adj	OOM killer priority tuning	Protect specific processes from termination
mlock()	Lock specific pages in memory	Ultra-critical pages must never be evicted

production_configuration.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
#!/bin/bash
# Production Memory Configuration Examples
 
# ============================================
# Scenario 1: Multi-Tenant Container Host
# Goal: Strong isolation between tenants
# ============================================
 
create_tenant_cgroup() {
    local tenant=$1
    local max_memory=$2
    
    # Create isolated cgroup for tenant
    mkdir -p /sys/fs/cgroup/$tenant
    
    # Hard limit - tenant cannot exceed this
    echo $max_memory > /sys/fs/cgroup/$tenant/memory.max
    
    # No minimum guarantee - they get what they pay for
    echo 0 > /sys/fs/cgroup/$tenant/memory.min
    
    # High watermark - throttle before hitting max
    echo $(( max_memory * 90 / 100 )) > /sys/fs/cgroup/$tenant/memory.high
}
 
create_tenant_cgroup "tenant-a" "4G"
create_tenant_cgroup "tenant-b" "8G"
create_tenant_cgroup "tenant-c" "2G"
 
# ============================================
# Scenario 2: Critical Service Protection
# Goal: Database always gets memory it needs
# ============================================
 
configure_critical_service() {
    local cgroup_path=$1
    local min_memory=$2
    local max_memory=$3
    
    mkdir -p /sys/fs/cgroup/$cgroup_path
    
    # Guaranteed minimum - NEVER reclaim below this
    echo $min_memory > /sys/fs/cgroup/$cgroup_path/memory.min
    
    # Hard limit to prevent runaway
    echo $max_memory > /sys/fs/cgroup/$cgroup_path/memory.max
    
    # Move database process into cgroup
    echo $(pgrep postgres) > /sys/fs/cgroup/$cgroup_path/cgroup.procs
    
    # Additional protection: resistant to OOM killer
    echo -500 > /proc/$(pgrep postgres)/oom_score_adj
}
 
configure_critical_service "production/database" "4G" "8G"
 
# ============================================
# Scenario 3: Tiered Service Priority
# Goal: Gold > Silver > Bronze in memory access
# ============================================
 
setup_tiered_services() {
    # Gold tier: guaranteed memory, high protection
    mkdir -p /sys/fs/cgroup/gold
    echo "4G" > /sys/fs/cgroup/gold/memory.min
    echo "10G" > /sys/fs/cgroup/gold/memory.max
    
    # Silver tier: some protection, flexible limit
    mkdir -p /sys/fs/cgroup/silver  
    echo "1G" > /sys/fs/cgroup/silver/memory.low  # soft protection
    echo "6G" > /sys/fs/cgroup/silver/memory.max
    
    # Bronze tier: no protection, capped
    mkdir -p /sys/fs/cgroup/bronze
    echo "0" > /sys/fs/cgroup/bronze/memory.min
    echo "4G" > /sys/fs/cgroup/bronze/memory.max
}
 
setup_tiered_services
 
# Result:
# - Gold tier always gets 4G, can burst to 10G
# - Silver tier prefers to keep 1G, can burst to 6G
# - Bronze tier shares remaining memory, capped at 4G
# - Under pressure: Bronze reclaimed first, then Silver, Gold last
 
# ============================================
# Scenario 4: Kubernetes Memory QoS Classes
# ============================================
 
# Kubernetes automatically creates QoS classes:
#
# Guaranteed: requests == limits for all containers
#   -> Highest priority, least likely to be evicted
#
# Burstable: requests < limits for some containers
#   -> Medium priority, can be evicted if exceeds requests
#
# BestEffort: no requests or limits specified
#   -> Lowest priority, first to be evicted
 
# Example Guaranteed pod:
cat <<EOF
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: db
    resources:
      requests:
        memory: "4Gi"  # request == limit
      limits:
        memory: "4Gi"  # Guaranteed class
EOF

Configuration Best Practices

Common Pitfalls and Mistakes

Even with good intentions, memory management configurations often go wrong. Learning from common mistakes helps avoid them in your own systems.

Frequent configuration mistakes:

Common Memory Management Pitfalls

•Sum of memory.min > physical memory — If guaranteed minimums exceed available RAM, some processes will be starved despite 'guarantees.' Always verify that sum of minimums fits within available memory.
•Setting limits without understanding usage — Arbitrary limits cause either wasted resources (too high) or constant OOM kills (too low). Profile workloads before setting limits.
•Forgetting kernel and buffer memory — Physical memory isn't all available to processes. Kernel, page tables, and I/O buffers consume significant memory. Account for this overhead.
•Ignoring swap interactions — Memory limits interact with swap in complex ways. A process 'within its limit' may still cause disk thrashing via swap usage.
•Equal allocation for unequal workloads — Giving every container 2GB when some need 500MB and others need 4GB wastes memory and causes failures.
•Not monitoring after configuration — Configurations that work today may fail as workloads grow or change. Continuous monitoring is essential.
•Confusing memory.low with memory.min — memory.low is a preference, not a guarantee. Under severe pressure, it can still be reclaimed. Use memory.min for true guarantees.

pitfall_examples.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
"""
Common Memory Configuration Pitfalls - Examples and Fixes
"""
 
def pitfall_1_overcommitted_minimums():
    """
    PITFALL: Sum of memory.min exceeds available
    """
    physical_memory_gb = 16
    kernel_reservation_gb = 2  # For kernel, buffers, etc.
    available_gb = physical_memory_gb - kernel_reservation_gb  # = 14GB
    
    # BAD: Guarantees exceed available
    bad_config = {
        "database": {"memory_min": "8G"},
        "webserver": {"memory_min": "6G"},
        "cache": {"memory_min": "4G"},
    }
    # Total min: 18GB > 14GB available => SOMEONE WILL STARVE
    
    # GOOD: Guarantees fit within available
    good_config = {
        "database": {"memory_min": "6G"},
        "webserver": {"memory_min": "4G"},
        "cache": {"memory_min": "2G"},  
    }
    # Total min: 12GB < 14GB available => All guarantees honored
    
    print("PITFALL 1: Overcommitted Minimums")
    print(f"  Available memory: {available_gb}GB")
    print(f"  Bad config total min: 18GB (EXCEEDS AVAILABLE)")
    print(f"  Good config total min: 12GB (fits)")
 
def pitfall_2_equal_allocation_unequal_needs():
    """
    PITFALL: One-size-fits-all allocations
    """
    print("\nPITFALL 2: Equal Allocation for Unequal Workloads")
    
    # Actual memory needs (measured via profiling)
    actual_needs = {
        "api-gateway": 0.5,      # 500MB
        "auth-service": 0.2,     # 200MB
        "data-processor": 4.0,   # 4GB
        "report-generator": 2.0, # 2GB
    }
    
    # BAD: Everyone gets 1GB
    bad_allocation = 1.0  # GB each
    total_bad = len(actual_needs) * bad_allocation  # 4GB total
    
    print(f"  Bad approach: Give everyone {bad_allocation}GB")
    for service, need in actual_needs.items():
        status = "WASTED" if need < bad_allocation else "STARVED"
        diff = abs(need - bad_allocation)
        print(f"    {service}: needs {need}GB, gets {bad_allocation}GB -> {status} by {diff}GB")
    
    # GOOD: Proportional to actual needs with buffer
    total_need = sum(actual_needs.values())
    available = 8.0  # GB available for these services
    
    print(f"\n  Good approach: Proportional allocation from {available}GB")
    for service, need in actual_needs.items():
        allocation = (need / total_need) * available
        status = "ADEQUATE" if allocation >= need else "INSUFFICIENT"
        print(f"    {service}: needs {need}GB, gets {allocation:.1f}GB -> {status}")
 
def pitfall_3_forgetting_overhead():
    """
    PITFALL: Not accounting for system overhead
    """
    print("\nPITFALL 3: Forgetting OS/Kernel Overhead")
    
    total_ram = 32  # GB
    
    # BAD: Allocate everything to applications
    bad_app_allocation = total_ram  # 32GB to apps
    print(f"  Bad: Allocate {bad_app_allocation}GB to applications")
    print(f"    Kernel needs: ~2GB (page tables, slab, etc.)")
    print(f"    Buffer cache: ~1-4GB (for I/O performance)")
    print(f"    Result: Apps fight kernel; system unstable")
    
    # GOOD: Reserve for system needs
    kernel_reserve = 2
    buffer_reserve = 4  # For healthy I/O caching
    app_allocation = total_ram - kernel_reserve - buffer_reserve
    
    print(f"\n  Good: Reserve {kernel_reserve}GB kernel + {buffer_reserve}GB buffers")
    print(f"    Available for apps: {app_allocation}GB")
    print(f"    Result: Healthy system with room for caching")
 
# Run examples
pitfall_1_overcommitted_minimums()
pitfall_2_equal_allocation_unequal_needs()
pitfall_3_forgetting_overhead()

Prevention Strategy

Decision Checklist

Use this comprehensive checklist when making replacement policy decisions for a new system or re-evaluating an existing configuration.

Pre-decision information gathering:

Information to Gather

•□ Total physical memory available
•□ Number and type of workloads to run
•□ Working set size for each workload (measured)
•□ Peak vs. steady-state memory usage patterns
•□ Trust boundaries (single-tenant, multi-tenant, etc.)
•□ SLA requirements for each workload
•□ Priority relationships between workloads
•□ Acceptable overcommit ratio (if any)

Policy Selection Decision Points

•□ IF multi-tenant → Local boundaries required between tenants
•□ IF strict latency SLAs → Local boundaries or guaranteed minimums
•□ IF security isolation required → Local boundaries mandatory
•□ IF throughput is primary goal → Global preferred (with guards)
•□ IF overcommit needed → Global required (local cannot overcommit)
•□ IF workloads are heterogeneous → Consider proportional allocation
•□ IF workloads are homogeneous → Equal allocation or global is fine

Configuration Validation

•□ Sum of memory.min ≤ available RAM - overhead (with 10-20% margin)
•□ Each critical service has appropriate memory.min or OOM protection
•□ memory.max set to prevent runaway processes
•□ Monitoring deployed for page fault rates and memory pressure
•□ Alerting configured for OOM events and high pressure
•□ Runbook exists for responding to memory issues

Configuration Review Cadence

Summary: Choosing Strategy

We have synthesized the module's concepts into a practical decision framework for choosing replacement strategies.

Key Takeaways

•Start with understanding — Gather information about workloads, trust boundaries, and requirements before choosing a strategy
•Default to global with guards — Global replacement is efficient; add OOM protection and optional minimums for critical processes
•Add local boundaries for trust — Multi-tenancy, SLA, and security requirements demand local isolation
•Use hybrid strategies — Most production systems use global within boundaries, local between them
•Configure deliberately — Base settings on measured workload needs; avoid arbitrary or equal allocations
•Avoid common pitfalls — Don't overcommit minimums; account for overhead; monitor continuously
•Review regularly — Workloads change; configurations must evolve to match

Module Complete: Global vs Local Replacement

Module Complete

5 / 5