Operating SystemsWorking Set Model

Working Set Model

LevelIntermediate

Duration75 mins

TopicWorking Set Model

3 / 5

Working Set Window (Δ)

The Critical Parameter That Governs Everything

Consider a historian writing a book. To write effectively, they need reference materials at hand—but how many days of reading counts as 'recent enough' to be relevant? Yesterday's reading is certainly useful; last year's might be outdated or forgotten. The same dilemma faces operating systems: how far back should we look when determining what a process 'currently' needs?

This is the question the working set window parameter Δ (delta) answers. It's the single most important tunable aspect of the working set model, determining the boundary between 'recent' and 'ancient' memory accesses. Get Δ right, and the working set model efficiently prevents thrashing. Get Δ wrong, and you either waste memory on unused pages or suffer needless page faults.

What You Will Learn

By the end of this page, you will deeply understand the working set window parameter: how to interpret it, how its value affects system behavior, techniques for choosing appropriate values, trade-offs between static and adaptive window sizes, and why there's no universally 'correct' Δ. You'll develop the intuition to reason about window selection for different workload types.

Understanding the Window Parameter

The window parameter Δ defines the temporal scope of the working set. Recall the definition:

W(t, Δ) = { p : page p was accessed during interval [t - Δ, t] }

Δ determines how far back in time (or, more precisely, how many memory references back) we consider when building the working set. This seemingly simple parameter has profound effects on memory management behavior.

Δ as a Measure of 'Recency':

Think of Δ as the definition of 'recently used.' When we say 'keep recently used pages resident,' Δ quantifies 'recently':

Larger Δ: More liberal definition of 'recent'—pages used longer ago still count
Smaller Δ: Stricter definition of 'recent'—only very recently used pages count

Physical Interpretation:

In practice, Δ is measured in one of two ways:

Memory references: Δ = 10,000 means 'pages accessed in the last 10,000 memory references'
Time intervals: Δ = 100ms means 'pages accessed in the last 100 milliseconds'

The reference-based measure is more theoretically pure (independent of CPU speed), while the time-based measure is more practical (easier to implement with timer interrupts).

Interpreting Δ Values
Δ Value	Interpretation	Practical Meaning
Δ = 1	Minimal window	Only the single most recent reference is 'in' the working set
Δ = 100	Very small window	~100 instructions of history (submicrosecond)
Δ = 10,000	Small window	~10K references (~tens of microseconds)
Δ = 100,000	Medium window	~100K references (~hundreds of microseconds)
Δ = 1,000,000	Large window	~1M references (~milliseconds)
Δ = 10,000,000	Very large window	~10M references (~tens of milliseconds)
Δ → ∞	Infinite window	Every page ever accessed is in working set

Δ and Workload Characteristics

The 'ideal' Δ depends on workload characteristics. A tight computational loop might reuse pages every 1,000 references, needing only a small Δ. A database query might touch many pages once over millions of references, needing a much larger Δ. The working set model is most effective when Δ is tuned to match the workload's temporal locality characteristics.

Effects of Window Size Selection

The window size Δ creates a fundamental trade-off between memory consumption and page fault rate. Understanding this trade-off is essential for effective working set tuning.

The Fundamental Trade-off:

Larger Δ → Larger WSS → More frames needed → Fewer page faults
Smaller Δ → Smaller WSS → Fewer frames needed → More page faults

This trade-off is not linear—the relationship between Δ, WSS, and page fault rate depends on the workload's locality characteristics.

Window Too Small (Under-estimation)

•Working set misses pages: Pages that will be reused soon are excluded
•Premature eviction: Pages evicted before their next access
•Avoidable page faults: Faults that larger Δ would prevent
•Lower memory consumption: But at cost of higher fault rate
•Risk of thrashing: If fault rate exceeds threshold

Window Too Large (Over-estimation)

•Working set includes stale pages: Pages that won't be reused
•Memory waste: Frames occupied by inactive pages
•Lower multiprogramming: Fewer processes can run
•Minimal fault rate: But at cost of memory waste
•Slow phase adaptation: Old pages linger after transition

The Optimal Window:

Conceptually, the optimal Δ is the smallest window that captures all pages that will be reused before aging out. In other words:

Optimal Δ = minimum window such that few reuse-path pages are excluded

For a program with perfect loop structure (each page accessed exactly every T references), the optimal Δ is exactly T. For real programs with irregular access patterns, there's no single optimal value—rather, a range of acceptable values that balance memory and faults reasonably.

window_size_analysis.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
# Comprehensive analysis of window size effects
 
import random
from collections import defaultdict
 
class WindowSizeAnalyzer:
    """
    Analyze how window size affects working set behavior,
    page fault rate, and memory consumption.
    """
    
    def __init__(self, num_pages, trace):
        self.num_pages = num_pages
        self.trace = trace  # List of page accesses
        self.trace_length = len(trace)
        
    def compute_wss_for_delta(self, delta):
        """Compute average WSS for given delta over entire trace."""
        wss_values = []
        for t in range(delta, self.trace_length):
            window = self.trace[t-delta:t+1]
            ws = set(window)
            wss_values.append(len(ws))
        return sum(wss_values) / len(wss_values) if wss_values else 0
    
    def simulate_page_faults(self, delta):
        """
        Simulate page faults under working set policy with given delta.
        
        A page fault occurs when a page is accessed that is not
        in the current working set (i.e., not accessed in last delta refs).
        """
        faults = 0
        for t in range(self.trace_length):
            page = self.trace[t]
            
            # Check if page is in working set at time t
            window_start = max(0, t - delta)
            working_set = set(self.trace[window_start:t])  # Exclude current access
            
            if page not in working_set:
                faults += 1
                
        fault_rate = faults / self.trace_length
        return faults, fault_rate
    
    def analyze_window_range(self, delta_range):
        """
        Analyze behavior across a range of delta values.
        
        Returns data for plotting trade-off curves.
        """
        results = []
        
        for delta in delta_range:
            avg_wss = self.compute_wss_for_delta(delta)
            faults, fault_rate = self.simulate_page_faults(delta)
            
            results.append({
                'delta': delta,
                'avg_wss': avg_wss,
                'faults': faults,
                'fault_rate': fault_rate,
                'efficiency': (1 - fault_rate) / avg_wss if avg_wss > 0 else 0
            })
            
        return results
    
    def find_knee_point(self, results):
        """
        Find the "knee" in the trade-off curve.
        
        The knee is the point where increasing delta provides
        diminishing returns in fault rate reduction.
        """
        best_efficiency = 0
        knee_delta = results[0]['delta']
        
        for r in results:
            if r['efficiency'] > best_efficiency:
                best_efficiency = r['efficiency']
                knee_delta = r['delta']
                
        return knee_delta, best_efficiency
 
 
def demonstrate_window_effects():
    """
    Demonstrate how window size affects working set behavior.
    """
    print("Window Size Effects Analysis")
    print("=" * 70)
    
    # Generate trace with two phases
    trace = []
    
    # Phase 1: Tight loop over 10 pages
    for _ in range(500):
        trace.extend([i % 10 for i in range(10)])  # 5000 refs
    
    # Phase 2: Scattered access over 20 pages
    for _ in range(2000):
        trace.append(random.randint(0, 19))  # 2000 refs
    
    # Phase 3: Back to tight loop
    for _ in range(500):
        trace.extend([i % 10 for i in range(10)])  # 5000 refs
    
    analyzer = WindowSizeAnalyzer(num_pages=20, trace=trace)
    
    # Analyze range of delta values
    delta_range = [5, 10, 20, 50, 100, 200, 500, 1000, 2000, 5000]
    results = analyzer.analyze_window_range(delta_range)
    
    print("
{:>8} {:>10} {:>12} {:>12}".format(
        "Delta", "Avg WSS", "Fault Rate", "Efficiency"))
    print("-" * 45)
    
    for r in results:
        print("{:>8} {:>10.2f} {:>11.4f} {:>12.4f}".format(
            r['delta'], r['avg_wss'], r['fault_rate'], r['efficiency']))
    
    # Find knee point
    knee_delta, knee_eff = analyzer.find_knee_point(results)
    print(f"
Knee point: Δ = {knee_delta} (efficiency = {knee_eff:.4f})")
    print("
Interpretation:")
    print("- Very small Δ (5-10): High fault rate, small WSS")
    print("- Medium Δ (50-200): Good balance of faults and memory")
    print("- Large Δ (1000+): Minimal faults but large WSS")
    print("- The 'knee' represents optimal efficiency point")
 
 
def analyze_phase_transitions():
    """
    Analyze how window size affects adaptation to phase changes.
    """
    print("
" + "=" * 70)
    print("Window Size and Phase Transition Adaptation")
    print("=" * 70)
    
    # Create trace with abrupt phase change
    # Phase 1 (t=0-999): pages 0-4
    # Phase 2 (t=1000-1999): pages 10-14
    trace = []
    for _ in range(200):
        trace.extend([0, 1, 2, 3, 4])  # 1000 refs to pages 0-4
    for _ in range(200):
        trace.extend([10, 11, 12, 13, 14])  # 1000 refs to pages 10-14
    
    # Measure working sets at various points for different deltas
    print("
Working set contents at different times:")
    print("(Phase transition at t=1000)")
    
    for delta in [100, 500, 1000, 1500]:
        print(f"
Δ = {delta}:")
        for t in [800, 999, 1100, 1500]:
            if t < len(trace):
                window_start = max(0, t - delta)
                ws = set(trace[window_start:t+1])
                old_pages = {p for p in ws if p < 10}
                new_pages = {p for p in ws if p >= 10}
                print(f"  t={t}: old={sorted(old_pages)}, new={sorted(new_pages)}")
    
    print("
Observations:")
    print("- Small Δ adapts quickly but may lose useful old pages prematurely")
    print("- Large Δ retains old pages longer, slowing adaptation")
    print("- During transitions, WSS temporarily includes BOTH phase's pages")
 
 
# Run demonstrations
if __name__ == "__main__":
    demonstrate_window_effects()
    analyze_phase_transitions()

Strategies for Choosing Window Size

Choosing an appropriate window size is crucial for effective working set management. Several strategies exist, ranging from simple static values to sophisticated adaptive approaches.

Static Window Selection:

The simplest approach uses a fixed Δ for all processes:

Empirical selection: Measure typical workload behavior and choose Δ that works well on average
Conservative choice: Choose large enough Δ to handle most workloads (errs toward memory waste)
Aggressive choice: Choose small Δ and accept occasional extra faults (errs toward memory efficiency)

Typical Static Values:

Historically, time-based windows of 10-100 milliseconds (or reference-based equivalents of 10,000-100,000 references) have been common. These values work reasonably for desktop/server workloads but may be suboptimal for specific applications.

Window Selection Guidelines by Workload Type
Workload Type	Suggested Δ Range	Rationale
Tight computational loops	Small (10-50ms)	Rapid reuse, small active set
Database OLTP	Medium (50-100ms)	Transaction-sized working sets
Database OLAP	Large (100-500ms)	Large scan-based queries
Interactive applications	Medium (50-100ms)	Balance responsiveness and memory
Batch processing	Large (100-500ms)	Long-running phases
Mixed workloads	Adaptive	Different processes need different Δ

The 80-20 Heuristic

A practical heuristic: choose Δ such that the working set captures approximately the 'knee' of the miss rate curve—the point where adding more frames yields diminishing returns in fault reduction. For many workloads, this corresponds to where 80% of accesses are served from the working set (80% hit rate within the window). This leverages the common observation that locality follows something like an 80-20 rule.

Adaptive Window Selection:

More sophisticated systems adjust Δ dynamically based on observed behavior:

Approach 1: Fault-Rate Guided

Monitor page fault rate for each process
If fault rate too high, increase Δ (larger working set)
If fault rate very low, decrease Δ (potentially wasting memory)
Converges to Δ that maintains target fault rate

Approach 2: WSS-Change Guided

Monitor rate of change of WSS
Rapid WSS growth suggests Δ too small (pages aging out prematurely)
Stable WSS with many evictions suggests Δ too large (including stale pages)

Approach 3: Per-Process Δ

Recognize different processes have different locality characteristics
Track optimal Δ separately for each process
Allocate memory based on per-process working sets

Approach 4: Phase-Aware Δ

Detect phase transitions in process behavior
Temporarily adjust Δ during transitions
Use different Δ for different detected phases

adaptive_window.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
# Adaptive window size selection strategies
 
class AdaptiveWindowManager:
    """
    Manages window size adaptively based on observed behavior.
    
    Implements multiple strategies for dynamic Δ adjustment.
    """
    
    def __init__(self, initial_delta, min_delta=10, max_delta=10000):
        self.delta = initial_delta
        self.min_delta = min_delta
        self.max_delta = max_delta
        
        # Tracking state
        self.recent_faults = []
        self.recent_wss = []
        self.adjustment_interval = 100  # Adjust every N references
        self.refs_since_adjust = 0
        
        # Tuning parameters
        self.target_fault_rate = 0.01  # Target 1% fault rate
        self.fault_rate_tolerance = 0.005  # +/- 0.5%
        
    def record_reference(self, was_fault, current_wss):
        """Record a memory reference and potentially adjust delta."""
        self.recent_faults.append(1 if was_fault else 0)
        self.recent_wss.append(current_wss)
        self.refs_since_adjust += 1
        
        # Keep bounded history
        max_history = self.adjustment_interval * 10
        if len(self.recent_faults) > max_history:
            self.recent_faults = self.recent_faults[-max_history:]
            self.recent_wss = self.recent_wss[-max_history:]
        
        # Periodic adjustment
        if self.refs_since_adjust >= self.adjustment_interval:
            self._adjust_delta()
            self.refs_since_adjust = 0
            
    def _adjust_delta(self):
        """
        Adjust delta based on recent behavior.
        
        Uses fault rate as primary signal for adjustment.
        """
        if len(self.recent_faults) < self.adjustment_interval:
            return
            
        # Compute recent fault rate
        recent = self.recent_faults[-self.adjustment_interval:]
        fault_rate = sum(recent) / len(recent)
        
        # Adjustment logic
        old_delta = self.delta
        
        if fault_rate > self.target_fault_rate + self.fault_rate_tolerance:
            # Fault rate too high: increase delta to capture more pages
            increase_factor = 1 + (fault_rate - self.target_fault_rate) * 10
            self.delta = min(int(self.delta * increase_factor), self.max_delta)
            
        elif fault_rate < self.target_fault_rate - self.fault_rate_tolerance:
            # Fault rate very low: might be wasting memory, decrease delta
            # Be more conservative about decreasing (prefer faults to thrashing)
            decrease_factor = 0.95  # Only 5% decrease at a time
            self.delta = max(int(self.delta * decrease_factor), self.min_delta)
            
        if self.delta != old_delta:
            print(f"Adjusted Δ: {old_delta} → {self.delta} (fault_rate={fault_rate:.4f})")
            
    def get_current_delta(self):
        return self.delta
 
 
class PerProcessWindowManager:
    """
    Manage separate window sizes for each process.
    
    Recognizes that different processes have different
    locality characteristics and need different Δ values.
    """
    
    def __init__(self, default_delta=1000):
        self.default_delta = default_delta
        self.process_managers = {}  # pid -> AdaptiveWindowManager
        
    def get_delta_for_process(self, pid):
        """Get current delta for a specific process."""
        if pid not in self.process_managers:
            self.process_managers[pid] = AdaptiveWindowManager(
                initial_delta=self.default_delta
            )
        return self.process_managers[pid].get_current_delta()
    
    def record_reference(self, pid, was_fault, current_wss):
        """Record reference for a specific process."""
        if pid not in self.process_managers:
            self.process_managers[pid] = AdaptiveWindowManager(
                initial_delta=self.default_delta
            )
        self.process_managers[pid].record_reference(was_fault, current_wss)
        
    def get_statistics(self):
        """Get per-process delta statistics."""
        stats = {}
        for pid, manager in self.process_managers.items():
            stats[pid] = {
                'delta': manager.get_current_delta(),
                'fault_history_len': len(manager.recent_faults)
            }
        return stats
 
 
class PhaseAwareWindowManager:
    """
    Manage window size with awareness of execution phases.
    
    Detects phase transitions and adjusts behavior accordingly.
    """
    
    def __init__(self, initial_delta=1000):
        self.delta = initial_delta
        self.base_delta = initial_delta
        
        # Phase detection state
        self.recent_pages = []
        self.phase_window = 100  # Look at last N accesses for phase detection
        self.previous_phase_signature = set()
        self.in_transition = False
        self.transition_start = 0
        
    def record_page_access(self, page, time):
        """Record page access and detect phase transitions."""
        self.recent_pages.append(page)
        
        if len(self.recent_pages) > self.phase_window * 2:
            self.recent_pages = self.recent_pages[-self.phase_window * 2:]
        
        if len(self.recent_pages) >= self.phase_window:
            self._detect_phase_transition(time)
            
    def _detect_phase_transition(self, current_time):
        """Detect if we're transitioning between phases."""
        current_signature = set(self.recent_pages[-self.phase_window:])
        
        if self.previous_phase_signature:
            # Jaccard similarity between current and previous phase
            intersection = len(current_signature & self.previous_phase_signature)
            union = len(current_signature | self.previous_phase_signature)
            similarity = intersection / union if union > 0 else 0
            
            if similarity < 0.3:  # Less than 30% overlap = phase transition
                if not self.in_transition:
                    print(f"Phase transition detected at time {current_time}")
                    self.in_transition = True
                    self.transition_start = current_time
                    # Temporarily increase delta during transition
                    self.delta = int(self.base_delta * 1.5)
                    
            elif similarity > 0.8 and self.in_transition:
                # Phase has stabilized
                print(f"Phase stabilized at time {current_time}")
                self.in_transition = False
                self.delta = self.base_delta
                
        self.previous_phase_signature = current_signature
        
    def get_current_delta(self):
        return self.delta
 
 
# Demonstration of adaptive approaches
def demonstrate_adaptive_window():
    print("Adaptive Window Size Demonstration")
    print("=" * 70)
    
    manager = AdaptiveWindowManager(initial_delta=100)
    
    # Simulate workload with changing characteristics
    # Phase 1: Good locality, low faults
    print("
Phase 1: Good locality (simulated low fault rate)")
    for i in range(500):
        was_fault = (i % 50 == 0)  # ~2% fault rate
        manager.record_reference(was_fault, current_wss=10)
    
    print(f"Delta after Phase 1: {manager.get_current_delta()}")
    
    # Phase 2: Poor locality, high faults  
    print("
Phase 2: Poor locality (simulated high fault rate)")
    for i in range(500):
        was_fault = (i % 10 == 0)  # ~10% fault rate
        manager.record_reference(was_fault, current_wss=20)
    
    print(f"Delta after Phase 2: {manager.get_current_delta()}")
    
    # Phase 3: Return to good locality
    print("
Phase 3: Return to good locality")
    for i in range(500):
        was_fault = (i % 100 == 0)  # ~1% fault rate
        manager.record_reference(was_fault, current_wss=10)
    
    print(f"Delta after Phase 3: {manager.get_current_delta()}")
    
    print("
Key insight: Delta adapts to maintain target fault rate")
    print("- High faults → increase delta → larger working set → fewer faults")
    print("- Low faults → decrease delta → smaller working set → save memory")

Window Size and System-Wide Behavior

The window size has effects beyond individual processes—it impacts overall system behavior, multiprogramming levels, and the stability of memory management.

Multiprogramming Level Connection:

The sum of all working set sizes determines how many processes can run concurrently:

Multiprogramming Level = Total Frames / Average(WSS)

Since WSS increases with Δ, the choice of window size directly affects multiprogramming:

Large Δ → Large WSS → Lower multiprogramming level
Small Δ → Small WSS → Higher multiprogramming level (but more faults)

The optimal balance maximizes CPU utilization while keeping fault rates acceptable.

System Stability Concerns:

Window size affects system stability in several ways:

Thrashing Threshold: The system thrashes when Σ WSS > Total Memory. A larger Δ makes each WSS larger, lowering the threshold at which thrashing begins. However, each process with its working set resident has lower fault rate, so the trade-off isn't simple.

Oscillation Risk: With adaptive Δ, there's risk of oscillation:

High faults → increase Δ
Larger WSS → less multiprogramming
Less multiprogramming → lower memory pressure
Lower fault rate → decrease Δ
Smaller WSS → more multiprogramming
More multiprogramming → higher memory pressure
High faults → cycle repeats

Good adaptive algorithms include damping to prevent oscillation.

Latency Sensitivity: Larger Δ means pages stay resident longer. For latency-sensitive applications, this can be beneficial (fewer fault stalls). For batch processing, smaller Δ may be acceptable if overall throughput is the goal.

The Memory Cliff

There's often a 'cliff' in system behavior as memory pressure increases. With Δ too large, the system suddenly tips from stable operation to thrashing when one more process is admitted. With Δ too small, performance degrades gradually but never catastrophically. Choosing slightly smaller Δ than optimal provides a safety margin against the cliff.

window_system_effects.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# Modeling system-wide effects of window size
 
class SystemSimulator:
    """
    Simulate system-wide effects of window size choices.
    
    Models multiple processes competing for memory with
    different window size policies.
    """
    
    def __init__(self, total_frames, num_processes):
        self.total_frames = total_frames
        self.num_processes = num_processes
        
        # Per-process state
        self.process_wss = {}  # pid -> current wss
        self.process_faults = {}  # pid -> fault count
        self.active_processes = set()
        self.suspended_processes = set()
        
    def simulate_with_delta(self, delta, num_refs, locality_strength=0.8):
        """
        Simulate system behavior with given delta.
        
        Args:
            delta: Working set window (same for all processes)
            num_refs: Number of references to simulate
            locality_strength: 0-1, higher = more locality (easier workload)
            
        Returns dict with system-wide metrics.
        """
        # Initialize processes
        pages_per_process = 100
        for pid in range(self.num_processes):
            self.process_wss[pid] = 0
            self.process_faults[pid] = 0
            self.active_processes.add(pid)
            
        # Simulate memory accesses
        total_faults = 0
        total_active_time = 0
        history = {pid: [] for pid in range(self.num_processes)}
        
        import random
        
        for ref in range(num_refs):
            # Round-robin process selection (simplified)
            pid = ref % self.num_processes
            
            if pid in self.suspended_processes:
                continue
                
            total_active_time += 1
            
            # Generate page access with given locality
            if history[pid] and random.random() < locality_strength:
                # Access a page from recent history
                if len(history[pid]) > delta // 10:
                    page = random.choice(history[pid][-delta//10:])
                else:
                    page = random.choice(history[pid]) if history[pid] else random.randint(0, pages_per_process-1)
            else:
                # Random page access
                page = random.randint(0, pages_per_process - 1)
            
            history[pid].append(page)
            if len(history[pid]) > delta * 2:
                history[pid] = history[pid][-delta * 2:]
            
            # Compute working set
            ws = set(history[pid][-delta:]) if len(history[pid]) >= delta else set(history[pid])
            self.process_wss[pid] = len(ws)
            
            # Check if access is a fault (not in working set before this access)
            prior_ws = set(history[pid][-delta:-1]) if len(history[pid]) > 1 else set()
            if page not in prior_ws:
                total_faults += 1
                self.process_faults[pid] += 1
            
            # Check for thrashing (sum of WSS > memory)
            total_wss = sum(self.process_wss.values())
            if total_wss > self.total_frames:
                # Suspend process with largest WSS
                victim = max(self.active_processes, key=lambda p: self.process_wss[p])
                self.active_processes.remove(victim)
                self.suspended_processes.add(victim)
        
        # Compute metrics
        fault_rate = total_faults / total_active_time if total_active_time > 0 else 0
        avg_active = sum(1 for pid in range(self.num_processes) if pid in self.active_processes)
        avg_wss = sum(self.process_wss.values()) / len(self.process_wss)
        
        return {
            'delta': delta,
            'fault_rate': fault_rate,
            'avg_wss': avg_wss,
            'total_wss': sum(self.process_wss.values()),
            'active_processes': len(self.active_processes),
            'suspended_processes': len(self.suspended_processes),
            'memory_utilization': min(sum(self.process_wss.values()) / self.total_frames, 1.0),
            'throughput_estimate': (1 - fault_rate) * len(self.active_processes)
        }
 
 
def analyze_system_wide_effects():
    """
    Analyze how window size affects system-wide behavior.
    """
    print("System-Wide Effects of Window Size")
    print("=" * 70)
    
    print("
Configuration:")
    print("  Total memory frames: 500")
    print("  Number of processes: 20")
    print("  Pages per process: 100")
    print("  Workload locality: 80%")
    print()
    
    # Test different delta values
    delta_values = [10, 25, 50, 100, 200, 500, 1000]
    
    print("{:>8} {:>10} {:>10} {:>10} {:>12} {:>12}".format(
        "Delta", "Fault%", "Avg WSS", "Total WSS", "Active", "Throughput"))
    print("-" * 65)
    
    for delta in delta_values:
        sim = SystemSimulator(total_frames=500, num_processes=20)
        result = sim.simulate_with_delta(delta, num_refs=10000, locality_strength=0.8)
        
        print("{:>8} {:>9.2%} {:>10.1f} {:>10.0f} {:>12} {:>12.1f}".format(
            result['delta'],
            result['fault_rate'],
            result['avg_wss'],
            result['total_wss'],
            result['active_processes'],
            result['throughput_estimate']
        ))
    
    print("
Observations:")
    print("- Larger delta → larger WSS → fewer processes can be active")
    print("- Smaller delta → smaller WSS → more processes active but higher faults")
    print("- Throughput peaks at intermediate delta (balance of processes vs faults)")
    print("- Total WSS exceeding memory → process suspension (thrashing prevention)")

Practical Implementation Considerations

Implementing working set tracking in real operating systems requires addressing several practical challenges related to the window parameter.

Timer-Based vs Reference-Based Windows:

The theoretical definition uses reference counts, but most implementations use time-based windows:

Aspect	Reference-Based	Time-Based
Tracking granularity	Per memory reference	Per timer tick
Overhead	High (trap every reference)	Low (periodic check)
Precision	Exact	Approximate
Fairness	Naturally fair (references are work)	May favor CPU-bound processes
Implementation	Requires hardware support	Uses timer interrupts

Practical Choice: Almost all real implementations use time-based windows, accepting the approximation for dramatically lower overhead.

Choosing Timer Resolution:

For time-based implementation, the timer resolution affects precision:

Too fast (e.g., 1ms): High overhead from frequent interrupts
Too slow (e.g., 1s): Coarse tracking, poor phase transition response
Typical (10-100ms): Good balance for most systems

Reference Bit Limitations:

Hardware reference bits are binary (accessed/not-accessed since last clear). This limits precision:

Can only detect 'accessed in interval' not 'when in interval'
Can't count number of accesses
Must infer recency from sequence of interval samples

Multiple Reference Bits:

Some systems simulate multiple reference bits using software techniques:

Shift register approach: Shift right on timer, OR in reference bit as MSB
8-bit value encodes access pattern over 8 intervals
Higher value = more recently accessed

practical_window_impl.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
/*
 * Practical Working Set Window Implementation
 * 
 * This demonstrates time-based working set tracking using
 * the reference bit aging technique common in real OS kernels.
 */
 
#include <stdint.h>
#include <stdbool.h>
 
#define MAX_PAGES 4096
#define WINDOW_INTERVALS 8  // Track over 8 timer intervals
 
typedef struct {
    uint8_t age;           // Bit vector: bit i = accessed in interval (now - i)
    bool reference_bit;    // Hardware reference bit (set by MMU on access)
    bool modified_bit;     // Hardware modified bit
    uint64_t frame_number; // Physical frame if resident
    bool present;          // Currently in memory
} PageTableEntry;
 
typedef struct {
    PageTableEntry pages[MAX_PAGES];
    int num_pages;
    int wss_estimate;      // Current working set size estimate
    
    // Statistics
    uint64_t total_faults;
    uint64_t total_accesses;
} ProcessMemoryState;
 
/*
 * Update working set tracking on timer interrupt.
 * 
 * Called at regular intervals (e.g., 10ms) by timer handler.
 * This is O(num_pages) but amortized over many instructions.
 */
void timer_tick_update(ProcessMemoryState *state) {
    int working_set_size = 0;
    
    for (int i = 0; i < state->num_pages; i++) {
        PageTableEntry *pte = &state->pages[i];
        
        // Shift age bits right (age all pages)
        pte->age = pte->age >> 1;
        
        // Add reference bit as new MSB
        if (pte->reference_bit) {
            pte->age |= (1 << (WINDOW_INTERVALS - 1));
        }
        
        // Clear reference bit for next interval
        pte->reference_bit = false;
        
        // Page is in working set if ANY bit is set
        if (pte->age != 0) {
            working_set_size++;
        }
    }
    
    state->wss_estimate = working_set_size;
}
 
/*
 * Check if page is in working set.
 * O(1) - just check if age bits are non-zero.
 */
bool is_in_working_set(ProcessMemoryState *state, int page_num) {
    if (page_num < 0 || page_num >= state->num_pages) {
        return false;
    }
    return state->pages[page_num].age != 0;
}
 
/*
 * Get relative age of a page (for replacement decisions).
 * Lower value = more recently used.
 */
int get_page_age(ProcessMemoryState *state, int page_num) {
    if (page_num < 0 || page_num >= state->num_pages) {
        return WINDOW_INTERVALS + 1;  // "Very old"
    }
    
    uint8_t age = state->pages[page_num].age;
    
    // Count leading zeros to find most recent access
    if (age == 0) return WINDOW_INTERVALS + 1;
    
    int recency = 0;
    while ((age & (1 << (WINDOW_INTERVALS - 1))) == 0) {
        age <<= 1;
        recency++;
    }
    return recency;
}
 
/*
 * Select victim page for replacement.
 * Returns page NOT in working set with oldest age.
 */
int select_victim_page(ProcessMemoryState *state) {
    int best_victim = -1;
    int best_age = -1;
    
    for (int i = 0; i < state->num_pages; i++) {
        PageTableEntry *pte = &state->pages[i];
        
        // Only consider present pages
        if (!pte->present) continue;
        
        // Prefer pages NOT in working set
        if (pte->age == 0) {
            // Not in working set - immediate candidate
            // Among these, prefer unmodified (clean) pages
            if (!pte->modified_bit) {
                return i;  // Best case: not in WS and clean
            }
            if (best_victim < 0) {
                best_victim = i;
            }
        }
    }
    
    // If all pages are in working set, fall back to LRU approximation
    if (best_victim < 0) {
        for (int i = 0; i < state->num_pages; i++) {
            PageTableEntry *pte = &state->pages[i];
            if (!pte->present) continue;
            
            int age = get_page_age(state, i);
            if (age > best_age) {
                best_age = age;
                best_victim = i;
            }
        }
    }
    
    return best_victim;
}
 
/*
 * Adjust effective window size dynamically.
 * 
 * This modifies how we interpret the age bits to
 * effectively change the window without re-tracking.
 */
int compute_wss_with_effective_window(ProcessMemoryState *state, int effective_intervals) {
    if (effective_intervals > WINDOW_INTERVALS) {
        effective_intervals = WINDOW_INTERVALS;
    }
    
    // Create mask for desired number of intervals
    uint8_t mask = (1 << effective_intervals) - 1;
    mask <<= (WINDOW_INTERVALS - effective_intervals);
    
    int wss = 0;
    for (int i = 0; i < state->num_pages; i++) {
        if ((state->pages[i].age & mask) != 0) {
            wss++;
        }
    }
    return wss;
}
 
/*
 * Example: Log working set dynamics for analysis.
 */
void log_wss_dynamics(ProcessMemoryState *state, uint64_t timestamp) {
    // Could write to trace buffer for offline analysis
    // Format: timestamp, wss, fault_rate, ...
    
    double recent_fault_rate = (double)state->total_faults / 
                               (state->total_accesses + 1);
                               
    // In real implementation, would log to kernel trace buffer
    // or update /proc filesystem statistics
}

Real OS Implementations

Linux uses a variant of this approach with the 'Active' and 'Inactive' lists. Pages on the active list were recently accessed; pages on the inactive list are candidates for eviction. The lists approximate working set membership without explicit window tracking. Windows uses a similar working set manager that tracks per-process resident sets with adaptive trimming policies.

Summary and Looking Ahead

The working set window parameter Δ is the critical control variable that determines how the working set model behaves in practice. Its selection involves fundamental trade-offs between memory consumption and page fault rate.

Key Concepts Mastered

•Window interpretation: Δ defines what 'recently used' means for working set membership
•Fundamental trade-off: Larger Δ means larger WSS, more memory, fewer faults; smaller Δ means the reverse
•Selection strategies: Static values for simplicity, adaptive approaches for optimization
•System-wide effects: Window size affects multiprogramming level and thrashing threshold
•Practical implementation: Time-based windows with reference bit aging approximate the theoretical model

Page Complete

You now deeply understand the working set window parameter—how to interpret it, how to choose appropriate values, and how it affects both individual process and system-wide behavior. In the next page, we'll explore techniques for tracking working set size over time.

Next Up: WSS Tracking

With the window defined and understood, we need mechanisms to track the working set size of each process over time. We'll explore practical algorithms for WSS estimation, the overhead-accuracy trade-offs, and how modern OS kernels implement working set tracking to enable dynamic memory allocation.

3 / 5

Loading learning content...

Operating SystemsWorking Set Model

Working Set Model

LevelIntermediate

Duration75 mins

TopicWorking Set Model

3 / 5

Working Set Window (Δ)

The Critical Parameter That Governs Everything

What You Will Learn

Understanding the Window Parameter

The window parameter Δ defines the temporal scope of the working set. Recall the definition:

W(t, Δ) = { p : page p was accessed during interval [t - Δ, t] }

Δ as a Measure of 'Recency':

Think of Δ as the definition of 'recently used.' When we say 'keep recently used pages resident,' Δ quantifies 'recently':

Larger Δ: More liberal definition of 'recent'—pages used longer ago still count
Smaller Δ: Stricter definition of 'recent'—only very recently used pages count

Physical Interpretation:

In practice, Δ is measured in one of two ways:

Memory references: Δ = 10,000 means 'pages accessed in the last 10,000 memory references'
Time intervals: Δ = 100ms means 'pages accessed in the last 100 milliseconds'

The reference-based measure is more theoretically pure (independent of CPU speed), while the time-based measure is more practical (easier to implement with timer interrupts).

Interpreting Δ Values
Δ Value	Interpretation	Practical Meaning
Δ = 1	Minimal window	Only the single most recent reference is 'in' the working set
Δ = 100	Very small window	~100 instructions of history (submicrosecond)
Δ = 10,000	Small window	~10K references (~tens of microseconds)
Δ = 100,000	Medium window	~100K references (~hundreds of microseconds)
Δ = 1,000,000	Large window	~1M references (~milliseconds)
Δ = 10,000,000	Very large window	~10M references (~tens of milliseconds)
Δ → ∞	Infinite window	Every page ever accessed is in working set

Δ and Workload Characteristics

Effects of Window Size Selection

The window size Δ creates a fundamental trade-off between memory consumption and page fault rate. Understanding this trade-off is essential for effective working set tuning.

The Fundamental Trade-off:

Larger Δ → Larger WSS → More frames needed → Fewer page faults
Smaller Δ → Smaller WSS → Fewer frames needed → More page faults

This trade-off is not linear—the relationship between Δ, WSS, and page fault rate depends on the workload's locality characteristics.

Window Too Small (Under-estimation)

•Working set misses pages: Pages that will be reused soon are excluded
•Premature eviction: Pages evicted before their next access
•Avoidable page faults: Faults that larger Δ would prevent
•Lower memory consumption: But at cost of higher fault rate
•Risk of thrashing: If fault rate exceeds threshold

Window Too Large (Over-estimation)

•Working set includes stale pages: Pages that won't be reused
•Memory waste: Frames occupied by inactive pages
•Lower multiprogramming: Fewer processes can run
•Minimal fault rate: But at cost of memory waste
•Slow phase adaptation: Old pages linger after transition

The Optimal Window:

Conceptually, the optimal Δ is the smallest window that captures all pages that will be reused before aging out. In other words:

Optimal Δ = minimum window such that few reuse-path pages are excluded

window_size_analysis.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
# Comprehensive analysis of window size effects
 
import random
from collections import defaultdict
 
class WindowSizeAnalyzer:
    """
    Analyze how window size affects working set behavior,
    page fault rate, and memory consumption.
    """
    
    def __init__(self, num_pages, trace):
        self.num_pages = num_pages
        self.trace = trace  # List of page accesses
        self.trace_length = len(trace)
        
    def compute_wss_for_delta(self, delta):
        """Compute average WSS for given delta over entire trace."""
        wss_values = []
        for t in range(delta, self.trace_length):
            window = self.trace[t-delta:t+1]
            ws = set(window)
            wss_values.append(len(ws))
        return sum(wss_values) / len(wss_values) if wss_values else 0
    
    def simulate_page_faults(self, delta):
        """
        Simulate page faults under working set policy with given delta.
        
        A page fault occurs when a page is accessed that is not
        in the current working set (i.e., not accessed in last delta refs).
        """
        faults = 0
        for t in range(self.trace_length):
            page = self.trace[t]
            
            # Check if page is in working set at time t
            window_start = max(0, t - delta)
            working_set = set(self.trace[window_start:t])  # Exclude current access
            
            if page not in working_set:
                faults += 1
                
        fault_rate = faults / self.trace_length
        return faults, fault_rate
    
    def analyze_window_range(self, delta_range):
        """
        Analyze behavior across a range of delta values.
        
        Returns data for plotting trade-off curves.
        """
        results = []
        
        for delta in delta_range:
            avg_wss = self.compute_wss_for_delta(delta)
            faults, fault_rate = self.simulate_page_faults(delta)
            
            results.append({
                'delta': delta,
                'avg_wss': avg_wss,
                'faults': faults,
                'fault_rate': fault_rate,
                'efficiency': (1 - fault_rate) / avg_wss if avg_wss > 0 else 0
            })
            
        return results
    
    def find_knee_point(self, results):
        """
        Find the "knee" in the trade-off curve.
        
        The knee is the point where increasing delta provides
        diminishing returns in fault rate reduction.
        """
        best_efficiency = 0
        knee_delta = results[0]['delta']
        
        for r in results:
            if r['efficiency'] > best_efficiency:
                best_efficiency = r['efficiency']
                knee_delta = r['delta']
                
        return knee_delta, best_efficiency
 
 
def demonstrate_window_effects():
    """
    Demonstrate how window size affects working set behavior.
    """
    print("Window Size Effects Analysis")
    print("=" * 70)
    
    # Generate trace with two phases
    trace = []
    
    # Phase 1: Tight loop over 10 pages
    for _ in range(500):
        trace.extend([i % 10 for i in range(10)])  # 5000 refs
    
    # Phase 2: Scattered access over 20 pages
    for _ in range(2000):
        trace.append(random.randint(0, 19))  # 2000 refs
    
    # Phase 3: Back to tight loop
    for _ in range(500):
        trace.extend([i % 10 for i in range(10)])  # 5000 refs
    
    analyzer = WindowSizeAnalyzer(num_pages=20, trace=trace)
    
    # Analyze range of delta values
    delta_range = [5, 10, 20, 50, 100, 200, 500, 1000, 2000, 5000]
    results = analyzer.analyze_window_range(delta_range)
    
    print("
{:>8} {:>10} {:>12} {:>12}".format(
        "Delta", "Avg WSS", "Fault Rate", "Efficiency"))
    print("-" * 45)
    
    for r in results:
        print("{:>8} {:>10.2f} {:>11.4f} {:>12.4f}".format(
            r['delta'], r['avg_wss'], r['fault_rate'], r['efficiency']))
    
    # Find knee point
    knee_delta, knee_eff = analyzer.find_knee_point(results)
    print(f"
Knee point: Δ = {knee_delta} (efficiency = {knee_eff:.4f})")
    print("
Interpretation:")
    print("- Very small Δ (5-10): High fault rate, small WSS")
    print("- Medium Δ (50-200): Good balance of faults and memory")
    print("- Large Δ (1000+): Minimal faults but large WSS")
    print("- The 'knee' represents optimal efficiency point")
 
 
def analyze_phase_transitions():
    """
    Analyze how window size affects adaptation to phase changes.
    """
    print("
" + "=" * 70)
    print("Window Size and Phase Transition Adaptation")
    print("=" * 70)
    
    # Create trace with abrupt phase change
    # Phase 1 (t=0-999): pages 0-4
    # Phase 2 (t=1000-1999): pages 10-14
    trace = []
    for _ in range(200):
        trace.extend([0, 1, 2, 3, 4])  # 1000 refs to pages 0-4
    for _ in range(200):
        trace.extend([10, 11, 12, 13, 14])  # 1000 refs to pages 10-14
    
    # Measure working sets at various points for different deltas
    print("
Working set contents at different times:")
    print("(Phase transition at t=1000)")
    
    for delta in [100, 500, 1000, 1500]:
        print(f"
Δ = {delta}:")
        for t in [800, 999, 1100, 1500]:
            if t < len(trace):
                window_start = max(0, t - delta)
                ws = set(trace[window_start:t+1])
                old_pages = {p for p in ws if p < 10}
                new_pages = {p for p in ws if p >= 10}
                print(f"  t={t}: old={sorted(old_pages)}, new={sorted(new_pages)}")
    
    print("
Observations:")
    print("- Small Δ adapts quickly but may lose useful old pages prematurely")
    print("- Large Δ retains old pages longer, slowing adaptation")
    print("- During transitions, WSS temporarily includes BOTH phase's pages")
 
 
# Run demonstrations
if __name__ == "__main__":
    demonstrate_window_effects()
    analyze_phase_transitions()

Strategies for Choosing Window Size

Choosing an appropriate window size is crucial for effective working set management. Several strategies exist, ranging from simple static values to sophisticated adaptive approaches.

Static Window Selection:

The simplest approach uses a fixed Δ for all processes:

Empirical selection: Measure typical workload behavior and choose Δ that works well on average
Conservative choice: Choose large enough Δ to handle most workloads (errs toward memory waste)
Aggressive choice: Choose small Δ and accept occasional extra faults (errs toward memory efficiency)

Typical Static Values:

Window Selection Guidelines by Workload Type
Workload Type	Suggested Δ Range	Rationale
Tight computational loops	Small (10-50ms)	Rapid reuse, small active set
Database OLTP	Medium (50-100ms)	Transaction-sized working sets
Database OLAP	Large (100-500ms)	Large scan-based queries
Interactive applications	Medium (50-100ms)	Balance responsiveness and memory
Batch processing	Large (100-500ms)	Long-running phases
Mixed workloads	Adaptive	Different processes need different Δ

The 80-20 Heuristic

Adaptive Window Selection:

More sophisticated systems adjust Δ dynamically based on observed behavior:

Approach 1: Fault-Rate Guided

Monitor page fault rate for each process
If fault rate too high, increase Δ (larger working set)
If fault rate very low, decrease Δ (potentially wasting memory)
Converges to Δ that maintains target fault rate

Approach 2: WSS-Change Guided

Monitor rate of change of WSS
Rapid WSS growth suggests Δ too small (pages aging out prematurely)
Stable WSS with many evictions suggests Δ too large (including stale pages)

Approach 3: Per-Process Δ

Recognize different processes have different locality characteristics
Track optimal Δ separately for each process
Allocate memory based on per-process working sets

Approach 4: Phase-Aware Δ

Detect phase transitions in process behavior
Temporarily adjust Δ during transitions
Use different Δ for different detected phases

adaptive_window.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
# Adaptive window size selection strategies
 
class AdaptiveWindowManager:
    """
    Manages window size adaptively based on observed behavior.
    
    Implements multiple strategies for dynamic Δ adjustment.
    """
    
    def __init__(self, initial_delta, min_delta=10, max_delta=10000):
        self.delta = initial_delta
        self.min_delta = min_delta
        self.max_delta = max_delta
        
        # Tracking state
        self.recent_faults = []
        self.recent_wss = []
        self.adjustment_interval = 100  # Adjust every N references
        self.refs_since_adjust = 0
        
        # Tuning parameters
        self.target_fault_rate = 0.01  # Target 1% fault rate
        self.fault_rate_tolerance = 0.005  # +/- 0.5%
        
    def record_reference(self, was_fault, current_wss):
        """Record a memory reference and potentially adjust delta."""
        self.recent_faults.append(1 if was_fault else 0)
        self.recent_wss.append(current_wss)
        self.refs_since_adjust += 1
        
        # Keep bounded history
        max_history = self.adjustment_interval * 10
        if len(self.recent_faults) > max_history:
            self.recent_faults = self.recent_faults[-max_history:]
            self.recent_wss = self.recent_wss[-max_history:]
        
        # Periodic adjustment
        if self.refs_since_adjust >= self.adjustment_interval:
            self._adjust_delta()
            self.refs_since_adjust = 0
            
    def _adjust_delta(self):
        """
        Adjust delta based on recent behavior.
        
        Uses fault rate as primary signal for adjustment.
        """
        if len(self.recent_faults) < self.adjustment_interval:
            return
            
        # Compute recent fault rate
        recent = self.recent_faults[-self.adjustment_interval:]
        fault_rate = sum(recent) / len(recent)
        
        # Adjustment logic
        old_delta = self.delta
        
        if fault_rate > self.target_fault_rate + self.fault_rate_tolerance:
            # Fault rate too high: increase delta to capture more pages
            increase_factor = 1 + (fault_rate - self.target_fault_rate) * 10
            self.delta = min(int(self.delta * increase_factor), self.max_delta)
            
        elif fault_rate < self.target_fault_rate - self.fault_rate_tolerance:
            # Fault rate very low: might be wasting memory, decrease delta
            # Be more conservative about decreasing (prefer faults to thrashing)
            decrease_factor = 0.95  # Only 5% decrease at a time
            self.delta = max(int(self.delta * decrease_factor), self.min_delta)
            
        if self.delta != old_delta:
            print(f"Adjusted Δ: {old_delta} → {self.delta} (fault_rate={fault_rate:.4f})")
            
    def get_current_delta(self):
        return self.delta
 
 
class PerProcessWindowManager:
    """
    Manage separate window sizes for each process.
    
    Recognizes that different processes have different
    locality characteristics and need different Δ values.
    """
    
    def __init__(self, default_delta=1000):
        self.default_delta = default_delta
        self.process_managers = {}  # pid -> AdaptiveWindowManager
        
    def get_delta_for_process(self, pid):
        """Get current delta for a specific process."""
        if pid not in self.process_managers:
            self.process_managers[pid] = AdaptiveWindowManager(
                initial_delta=self.default_delta
            )
        return self.process_managers[pid].get_current_delta()
    
    def record_reference(self, pid, was_fault, current_wss):
        """Record reference for a specific process."""
        if pid not in self.process_managers:
            self.process_managers[pid] = AdaptiveWindowManager(
                initial_delta=self.default_delta
            )
        self.process_managers[pid].record_reference(was_fault, current_wss)
        
    def get_statistics(self):
        """Get per-process delta statistics."""
        stats = {}
        for pid, manager in self.process_managers.items():
            stats[pid] = {
                'delta': manager.get_current_delta(),
                'fault_history_len': len(manager.recent_faults)
            }
        return stats
 
 
class PhaseAwareWindowManager:
    """
    Manage window size with awareness of execution phases.
    
    Detects phase transitions and adjusts behavior accordingly.
    """
    
    def __init__(self, initial_delta=1000):
        self.delta = initial_delta
        self.base_delta = initial_delta
        
        # Phase detection state
        self.recent_pages = []
        self.phase_window = 100  # Look at last N accesses for phase detection
        self.previous_phase_signature = set()
        self.in_transition = False
        self.transition_start = 0
        
    def record_page_access(self, page, time):
        """Record page access and detect phase transitions."""
        self.recent_pages.append(page)
        
        if len(self.recent_pages) > self.phase_window * 2:
            self.recent_pages = self.recent_pages[-self.phase_window * 2:]
        
        if len(self.recent_pages) >= self.phase_window:
            self._detect_phase_transition(time)
            
    def _detect_phase_transition(self, current_time):
        """Detect if we're transitioning between phases."""
        current_signature = set(self.recent_pages[-self.phase_window:])
        
        if self.previous_phase_signature:
            # Jaccard similarity between current and previous phase
            intersection = len(current_signature & self.previous_phase_signature)
            union = len(current_signature | self.previous_phase_signature)
            similarity = intersection / union if union > 0 else 0
            
            if similarity < 0.3:  # Less than 30% overlap = phase transition
                if not self.in_transition:
                    print(f"Phase transition detected at time {current_time}")
                    self.in_transition = True
                    self.transition_start = current_time
                    # Temporarily increase delta during transition
                    self.delta = int(self.base_delta * 1.5)
                    
            elif similarity > 0.8 and self.in_transition:
                # Phase has stabilized
                print(f"Phase stabilized at time {current_time}")
                self.in_transition = False
                self.delta = self.base_delta
                
        self.previous_phase_signature = current_signature
        
    def get_current_delta(self):
        return self.delta
 
 
# Demonstration of adaptive approaches
def demonstrate_adaptive_window():
    print("Adaptive Window Size Demonstration")
    print("=" * 70)
    
    manager = AdaptiveWindowManager(initial_delta=100)
    
    # Simulate workload with changing characteristics
    # Phase 1: Good locality, low faults
    print("
Phase 1: Good locality (simulated low fault rate)")
    for i in range(500):
        was_fault = (i % 50 == 0)  # ~2% fault rate
        manager.record_reference(was_fault, current_wss=10)
    
    print(f"Delta after Phase 1: {manager.get_current_delta()}")
    
    # Phase 2: Poor locality, high faults  
    print("
Phase 2: Poor locality (simulated high fault rate)")
    for i in range(500):
        was_fault = (i % 10 == 0)  # ~10% fault rate
        manager.record_reference(was_fault, current_wss=20)
    
    print(f"Delta after Phase 2: {manager.get_current_delta()}")
    
    # Phase 3: Return to good locality
    print("
Phase 3: Return to good locality")
    for i in range(500):
        was_fault = (i % 100 == 0)  # ~1% fault rate
        manager.record_reference(was_fault, current_wss=10)
    
    print(f"Delta after Phase 3: {manager.get_current_delta()}")
    
    print("
Key insight: Delta adapts to maintain target fault rate")
    print("- High faults → increase delta → larger working set → fewer faults")
    print("- Low faults → decrease delta → smaller working set → save memory")

Window Size and System-Wide Behavior

The window size has effects beyond individual processes—it impacts overall system behavior, multiprogramming levels, and the stability of memory management.

Multiprogramming Level Connection:

The sum of all working set sizes determines how many processes can run concurrently:

Multiprogramming Level = Total Frames / Average(WSS)

Since WSS increases with Δ, the choice of window size directly affects multiprogramming:

Large Δ → Large WSS → Lower multiprogramming level
Small Δ → Small WSS → Higher multiprogramming level (but more faults)

The optimal balance maximizes CPU utilization while keeping fault rates acceptable.

System Stability Concerns:

Window size affects system stability in several ways:

Oscillation Risk: With adaptive Δ, there's risk of oscillation:

High faults → increase Δ
Larger WSS → less multiprogramming
Less multiprogramming → lower memory pressure
Lower fault rate → decrease Δ
Smaller WSS → more multiprogramming
More multiprogramming → higher memory pressure
High faults → cycle repeats

Good adaptive algorithms include damping to prevent oscillation.

The Memory Cliff

window_system_effects.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# Modeling system-wide effects of window size
 
class SystemSimulator:
    """
    Simulate system-wide effects of window size choices.
    
    Models multiple processes competing for memory with
    different window size policies.
    """
    
    def __init__(self, total_frames, num_processes):
        self.total_frames = total_frames
        self.num_processes = num_processes
        
        # Per-process state
        self.process_wss = {}  # pid -> current wss
        self.process_faults = {}  # pid -> fault count
        self.active_processes = set()
        self.suspended_processes = set()
        
    def simulate_with_delta(self, delta, num_refs, locality_strength=0.8):
        """
        Simulate system behavior with given delta.
        
        Args:
            delta: Working set window (same for all processes)
            num_refs: Number of references to simulate
            locality_strength: 0-1, higher = more locality (easier workload)
            
        Returns dict with system-wide metrics.
        """
        # Initialize processes
        pages_per_process = 100
        for pid in range(self.num_processes):
            self.process_wss[pid] = 0
            self.process_faults[pid] = 0
            self.active_processes.add(pid)
            
        # Simulate memory accesses
        total_faults = 0
        total_active_time = 0
        history = {pid: [] for pid in range(self.num_processes)}
        
        import random
        
        for ref in range(num_refs):
            # Round-robin process selection (simplified)
            pid = ref % self.num_processes
            
            if pid in self.suspended_processes:
                continue
                
            total_active_time += 1
            
            # Generate page access with given locality
            if history[pid] and random.random() < locality_strength:
                # Access a page from recent history
                if len(history[pid]) > delta // 10:
                    page = random.choice(history[pid][-delta//10:])
                else:
                    page = random.choice(history[pid]) if history[pid] else random.randint(0, pages_per_process-1)
            else:
                # Random page access
                page = random.randint(0, pages_per_process - 1)
            
            history[pid].append(page)
            if len(history[pid]) > delta * 2:
                history[pid] = history[pid][-delta * 2:]
            
            # Compute working set
            ws = set(history[pid][-delta:]) if len(history[pid]) >= delta else set(history[pid])
            self.process_wss[pid] = len(ws)
            
            # Check if access is a fault (not in working set before this access)
            prior_ws = set(history[pid][-delta:-1]) if len(history[pid]) > 1 else set()
            if page not in prior_ws:
                total_faults += 1
                self.process_faults[pid] += 1
            
            # Check for thrashing (sum of WSS > memory)
            total_wss = sum(self.process_wss.values())
            if total_wss > self.total_frames:
                # Suspend process with largest WSS
                victim = max(self.active_processes, key=lambda p: self.process_wss[p])
                self.active_processes.remove(victim)
                self.suspended_processes.add(victim)
        
        # Compute metrics
        fault_rate = total_faults / total_active_time if total_active_time > 0 else 0
        avg_active = sum(1 for pid in range(self.num_processes) if pid in self.active_processes)
        avg_wss = sum(self.process_wss.values()) / len(self.process_wss)
        
        return {
            'delta': delta,
            'fault_rate': fault_rate,
            'avg_wss': avg_wss,
            'total_wss': sum(self.process_wss.values()),
            'active_processes': len(self.active_processes),
            'suspended_processes': len(self.suspended_processes),
            'memory_utilization': min(sum(self.process_wss.values()) / self.total_frames, 1.0),
            'throughput_estimate': (1 - fault_rate) * len(self.active_processes)
        }
 
 
def analyze_system_wide_effects():
    """
    Analyze how window size affects system-wide behavior.
    """
    print("System-Wide Effects of Window Size")
    print("=" * 70)
    
    print("
Configuration:")
    print("  Total memory frames: 500")
    print("  Number of processes: 20")
    print("  Pages per process: 100")
    print("  Workload locality: 80%")
    print()
    
    # Test different delta values
    delta_values = [10, 25, 50, 100, 200, 500, 1000]
    
    print("{:>8} {:>10} {:>10} {:>10} {:>12} {:>12}".format(
        "Delta", "Fault%", "Avg WSS", "Total WSS", "Active", "Throughput"))
    print("-" * 65)
    
    for delta in delta_values:
        sim = SystemSimulator(total_frames=500, num_processes=20)
        result = sim.simulate_with_delta(delta, num_refs=10000, locality_strength=0.8)
        
        print("{:>8} {:>9.2%} {:>10.1f} {:>10.0f} {:>12} {:>12.1f}".format(
            result['delta'],
            result['fault_rate'],
            result['avg_wss'],
            result['total_wss'],
            result['active_processes'],
            result['throughput_estimate']
        ))
    
    print("
Observations:")
    print("- Larger delta → larger WSS → fewer processes can be active")
    print("- Smaller delta → smaller WSS → more processes active but higher faults")
    print("- Throughput peaks at intermediate delta (balance of processes vs faults)")
    print("- Total WSS exceeding memory → process suspension (thrashing prevention)")

Practical Implementation Considerations

Implementing working set tracking in real operating systems requires addressing several practical challenges related to the window parameter.

Timer-Based vs Reference-Based Windows:

The theoretical definition uses reference counts, but most implementations use time-based windows:

Aspect	Reference-Based	Time-Based
Tracking granularity	Per memory reference	Per timer tick
Overhead	High (trap every reference)	Low (periodic check)
Precision	Exact	Approximate
Fairness	Naturally fair (references are work)	May favor CPU-bound processes
Implementation	Requires hardware support	Uses timer interrupts

Practical Choice: Almost all real implementations use time-based windows, accepting the approximation for dramatically lower overhead.

Choosing Timer Resolution:

For time-based implementation, the timer resolution affects precision:

Too fast (e.g., 1ms): High overhead from frequent interrupts
Too slow (e.g., 1s): Coarse tracking, poor phase transition response
Typical (10-100ms): Good balance for most systems

Reference Bit Limitations:

Hardware reference bits are binary (accessed/not-accessed since last clear). This limits precision:

Can only detect 'accessed in interval' not 'when in interval'
Can't count number of accesses
Must infer recency from sequence of interval samples

Multiple Reference Bits:

Some systems simulate multiple reference bits using software techniques:

Shift register approach: Shift right on timer, OR in reference bit as MSB
8-bit value encodes access pattern over 8 intervals
Higher value = more recently accessed

practical_window_impl.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
/*
 * Practical Working Set Window Implementation
 * 
 * This demonstrates time-based working set tracking using
 * the reference bit aging technique common in real OS kernels.
 */
 
#include <stdint.h>
#include <stdbool.h>
 
#define MAX_PAGES 4096
#define WINDOW_INTERVALS 8  // Track over 8 timer intervals
 
typedef struct {
    uint8_t age;           // Bit vector: bit i = accessed in interval (now - i)
    bool reference_bit;    // Hardware reference bit (set by MMU on access)
    bool modified_bit;     // Hardware modified bit
    uint64_t frame_number; // Physical frame if resident
    bool present;          // Currently in memory
} PageTableEntry;
 
typedef struct {
    PageTableEntry pages[MAX_PAGES];
    int num_pages;
    int wss_estimate;      // Current working set size estimate
    
    // Statistics
    uint64_t total_faults;
    uint64_t total_accesses;
} ProcessMemoryState;
 
/*
 * Update working set tracking on timer interrupt.
 * 
 * Called at regular intervals (e.g., 10ms) by timer handler.
 * This is O(num_pages) but amortized over many instructions.
 */
void timer_tick_update(ProcessMemoryState *state) {
    int working_set_size = 0;
    
    for (int i = 0; i < state->num_pages; i++) {
        PageTableEntry *pte = &state->pages[i];
        
        // Shift age bits right (age all pages)
        pte->age = pte->age >> 1;
        
        // Add reference bit as new MSB
        if (pte->reference_bit) {
            pte->age |= (1 << (WINDOW_INTERVALS - 1));
        }
        
        // Clear reference bit for next interval
        pte->reference_bit = false;
        
        // Page is in working set if ANY bit is set
        if (pte->age != 0) {
            working_set_size++;
        }
    }
    
    state->wss_estimate = working_set_size;
}
 
/*
 * Check if page is in working set.
 * O(1) - just check if age bits are non-zero.
 */
bool is_in_working_set(ProcessMemoryState *state, int page_num) {
    if (page_num < 0 || page_num >= state->num_pages) {
        return false;
    }
    return state->pages[page_num].age != 0;
}
 
/*
 * Get relative age of a page (for replacement decisions).
 * Lower value = more recently used.
 */
int get_page_age(ProcessMemoryState *state, int page_num) {
    if (page_num < 0 || page_num >= state->num_pages) {
        return WINDOW_INTERVALS + 1;  // "Very old"
    }
    
    uint8_t age = state->pages[page_num].age;
    
    // Count leading zeros to find most recent access
    if (age == 0) return WINDOW_INTERVALS + 1;
    
    int recency = 0;
    while ((age & (1 << (WINDOW_INTERVALS - 1))) == 0) {
        age <<= 1;
        recency++;
    }
    return recency;
}
 
/*
 * Select victim page for replacement.
 * Returns page NOT in working set with oldest age.
 */
int select_victim_page(ProcessMemoryState *state) {
    int best_victim = -1;
    int best_age = -1;
    
    for (int i = 0; i < state->num_pages; i++) {
        PageTableEntry *pte = &state->pages[i];
        
        // Only consider present pages
        if (!pte->present) continue;
        
        // Prefer pages NOT in working set
        if (pte->age == 0) {
            // Not in working set - immediate candidate
            // Among these, prefer unmodified (clean) pages
            if (!pte->modified_bit) {
                return i;  // Best case: not in WS and clean
            }
            if (best_victim < 0) {
                best_victim = i;
            }
        }
    }
    
    // If all pages are in working set, fall back to LRU approximation
    if (best_victim < 0) {
        for (int i = 0; i < state->num_pages; i++) {
            PageTableEntry *pte = &state->pages[i];
            if (!pte->present) continue;
            
            int age = get_page_age(state, i);
            if (age > best_age) {
                best_age = age;
                best_victim = i;
            }
        }
    }
    
    return best_victim;
}
 
/*
 * Adjust effective window size dynamically.
 * 
 * This modifies how we interpret the age bits to
 * effectively change the window without re-tracking.
 */
int compute_wss_with_effective_window(ProcessMemoryState *state, int effective_intervals) {
    if (effective_intervals > WINDOW_INTERVALS) {
        effective_intervals = WINDOW_INTERVALS;
    }
    
    // Create mask for desired number of intervals
    uint8_t mask = (1 << effective_intervals) - 1;
    mask <<= (WINDOW_INTERVALS - effective_intervals);
    
    int wss = 0;
    for (int i = 0; i < state->num_pages; i++) {
        if ((state->pages[i].age & mask) != 0) {
            wss++;
        }
    }
    return wss;
}
 
/*
 * Example: Log working set dynamics for analysis.
 */
void log_wss_dynamics(ProcessMemoryState *state, uint64_t timestamp) {
    // Could write to trace buffer for offline analysis
    // Format: timestamp, wss, fault_rate, ...
    
    double recent_fault_rate = (double)state->total_faults / 
                               (state->total_accesses + 1);
                               
    // In real implementation, would log to kernel trace buffer
    // or update /proc filesystem statistics
}

Real OS Implementations

Summary and Looking Ahead

Key Concepts Mastered

•Window interpretation: Δ defines what 'recently used' means for working set membership
•Fundamental trade-off: Larger Δ means larger WSS, more memory, fewer faults; smaller Δ means the reverse
•Selection strategies: Static values for simplicity, adaptive approaches for optimization
•System-wide effects: Window size affects multiprogramming level and thrashing threshold
•Practical implementation: Time-based windows with reference bit aging approximate the theoretical model

Page Complete

Next Up: WSS Tracking

3 / 5