Loading learning content...
In 1968, Peter Denning introduced a concept that would fundamentally transform how operating systems manage memory: the working set model. Before this breakthrough, memory management was largely reactive—systems would respond to page faults as they occurred, often leading to catastrophic performance degradation known as thrashing.
The working set model provided something revolutionary: a predictive framework for understanding which pages a process truly needs at any given moment. Rather than treating all pages equally or making arbitrary eviction decisions, the working set approach recognizes a fundamental truth about program behavior—locality of reference.
This page explores the working set model in exhaustive detail, from its theoretical foundations to its mathematical formalization, establishing the groundwork for understanding modern memory management policies.
By the end of this page, you will understand: (1) The historical context and motivation for the working set model, (2) The principle of locality and its three manifestations, (3) The formal mathematical definition of a working set, (4) How working sets capture program behavior, and (5) Why the working set model became foundational to memory management theory.
To appreciate the significance of the working set model, we must first understand the severe challenges that plagued early multiprogramming systems.
The Early Virtual Memory Dilemma:
When virtual memory systems were first developed in the 1960s, designers faced a fundamental question: How many frames should each process receive? The answers seemed deceptively simple:
Each of these approaches harbored critical flaws that would manifest catastrophically under load.
| Strategy | Approach | Critical Flaw | Failure Mode |
|---|---|---|---|
| Equal Allocation | Divide frames equally among processes | Ignores actual memory needs | Small processes waste frames; large processes starve |
| Free Competition | Global replacement, no per-process limits | No isolation between processes | One process can steal all frames from others |
| Size-Proportional | Allocate based on virtual address space size | Confuses address space with actual usage | Sparse address spaces get excessive allocation |
| Fixed Minimum | Ensure each process has minimum frames | Static allocation ignores phase behavior | Cannot adapt to changing memory demands |
The Thrashing Catastrophe:
The most devastating failure mode was thrashing—a condition where the system spends more time handling page faults than executing useful work. Consider this progression:
The terrifying aspect of thrashing is its positive feedback loop—once it starts, it accelerates. More page faults lead to more context switches, which leads to more diverse page references, which leads to more page faults.
Traditional wisdom suggested that increasing the degree of multiprogramming (more processes) would increase CPU utilization. But beyond a certain point, the opposite occurs—utilization plummets. Before the working set model, system administrators had no principled way to determine this tipping point.
What Was Missing:
The fundamental gap in early approaches was predictive understanding of process memory behavior. Systems needed to answer:
Peter Denning's working set model provided elegant answers to all these questions by formalizing the observation that programs exhibit locality of reference.
The working set model rests on a profound empirical observation about program behavior: programs do not reference their entire address space uniformly. Instead, at any given time, they focus their memory accesses on a small subset of their pages. This phenomenon is called the principle of locality, and it manifests in three distinct forms.
Why Locality Exists:
Locality is not accidental—it emerges from fundamental properties of program structure:
Program Structure:
Algorithm Behavior:
Memory Layout:
Empirical studies consistently show that programs spend approximately 90% of their execution time accessing only 10% of their code and data. This extreme concentration makes intelligent memory management not just possible, but highly effective. If programs truly used all their pages uniformly, virtual memory would be far less practical.
Locality and Program Phases:
Programs don't maintain constant locality—they move through phases, each with its own working set of pages:
| Phase | Example | Pages Referenced |
|---|---|---|
| Initialization | Program startup, library loading | Loader, initialization routines, global data |
| Input Processing | Reading configuration, parsing input | I/O libraries, parser code, input buffers |
| Computation | Core algorithm execution | Algorithm code, working data structures |
| Output Generation | Producing results | Output routines, formatting code, output buffers |
| Cleanup | Resource deallocation, exit | Cleanup routines, finalization code |
During each phase, the program's memory requirements may differ dramatically. The working set model captures this phase behavior by defining the working set relative to recent history, automatically adapting as the program transitions between phases.
With the principle of locality established, we can now formalize the working set concept. Denning's definition is elegant and precise:
Definition: The working set of a process at virtual time t with window size Δ, denoted W(t, Δ), is the set of pages referenced during the time interval from t - Δ to t.
Let's unpack each component of this definition carefully.
Mathematical Formalization:
Let r₁, r₂, r₃, ... be the sequence of page references made by a process, where rᵢ is the page referenced at virtual time i.
Then the working set is defined as:
W(t, Δ) = { rᵢ | t - Δ < i ≤ t }
And the working set size is:
|W(t, Δ)| = number of distinct pages in W(t, Δ)
Consider a page reference string: 1, 2, 3, 2, 1, 4, 5, 4, 3, 2
At t = 10 with Δ = 5, we look at references from t = 6 to t = 10: References: 5, 4, 3, 2 (positions 7, 8, 9, 10 — note: 4 appears at position 8) Wait, let's recount: positions 6, 7, 8, 9, 10 are: 4, 5, 4, 3, 2
W(10, 5) = {4, 5, 3, 2} = {2, 3, 4, 5} |W(10, 5)| = 4 pages
Properties of Working Sets:
Working sets exhibit several important mathematical properties:
1. Monotonicity with Δ: If Δ₁ ≤ Δ₂, then W(t, Δ₁) ⊆ W(t, Δ₂)
A larger window always includes at least the pages in a smaller window. This means:
2. Bounds on Working Set Size: 1 ≤ |W(t, Δ)| ≤ min(Δ, N)
Where N is the total number of pages in the process's address space. The working set always contains at least one page (the most recently referenced) and at most Δ pages (if every reference in the window is to a different page) or N pages (if the entire address space is covered).
3. Inclusion Property: If page P ∈ W(t, Δ), then either:
4. Temporal Coherence: For most programs, W(t, Δ) ≈ W(t+1, Δ) — the working set changes slowly over time. Dramatic changes occur only at phase transitions.
To build intuition for how working sets behave, let's trace through a detailed example. Consider a program with the following page reference string:
Reference String: 1, 2, 3, 2, 1, 4, 1, 2, 5, 1, 2, 3, 4, 5
We'll compute the working set at each time step with Δ = 4 (a window of 4 references).
| Time (t) | Reference | Window References | W(t, 4) | |W(t, 4)| |
|---|---|---|---|---|
| 1 | 1 | 1 | {1} | 1 |
| 2 | 2 | 1, 2 | {1, 2} | 2 |
| 3 | 3 | 1, 2, 3 | {1, 2, 3} | 3 |
| 4 | 2 | 1, 2, 3, 2 | {1, 2, 3} | 3 |
| 5 | 1 | 2, 3, 2, 1 | {1, 2, 3} | 3 |
| 6 | 4 | 3, 2, 1, 4 | {1, 2, 3, 4} | 4 |
| 7 | 1 | 2, 1, 4, 1 | {1, 2, 4} | 3 |
| 8 | 2 | 1, 4, 1, 2 | {1, 2, 4} | 3 |
| 9 | 5 | 4, 1, 2, 5 | {1, 2, 4, 5} | 4 |
| 10 | 1 | 1, 2, 5, 1 | {1, 2, 5} | 3 |
| 11 | 2 | 2, 5, 1, 2 | {1, 2, 5} | 3 |
| 12 | 3 | 5, 1, 2, 3 | {1, 2, 3, 5} | 4 |
| 13 | 4 | 1, 2, 3, 4 | {1, 2, 3, 4} | 4 |
| 14 | 5 | 2, 3, 4, 5 | {2, 3, 4, 5} | 4 |
Observations from the Trace:
1. Working Set Stability: Notice how the working set remains relatively stable. From t=4 to t=5, despite different references, the working set stays {1, 2, 3}. This stability is what makes working sets useful—they change slowly, tracking genuine phase transitions rather than individual references.
2. Page Entry and Exit: Page 3 exits the working set at t=7 (its last reference at t=5 falls outside the window) but re-enters at t=12 when referenced again. The working set naturally captures which pages are 'currently important.'
3. Size Fluctuation: The working set size fluctuates between 3 and 4 pages. If we allocate exactly |W(t, Δ)| frames to this process at each moment, we'd need between 3 and 4 frames—far less than the 5 distinct pages eventually used.
4. Effect of Window Size: With a larger Δ (say, 6 instead of 4), working sets would be larger and change more slowly. With a smaller Δ (say, 2), they'd be smaller but potentially miss important pages, causing page faults.
Think of the working set as a 'sliding window' over the process's recent memory behavior. Pages enter when referenced and stay until they haven't been accessed for Δ references. It's like a 'recently used' list with automatic expiration.
The working set model leads to a fundamental principle for memory management, often called the Working Set Principle or Denning's Principle:
A process should be allocated frames at least equal to its working set size. If the sum of all processes' working set sizes exceeds total available frames, then some processes must be suspended (not all can run efficiently simultaneously).
This principle has profound implications for operating system design.
The Working Set Balance Equation:
Let WSS_i denote the working set size of process i at a given moment, and let M denote total available frames. The system is in balance when:
∑ WSS_i ≤ M (for all currently running processes)
When this inequality is satisfied, each process can hold its entire working set in memory, page faults are minimal (only when transitioning between phases), and CPU utilization is high.
When the inequality is violated:
∑ WSS_i > M (system is overcommitted)
Some processes cannot hold their working sets, leading to:
The OS must respond by suspending one or more processes (moving them entirely out of memory) until balance is restored.
When WSS sum exceeds M, the OS faces a critical decision: which process(es) to suspend? This decision involves balancing multiple factors—process priority, how long each has run, how close each is to completion, and the size of each working set (suspending one large-WSS process may be more effective than suspending multiple small-WSS processes).
The window size Δ is the critical parameter of the working set model. Its selection represents a fundamental tradeoff between responsiveness and stability.
Effects of Different Δ Values:
The Ideal Δ:
The ideal window size is one that:
Practical Considerations:
In practice, Δ is typically:
Common values in real systems:
There is no universally 'correct' Δ. The optimal value depends on program behavior, workload mix, and system goals. This inherent difficulty in selecting Δ motivated alternative approaches like Page Fault Frequency (PFF), which we'll explore in a later page—PFF achieves similar goals without requiring explicit Δ selection.
Students often confuse the working set model with the LRU (Least Recently Used) page replacement algorithm. While both leverage the principle of locality, they serve fundamentally different purposes and operate at different levels.
Key Distinctions:
| Aspect | Working Set Model | LRU Algorithm |
|---|---|---|
| Purpose | Determines how many frames a process needs | Determines which page to evict when a frame is needed |
| Level | Memory allocation policy | Page replacement policy |
| Scope | Per-process (allocation decision) | Global or local (eviction decision) |
| Output | A set of pages (the working set) | A single page (the victim) |
| Parameter | Window size Δ (time-based) | None inherent (though approximations have parameters) |
| Trigger | Continuous monitoring | Activated on page fault |
| Goal | Prevent thrashing through proper allocation | Minimize page faults through smart replacement |
Complementary, Not Competing:
Working set and LRU work at different levels of the memory management hierarchy:
A well-designed system uses both:
The Relationship:
Interestingly, if you use working set-based allocation and need to evict a page, the page that will leave the working set next (under LRU ordering) is exactly the least recently used page. This means:
The working set model can be seen as a way to dynamically adjust allocation so that LRU replacement rarely needs to evict pages that will be used again soon.
Both working set and LRU are manifestations of the same insight: recent history predicts future behavior. Working set uses this to allocate memory proactively; LRU uses it to replace pages reactively. The best systems combine proactive allocation with reactive replacement.
We have established the theoretical foundation of the working set model—one of the most influential concepts in operating systems memory management. Let's consolidate the key insights:
What's Next:
In the next page, we'll explore the working set window in greater depth—examining how the window parameter affects system behavior, how to track working sets in practice, and the implementation challenges that led to approximate approaches.
You now understand the conceptual and mathematical foundations of the working set model. This understanding is essential for grasping how modern operating systems prevent thrashing and allocate memory dynamically. The working set concept remains central to memory management theory and practice, influencing designs from Linux's page reclamation to database buffer management.