System Design (HLD)Distributed Cache

Designing a Distributed Cache

LevelAdvanced

Duration120 mins

TopicDistributed Cache

4 / 6

Eviction Policies

The Eviction Imperative

Memory is finite. This simple truth underlies one of the most critical decisions in cache design: when the cache is full and a new item must be stored, what gets removed?

This decision—cache eviction—appears deceptively simple. The naive answer is "remove something old or unused." But the engineering reality is far more nuanced. The choice of eviction policy directly impacts cache hit rate, which in turn affects system latency, origin load, and infrastructure costs. A 5% improvement in hit rate might mean 50% fewer requests hitting your database.

Eviction policies have been studied extensively in computer science for decades, from CPU cache design to virtual memory systems to distributed caches. The fundamental challenge is predicting the future: which items will be accessed soon, and which won't? Since we can't actually predict the future, eviction policies use heuristics based on past access patterns.

Different policies make different assumptions about access patterns, and no single policy dominates all workloads. Understanding the trade-offs between policies—and when each excels—is essential for designing high-performance caches.

This page examines eviction policies comprehensively: from classical algorithms through modern approximations, from theoretical optimality through practical implementation constraints.

What You Will Learn

By the end of this page, you will understand the theoretical basis for eviction policies and the optimal algorithm, compare LRU, LFU, and hybrid policies with their trade-offs, implement efficient approximations used in production systems, analyze which policies suit which workloads, and recognize the impact of eviction policy on overall cache performance.

The Theoretical Optimum: Bélády's Algorithm

Before examining practical policies, we must understand the theoretical optimum: Bélády's MIN algorithm (1966).

The Algorithm

Bélády's algorithm is simple: when eviction is needed, remove the item that will be used furthest in the future (or never used again). This minimizes cache misses because it keeps items that will be needed soon.

Why It's Optimal

The intuition is straightforward: if we evict something that won't be needed for a long time, we delay the miss penalty. If we evict something needed immediately, we cause an immediate miss.

Bélády proved this is optimal: no other policy can achieve fewer cache misses for a given sequence of accesses.

Why We Can't Use It

The catch is obvious: Bélády's algorithm requires knowing the future. In real systems, we don't know which keys will be accessed next. This makes Bélády's algorithm useful only as a benchmark—we can compare actual policies against it to see how close to optimal they perform.

The Benchmark Gap

Policy Performance vs. Bélády's Optimum (Illustrative)
Policy	Hit Rate	Gap from Optimal	When It Excels
Bélády's MIN	97%	—	Theoretical only
LRU	89%	~8%	Temporal locality workloads
LFU	91%	~6%	Frequency-skewed workloads
ARC (Adaptive)	94%	~3%	Mixed workloads
Random	75%	~22%	Never (baseline only)

The Practical Goal

Since Bélády's algorithm is impractical, all real eviction policies are heuristics that approximate it. The goal is to get as close to optimal as possible while remaining computationally efficient. The best policies achieve hit rates within a few percentage points of optimal for typical workloads.

Least Recently Used (LRU)

LRU is the most widely used eviction policy. Its premise is simple: items accessed recently are likely to be accessed again soon (temporal locality). When eviction is needed, remove the item that was accessed longest ago.

Why LRU Works Well

Many real-world access patterns exhibit temporal locality:

User sessions: A logged-in user makes multiple requests in a short period
Hot content: Trending items are accessed frequently, then fade
Working sets: Applications access a subset of data repeatedly

For these patterns, LRU approximates Bélády well because 'recently used' often correlates with 'soon to be used'.

LRU Implementation

The classic LRU implementation uses a doubly-linked list + hash map:

lru_cache.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
from collections import OrderedDict
from typing import Optional, TypeVar, Generic
 
K = TypeVar('K')
V = TypeVar('V')
 
class LRUCache(Generic[K, V]):
    """
    LRU Cache implementation using OrderedDict.
    
    Operations:
    - get(key): O(1)
    - put(key, value): O(1)
    - eviction: O(1)
    """
    
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.cache: OrderedDict[K, V] = OrderedDict()
    
    def get(self, key: K) -> Optional[V]:
        """
        Get value for key, marking it as most recently used.
        Returns None if key not present.
        """
        if key not in self.cache:
            return None
        
        # Move to end (most recently used)
        self.cache.move_to_end(key)
        return self.cache[key]
    
    def put(self, key: K, value: V) -> Optional[K]:
        """
        Insert or update key-value pair.
        Returns evicted key if eviction occurred, None otherwise.
        """
        evicted_key = None
        
        if key in self.cache:
            # Update existing: move to end
            self.cache.move_to_end(key)
        else:
            # New key: check capacity
            if len(self.cache) >= self.capacity:
                # Evict least recently used (first item)
                evicted_key, _ = self.cache.popitem(last=False)
        
        self.cache[key] = value
        return evicted_key
    
    def delete(self, key: K) -> bool:
        """Remove key from cache. Returns True if key existed."""
        if key in self.cache:
            del self.cache[key]
            return True
        return False
 
# Example usage
cache = LRUCache[str, str](capacity=3)
cache.put("a", "1")  # Cache: [a]
cache.put("b", "2")  # Cache: [a, b]
cache.put("c", "3")  # Cache: [a, b, c]
cache.get("a")       # Cache: [b, c, a] (a moved to end)
cache.put("d", "4")  # Cache: [c, a, d] (b evicted)

LRU Data Structure

The combination of doubly-linked list and hash map provides:

Hash map: Key → Node pointer for O(1) lookup
Doubly-linked list: Ordered by recency for O(1) eviction
move_to_end: O(1) by manipulating pointers
popitem(first): O(1) removal of LRU item

LRU Advantages

•Simple to understand and implement — the algorithm is intuitive and well-documented
•O(1) for all operations — no scaling concern with cache size
•Works well for temporal locality — captures recency effectively
•Bounded memory — structure size proportional to capacity
•No parameter tuning — just set the capacity

LRU Disadvantages

•Scan resistance problem — a large sequential scan evicts the entire working set
•Ignores frequency — an item accessed once recently beats one accessed 1000 times long ago
•Pointer overhead — each entry needs prev/next pointers (~16 bytes on 64-bit)
•Lock contention — every read modifies the list, creating contention in concurrent access

The Scan Problem

A single sequential scan of data larger than the cache (e.g., a backup job reading all records) evicts the entire hot working set. After the scan, the cache is cold and hit rate drops to near zero until the working set is rebuilt. This is LRU's most significant weakness.

Least Frequently Used (LFU)

LFU takes a different approach: items accessed more often are more valuable (frequency locality). When eviction is needed, remove the item with the lowest access count.

Why LFU Makes Sense

Some access patterns are frequency-dominated:

Popular content: Celebrity posts, viral videos have consistent high access
Configuration data: Frequently read settings
Power-law distributions: 1% of items serve 50% of requests

For these workloads, an item's past frequency is a strong predictor of future access probability.

Naive LFU Implementation

lfu_cache.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
from collections import defaultdict
from typing import Optional, TypeVar, Generic, Dict, Set
 
K = TypeVar('K')
V = TypeVar('V')
 
class LFUCache(Generic[K, V]):
    """
    LFU Cache implementation with O(1) operations.
    
    Uses three data structures:
    1. key -> (value, frequency) mapping
    2. frequency -> set of keys with that frequency
    3. Track minimum frequency for O(1) eviction
    """
    
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.min_freq = 0
        
        # Key -> (value, frequency)
        self.key_value_freq: Dict[K, tuple[V, int]] = {}
        
        # Frequency -> OrderedSet of keys (for LRU tie-breaking)
        # Using dict as ordered set (Python 3.7+)
        self.freq_keys: Dict[int, Dict[K, None]] = defaultdict(dict)
    
    def _update_frequency(self, key: K) -> None:
        """Increment frequency of a key."""
        value, freq = self.key_value_freq[key]
        
        # Remove from current frequency bucket
        del self.freq_keys[freq][key]
        
        # Update min_freq if current bucket is now empty
        if not self.freq_keys[freq]:
            del self.freq_keys[freq]
            if self.min_freq == freq:
                self.min_freq = freq + 1
        
        # Add to new frequency bucket
        new_freq = freq + 1
        self.freq_keys[new_freq][key] = None  # Using dict as ordered set
        self.key_value_freq[key] = (value, new_freq)
    
    def get(self, key: K) -> Optional[V]:
        """Get value and increment frequency."""
        if key not in self.key_value_freq:
            return None
        
        self._update_frequency(key)
        return self.key_value_freq[key][0]
    
    def put(self, key: K, value: V) -> Optional[K]:
        """Insert or update, with LFU eviction if needed."""
        if self.capacity <= 0:
            return None
        
        evicted_key = None
        
        if key in self.key_value_freq:
            # Update existing key
            _, freq = self.key_value_freq[key]
            self.key_value_freq[key] = (value, freq)
            self._update_frequency(key)
        else:
            # New key
            if len(self.key_value_freq) >= self.capacity:
                # Evict LFU (use LRU for tie-breaking)
                lfu_keys = self.freq_keys[self.min_freq]
                evicted_key = next(iter(lfu_keys))  # First inserted at this freq
                
                del lfu_keys[evicted_key]
                if not lfu_keys:
                    del self.freq_keys[self.min_freq]
                del self.key_value_freq[evicted_key]
            
            # Insert new key with frequency 1
            self.key_value_freq[key] = (value, 1)
            self.freq_keys[1][key] = None
            self.min_freq = 1
        
        return evicted_key

LFU Trade-offs

LFU Advantages

•Excellent for stable popularity distributions
•Scan resistant—scanned items have low frequency
•Captures long-term access patterns
•O(1) implementation possible

LFU Disadvantages

•New items starve—can't compete with old high-freq items
•Slow to adapt to changing popularity
•Counter overflow for long-running caches
•More complex than LRU

The History Pollution Problem

LFU's biggest weakness: items that were once popular but are no longer relevant keep high counts forever. A news article from last year with 1 million historical accesses will never be evicted, even though it will never be accessed again. This is called 'cache pollution' or 'history pollution.'

Hybrid and Adaptive Policies

LRU and LFU each have significant weaknesses. Modern eviction policies combine their strengths or adapt based on observed access patterns.

LRU-K and 2Q

LRU-K (1993) tracks the K-th to last access time instead of just the most recent. An item is only promoted to the main cache after K accesses. This provides scan resistance while maintaining LRU's recency benefits.

2Q ('Two Queue', 1994) simplifies LRU-K with two queues:

A1in: New items start here (FIFO, small)
Am: Items accessed again move here (LRU, large)

Single-access items (scans) stay in A1in and are evicted quickly. Frequently accessed items graduate to Am.

ARC: Adaptive Replacement Cache

IBM's ARC (2003) is a landmark algorithm that adapts between LRU and LFU behavior automatically:

Maintains two LRU lists: T1 (recency) and T2 (frequency)
Maintains 'ghost' entries for recently evicted items (B1, B2)
Uses ghost hits to adapt: if B1 hit, favor recency; if B2 hit, favor frequency
Provably within a constant factor of optimal for any workload

arc_conceptual.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
class ARCCache:
    """
    Simplified conceptual ARC implementation.
    
    Real ARC is more complex but this captures the key ideas.
    """
    
    def __init__(self, capacity: int):
        self.c = capacity  # Total cache capacity
        self.p = 0  # Target size for T1 (adaptive parameter)
        
        # Real cache entries
        self.T1 = OrderedDict()  # Recent, single access
        self.T2 = OrderedDict()  # Frequent, multiple access
        
        # Ghost entries (just keys, no values)
        self.B1 = OrderedDict()  # Recently evicted from T1
        self.B2 = OrderedDict()  # Recently evicted from T2
    
    def _adapt_on_b1_hit(self, key):
        """B1 hit: items similar to this key should have stayed in T1."""
        # Increase p to give T1 more space
        delta = max(1, len(self.B2) // len(self.B1)) if self.B1 else 1
        self.p = min(self.p + delta, self.c)
    
    def _adapt_on_b2_hit(self, key):
        """B2 hit: items similar to this key should have stayed in T2."""
        # Decrease p to give T2 more space
        delta = max(1, len(self.B1) // len(self.B2)) if self.B2 else 1
        self.p = max(self.p - delta, 0)
    
    def get(self, key):
        if key in self.T1:
            # Promote from T1 to T2 (now a frequent item)
            value = self.T1.pop(key)
            self.T2[key] = value
            return value
        
        if key in self.T2:
            # Already frequent, move to end (LRU within T2)
            self.T2.move_to_end(key)
            return self.T2[key]
        
        # Cache miss
        if key in self.B1:
            self._adapt_on_b1_hit(key)
        elif key in self.B2:
            self._adapt_on_b2_hit(key)
        
        return None
    
    # put() implementation would handle eviction between T1 and T2
    # based on the adaptive parameter p

TinyLFU and W-TinyLFU

Caffeine (Java's best cache library) uses W-TinyLFU, combining:

Window Cache: Small LRU window for new items
Main Cache: SLRU (Segmented LRU) for proven items
TinyLFU Filter: Frequency sketch to decide admission

The TinyLFU filter uses a Count-Min Sketch (probabilistic frequency counter) with minimal memory overhead. Items must pass the frequency filter to graduate from window to main cache.

Eviction Policy Comparison
Policy	Scan Resistant	Adapts to Pattern	Memory Overhead	Complexity
LRU	No	No	Low	Simple
LFU	Yes	No	Medium	Medium
2Q	Yes	No	Low	Simple
ARC	Yes	Yes	2x cache size	Complex
W-TinyLFU	Yes	Somewhat	Low	Complex

Practical Recommendation

For most applications: start with LRU. It's well-understood, has minimal overhead, and works well for typical workloads. If you observe scan-related hit rate drops or stable popularity distributions where LFU would help, consider W-TinyLFU (use Caffeine in Java, or Ben Manes' implementation). ARC has patents that may affect commercial use.

Approximated LRU in Practice

Pure LRU's linked-list implementation has overhead: extra memory for pointers and lock contention on every access. Production systems often use approximated LRU that trades accuracy for efficiency.

Redis: Sampling-Based LRU

Redis approximates LRU by sampling:

When eviction needed, randomly sample N keys (default N=5)
Evict the least recently used among the sample
Repeat until enough memory freed

This avoids maintaining a global ordering while approximating LRU behavior. Testing shows that sampling 10 keys gets very close to true LRU.

sampling_lru.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
import random
from time import time
from typing import Dict, Tuple, Optional
 
class SamplingLRUCache:
    """
    Approximated LRU using random sampling.
    Similar to Redis's approach.
    
    Advantages:
    - O(1) lookup and update (no list maintenance)
    - Lower memory overhead (just a timestamp per entry)
    - Better concurrency (no shared list to update)
    """
    
    def __init__(self, capacity: int, sample_size: int = 5):
        self.capacity = capacity
        self.sample_size = sample_size
        
        # Key -> (value, last_access_timestamp)
        self.cache: Dict[str, Tuple[any, float]] = {}
    
    def get(self, key: str) -> Optional[any]:
        """Get value, updating access time."""
        if key not in self.cache:
            return None
        
        value, _ = self.cache[key]
        self.cache[key] = (value, time())
        return value
    
    def put(self, key: str, value: any) -> Optional[str]:
        """Insert with sampling-based eviction."""
        evicted = None
        
        if key not in self.cache and len(self.cache) >= self.capacity:
            # Need to evict: sample random keys and evict oldest
            sample_keys = random.sample(list(self.cache.keys()), 
                                        min(self.sample_size, len(self.cache)))
            
            # Find oldest in sample
            oldest_key = min(sample_keys, 
                           key=lambda k: self.cache[k][1])
            
            del self.cache[oldest_key]
            evicted = oldest_key
        
        self.cache[key] = (value, time())
        return evicted
 
# Comparison of exact vs sampled LRU hit rates
def simulate_hit_rates():
    """Compare exact LRU vs sampling approximation."""
    import random
    
    # Zipf-distributed access pattern
    def zipf_key(n_keys=1000, s=1.5):
        # Simplified Zipf
        weights = [1.0 / (i ** s) for i in range(1, n_keys + 1)]
        total = sum(weights)
        weights = [w / total for w in weights]
        return f"key:{random.choices(range(n_keys), weights)[0]}"
    
    cache_size = 100
    num_accesses = 100000
    
    for sample_size in [1, 3, 5, 10, 100]:
        if sample_size == 100:
            cache = LRUCache(cache_size)  # Exact LRU for comparison
            name = "Exact LRU"
        else:
            cache = SamplingLRUCache(cache_size, sample_size)
            name = f"Sample-{sample_size}"
        
        hits = 0
        for _ in range(num_accesses):
            key = zipf_key()
            if cache.get(key) is not None:
                hits += 1
            else:
                cache.put(key, "value")
        
        hit_rate = hits / num_accesses * 100
        print(f"{name:12s}: {hit_rate:.1f}% hit rate")

Clock Algorithm (Second-Chance)

Another popular approximation is the Clock algorithm, used in operating systems for page replacement:

Maintain items in a circular buffer with a 'hand' pointer
Each item has a 'referenced' bit, set on access
On eviction: scan from hand, clear referenced bit. If already clear, evict that item

This approximates LRU without maintaining ordering—it gives items a 'second chance' before eviction.

Memory Layout Matters

Approximated LRU also benefits cache-friendly memory layout. A linked list's nodes scatter across memory, causing CPU cache misses. Array-based approximations (like clock) or hash-table-only approaches (like sampling) can be much faster due to CPU cache locality—ironic for a discussion about application-level caching!

TTL and Expiration

Beyond capacity-based eviction, caches also remove items based on time: entries expire after a configured TTL (Time To Live). TTL eviction and capacity eviction are orthogonal but interact.

Why TTL Matters

Data freshness: Cached data may become stale; TTL ensures eventual consistency
Bounded staleness: Users see data at most N seconds old
Resource cleanup: Session tokens, temporary data have natural lifetimes
Memory management: Old entries are purged even if never accessed

TTL Implementation Approaches

TTL Expiration Strategies
Strategy	How It Works	Pros	Cons
Lazy (on-access)	Check expiration on get(); delete if expired	No background work	Memory not freed until access
Periodic scan	Background thread scans and deletes expired	Reclaims memory	CPU overhead; scan latency
Bucket-based	Group expires into time buckets; process buckets	Efficient batch deletion	Implementation complexity
Timer wheel	Scheduled timers per item or bucket	Precise expiration timing	Timer management overhead

Redis TTL Implementation

Redis uses a hybrid approach:

Lazy expiration: Check TTL on every key access
Periodic sampling: Every 100ms, sample 20 keys with TTL. If >25% are expired, sample 20 more. Repeat until <25% expired or time limit.

This balances memory reclamation with CPU usage.

ttl_cache.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
from time import time
from typing import Optional, Dict, Tuple
 
class TTLCache:
    """
    Cache with TTL support using lazy + periodic expiration.
    """
    
    def __init__(self, capacity: int, default_ttl: float = 300.0):
        self.capacity = capacity
        self.default_ttl = default_ttl
        
        # Key -> (value, expiration_timestamp)
        self.cache: Dict[str, Tuple[any, float]] = {}
        self.last_cleanup = time()
        self.cleanup_interval = 60.0  # Periodic cleanup every 60s
    
    def _is_expired(self, key: str) -> bool:
        """Check if a key is expired."""
        if key not in self.cache:
            return True
        _, expiry = self.cache[key]
        return time() > expiry
    
    def _lazy_expire(self, key: str) -> bool:
        """Delete key if expired. Returns True if expired."""
        if self._is_expired(key):
            if key in self.cache:
                del self.cache[key]
            return True
        return False
    
    def _maybe_periodic_cleanup(self):
        """Run periodic cleanup if enough time has passed."""
        current_time = time()
        if current_time - self.last_cleanup < self.cleanup_interval:
            return
        
        self.last_cleanup = current_time
        
        # Sample-based cleanup like Redis
        sample_size = min(20, len(self.cache))
        if sample_size == 0:
            return
        
        import random
        expired_count = 0
        
        for _ in range(5):  # Max 5 iterations
            sample_keys = random.sample(list(self.cache.keys()), 
                                        min(sample_size, len(self.cache)))
            
            batch_expired = 0
            for key in sample_keys:
                if self._lazy_expire(key):
                    batch_expired += 1
            
            expired_count += batch_expired
            
            # Stop if less than 25% expired
            if batch_expired / len(sample_keys) < 0.25:
                break
    
    def get(self, key: str) -> Optional[any]:
        """Get with lazy expiration."""
        self._maybe_periodic_cleanup()
        
        if self._lazy_expire(key):
            return None
        
        return self.cache[key][0]
    
    def put(self, key: str, value: any, ttl: Optional[float] = None) -> None:
        """Insert with optional custom TTL."""
        self._maybe_periodic_cleanup()
        
        if ttl is None:
            ttl = self.default_ttl
        
        expiry = time() + ttl
        self.cache[key] = (value, expiry)
        
        # Note: capacity eviction would also happen here

TTL Design Considerations

Setting TTL too short increases cache misses and origin load. Setting TTL too long serves stale data. For most applications, TTL should align with how frequently the underlying data changes and how tolerant users are of staleness. Common values range from seconds (rate limiting) to hours (user profiles) to days (static content).

Summary: Choosing an Eviction Policy

Eviction policy choice significantly impacts cache effectiveness. Let's consolidate the decision framework:

Key Takeaways

•Bélády's optimal algorithm requires knowing the future—practical policies are heuristics that approximate it.
•LRU (Least Recently Used) is the default choice—simple, efficient, works for temporal locality. Weak to sequential scans.
•LFU (Least Frequently Used) excels for stable popularity but suffers from history pollution and slow adaptation.
•Hybrid policies (2Q, ARC, W-TinyLFU) combine LRU and LFU benefits, adapting to workload patterns.
•Approximated LRU (sampling, clock) trades accuracy for efficiency and is used in production systems like Redis.
•TTL-based expiration is orthogonal to capacity eviction; both are needed for production caches.

Eviction Policy Selection Guide
Workload Characteristic	Recommended Policy	Rationale
General purpose / Unknown	LRU or approximated LRU	Good baseline, simple to operate
Sequential scans present	2Q or W-TinyLFU	Scan resistant without complexity
Stable popularity (Zipf)	LFU or W-TinyLFU	Captures frequency-based value
Mixed/changing patterns	ARC or W-TinyLFU	Self-adapting
Extreme scale (millions QPS)	Approximated LRU	Lower overhead, better concurrency

Eviction Policies Complete

You now understand eviction policies from theoretical optimality through production implementation. In the next page, we'll explore Cache Coherence—how to maintain consistency between cached data and the source of truth when both can change.

4 / 6

Loading learning content...

System Design (HLD)Distributed Cache

Designing a Distributed Cache

LevelAdvanced

Duration120 mins

TopicDistributed Cache

4 / 6

Eviction Policies

The Eviction Imperative

Memory is finite. This simple truth underlies one of the most critical decisions in cache design: when the cache is full and a new item must be stored, what gets removed?

This page examines eviction policies comprehensively: from classical algorithms through modern approximations, from theoretical optimality through practical implementation constraints.

What You Will Learn

The Theoretical Optimum: Bélády's Algorithm

Before examining practical policies, we must understand the theoretical optimum: Bélády's MIN algorithm (1966).

The Algorithm

Why It's Optimal

The intuition is straightforward: if we evict something that won't be needed for a long time, we delay the miss penalty. If we evict something needed immediately, we cause an immediate miss.

Bélády proved this is optimal: no other policy can achieve fewer cache misses for a given sequence of accesses.

Why We Can't Use It

The Benchmark Gap

Policy Performance vs. Bélády's Optimum (Illustrative)
Policy	Hit Rate	Gap from Optimal	When It Excels
Bélády's MIN	97%	—	Theoretical only
LRU	89%	~8%	Temporal locality workloads
LFU	91%	~6%	Frequency-skewed workloads
ARC (Adaptive)	94%	~3%	Mixed workloads
Random	75%	~22%	Never (baseline only)

The Practical Goal

Least Recently Used (LRU)

Why LRU Works Well

Many real-world access patterns exhibit temporal locality:

User sessions: A logged-in user makes multiple requests in a short period
Hot content: Trending items are accessed frequently, then fade
Working sets: Applications access a subset of data repeatedly

For these patterns, LRU approximates Bélády well because 'recently used' often correlates with 'soon to be used'.

LRU Implementation

The classic LRU implementation uses a doubly-linked list + hash map:

lru_cache.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
from collections import OrderedDict
from typing import Optional, TypeVar, Generic
 
K = TypeVar('K')
V = TypeVar('V')
 
class LRUCache(Generic[K, V]):
    """
    LRU Cache implementation using OrderedDict.
    
    Operations:
    - get(key): O(1)
    - put(key, value): O(1)
    - eviction: O(1)
    """
    
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.cache: OrderedDict[K, V] = OrderedDict()
    
    def get(self, key: K) -> Optional[V]:
        """
        Get value for key, marking it as most recently used.
        Returns None if key not present.
        """
        if key not in self.cache:
            return None
        
        # Move to end (most recently used)
        self.cache.move_to_end(key)
        return self.cache[key]
    
    def put(self, key: K, value: V) -> Optional[K]:
        """
        Insert or update key-value pair.
        Returns evicted key if eviction occurred, None otherwise.
        """
        evicted_key = None
        
        if key in self.cache:
            # Update existing: move to end
            self.cache.move_to_end(key)
        else:
            # New key: check capacity
            if len(self.cache) >= self.capacity:
                # Evict least recently used (first item)
                evicted_key, _ = self.cache.popitem(last=False)
        
        self.cache[key] = value
        return evicted_key
    
    def delete(self, key: K) -> bool:
        """Remove key from cache. Returns True if key existed."""
        if key in self.cache:
            del self.cache[key]
            return True
        return False
 
# Example usage
cache = LRUCache[str, str](capacity=3)
cache.put("a", "1")  # Cache: [a]
cache.put("b", "2")  # Cache: [a, b]
cache.put("c", "3")  # Cache: [a, b, c]
cache.get("a")       # Cache: [b, c, a] (a moved to end)
cache.put("d", "4")  # Cache: [c, a, d] (b evicted)

LRU Data Structure

The combination of doubly-linked list and hash map provides:

Hash map: Key → Node pointer for O(1) lookup
Doubly-linked list: Ordered by recency for O(1) eviction
move_to_end: O(1) by manipulating pointers
popitem(first): O(1) removal of LRU item

LRU Advantages

•Simple to understand and implement — the algorithm is intuitive and well-documented
•O(1) for all operations — no scaling concern with cache size
•Works well for temporal locality — captures recency effectively
•Bounded memory — structure size proportional to capacity
•No parameter tuning — just set the capacity

LRU Disadvantages

•Scan resistance problem — a large sequential scan evicts the entire working set
•Ignores frequency — an item accessed once recently beats one accessed 1000 times long ago
•Pointer overhead — each entry needs prev/next pointers (~16 bytes on 64-bit)
•Lock contention — every read modifies the list, creating contention in concurrent access

The Scan Problem

Least Frequently Used (LFU)

LFU takes a different approach: items accessed more often are more valuable (frequency locality). When eviction is needed, remove the item with the lowest access count.

Why LFU Makes Sense

Some access patterns are frequency-dominated:

Popular content: Celebrity posts, viral videos have consistent high access
Configuration data: Frequently read settings
Power-law distributions: 1% of items serve 50% of requests

For these workloads, an item's past frequency is a strong predictor of future access probability.

Naive LFU Implementation

lfu_cache.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
from collections import defaultdict
from typing import Optional, TypeVar, Generic, Dict, Set
 
K = TypeVar('K')
V = TypeVar('V')
 
class LFUCache(Generic[K, V]):
    """
    LFU Cache implementation with O(1) operations.
    
    Uses three data structures:
    1. key -> (value, frequency) mapping
    2. frequency -> set of keys with that frequency
    3. Track minimum frequency for O(1) eviction
    """
    
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.min_freq = 0
        
        # Key -> (value, frequency)
        self.key_value_freq: Dict[K, tuple[V, int]] = {}
        
        # Frequency -> OrderedSet of keys (for LRU tie-breaking)
        # Using dict as ordered set (Python 3.7+)
        self.freq_keys: Dict[int, Dict[K, None]] = defaultdict(dict)
    
    def _update_frequency(self, key: K) -> None:
        """Increment frequency of a key."""
        value, freq = self.key_value_freq[key]
        
        # Remove from current frequency bucket
        del self.freq_keys[freq][key]
        
        # Update min_freq if current bucket is now empty
        if not self.freq_keys[freq]:
            del self.freq_keys[freq]
            if self.min_freq == freq:
                self.min_freq = freq + 1
        
        # Add to new frequency bucket
        new_freq = freq + 1
        self.freq_keys[new_freq][key] = None  # Using dict as ordered set
        self.key_value_freq[key] = (value, new_freq)
    
    def get(self, key: K) -> Optional[V]:
        """Get value and increment frequency."""
        if key not in self.key_value_freq:
            return None
        
        self._update_frequency(key)
        return self.key_value_freq[key][0]
    
    def put(self, key: K, value: V) -> Optional[K]:
        """Insert or update, with LFU eviction if needed."""
        if self.capacity <= 0:
            return None
        
        evicted_key = None
        
        if key in self.key_value_freq:
            # Update existing key
            _, freq = self.key_value_freq[key]
            self.key_value_freq[key] = (value, freq)
            self._update_frequency(key)
        else:
            # New key
            if len(self.key_value_freq) >= self.capacity:
                # Evict LFU (use LRU for tie-breaking)
                lfu_keys = self.freq_keys[self.min_freq]
                evicted_key = next(iter(lfu_keys))  # First inserted at this freq
                
                del lfu_keys[evicted_key]
                if not lfu_keys:
                    del self.freq_keys[self.min_freq]
                del self.key_value_freq[evicted_key]
            
            # Insert new key with frequency 1
            self.key_value_freq[key] = (value, 1)
            self.freq_keys[1][key] = None
            self.min_freq = 1
        
        return evicted_key

LFU Trade-offs

LFU Advantages

•Excellent for stable popularity distributions
•Scan resistant—scanned items have low frequency
•Captures long-term access patterns
•O(1) implementation possible

LFU Disadvantages

•New items starve—can't compete with old high-freq items
•Slow to adapt to changing popularity
•Counter overflow for long-running caches
•More complex than LRU

The History Pollution Problem

Hybrid and Adaptive Policies

LRU and LFU each have significant weaknesses. Modern eviction policies combine their strengths or adapt based on observed access patterns.

LRU-K and 2Q

2Q ('Two Queue', 1994) simplifies LRU-K with two queues:

A1in: New items start here (FIFO, small)
Am: Items accessed again move here (LRU, large)

Single-access items (scans) stay in A1in and are evicted quickly. Frequently accessed items graduate to Am.

ARC: Adaptive Replacement Cache

IBM's ARC (2003) is a landmark algorithm that adapts between LRU and LFU behavior automatically:

Maintains two LRU lists: T1 (recency) and T2 (frequency)
Maintains 'ghost' entries for recently evicted items (B1, B2)
Uses ghost hits to adapt: if B1 hit, favor recency; if B2 hit, favor frequency
Provably within a constant factor of optimal for any workload

arc_conceptual.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
class ARCCache:
    """
    Simplified conceptual ARC implementation.
    
    Real ARC is more complex but this captures the key ideas.
    """
    
    def __init__(self, capacity: int):
        self.c = capacity  # Total cache capacity
        self.p = 0  # Target size for T1 (adaptive parameter)
        
        # Real cache entries
        self.T1 = OrderedDict()  # Recent, single access
        self.T2 = OrderedDict()  # Frequent, multiple access
        
        # Ghost entries (just keys, no values)
        self.B1 = OrderedDict()  # Recently evicted from T1
        self.B2 = OrderedDict()  # Recently evicted from T2
    
    def _adapt_on_b1_hit(self, key):
        """B1 hit: items similar to this key should have stayed in T1."""
        # Increase p to give T1 more space
        delta = max(1, len(self.B2) // len(self.B1)) if self.B1 else 1
        self.p = min(self.p + delta, self.c)
    
    def _adapt_on_b2_hit(self, key):
        """B2 hit: items similar to this key should have stayed in T2."""
        # Decrease p to give T2 more space
        delta = max(1, len(self.B1) // len(self.B2)) if self.B2 else 1
        self.p = max(self.p - delta, 0)
    
    def get(self, key):
        if key in self.T1:
            # Promote from T1 to T2 (now a frequent item)
            value = self.T1.pop(key)
            self.T2[key] = value
            return value
        
        if key in self.T2:
            # Already frequent, move to end (LRU within T2)
            self.T2.move_to_end(key)
            return self.T2[key]
        
        # Cache miss
        if key in self.B1:
            self._adapt_on_b1_hit(key)
        elif key in self.B2:
            self._adapt_on_b2_hit(key)
        
        return None
    
    # put() implementation would handle eviction between T1 and T2
    # based on the adaptive parameter p

TinyLFU and W-TinyLFU

Caffeine (Java's best cache library) uses W-TinyLFU, combining:

Window Cache: Small LRU window for new items
Main Cache: SLRU (Segmented LRU) for proven items
TinyLFU Filter: Frequency sketch to decide admission

The TinyLFU filter uses a Count-Min Sketch (probabilistic frequency counter) with minimal memory overhead. Items must pass the frequency filter to graduate from window to main cache.

Eviction Policy Comparison
Policy	Scan Resistant	Adapts to Pattern	Memory Overhead	Complexity
LRU	No	No	Low	Simple
LFU	Yes	No	Medium	Medium
2Q	Yes	No	Low	Simple
ARC	Yes	Yes	2x cache size	Complex
W-TinyLFU	Yes	Somewhat	Low	Complex

Practical Recommendation

Approximated LRU in Practice

Redis: Sampling-Based LRU

Redis approximates LRU by sampling:

When eviction needed, randomly sample N keys (default N=5)
Evict the least recently used among the sample
Repeat until enough memory freed

This avoids maintaining a global ordering while approximating LRU behavior. Testing shows that sampling 10 keys gets very close to true LRU.

sampling_lru.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
import random
from time import time
from typing import Dict, Tuple, Optional
 
class SamplingLRUCache:
    """
    Approximated LRU using random sampling.
    Similar to Redis's approach.
    
    Advantages:
    - O(1) lookup and update (no list maintenance)
    - Lower memory overhead (just a timestamp per entry)
    - Better concurrency (no shared list to update)
    """
    
    def __init__(self, capacity: int, sample_size: int = 5):
        self.capacity = capacity
        self.sample_size = sample_size
        
        # Key -> (value, last_access_timestamp)
        self.cache: Dict[str, Tuple[any, float]] = {}
    
    def get(self, key: str) -> Optional[any]:
        """Get value, updating access time."""
        if key not in self.cache:
            return None
        
        value, _ = self.cache[key]
        self.cache[key] = (value, time())
        return value
    
    def put(self, key: str, value: any) -> Optional[str]:
        """Insert with sampling-based eviction."""
        evicted = None
        
        if key not in self.cache and len(self.cache) >= self.capacity:
            # Need to evict: sample random keys and evict oldest
            sample_keys = random.sample(list(self.cache.keys()), 
                                        min(self.sample_size, len(self.cache)))
            
            # Find oldest in sample
            oldest_key = min(sample_keys, 
                           key=lambda k: self.cache[k][1])
            
            del self.cache[oldest_key]
            evicted = oldest_key
        
        self.cache[key] = (value, time())
        return evicted
 
# Comparison of exact vs sampled LRU hit rates
def simulate_hit_rates():
    """Compare exact LRU vs sampling approximation."""
    import random
    
    # Zipf-distributed access pattern
    def zipf_key(n_keys=1000, s=1.5):
        # Simplified Zipf
        weights = [1.0 / (i ** s) for i in range(1, n_keys + 1)]
        total = sum(weights)
        weights = [w / total for w in weights]
        return f"key:{random.choices(range(n_keys), weights)[0]}"
    
    cache_size = 100
    num_accesses = 100000
    
    for sample_size in [1, 3, 5, 10, 100]:
        if sample_size == 100:
            cache = LRUCache(cache_size)  # Exact LRU for comparison
            name = "Exact LRU"
        else:
            cache = SamplingLRUCache(cache_size, sample_size)
            name = f"Sample-{sample_size}"
        
        hits = 0
        for _ in range(num_accesses):
            key = zipf_key()
            if cache.get(key) is not None:
                hits += 1
            else:
                cache.put(key, "value")
        
        hit_rate = hits / num_accesses * 100
        print(f"{name:12s}: {hit_rate:.1f}% hit rate")

Clock Algorithm (Second-Chance)

Another popular approximation is the Clock algorithm, used in operating systems for page replacement:

Maintain items in a circular buffer with a 'hand' pointer
Each item has a 'referenced' bit, set on access
On eviction: scan from hand, clear referenced bit. If already clear, evict that item

This approximates LRU without maintaining ordering—it gives items a 'second chance' before eviction.

Memory Layout Matters

TTL and Expiration

Beyond capacity-based eviction, caches also remove items based on time: entries expire after a configured TTL (Time To Live). TTL eviction and capacity eviction are orthogonal but interact.

Why TTL Matters

Data freshness: Cached data may become stale; TTL ensures eventual consistency
Bounded staleness: Users see data at most N seconds old
Resource cleanup: Session tokens, temporary data have natural lifetimes
Memory management: Old entries are purged even if never accessed

TTL Implementation Approaches

TTL Expiration Strategies
Strategy	How It Works	Pros	Cons
Lazy (on-access)	Check expiration on get(); delete if expired	No background work	Memory not freed until access
Periodic scan	Background thread scans and deletes expired	Reclaims memory	CPU overhead; scan latency
Bucket-based	Group expires into time buckets; process buckets	Efficient batch deletion	Implementation complexity
Timer wheel	Scheduled timers per item or bucket	Precise expiration timing	Timer management overhead

Redis TTL Implementation

Redis uses a hybrid approach:

Lazy expiration: Check TTL on every key access
Periodic sampling: Every 100ms, sample 20 keys with TTL. If >25% are expired, sample 20 more. Repeat until <25% expired or time limit.

This balances memory reclamation with CPU usage.

ttl_cache.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
from time import time
from typing import Optional, Dict, Tuple
 
class TTLCache:
    """
    Cache with TTL support using lazy + periodic expiration.
    """
    
    def __init__(self, capacity: int, default_ttl: float = 300.0):
        self.capacity = capacity
        self.default_ttl = default_ttl
        
        # Key -> (value, expiration_timestamp)
        self.cache: Dict[str, Tuple[any, float]] = {}
        self.last_cleanup = time()
        self.cleanup_interval = 60.0  # Periodic cleanup every 60s
    
    def _is_expired(self, key: str) -> bool:
        """Check if a key is expired."""
        if key not in self.cache:
            return True
        _, expiry = self.cache[key]
        return time() > expiry
    
    def _lazy_expire(self, key: str) -> bool:
        """Delete key if expired. Returns True if expired."""
        if self._is_expired(key):
            if key in self.cache:
                del self.cache[key]
            return True
        return False
    
    def _maybe_periodic_cleanup(self):
        """Run periodic cleanup if enough time has passed."""
        current_time = time()
        if current_time - self.last_cleanup < self.cleanup_interval:
            return
        
        self.last_cleanup = current_time
        
        # Sample-based cleanup like Redis
        sample_size = min(20, len(self.cache))
        if sample_size == 0:
            return
        
        import random
        expired_count = 0
        
        for _ in range(5):  # Max 5 iterations
            sample_keys = random.sample(list(self.cache.keys()), 
                                        min(sample_size, len(self.cache)))
            
            batch_expired = 0
            for key in sample_keys:
                if self._lazy_expire(key):
                    batch_expired += 1
            
            expired_count += batch_expired
            
            # Stop if less than 25% expired
            if batch_expired / len(sample_keys) < 0.25:
                break
    
    def get(self, key: str) -> Optional[any]:
        """Get with lazy expiration."""
        self._maybe_periodic_cleanup()
        
        if self._lazy_expire(key):
            return None
        
        return self.cache[key][0]
    
    def put(self, key: str, value: any, ttl: Optional[float] = None) -> None:
        """Insert with optional custom TTL."""
        self._maybe_periodic_cleanup()
        
        if ttl is None:
            ttl = self.default_ttl
        
        expiry = time() + ttl
        self.cache[key] = (value, expiry)
        
        # Note: capacity eviction would also happen here

TTL Design Considerations

Summary: Choosing an Eviction Policy

Eviction policy choice significantly impacts cache effectiveness. Let's consolidate the decision framework:

Key Takeaways

•Bélády's optimal algorithm requires knowing the future—practical policies are heuristics that approximate it.
•LRU (Least Recently Used) is the default choice—simple, efficient, works for temporal locality. Weak to sequential scans.
•LFU (Least Frequently Used) excels for stable popularity but suffers from history pollution and slow adaptation.
•Hybrid policies (2Q, ARC, W-TinyLFU) combine LRU and LFU benefits, adapting to workload patterns.
•Approximated LRU (sampling, clock) trades accuracy for efficiency and is used in production systems like Redis.
•TTL-based expiration is orthogonal to capacity eviction; both are needed for production caches.

Eviction Policy Selection Guide
Workload Characteristic	Recommended Policy	Rationale
General purpose / Unknown	LRU or approximated LRU	Good baseline, simple to operate
Sequential scans present	2Q or W-TinyLFU	Scan resistant without complexity
Stable popularity (Zipf)	LFU or W-TinyLFU	Captures frequency-based value
Mixed/changing patterns	ARC or W-TinyLFU	Self-adapting
Extreme scale (millions QPS)	Approximated LRU	Lower overhead, better concurrency

Eviction Policies Complete

4 / 6