Loading learning content...
Imagine a library with infinite space for books. You could keep every book ever borrowed, organize them however you wish, and never worry about running out of room. But in reality, libraries have finite shelf space, and the same constraint applies to computer memory.
Caching is one of the most powerful techniques in computer science — the practice of keeping frequently or recently accessed data close at hand to avoid expensive recomputation or retrieval. But caches have limited capacity, and when that capacity is exhausted, difficult decisions must be made. Which data do we evict to make room for new entries?
This decision — the eviction policy — determines the cache's effectiveness. Choose wisely, and your cache dramatically accelerates system performance. Choose poorly, and your cache becomes worse than useless, consuming memory while providing no benefit.
By the end of this page, you will understand what cache eviction policies are, why they matter for system performance, and why the Least Recently Used (LRU) policy has become the de facto standard in production systems. You'll develop intuition for temporal locality and understand the theoretical foundations that make LRU so effective.
Before we discuss eviction policies, let's establish what caching is and why it's so critical to modern computing.
The fundamental asymmetry of modern computers:
Computers have multiple layers of storage, each with drastically different speed and capacity characteristics:
| Storage Level | Access Time | Typical Size | Cost per GB |
|---|---|---|---|
| CPU Registers | < 1 ns | Bytes | Astronomical |
| L1 Cache | ~1-2 ns | 32-64 KB | Extremely high |
| L2 Cache | ~3-10 ns | 256 KB - 1 MB | Very high |
| L3 Cache | ~10-20 ns | 2-64 MB | High |
| RAM (DRAM) | ~50-100 ns | 4-128 GB | ~$5-10 |
| SSD | ~10-100 μs | 128 GB - 8 TB | ~$0.10-0.20 |
| HDD | ~5-10 ms | 500 GB - 20 TB | ~$0.02-0.05 |
| Network (Local) | ~1-10 ms | Unlimited | Variable |
| Network (Remote) | ~50-500 ms | Unlimited | Variable |
Notice the six orders of magnitude difference between L1 cache access (~1 nanosecond) and network access (~1 millisecond). This asymmetry creates the fundamental motivation for caching: keep hot data close, push cold data far away.
The cache abstraction:
A cache is simply a smaller, faster storage layer that sits between a slow storage layer and the component requesting data. When data is requested:
The effectiveness of a cache is measured by its hit rate — the percentage of requests served from the cache rather than the slow storage.
Average Access Time = (Hit Rate × Cache Access Time) + (Miss Rate × Slow Storage Access Time)
With a 95% hit rate, if cache access takes 1ms and database access takes 100ms: Average = (0.95 × 1) + (0.05 × 100) = 0.95 + 5 = 5.95ms
Compare to no caching: 100ms average. That's a 16x improvement!
Here's the fundamental dilemma: caches have limited capacity. When a cache is full and a new item must be added, an existing item must be removed. Which item do we evict?
This is not a trivial decision. The choice of eviction policy can make or break your system's performance. Let's understand why with a concrete example.
Scenario: Web Application Cache
Imagine a web application with a cache that can hold 1000 user session objects. During peak traffic:
If your eviction policy is poor, you might evict sessions that are about to be accessed again. Each eviction of an active session costs 50ms in database access. With thousands of requests per second, bad eviction decisions cascade into seconds of cumulative delay.
The Oracle's Perfect Eviction:
If you could see the future, the optimal eviction policy would be simple: evict the item that will be used furthest in the future (or never used again). This is called the Optimal Replacement Policy or Bélády's Algorithm.
Unfortunately, predicting the future is impossible. We must use heuristics based on past behavior to approximate optimal eviction. This is where different eviction policies diverge.
László Bélády proved in 1966 that evicting the item used furthest in the future minimizes cache misses. This theoretical result is called OPT or MIN. While impossible to implement in practice (we can't know the future), it serves as a benchmark for evaluating other policies. An LRU cache typically achieves within 10-20% of OPT for real workloads.
Multiple eviction strategies have been developed, each with different trade-offs. Understanding these alternatives helps appreciate why LRU became dominant.
1. Random Replacement
The simplest possible policy: when eviction is needed, pick a random item to remove.
Pros: O(1) time, trivial implementation, no bookkeeping overhead Cons: Completely ignores access patterns — might evict the most frequently used item
2. First-In, First-Out (FIFO)
Evict the item that was added to the cache first, regardless of how often it's accessed.
Pros: O(1) time with a queue, simple mental model Cons: A frequently-accessed item added early will be evicted before a never-touched item added later
3. Least Frequently Used (LFU)
Evict the item that has been accessed the fewest times since entering the cache.
Pros: Keeps truly popular items around Cons: Historical bias — an item popular last week but cold now stays forever; new items are vulnerable to eviction before they can prove themselves
4. Least Recently Used (LRU)
Evict the item that was accessed longest ago.
Pros: Adapts to changing access patterns; recently accessed items stay Cons: A full-cache scan of all items (accessed once each) pollutes the entire cache
| Policy | What It Evicts | Best For | Weakness |
|---|---|---|---|
| Random | Any random item | When access patterns are truly random | Ignores all history |
| FIFO | Oldest insertion | Streaming data with TTL semantics | Ignores access frequency |
| LFU | Least access count | Stable popularity distributions | Slow to adapt to changes |
| LRU | Least recent access | Temporal locality workloads | Scan vulnerability |
In practice, LRU performs remarkably close to optimal for most real-world workloads. This is because real access patterns exhibit temporal locality — items accessed recently are likely to be accessed again soon. LRU capitalizes on this property with its recency-based eviction.
Temporal locality is the principle that if an item was accessed recently, it's likely to be accessed again in the near future. This isn't an arbitrary assumption — it reflects fundamental patterns in how humans and systems interact with data.
Why temporal locality exists:
The recency heuristic:
LRU makes a simple bet: the past is the best predictor of the near future. If you accessed something 5 seconds ago but haven't accessed something else in 5 hours, the first item is more likely to be needed again soon.
This heuristic isn't always correct — sometimes old data becomes relevant again. But it's correct often enough that LRU significantly outperforms random or FIFO policies across nearly all real-world workloads.
Mathematical justification:
Assume access patterns follow an exponential decay in recency. The probability of accessing item X in the next time interval is proportional to e^(-λ × time_since_last_access), where λ is a decay constant. Under this model, LRU is provably optimal — it always keeps the items with the highest probability of near-future access.
Your brain uses LRU-like eviction! Consider your working memory: names of people you just met are available; names from grade school are harder to recall. Your phone's recent apps list is literally an LRU cache, keeping recently used apps ready while older ones get swapped out.
LRU isn't just a theoretical concept — it's ubiquitous in production software. Understanding where LRU appears helps appreciate its importance.
CPU Caches:
Modern CPUs use LRU or pseudo-LRU policies to manage L1, L2, and L3 caches. When you access a memory address, the CPU keeps it in cache assuming you'll access it again. This hardware-level LRU is invisible to programmers but critical for performance.
Operating System Page Cache:
When your OS loads files into memory, it uses LRU-based policies to decide which file pages to keep in RAM. This is why opening a file the second time is faster — the pages stayed in the OS cache.
Database Query Caches:
MySQL, PostgreSQL, and other databases cache query results. LRU determines which query results stay in memory. Repeated queries execute instantly from cache.
Application Caches:
Redis and Memcached — the most popular caching systems — both support LRU eviction. Web applications worldwide depend on these caches for session storage, API response caching, and more.
| System | LRU Usage | Scale |
|---|---|---|
| Linux Page Cache | Evicts cold file pages | Manages TB of memory per server |
| Redis | Optional LRU eviction mode | Billions of keys across clusters |
| Memcached | Default eviction policy | Petabytes of cached data globally |
| Google Chrome | Tab discarding on memory pressure | Billions of browser instances |
| CDN Edge Caches | Content eviction at edge nodes | Exabytes of cached content |
| JIT Compilers | Hot code caching | Millions of functions across apps |
Pure LRU can be expensive to implement at scale. Many production systems use approximations like Pseudo-LRU (uses less metadata), Clock/Second-Chance (approximates LRU with a circular buffer), or Segmented LRU (separates hot and probationary items). These trade perfect LRU behavior for implementation efficiency while retaining most benefits.
Let's precisely define what an LRU cache does. This formal specification will guide our implementation in later pages.
The LRU Cache ADT (Abstract Data Type):
An LRU Cache with capacity C supports two operations:
get(key):
key exists in the cache, return its value and mark it as most recently usedkey doesn't exist, return a sentinel value (null, -1, etc.)put(key, value):
key already exists, update its value and mark it as most recently usedkey doesn't exist:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
/** * LRU Cache Interface * * Defines the contract for a Least Recently Used cache. * Any access (get or put) makes an item the most recently used. * Eviction always removes the least recently used item. */interface LRUCache<K, V> { /** * Capacity of the cache - maximum number of key-value pairs */ readonly capacity: number; /** * Current number of items in the cache */ readonly size: number; /** * Retrieves the value for a key, or null if not present. * If present, marks the key as most recently used. * * @param key - The key to look up * @returns The value if present, null otherwise */ get(key: K): V | null; /** * Inserts or updates a key-value pair. * If at capacity and key is new, evicts the least recently used item. * Marks the key as most recently used. * * @param key - The key to insert or update * @param value - The value to associate with the key */ put(key: K, value: V): void; /** * Optional: Removes a specific key from the cache * @param key - The key to remove * @returns true if key was present, false otherwise */ delete?(key: K): boolean; /** * Optional: Clears all entries from the cache */ clear?(): void;}The key insight: access = recency update
Notice that both get and put update recency. This is crucial and often overlooked:
get("A") returns A's value and makes A the most recently usedput("B", value) inserts/updates B and makes B the most recently usedThis means simply reading a value prevents it from being evicted soon. The cache automatically adapts to access patterns without any explicit hints from the caller.
At any moment, all items in an LRU cache have a total ordering from most to least recently used. You can imagine them arranged in a line — every access moves that item to the front of the line, and eviction always removes from the back.
Before designing our optimal solution, let's consider naive approaches and understand why they're insufficient. This analysis will reveal the specific requirements our final design must satisfy.
Naive Approach 1: HashMap + Timestamps
Store each item with a timestamp of its last access. On eviction, scan all items to find the one with the oldest timestamp.
get(key):
if key in map:
map[key].timestamp = current_time()
return map[key].value
return null
put(key, value):
if key in map:
map[key].value = value
map[key].timestamp = current_time()
else:
if map.size >= capacity:
// Find and remove item with oldest timestamp - O(n) scan!
oldest = find_min_timestamp(map)
map.remove(oldest)
map[key] = {value, timestamp: current_time()}
Analysis:
get: O(1) ✓put (no eviction): O(1) ✓put (with eviction): O(n) ✗ — must scan all items to find minimum timestampWith a cache of 100,000 items, every eviction requires comparing 100,000 timestamps. Unacceptable.
Naive Approach 2: HashMap + Min-Heap
Use a min-heap ordered by timestamp to find the oldest item in O(log n).
Analysis:
get: O(1) for map lookup, but updating the heap position requires O(n) search + O(log n) heapify → O(n) overall ✗put: Same problem — updating an existing key's position in the heap is expensiveThe issue is that heaps support efficient insert and extract-min, but not efficient arbitrary-element update or delete.
Naive Approach 3: HashMap + Sorted Array
Keep items sorted by access time.
Analysis:
Again, O(n) per operation is unacceptable.
| Approach | get() Time | put() Time | Space | Problem |
|---|---|---|---|---|
| HashMap + Timestamps | O(1) | O(n) eviction | O(n) | Slow eviction |
| HashMap + Min-Heap | O(n) | O(n) | O(n) | Slow recency update |
| HashMap + Sorted Array | O(n) | O(n) | O(n) | Slow moves |
| Optimal (next page) | O(1) | O(1) | O(n) | None! |
We need a data structure that supports three operations efficiently:
No single standard data structure supports all three. The solution requires combining two data structures synergistically.
We've covered substantial ground in understanding the LRU eviction policy. Let's consolidate the key insights:
What's next:
Now that we understand what LRU does and why it's valuable, the next page examines the specific requirements that make O(1) operations challenging. We'll analyze exactly what operations our data structure must support and establish the complexity requirements that any production-grade LRU cache must meet.
You now understand the theoretical foundations of LRU caching: why caches need eviction policies, why LRU outperforms alternatives for most workloads, and what operations the LRU cache ADT must support. Next, we'll establish the O(1) requirement and understand why achieving it requires a clever design.