Scaling Playbook - Learning Module

Loading content...

0/273

Caching Layer Introduction

The Performance Multiplier

If there's one pattern that delivers outsized impact with relatively modest complexity, it's caching. A well-designed caching layer can reduce database load by 90% or more, decrease response times from hundreds of milliseconds to single digits, and defer expensive database scaling indefinitely.

The fundamental insight is simple: most data is read far more often than it's written, and the same data is requested repeatedly. If the same product page is viewed 10,000 times per hour but its data changes once per day, why fetch it from the database 10,000 times? Cache it once, serve it instantly, and refresh only when necessary.

Yet caching introduces its own challenges: cache invalidation (famously one of the "two hard problems in computer science"), cache coherence, thundering herds, and cache penetration attacks. This page explores caching as a complete discipline—from fundamental concepts through production-grade implementation.

What You Will Learn

By the end of this page, you will understand how to design and implement effective caching layers at multiple levels of a system. You'll learn cache invalidation strategies, how to handle the challenges of distributed caching, and when caching is appropriate versus when it adds complexity without benefit.

The Cache Hierarchy

Modern systems employ caching at multiple layers, each with different characteristics, trade-offs, and use cases. Understanding this hierarchy is essential for designing effective caching strategies.

Layer 1: Browser Cache The closest cache to the user. HTTP headers (Cache-Control, ETag, Last-Modified) instruct browsers to cache static assets (images, CSS, JavaScript) and even API responses. Cache hits never reach your servers.

Layer 2: CDN Cache (Edge) Content Delivery Networks cache content at points of presence (PoPs) geographically close to users. Reduces latency and offloads origin servers. Critical for static content and increasingly for dynamic content.

Layer 3: Application Cache (In-Memory) Local caches within application servers (e.g., Guava cache, local in-memory maps). Fastest access, but not shared across instances. Good for small, frequently accessed data that doesn't change often.

Layer 4: Distributed Cache Shared cache layer across the application tier (Redis, Memcached). Network hop required, but provides consistency across all instances and larger capacity than local memory.

Layer 5: Database Query Cache Some databases cache query results internally. Useful for transparent caching but limited control over invalidation.

Cache Layer Comparison
Layer	Location	Latency	Capacity	Shared	Best For
Browser Cache	Client device	0ms (instant)	Limited (50-500MB)	No	Static assets, personalized data
CDN Cache	Edge PoPs	1-20ms	Very large (TB+)	Yes (per-region)	Static content, cacheable APIs
Application Cache	App server memory	< 1ms	Limited (GB)	No (per instance)	Hot path data, computed values
Distributed Cache	Cache cluster	1-5ms (network)	Large (100s GB)	Yes (global)	Session data, database query results
Database Cache	Database server	1-10ms	Based on DB config	Yes	Transparent query caching

The multi-layer strategy:

Effective systems employ caches at multiple layers, with each layer filtering requests:

Browser cache handles repeat visits for the same user
CDN cache handles popular content for all users in a region
Application cache handles ultra-hot data (feature flags, configuration)
Distributed cache handles warm data (user sessions, recent product views)
Database handles cold data (first access, infrequent queries)

If each layer achieves 90% hit rate:

Browser: 10% of requests proceed
CDN: 1% of requests proceed
Application: 0.1% of requests proceed
Distributed cache: 0.01% of requests proceed
Database: Handles only 0.01% of original traffic

This is the power of cache layering—multiplicative reduction at each tier.

Start at the Edge

When designing caching, work from outside in. Every request that can be served by the browser or CDN never reaches your infrastructure. Maximize these layers before optimizing internal caches. A proper CDN configuration often reduces origin traffic by 70-90% for content-heavy applications.

Distributed Caching with Redis

Redis has become the de facto standard for distributed caching. Its rich data structures, atomic operations, and excellent performance make it suitable for a wide range of caching patterns.

Why Redis for caching?

Performance: Redis operates entirely in memory. Single-threaded event loop eliminates lock contention. Typical latency is < 1ms for simple operations.

Data structures: Beyond simple key-value, Redis supports lists, sets, sorted sets, hashes, bitmaps, and more. These enable patterns impossible with simple caches.

Atomic operations: INCR, LPUSH, ZADD, and other atomic commands enable race-condition-free counter updates, rate limiting, and leaderboards.

Persistence options: Optional disk persistence (RDB snapshots, AOF logs) protects against data loss during restarts—useful when cache warmup is expensive.

Cluster mode: Redis Cluster provides automatic sharding and failover for horizontal scaling.

redis-caching-patterns
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
import Redis from 'ioredis';
 
const redis = new Redis({
    host: 'redis-cluster.example.com',
    port: 6379,
    retryDelayOnFailover: 100,
    maxRetriesPerRequest: 3,
});
 
// Pattern 1: Simple cache-aside with TTL
async function getCachedUser(userId: string): Promise<User | null> {
    const cacheKey = `user:${userId}`;
    
    // Try cache first
    const cached = await redis.get(cacheKey);
    if (cached) {
        return JSON.parse(cached);
    }
    
    // Cache miss - fetch from database
    const user = await database.getUser(userId);
    if (user) {
        // Cache for 1 hour
        await redis.setex(cacheKey, 3600, JSON.stringify(user));
    }
    
    return user;
}
 
// Pattern 2: Hash for object storage (more memory efficient)
async function getCachedProduct(productId: string): Promise<Product | null> {
    const cacheKey = `product:${productId}`;
    
    const cached = await redis.hgetall(cacheKey);
    if (Object.keys(cached).length > 0) {
        return {
            id: cached.id,
            name: cached.name,
            price: parseFloat(cached.price),
            stock: parseInt(cached.stock, 10),
        };
    }
    
    const product = await database.getProduct(productId);
    if (product) {
        await redis.hmset(cacheKey, {
            id: product.id,
            name: product.name,
            price: product.price.toString(),
            stock: product.stock.toString(),
        });
        await redis.expire(cacheKey, 3600);
    }
    
    return product;
}
 
// Pattern 3: Sorted sets for leaderboards/ranking
async function getTopScores(limit: number = 10): Promise<LeaderboardEntry[]> {
    // ZREVRANGE returns highest scores first
    const results = await redis.zrevrange('leaderboard', 0, limit - 1, 'WITHSCORES');
    
    const entries: LeaderboardEntry[] = [];
    for (let i = 0; i < results.length; i += 2) {
        entries.push({
            userId: results[i],
            score: parseFloat(results[i + 1]),
        });
    }
    return entries;
}
 
// Pattern 4: Rate limiting with sliding window
async function isRateLimited(userId: string, limit: number, windowSecs: number): Promise<boolean> {
    const key = `ratelimit:${userId}`;
    const now = Date.now();
    const windowStart = now - (windowSecs * 1000);
    
    // Remove old entries
    await redis.zremrangebyscore(key, 0, windowStart);
    
    // Count recent entries
    const count = await redis.zcard(key);
    
    if (count >= limit) {
        return true; // Rate limited
    }
    
    // Add current request
    await redis.zadd(key, now, `${now}-${Math.random()}`);
    await redis.expire(key, windowSecs);
    
    return false;
}

Redis vs Memcached

Memcached is simpler and can be slightly faster for pure key-value workloads. Redis offers richer data structures, persistence, and Lua scripting. For most applications, Redis's flexibility outweighs Memcached's marginal performance advantage. Use Memcached only if you have specific needs (legacy compatibility, extreme simplicity) that Redis doesn't address.

Cache-Aside Pattern Deep Dive

The cache-aside pattern (also called "lazy loading") is the most common caching strategy. The application manages both cache and database, loading data into cache on demand.

The flow:

Application checks cache for requested data
If found (cache hit), return cached data
If not found (cache miss), query database
Write retrieved data to cache
Return data to caller

Advantages:

Cache contains only requested data (no speculative caching)
Database remains the source of truth
Cache failures don't affect functionality (just performance)
Simple to implement and understand

Disadvantages:

First request for data always hits database (cold cache)
Cache and database can become inconsistent
Thundering herd problem on cache miss

Converting Mermaid diagram...

Critical implementation considerations:

1. TTL selection: Every cached item needs a TTL (time-to-live). Too short: frequent cache misses, database hammered. Too long: stale data served to users.

Rules of thumb:

Frequently changing data: 1-5 minutes
Moderately changing data: 5-60 minutes
Rarely changing data: hours to days
Never changing data: very long TTL with explicit invalidation

2. Serialization format: JSON is readable but verbose. MessagePack or Protocol Buffers are more compact but require schema changes. For most applications, JSON is fine—Redis is memory-bound before serialization becomes a bottleneck.

3. Cache key design: Keys should be:

Unique across the application
Descriptive (for debugging)
Consistent (same data, same key)
Include version if schema changes

Good: user:123:v2, product:abc:inventory Bad: 123, data, cache_key_1

4. Null caching: If results for a query are legitimately empty, cache the empty result. Otherwise, repeated queries for non-existent data always hit the database. Use a sentinel value or short TTL for negative caching.

The Write Path

Cache-aside only addresses reads. On writes, you must decide: invalidate the cache (delete the key) or update the cache (set new value). Generally, invalidation is safer—it avoids race conditions where a stale read happens between database write and cache update. Update only if you can guarantee atomicity.

Cache Invalidation Strategies

Phil Karlton's famous quote—"There are only two hard things in Computer Science: cache invalidation and naming things"—resonates because cache invalidation is genuinely difficult. When data changes, cached copies become stale. The question is: how and when do we detect and correct this?

Strategy 1: Time-Based Expiration (TTL)

The simplest approach: every cached item expires after a fixed duration.

Pros:

Simple to implement
Self-healing (stale data eventually expires)
No coupling between write path and cache

Cons:

Data can be stale for up to TTL duration
No control over when stale data is served
Must balance freshness vs. database load

Cache Invalidation Strategies

•Event-Based Invalidation — When data changes, explicitly delete or update the cache key. Requires coupling between write operations and cache, but provides immediate consistency.
•Write-Through — Every write updates both database and cache atomically. Ensures cache is always fresh, but adds latency to writes and requires careful transaction handling.
•Write-Behind (Write-Back) — Writes go to cache first, then asynchronously to database. Fastest writes, but risks data loss if cache fails before database write.
•Cache-Aside with Event Bus — Publish data change events to a message bus; cache subscribers invalidate keys. Decouples write path from cache logic.
•Version-Based Keys — Include a version number in cache keys. Increment version on data change. Old keys naturally expire while new keys are populated.
•Lease-Based Caching — Cache items include a lease. Writer must acquire lease to invalidate, preventing thundering herd on simultaneous invalidations.

invalidation-patterns
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
// Pattern 1: Write-Through (synchronous cache update)
async function updateUserWithWriteThrough(userId: string, updates: Partial<User>) {
    const cacheKey = `user:${userId}`;
    
    // Start transaction
    const trx = await database.transaction();
    try {
        // Update database
        const user = await trx.update('users', userId, updates);
        
        // Update cache
        await redis.setex(cacheKey, 3600, JSON.stringify(user));
        
        // Commit transaction
        await trx.commit();
        return user;
    } catch (error) {
        await trx.rollback();
        // On failure, invalidate cache to prevent stale data
        await redis.del(cacheKey);
        throw error;
    }
}
 
// Pattern 2: Event-Based Invalidation
class UserService {
    private eventBus: EventBus;
    
    async updateUser(userId: string, updates: Partial<User>) {
        const user = await database.update('users', userId, updates);
        
        // Emit event for cache invalidation
        await this.eventBus.publish('user.updated', {
            userId,
            timestamp: Date.now(),
        });
        
        return user;
    }
}
 
// Separate cache invalidation handler
class CacheInvalidator {
    constructor(private redis: Redis, private eventBus: EventBus) {
        this.eventBus.subscribe('user.updated', this.handleUserUpdate.bind(this));
    }
    
    private async handleUserUpdate(event: { userId: string }) {
        await this.redis.del(`user:${event.userId}`);
        // Also invalidate related caches
        await this.redis.del(`user:${event.userId}:preferences`);
        await this.redis.del(`user:${event.userId}:activity`);
    }
}
 
// Pattern 3: Version-Based Cache Keys
class VersionedCache {
    private versionKey = 'cache:version:users';
    
    async getVersion(): Promise<number> {
        const version = await redis.get(this.versionKey);
        return version ? parseInt(version, 10) : 1;
    }
    
    async incrementVersion(): Promise<number> {
        return redis.incr(this.versionKey);
    }
    
    async getUser(userId: string): Promise<User | null> {
        const version = await this.getVersion();
        const cacheKey = `user:${userId}:v${version}`;
        
        const cached = await redis.get(cacheKey);
        if (cached) return JSON.parse(cached);
        
        const user = await database.getUser(userId);
        if (user) {
            await redis.setex(cacheKey, 3600, JSON.stringify(user));
        }
        return user;
    }
    
    async invalidateAllUsers(): Promise<void> {
        // Simply increment version; old keys expire naturally
        await this.incrementVersion();
    }
}

Combining Strategies

Production systems often combine strategies. Use TTL as a safety net (data never stale for more than X minutes), event-based invalidation for real-time updates (changes reflected immediately), and version keys for bulk invalidation (invalidate all users after a schema migration). Each strategy covers different failure modes.

Cache Stampede and Thundering Herd

One of the most dangerous caching failure modes is the cache stampede (also called thundering herd). It occurs when a popular cache key expires and multiple requests simultaneously attempt to regenerate it, overwhelming the database.

The scenario:

Popular product page cached with 1-hour TTL
10,000 users are viewing the page when cache expires
All 10,000 requests find cache miss simultaneously
All 10,000 requests query the database
Database is overwhelmed
Responses are slow; additional requests pile up
System spirals into overload

The irony: The cache was protecting the database. When it expires, the protection disappears precisely when traffic is highest.

Stampede Prevention Techniques

•Locking (Mutex) — Only one request regenerates the cache; others wait for the lock to release. Prevents parallel database queries but adds latency for waiting requests.
•Probabilistic Early Expiration — Items expire slightly before their TTL, with randomness. Spreads regeneration over time instead of all at once.
•Background Refresh — A background process refreshes cache before expiration. Requests always hit cache; no stampede possible.
•Request Coalescing — Multiple simultaneous requests for the same key are collapsed into one database query. The single result is shared among waiters.
•Stale-While-Revalidate — Return stale data immediately while refreshing in the background. Trade-off: briefly stale data vs. latency.
•Circuit Breaker — If database is overwhelmed, return stale cached data even if expired, or return a degraded response. Protects database from cascade failure.

stampede-prevention
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
// Pattern 1: Distributed Lock (Mutex)
async function getCachedWithLock<T>(
    key: string,
    ttlSeconds: number,
    loader: () => Promise<T>
): Promise<T | null> {
    // Try cache first
    const cached = await redis.get(key);
    if (cached) return JSON.parse(cached);
    
    const lockKey = `lock:${key}`;
    const lockTTL = 10; // Lock expires after 10 seconds
    
    // Try to acquire lock
    const acquired = await redis.set(lockKey, '1', 'EX', lockTTL, 'NX');
    
    if (acquired) {
        // We have the lock - load data
        try {
            const data = await loader();
            await redis.setex(key, ttlSeconds, JSON.stringify(data));
            return data;
        } finally {
            await redis.del(lockKey);
        }
    } else {
        // Another process is loading - wait and retry
        await sleep(100);
        const cached = await redis.get(key);
        if (cached) return JSON.parse(cached);
        
        // Still no data - retry with backoff
        return getCachedWithLock(key, ttlSeconds, loader);
    }
}
 
// Pattern 2: Probabilistic Early Expiration
async function getCachedWithProbabilisticRefresh<T>(
    key: string,
    ttlSeconds: number,
    earlyRefreshWindow: number, // seconds before expiration to consider refresh
    loader: () => Promise<T>
): Promise<T | null> {
    const cached = await redis.get(key);
    if (cached) {
        const { data, createdAt } = JSON.parse(cached);
        const age = (Date.now() - createdAt) / 1000;
        const timeToExpiry = ttlSeconds - age;
        
        if (timeToExpiry < earlyRefreshWindow) {
            // In early refresh window - probabilistically refresh
            // Probability increases as we approach expiration
            const refreshProbability = 1 - (timeToExpiry / earlyRefreshWindow);
            
            if (Math.random() < refreshProbability) {
                // Refresh in background (don't await)
                refreshCache(key, ttlSeconds, loader).catch(console.error);
            }
        }
        
        return data;
    }
    
    // Cache miss - load synchronously
    const data = await loader();
    await redis.setex(key, ttlSeconds, JSON.stringify({
        data,
        createdAt: Date.now(),
    }));
    return data;
}
 
// Pattern 3: Stale-While-Revalidate
async function getCachedSWR<T>(
    key: string,
    ttlSeconds: number,
    staleWhileRevalidate: number,
    loader: () => Promise<T>
): Promise<T | null> {
    const cached = await redis.get(key);
    
    if (cached) {
        const { data, createdAt } = JSON.parse(cached);
        const age = (Date.now() - createdAt) / 1000;
        
        if (age > ttlSeconds) {
            if (age < ttlSeconds + staleWhileRevalidate) {
                // Stale but within revalidate window
                // Return stale, refresh in background
                refreshCache(key, ttlSeconds, loader).catch(console.error);
                return data; // Return stale immediately
            }
            // Too stale - let it fall through to reload
        } else {
            return data; // Fresh data
        }
    }
    
    // No cache or too stale - load synchronously
    const data = await loader();
    await saveToCache(key, data, ttlSeconds);
    return data;
}

Lock Contention at Scale

While locking prevents stampedes, it can create contention under very high concurrency. If 10,000 requests wait for one lock, you've traded database overload for lock waiting. For extremely hot keys, consider background refresh or request coalescing instead of locks. The right pattern depends on your traffic patterns.

Cache Coherence and Consistency

When data exists in multiple places—database, distributed cache, local caches across multiple servers—maintaining consistency becomes challenging. Cache coherence refers to ensuring all cache copies reflect the same state.

The fundamental tension:

Stronger consistency requires more coordination, which reduces performance. Weaker consistency improves performance but allows stale reads. There's no universal right answer—the choice depends on your domain.

Consistency levels in caching:

Strong consistency: Cache always reflects current database state. Requires synchronous invalidation on every write. Practically impossible with local/CDN caches.

Eventual consistency: Cache will converge to database state within a bounded time (TTL). Most common approach. Allows brief windows of stale data.

Read-your-writes consistency: A user sees their own writes immediately, even if others see stale data. Implemented by routing reads to primary after writes.

Causal consistency: Related operations are seen in order. If A causes B, no observer sees B before A. Complex to implement in distributed caches.

Consistency Requirements by Use Case
Use Case	Required Consistency	Approach	Staleness Tolerance
User profile display	Eventual	TTL + event invalidation	Minutes acceptable
Account balance	Strong	Cache-aside with short TTL or no cache	Zero tolerance
Shopping cart	Read-your-writes	Write-through + user affinity	Own changes immediate
Product inventory	Eventual with bounds	Event invalidation + 30s TTL	Brief oversell acceptable
News feed	Eventual	Long TTL, background refresh	Hours acceptable
Session authentication	Strong	In-memory or very short TTL	Zero tolerance

The dual-write problem:

A common pitfall occurs when updating cache and database separately:

Thread A: Updates database (balance = 100)
Thread B: Updates database (balance = 150)
Thread B: Updates cache (balance = 150)
Thread A: Updates cache (balance = 100)

Result: Database has 150, cache has 100. Data is inconsistent.

Solutions:

Invalidate, don't update — Delete cache key on write. Next read fetches fresh data. Race conditions cause extra cache misses, not inconsistency.
Single writer — All writes to a key go through one service. No concurrent conflicts.
Conditional updates — Use CAS (compare-and-set) operations. Update cache only if version matches.
Transaction log — Write to database, then publish to cache via change data capture. Order is guaranteed by log ordering.

When in Doubt, Invalidate

The safest cache update strategy is deletion. On any write, delete the cache key. The next read will populate fresh data. This trades a cache miss for guaranteed consistency. Most systems can tolerate occasional cache misses far better than inconsistent data.

Cache Sizing and Capacity Planning

Cache sizing is both art and science. Too small: low hit rate, frequent eviction, limited benefit. Too large: wasted resources, cold data consuming memory.

Key metrics for sizing:

Working set size: How much data is actively accessed? If you have 100GB of data but only 1GB is accessed in any hour, the working set is ~1GB.

Hit rate target: What hit rate do you need? 90%? 99%? Higher hit rates require more memory to store more data.

Object size: Average size of cached items. Determines how many items fit in a given memory budget.

Access pattern: Uniform access (all items equally likely) or skewed (some items much hotter than others)? Skewed patterns allow smaller caches—hot items stay in cache.

Sizing calculation example:

Scenario:

1 million users
Average user profile: 2KB
Daily active users: 100,000
Target hit rate: 95%

Calculation:

Active data: 100,000 users × 2KB = 200MB
For 95% hit rate with LRU: need ~1.5x working set
Cache size needed: ~300MB

With 10% buffer for metadata and fragmentation: 330MB

This is much smaller than caching all 1M users (2GB) because only active users need caching.

Monitoring cache efficiency:

Track these metrics continuously:

Hit rate: Hits / (Hits + Misses). Should be 90%+ for well-designed caches.
Memory usage: Current usage vs. capacity. Alerts if approaching limit.
Eviction rate: How often items are evicted. High eviction suggests undersized cache.
Latency: Cache operation latency. If increasing, might indicate memory pressure.
Key size distribution: Identifies if a few large keys are consuming disproportionate space.

Eviction Policies

•LRU (Least Recently Used) — Evicts items not accessed for longest time. Most common, works well for most access patterns. Redis approximates LRU with sampling.
•LFU (Least Frequently Used) — Evicts items accessed least often. Better for scenarios where access frequency matters more than recency. Redis 4.0+ supports LFU.
•Random — Evicts random items. Surprisingly effective for uniform access patterns. Very low overhead.
•TTL-based — Only evicts expired items. Requires TTL on all keys. Can lead to memory pressure if items don't expire fast enough.
•Volatile-TTL — Only evict keys with TTL set, preferring those closest to expiration. Useful for mixed workloads with some permanent and some expiring keys.

The 80/20 Rule in Caching

Most real-world access patterns follow Zipf's law: 20% of items receive 80% of traffic. This means a cache sized for 20% of data can achieve 80% hit rate. Understand your access pattern before sizing. Uniform random access is worst case; real traffic is usually much more skewed—and therefore more cache-friendly.

Cache Security and Attack Vectors

Caches introduce security considerations that are easy to overlook. Since caches are designed for performance, security features may be minimal. Understanding and mitigating cache-related attacks is essential.

Attack Vector 1: Cache Poisoning

An attacker injects malicious content into the cache, which is then served to legitimate users.

Example: If cache keys include user-controlled data (like a URL parameter) without proper validation, an attacker can pollute caches with malicious content.

Mitigation:

Validate and sanitize all inputs used in cache keys
Use strict cache key construction
Limit what can influence cache keys

Attack Vector 2: Cache Timing Attacks

An attacker measures response times to determine if data is cached, revealing information about other users' access patterns.

Mitigation:

Add noise to response times
Use private caches for sensitive data
Avoid caching user-specific sensitive data

Additional Cache Attack Vectors

•Cache Penetration — Attacker requests non-existent keys, bypassing cache and hammering database. Mitigate with bloom filters, negative caching, or rate limiting.
•Cache Breakdown — Targeted attack on hot keys at expiration. Deliberately time attacks to coincide with cache eviction. Mitigate with background refresh, no public TTL information.
•Data Leakage — Sensitive data cached without proper access controls. Ensure per-user data isn't served to other users. Use user-specific cache keys for private data.
•Cache Side-Channels — Shared cache infrastructure allows inference about other tenants. Common in multi-tenant environments. Use cache isolation or encryption.
•Denial of Service via Cache Flooding — Filling cache with useless data, evicting legitimate entries. Mitigate with per-user quotas, key prefixes, or separate cache instances.

Security best practices:

Network security: Redis should not be exposed to the internet. Use VPCs, firewalls, and Redis AUTH.
Encryption in transit: Use TLS for connections to cache, especially in cloud environments.
Access control: Use separate cache instances or key prefixes for different trust levels.
Audit logging: Log cache operations for security forensics.
Sensitive data handling: Encrypt sensitive data before caching, or don't cache it at all. Authentication tokens, payment information, and personal data require special care.
Cache key hygiene: Never use user-controlled values directly in cache keys. Hash or sanitize inputs.

Redis Default Security

Redis has historically been designed for trusted environments and has minimal security by default. Always enable AUTH, bind to internal interfaces only, use TLS, and disable dangerous commands (FLUSHDB, CONFIG, KEYS) in production. Never expose Redis to the internet without these protections.

Summary: Caching as a Scaling Discipline

Caching is one of the most powerful tools in the scaling engineer's toolkit. Let's consolidate the key learnings:

Key Takeaways

•Layer caches strategically — Browser, CDN, application, and distributed caches each filter traffic. Multiplicative benefits when layered correctly.
•Redis is the de facto distributed cache — Rich data structures, excellent performance, and good ecosystem make it suitable for most applications.
•Cache-aside is the dominant pattern — Simple, robust, and database remains source of truth. Understand its limitations (cold cache, stampede risk).
•Invalidation is hard but essential — TTL-based, event-based, and version-based approaches each have trade-offs. Often combine multiple strategies.
•Prevent cache stampedes — Locking, probabilistic expiration, and background refresh prevent database overload at cache expiration.
•Choose appropriate consistency — Strong consistency is expensive. Most use cases can tolerate eventual consistency with proper design.
•Size caches based on working set — Understand access patterns. Skewed access means small caches can achieve high hit rates.
•Security isn't optional — Cache poisoning, timing attacks, and data leakage are real threats. Design security from the start.

What's next:

With caching understood, we turn to queue-based decoupling—another essential scaling pattern. Queues decouple producers from consumers, enabling asynchronous processing, traffic shaping, and resilience to transient failures. The next page explores message queue architectures, processing patterns, and when to introduce queues into your system.

Page Complete

You now have a comprehensive understanding of caching as a scaling strategy—from cache hierarchies through invalidation strategies, stampede prevention, and security considerations. Caching is often the highest-leverage optimization available, and this knowledge enables you to apply it effectively.

Caching Layer Introduction

The Performance Multiplier

What You Will Learn

The Cache Hierarchy

Modern systems employ caching at multiple layers, each with different characteristics, trade-offs, and use cases. Understanding this hierarchy is essential for designing effective caching strategies.

Layer 5: Database Query Cache Some databases cache query results internally. Useful for transparent caching but limited control over invalidation.

Cache Layer Comparison
Layer	Location	Latency	Capacity	Shared	Best For
Browser Cache	Client device	0ms (instant)	Limited (50-500MB)	No	Static assets, personalized data
CDN Cache	Edge PoPs	1-20ms	Very large (TB+)	Yes (per-region)	Static content, cacheable APIs
Application Cache	App server memory	< 1ms	Limited (GB)	No (per instance)	Hot path data, computed values
Distributed Cache	Cache cluster	1-5ms (network)	Large (100s GB)	Yes (global)	Session data, database query results
Database Cache	Database server	1-10ms	Based on DB config	Yes	Transparent query caching

The multi-layer strategy:

Effective systems employ caches at multiple layers, with each layer filtering requests:

Browser cache handles repeat visits for the same user
CDN cache handles popular content for all users in a region
Application cache handles ultra-hot data (feature flags, configuration)
Distributed cache handles warm data (user sessions, recent product views)
Database handles cold data (first access, infrequent queries)

If each layer achieves 90% hit rate:

Browser: 10% of requests proceed
CDN: 1% of requests proceed
Application: 0.1% of requests proceed
Distributed cache: 0.01% of requests proceed
Database: Handles only 0.01% of original traffic

This is the power of cache layering—multiplicative reduction at each tier.

Start at the Edge

Distributed Caching with Redis

Redis has become the de facto standard for distributed caching. Its rich data structures, atomic operations, and excellent performance make it suitable for a wide range of caching patterns.

Why Redis for caching?

Performance: Redis operates entirely in memory. Single-threaded event loop eliminates lock contention. Typical latency is < 1ms for simple operations.

Data structures: Beyond simple key-value, Redis supports lists, sets, sorted sets, hashes, bitmaps, and more. These enable patterns impossible with simple caches.

Atomic operations: INCR, LPUSH, ZADD, and other atomic commands enable race-condition-free counter updates, rate limiting, and leaderboards.

Persistence options: Optional disk persistence (RDB snapshots, AOF logs) protects against data loss during restarts—useful when cache warmup is expensive.

Cluster mode: Redis Cluster provides automatic sharding and failover for horizontal scaling.

redis-caching-patterns
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
import Redis from 'ioredis';
 
const redis = new Redis({
    host: 'redis-cluster.example.com',
    port: 6379,
    retryDelayOnFailover: 100,
    maxRetriesPerRequest: 3,
});
 
// Pattern 1: Simple cache-aside with TTL
async function getCachedUser(userId: string): Promise<User | null> {
    const cacheKey = `user:${userId}`;
    
    // Try cache first
    const cached = await redis.get(cacheKey);
    if (cached) {
        return JSON.parse(cached);
    }
    
    // Cache miss - fetch from database
    const user = await database.getUser(userId);
    if (user) {
        // Cache for 1 hour
        await redis.setex(cacheKey, 3600, JSON.stringify(user));
    }
    
    return user;
}
 
// Pattern 2: Hash for object storage (more memory efficient)
async function getCachedProduct(productId: string): Promise<Product | null> {
    const cacheKey = `product:${productId}`;
    
    const cached = await redis.hgetall(cacheKey);
    if (Object.keys(cached).length > 0) {
        return {
            id: cached.id,
            name: cached.name,
            price: parseFloat(cached.price),
            stock: parseInt(cached.stock, 10),
        };
    }
    
    const product = await database.getProduct(productId);
    if (product) {
        await redis.hmset(cacheKey, {
            id: product.id,
            name: product.name,
            price: product.price.toString(),
            stock: product.stock.toString(),
        });
        await redis.expire(cacheKey, 3600);
    }
    
    return product;
}
 
// Pattern 3: Sorted sets for leaderboards/ranking
async function getTopScores(limit: number = 10): Promise<LeaderboardEntry[]> {
    // ZREVRANGE returns highest scores first
    const results = await redis.zrevrange('leaderboard', 0, limit - 1, 'WITHSCORES');
    
    const entries: LeaderboardEntry[] = [];
    for (let i = 0; i < results.length; i += 2) {
        entries.push({
            userId: results[i],
            score: parseFloat(results[i + 1]),
        });
    }
    return entries;
}
 
// Pattern 4: Rate limiting with sliding window
async function isRateLimited(userId: string, limit: number, windowSecs: number): Promise<boolean> {
    const key = `ratelimit:${userId}`;
    const now = Date.now();
    const windowStart = now - (windowSecs * 1000);
    
    // Remove old entries
    await redis.zremrangebyscore(key, 0, windowStart);
    
    // Count recent entries
    const count = await redis.zcard(key);
    
    if (count >= limit) {
        return true; // Rate limited
    }
    
    // Add current request
    await redis.zadd(key, now, `${now}-${Math.random()}`);
    await redis.expire(key, windowSecs);
    
    return false;
}

Redis vs Memcached

Cache-Aside Pattern Deep Dive

The cache-aside pattern (also called "lazy loading") is the most common caching strategy. The application manages both cache and database, loading data into cache on demand.

The flow:

Application checks cache for requested data
If found (cache hit), return cached data
If not found (cache miss), query database
Write retrieved data to cache
Return data to caller

Advantages:

Cache contains only requested data (no speculative caching)
Database remains the source of truth
Cache failures don't affect functionality (just performance)
Simple to implement and understand

Disadvantages:

First request for data always hits database (cold cache)
Cache and database can become inconsistent
Thundering herd problem on cache miss

Converting Mermaid diagram...

Critical implementation considerations:

1. TTL selection: Every cached item needs a TTL (time-to-live). Too short: frequent cache misses, database hammered. Too long: stale data served to users.

Rules of thumb:

Frequently changing data: 1-5 minutes
Moderately changing data: 5-60 minutes
Rarely changing data: hours to days
Never changing data: very long TTL with explicit invalidation

3. Cache key design: Keys should be:

Unique across the application
Descriptive (for debugging)
Consistent (same data, same key)
Include version if schema changes

Good: user:123:v2, product:abc:inventory Bad: 123, data, cache_key_1

The Write Path

Cache Invalidation Strategies

Strategy 1: Time-Based Expiration (TTL)

The simplest approach: every cached item expires after a fixed duration.

Pros:

Simple to implement
Self-healing (stale data eventually expires)
No coupling between write path and cache

Cons:

Data can be stale for up to TTL duration
No control over when stale data is served
Must balance freshness vs. database load

Cache Invalidation Strategies

•Event-Based Invalidation — When data changes, explicitly delete or update the cache key. Requires coupling between write operations and cache, but provides immediate consistency.
•Write-Through — Every write updates both database and cache atomically. Ensures cache is always fresh, but adds latency to writes and requires careful transaction handling.
•Write-Behind (Write-Back) — Writes go to cache first, then asynchronously to database. Fastest writes, but risks data loss if cache fails before database write.
•Cache-Aside with Event Bus — Publish data change events to a message bus; cache subscribers invalidate keys. Decouples write path from cache logic.
•Version-Based Keys — Include a version number in cache keys. Increment version on data change. Old keys naturally expire while new keys are populated.
•Lease-Based Caching — Cache items include a lease. Writer must acquire lease to invalidate, preventing thundering herd on simultaneous invalidations.

invalidation-patterns
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
// Pattern 1: Write-Through (synchronous cache update)
async function updateUserWithWriteThrough(userId: string, updates: Partial<User>) {
    const cacheKey = `user:${userId}`;
    
    // Start transaction
    const trx = await database.transaction();
    try {
        // Update database
        const user = await trx.update('users', userId, updates);
        
        // Update cache
        await redis.setex(cacheKey, 3600, JSON.stringify(user));
        
        // Commit transaction
        await trx.commit();
        return user;
    } catch (error) {
        await trx.rollback();
        // On failure, invalidate cache to prevent stale data
        await redis.del(cacheKey);
        throw error;
    }
}
 
// Pattern 2: Event-Based Invalidation
class UserService {
    private eventBus: EventBus;
    
    async updateUser(userId: string, updates: Partial<User>) {
        const user = await database.update('users', userId, updates);
        
        // Emit event for cache invalidation
        await this.eventBus.publish('user.updated', {
            userId,
            timestamp: Date.now(),
        });
        
        return user;
    }
}
 
// Separate cache invalidation handler
class CacheInvalidator {
    constructor(private redis: Redis, private eventBus: EventBus) {
        this.eventBus.subscribe('user.updated', this.handleUserUpdate.bind(this));
    }
    
    private async handleUserUpdate(event: { userId: string }) {
        await this.redis.del(`user:${event.userId}`);
        // Also invalidate related caches
        await this.redis.del(`user:${event.userId}:preferences`);
        await this.redis.del(`user:${event.userId}:activity`);
    }
}
 
// Pattern 3: Version-Based Cache Keys
class VersionedCache {
    private versionKey = 'cache:version:users';
    
    async getVersion(): Promise<number> {
        const version = await redis.get(this.versionKey);
        return version ? parseInt(version, 10) : 1;
    }
    
    async incrementVersion(): Promise<number> {
        return redis.incr(this.versionKey);
    }
    
    async getUser(userId: string): Promise<User | null> {
        const version = await this.getVersion();
        const cacheKey = `user:${userId}:v${version}`;
        
        const cached = await redis.get(cacheKey);
        if (cached) return JSON.parse(cached);
        
        const user = await database.getUser(userId);
        if (user) {
            await redis.setex(cacheKey, 3600, JSON.stringify(user));
        }
        return user;
    }
    
    async invalidateAllUsers(): Promise<void> {
        // Simply increment version; old keys expire naturally
        await this.incrementVersion();
    }
}

Combining Strategies

Cache Stampede and Thundering Herd

The scenario:

Popular product page cached with 1-hour TTL
10,000 users are viewing the page when cache expires
All 10,000 requests find cache miss simultaneously
All 10,000 requests query the database
Database is overwhelmed
Responses are slow; additional requests pile up
System spirals into overload

The irony: The cache was protecting the database. When it expires, the protection disappears precisely when traffic is highest.

Stampede Prevention Techniques

•Locking (Mutex) — Only one request regenerates the cache; others wait for the lock to release. Prevents parallel database queries but adds latency for waiting requests.
•Probabilistic Early Expiration — Items expire slightly before their TTL, with randomness. Spreads regeneration over time instead of all at once.
•Background Refresh — A background process refreshes cache before expiration. Requests always hit cache; no stampede possible.
•Request Coalescing — Multiple simultaneous requests for the same key are collapsed into one database query. The single result is shared among waiters.
•Stale-While-Revalidate — Return stale data immediately while refreshing in the background. Trade-off: briefly stale data vs. latency.
•Circuit Breaker — If database is overwhelmed, return stale cached data even if expired, or return a degraded response. Protects database from cascade failure.

stampede-prevention
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
// Pattern 1: Distributed Lock (Mutex)
async function getCachedWithLock<T>(
    key: string,
    ttlSeconds: number,
    loader: () => Promise<T>
): Promise<T | null> {
    // Try cache first
    const cached = await redis.get(key);
    if (cached) return JSON.parse(cached);
    
    const lockKey = `lock:${key}`;
    const lockTTL = 10; // Lock expires after 10 seconds
    
    // Try to acquire lock
    const acquired = await redis.set(lockKey, '1', 'EX', lockTTL, 'NX');
    
    if (acquired) {
        // We have the lock - load data
        try {
            const data = await loader();
            await redis.setex(key, ttlSeconds, JSON.stringify(data));
            return data;
        } finally {
            await redis.del(lockKey);
        }
    } else {
        // Another process is loading - wait and retry
        await sleep(100);
        const cached = await redis.get(key);
        if (cached) return JSON.parse(cached);
        
        // Still no data - retry with backoff
        return getCachedWithLock(key, ttlSeconds, loader);
    }
}
 
// Pattern 2: Probabilistic Early Expiration
async function getCachedWithProbabilisticRefresh<T>(
    key: string,
    ttlSeconds: number,
    earlyRefreshWindow: number, // seconds before expiration to consider refresh
    loader: () => Promise<T>
): Promise<T | null> {
    const cached = await redis.get(key);
    if (cached) {
        const { data, createdAt } = JSON.parse(cached);
        const age = (Date.now() - createdAt) / 1000;
        const timeToExpiry = ttlSeconds - age;
        
        if (timeToExpiry < earlyRefreshWindow) {
            // In early refresh window - probabilistically refresh
            // Probability increases as we approach expiration
            const refreshProbability = 1 - (timeToExpiry / earlyRefreshWindow);
            
            if (Math.random() < refreshProbability) {
                // Refresh in background (don't await)
                refreshCache(key, ttlSeconds, loader).catch(console.error);
            }
        }
        
        return data;
    }
    
    // Cache miss - load synchronously
    const data = await loader();
    await redis.setex(key, ttlSeconds, JSON.stringify({
        data,
        createdAt: Date.now(),
    }));
    return data;
}
 
// Pattern 3: Stale-While-Revalidate
async function getCachedSWR<T>(
    key: string,
    ttlSeconds: number,
    staleWhileRevalidate: number,
    loader: () => Promise<T>
): Promise<T | null> {
    const cached = await redis.get(key);
    
    if (cached) {
        const { data, createdAt } = JSON.parse(cached);
        const age = (Date.now() - createdAt) / 1000;
        
        if (age > ttlSeconds) {
            if (age < ttlSeconds + staleWhileRevalidate) {
                // Stale but within revalidate window
                // Return stale, refresh in background
                refreshCache(key, ttlSeconds, loader).catch(console.error);
                return data; // Return stale immediately
            }
            // Too stale - let it fall through to reload
        } else {
            return data; // Fresh data
        }
    }
    
    // No cache or too stale - load synchronously
    const data = await loader();
    await saveToCache(key, data, ttlSeconds);
    return data;
}

Lock Contention at Scale

Cache Coherence and Consistency

The fundamental tension:

Consistency levels in caching:

Strong consistency: Cache always reflects current database state. Requires synchronous invalidation on every write. Practically impossible with local/CDN caches.

Eventual consistency: Cache will converge to database state within a bounded time (TTL). Most common approach. Allows brief windows of stale data.

Read-your-writes consistency: A user sees their own writes immediately, even if others see stale data. Implemented by routing reads to primary after writes.

Causal consistency: Related operations are seen in order. If A causes B, no observer sees B before A. Complex to implement in distributed caches.

Consistency Requirements by Use Case
Use Case	Required Consistency	Approach	Staleness Tolerance
User profile display	Eventual	TTL + event invalidation	Minutes acceptable
Account balance	Strong	Cache-aside with short TTL or no cache	Zero tolerance
Shopping cart	Read-your-writes	Write-through + user affinity	Own changes immediate
Product inventory	Eventual with bounds	Event invalidation + 30s TTL	Brief oversell acceptable
News feed	Eventual	Long TTL, background refresh	Hours acceptable
Session authentication	Strong	In-memory or very short TTL	Zero tolerance

The dual-write problem:

A common pitfall occurs when updating cache and database separately:

Thread A: Updates database (balance = 100)
Thread B: Updates database (balance = 150)
Thread B: Updates cache (balance = 150)
Thread A: Updates cache (balance = 100)

Result: Database has 150, cache has 100. Data is inconsistent.

Solutions:

Invalidate, don't update — Delete cache key on write. Next read fetches fresh data. Race conditions cause extra cache misses, not inconsistency.
Single writer — All writes to a key go through one service. No concurrent conflicts.
Conditional updates — Use CAS (compare-and-set) operations. Update cache only if version matches.
Transaction log — Write to database, then publish to cache via change data capture. Order is guaranteed by log ordering.

When in Doubt, Invalidate

Cache Sizing and Capacity Planning

Cache sizing is both art and science. Too small: low hit rate, frequent eviction, limited benefit. Too large: wasted resources, cold data consuming memory.

Key metrics for sizing:

Working set size: How much data is actively accessed? If you have 100GB of data but only 1GB is accessed in any hour, the working set is ~1GB.

Hit rate target: What hit rate do you need? 90%? 99%? Higher hit rates require more memory to store more data.

Object size: Average size of cached items. Determines how many items fit in a given memory budget.

Access pattern: Uniform access (all items equally likely) or skewed (some items much hotter than others)? Skewed patterns allow smaller caches—hot items stay in cache.

Sizing calculation example:

Scenario:

1 million users
Average user profile: 2KB
Daily active users: 100,000
Target hit rate: 95%

Calculation:

Active data: 100,000 users × 2KB = 200MB
For 95% hit rate with LRU: need ~1.5x working set
Cache size needed: ~300MB

With 10% buffer for metadata and fragmentation: 330MB

This is much smaller than caching all 1M users (2GB) because only active users need caching.

Monitoring cache efficiency:

Track these metrics continuously:

Hit rate: Hits / (Hits + Misses). Should be 90%+ for well-designed caches.
Memory usage: Current usage vs. capacity. Alerts if approaching limit.
Eviction rate: How often items are evicted. High eviction suggests undersized cache.
Latency: Cache operation latency. If increasing, might indicate memory pressure.
Key size distribution: Identifies if a few large keys are consuming disproportionate space.

Eviction Policies

•LRU (Least Recently Used) — Evicts items not accessed for longest time. Most common, works well for most access patterns. Redis approximates LRU with sampling.
•LFU (Least Frequently Used) — Evicts items accessed least often. Better for scenarios where access frequency matters more than recency. Redis 4.0+ supports LFU.
•Random — Evicts random items. Surprisingly effective for uniform access patterns. Very low overhead.
•TTL-based — Only evicts expired items. Requires TTL on all keys. Can lead to memory pressure if items don't expire fast enough.
•Volatile-TTL — Only evict keys with TTL set, preferring those closest to expiration. Useful for mixed workloads with some permanent and some expiring keys.

The 80/20 Rule in Caching

Cache Security and Attack Vectors

Attack Vector 1: Cache Poisoning

An attacker injects malicious content into the cache, which is then served to legitimate users.

Example: If cache keys include user-controlled data (like a URL parameter) without proper validation, an attacker can pollute caches with malicious content.

Mitigation:

Validate and sanitize all inputs used in cache keys
Use strict cache key construction
Limit what can influence cache keys

Attack Vector 2: Cache Timing Attacks

An attacker measures response times to determine if data is cached, revealing information about other users' access patterns.

Mitigation:

Add noise to response times
Use private caches for sensitive data
Avoid caching user-specific sensitive data

Additional Cache Attack Vectors

•Cache Penetration — Attacker requests non-existent keys, bypassing cache and hammering database. Mitigate with bloom filters, negative caching, or rate limiting.
•Cache Breakdown — Targeted attack on hot keys at expiration. Deliberately time attacks to coincide with cache eviction. Mitigate with background refresh, no public TTL information.
•Data Leakage — Sensitive data cached without proper access controls. Ensure per-user data isn't served to other users. Use user-specific cache keys for private data.
•Cache Side-Channels — Shared cache infrastructure allows inference about other tenants. Common in multi-tenant environments. Use cache isolation or encryption.
•Denial of Service via Cache Flooding — Filling cache with useless data, evicting legitimate entries. Mitigate with per-user quotas, key prefixes, or separate cache instances.

Security best practices:

Network security: Redis should not be exposed to the internet. Use VPCs, firewalls, and Redis AUTH.
Encryption in transit: Use TLS for connections to cache, especially in cloud environments.
Access control: Use separate cache instances or key prefixes for different trust levels.
Audit logging: Log cache operations for security forensics.
Sensitive data handling: Encrypt sensitive data before caching, or don't cache it at all. Authentication tokens, payment information, and personal data require special care.
Cache key hygiene: Never use user-controlled values directly in cache keys. Hash or sanitize inputs.

Redis Default Security

Summary: Caching as a Scaling Discipline

Caching is one of the most powerful tools in the scaling engineer's toolkit. Let's consolidate the key learnings:

Key Takeaways

•Layer caches strategically — Browser, CDN, application, and distributed caches each filter traffic. Multiplicative benefits when layered correctly.
•Redis is the de facto distributed cache — Rich data structures, excellent performance, and good ecosystem make it suitable for most applications.
•Cache-aside is the dominant pattern — Simple, robust, and database remains source of truth. Understand its limitations (cold cache, stampede risk).
•Invalidation is hard but essential — TTL-based, event-based, and version-based approaches each have trade-offs. Often combine multiple strategies.
•Prevent cache stampedes — Locking, probabilistic expiration, and background refresh prevent database overload at cache expiration.
•Choose appropriate consistency — Strong consistency is expensive. Most use cases can tolerate eventual consistency with proper design.
•Size caches based on working set — Understand access patterns. Skewed access means small caches can achieve high hit rates.
•Security isn't optional — Cache poisoning, timing attacks, and data leakage are real threats. Design security from the start.

What's next:

Page Complete