System Design (LLD)Cache Design Considerations

Cache Design Considerations

LevelIntermediate

Duration90 mins

TopicCache Design Considerations

3 / 4

Distributed vs Local Cache

The Cache Location Decision

Where should your cache live? This deceptively simple question has profound implications for your system's performance, consistency, and operational complexity.

Local caches (in-process, same machine) offer blazing speed—nanosecond access times with zero network overhead. But they're isolated: each application instance has its own cache with potentially different data.

Distributed caches (Redis, Memcached, shared across instances) provide a single source of truth that all instances share. But every access crosses the network, adding latency and failure modes.

Neither approach is universally superior. Principal engineers understand that cache topology is an architectural decision with trade-offs that must align with system requirements. Getting this wrong leads to either excessive latency (over-reliance on distributed cache) or inconsistency nightmares (poorly-managed local caches).

What You Will Learn

By the end of this page, you will understand the characteristics, trade-offs, and appropriate use cases for local and distributed caches. You'll learn cache coherence strategies, multi-tier architectures, and a decision framework for choosing cache topology in your systems.

Cache Topology Overview

Cache topology refers to where cache storage is located relative to your application instances. The three primary topologies are:

1. Local Cache (In-Process)

Cache lives within application process memory
Zero network latency
Each instance has its own isolated cache
Examples: Guava Cache, Caffeine, in-memory Maps

2. Distributed Cache (Shared)

Cache runs as separate service(s)
Accessed over network
Shared across all application instances
Examples: Redis, Memcached, Hazelcast

3. Multi-Tier Cache (Layered)

Combines local and distributed caches
Local cache as L1, distributed as L2
Balances speed with consistency
Most production systems use this approach

Cache Topology Comparison at a Glance
Aspect	Local Cache	Distributed Cache
Access Latency	< 1μs (sub-microsecond)	1-10ms (network RTT)
Total Capacity	Bounded by instance RAM	Scales independently
Consistency Across Instances	No built-in coherence	Single source of truth
Failure Impact	Lost on instance restart	Survives instance restarts
Operational Complexity	Zero (embedded)	Additional infrastructure
Cost	Uses application memory	Separate compute/memory cost

The 1000x Latency Gap

A local cache lookup takes ~100 nanoseconds. A Redis lookup takes ~1 millisecond (including network). That's a 10,000x difference. For hot paths where you need sub-millisecond response, this difference matters enormously. But for cold paths or large datasets, the shared capacity of distributed cache is more important.

Local Cache Deep Dive

Local caches are embedded within your application process. They offer unmatched performance but require careful management to avoid problems.

Characteristics:

Advantages

•Ultra-low latency — Direct memory access, no serialization
•No network dependency — Works during network partitions
•Zero infrastructure — No additional services to manage
•Simple implementation — Just a Map with eviction
•No serialization cost — Objects stored as-is

Disadvantages

•Limited capacity — Shares RAM with application
•No cross-instance sharing — Each instance caches independently
•Coherence challenges — Different instances may have conflicting data
•Cold start penalty — Cache empty after restart/deploy
•Memory fragmentation — Can cause GC pressure

local-cache-implementation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
/**
 * Production-ready local cache with essential features.
 */
interface LocalCacheConfig<K, V> {
    maxSize: number;           // Maximum items
    maxMemoryBytes?: number;   // Maximum memory (if tracking size)
    ttlMs: number;             // Time-to-live for entries
    refreshMs?: number;        // Background refresh interval
    sizeEstimator?: (value: V) => number;  // For memory tracking
}
 
class LocalCache<K, V> {
    private cache: Map<K, CacheEntry<V>> = new Map();
    private config: LocalCacheConfig<K, V>;
    private currentMemory: number = 0;
    
    // Metrics
    private metrics = {
        hits: 0,
        misses: 0,
        evictions: 0,
        refreshes: 0,
    };
    
    constructor(config: LocalCacheConfig<K, V>) {
        this.config = config;
        this.startCleanupTask();
    }
    
    get(key: K): V | undefined {
        const entry = this.cache.get(key);
        
        if (!entry) {
            this.metrics.misses++;
            return undefined;
        }
        
        // Check TTL
        if (Date.now() > entry.expiresAt) {
            this.cache.delete(key);
            this.currentMemory -= entry.memorySize;
            this.metrics.misses++;
            return undefined;
        }
        
        // Update access time for LRU
        entry.lastAccess = Date.now();
        this.metrics.hits++;
        
        return entry.value;
    }
    
    set(key: K, value: V): void {
        const memorySize = this.config.sizeEstimator?.(value) ?? 1;
        
        // Remove existing entry if present
        if (this.cache.has(key)) {
            const existing = this.cache.get(key)!;
            this.currentMemory -= existing.memorySize;
            this.cache.delete(key);
        }
        
        // Evict if necessary
        this.evictIfNeeded(memorySize);
        
        // Insert new entry
        const entry: CacheEntry<V> = {
            value,
            createdAt: Date.now(),
            lastAccess: Date.now(),
            expiresAt: Date.now() + this.config.ttlMs,
            memorySize,
        };
        
        this.cache.set(key, entry);
        this.currentMemory += memorySize;
    }
    
    /**
     * Get with loader - returns cached value or loads and caches.
     */
    async getOrLoad(key: K, loader: () => Promise<V>): Promise<V> {
        const cached = this.get(key);
        if (cached !== undefined) {
            return cached;
        }
        
        // Load and cache
        const value = await loader();
        this.set(key, value);
        return value;
    }
    
    private evictIfNeeded(incomingSize: number): void {
        // Evict by count
        while (this.cache.size >= this.config.maxSize) {
            this.evictOldest();
        }
        
        // Evict by memory (if configured)
        if (this.config.maxMemoryBytes) {
            while (this.currentMemory + incomingSize > this.config.maxMemoryBytes &&
                   this.cache.size > 0) {
                this.evictOldest();
            }
        }
    }
    
    private evictOldest(): void {
        let oldestKey: K | undefined;
        let oldestAccess = Infinity;
        
        for (const [key, entry] of this.cache) {
            if (entry.lastAccess < oldestAccess) {
                oldestAccess = entry.lastAccess;
                oldestKey = key;
            }
        }
        
        if (oldestKey !== undefined) {
            const entry = this.cache.get(oldestKey)!;
            this.cache.delete(oldestKey);
            this.currentMemory -= entry.memorySize;
            this.metrics.evictions++;
        }
    }
    
    private startCleanupTask(): void {
        // Periodic cleanup of expired entries
        setInterval(() => {
            const now = Date.now();
            for (const [key, entry] of this.cache) {
                if (now > entry.expiresAt) {
                    this.cache.delete(key);
                    this.currentMemory -= entry.memorySize;
                }
            }
        }, Math.min(this.config.ttlMs / 2, 60000));
    }
    
    invalidate(key: K): boolean {
        const entry = this.cache.get(key);
        if (entry) {
            this.currentMemory -= entry.memorySize;
            this.cache.delete(key);
            return true;
        }
        return false;
    }
    
    clear(): void {
        this.cache.clear();
        this.currentMemory = 0;
    }
    
    getMetrics(): LocalCacheMetrics {
        const total = this.metrics.hits + this.metrics.misses;
        return {
            hitRate: total > 0 ? this.metrics.hits / total : 0,
            size: this.cache.size,
            memoryUsed: this.currentMemory,
            ...this.metrics,
        };
    }
}
 
interface CacheEntry<V> {
    value: V;
    createdAt: number;
    lastAccess: number;
    expiresAt: number;
    memorySize: number;
}
 
interface LocalCacheMetrics {
    hitRate: number;
    size: number;
    memoryUsed: number;
    hits: number;
    misses: number;
    evictions: number;
    refreshes: number;
}
 
// Usage example
const userCache = new LocalCache<string, UserProfile>({
    maxSize: 10000,
    maxMemoryBytes: 100 * 1024 * 1024, // 100MB
    ttlMs: 5 * 60 * 1000, // 5 minutes
    sizeEstimator: (user) => JSON.stringify(user).length * 2, // Rough estimate
});

Memory Management Caution

Local caches consume application heap memory. Large caches can cause GC pressure, especially in Java/JVM environments. Monitor application memory and GC metrics when using local caches. Consider off-heap storage (like Caffeine's experimental off-heap mode) for large local caches.

Distributed Cache Deep Dive

Distributed caches run as separate services, providing shared storage accessible by all application instances. They're the backbone of scalable caching architectures.

Popular Distributed Cache Systems:

Distributed Cache System Comparison
System	Strengths	Best For	Considerations
Redis	Rich data types, Pub/Sub, Persistence	General purpose, Sessions, Queues	Single-threaded, Memory-bound
Memcached	Simplicity, Multi-threaded	Pure key-value caching	No persistence, No data types
Hazelcast	Distributed computing, Near-cache	Java apps, Compute + Cache	JVM-only traditionally
Apache Ignite	SQL queries, Compute grid	Data grid, Analytics cache	Complexity, Resource intensive
Couchbase	JSON documents, Mobile sync	Document caching, Mobile	Operational overhead

distributed-cache-client.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
import Redis from 'ioredis';
 
/**
 * Production distributed cache client with resilience patterns.
 */
interface DistributedCacheConfig {
    redisUrl: string;
    defaultTtlSeconds: number;
    connectionTimeout: number;
    maxRetries: number;
    retryDelayMs: number;
    prefix: string;
}
 
class DistributedCache<V> {
    private redis: Redis;
    private config: DistributedCacheConfig;
    private connected: boolean = false;
    
    // Circuit breaker state
    private failures: number = 0;
    private circuitOpen: boolean = false;
    private lastFailure: number = 0;
    
    constructor(config: DistributedCacheConfig) {
        this.config = config;
        this.redis = new Redis(config.redisUrl, {
            connectTimeout: config.connectionTimeout,
            retryStrategy: (times) => {
                if (times > config.maxRetries) {
                    return null; // Stop retrying
                }
                return Math.min(times * config.retryDelayMs, 5000);
            },
            lazyConnect: true,
        });
        
        this.setupConnectionHandlers();
    }
    
    private setupConnectionHandlers(): void {
        this.redis.on('connect', () => {
            this.connected = true;
            this.failures = 0;
            this.circuitOpen = false;
            console.log('Distributed cache connected');
        });
        
        this.redis.on('error', (err) => {
            console.error('Distributed cache error:', err.message);
            this.recordFailure();
        });
        
        this.redis.on('close', () => {
            this.connected = false;
            console.warn('Distributed cache connection closed');
        });
    }
    
    private recordFailure(): void {
        this.failures++;
        this.lastFailure = Date.now();
        
        // Open circuit after 5 consecutive failures
        if (this.failures >= 5) {
            this.circuitOpen = true;
            console.warn('Cache circuit breaker opened');
            
            // Auto-reset after 30 seconds
            setTimeout(() => {
                this.circuitOpen = false;
                this.failures = 0;
                console.log('Cache circuit breaker reset');
            }, 30000);
        }
    }
    
    private fullKey(key: string): string {
        return `${this.config.prefix}:${key}`;
    }
    
    async get(key: string): Promise<V | null> {
        if (this.circuitOpen) {
            return null; // Fail fast
        }
        
        try {
            const data = await this.redis.get(this.fullKey(key));
            if (!data) return null;
            
            return JSON.parse(data) as V;
        } catch (error) {
            this.recordFailure();
            throw error;
        }
    }
    
    async set(
        key: string, 
        value: V, 
        ttlSeconds?: number
    ): Promise<boolean> {
        if (this.circuitOpen) {
            return false; // Fail fast
        }
        
        try {
            const serialized = JSON.stringify(value);
            const ttl = ttlSeconds ?? this.config.defaultTtlSeconds;
            
            await this.redis.setex(
                this.fullKey(key), 
                ttl, 
                serialized
            );
            
            return true;
        } catch (error) {
            this.recordFailure();
            throw error;
        }
    }
    
    /**
     * Get with fallback loader and cache-aside pattern.
     */
    async getOrLoad(
        key: string,
        loader: () => Promise<V>,
        ttlSeconds?: number
    ): Promise<V> {
        // Try cache first
        try {
            const cached = await this.get(key);
            if (cached !== null) {
                return cached;
            }
        } catch (error) {
            // Cache unavailable - continue to loader
            console.warn('Cache read failed, loading from source');
        }
        
        // Load from source
        const value = await loader();
        
        // Try to cache (fire and forget)
        this.set(key, value, ttlSeconds).catch(err => {
            console.warn('Failed to cache loaded value:', err.message);
        });
        
        return value;
    }
    
    /**
     * Delete with pattern matching (Redis SCAN + DEL).
     */
    async invalidatePattern(pattern: string): Promise<number> {
        if (this.circuitOpen) return 0;
        
        let deleted = 0;
        const fullPattern = this.fullKey(pattern);
        
        // Use SCAN for non-blocking pattern search
        const stream = this.redis.scanStream({
            match: fullPattern,
            count: 100,
        });
        
        return new Promise((resolve, reject) => {
            stream.on('data', async (keys: string[]) => {
                if (keys.length > 0) {
                    const pipeline = this.redis.pipeline();
                    keys.forEach(key => pipeline.del(key));
                    await pipeline.exec();
                    deleted += keys.length;
                }
            });
            
            stream.on('end', () => resolve(deleted));
            stream.on('error', reject);
        });
    }
    
    /**
     * Bulk get for multiple keys.
     */
    async mget(keys: string[]): Promise<Map<string, V>> {
        if (this.circuitOpen || keys.length === 0) {
            return new Map();
        }
        
        const fullKeys = keys.map(k => this.fullKey(k));
        const results = await this.redis.mget(...fullKeys);
        
        const map = new Map<string, V>();
        for (let i = 0; i < keys.length; i++) {
            if (results[i]) {
                map.set(keys[i], JSON.parse(results[i]!) as V);
            }
        }
        
        return map;
    }
    
    getHealthStatus(): CacheHealthStatus {
        return {
            connected: this.connected,
            circuitOpen: this.circuitOpen,
            recentFailures: this.failures,
            lastFailureAt: this.lastFailure > 0 
                ? new Date(this.lastFailure).toISOString() 
                : null,
        };
    }
    
    async disconnect(): Promise<void> {
        await this.redis.quit();
    }
}
 
interface CacheHealthStatus {
    connected: boolean;
    circuitOpen: boolean;
    recentFailures: number;
    lastFailureAt: string | null;
}

Serialization Overhead

Distributed caches require serialization (object → bytes) on write and deserialization (bytes → object) on read. This adds latency and CPU cost. For performance-critical paths, consider binary formats like MessagePack or Protocol Buffers instead of JSON.

Cache Coherence Challenges

Cache coherence is the problem of keeping cached data consistent across multiple caches. This is primarily a challenge with local caches, where each instance maintains independent state.

The Coherence Problem:

Imagine three application instances, each with local caches:

User updates their profile via Instance A
Instance A updates database and its local cache
Instances B and C still have old profile cached
Users hitting B or C see stale data

This inconsistency can persist until cached data expires (minutes or hours).

Common Coherence Issues

•Read-your-writes violation — User updates data but sees old value on next request (routed to different instance)
•Phantom data — Deleted record still appears from cached instances
•Inconsistent reads — Same query returns different results depending on which instance handles it
•Update ordering — Concurrent updates applied in different order across instances

Coherence Strategies:

TTL (Time-to-Live) coherence accepts eventual consistency by ensuring cached data expires within a bounded time. No active invalidation required.

When to Use: When temporary staleness is acceptable (product catalogs, configuration, non-critical content).

ttl-coherence.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
/**
 * TTL-based coherence relies on natural expiration.
 * Staleness window = TTL duration.
 */
class TTLCoherentCache<V> {
    private cache: Map<string, { value: V; expiresAt: number }> = new Map();
    
    constructor(
        private readonly ttlMs: number,
        private readonly maxStalenessMs: number // Maximum acceptable staleness
    ) {
        // TTL should not exceed max staleness
        if (ttlMs > maxStalenessMs) {
            console.warn(`TTL (${ttlMs}ms) exceeds max staleness (${maxStalenessMs}ms)`);
        }
    }
    
    get(key: string): V | undefined {
        const entry = this.cache.get(key);
        if (!entry) return undefined;
        
        if (Date.now() > entry.expiresAt) {
            this.cache.delete(key);
            return undefined;
        }
        
        return entry.value;
    }
    
    set(key: string, value: V): void {
        this.cache.set(key, {
            value,
            expiresAt: Date.now() + this.ttlMs,
        });
    }
}
 
// Coherence guarantee: Data is never staler than TTL
// Trade-off: Must balance staleness tolerance vs hit rate
 
// Shorter TTL = fresher data, lower hit rate
const realTimeCache = new TTLCoherentCache<User>(
    30 * 1000,  // 30 second TTL
    60 * 1000   // 1 minute max staleness acceptable
);
 
// Longer TTL = higher hit rate, potentially staler data
const catalogCache = new TTLCoherentCache<Product>(
    5 * 60 * 1000,  // 5 minute TTL
    10 * 60 * 1000  // 10 minutes max staleness acceptable
);

Multi-Tier Caching Architectures

Multi-tier caching combines local and distributed caches to achieve both performance and consistency. This is the standard architecture for production systems at scale.

The L1/L2 Pattern:

L1 (Local Cache): Small, ultra-fast, process-local
L2 (Distributed Cache): Larger, shared, network-accessed

Read path: Check L1 → if miss, check L2 → if miss, load from source → populate L2 → populate L1

multi-tier-cache.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
/**
 * Multi-tier (L1/L2) cache implementation.
 * L1: Local in-memory cache (fast, small)
 * L2: Distributed Redis cache (larger, shared)
 */
interface TieredCacheConfig {
    l1MaxItems: number;
    l1TtlMs: number;
    l2TtlSeconds: number;
    namespace: string;
}
 
class TieredCache<V> {
    private l1: LocalCache<string, V>;
    private l2: DistributedCache<V>;
    private config: TieredCacheConfig;
    
    // Metrics by tier
    private metrics = {
        l1Hits: 0,
        l2Hits: 0,
        misses: 0,
    };
    
    constructor(
        config: TieredCacheConfig,
        redisUrl: string
    ) {
        this.config = config;
        
        this.l1 = new LocalCache({
            maxSize: config.l1MaxItems,
            ttlMs: config.l1TtlMs,
        });
        
        this.l2 = new DistributedCache({
            redisUrl,
            defaultTtlSeconds: config.l2TtlSeconds,
            prefix: config.namespace,
            connectionTimeout: 5000,
            maxRetries: 3,
            retryDelayMs: 100,
        });
    }
    
    /**
     * Multi-tier get with L1 → L2 → source fallback.
     */
    async get(key: string): Promise<V | undefined> {
        // Try L1 first (local, ultra-fast)
        let value = this.l1.get(key);
        if (value !== undefined) {
            this.metrics.l1Hits++;
            return value;
        }
        
        // Try L2 (distributed)
        try {
            value = await this.l2.get(key) ?? undefined;
            if (value !== undefined) {
                this.metrics.l2Hits++;
                // Promote to L1 for future accesses
                this.l1.set(key, value);
                return value;
            }
        } catch (error) {
            // L2 unavailable, treat as miss
            console.warn('L2 cache unavailable:', error);
        }
        
        this.metrics.misses++;
        return undefined;
    }
    
    /**
     * Get with loader - populates both tiers on miss.
     */
    async getOrLoad(key: string, loader: () => Promise<V>): Promise<V> {
        // Try caches first
        const cached = await this.get(key);
        if (cached !== undefined) {
            return cached;
        }
        
        // Load from source
        const value = await loader();
        
        // Populate both tiers
        await this.set(key, value);
        
        return value;
    }
    
    /**
     * Set in both tiers.
     */
    async set(key: string, value: V): Promise<void> {
        // Write to L1 (synchronous)
        this.l1.set(key, value);
        
        // Write to L2 (async, non-blocking)
        this.l2.set(key, value, this.config.l2TtlSeconds).catch(err => {
            console.warn('L2 cache write failed:', err);
        });
    }
    
    /**
     * Invalidate from both tiers.
     */
    async invalidate(key: string): Promise<void> {
        // Invalidate L1 (local only)
        this.l1.invalidate(key);
        
        // Invalidate L2 (all instances will see this)
        try {
            await this.l2.invalidate(key);
        } catch (error) {
            console.warn('L2 cache invalidation failed:', error);
        }
        
        // Note: Other instances still have stale L1
        // Combine with Pub/Sub for cross-instance L1 invalidation
    }
    
    /**
     * Get tiered cache metrics.
     */
    getMetrics(): TieredCacheMetrics {
        const total = this.metrics.l1Hits + this.metrics.l2Hits + this.metrics.misses;
        
        return {
            l1HitRate: total > 0 ? this.metrics.l1Hits / total : 0,
            l2HitRate: total > 0 ? this.metrics.l2Hits / total : 0,
            overallHitRate: total > 0 
                ? (this.metrics.l1Hits + this.metrics.l2Hits) / total 
                : 0,
            l1Hits: this.metrics.l1Hits,
            l2Hits: this.metrics.l2Hits,
            misses: this.metrics.misses,
            l1Stats: this.l1.getMetrics(),
            l2Status: this.l2.getHealthStatus(),
        };
    }
}
 
interface TieredCacheMetrics {
    l1HitRate: number;
    l2HitRate: number;
    overallHitRate: number;
    l1Hits: number;
    l2Hits: number;
    misses: number;
    l1Stats: LocalCacheMetrics;
    l2Status: CacheHealthStatus;
}
 
// Production configuration example
const userCache = new TieredCache<User>(
    {
        l1MaxItems: 1000,           // Small L1 for hottest users
        l1TtlMs: 30 * 1000,         // 30 second L1 TTL
        l2TtlSeconds: 5 * 60,       // 5 minute L2 TTL
        namespace: 'user-service:users',
    },
    process.env.REDIS_URL!
);

L1 TTL Should Be Shorter Than L2

L1 TTL should be significantly shorter than L2 TTL. This ensures L1 refreshes from L2 regularly, maintaining coherence across instances. A common pattern: L1 = 30 seconds, L2 = 5-15 minutes.

Cache Topology Decision Framework

Choosing between local, distributed, or multi-tier caching requires analyzing your specific requirements. Here's a decision framework:

Cache Topology Decision Matrix
Requirement	Local Only	Distributed Only	Multi-Tier
Sub-millisecond latency critical	✅ Best choice	❌ Network latency	✅ L1 handles hot path
Data must be consistent across instances	❌ Coherence issues	✅ Single source	⚠️ Requires coherence strategy
Large dataset (GB+)	❌ Memory limited	✅ Scales separately	✅ L2 holds large dataset
Survive instance restarts	❌ Lost on restart	✅ Persists externally	✅ L2 persists
Zero external dependencies	✅ Self-contained	❌ Redis/Memcached needed	❌ Requires distributed tier
Simple operational model	✅ Embedded	⚠️ Additional service	⚠️ Most complex

topology-decision.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
/**
 * Cache topology decision helper.
 */
interface CacheRequirements {
    latencyBudgetMs: number;         // Maximum acceptable latency
    consistencyRequired: boolean;     // Must be consistent across instances
    datasetSizeGB: number;           // Total cacheable data size
    instanceCount: number;           // Number of application instances
    operationalComplexity: 'minimal' | 'moderate' | 'acceptable';
    stalenessToleranceSeconds: number;  // How stale is okay
}
 
function recommendTopology(req: CacheRequirements): TopologyRecommendation {
    // Single instance = local cache is sufficient
    if (req.instanceCount === 1) {
        return {
            topology: 'local',
            rationale: 'Single instance - no coherence concerns',
            l1Config: { maxMB: Math.min(req.datasetSizeGB * 1000, 500), ttlMs: 300000 },
        };
    }
    
    // Ultra-low latency requirement
    if (req.latencyBudgetMs < 1) {
        if (req.consistencyRequired) {
            return {
                topology: 'multi-tier',
                rationale: 'Need speed of L1 with consistency from L2',
                l1Config: { maxMB: 100, ttlMs: Math.min(req.stalenessToleranceSeconds * 1000, 30000) },
                l2Config: { maxGB: req.datasetSizeGB, ttlSeconds: 300 },
                coherenceStrategy: 'pubsub',
            };
        }
        return {
            topology: 'local',
            rationale: 'Latency critical, consistency secondary',
            l1Config: { maxMB: 200, ttlMs: req.stalenessToleranceSeconds * 1000 },
            coherenceStrategy: 'ttl-only',
        };
    }
    
    // Strong consistency required
    if (req.consistencyRequired && req.stalenessToleranceSeconds < 5) {
        return {
            topology: 'distributed',
            rationale: 'Consistency critical - single source of truth',
            l2Config: { maxGB: req.datasetSizeGB, ttlSeconds: 300 },
        };
    }
    
    // Large dataset
    if (req.datasetSizeGB > 1) {
        return {
            topology: 'multi-tier',
            rationale: 'Dataset too large for local; L1 for hot subset',
            l1Config: { maxMB: 500, ttlMs: 60000 },
            l2Config: { maxGB: req.datasetSizeGB, ttlSeconds: 600 },
            coherenceStrategy: req.stalenessToleranceSeconds < 60 ? 'pubsub' : 'ttl-only',
        };
    }
    
    // Default: multi-tier for balance
    return {
        topology: 'multi-tier',
        rationale: 'Balanced approach for typical workload',
        l1Config: { maxMB: 100, ttlMs: 30000 },
        l2Config: { maxGB: 1, ttlSeconds: 300 },
        coherenceStrategy: 'pubsub',
    };
}
 
interface TopologyRecommendation {
    topology: 'local' | 'distributed' | 'multi-tier';
    rationale: string;
    l1Config?: { maxMB: number; ttlMs: number };
    l2Config?: { maxGB: number; ttlSeconds: number };
    coherenceStrategy?: 'ttl-only' | 'pubsub' | 'version-based';
}
 
// Example usage
const recommendation = recommendTopology({
    latencyBudgetMs: 5,
    consistencyRequired: true,
    datasetSizeGB: 2,
    instanceCount: 10,
    operationalComplexity: 'acceptable',
    stalenessToleranceSeconds: 30,
});
 
console.log(recommendation);
// {
//   topology: 'multi-tier',
//   rationale: 'Dataset too large for local; L1 for hot subset',
//   l1Config: { maxMB: 500, ttlMs: 60000 },
//   l2Config: { maxGB: 2, ttlSeconds: 600 },
//   coherenceStrategy: 'pubsub'
// }

Start Simple, Evolve as Needed

For greenfield projects, start with distributed-only (Redis). Add a local L1 tier only when profiling shows network latency is a bottleneck. Multi-tier complexity is only justified when you have clear evidence of need.

Summary: Cache Topology Mastery

Cache topology is an architectural decision with significant implications. Let's consolidate the key principles:

Key Takeaways

•Local caches offer unmatched speed — Sub-microsecond access, no network dependency, but isolated per instance.
•Distributed caches provide shared state — Single source of truth, larger capacity, but network latency on every access.
•Multi-tier combines benefits — L1 for speed, L2 for capacity and consistency; standard for production systems.
•Cache coherence requires strategy — TTL for simplicity, Pub/Sub for active invalidation, versioning for strong consistency.
•L1 TTL should be much shorter than L2 — Ensures local caches refresh regularly from the authoritative distributed tier.
•Single instance? Use local cache — No coherence concerns; distributed cache adds unnecessary complexity.
•When in doubt, start distributed — Add local tier only when profiling proves network latency is the bottleneck.

What's Next:

With cache topology understood, we'll explore cache warming strategies in the final page of this module. You'll learn how to pre-populate caches to avoid cold-start penalties, implement background warming, and handle cache warming during deployments and scaling events.

Page Complete

You now understand the trade-offs between local and distributed caches, cache coherence challenges, and multi-tier architectures. These patterns enable you to design caching systems that achieve both performance and consistency goals appropriate to your requirements.

3 / 4

Loading learning content...

System Design (LLD)Cache Design Considerations

Cache Design Considerations

LevelIntermediate

Duration90 mins

TopicCache Design Considerations

3 / 4

Distributed vs Local Cache

The Cache Location Decision

Where should your cache live? This deceptively simple question has profound implications for your system's performance, consistency, and operational complexity.

Distributed caches (Redis, Memcached, shared across instances) provide a single source of truth that all instances share. But every access crosses the network, adding latency and failure modes.

What You Will Learn

Cache Topology Overview

Cache topology refers to where cache storage is located relative to your application instances. The three primary topologies are:

1. Local Cache (In-Process)

Cache lives within application process memory
Zero network latency
Each instance has its own isolated cache
Examples: Guava Cache, Caffeine, in-memory Maps

2. Distributed Cache (Shared)

Cache runs as separate service(s)
Accessed over network
Shared across all application instances
Examples: Redis, Memcached, Hazelcast

3. Multi-Tier Cache (Layered)

Combines local and distributed caches
Local cache as L1, distributed as L2
Balances speed with consistency
Most production systems use this approach

Cache Topology Comparison at a Glance
Aspect	Local Cache	Distributed Cache
Access Latency	< 1μs (sub-microsecond)	1-10ms (network RTT)
Total Capacity	Bounded by instance RAM	Scales independently
Consistency Across Instances	No built-in coherence	Single source of truth
Failure Impact	Lost on instance restart	Survives instance restarts
Operational Complexity	Zero (embedded)	Additional infrastructure
Cost	Uses application memory	Separate compute/memory cost

The 1000x Latency Gap

Local Cache Deep Dive

Local caches are embedded within your application process. They offer unmatched performance but require careful management to avoid problems.

Characteristics:

Advantages

•Ultra-low latency — Direct memory access, no serialization
•No network dependency — Works during network partitions
•Zero infrastructure — No additional services to manage
•Simple implementation — Just a Map with eviction
•No serialization cost — Objects stored as-is

Disadvantages

•Limited capacity — Shares RAM with application
•No cross-instance sharing — Each instance caches independently
•Coherence challenges — Different instances may have conflicting data
•Cold start penalty — Cache empty after restart/deploy
•Memory fragmentation — Can cause GC pressure

local-cache-implementation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
/**
 * Production-ready local cache with essential features.
 */
interface LocalCacheConfig<K, V> {
    maxSize: number;           // Maximum items
    maxMemoryBytes?: number;   // Maximum memory (if tracking size)
    ttlMs: number;             // Time-to-live for entries
    refreshMs?: number;        // Background refresh interval
    sizeEstimator?: (value: V) => number;  // For memory tracking
}
 
class LocalCache<K, V> {
    private cache: Map<K, CacheEntry<V>> = new Map();
    private config: LocalCacheConfig<K, V>;
    private currentMemory: number = 0;
    
    // Metrics
    private metrics = {
        hits: 0,
        misses: 0,
        evictions: 0,
        refreshes: 0,
    };
    
    constructor(config: LocalCacheConfig<K, V>) {
        this.config = config;
        this.startCleanupTask();
    }
    
    get(key: K): V | undefined {
        const entry = this.cache.get(key);
        
        if (!entry) {
            this.metrics.misses++;
            return undefined;
        }
        
        // Check TTL
        if (Date.now() > entry.expiresAt) {
            this.cache.delete(key);
            this.currentMemory -= entry.memorySize;
            this.metrics.misses++;
            return undefined;
        }
        
        // Update access time for LRU
        entry.lastAccess = Date.now();
        this.metrics.hits++;
        
        return entry.value;
    }
    
    set(key: K, value: V): void {
        const memorySize = this.config.sizeEstimator?.(value) ?? 1;
        
        // Remove existing entry if present
        if (this.cache.has(key)) {
            const existing = this.cache.get(key)!;
            this.currentMemory -= existing.memorySize;
            this.cache.delete(key);
        }
        
        // Evict if necessary
        this.evictIfNeeded(memorySize);
        
        // Insert new entry
        const entry: CacheEntry<V> = {
            value,
            createdAt: Date.now(),
            lastAccess: Date.now(),
            expiresAt: Date.now() + this.config.ttlMs,
            memorySize,
        };
        
        this.cache.set(key, entry);
        this.currentMemory += memorySize;
    }
    
    /**
     * Get with loader - returns cached value or loads and caches.
     */
    async getOrLoad(key: K, loader: () => Promise<V>): Promise<V> {
        const cached = this.get(key);
        if (cached !== undefined) {
            return cached;
        }
        
        // Load and cache
        const value = await loader();
        this.set(key, value);
        return value;
    }
    
    private evictIfNeeded(incomingSize: number): void {
        // Evict by count
        while (this.cache.size >= this.config.maxSize) {
            this.evictOldest();
        }
        
        // Evict by memory (if configured)
        if (this.config.maxMemoryBytes) {
            while (this.currentMemory + incomingSize > this.config.maxMemoryBytes &&
                   this.cache.size > 0) {
                this.evictOldest();
            }
        }
    }
    
    private evictOldest(): void {
        let oldestKey: K | undefined;
        let oldestAccess = Infinity;
        
        for (const [key, entry] of this.cache) {
            if (entry.lastAccess < oldestAccess) {
                oldestAccess = entry.lastAccess;
                oldestKey = key;
            }
        }
        
        if (oldestKey !== undefined) {
            const entry = this.cache.get(oldestKey)!;
            this.cache.delete(oldestKey);
            this.currentMemory -= entry.memorySize;
            this.metrics.evictions++;
        }
    }
    
    private startCleanupTask(): void {
        // Periodic cleanup of expired entries
        setInterval(() => {
            const now = Date.now();
            for (const [key, entry] of this.cache) {
                if (now > entry.expiresAt) {
                    this.cache.delete(key);
                    this.currentMemory -= entry.memorySize;
                }
            }
        }, Math.min(this.config.ttlMs / 2, 60000));
    }
    
    invalidate(key: K): boolean {
        const entry = this.cache.get(key);
        if (entry) {
            this.currentMemory -= entry.memorySize;
            this.cache.delete(key);
            return true;
        }
        return false;
    }
    
    clear(): void {
        this.cache.clear();
        this.currentMemory = 0;
    }
    
    getMetrics(): LocalCacheMetrics {
        const total = this.metrics.hits + this.metrics.misses;
        return {
            hitRate: total > 0 ? this.metrics.hits / total : 0,
            size: this.cache.size,
            memoryUsed: this.currentMemory,
            ...this.metrics,
        };
    }
}
 
interface CacheEntry<V> {
    value: V;
    createdAt: number;
    lastAccess: number;
    expiresAt: number;
    memorySize: number;
}
 
interface LocalCacheMetrics {
    hitRate: number;
    size: number;
    memoryUsed: number;
    hits: number;
    misses: number;
    evictions: number;
    refreshes: number;
}
 
// Usage example
const userCache = new LocalCache<string, UserProfile>({
    maxSize: 10000,
    maxMemoryBytes: 100 * 1024 * 1024, // 100MB
    ttlMs: 5 * 60 * 1000, // 5 minutes
    sizeEstimator: (user) => JSON.stringify(user).length * 2, // Rough estimate
});

Memory Management Caution

Distributed Cache Deep Dive

Distributed caches run as separate services, providing shared storage accessible by all application instances. They're the backbone of scalable caching architectures.

Popular Distributed Cache Systems:

Distributed Cache System Comparison
System	Strengths	Best For	Considerations
Redis	Rich data types, Pub/Sub, Persistence	General purpose, Sessions, Queues	Single-threaded, Memory-bound
Memcached	Simplicity, Multi-threaded	Pure key-value caching	No persistence, No data types
Hazelcast	Distributed computing, Near-cache	Java apps, Compute + Cache	JVM-only traditionally
Apache Ignite	SQL queries, Compute grid	Data grid, Analytics cache	Complexity, Resource intensive
Couchbase	JSON documents, Mobile sync	Document caching, Mobile	Operational overhead

distributed-cache-client.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
import Redis from 'ioredis';
 
/**
 * Production distributed cache client with resilience patterns.
 */
interface DistributedCacheConfig {
    redisUrl: string;
    defaultTtlSeconds: number;
    connectionTimeout: number;
    maxRetries: number;
    retryDelayMs: number;
    prefix: string;
}
 
class DistributedCache<V> {
    private redis: Redis;
    private config: DistributedCacheConfig;
    private connected: boolean = false;
    
    // Circuit breaker state
    private failures: number = 0;
    private circuitOpen: boolean = false;
    private lastFailure: number = 0;
    
    constructor(config: DistributedCacheConfig) {
        this.config = config;
        this.redis = new Redis(config.redisUrl, {
            connectTimeout: config.connectionTimeout,
            retryStrategy: (times) => {
                if (times > config.maxRetries) {
                    return null; // Stop retrying
                }
                return Math.min(times * config.retryDelayMs, 5000);
            },
            lazyConnect: true,
        });
        
        this.setupConnectionHandlers();
    }
    
    private setupConnectionHandlers(): void {
        this.redis.on('connect', () => {
            this.connected = true;
            this.failures = 0;
            this.circuitOpen = false;
            console.log('Distributed cache connected');
        });
        
        this.redis.on('error', (err) => {
            console.error('Distributed cache error:', err.message);
            this.recordFailure();
        });
        
        this.redis.on('close', () => {
            this.connected = false;
            console.warn('Distributed cache connection closed');
        });
    }
    
    private recordFailure(): void {
        this.failures++;
        this.lastFailure = Date.now();
        
        // Open circuit after 5 consecutive failures
        if (this.failures >= 5) {
            this.circuitOpen = true;
            console.warn('Cache circuit breaker opened');
            
            // Auto-reset after 30 seconds
            setTimeout(() => {
                this.circuitOpen = false;
                this.failures = 0;
                console.log('Cache circuit breaker reset');
            }, 30000);
        }
    }
    
    private fullKey(key: string): string {
        return `${this.config.prefix}:${key}`;
    }
    
    async get(key: string): Promise<V | null> {
        if (this.circuitOpen) {
            return null; // Fail fast
        }
        
        try {
            const data = await this.redis.get(this.fullKey(key));
            if (!data) return null;
            
            return JSON.parse(data) as V;
        } catch (error) {
            this.recordFailure();
            throw error;
        }
    }
    
    async set(
        key: string, 
        value: V, 
        ttlSeconds?: number
    ): Promise<boolean> {
        if (this.circuitOpen) {
            return false; // Fail fast
        }
        
        try {
            const serialized = JSON.stringify(value);
            const ttl = ttlSeconds ?? this.config.defaultTtlSeconds;
            
            await this.redis.setex(
                this.fullKey(key), 
                ttl, 
                serialized
            );
            
            return true;
        } catch (error) {
            this.recordFailure();
            throw error;
        }
    }
    
    /**
     * Get with fallback loader and cache-aside pattern.
     */
    async getOrLoad(
        key: string,
        loader: () => Promise<V>,
        ttlSeconds?: number
    ): Promise<V> {
        // Try cache first
        try {
            const cached = await this.get(key);
            if (cached !== null) {
                return cached;
            }
        } catch (error) {
            // Cache unavailable - continue to loader
            console.warn('Cache read failed, loading from source');
        }
        
        // Load from source
        const value = await loader();
        
        // Try to cache (fire and forget)
        this.set(key, value, ttlSeconds).catch(err => {
            console.warn('Failed to cache loaded value:', err.message);
        });
        
        return value;
    }
    
    /**
     * Delete with pattern matching (Redis SCAN + DEL).
     */
    async invalidatePattern(pattern: string): Promise<number> {
        if (this.circuitOpen) return 0;
        
        let deleted = 0;
        const fullPattern = this.fullKey(pattern);
        
        // Use SCAN for non-blocking pattern search
        const stream = this.redis.scanStream({
            match: fullPattern,
            count: 100,
        });
        
        return new Promise((resolve, reject) => {
            stream.on('data', async (keys: string[]) => {
                if (keys.length > 0) {
                    const pipeline = this.redis.pipeline();
                    keys.forEach(key => pipeline.del(key));
                    await pipeline.exec();
                    deleted += keys.length;
                }
            });
            
            stream.on('end', () => resolve(deleted));
            stream.on('error', reject);
        });
    }
    
    /**
     * Bulk get for multiple keys.
     */
    async mget(keys: string[]): Promise<Map<string, V>> {
        if (this.circuitOpen || keys.length === 0) {
            return new Map();
        }
        
        const fullKeys = keys.map(k => this.fullKey(k));
        const results = await this.redis.mget(...fullKeys);
        
        const map = new Map<string, V>();
        for (let i = 0; i < keys.length; i++) {
            if (results[i]) {
                map.set(keys[i], JSON.parse(results[i]!) as V);
            }
        }
        
        return map;
    }
    
    getHealthStatus(): CacheHealthStatus {
        return {
            connected: this.connected,
            circuitOpen: this.circuitOpen,
            recentFailures: this.failures,
            lastFailureAt: this.lastFailure > 0 
                ? new Date(this.lastFailure).toISOString() 
                : null,
        };
    }
    
    async disconnect(): Promise<void> {
        await this.redis.quit();
    }
}
 
interface CacheHealthStatus {
    connected: boolean;
    circuitOpen: boolean;
    recentFailures: number;
    lastFailureAt: string | null;
}

Serialization Overhead

Cache Coherence Challenges

Cache coherence is the problem of keeping cached data consistent across multiple caches. This is primarily a challenge with local caches, where each instance maintains independent state.

The Coherence Problem:

Imagine three application instances, each with local caches:

User updates their profile via Instance A
Instance A updates database and its local cache
Instances B and C still have old profile cached
Users hitting B or C see stale data

This inconsistency can persist until cached data expires (minutes or hours).

Common Coherence Issues

•Read-your-writes violation — User updates data but sees old value on next request (routed to different instance)
•Phantom data — Deleted record still appears from cached instances
•Inconsistent reads — Same query returns different results depending on which instance handles it
•Update ordering — Concurrent updates applied in different order across instances

Coherence Strategies:

TTL (Time-to-Live) coherence accepts eventual consistency by ensuring cached data expires within a bounded time. No active invalidation required.

When to Use: When temporary staleness is acceptable (product catalogs, configuration, non-critical content).

ttl-coherence.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
/**
 * TTL-based coherence relies on natural expiration.
 * Staleness window = TTL duration.
 */
class TTLCoherentCache<V> {
    private cache: Map<string, { value: V; expiresAt: number }> = new Map();
    
    constructor(
        private readonly ttlMs: number,
        private readonly maxStalenessMs: number // Maximum acceptable staleness
    ) {
        // TTL should not exceed max staleness
        if (ttlMs > maxStalenessMs) {
            console.warn(`TTL (${ttlMs}ms) exceeds max staleness (${maxStalenessMs}ms)`);
        }
    }
    
    get(key: string): V | undefined {
        const entry = this.cache.get(key);
        if (!entry) return undefined;
        
        if (Date.now() > entry.expiresAt) {
            this.cache.delete(key);
            return undefined;
        }
        
        return entry.value;
    }
    
    set(key: string, value: V): void {
        this.cache.set(key, {
            value,
            expiresAt: Date.now() + this.ttlMs,
        });
    }
}
 
// Coherence guarantee: Data is never staler than TTL
// Trade-off: Must balance staleness tolerance vs hit rate
 
// Shorter TTL = fresher data, lower hit rate
const realTimeCache = new TTLCoherentCache<User>(
    30 * 1000,  // 30 second TTL
    60 * 1000   // 1 minute max staleness acceptable
);
 
// Longer TTL = higher hit rate, potentially staler data
const catalogCache = new TTLCoherentCache<Product>(
    5 * 60 * 1000,  // 5 minute TTL
    10 * 60 * 1000  // 10 minutes max staleness acceptable
);

Multi-Tier Caching Architectures

Multi-tier caching combines local and distributed caches to achieve both performance and consistency. This is the standard architecture for production systems at scale.

The L1/L2 Pattern:

L1 (Local Cache): Small, ultra-fast, process-local
L2 (Distributed Cache): Larger, shared, network-accessed

Read path: Check L1 → if miss, check L2 → if miss, load from source → populate L2 → populate L1

multi-tier-cache.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
/**
 * Multi-tier (L1/L2) cache implementation.
 * L1: Local in-memory cache (fast, small)
 * L2: Distributed Redis cache (larger, shared)
 */
interface TieredCacheConfig {
    l1MaxItems: number;
    l1TtlMs: number;
    l2TtlSeconds: number;
    namespace: string;
}
 
class TieredCache<V> {
    private l1: LocalCache<string, V>;
    private l2: DistributedCache<V>;
    private config: TieredCacheConfig;
    
    // Metrics by tier
    private metrics = {
        l1Hits: 0,
        l2Hits: 0,
        misses: 0,
    };
    
    constructor(
        config: TieredCacheConfig,
        redisUrl: string
    ) {
        this.config = config;
        
        this.l1 = new LocalCache({
            maxSize: config.l1MaxItems,
            ttlMs: config.l1TtlMs,
        });
        
        this.l2 = new DistributedCache({
            redisUrl,
            defaultTtlSeconds: config.l2TtlSeconds,
            prefix: config.namespace,
            connectionTimeout: 5000,
            maxRetries: 3,
            retryDelayMs: 100,
        });
    }
    
    /**
     * Multi-tier get with L1 → L2 → source fallback.
     */
    async get(key: string): Promise<V | undefined> {
        // Try L1 first (local, ultra-fast)
        let value = this.l1.get(key);
        if (value !== undefined) {
            this.metrics.l1Hits++;
            return value;
        }
        
        // Try L2 (distributed)
        try {
            value = await this.l2.get(key) ?? undefined;
            if (value !== undefined) {
                this.metrics.l2Hits++;
                // Promote to L1 for future accesses
                this.l1.set(key, value);
                return value;
            }
        } catch (error) {
            // L2 unavailable, treat as miss
            console.warn('L2 cache unavailable:', error);
        }
        
        this.metrics.misses++;
        return undefined;
    }
    
    /**
     * Get with loader - populates both tiers on miss.
     */
    async getOrLoad(key: string, loader: () => Promise<V>): Promise<V> {
        // Try caches first
        const cached = await this.get(key);
        if (cached !== undefined) {
            return cached;
        }
        
        // Load from source
        const value = await loader();
        
        // Populate both tiers
        await this.set(key, value);
        
        return value;
    }
    
    /**
     * Set in both tiers.
     */
    async set(key: string, value: V): Promise<void> {
        // Write to L1 (synchronous)
        this.l1.set(key, value);
        
        // Write to L2 (async, non-blocking)
        this.l2.set(key, value, this.config.l2TtlSeconds).catch(err => {
            console.warn('L2 cache write failed:', err);
        });
    }
    
    /**
     * Invalidate from both tiers.
     */
    async invalidate(key: string): Promise<void> {
        // Invalidate L1 (local only)
        this.l1.invalidate(key);
        
        // Invalidate L2 (all instances will see this)
        try {
            await this.l2.invalidate(key);
        } catch (error) {
            console.warn('L2 cache invalidation failed:', error);
        }
        
        // Note: Other instances still have stale L1
        // Combine with Pub/Sub for cross-instance L1 invalidation
    }
    
    /**
     * Get tiered cache metrics.
     */
    getMetrics(): TieredCacheMetrics {
        const total = this.metrics.l1Hits + this.metrics.l2Hits + this.metrics.misses;
        
        return {
            l1HitRate: total > 0 ? this.metrics.l1Hits / total : 0,
            l2HitRate: total > 0 ? this.metrics.l2Hits / total : 0,
            overallHitRate: total > 0 
                ? (this.metrics.l1Hits + this.metrics.l2Hits) / total 
                : 0,
            l1Hits: this.metrics.l1Hits,
            l2Hits: this.metrics.l2Hits,
            misses: this.metrics.misses,
            l1Stats: this.l1.getMetrics(),
            l2Status: this.l2.getHealthStatus(),
        };
    }
}
 
interface TieredCacheMetrics {
    l1HitRate: number;
    l2HitRate: number;
    overallHitRate: number;
    l1Hits: number;
    l2Hits: number;
    misses: number;
    l1Stats: LocalCacheMetrics;
    l2Status: CacheHealthStatus;
}
 
// Production configuration example
const userCache = new TieredCache<User>(
    {
        l1MaxItems: 1000,           // Small L1 for hottest users
        l1TtlMs: 30 * 1000,         // 30 second L1 TTL
        l2TtlSeconds: 5 * 60,       // 5 minute L2 TTL
        namespace: 'user-service:users',
    },
    process.env.REDIS_URL!
);

L1 TTL Should Be Shorter Than L2

L1 TTL should be significantly shorter than L2 TTL. This ensures L1 refreshes from L2 regularly, maintaining coherence across instances. A common pattern: L1 = 30 seconds, L2 = 5-15 minutes.

Cache Topology Decision Framework

Choosing between local, distributed, or multi-tier caching requires analyzing your specific requirements. Here's a decision framework:

Cache Topology Decision Matrix
Requirement	Local Only	Distributed Only	Multi-Tier
Sub-millisecond latency critical	✅ Best choice	❌ Network latency	✅ L1 handles hot path
Data must be consistent across instances	❌ Coherence issues	✅ Single source	⚠️ Requires coherence strategy
Large dataset (GB+)	❌ Memory limited	✅ Scales separately	✅ L2 holds large dataset
Survive instance restarts	❌ Lost on restart	✅ Persists externally	✅ L2 persists
Zero external dependencies	✅ Self-contained	❌ Redis/Memcached needed	❌ Requires distributed tier
Simple operational model	✅ Embedded	⚠️ Additional service	⚠️ Most complex

topology-decision.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
/**
 * Cache topology decision helper.
 */
interface CacheRequirements {
    latencyBudgetMs: number;         // Maximum acceptable latency
    consistencyRequired: boolean;     // Must be consistent across instances
    datasetSizeGB: number;           // Total cacheable data size
    instanceCount: number;           // Number of application instances
    operationalComplexity: 'minimal' | 'moderate' | 'acceptable';
    stalenessToleranceSeconds: number;  // How stale is okay
}
 
function recommendTopology(req: CacheRequirements): TopologyRecommendation {
    // Single instance = local cache is sufficient
    if (req.instanceCount === 1) {
        return {
            topology: 'local',
            rationale: 'Single instance - no coherence concerns',
            l1Config: { maxMB: Math.min(req.datasetSizeGB * 1000, 500), ttlMs: 300000 },
        };
    }
    
    // Ultra-low latency requirement
    if (req.latencyBudgetMs < 1) {
        if (req.consistencyRequired) {
            return {
                topology: 'multi-tier',
                rationale: 'Need speed of L1 with consistency from L2',
                l1Config: { maxMB: 100, ttlMs: Math.min(req.stalenessToleranceSeconds * 1000, 30000) },
                l2Config: { maxGB: req.datasetSizeGB, ttlSeconds: 300 },
                coherenceStrategy: 'pubsub',
            };
        }
        return {
            topology: 'local',
            rationale: 'Latency critical, consistency secondary',
            l1Config: { maxMB: 200, ttlMs: req.stalenessToleranceSeconds * 1000 },
            coherenceStrategy: 'ttl-only',
        };
    }
    
    // Strong consistency required
    if (req.consistencyRequired && req.stalenessToleranceSeconds < 5) {
        return {
            topology: 'distributed',
            rationale: 'Consistency critical - single source of truth',
            l2Config: { maxGB: req.datasetSizeGB, ttlSeconds: 300 },
        };
    }
    
    // Large dataset
    if (req.datasetSizeGB > 1) {
        return {
            topology: 'multi-tier',
            rationale: 'Dataset too large for local; L1 for hot subset',
            l1Config: { maxMB: 500, ttlMs: 60000 },
            l2Config: { maxGB: req.datasetSizeGB, ttlSeconds: 600 },
            coherenceStrategy: req.stalenessToleranceSeconds < 60 ? 'pubsub' : 'ttl-only',
        };
    }
    
    // Default: multi-tier for balance
    return {
        topology: 'multi-tier',
        rationale: 'Balanced approach for typical workload',
        l1Config: { maxMB: 100, ttlMs: 30000 },
        l2Config: { maxGB: 1, ttlSeconds: 300 },
        coherenceStrategy: 'pubsub',
    };
}
 
interface TopologyRecommendation {
    topology: 'local' | 'distributed' | 'multi-tier';
    rationale: string;
    l1Config?: { maxMB: number; ttlMs: number };
    l2Config?: { maxGB: number; ttlSeconds: number };
    coherenceStrategy?: 'ttl-only' | 'pubsub' | 'version-based';
}
 
// Example usage
const recommendation = recommendTopology({
    latencyBudgetMs: 5,
    consistencyRequired: true,
    datasetSizeGB: 2,
    instanceCount: 10,
    operationalComplexity: 'acceptable',
    stalenessToleranceSeconds: 30,
});
 
console.log(recommendation);
// {
//   topology: 'multi-tier',
//   rationale: 'Dataset too large for local; L1 for hot subset',
//   l1Config: { maxMB: 500, ttlMs: 60000 },
//   l2Config: { maxGB: 2, ttlSeconds: 600 },
//   coherenceStrategy: 'pubsub'
// }

Start Simple, Evolve as Needed

Summary: Cache Topology Mastery

Cache topology is an architectural decision with significant implications. Let's consolidate the key principles:

Key Takeaways

•Local caches offer unmatched speed — Sub-microsecond access, no network dependency, but isolated per instance.
•Distributed caches provide shared state — Single source of truth, larger capacity, but network latency on every access.
•Multi-tier combines benefits — L1 for speed, L2 for capacity and consistency; standard for production systems.
•Cache coherence requires strategy — TTL for simplicity, Pub/Sub for active invalidation, versioning for strong consistency.
•L1 TTL should be much shorter than L2 — Ensures local caches refresh regularly from the authoritative distributed tier.
•Single instance? Use local cache — No coherence concerns; distributed cache adds unnecessary complexity.
•When in doubt, start distributed — Add local tier only when profiling proves network latency is the bottleneck.

What's Next:

Page Complete

3 / 4