Loading content...
Browser and CDN caching operate largely outside your application code—you configure headers and let the infrastructure handle the rest. Application-level caching is different: it's where you, the developer, explicitly decide what to cache, where to store it, when to invalidate it, and how to handle cache misses.
This layer offers the greatest control and flexibility. You can cache computed results, database query outputs, external API responses, session data, and complex business objects. But with control comes responsibility—poorly designed application caches lead to stale data, memory exhaustion, inconsistencies between nodes, and debugging nightmares.
Mastering application-level caching means understanding the patterns, the tradeoffs, and the failure modes that separate robust production caches from fragile prototypes.
This page covers in-process caching with hash maps and LRU caches, memoization patterns for function results, caching library selection and configuration, cache key design strategies, handling cache population and invalidation, and multi-instance cache consistency challenges. By the end, you'll be able to implement robust application-level caching that accelerates your services without introducing subtle bugs.
The simplest and fastest application cache is an in-process cache—data stored in your application's memory, accessible with no network calls, serialization, or context switching. Access time is measured in nanoseconds rather than milliseconds.
When In-Process Caching Excels:
Basic In-Process Cache Implementation:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
// Simple in-memory cache with TTL supportinterface CacheEntry<T> { value: T; expiresAt: number;} class InMemoryCache<T> { private cache: Map<string, CacheEntry<T>> = new Map(); private defaultTtlMs: number; constructor(defaultTtlSeconds: number = 300) { this.defaultTtlMs = defaultTtlSeconds * 1000; } set(key: string, value: T, ttlSeconds?: number): void { const ttl = ttlSeconds ? ttlSeconds * 1000 : this.defaultTtlMs; this.cache.set(key, { value, expiresAt: Date.now() + ttl, }); } get(key: string): T | undefined { const entry = this.cache.get(key); if (!entry) { return undefined; } if (Date.now() > entry.expiresAt) { this.cache.delete(key); return undefined; } return entry.value; } delete(key: string): boolean { return this.cache.delete(key); } clear(): void { this.cache.clear(); } // Periodic cleanup of expired entries startCleanup(intervalMs: number = 60000): NodeJS.Timeout { return setInterval(() => { const now = Date.now(); for (const [key, entry] of this.cache) { if (now > entry.expiresAt) { this.cache.delete(key); } } }, intervalMs); }} // Usageconst userCache = new InMemoryCache<User>(300); // 5-minute default TTL async function getUser(userId: string): Promise<User> { // Check cache first const cached = userCache.get(`user:${userId}`); if (cached) { return cached; } // Cache miss - fetch from database const user = await db.users.findUnique({ where: { id: userId } }); if (user) { userCache.set(`user:${userId}`, user); } return user;}The Memory Pressure Problem:
The critical limitation of in-process caches is memory consumption. Unlike external caches, in-process caches compete with your application for heap space. Unbounded caches lead to:
Solution: Bounded Caches with Eviction
Production-grade in-process caches enforce size limits and evict entries when full. The choice of eviction policy significantly impacts cache effectiveness.
| Policy | Description | Best For | Drawback |
|---|---|---|---|
| LRU (Least Recently Used) | Evict the entry not accessed longest | General purpose caching | Doesn't consider access frequency |
| LFU (Least Frequently Used) | Evict the entry accessed fewest times | Data with stable hot set | New items vulnerable; stale frequency counts |
| FIFO (First In First Out) | Evict the oldest entry | When age correlates with value | Ignores access patterns entirely |
| TTL (Time To Live) | Evict after fixed time | Data with known validity period | May evict still-valuable items |
| W-TinyLFU (Caffeine) | Combines recency and frequency | Near-optimal hit rates | More complex implementation |
| Random | Evict a random entry | Simple, unpredictable workloads | May evict hot items |
Don't implement your own LRU cache in production unless you have specific requirements. Use battle-tested libraries: node-cache, lru-cache, or quick-lru for Node.js; Caffeine for Java; functools.lru_cache for Python. These handle edge cases and offer better performance than naive implementations.
LRU (Least Recently Used) is the most common eviction policy. Understanding its implementation illuminates both its power and its limitations.
The LRU Invariant:
An LRU cache maintains entries in access-recency order. The most recently accessed item is at the "front," and the least recently accessed is at the "back." When the cache is full, the item at the back is evicted.
Implementation Requirements:
get(key) — O(1) lookup AND move to frontset(key, value) — O(1) insert at front AND evict from back if fullThe Classic Implementation: HashMap + Doubly Linked List
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125
// Classic LRU Cache: O(1) get and set operationsclass LRUNode<K, V> { key: K; value: V; prev: LRUNode<K, V> | null = null; next: LRUNode<K, V> | null = null; constructor(key: K, value: V) { this.key = key; this.value = value; }} class LRUCache<K, V> { private capacity: number; private cache: Map<K, LRUNode<K, V>> = new Map(); // Sentinel nodes simplify edge cases private head: LRUNode<K, V>; // Most recent private tail: LRUNode<K, V>; // Least recent constructor(capacity: number) { this.capacity = capacity; // Initialize sentinels this.head = new LRUNode<K, V>(null as any, null as any); this.tail = new LRUNode<K, V>(null as any, null as any); this.head.next = this.tail; this.tail.prev = this.head; } get(key: K): V | undefined { const node = this.cache.get(key); if (!node) { return undefined; } // Move to front (most recently used) this.moveToHead(node); return node.value; } set(key: K, value: V): void { const existing = this.cache.get(key); if (existing) { // Update existing entry existing.value = value; this.moveToHead(existing); return; } // New entry const node = new LRUNode(key, value); this.cache.set(key, node); this.addToHead(node); // Evict if over capacity if (this.cache.size > this.capacity) { const evicted = this.removeTail(); if (evicted) { this.cache.delete(evicted.key); this.onEvict?.(evicted.key, evicted.value); } } } delete(key: K): boolean { const node = this.cache.get(key); if (!node) return false; this.removeNode(node); this.cache.delete(key); return true; } // Optional callback when items are evicted onEvict?: (key: K, value: V) => void; // Helper methods for doubly-linked list operations private addToHead(node: LRUNode<K, V>): void { node.prev = this.head; node.next = this.head.next; this.head.next!.prev = node; this.head.next = node; } private removeNode(node: LRUNode<K, V>): void { node.prev!.next = node.next; node.next!.prev = node.prev; } private moveToHead(node: LRUNode<K, V>): void { this.removeNode(node); this.addToHead(node); } private removeTail(): LRUNode<K, V> | null { const node = this.tail.prev; if (node === this.head) return null; this.removeNode(node!); return node; } get size(): number { return this.cache.size; } // Iterate from most to least recent *entries(): IterableIterator<[K, V]> { let node = this.head.next; while (node !== this.tail) { yield [node!.key, node!.value]; node = node!.next; } }} // Usage with eviction callback for metricsconst cache = new LRUCache<string, User>(1000);cache.onEvict = (key, value) => { metrics.increment('cache.evictions'); console.log(`Evicted user ${key} from cache`);};Modern JavaScript Alternative: Map with Access Order
In modern JavaScript, Map maintains insertion order. By deleting and re-inserting on access, you get LRU behavior without a linked list:
class SimpleLRU<K, V> {
private cache: Map<K, V> = new Map();
private maxSize: number;
constructor(maxSize: number) {
this.maxSize = maxSize;
}
get(key: K): V | undefined {
const value = this.cache.get(key);
if (value !== undefined) {
// Move to end (most recent) by re-inserting
this.cache.delete(key);
this.cache.set(key, value);
}
return value;
}
set(key: K, value: V): void {
// Delete first to move to end if exists
this.cache.delete(key);
this.cache.set(key, value);
// Evict oldest (first) if over capacity
if (this.cache.size > this.maxSize) {
const oldest = this.cache.keys().next().value;
this.cache.delete(oldest);
}
}
}
This is cleaner but the delete-then-insert pattern has overhead. For high-throughput scenarios, the explicit linked-list approach performs better.
Always instrument your caches: hit/miss counts, eviction counts, current size, and average entry age. Without metrics, you're guessing whether your cache is effective. A cache with 10% hit rate is wasting memory; a cache that never evicts might be undersized.
Memoization is a specific application of caching: storing the results of function calls based on their arguments. When the function is called again with the same arguments, the cached result is returned instead of recomputing.
When to Memoize:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293
// Basic memoization decoratorfunction memoize<T extends (...args: any[]) => any>( fn: T, options: { maxSize?: number; ttlMs?: number; keyFn?: (...args: Parameters<T>) => string; } = {}): T { const { maxSize = 100, ttlMs, keyFn = (...args) => JSON.stringify(args) } = options; const cache = new Map<string, { value: ReturnType<T>; expiresAt?: number }>(); const keyOrder: string[] = []; return ((...args: Parameters<T>): ReturnType<T> => { const key = keyFn(...args); const cached = cache.get(key); if (cached) { if (!ttlMs || Date.now() < cached.expiresAt!) { return cached.value; } // Expired - remove and continue cache.delete(key); keyOrder.splice(keyOrder.indexOf(key), 1); } const result = fn(...args); cache.set(key, { value: result, expiresAt: ttlMs ? Date.now() + ttlMs : undefined, }); keyOrder.push(key); // LRU eviction while (keyOrder.length > maxSize) { const oldestKey = keyOrder.shift()!; cache.delete(oldestKey); } return result; }) as T;} // Usage: Memoizing expensive computationconst expensiveCalculation = memoize( (input: number): number => { // Simulate expensive work let result = 0; for (let i = 0; i < 10000000; i++) { result += Math.sin(input + i); } return result; }, { maxSize: 50, ttlMs: 60000 }); // Usage: Memoizing API calls with custom keyinterface ProductQuery { category: string; minPrice?: number; maxPrice?: number; sortBy?: string;} const fetchProducts = memoize( async (query: ProductQuery): Promise<Product[]> => { const response = await fetch(`/api/products?${new URLSearchParams(query as any)}`); return response.json(); }, { maxSize: 100, ttlMs: 300000, // 5 minutes // Custom key: normalize query to avoid cache misses for equivalent queries keyFn: (query) => { const normalized = { category: query.category, minPrice: query.minPrice ?? 0, maxPrice: query.maxPrice ?? Infinity, sortBy: query.sortBy ?? 'relevance', }; return JSON.stringify(normalized); }, });Handling Async Functions:
Memoizing async functions requires care to avoid duplicate in-flight requests:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
// Memoize that handles concurrent async calls correctlyfunction memoizeAsync<T extends (...args: any[]) => Promise<any>>( fn: T, options: { maxSize?: number; ttlMs?: number; keyFn?: (...args: Parameters<T>) => string; } = {}): T { const { maxSize = 100, ttlMs, keyFn = (...args) => JSON.stringify(args) } = options; // Store both resolved values AND in-flight promises const cache = new Map<string, { promise?: Promise<Awaited<ReturnType<T>>>; value?: Awaited<ReturnType<T>>; expiresAt?: number; state: 'pending' | 'resolved' | 'rejected'; }>(); return (async (...args: Parameters<T>): Promise<Awaited<ReturnType<T>>> => { const key = keyFn(...args); const cached = cache.get(key); if (cached) { // Check expiration if (ttlMs && cached.expiresAt && Date.now() > cached.expiresAt) { cache.delete(key); } else if (cached.state === 'resolved') { return cached.value!; } else if (cached.state === 'pending') { // Wait for existing in-flight request return cached.promise!; } // If rejected, allow retry } // Create new request const promise = fn(...args); cache.set(key, { promise, state: 'pending', }); try { const value = await promise; cache.set(key, { value, state: 'resolved', expiresAt: ttlMs ? Date.now() + ttlMs : undefined, }); // LRU eviction if (cache.size > maxSize) { const oldest = cache.keys().next().value; cache.delete(oldest); } return value; } catch (error) { cache.set(key, { state: 'rejected' }); throw error; } }) as T;} // Now concurrent calls share the same requestconst getUser = memoizeAsync( async (userId: string) => { console.log(`Fetching user ${userId}`); // Only logs once per userId const response = await fetch(`/api/users/${userId}`); return response.json(); }, { ttlMs: 60000 }); // These three concurrent calls result in ONE fetch:await Promise.all([ getUser('123'), getUser('123'), getUser('123'),]);Avoid memoizing functions with side effects (logging, analytics, mutations). The function won't execute on cache hits, causing missed side effects. Also watch for memory leaks with unbounded memoization—always set maxSize limits in production.
Production applications should use established caching libraries rather than custom implementations. These libraries handle edge cases, offer better performance, and provide important features out of the box.
Node.js Caching Libraries:
| Library | Key Features | Performance | Best For |
|---|---|---|---|
| node-cache | TTL, stats, callbacks, clone support | Good | Simple use cases, drop-in solution |
| lru-cache | LRU eviction, TTL, dispose callbacks | Excellent | Memory-sensitive applications |
| quick-lru | Minimal footprint, fast | Excellent | Simple LRU with minimal overhead |
| keyv | Multi-backend (memory, Redis, SQLite) | Good | When you might need external storage later |
| cacheable-request | HTTP caching layer | Good | Caching HTTP client requests |
| memoizee | Function memoization with options | Good | Extensive memoization features |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
import { LRUCache } from 'lru-cache'; // Configure LRU cache with comprehensive optionsconst userCache = new LRUCache<string, User>({ // Maximum number of items max: 500, // Maximum size in bytes (optional, requires sizeCalculation) maxSize: 50 * 1024 * 1024, // 50 MB sizeCalculation: (value, key) => { return JSON.stringify(value).length + key.length; }, // TTL in milliseconds ttl: 1000 * 60 * 5, // 5 minutes // Update age on get? (affects TTL behavior) updateAgeOnGet: false, updateAgeOnHas: false, // Called when items are evicted dispose: (value, key, reason) => { console.log(`Cache evicted: ${key}, reason: ${reason}`); // 'set' - replaced by new value // 'delete' - explicitly deleted // 'evict' - removed due to size/count limit // 'expire' - TTL expired }, // If true, disposed items can still be retrieved until GC noDisposeOnSet: false, // Allow stale items to be returned while fetching fresh allowStale: true, // Async fetch function for cache-aside pattern fetchMethod: async (key, staleValue, { options, signal }) => { const response = await fetch(`/api/users/${key}`, { signal }); if (!response.ok) { // Return stale if fetch fails if (staleValue) return staleValue; throw new Error(`User ${key} not found`); } return response.json(); }, // Background fetch when TTL is about to expire ttlAutopurge: true,}); // Basic operationsuserCache.set('user:123', user);const cached = userCache.get('user:123');userCache.delete('user:123'); // Using fetch method (stale-while-revalidate built in)const user = await userCache.fetch('user:123'); // Get statsconsole.log({ size: userCache.size, calculatedSize: userCache.calculatedSize, hits: userCache.hits, misses: userCache.misses, hitRate: userCache.hits / (userCache.hits + userCache.misses),});Java Caching Libraries:
For Java applications, Caffeine is the gold standard for in-process caching, offering near-optimal hit rates through its W-TinyLFU eviction policy:
import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
import java.util.concurrent.TimeUnit;
Cache<String, User> cache = Caffeine.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(5, TimeUnit.MINUTES)
.expireAfterAccess(2, TimeUnit.MINUTES)
.recordStats()
.removalListener((key, value, cause) ->
System.out.println("Evicted: " + key + ", cause: " + cause))
.build();
// Synchronous loading cache
LoadingCache<String, User> loadingCache = Caffeine.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(5, TimeUnit.MINUTES)
.build(key -> fetchUserFromDatabase(key));
User user = loadingCache.get("user:123"); // Loads if missing
When caching variable-size objects, prefer size-based limits over count-based limits. 100 small objects and 100 large objects use vastly different memory. Libraries like lru-cache support sizeCalculation functions that account for actual memory usage.
Cache key design is deceptively important. Poor key design leads to cache misses for equivalent requests, key collisions serving wrong data, or unbounded key cardinality exploding memory usage.
Cache Key Principles:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
// BAD: Missing version - stale data after format changesconst badKey1 = `user:${userId}`; // GOOD: Include version for schema changesconst goodKey1 = `v1:user:${userId}`; // BAD: Object as key - different orderings produce different stringsconst badKey2 = JSON.stringify({ category, sort, page });// { category: 'A', sort: 'price' } !== { sort: 'price', category: 'A' } // GOOD: Normalize key componentsfunction normalizeKey(params: Record<string, any>): string { const sorted = Object.keys(params) .sort() .map(k => `${k}=${params[k]}`) .join('&'); return sorted;} // BAD: Unbounded cardinality - caches for every unique searchconst badKey3 = `search:${userQuery}`; // "red shoes" vs "Red shoes" vs "RED SHOES" // GOOD: Normalize and limit cardinalityfunction searchCacheKey(query: string, page: number): string { const normalized = query.toLowerCase().trim().replace(/\s+/g, ' '); // Consider: truncate very long queries, or hash them if (normalized.length > 100) { return `search:${hash(normalized)}:page:${page}`; } return `search:${normalized}:page:${page}`;} // Key namespacing strategy for multi-tenant applicationsinterface CacheKeyBuilder { tenant: string; entity: string; id: string; version?: number;} function buildKey({ tenant, entity, id, version = 1 }: CacheKeyBuilder): string { return `v${version}:${tenant}:${entity}:${id}`;} // Examples:buildKey({ tenant: 'acme', entity: 'user', id: '12345' });// => "v1:acme:user:12345" buildKey({ tenant: 'acme', entity: 'product', id: 'sku-789', version: 2 });// => "v2:acme:product:sku-789"| Anti-Pattern | Problem | Solution |
|---|---|---|
| Using raw user input | Cardinality explosion, cache pollution | Normalize, validate, truncate inputs |
| Including timestamps | Every request has unique key (0% hit rate) | Remove time-varying components |
| Excluding tenant/user context | Data leakage between users | Include isolation context in key |
| Object reference as key | Memory leak, always misses | Serialize to string |
| No version prefix | Stale data after schema changes | Include schema/version prefix |
| Random/UUID components | Each request is unique | Use deterministic identifiers only |
Include user/tenant identity in cache keys for any user-specific data. A bug that omits user context from the cache key can serve User A's data to User B—a serious security vulnerability. Prefer explicit namespace prefixes: 'user:123:profile' not just 'profile'.
How and when you populate your cache significantly impacts performance, consistency, and system complexity. The main patterns are cache-aside, read-through, write-through, write-behind, and cache warming.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
// Pattern 1: Cache-Aside (most common)async function getUserCacheAside(userId: string): Promise<User> { const cacheKey = `user:${userId}`; // Step 1: Check cache const cached = await cache.get(cacheKey); if (cached) { return cached; } // Step 2: Fetch from source on miss const user = await database.users.findById(userId); // Step 3: Populate cache if (user) { await cache.set(cacheKey, user, { ttl: 300 }); } return user;} // Pattern 2: Read-Through (via library support)const userCache = new LRUCache<string, User>({ max: 1000, ttl: 300000, // Library handles fetching on miss fetchMethod: async (userId) => { return database.users.findById(userId); },}); async function getUserReadThrough(userId: string): Promise<User> { // Single call - cache handles miss logic return userCache.fetch(userId);} // Pattern 3: Cache Warming (Proactive Population)async function warmUserCache(userIds: string[]): Promise<void> { console.log(`Warming cache for ${userIds.length} users`); // Batch fetch from database const users = await database.users.findMany({ where: { id: { in: userIds } }, }); // Populate cache in parallel await Promise.all( users.map(user => cache.set(`user:${user.id}`, user, { ttl: 300 })) ); console.log(`Cache warmed: ${users.length} users`);} // Warm on service startup for critical dataasync function initializeService() { // Get IDs of recently active users const recentUserIds = await database.users.findMany({ where: { lastActiveAt: { gte: new Date(Date.now() - 86400000) } }, select: { id: true }, }); await warmUserCache(recentUserIds.map(u => u.id));} // Pattern 4: Preventing Thundering Herd with Coalescingconst inFlightRequests = new Map<string, Promise<any>>(); async function getUserWithCoalescing(userId: string): Promise<User> { const cacheKey = `user:${userId}`; const cached = await cache.get(cacheKey); if (cached) return cached; // Check if request already in flight const inFlight = inFlightRequests.get(cacheKey); if (inFlight) { return inFlight; // Wait for existing request } // New request - track it const request = (async () => { try { const user = await database.users.findById(userId); if (user) { await cache.set(cacheKey, user, { ttl: 300 }); } return user; } finally { inFlightRequests.delete(cacheKey); } })(); inFlightRequests.set(cacheKey, request); return request;}When a hot cache entry expires, all concurrent requests may simultaneously try to refresh it, overwhelming the data source. Request coalescing ensures only one request goes through while others wait. Libraries like lru-cache with fetchMethod handle this automatically.
Application-level cache invalidation is where "there are only two hard things in computer science" becomes painfully real. The fundamental challenge: keeping cached data consistent with the source of truth.
Invalidation Strategies:
| Strategy | Mechanism | Consistency | Complexity | Best For |
|---|---|---|---|---|
| TTL-based | Entries expire after fixed time | Eventually consistent | Low | Read-heavy data that can be stale |
| Event-based | Invalidate on writes/updates | Strong (if no bugs) | Medium | Critical data requiring freshness |
| Version-based | New version = new cache key | Strong | Low | Immutable/versioned content |
| Tag-based | Group invalidation by tag | Strong (if correct tags) | Medium | Related data that updates together |
| Hybrid | TTL + event-based fallback | Best effort strong | High | Critical data with TTL safety net |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
// Event-based invalidation with domain eventsclass UserService { async updateUser(userId: string, updates: Partial<User>): Promise<User> { // Update database const user = await database.users.update({ where: { id: userId }, data: updates, }); // Invalidate cache entries await this.invalidateUserCache(userId); // Publish event for other services await eventBus.publish('user.updated', { userId, changes: updates }); return user; } private async invalidateUserCache(userId: string): Promise<void> { // Invalidate direct user cache await cache.delete(`user:${userId}`); // Invalidate derived/related caches await cache.delete(`user:${userId}:profile`); await cache.delete(`user:${userId}:permissions`); // Invalidate aggregate caches that include this user // This is where it gets complex... const userTeams = await database.teamMemberships.findMany({ where: { userId }, select: { teamId: true }, }); for (const { teamId } of userTeams) { await cache.delete(`team:${teamId}:members`); } }} // Tag-based invalidation for related dataclass TaggedCache { private cache: Map<string, any> = new Map(); private tagIndex: Map<string, Set<string>> = new Map(); set(key: string, value: any, tags: string[]): void { this.cache.set(key, value); // Index by tags for (const tag of tags) { if (!this.tagIndex.has(tag)) { this.tagIndex.set(tag, new Set()); } this.tagIndex.get(tag)!.add(key); } } invalidateByTag(tag: string): number { const keys = this.tagIndex.get(tag); if (!keys) return 0; let count = 0; for (const key of keys) { if (this.cache.delete(key)) { count++; } } this.tagIndex.delete(tag); return count; }} // Usageconst cache = new TaggedCache(); cache.set('product:123', product, ['product:123', 'category:electronics', 'featured']);cache.set('product:456', product2, ['product:456', 'category:electronics']); // When electronics category changescache.invalidateByTag('category:electronics'); // Clears both productsThe Cascade Problem:
One of the hardest invalidation challenges is cascading invalidation. User data appears in:
Updating a user requires invalidating all these caches. Missing any leaves stale data.
Strategies for Cascade Invalidation:
A subtle bug: Read-Modify-Write races. Thread A reads stale from cache, Thread B updates DB and invalidates cache, Thread A writes stale to cache. Now cache has stale data. Solutions: write-through (update cache atomically with DB), or add cache entry version checks.
In-process caches are local to each application instance. When running multiple instances behind a load balancer, each instance has its own cache, leading to potential inconsistencies.
The Problem Illustrated:
┌─────────────────┐
│ Load Balancer │
└────────┬────────┘
│
┌─────────────┼─────────────┐
▼ ▼ ▼
Instance A Instance B Instance C
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Cache A │ │ Cache B │ │ Cache C │
│user:123 │ │user:123 │ │user:123 │
│ (v1) │ │ (v2) │ │ (stale) │
└─────────┘ └─────────┘ └─────────┘
│ │ │
└─────────────┴─────────────┘
│
▼
┌──────────┐
│ Database │
│ (v2) │
└──────────┘
Each instance may have different versions cached!
| Strategy | How It Works | Tradeoffs |
|---|---|---|
| Accept inconsistency | Use TTLs; instances eventually converge | Simple but unpredictable user experience |
| Shared external cache | Replace in-process with Redis/Memcached | Network overhead but consistent |
| Two-tier caching | In-process (fast) + shared (consistent) | Complex but optimal performance |
| Cache invalidation broadcast | Publish invalidation events to all instances | Requires messaging infrastructure |
| Sticky sessions | Route user to same instance | Defeats load balancing benefits |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
// Cache invalidation broadcast using Redis Pub/Subimport Redis from 'ioredis'; class DistributedCache<T> { private localCache: LRUCache<string, T>; private redis: Redis; private subscriber: Redis; private instanceId: string; private channel = 'cache:invalidation'; constructor(redisUrl: string, maxSize: number = 1000) { this.instanceId = crypto.randomUUID(); this.localCache = new LRUCache({ max: maxSize, ttl: 300000 }); this.redis = new Redis(redisUrl); this.subscriber = new Redis(redisUrl); this.setupSubscription(); } private async setupSubscription(): Promise<void> { await this.subscriber.subscribe(this.channel); this.subscriber.on('message', (channel, message) => { if (channel !== this.channel) return; const { senderId, pattern, keys } = JSON.parse(message); // Ignore our own messages if (senderId === this.instanceId) return; // Process invalidation if (pattern) { this.invalidatePattern(pattern); } else if (keys) { for (const key of keys) { this.localCache.delete(key); } } }); } async get(key: string): Promise<T | undefined> { // Check local cache first const local = this.localCache.get(key); if (local !== undefined) { return local; } // Fall through to Redis or database return undefined; } async set(key: string, value: T): Promise<void> { this.localCache.set(key, value); } async invalidate(keys: string[]): Promise<void> { // Invalidate locally for (const key of keys) { this.localCache.delete(key); } // Broadcast to other instances await this.redis.publish(this.channel, JSON.stringify({ senderId: this.instanceId, keys, })); } async invalidateByPattern(pattern: string): Promise<void> { // Invalidate matching keys locally this.invalidatePattern(pattern); // Broadcast pattern invalidation await this.redis.publish(this.channel, JSON.stringify({ senderId: this.instanceId, pattern, })); } private invalidatePattern(pattern: string): void { const regex = new RegExp(pattern.replace(/\*/g, '.*')); for (const key of this.localCache.keys()) { if (regex.test(key)) { this.localCache.delete(key); } } }} // Usageconst cache = new DistributedCache<User>('redis://localhost:6379'); // When user is updated on any instanceasync function updateUser(userId: string, data: Partial<User>) { await database.users.update(userId, data); await cache.invalidate([`user:${userId}`, `user:${userId}:profile`]); // All instances now have invalidated this key}The optimal pattern for high-performance systems: fast in-process cache (nanosecond access) backed by shared external cache (sub-millisecond access). Check local first, then Redis, then database. This gives you speed AND consistency with appropriate TTLs and invalidation broadcast.
Application-level caching provides the most control over what gets cached and when. In-process caches offer nanosecond access times, memoization elegantly handles repeated computations, and careful cache key design prevents subtle bugs. However, this control comes with responsibility for memory management, invalidation correctness, and multi-instance consistency.
What's Next:
Moving closer to the data layer, the next page explores Database Query Caching—where we cache at the query level, leveraging database-specific optimizations, query result caching, and the tradeoffs between caching in the application versus the database layer.
You now understand application-level caching—from in-process LRU caches to memoization patterns, cache key design, population strategies, invalidation challenges, and multi-instance consistency. You can implement robust caching that accelerates your services without introducing subtle data consistency bugs.