Caching Layers - Learning Module

Loading content...

0/273

Application-Level Caching

Where Your Code Meets Caching

Browser and CDN caching operate largely outside your application code—you configure headers and let the infrastructure handle the rest. Application-level caching is different: it's where you, the developer, explicitly decide what to cache, where to store it, when to invalidate it, and how to handle cache misses.

This layer offers the greatest control and flexibility. You can cache computed results, database query outputs, external API responses, session data, and complex business objects. But with control comes responsibility—poorly designed application caches lead to stale data, memory exhaustion, inconsistencies between nodes, and debugging nightmares.

Mastering application-level caching means understanding the patterns, the tradeoffs, and the failure modes that separate robust production caches from fragile prototypes.

What You Will Learn

This page covers in-process caching with hash maps and LRU caches, memoization patterns for function results, caching library selection and configuration, cache key design strategies, handling cache population and invalidation, and multi-instance cache consistency challenges. By the end, you'll be able to implement robust application-level caching that accelerates your services without introducing subtle bugs.

In-Process Caching Fundamentals

The simplest and fastest application cache is an in-process cache—data stored in your application's memory, accessible with no network calls, serialization, or context switching. Access time is measured in nanoseconds rather than milliseconds.

When In-Process Caching Excels:

Reference data — Country codes, currency formats, feature flags that change infrequently
Computed values — Expensive calculations that produce the same output for the same input
Parsed configurations — Application config loaded at startup and reused throughout
Hot paths — Frequently accessed data where every microsecond matters

Basic In-Process Cache Implementation:

basic-cache.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
// Simple in-memory cache with TTL support
interface CacheEntry<T> {
  value: T;
  expiresAt: number;
}
 
class InMemoryCache<T> {
  private cache: Map<string, CacheEntry<T>> = new Map();
  private defaultTtlMs: number;
 
  constructor(defaultTtlSeconds: number = 300) {
    this.defaultTtlMs = defaultTtlSeconds * 1000;
  }
 
  set(key: string, value: T, ttlSeconds?: number): void {
    const ttl = ttlSeconds ? ttlSeconds * 1000 : this.defaultTtlMs;
    this.cache.set(key, {
      value,
      expiresAt: Date.now() + ttl,
    });
  }
 
  get(key: string): T | undefined {
    const entry = this.cache.get(key);
    
    if (!entry) {
      return undefined;
    }
    
    if (Date.now() > entry.expiresAt) {
      this.cache.delete(key);
      return undefined;
    }
    
    return entry.value;
  }
 
  delete(key: string): boolean {
    return this.cache.delete(key);
  }
 
  clear(): void {
    this.cache.clear();
  }
 
  // Periodic cleanup of expired entries
  startCleanup(intervalMs: number = 60000): NodeJS.Timeout {
    return setInterval(() => {
      const now = Date.now();
      for (const [key, entry] of this.cache) {
        if (now > entry.expiresAt) {
          this.cache.delete(key);
        }
      }
    }, intervalMs);
  }
}
 
// Usage
const userCache = new InMemoryCache<User>(300); // 5-minute default TTL
 
async function getUser(userId: string): Promise<User> {
  // Check cache first
  const cached = userCache.get(`user:${userId}`);
  if (cached) {
    return cached;
  }
  
  // Cache miss - fetch from database
  const user = await db.users.findUnique({ where: { id: userId } });
  
  if (user) {
    userCache.set(`user:${userId}`, user);
  }
  
  return user;
}

The Memory Pressure Problem:

The critical limitation of in-process caches is memory consumption. Unlike external caches, in-process caches compete with your application for heap space. Unbounded caches lead to:

Memory exhaustion — OutOfMemory errors crashing your application
GC pressure — More objects mean longer garbage collection pauses
Memory fragmentation — Large caches can cause inefficient memory patterns

Solution: Bounded Caches with Eviction

Production-grade in-process caches enforce size limits and evict entries when full. The choice of eviction policy significantly impacts cache effectiveness.

Cache Eviction Policies
Policy	Description	Best For	Drawback
LRU (Least Recently Used)	Evict the entry not accessed longest	General purpose caching	Doesn't consider access frequency
LFU (Least Frequently Used)	Evict the entry accessed fewest times	Data with stable hot set	New items vulnerable; stale frequency counts
FIFO (First In First Out)	Evict the oldest entry	When age correlates with value	Ignores access patterns entirely
TTL (Time To Live)	Evict after fixed time	Data with known validity period	May evict still-valuable items
W-TinyLFU (Caffeine)	Combines recency and frequency	Near-optimal hit rates	More complex implementation
Random	Evict a random entry	Simple, unpredictable workloads	May evict hot items

Use Established Libraries

Don't implement your own LRU cache in production unless you have specific requirements. Use battle-tested libraries: node-cache, lru-cache, or quick-lru for Node.js; Caffeine for Java; functools.lru_cache for Python. These handle edge cases and offer better performance than naive implementations.

LRU Cache Implementation Deep Dive

LRU (Least Recently Used) is the most common eviction policy. Understanding its implementation illuminates both its power and its limitations.

The LRU Invariant:

An LRU cache maintains entries in access-recency order. The most recently accessed item is at the "front," and the least recently accessed is at the "back." When the cache is full, the item at the back is evicted.

Implementation Requirements:

get(key) — O(1) lookup AND move to front
set(key, value) — O(1) insert at front AND evict from back if full
Ordered iteration from most to least recent

The Classic Implementation: HashMap + Doubly Linked List

lru-cache.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
// Classic LRU Cache: O(1) get and set operations
class LRUNode<K, V> {
  key: K;
  value: V;
  prev: LRUNode<K, V> | null = null;
  next: LRUNode<K, V> | null = null;
 
  constructor(key: K, value: V) {
    this.key = key;
    this.value = value;
  }
}
 
class LRUCache<K, V> {
  private capacity: number;
  private cache: Map<K, LRUNode<K, V>> = new Map();
  
  // Sentinel nodes simplify edge cases
  private head: LRUNode<K, V>;  // Most recent
  private tail: LRUNode<K, V>;  // Least recent
 
  constructor(capacity: number) {
    this.capacity = capacity;
    
    // Initialize sentinels
    this.head = new LRUNode<K, V>(null as any, null as any);
    this.tail = new LRUNode<K, V>(null as any, null as any);
    this.head.next = this.tail;
    this.tail.prev = this.head;
  }
 
  get(key: K): V | undefined {
    const node = this.cache.get(key);
    
    if (!node) {
      return undefined;
    }
    
    // Move to front (most recently used)
    this.moveToHead(node);
    return node.value;
  }
 
  set(key: K, value: V): void {
    const existing = this.cache.get(key);
    
    if (existing) {
      // Update existing entry
      existing.value = value;
      this.moveToHead(existing);
      return;
    }
    
    // New entry
    const node = new LRUNode(key, value);
    this.cache.set(key, node);
    this.addToHead(node);
    
    // Evict if over capacity
    if (this.cache.size > this.capacity) {
      const evicted = this.removeTail();
      if (evicted) {
        this.cache.delete(evicted.key);
        this.onEvict?.(evicted.key, evicted.value);
      }
    }
  }
 
  delete(key: K): boolean {
    const node = this.cache.get(key);
    if (!node) return false;
    
    this.removeNode(node);
    this.cache.delete(key);
    return true;
  }
 
  // Optional callback when items are evicted
  onEvict?: (key: K, value: V) => void;
 
  // Helper methods for doubly-linked list operations
  private addToHead(node: LRUNode<K, V>): void {
    node.prev = this.head;
    node.next = this.head.next;
    this.head.next!.prev = node;
    this.head.next = node;
  }
 
  private removeNode(node: LRUNode<K, V>): void {
    node.prev!.next = node.next;
    node.next!.prev = node.prev;
  }
 
  private moveToHead(node: LRUNode<K, V>): void {
    this.removeNode(node);
    this.addToHead(node);
  }
 
  private removeTail(): LRUNode<K, V> | null {
    const node = this.tail.prev;
    if (node === this.head) return null;
    this.removeNode(node!);
    return node;
  }
 
  get size(): number {
    return this.cache.size;
  }
 
  // Iterate from most to least recent
  *entries(): IterableIterator<[K, V]> {
    let node = this.head.next;
    while (node !== this.tail) {
      yield [node!.key, node!.value];
      node = node!.next;
    }
  }
}
 
// Usage with eviction callback for metrics
const cache = new LRUCache<string, User>(1000);
cache.onEvict = (key, value) => {
  metrics.increment('cache.evictions');
  console.log(`Evicted user ${key} from cache`);
};

Modern JavaScript Alternative: Map with Access Order

In modern JavaScript, Map maintains insertion order. By deleting and re-inserting on access, you get LRU behavior without a linked list:

class SimpleLRU<K, V> {
  private cache: Map<K, V> = new Map();
  private maxSize: number;

  constructor(maxSize: number) {
    this.maxSize = maxSize;
  }

  get(key: K): V | undefined {
    const value = this.cache.get(key);
    if (value !== undefined) {
      // Move to end (most recent) by re-inserting
      this.cache.delete(key);
      this.cache.set(key, value);
    }
    return value;
  }

  set(key: K, value: V): void {
    // Delete first to move to end if exists
    this.cache.delete(key);
    this.cache.set(key, value);
    
    // Evict oldest (first) if over capacity
    if (this.cache.size > this.maxSize) {
      const oldest = this.cache.keys().next().value;
      this.cache.delete(oldest);
    }
  }
}

This is cleaner but the delete-then-insert pattern has overhead. For high-throughput scenarios, the explicit linked-list approach performs better.

Cache Metrics Are Essential

Always instrument your caches: hit/miss counts, eviction counts, current size, and average entry age. Without metrics, you're guessing whether your cache is effective. A cache with 10% hit rate is wasting memory; a cache that never evicts might be undersized.

Memoization Patterns

Memoization is a specific application of caching: storing the results of function calls based on their arguments. When the function is called again with the same arguments, the cached result is returned instead of recomputing.

When to Memoize:

Pure functions — Functions where output depends only on input (no side effects)
Expensive computations — Complex calculations, heavy parsing, algorithmic operations
Idempotent operations — APIs that return the same result for the same request
Recursive algorithms — Avoiding redundant subproblem computation (dynamic programming)

memoization-patterns.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
// Basic memoization decorator
function memoize<T extends (...args: any[]) => any>(
  fn: T,
  options: {
    maxSize?: number;
    ttlMs?: number;
    keyFn?: (...args: Parameters<T>) => string;
  } = {}
): T {
  const { 
    maxSize = 100, 
    ttlMs, 
    keyFn = (...args) => JSON.stringify(args) 
  } = options;
  
  const cache = new Map<string, { 
    value: ReturnType<T>; 
    expiresAt?: number 
  }>();
  const keyOrder: string[] = [];
 
  return ((...args: Parameters<T>): ReturnType<T> => {
    const key = keyFn(...args);
    
    const cached = cache.get(key);
    if (cached) {
      if (!ttlMs || Date.now() < cached.expiresAt!) {
        return cached.value;
      }
      // Expired - remove and continue
      cache.delete(key);
      keyOrder.splice(keyOrder.indexOf(key), 1);
    }
    
    const result = fn(...args);
    
    cache.set(key, {
      value: result,
      expiresAt: ttlMs ? Date.now() + ttlMs : undefined,
    });
    keyOrder.push(key);
    
    // LRU eviction
    while (keyOrder.length > maxSize) {
      const oldestKey = keyOrder.shift()!;
      cache.delete(oldestKey);
    }
    
    return result;
  }) as T;
}
 
// Usage: Memoizing expensive computation
const expensiveCalculation = memoize(
  (input: number): number => {
    // Simulate expensive work
    let result = 0;
    for (let i = 0; i < 10000000; i++) {
      result += Math.sin(input + i);
    }
    return result;
  },
  { maxSize: 50, ttlMs: 60000 }
);
 
// Usage: Memoizing API calls with custom key
interface ProductQuery {
  category: string;
  minPrice?: number;
  maxPrice?: number;
  sortBy?: string;
}
 
const fetchProducts = memoize(
  async (query: ProductQuery): Promise<Product[]> => {
    const response = await fetch(`/api/products?${new URLSearchParams(query as any)}`);
    return response.json();
  },
  {
    maxSize: 100,
    ttlMs: 300000, // 5 minutes
    // Custom key: normalize query to avoid cache misses for equivalent queries
    keyFn: (query) => {
      const normalized = {
        category: query.category,
        minPrice: query.minPrice ?? 0,
        maxPrice: query.maxPrice ?? Infinity,
        sortBy: query.sortBy ?? 'relevance',
      };
      return JSON.stringify(normalized);
    },
  }
);

Handling Async Functions:

Memoizing async functions requires care to avoid duplicate in-flight requests:

async-memoize.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
// Memoize that handles concurrent async calls correctly
function memoizeAsync<T extends (...args: any[]) => Promise<any>>(
  fn: T,
  options: {
    maxSize?: number;
    ttlMs?: number;
    keyFn?: (...args: Parameters<T>) => string;
  } = {}
): T {
  const { maxSize = 100, ttlMs, keyFn = (...args) => JSON.stringify(args) } = options;
  
  // Store both resolved values AND in-flight promises
  const cache = new Map<string, {
    promise?: Promise<Awaited<ReturnType<T>>>;
    value?: Awaited<ReturnType<T>>;
    expiresAt?: number;
    state: 'pending' | 'resolved' | 'rejected';
  }>();
 
  return (async (...args: Parameters<T>): Promise<Awaited<ReturnType<T>>> => {
    const key = keyFn(...args);
    
    const cached = cache.get(key);
    
    if (cached) {
      // Check expiration
      if (ttlMs && cached.expiresAt && Date.now() > cached.expiresAt) {
        cache.delete(key);
      } else if (cached.state === 'resolved') {
        return cached.value!;
      } else if (cached.state === 'pending') {
        // Wait for existing in-flight request
        return cached.promise!;
      }
      // If rejected, allow retry
    }
    
    // Create new request
    const promise = fn(...args);
    
    cache.set(key, {
      promise,
      state: 'pending',
    });
    
    try {
      const value = await promise;
      
      cache.set(key, {
        value,
        state: 'resolved',
        expiresAt: ttlMs ? Date.now() + ttlMs : undefined,
      });
      
      // LRU eviction
      if (cache.size > maxSize) {
        const oldest = cache.keys().next().value;
        cache.delete(oldest);
      }
      
      return value;
    } catch (error) {
      cache.set(key, { state: 'rejected' });
      throw error;
    }
  }) as T;
}
 
// Now concurrent calls share the same request
const getUser = memoizeAsync(
  async (userId: string) => {
    console.log(`Fetching user ${userId}`); // Only logs once per userId
    const response = await fetch(`/api/users/${userId}`);
    return response.json();
  },
  { ttlMs: 60000 }
);
 
// These three concurrent calls result in ONE fetch:
await Promise.all([
  getUser('123'),
  getUser('123'),
  getUser('123'),
]);

Memoization Pitfalls

Avoid memoizing functions with side effects (logging, analytics, mutations). The function won't execute on cache hits, causing missed side effects. Also watch for memory leaks with unbounded memoization—always set maxSize limits in production.

Caching Library Selection

Production applications should use established caching libraries rather than custom implementations. These libraries handle edge cases, offer better performance, and provide important features out of the box.

Node.js Caching Libraries:

Node.js In-Memory Caching Libraries
Library	Key Features	Performance	Best For
node-cache	TTL, stats, callbacks, clone support	Good	Simple use cases, drop-in solution
lru-cache	LRU eviction, TTL, dispose callbacks	Excellent	Memory-sensitive applications
quick-lru	Minimal footprint, fast	Excellent	Simple LRU with minimal overhead
keyv	Multi-backend (memory, Redis, SQLite)	Good	When you might need external storage later
cacheable-request	HTTP caching layer	Good	Caching HTTP client requests
memoizee	Function memoization with options	Good	Extensive memoization features

lru-cache-example.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
import { LRUCache } from 'lru-cache';
 
// Configure LRU cache with comprehensive options
const userCache = new LRUCache<string, User>({
  // Maximum number of items
  max: 500,
  
  // Maximum size in bytes (optional, requires sizeCalculation)
  maxSize: 50 * 1024 * 1024, // 50 MB
  sizeCalculation: (value, key) => {
    return JSON.stringify(value).length + key.length;
  },
  
  // TTL in milliseconds
  ttl: 1000 * 60 * 5, // 5 minutes
  
  // Update age on get? (affects TTL behavior)
  updateAgeOnGet: false,
  updateAgeOnHas: false,
  
  // Called when items are evicted
  dispose: (value, key, reason) => {
    console.log(`Cache evicted: ${key}, reason: ${reason}`);
    // 'set' - replaced by new value
    // 'delete' - explicitly deleted
    // 'evict' - removed due to size/count limit
    // 'expire' - TTL expired
  },
  
  // If true, disposed items can still be retrieved until GC
  noDisposeOnSet: false,
  
  // Allow stale items to be returned while fetching fresh
  allowStale: true,
  
  // Async fetch function for cache-aside pattern
  fetchMethod: async (key, staleValue, { options, signal }) => {
    const response = await fetch(`/api/users/${key}`, { signal });
    if (!response.ok) {
      // Return stale if fetch fails
      if (staleValue) return staleValue;
      throw new Error(`User ${key} not found`);
    }
    return response.json();
  },
  
  // Background fetch when TTL is about to expire
  ttlAutopurge: true,
});
 
// Basic operations
userCache.set('user:123', user);
const cached = userCache.get('user:123');
userCache.delete('user:123');
 
// Using fetch method (stale-while-revalidate built in)
const user = await userCache.fetch('user:123');
 
// Get stats
console.log({
  size: userCache.size,
  calculatedSize: userCache.calculatedSize,
  hits: userCache.hits,
  misses: userCache.misses,
  hitRate: userCache.hits / (userCache.hits + userCache.misses),
});

Java Caching Libraries:

For Java applications, Caffeine is the gold standard for in-process caching, offering near-optimal hit rates through its W-TinyLFU eviction policy:

import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
import java.util.concurrent.TimeUnit;

Cache<String, User> cache = Caffeine.newBuilder()
    .maximumSize(10_000)
    .expireAfterWrite(5, TimeUnit.MINUTES)
    .expireAfterAccess(2, TimeUnit.MINUTES)
    .recordStats()
    .removalListener((key, value, cause) -> 
        System.out.println("Evicted: " + key + ", cause: " + cause))
    .build();

// Synchronous loading cache
LoadingCache<String, User> loadingCache = Caffeine.newBuilder()
    .maximumSize(10_000)
    .expireAfterWrite(5, TimeUnit.MINUTES)
    .build(key -> fetchUserFromDatabase(key));

User user = loadingCache.get("user:123"); // Loads if missing

Size vs Count Limits

When caching variable-size objects, prefer size-based limits over count-based limits. 100 small objects and 100 large objects use vastly different memory. Libraries like lru-cache support sizeCalculation functions that account for actual memory usage.

Cache Key Design

Cache key design is deceptively important. Poor key design leads to cache misses for equivalent requests, key collisions serving wrong data, or unbounded key cardinality exploding memory usage.

Cache Key Principles:

Uniqueness — Keys must uniquely identify the cached value's computation
Completeness — All inputs affecting the output must be in the key
Normalization — Equivalent inputs should produce the same key
Readability — Keys should be debuggable (you'll read them in logs)
Bounded cardinality — Keys shouldn't grow unbounded

cache-key-design.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// BAD: Missing version - stale data after format changes
const badKey1 = `user:${userId}`;
 
// GOOD: Include version for schema changes
const goodKey1 = `v1:user:${userId}`;
 
// BAD: Object as key - different orderings produce different strings
const badKey2 = JSON.stringify({ category, sort, page });
// { category: 'A', sort: 'price' } !== { sort: 'price', category: 'A' }
 
// GOOD: Normalize key components
function normalizeKey(params: Record<string, any>): string {
  const sorted = Object.keys(params)
    .sort()
    .map(k => `${k}=${params[k]}`)
    .join('&');
  return sorted;
}
 
// BAD: Unbounded cardinality - caches for every unique search
const badKey3 = `search:${userQuery}`; // "red shoes" vs "Red shoes" vs "RED SHOES"
 
// GOOD: Normalize and limit cardinality
function searchCacheKey(query: string, page: number): string {
  const normalized = query.toLowerCase().trim().replace(/\s+/g, ' ');
  // Consider: truncate very long queries, or hash them
  if (normalized.length > 100) {
    return `search:${hash(normalized)}:page:${page}`;
  }
  return `search:${normalized}:page:${page}`;
}
 
// Key namespacing strategy for multi-tenant applications
interface CacheKeyBuilder {
  tenant: string;
  entity: string;
  id: string;
  version?: number;
}
 
function buildKey({ tenant, entity, id, version = 1 }: CacheKeyBuilder): string {
  return `v${version}:${tenant}:${entity}:${id}`;
}
 
// Examples:
buildKey({ tenant: 'acme', entity: 'user', id: '12345' });
// => "v1:acme:user:12345"
 
buildKey({ tenant: 'acme', entity: 'product', id: 'sku-789', version: 2 });
// => "v2:acme:product:sku-789"

Cache Key Anti-Patterns
Anti-Pattern	Problem	Solution
Using raw user input	Cardinality explosion, cache pollution	Normalize, validate, truncate inputs
Including timestamps	Every request has unique key (0% hit rate)	Remove time-varying components
Excluding tenant/user context	Data leakage between users	Include isolation context in key
Object reference as key	Memory leak, always misses	Serialize to string
No version prefix	Stale data after schema changes	Include schema/version prefix
Random/UUID components	Each request is unique	Use deterministic identifiers only

Security in Cache Keys

Include user/tenant identity in cache keys for any user-specific data. A bug that omits user context from the cache key can serve User A's data to User B—a serious security vulnerability. Prefer explicit namespace prefixes: 'user:123:profile' not just 'profile'.

Cache Population Patterns

How and when you populate your cache significantly impacts performance, consistency, and system complexity. The main patterns are cache-aside, read-through, write-through, write-behind, and cache warming.

Cache-Aside (Lazy Loading)

•Application checks cache first
•On miss, fetches from source
•Application writes to cache
•Pros: Simple, lazy, resilient to cache failures
•Cons: First request slow, potential thundering herd

Read-Through

•Application only talks to cache
•Cache fetches from source on miss
•Cache manages population
•Pros: Simpler app code, consistent access pattern
•Cons: Cache library must support, less flexibility

cache-population-patterns.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
// Pattern 1: Cache-Aside (most common)
async function getUserCacheAside(userId: string): Promise<User> {
  const cacheKey = `user:${userId}`;
  
  // Step 1: Check cache
  const cached = await cache.get(cacheKey);
  if (cached) {
    return cached;
  }
  
  // Step 2: Fetch from source on miss
  const user = await database.users.findById(userId);
  
  // Step 3: Populate cache
  if (user) {
    await cache.set(cacheKey, user, { ttl: 300 });
  }
  
  return user;
}
 
// Pattern 2: Read-Through (via library support)
const userCache = new LRUCache<string, User>({
  max: 1000,
  ttl: 300000,
  // Library handles fetching on miss
  fetchMethod: async (userId) => {
    return database.users.findById(userId);
  },
});
 
async function getUserReadThrough(userId: string): Promise<User> {
  // Single call - cache handles miss logic
  return userCache.fetch(userId);
}
 
// Pattern 3: Cache Warming (Proactive Population)
async function warmUserCache(userIds: string[]): Promise<void> {
  console.log(`Warming cache for ${userIds.length} users`);
  
  // Batch fetch from database
  const users = await database.users.findMany({
    where: { id: { in: userIds } },
  });
  
  // Populate cache in parallel
  await Promise.all(
    users.map(user => cache.set(`user:${user.id}`, user, { ttl: 300 }))
  );
  
  console.log(`Cache warmed: ${users.length} users`);
}
 
// Warm on service startup for critical data
async function initializeService() {
  // Get IDs of recently active users
  const recentUserIds = await database.users.findMany({
    where: { lastActiveAt: { gte: new Date(Date.now() - 86400000) } },
    select: { id: true },
  });
  
  await warmUserCache(recentUserIds.map(u => u.id));
}
 
// Pattern 4: Preventing Thundering Herd with Coalescing
const inFlightRequests = new Map<string, Promise<any>>();
 
async function getUserWithCoalescing(userId: string): Promise<User> {
  const cacheKey = `user:${userId}`;
  
  const cached = await cache.get(cacheKey);
  if (cached) return cached;
  
  // Check if request already in flight
  const inFlight = inFlightRequests.get(cacheKey);
  if (inFlight) {
    return inFlight; // Wait for existing request
  }
  
  // New request - track it
  const request = (async () => {
    try {
      const user = await database.users.findById(userId);
      if (user) {
        await cache.set(cacheKey, user, { ttl: 300 });
      }
      return user;
    } finally {
      inFlightRequests.delete(cacheKey);
    }
  })();
  
  inFlightRequests.set(cacheKey, request);
  return request;
}

Thundering Herd Problem

When a hot cache entry expires, all concurrent requests may simultaneously try to refresh it, overwhelming the data source. Request coalescing ensures only one request goes through while others wait. Libraries like lru-cache with fetchMethod handle this automatically.

Cache Invalidation Challenges

Application-level cache invalidation is where "there are only two hard things in computer science" becomes painfully real. The fundamental challenge: keeping cached data consistent with the source of truth.

Invalidation Strategies:

Cache Invalidation Approaches
Strategy	Mechanism	Consistency	Complexity	Best For
TTL-based	Entries expire after fixed time	Eventually consistent	Low	Read-heavy data that can be stale
Event-based	Invalidate on writes/updates	Strong (if no bugs)	Medium	Critical data requiring freshness
Version-based	New version = new cache key	Strong	Low	Immutable/versioned content
Tag-based	Group invalidation by tag	Strong (if correct tags)	Medium	Related data that updates together
Hybrid	TTL + event-based fallback	Best effort strong	High	Critical data with TTL safety net

cache-invalidation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
// Event-based invalidation with domain events
class UserService {
  async updateUser(userId: string, updates: Partial<User>): Promise<User> {
    // Update database
    const user = await database.users.update({
      where: { id: userId },
      data: updates,
    });
    
    // Invalidate cache entries
    await this.invalidateUserCache(userId);
    
    // Publish event for other services
    await eventBus.publish('user.updated', { userId, changes: updates });
    
    return user;
  }
  
  private async invalidateUserCache(userId: string): Promise<void> {
    // Invalidate direct user cache
    await cache.delete(`user:${userId}`);
    
    // Invalidate derived/related caches
    await cache.delete(`user:${userId}:profile`);
    await cache.delete(`user:${userId}:permissions`);
    
    // Invalidate aggregate caches that include this user
    // This is where it gets complex...
    const userTeams = await database.teamMemberships.findMany({
      where: { userId },
      select: { teamId: true },
    });
    
    for (const { teamId } of userTeams) {
      await cache.delete(`team:${teamId}:members`);
    }
  }
}
 
// Tag-based invalidation for related data
class TaggedCache {
  private cache: Map<string, any> = new Map();
  private tagIndex: Map<string, Set<string>> = new Map();
  
  set(key: string, value: any, tags: string[]): void {
    this.cache.set(key, value);
    
    // Index by tags
    for (const tag of tags) {
      if (!this.tagIndex.has(tag)) {
        this.tagIndex.set(tag, new Set());
      }
      this.tagIndex.get(tag)!.add(key);
    }
  }
  
  invalidateByTag(tag: string): number {
    const keys = this.tagIndex.get(tag);
    if (!keys) return 0;
    
    let count = 0;
    for (const key of keys) {
      if (this.cache.delete(key)) {
        count++;
      }
    }
    
    this.tagIndex.delete(tag);
    return count;
  }
}
 
// Usage
const cache = new TaggedCache();
 
cache.set('product:123', product, ['product:123', 'category:electronics', 'featured']);
cache.set('product:456', product2, ['product:456', 'category:electronics']);
 
// When electronics category changes
cache.invalidateByTag('category:electronics'); // Clears both products

The Cascade Problem:

One of the hardest invalidation challenges is cascading invalidation. User data appears in:

User profile cache
Team member lists
Activity feeds
Search indexes
Denormalized views

Updating a user requires invalidating all these caches. Missing any leaves stale data.

Strategies for Cascade Invalidation:

Explicit tracking — Maintain a registry of all keys affected by each entity
Event-driven — Publish update events; consumers invalidate their own caches
Conservative TTLs — Limit blast radius with shorter TTLs; staleness is bounded
Accept eventual consistency — For non-critical data, allow temporary inconsistency

Race Condition Warning

A subtle bug: Read-Modify-Write races. Thread A reads stale from cache, Thread B updates DB and invalidates cache, Thread A writes stale to cache. Now cache has stale data. Solutions: write-through (update cache atomically with DB), or add cache entry version checks.

Multi-Instance Cache Consistency

In-process caches are local to each application instance. When running multiple instances behind a load balancer, each instance has its own cache, leading to potential inconsistencies.

The Problem Illustrated:

             ┌─────────────────┐
             │  Load Balancer  │
             └────────┬────────┘
                      │
        ┌─────────────┼─────────────┐
        ▼             ▼             ▼
   Instance A    Instance B    Instance C
   ┌─────────┐   ┌─────────┐   ┌─────────┐
   │ Cache A │   │ Cache B │   │ Cache C │
   │user:123 │   │user:123 │   │user:123 │
   │ (v1)    │   │ (v2)    │   │ (stale) │
   └─────────┘   └─────────┘   └─────────┘
        │             │             │
        └─────────────┴─────────────┘
                      │
                      ▼
                ┌──────────┐
                │ Database │
                │  (v2)    │
                └──────────┘

Each instance may have different versions cached!

Strategies for Multi-Instance Cache Consistency
Strategy	How It Works	Tradeoffs
Accept inconsistency	Use TTLs; instances eventually converge	Simple but unpredictable user experience
Shared external cache	Replace in-process with Redis/Memcached	Network overhead but consistent
Two-tier caching	In-process (fast) + shared (consistent)	Complex but optimal performance
Cache invalidation broadcast	Publish invalidation events to all instances	Requires messaging infrastructure
Sticky sessions	Route user to same instance	Defeats load balancing benefits

cache-invalidation-broadcast.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
// Cache invalidation broadcast using Redis Pub/Sub
import Redis from 'ioredis';
 
class DistributedCache<T> {
  private localCache: LRUCache<string, T>;
  private redis: Redis;
  private subscriber: Redis;
  private instanceId: string;
  private channel = 'cache:invalidation';
 
  constructor(redisUrl: string, maxSize: number = 1000) {
    this.instanceId = crypto.randomUUID();
    this.localCache = new LRUCache({ max: maxSize, ttl: 300000 });
    
    this.redis = new Redis(redisUrl);
    this.subscriber = new Redis(redisUrl);
    
    this.setupSubscription();
  }
 
  private async setupSubscription(): Promise<void> {
    await this.subscriber.subscribe(this.channel);
    
    this.subscriber.on('message', (channel, message) => {
      if (channel !== this.channel) return;
      
      const { senderId, pattern, keys } = JSON.parse(message);
      
      // Ignore our own messages
      if (senderId === this.instanceId) return;
      
      // Process invalidation
      if (pattern) {
        this.invalidatePattern(pattern);
      } else if (keys) {
        for (const key of keys) {
          this.localCache.delete(key);
        }
      }
    });
  }
 
  async get(key: string): Promise<T | undefined> {
    // Check local cache first
    const local = this.localCache.get(key);
    if (local !== undefined) {
      return local;
    }
    
    // Fall through to Redis or database
    return undefined;
  }
 
  async set(key: string, value: T): Promise<void> {
    this.localCache.set(key, value);
  }
 
  async invalidate(keys: string[]): Promise<void> {
    // Invalidate locally
    for (const key of keys) {
      this.localCache.delete(key);
    }
    
    // Broadcast to other instances
    await this.redis.publish(this.channel, JSON.stringify({
      senderId: this.instanceId,
      keys,
    }));
  }
 
  async invalidateByPattern(pattern: string): Promise<void> {
    // Invalidate matching keys locally
    this.invalidatePattern(pattern);
    
    // Broadcast pattern invalidation
    await this.redis.publish(this.channel, JSON.stringify({
      senderId: this.instanceId,
      pattern,
    }));
  }
 
  private invalidatePattern(pattern: string): void {
    const regex = new RegExp(pattern.replace(/\*/g, '.*'));
    for (const key of this.localCache.keys()) {
      if (regex.test(key)) {
        this.localCache.delete(key);
      }
    }
  }
}
 
// Usage
const cache = new DistributedCache<User>('redis://localhost:6379');
 
// When user is updated on any instance
async function updateUser(userId: string, data: Partial<User>) {
  await database.users.update(userId, data);
  await cache.invalidate([`user:${userId}`, `user:${userId}:profile`]);
  // All instances now have invalidated this key
}

Two-Tier Caching Pattern

The optimal pattern for high-performance systems: fast in-process cache (nanosecond access) backed by shared external cache (sub-millisecond access). Check local first, then Redis, then database. This gives you speed AND consistency with appropriate TTLs and invalidation broadcast.

Summary and Best Practices

Application-level caching provides the most control over what gets cached and when. In-process caches offer nanosecond access times, memoization elegantly handles repeated computations, and careful cache key design prevents subtle bugs. However, this control comes with responsibility for memory management, invalidation correctness, and multi-instance consistency.

Application Caching Best Practices

•Use established libraries — Don't roll your own LRU cache. Use lru-cache (Node.js), Caffeine (Java), or equivalent.
•Set size limits always — Unbounded caches will eventually exhaust memory. Define max size or max memory.
•Design cache keys carefully — Include all inputs, normalize formats, add version prefixes, include tenant context.
•Handle concurrent requests — Implement request coalescing to prevent thundering herd on cache misses.
•Consider freshness vs latency — Short TTLs for critical data, longer for stable reference data.
•Plan invalidation from day one — Reactive invalidation is hard to retrofit. Design invalidation paths with initial implementation.
•Instrument everything — Track hit/miss ratios, eviction rates, memory usage. Caches without metrics are black boxes.
•Test cache behavior — Unit test cache population, invalidation, and TTL expiration. Cache bugs are subtle.
•Handle multi-instance deployments — Use broadcast invalidation, external caches, or accept bounded inconsistency.
•Document caching decisions — Why this TTL? Why this size limit? What invalidates this? Future maintainers will thank you.

What's Next:

Moving closer to the data layer, the next page explores Database Query Caching—where we cache at the query level, leveraging database-specific optimizations, query result caching, and the tradeoffs between caching in the application versus the database layer.

Page Complete

You now understand application-level caching—from in-process LRU caches to memoization patterns, cache key design, population strategies, invalidation challenges, and multi-instance consistency. You can implement robust caching that accelerates your services without introducing subtle data consistency bugs.