Loading learning content...
In write-around caching, cache misses are not just expected—they're fundamental to how the system works. Unlike write-through caching where misses represent anomalies, write-around treats misses as the primary mechanism for cache population. Every piece of data in the cache arrived there through a read miss.
But cache misses have costs: increased latency for the requesting client, additional load on the database, and potential for cascading failures under high concurrency. Understanding read miss behavior in depth is essential for building resilient write-around systems.
By the end of this page, you will understand the complete anatomy of a cache miss—the latency breakdown, database load implications, failure cascades, and the full arsenal of mitigation strategies. You'll be equipped to design systems that handle cache misses gracefully even under extreme load.
A cache miss in write-around caching triggers a multi-step process. Understanding each step—and its latency contribution—is crucial for performance analysis and optimization.
The Cache Miss Timeline:
┌───────────────────────────────────────────────────────────────────────────────────┐
│ Total Cache Miss Latency │
├────────────┬──────────────┬─────────────────┬────────────────┬───────────────────┤
│ Cache │ Network │ Database │ Network │ Cache │
│ Lookup │ to DB │ Query │ from DB │ Population │
│ (miss) │ │ │ │ │
├────────────┼──────────────┼─────────────────┼────────────────┼───────────────────┤
│ 0.1-1ms │ 0.5-5ms │ 1-100ms │ 0.5-5ms │ 0.1-2ms │
└────────────┴──────────────┴─────────────────┴────────────────┴───────────────────┘
Total: 2-115ms (typically 10-30ms)
| Step | Typical Latency | Variability | Optimization Opportunities |
|---|---|---|---|
| Cache lookup (miss) | 0.1-1ms | Low | Fast cache client, connection pooling |
| Network to database | 0.5-5ms | Medium | Co-location, connection pooling |
| Database query execution | 1-100ms | High | Query optimization, indexing |
| Data serialization | 0.1-5ms | Medium | Efficient serialization (protobuf) |
| Network from database | 0.5-5ms | Medium | Compact payloads, compression |
| Cache population | 0.1-2ms | Low | Async population, pipelining |
| Response to client | 0.1-5ms | Medium | Efficient serialization |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
interface MissLatencyBreakdown { cacheLookupMs: number; networkToDbMs: number; dbQueryMs: number; dbResultSizeBytes: number; cachePopulationMs: number; totalMs: number;} class InstrumentedCacheMissHandler<T> { async handleCacheMiss(key: string): Promise<{data: T | null; breakdown: MissLatencyBreakdown}> { const breakdown: MissLatencyBreakdown = { cacheLookupMs: 0, networkToDbMs: 0, dbQueryMs: 0, dbResultSizeBytes: 0, cachePopulationMs: 0, totalMs: 0, }; const totalStart = performance.now(); // Step 1: Cache lookup (resulting in miss) const cacheStart = performance.now(); const cached = await this.cache.get(key); breakdown.cacheLookupMs = performance.now() - cacheStart; if (cached !== null) { throw new Error("Expected cache miss, got hit"); } // Step 2: Database query const dbStart = performance.now(); const dbResult = await this.database.getWithMetadata(key); breakdown.dbQueryMs = performance.now() - dbStart; if (dbResult !== null) { breakdown.dbResultSizeBytes = JSON.stringify(dbResult.data).length; // Step 3: Cache population const cachePopStart = performance.now(); await this.cache.set(key, dbResult.data, this.ttl); breakdown.cachePopulationMs = performance.now() - cachePopStart; } breakdown.totalMs = performance.now() - totalStart; breakdown.networkToDbMs = breakdown.totalMs - breakdown.cacheLookupMs - breakdown.dbQueryMs - breakdown.cachePopulationMs; // Log for analysis this.metrics.recordMiss(breakdown); return { data: dbResult?.data ?? null, breakdown, }; } // Analysis: Where is time being spent? analyzeMissProfile(): MissAnalysis { const avgBreakdown = this.metrics.getAverageBreakdown(); return { bottleneck: this.identifyBottleneck(avgBreakdown), dbQueryPercentage: (avgBreakdown.dbQueryMs / avgBreakdown.totalMs) * 100, networkPercentage: ((avgBreakdown.networkToDbMs) / avgBreakdown.totalMs) * 100, cacheOverheadPercentage: ((avgBreakdown.cacheLookupMs + avgBreakdown.cachePopulationMs) / avgBreakdown.totalMs) * 100, recommendations: this.generateRecommendations(avgBreakdown), }; } private identifyBottleneck(breakdown: MissLatencyBreakdown): string { const components = [ { name: 'database_query', value: breakdown.dbQueryMs }, { name: 'network', value: breakdown.networkToDbMs }, { name: 'cache_overhead', value: breakdown.cacheLookupMs + breakdown.cachePopulationMs }, ]; return components.sort((a, b) => b.value - a.value)[0].name; }}In most systems, the database query accounts for 50-80% of cache miss latency. This is why database optimization (indexes, query tuning, connection pooling) has the highest impact on miss performance. Cache layer optimization provides diminishing returns if the database is slow.
Every cache miss translates to a database query. In write-around caching, understanding the relationship between miss rate and database load is critical for capacity planning and avoiding cascading failures.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
interface LoadModel { totalReadQPS: number; // Total read queries per second cacheHitRate: number; // Percentage of reads served from cache missRate: number; // Percentage of reads causing DB queries dbQueryQPS: number; // Resulting database queries per second avgQueryLatencyMs: number; // Average DB query time dbCPUUtilization: number; // Estimated CPU usage} function modelDatabaseLoad( totalReads: number, hitRate: number, avgDbLatencyMs: number, dbMaxQPS: number): LoadModel { const missRate = 1 - hitRate; const dbQueryQPS = totalReads * missRate; const dbCPUUtilization = dbQueryQPS / dbMaxQPS; return { totalReadQPS: totalReads, cacheHitRate: hitRate, missRate: missRate, dbQueryQPS: dbQueryQPS, avgQueryLatencyMs: avgDbLatencyMs, dbCPUUtilization: dbCPUUtilization, };} // Example scenariosconst scenarios = { // Healthy: High cache hit rate healthy: modelDatabaseLoad( 10000, // 10K reads/sec 0.95, // 95% cache hit rate 5, // 5ms average query 1000 // DB can handle 1000 QPS ), // Result: 500 DB queries/sec (50% DB utilization) // Warming: Cold cache after restart warmingPhase: modelDatabaseLoad( 10000, // 10K reads/sec 0.50, // 50% hit rate (cache warming) 5, // 5ms average query 1000 // DB can handle 1000 QPS ), // Result: 5000 DB queries/sec (500% DB - OVERLOADED!) // Degraded: Cache partially failed degraded: modelDatabaseLoad( 10000, // 10K reads/sec 0.80, // 80% hit rate 15, // 15ms (DB slowing under load) 1000 // DB can handle 1000 QPS ), // Result: 2000 DB queries/sec (200% DB - DANGER ZONE)}; // The takeaway: Hit rate directly determines DB load// Small changes in hit rate cause large changes in DB loadThe Amplification Effect:
Consider a system serving 10,000 reads/second:
| Cache Hit Rate | DB Queries/sec | DB Load Change |
|---|---|---|
| 99% | 100 | Baseline |
| 95% | 500 | 5x increase! |
| 90% | 1,000 | 10x increase! |
| 80% | 2,000 | 20x increase! |
| 50% | 5,000 | 50x increase! |
A mere 4% drop in hit rate (99% → 95%) causes a 5x increase in database load. This nonlinear relationship is why cache health monitoring and protection mechanisms are critical.
When database load increases, query latency increases. Slower queries mean requests hold connections longer, reducing connection pool availability. This causes more timeouts, more retries, more load—a death spiral. Cache misses can trigger cascading failures if not managed carefully.
Cache misses, while normal in isolation, can trigger cascading failures when they occur at scale. Understanding these failure scenarios helps you design protective mechanisms.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
// Scenario: What happens when a popular cache key expires? interface SimulationResult { dbQueriesTriggered: number; totalLatencyMs: number; failureOccurred: boolean; failureType?: string;} async function simulateCacheStampede( concurrentRequests: number, dbMaxQPS: number, dbLatencyMs: number): Promise<SimulationResult> { // All requests arrive at the same moment for an expired key const startTime = performance.now(); let dbQueriesTriggered = 0; let failureOccurred = false; let failureType: string | undefined; // Without protection: All requests hit the database dbQueriesTriggered = concurrentRequests; // Database response based on load const actualLoadRatio = dbQueriesTriggered / dbMaxQPS; if (actualLoadRatio > 3) { failureOccurred = true; failureType = 'database_connection_exhausted'; } else if (actualLoadRatio > 1.5) { failureOccurred = true; failureType = 'database_timeout_cascade'; } // Latency increases non-linearly with load const latencyMultiplier = Math.pow(actualLoadRatio, 1.5); const actualLatency = dbLatencyMs * latencyMultiplier; return { dbQueriesTriggered, totalLatencyMs: actualLatency, failureOccurred, failureType, };} // Example: 1000 concurrent requests, DB can handle 100 QPSconst naiveResult = await simulateCacheStampede(1000, 100, 10);// Result: 1000 DB queries, 10x overload, failure! // With request coalescing:async function simulateWithCoalescing( concurrentRequests: number, dbMaxQPS: number, dbLatencyMs: number): Promise<SimulationResult> { // Only ONE request hits the database // Others wait for the result const dbQueriesTriggered = 1; const actualLoadRatio = dbQueriesTriggered / dbMaxQPS; return { dbQueriesTriggered, totalLatencyMs: dbLatencyMs, // Single query latency failureOccurred: false, };} const coalescedResult = await simulateWithCoalescing(1000, 100, 10);// Result: 1 DB query, no overload, success!| Scenario | Trigger | Without Protection | With Protection |
|---|---|---|---|
| Cache Stampede | Hot key expiry | 1000s of DB queries | 1 query (coalescing) |
| Cold Start | Cache restart | 100% miss rate | Gradual traffic shift |
| Mass Eviction | Memory pressure | All evicted keys query DB | Rate limiting, backpressure |
| TTL Sync | Batch of same TTL | Periodic spikes | TTL jitter |
Your system will eventually experience every failure scenario. A popular key WILL expire exactly when traffic is highest. Your cache WILL restart unexpectedly. Design protective mechanisms before they're needed, not after the first outage.
A comprehensive miss mitigation strategy combines multiple techniques to handle cache misses gracefully under all conditions.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120
class ProtectedWriteAroundCache<T> { private coalescing = new Map<string, Promise<T | null>>(); private circuitBreaker: CircuitBreaker; private rateLimiter: RateLimiter; constructor( private cache: CacheStore<T>, private database: Database<T>, private config: CacheConfig ) { this.circuitBreaker = new CircuitBreaker({ failureThreshold: 5, resetTimeoutMs: 30000, }); this.rateLimiter = new RateLimiter({ maxRequestsPerSecond: config.maxDbQPS, }); } async read(key: string): Promise<T | null> { // Layer 1: Check cache (with stale-while-revalidate) const cached = await this.cache.getWithMetadata(key); if (cached !== null) { const { data, ttlRemaining, setTime } = cached; // If near expiry, trigger background refresh if (ttlRemaining < this.config.refreshThreshold) { this.backgroundRefresh(key); } return data; } // Layer 2: Request coalescing if (this.coalescing.has(key)) { return this.coalescing.get(key)!; } // Layer 3: Circuit breaker check if (!this.circuitBreaker.canRequest()) { return this.handleCircuitOpen(key); } // Layer 4: Rate limiting if (!this.rateLimiter.tryAcquire()) { return this.handleRateLimited(key); } // Execute with protection const promise = this.fetchWithProtection(key); this.coalescing.set(key, promise); try { return await promise; } finally { this.coalescing.delete(key); } } private async fetchWithProtection(key: string): Promise<T | null> { const timeout = new Promise<never>((_, reject) => setTimeout(() => reject(new Error('Timeout')), this.config.dbTimeoutMs) ); try { const result = await Promise.race([ this.database.get(key), timeout, ]); this.circuitBreaker.recordSuccess(); if (result !== null) { await this.cache.set(key, result, this.jitteredTTL()); } return result; } catch (error) { this.circuitBreaker.recordFailure(); throw error; } } private async backgroundRefresh(key: string): Promise<void> { // Don't await - fire and forget (async () => { try { const fresh = await this.database.get(key); if (fresh !== null) { await this.cache.set(key, fresh, this.jitteredTTL()); } } catch { // Background refresh failure is not critical } })(); } private handleCircuitOpen(key: string): T | null { // Options: // 1. Return stale data if available // 2. Return fallback/default value // 3. Return null and let caller handle return this.getFallbackValue(key); } private handleRateLimited(key: string): Promise<T | null> { // Wait and retry, or return fallback return new Promise(resolve => { setTimeout(async () => { resolve(await this.read(key)); }, 100); }); } private jitteredTTL(): number { const jitter = this.config.baseTTL * 0.1; return this.config.baseTTL + (Math.random() - 0.5) * 2 * jitter; }}No single mitigation strategy is sufficient. The most resilient systems layer multiple protections: TTL jitter prevents synchronized expirations, coalescing handles stampedes, circuit breakers stop cascading failures, and stale-while-revalidate ensures availability during database issues.
Stale-While-Revalidate (SWR) is a cache strategy that serves stale (expired) data immediately while asynchronously fetching fresh data in the background. This pattern eliminates perceived cache misses entirely from the user's perspective.
How SWR Works:
├───────────────────├───────────────────├─────────────────────┤
│ Fresh Period │ Stale Period │ Expired │
│ (serve as-is) │ (serve + refresh) │ (cache miss) │
├───────────────────├───────────────────├─────────────────────┤
│ TTL: 300s │ +60s │ │
└───────────────────└───────────────────└─────────────────────┘
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
interface SWRConfig { freshTTL: number; // Time data is considered fresh staleTTL: number; // Time stale data can still be served refreshInFlight: Map<string, boolean>;} class StaleWhileRevalidateCache<T> { private refreshInFlight = new Map<string, boolean>(); constructor( private cache: CacheStore<T>, private database: Database<T>, private config: SWRConfig ) {} async read(key: string): Promise<T | null> { const entry = await this.cache.getWithTimestamps(key); if (entry !== null) { const { data, setTime } = entry; const age = Date.now() - setTime; // Case 1: Fresh - serve directly if (age < this.config.freshTTL * 1000) { return data; } // Case 2: Stale but serveable - serve and refresh if (age < (this.config.freshTTL + this.config.staleTTL) * 1000) { // Serve stale data immediately this.triggerBackgroundRefresh(key); return data; // User sees no latency! } // Case 3: Too stale - treat as miss (fall through) } // True cache miss - must wait for database return this.fetchAndCache(key); } private triggerBackgroundRefresh(key: string): void { // Prevent multiple concurrent refreshes for same key if (this.refreshInFlight.get(key)) { return; } this.refreshInFlight.set(key, true); // Fire and forget (async () => { try { const fresh = await this.database.get(key); if (fresh !== null) { await this.cache.setWithTimestamp(key, fresh, { setTime: Date.now(), expiresAt: Date.now() + (this.config.freshTTL + this.config.staleTTL) * 1000, }); } } finally { this.refreshInFlight.delete(key); } })(); } private async fetchAndCache(key: string): Promise<T | null> { const data = await this.database.get(key); if (data !== null) { await this.cache.setWithTimestamp(key, data, { setTime: Date.now(), expiresAt: Date.now() + (this.config.freshTTL + this.config.staleTTL) * 1000, }); } return data; }} // Usage exampleconst swrCache = new StaleWhileRevalidateCache(cache, db, { freshTTL: 300, // 5 minutes fresh staleTTL: 60, // 1 minute stale grace period refreshInFlight: new Map(),}); // User always sees low latency// During stale period: 0.5ms (cache read)// True miss: 15ms (database fetch)// The stale period absorbs the refresh latency| Data State | User Latency | Background Action | Data Freshness |
|---|---|---|---|
| Fresh (< freshTTL) | ~1ms (cache) | None | Current |
| Stale (< staleTTL) | ~1ms (cache) | Async DB fetch | Slightly outdated |
| Expired (> staleTTL) | ~20ms (DB) | None (waited) | Fresh on response |
The stale-while-revalidate pattern is standardized in HTTP's Cache-Control header. CDNs like Cloudflare and Fastly support 'stale-while-revalidate' directives, allowing edge caches to serve stale content while fetching updates from origin servers.
The circuit breaker pattern protects downstream services (your database) when they're struggling. Instead of continuing to hammer a failing database with requests, the circuit breaker 'opens' to stop the flood and give the database time to recover.
Circuit Breaker States:
┌─────────────────────┐
│ CLOSED │ Normal operation
│ (Requests pass) │ Monitor for failures
└──────────┬──────────┘
│ Failure threshold exceeded
▼
┌─────────────────────┐
│ OPEN │ Stop all requests
│ (Fail immediately) │ Wait for cooldown
└──────────┬──────────┘
│ Cooldown period expires
▼
┌─────────────────────┐
│ HALF-OPEN │ Test with limited requests
│ (Allow some tests) │ Determine if recovered
└──────────┬──────────┘
│
┌──────────────┴──────────────┐
│ Test succeeds │ Test fails
▼ ▼
┌────────┐ ┌────────┐
│ CLOSED │ │ OPEN │
└────────┘ └────────┘
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109
enum CircuitState { CLOSED = 'CLOSED', // Normal, requests pass through OPEN = 'OPEN', // Failing, requests rejected HALF_OPEN = 'HALF_OPEN' // Testing if recovered} interface CircuitBreakerConfig { failureThreshold: number; // Failures before opening successThreshold: number; // Successes to close from half-open resetTimeoutMs: number; // Time before trying again} class CircuitBreaker { private state: CircuitState = CircuitState.CLOSED; private failures = 0; private successes = 0; private lastFailureTime = 0; constructor(private config: CircuitBreakerConfig) {} canRequest(): boolean { switch (this.state) { case CircuitState.CLOSED: return true; case CircuitState.OPEN: // Check if cooldown has elapsed if (Date.now() - this.lastFailureTime > this.config.resetTimeoutMs) { this.transitionTo(CircuitState.HALF_OPEN); return true; } return false; case CircuitState.HALF_OPEN: // Allow limited test requests return true; } } recordSuccess(): void { if (this.state === CircuitState.HALF_OPEN) { this.successes++; if (this.successes >= this.config.successThreshold) { this.transitionTo(CircuitState.CLOSED); } } this.failures = 0; // Reset failure count on success } recordFailure(): void { this.failures++; this.lastFailureTime = Date.now(); if (this.state === CircuitState.HALF_OPEN) { this.transitionTo(CircuitState.OPEN); } else if (this.failures >= this.config.failureThreshold) { this.transitionTo(CircuitState.OPEN); } } private transitionTo(newState: CircuitState): void { console.log(`Circuit breaker: ${this.state} -> ${newState}`); this.state = newState; if (newState === CircuitState.CLOSED) { this.failures = 0; this.successes = 0; } else if (newState === CircuitState.HALF_OPEN) { this.successes = 0; } } getState(): CircuitState { return this.state; }} // Usage in cache readclass CacheWithCircuitBreaker<T> { private circuitBreaker = new CircuitBreaker({ failureThreshold: 5, successThreshold: 2, resetTimeoutMs: 30000, }); async read(key: string): Promise<T | null> { const cached = await this.cache.get(key); if (cached !== null) return cached; // Check circuit breaker before DB call if (!this.circuitBreaker.canRequest()) { // Return fallback instead of hitting DB return this.fallbackValue(key); } try { const data = await this.database.get(key); this.circuitBreaker.recordSuccess(); if (data !== null) { await this.cache.set(key, data); } return data; } catch (error) { this.circuitBreaker.recordFailure(); throw error; } }}When the circuit opens, cache misses return fallback values or errors—not real data. Users experience degraded functionality. This is intentional: it's better to serve fallbacks than to cascade failures across your entire system. Design your fallback behavior carefully.
Effective monitoring of cache miss behavior is essential for understanding system health and preventing issues before they become outages.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
interface CacheMissMetrics { // Core metrics totalReads: Counter; cacheHits: Counter; cacheMisses: Counter; // Latency histograms missLatency: Histogram; dbQueryLatency: Histogram; // Protection metrics coalescedRequests: Counter; circuitBreakerOpens: Counter; staleServes: Counter; rateLimitedRequests: Counter; // Health indicators dbConnectionPoolUsage: Gauge; cacheMemoryUsage: Gauge; evictionRate: Gauge;} class MetricsCollector { private metrics: CacheMissMetrics; recordCacheMiss(latencyMs: number, dbLatencyMs: number): void { this.metrics.cacheMisses.inc(); this.metrics.missLatency.observe(latencyMs); this.metrics.dbQueryLatency.observe(dbLatencyMs); } recordCoalescedRequest(): void { this.metrics.coalescedRequests.inc(); } // Alert thresholds checkAlerts(): Alert[] { const alerts: Alert[] = []; const hitRate = this.calculateHitRate(); if (hitRate < 0.80) { alerts.push({ severity: 'warning', message: `Cache hit rate dropped to ${(hitRate * 100).toFixed(1)}%`, }); } const p99Latency = this.metrics.missLatency.getPercentile(0.99); if (p99Latency > 100) { alerts.push({ severity: 'critical', message: `Cache miss p99 latency: ${p99Latency}ms exceeds threshold`, }); } const evictionRate = this.metrics.evictionRate.get(); if (evictionRate > 1000) { alerts.push({ severity: 'warning', message: `High cache eviction rate: ${evictionRate}/sec`, }); } return alerts; } // Dashboard data getDashboardData(): DashboardSnapshot { return { hitRate: this.calculateHitRate(), missRate: 1 - this.calculateHitRate(), avgMissLatencyMs: this.metrics.missLatency.getMean(), p99MissLatencyMs: this.metrics.missLatency.getPercentile(0.99), dbQPSFromMisses: this.metrics.cacheMisses.getRate(), coalescingEfficiency: this.getCoalescingEfficiency(), circuitBreakerState: this.getCircuitState(), cacheMemoryUsagePercent: this.metrics.cacheMemoryUsage.get(), }; }}| Metric | Warning Threshold | Critical Threshold | Action |
|---|---|---|---|
| Hit Rate | < 85% | < 70% | Investigate cache size, TTL, access patterns |
| Miss p99 Latency | 50ms | 100ms | Check database performance, network |
| DB QPS from Misses | 50% DB capacity | 80% DB capacity | Increase cache, reduce traffic |
| Circuit Breaker Opens | Any occurrence | Multiple/hour | Investigate database health |
| Eviction Rate | 100/sec | 1000/sec | Increase cache size or reduce TTL |
Sometimes the absolute values look fine, but a sudden change indicates problems. Alert on rate of change: if hit rate drops 10% in 5 minutes, something is wrong even if the absolute hit rate is still acceptable.
Cache misses in write-around caching are a design feature, not a bug. Understanding their behavior—latency profile, database impact, failure scenarios, and mitigation strategies—is essential for building resilient systems.
What's Next:
Now that you deeply understand how cache misses behave and how to handle them, the final page explores when to use write-around caching—the use cases, workload patterns, and system characteristics that make write-around the optimal choice.
You now have a complete understanding of cache miss behavior in write-around caching—the latency anatomy, database load implications, cascading failure risks, and the full toolkit of mitigation strategies. You can design systems that handle misses gracefully under normal and extreme conditions.