System Design (HLD)CDN Caching Mechanics

CDN Caching Mechanics

LevelIntermediate

Duration75 mins

TopicCDN Caching Mechanics

4 / 5

Stale-While-Revalidate: Instant Responses, Fresh Content

The Latency Killer That Changed Web Performance

In traditional caching, there's an awkward moment when cached content expires. The user's request arrives, the cache realizes its content is stale, and the user must wait while the cache fetches fresh content from the origin. For that unlucky user, the cache provides zero benefit—they experience the full origin latency as if the cache didn't exist.

The stale-while-revalidate (SWR) pattern elegantly solves this problem. Instead of making the user wait, the cache immediately serves the stale content while simultaneously refreshing in the background. The current user gets an instant response (with slightly dated content), and subsequent users get fresh content.

This seemingly simple shift in strategy has profound implications. SWR effectively eliminates latency spikes at cache boundaries, provides natural protection against origin failures, and enables aggressive caching without sacrificing content freshness. It's one of the most impactful optimizations in modern CDN architecture.

What You Will Learn

By the end of this page, you will understand the exact mechanics of stale-while-revalidate, how to configure it across different CDN providers, the subtle edge cases that can cause problems, and how to combine SWR with other caching strategies for optimal performance.

The Problem Stale-While-Revalidate Solves

To appreciate stale-while-revalidate, we must first understand the fundamental problem it addresses: cache expiration latency spikes.

The Traditional Cache Expiration Problem:

Consider a product page cached for 5 minutes. During those 5 minutes, every user gets instant sub-50ms responses. But the moment the cache expires:

User requests the page
Cache checks: "Is this fresh?" → No, TTL expired
Cache BLOCKS the user's request
Cache sends request to origin (300ms network latency)
Origin processes request (200ms application latency)
Origin returns response to cache (300ms network latency)
Cache stores the fresh response
Cache finally returns the response to the user

Result: That user waited 800ms instead of 50ms. And if this is a popular page, potentially hundreds of users could be blocked simultaneously during this refresh window.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
LATENCY OVER TIME (Traditional Caching with max-age=300):
 
Latency (ms)
│
800 │                    ×                    ×
    │                   /|                   /|
600 │                  / |                  / |
    │                 /  |                 /  |
400 │                /   |                /   |
    │               /    |               /    |
200 │              /     |              /     |
    │             /      |             /      |
 50 │────────────·       ·────────────·       ·────
    │                                              
    └───────────────────────────────────────────────→ Time
    0        5min      5:01min    10min    10:01min
 
PATTERN:
- Perfect latency (50ms) during cache-fresh period
- Latency spike (800ms) exactly at cache expiry
- First request after expiry pays the full origin cost
- Subsequent requests are fast again
- Cycle repeats every 5 minutes

The Thundering Herd Problem:

The situation worsens under high traffic. When cache expires:

100 users request the page in the same millisecond
Without protection, all 100 requests go to origin ("thundering herd")
Origin may collapse under sudden load
Even with request coalescing, that first request blocks all 100 users

User Experience Impact:

Latency spikes are particularly harmful because:

Users perceive inconsistency (sometimes fast, sometimes slow)
The spikes often hit during peak traffic (when cache expires under load)
Mobile users on slow connections experience even longer delays
Conversion rates and engagement suffer from unpredictable performance

The Cache Cliff

Without SWR, caches create a 'cliff' at expiration where performance falls off dramatically. For content with regular update patterns (e.g., blog posts updated at 9 AM), this cliff can affect many users simultaneously. SWR smooths out these cliffs into gentle slopes.

How Stale-While-Revalidate Works

Stale-while-revalidate introduces a grace period after the cache's TTL expires during which stale content can still be served while a background refresh occurs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
CACHE-CONTROL: max-age=300, stale-while-revalidate=3600
 
TIMELINE:
═══════════════════════════════════════════════════════════════════════
 
T = 0 seconds
├─ Response cached
├─ "Fresh" period begins
└─ Any request → Instant cache hit (fresh content)
 
─────────────────────────────────────────────────────────────────────
 
T = 0 to T = 300 seconds (FRESH PERIOD)
├─ All requests served instantly from cache
├─ No origin requests needed
└─ No revalidation
 
─────────────────────────────────────────────────────────────────────
 
T = 300 seconds (TTL EXPIRES)
├─ Content now "stale"
├─ SWR window begins
└─ Content can still be served, but with background refresh
 
─────────────────────────────────────────────────────────────────────
 
T = 300 to T = 3900 seconds (STALE-WHILE-REVALIDATE WINDOW)
│
├─ Request arrives at T = 350s:
│   ├─ Cache immediately returns stale content (instant response!)
│   ├─ Cache triggers background request to origin
│   ├─ Origin returns fresh content
│   └─ Cache updates stored content for future requests
│
├─ Request arrives at T = 355s:
│   └─ Cache returns fresh content (updated by previous background refresh)
│
└─ If no requests during this window, content stays stale
 
─────────────────────────────────────────────────────────────────────
 
T = 3900 seconds (SWR WINDOW EXPIRES: 300 + 3600)
├─ Content too stale to serve
├─ Next request MUST wait for fresh content
└─ Synchronous revalidation required
 
═══════════════════════════════════════════════════════════════════════

Key Mechanics of SWR:

Immediate Response: When a request hits stale content within the SWR window, the cache responds immediately with the stale content. No waiting.
Background Refresh: Simultaneously (or immediately after responding), the cache initiates a fresh request to the origin. This happens asynchronously.
Cache Update: When the origin responds, the cache updates its stored content. Subsequent requests receive this fresh content.
Window Expiration: The SWR window is finite. After max-age + stale-while-revalidate seconds, the cache must perform synchronous revalidation.
Single Flight: Smart implementations ensure only one background revalidation per stale object, even if multiple requests arrive during the stale period.

The 'One Request Behind' Trade-off

SWR has a fundamental trade-off: the user who triggers the background refresh receives stale content. They're 'one request behind' the fresh content. For most applications, this trade-off is excellent—near-instant latency for a small freshness delay. But for content where being even slightly behind is unacceptable, SWR may not be appropriate.

Configuring Stale-While-Revalidate

Stale-while-revalidate is configured via the Cache-Control header. The directive takes a single value: the number of seconds after TTL expiration during which stale content may be served.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# BASIC SWR CONFIGURATION
HTTP/1.1 200 OK
Cache-Control: max-age=60, stale-while-revalidate=300
# Fresh for 1 minute, then stale+refresh for 5 minutes
 
# AGGRESSIVE SWR (Availability-focused)
HTTP/1.1 200 OK
Cache-Control: max-age=300, stale-while-revalidate=86400
# Fresh for 5 minutes, then stale+refresh for 24 hours
# Almost never blocks on origin
 
# CONSERVATIVE SWR (Freshness-focused)
HTTP/1.1 200 OK
Cache-Control: max-age=3600, stale-while-revalidate=60
# Fresh for 1 hour, only 1 minute of SWR grace period
# Quickly forces synchronous refresh if content is old
 
# COMBINED WITH OTHER DIRECTIVES
HTTP/1.1 200 OK
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=3600, stale-if-error=86400
# Browser: fresh 1 min (no SWR in most browsers)
# CDN: fresh 5 min, SWR for 1 hour, error fallback for 24 hours
 
# SPLIT BROWSER/CDN WITH SWR
# Some CDNs support CDN-specific SWR while browsers don't
HTTP/1.1 200 OK
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=3600
 
# Browser: Uses max-age=60, may or may not support SWR
# CDN: Uses s-maxage=300, typically supports SWR

SWR Support by Platform
Platform	SWR Support	Notes
Cloudflare	✓ Full support	Native SWR, configurable in Cache Rules
AWS CloudFront	✓ Full support	Enabled via origin response headers
Fastly	✓ Full support	Configurable in VCL
Akamai	✓ Full support	Property Manager configuration
Chrome/Edge	✓ Supported	Service Worker and HTTP cache
Firefox	✓ Supported	Since Firefox 68
Safari	⚠ Partial	Limited support, check version
nginx	✓ Configurable	Via proxy_cache_use_stale
Varnish	✓ Configurable	Via VCL beresp.grace

Start with Conservative Values

If you're new to SWR, start with a short SWR window (e.g., stale-while-revalidate=60) and monitor behavior. Once you're confident in your cache invalidation strategy, gradually increase the window. Very long SWR windows require robust purge mechanisms to update critical content.

SWR vs Stale-If-Error

Stale-while-revalidate has a closely related companion: stale-if-error. While SWR handles normal expiration, stale-if-error handles origin failures. Understanding both—and how they work together—is essential for a robust caching strategy.

stale-while-revalidate

•Activated when: Cache entry is stale (TTL expired)
•Origin status: Origin is healthy and reachable
•Behavior: Serve stale, refresh in background
•Goal: Eliminate latency spikes at expiration
•User sees: Instant response (slightly stale)

stale-if-error

•Activated when: Origin returns error or is unreachable
•Origin status: Origin is down or returning 5xx
•Behavior: Serve stale instead of propagating error
•Goal: Maintain availability during outages
•User sees: Content instead of error page

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
CONFIGURATION: max-age=300, stale-while-revalidate=3600, stale-if-error=86400
 
═══════════════════════════════════════════════════════════════════════
SCENARIO 1: Normal operation (origin healthy)
═══════════════════════════════════════════════════════════════════════
 
T=0: Content cached
T=300: TTL expires, content stale
T=350: Request arrives
       ├─ SWR applies: Serve stale instantly
       ├─ Background refresh succeeds
       └─ Content updated
T=355: Next request gets fresh content ✓
 
═══════════════════════════════════════════════════════════════════════
SCENARIO 2: Origin failure during SWR window
═══════════════════════════════════════════════════════════════════════
 
T=0: Content cached
T=300: TTL expires, content stale
T=350: Request arrives, origin is DOWN
       ├─ SWR applies: Serve stale instantly
       ├─ Background refresh fails (origin 503)
       ├─ stale-if-error applies: Keep serving stale
       └─ Retry origin on next request
T=500: Origin recovers
       └─ Next request refreshes successfully ✓
 
═══════════════════════════════════════════════════════════════════════
SCENARIO 3: Origin failure AFTER SWR window expires
═══════════════════════════════════════════════════════════════════════
 
T=0: Content cached
T=300: TTL expires
T=3900: SWR window expires (300 + 3600)
        Content is now "too stale for SWR"
T=4000: Request arrives, origin is DOWN
        ├─ SWR does NOT apply (window expired)
        ├─ Would need synchronous refresh, but origin down
        ├─ stale-if-error DOES apply (86400 seconds)
        └─ Serve stale content ✓
        
T=90000: stale-if-error window expires (300 + 86400)
         Origin still down
         └─ Must return 504 Gateway Timeout ✗
 
═══════════════════════════════════════════════════════════════════════

Always Pair SWR with Stale-If-Error

For production systems, always configure both stale-while-revalidate and stale-if-error. SWR handles performance; stale-if-error handles availability. Together, they provide a robust safety net that keeps your site responsive even during origin issues.

Edge Cases and Gotchas

While stale-while-revalidate is powerful, several edge cases and implementation details can cause unexpected behavior. Understanding these prevents production surprises.

Common SWR Gotchas

•Request coalescing varies by CDN — Some CDNs deduplicate background refreshes, others don't. Without deduplication, multiple stale requests can each trigger a separate origin request.
•Browser SWR support is inconsistent — While modern browsers support SWR, behavior varies. Service Workers provide more predictable SWR behavior than native HTTP cache.
•No guarantee of background refresh completion — If the cache shuts down or the connection drops during background refresh, the content may remain stale until the next request.
•Age header accumulation — Content that starts with max-age=300 but is stored in cache for 200 seconds only has 100 seconds of freshness remaining. SWR window starts from that reduced freshness.
•POST/PUT/DELETE don't use SWR — SWR only applies to cacheable responses (typically GET/HEAD). Mutations still go directly to origin.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
SCENARIO: Multi-tier CDN with Age accumulation
 
Origin Response:
  Cache-Control: max-age=300, stale-while-revalidate=3600
  
Origin → Shield (cached for 100s):
  Age: 100
  Effective freshness remaining: 200s
  
Shield → Edge (cached for 50s):
  Age: 150
  Effective freshness remaining: 150s
  
Edge → Browser (cached for 30s):
  Age: 180
  Effective freshness remaining: 120s
 
RESULT:
User's browser sees content that's 180 seconds old with only 120 seconds
of freshness remaining. SWR window starts after those 120 seconds.
 
EFFECTIVE TIMELINE (from user's perspective):
T=0: User receives response with Age: 180
T=120: Content becomes stale (not T=300!)
T=120 to T=3720: SWR window (stale-while-revalidate applies)
 
MITIGATION:
- Account for Age accumulation when setting TTLs
- Consider longer max-age to ensure meaningful freshness at edges
- Use short-TTL content in origin shield to reduce Age at edges

Implementation Variations

•Cloudflare: SWR triggers on first stale request. Respects directive fully. Background refresh is asynchronous.
•CloudFront: Respects SWR from origin. No CDN-level override; must use origin headers.
•Fastly: Full SWR support via VCL. Can configure custom stale-serving behavior.
•nginx: Uses proxy_cache_use_stale updating for similar behavior. Syntax differs from HTTP standard.
•Varnish: Uses beresp.grace for stale serving. Conceptually similar but differently configured.

Test SWR Behavior

CDN SWR implementations vary. Before relying on SWR in production, test your specific CDN's behavior: Does it actually serve stale and refresh in the background? Does it deduplicate background requests? What happens if the origin is slow vs. down? Document the specific behavior you observe.

SWR Best Practices

Applying stale-while-revalidate effectively requires understanding both the technical mechanics and the operational implications.

SWR Configuration Best Practices

•Match SWR to acceptable staleness — If content can be 1 hour out of date without issue, set stale-while-revalidate=3600. If 10 minutes is the maximum, use stale-while-revalidate=600.
•Use longer SWR for infrequently accessed content — Long-tail content may not be requested frequently enough to trigger regular refreshes. Longer SWR ensures it's served even when old.
•Combine with short max-age for predictable refreshes — max-age=60, stale-while-revalidate=3600 refreshes every minute during traffic, but allows stale for quiet periods.
•Set stale-if-error at least as long as SWR — During origin outages, you want the same stale-serving capability. Usually stale-if-error should be longer.
•Monitor SWR triggering — Track how often responses are served during the SWR window vs. fresh. High SWR hit rates might indicate max-age is too short.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// Content update workflow with SWR
 
// Step 1: Configure aggressive SWR for performance
// In origin response:
// Cache-Control: public, max-age=60, s-maxage=300, 
//                stale-while-revalidate=86400, stale-if-error=86400
 
// Step 2: When content changes, explicitly purge
async function updateProduct(productId, data) {
  // Update in database
  await database.products.update(productId, data);
  
  // Purge CDN cache for this product
  await cdnClient.purge([
    `/products/${productId}`,
    `/products/${productId}.json`,
    `/api/products/${productId}`,
  ]);
  
  // With SWR + purge strategy:
  // - Normal traffic: Users get instant responses (stale during refresh)
  // - After update: Purge immediately clears stale content
  // - First request after purge: Cache miss, fetches fresh from origin
  // - Subsequent requests: Cache hit with fresh content
  
  return { success: true };
}
 
// Step 3: For critical updates, warm the cache
async function updateAndWarmProduct(productId, data) {
  await database.products.update(productId, data);
  await cdnClient.purge([`/products/${productId}`]);
  
  // Immediately request the page to warm the cache
  // This ensures no user experiences origin latency
  await fetch(`https://cdn.example.com/products/${productId}`, {
    headers: { 'X-Cache-Warm': 'true' }
  });
  
  return { success: true };
}

SWR Enables Aggressive Caching

With proper SWR configuration, you can safely use very long CDN TTLs (hours or days) without worrying about serving stale content. The SWR pattern ensures refreshes happen during normal traffic, and purge handles explicit updates. This dramatically reduces origin load while maintaining freshness.

Measuring SWR Effectiveness

Implementing SWR is only valuable if you can measure its impact. Proper observability ensures SWR is working as expected and allows you to tune configurations.

Key SWR Metrics
Metric	What It Measures	Target	Alert Threshold
SWR Hit Rate	% of responses served from stale cache during revalidation	< 10% of total cache hits	If > 30%, max-age may be too short
Background Refresh Time	Time to complete background origin fetch	< 1 second	If > 3s, origin performance issue
Stale Serving Duration	How old the content was when served stale	< 1.5x max-age	If > 2x, investigate refresh failures
Origin Error Rate	% of background refreshes that fail	< 0.1%	If > 1%, origin stability issue
Cache Hit Ratio	Overall % of requests served from cache	95% for static	If < 90%, check TTL/SWR configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Response served from fresh cache
HTTP/1.1 200 OK
Age: 45
X-Cache: HIT
X-Cache-Status: FRESH
CF-Cache-Status: HIT    # Cloudflare
 
# Response served from stale cache (SWR in progress)
HTTP/1.1 200 OK
Age: 350                 # Exceeds max-age of 300
X-Cache: HIT
X-Cache-Status: STALE
X-SWR: REVALIDATING     # Custom header from edge logic
Warning: 110 Response is stale
 
# Response after background refresh completed
HTTP/1.1 200 OK
Age: 2                   # Freshly refreshed
X-Cache: HIT
X-Cache-Status: FRESH
X-SWR: REFRESHED        # Custom header indicating refresh happened
 
# CLOUDFLARE-SPECIFIC CACHE STATUS VALUES:
# HIT - Served from cache (fresh)
# STALE - Served from cache (stale, with or without revalidation)
# MISS - Not in cache, fetched from origin
# EXPIRED - TTL exceeded, synchronous refresh required
# REVALIDATED - Conditional refresh returned 304
 
# CUSTOM HEADERS FOR DEBUGGING (add via edge logic):
X-Cache-Age: 350         # How old the cached content is
X-Max-Age: 300           # Configured max-age
X-SWR-Window: 3600       # Configured stale-while-revalidate
X-SWR-Time-Remaining: 3250  # Seconds left in SWR window

The Age Header Is Your Friend

The Age header reveals how old cached content is. If you see Age values consistently exceeding max-age, SWR is working—stale content is being served. Monitor these values to understand how often you're in SWR mode and whether your TTLs are appropriate for your traffic patterns.

Summary: Mastering Stale-While-Revalidate

Stale-while-revalidate is one of the most powerful patterns in CDN caching. It eliminates latency spikes, improves availability, and enables aggressive caching strategies.

Key Takeaways

•SWR solves the cache expiration latency spike — Users never wait for origin refresh during normal operation.
•SWR serves stale instantly, refreshes in background — The first request triggers async refresh; the user gets immediate response.
•SWR and stale-if-error complement each other — SWR for performance, stale-if-error for availability. Use both.
•Configure SWR based on acceptable staleness — Match the window to how long slightly-out-of-date content is acceptable.
•Combine SWR with purge for controlled updates — Long SWR windows are safe when you have reliable purge mechanisms.
•Monitor SWR behavior — Track Age headers, SWR hit rates, and background refresh times to ensure proper operation.

What's Next:

We've mastered the mechanics of getting content into cache and keeping it fresh. But what about measuring and improving overall cache performance? The final page of this module covers Cache Hit Ratio Optimization—the strategies and techniques for maximizing the percentage of requests served from cache.

Page Complete

You now deeply understand stale-while-revalidate—how it works, when to use it, how to configure it, and how to monitor its effectiveness. This pattern should be part of virtually every CDN caching strategy where sub-second latency matters.

4 / 5

Loading learning content...

System Design (HLD)CDN Caching Mechanics

CDN Caching Mechanics

LevelIntermediate

Duration75 mins

TopicCDN Caching Mechanics

4 / 5

Stale-While-Revalidate: Instant Responses, Fresh Content

The Latency Killer That Changed Web Performance

What You Will Learn

The Problem Stale-While-Revalidate Solves

To appreciate stale-while-revalidate, we must first understand the fundamental problem it addresses: cache expiration latency spikes.

The Traditional Cache Expiration Problem:

Consider a product page cached for 5 minutes. During those 5 minutes, every user gets instant sub-50ms responses. But the moment the cache expires:

User requests the page
Cache checks: "Is this fresh?" → No, TTL expired
Cache BLOCKS the user's request
Cache sends request to origin (300ms network latency)
Origin processes request (200ms application latency)
Origin returns response to cache (300ms network latency)
Cache stores the fresh response
Cache finally returns the response to the user

Result: That user waited 800ms instead of 50ms. And if this is a popular page, potentially hundreds of users could be blocked simultaneously during this refresh window.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
LATENCY OVER TIME (Traditional Caching with max-age=300):
 
Latency (ms)
│
800 │                    ×                    ×
    │                   /|                   /|
600 │                  / |                  / |
    │                 /  |                 /  |
400 │                /   |                /   |
    │               /    |               /    |
200 │              /     |              /     |
    │             /      |             /      |
 50 │────────────·       ·────────────·       ·────
    │                                              
    └───────────────────────────────────────────────→ Time
    0        5min      5:01min    10min    10:01min
 
PATTERN:
- Perfect latency (50ms) during cache-fresh period
- Latency spike (800ms) exactly at cache expiry
- First request after expiry pays the full origin cost
- Subsequent requests are fast again
- Cycle repeats every 5 minutes

The Thundering Herd Problem:

The situation worsens under high traffic. When cache expires:

100 users request the page in the same millisecond
Without protection, all 100 requests go to origin ("thundering herd")
Origin may collapse under sudden load
Even with request coalescing, that first request blocks all 100 users

User Experience Impact:

Latency spikes are particularly harmful because:

Users perceive inconsistency (sometimes fast, sometimes slow)
The spikes often hit during peak traffic (when cache expires under load)
Mobile users on slow connections experience even longer delays
Conversion rates and engagement suffer from unpredictable performance

The Cache Cliff

How Stale-While-Revalidate Works

Stale-while-revalidate introduces a grace period after the cache's TTL expires during which stale content can still be served while a background refresh occurs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
CACHE-CONTROL: max-age=300, stale-while-revalidate=3600
 
TIMELINE:
═══════════════════════════════════════════════════════════════════════
 
T = 0 seconds
├─ Response cached
├─ "Fresh" period begins
└─ Any request → Instant cache hit (fresh content)
 
─────────────────────────────────────────────────────────────────────
 
T = 0 to T = 300 seconds (FRESH PERIOD)
├─ All requests served instantly from cache
├─ No origin requests needed
└─ No revalidation
 
─────────────────────────────────────────────────────────────────────
 
T = 300 seconds (TTL EXPIRES)
├─ Content now "stale"
├─ SWR window begins
└─ Content can still be served, but with background refresh
 
─────────────────────────────────────────────────────────────────────
 
T = 300 to T = 3900 seconds (STALE-WHILE-REVALIDATE WINDOW)
│
├─ Request arrives at T = 350s:
│   ├─ Cache immediately returns stale content (instant response!)
│   ├─ Cache triggers background request to origin
│   ├─ Origin returns fresh content
│   └─ Cache updates stored content for future requests
│
├─ Request arrives at T = 355s:
│   └─ Cache returns fresh content (updated by previous background refresh)
│
└─ If no requests during this window, content stays stale
 
─────────────────────────────────────────────────────────────────────
 
T = 3900 seconds (SWR WINDOW EXPIRES: 300 + 3600)
├─ Content too stale to serve
├─ Next request MUST wait for fresh content
└─ Synchronous revalidation required
 
═══════════════════════════════════════════════════════════════════════

Key Mechanics of SWR:

Immediate Response: When a request hits stale content within the SWR window, the cache responds immediately with the stale content. No waiting.
Background Refresh: Simultaneously (or immediately after responding), the cache initiates a fresh request to the origin. This happens asynchronously.
Cache Update: When the origin responds, the cache updates its stored content. Subsequent requests receive this fresh content.
Window Expiration: The SWR window is finite. After max-age + stale-while-revalidate seconds, the cache must perform synchronous revalidation.
Single Flight: Smart implementations ensure only one background revalidation per stale object, even if multiple requests arrive during the stale period.

The 'One Request Behind' Trade-off

Configuring Stale-While-Revalidate

Stale-while-revalidate is configured via the Cache-Control header. The directive takes a single value: the number of seconds after TTL expiration during which stale content may be served.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# BASIC SWR CONFIGURATION
HTTP/1.1 200 OK
Cache-Control: max-age=60, stale-while-revalidate=300
# Fresh for 1 minute, then stale+refresh for 5 minutes
 
# AGGRESSIVE SWR (Availability-focused)
HTTP/1.1 200 OK
Cache-Control: max-age=300, stale-while-revalidate=86400
# Fresh for 5 minutes, then stale+refresh for 24 hours
# Almost never blocks on origin
 
# CONSERVATIVE SWR (Freshness-focused)
HTTP/1.1 200 OK
Cache-Control: max-age=3600, stale-while-revalidate=60
# Fresh for 1 hour, only 1 minute of SWR grace period
# Quickly forces synchronous refresh if content is old
 
# COMBINED WITH OTHER DIRECTIVES
HTTP/1.1 200 OK
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=3600, stale-if-error=86400
# Browser: fresh 1 min (no SWR in most browsers)
# CDN: fresh 5 min, SWR for 1 hour, error fallback for 24 hours
 
# SPLIT BROWSER/CDN WITH SWR
# Some CDNs support CDN-specific SWR while browsers don't
HTTP/1.1 200 OK
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=3600
 
# Browser: Uses max-age=60, may or may not support SWR
# CDN: Uses s-maxage=300, typically supports SWR

SWR Support by Platform
Platform	SWR Support	Notes
Cloudflare	✓ Full support	Native SWR, configurable in Cache Rules
AWS CloudFront	✓ Full support	Enabled via origin response headers
Fastly	✓ Full support	Configurable in VCL
Akamai	✓ Full support	Property Manager configuration
Chrome/Edge	✓ Supported	Service Worker and HTTP cache
Firefox	✓ Supported	Since Firefox 68
Safari	⚠ Partial	Limited support, check version
nginx	✓ Configurable	Via proxy_cache_use_stale
Varnish	✓ Configurable	Via VCL beresp.grace

Start with Conservative Values

SWR vs Stale-If-Error

stale-while-revalidate

•Activated when: Cache entry is stale (TTL expired)
•Origin status: Origin is healthy and reachable
•Behavior: Serve stale, refresh in background
•Goal: Eliminate latency spikes at expiration
•User sees: Instant response (slightly stale)

stale-if-error

•Activated when: Origin returns error or is unreachable
•Origin status: Origin is down or returning 5xx
•Behavior: Serve stale instead of propagating error
•Goal: Maintain availability during outages
•User sees: Content instead of error page

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
CONFIGURATION: max-age=300, stale-while-revalidate=3600, stale-if-error=86400
 
═══════════════════════════════════════════════════════════════════════
SCENARIO 1: Normal operation (origin healthy)
═══════════════════════════════════════════════════════════════════════
 
T=0: Content cached
T=300: TTL expires, content stale
T=350: Request arrives
       ├─ SWR applies: Serve stale instantly
       ├─ Background refresh succeeds
       └─ Content updated
T=355: Next request gets fresh content ✓
 
═══════════════════════════════════════════════════════════════════════
SCENARIO 2: Origin failure during SWR window
═══════════════════════════════════════════════════════════════════════
 
T=0: Content cached
T=300: TTL expires, content stale
T=350: Request arrives, origin is DOWN
       ├─ SWR applies: Serve stale instantly
       ├─ Background refresh fails (origin 503)
       ├─ stale-if-error applies: Keep serving stale
       └─ Retry origin on next request
T=500: Origin recovers
       └─ Next request refreshes successfully ✓
 
═══════════════════════════════════════════════════════════════════════
SCENARIO 3: Origin failure AFTER SWR window expires
═══════════════════════════════════════════════════════════════════════
 
T=0: Content cached
T=300: TTL expires
T=3900: SWR window expires (300 + 3600)
        Content is now "too stale for SWR"
T=4000: Request arrives, origin is DOWN
        ├─ SWR does NOT apply (window expired)
        ├─ Would need synchronous refresh, but origin down
        ├─ stale-if-error DOES apply (86400 seconds)
        └─ Serve stale content ✓
        
T=90000: stale-if-error window expires (300 + 86400)
         Origin still down
         └─ Must return 504 Gateway Timeout ✗
 
═══════════════════════════════════════════════════════════════════════

Always Pair SWR with Stale-If-Error

Edge Cases and Gotchas

While stale-while-revalidate is powerful, several edge cases and implementation details can cause unexpected behavior. Understanding these prevents production surprises.

Common SWR Gotchas

•Request coalescing varies by CDN — Some CDNs deduplicate background refreshes, others don't. Without deduplication, multiple stale requests can each trigger a separate origin request.
•Browser SWR support is inconsistent — While modern browsers support SWR, behavior varies. Service Workers provide more predictable SWR behavior than native HTTP cache.
•No guarantee of background refresh completion — If the cache shuts down or the connection drops during background refresh, the content may remain stale until the next request.
•Age header accumulation — Content that starts with max-age=300 but is stored in cache for 200 seconds only has 100 seconds of freshness remaining. SWR window starts from that reduced freshness.
•POST/PUT/DELETE don't use SWR — SWR only applies to cacheable responses (typically GET/HEAD). Mutations still go directly to origin.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
SCENARIO: Multi-tier CDN with Age accumulation
 
Origin Response:
  Cache-Control: max-age=300, stale-while-revalidate=3600
  
Origin → Shield (cached for 100s):
  Age: 100
  Effective freshness remaining: 200s
  
Shield → Edge (cached for 50s):
  Age: 150
  Effective freshness remaining: 150s
  
Edge → Browser (cached for 30s):
  Age: 180
  Effective freshness remaining: 120s
 
RESULT:
User's browser sees content that's 180 seconds old with only 120 seconds
of freshness remaining. SWR window starts after those 120 seconds.
 
EFFECTIVE TIMELINE (from user's perspective):
T=0: User receives response with Age: 180
T=120: Content becomes stale (not T=300!)
T=120 to T=3720: SWR window (stale-while-revalidate applies)
 
MITIGATION:
- Account for Age accumulation when setting TTLs
- Consider longer max-age to ensure meaningful freshness at edges
- Use short-TTL content in origin shield to reduce Age at edges

Implementation Variations

•Cloudflare: SWR triggers on first stale request. Respects directive fully. Background refresh is asynchronous.
•CloudFront: Respects SWR from origin. No CDN-level override; must use origin headers.
•Fastly: Full SWR support via VCL. Can configure custom stale-serving behavior.
•nginx: Uses proxy_cache_use_stale updating for similar behavior. Syntax differs from HTTP standard.
•Varnish: Uses beresp.grace for stale serving. Conceptually similar but differently configured.

Test SWR Behavior

SWR Best Practices

Applying stale-while-revalidate effectively requires understanding both the technical mechanics and the operational implications.

SWR Configuration Best Practices

•Match SWR to acceptable staleness — If content can be 1 hour out of date without issue, set stale-while-revalidate=3600. If 10 minutes is the maximum, use stale-while-revalidate=600.
•Use longer SWR for infrequently accessed content — Long-tail content may not be requested frequently enough to trigger regular refreshes. Longer SWR ensures it's served even when old.
•Combine with short max-age for predictable refreshes — max-age=60, stale-while-revalidate=3600 refreshes every minute during traffic, but allows stale for quiet periods.
•Set stale-if-error at least as long as SWR — During origin outages, you want the same stale-serving capability. Usually stale-if-error should be longer.
•Monitor SWR triggering — Track how often responses are served during the SWR window vs. fresh. High SWR hit rates might indicate max-age is too short.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// Content update workflow with SWR
 
// Step 1: Configure aggressive SWR for performance
// In origin response:
// Cache-Control: public, max-age=60, s-maxage=300, 
//                stale-while-revalidate=86400, stale-if-error=86400
 
// Step 2: When content changes, explicitly purge
async function updateProduct(productId, data) {
  // Update in database
  await database.products.update(productId, data);
  
  // Purge CDN cache for this product
  await cdnClient.purge([
    `/products/${productId}`,
    `/products/${productId}.json`,
    `/api/products/${productId}`,
  ]);
  
  // With SWR + purge strategy:
  // - Normal traffic: Users get instant responses (stale during refresh)
  // - After update: Purge immediately clears stale content
  // - First request after purge: Cache miss, fetches fresh from origin
  // - Subsequent requests: Cache hit with fresh content
  
  return { success: true };
}
 
// Step 3: For critical updates, warm the cache
async function updateAndWarmProduct(productId, data) {
  await database.products.update(productId, data);
  await cdnClient.purge([`/products/${productId}`]);
  
  // Immediately request the page to warm the cache
  // This ensures no user experiences origin latency
  await fetch(`https://cdn.example.com/products/${productId}`, {
    headers: { 'X-Cache-Warm': 'true' }
  });
  
  return { success: true };
}

SWR Enables Aggressive Caching

Measuring SWR Effectiveness

Implementing SWR is only valuable if you can measure its impact. Proper observability ensures SWR is working as expected and allows you to tune configurations.

Key SWR Metrics
Metric	What It Measures	Target	Alert Threshold
SWR Hit Rate	% of responses served from stale cache during revalidation	< 10% of total cache hits	If > 30%, max-age may be too short
Background Refresh Time	Time to complete background origin fetch	< 1 second	If > 3s, origin performance issue
Stale Serving Duration	How old the content was when served stale	< 1.5x max-age	If > 2x, investigate refresh failures
Origin Error Rate	% of background refreshes that fail	< 0.1%	If > 1%, origin stability issue
Cache Hit Ratio	Overall % of requests served from cache	95% for static	If < 90%, check TTL/SWR configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Response served from fresh cache
HTTP/1.1 200 OK
Age: 45
X-Cache: HIT
X-Cache-Status: FRESH
CF-Cache-Status: HIT    # Cloudflare
 
# Response served from stale cache (SWR in progress)
HTTP/1.1 200 OK
Age: 350                 # Exceeds max-age of 300
X-Cache: HIT
X-Cache-Status: STALE
X-SWR: REVALIDATING     # Custom header from edge logic
Warning: 110 Response is stale
 
# Response after background refresh completed
HTTP/1.1 200 OK
Age: 2                   # Freshly refreshed
X-Cache: HIT
X-Cache-Status: FRESH
X-SWR: REFRESHED        # Custom header indicating refresh happened
 
# CLOUDFLARE-SPECIFIC CACHE STATUS VALUES:
# HIT - Served from cache (fresh)
# STALE - Served from cache (stale, with or without revalidation)
# MISS - Not in cache, fetched from origin
# EXPIRED - TTL exceeded, synchronous refresh required
# REVALIDATED - Conditional refresh returned 304
 
# CUSTOM HEADERS FOR DEBUGGING (add via edge logic):
X-Cache-Age: 350         # How old the cached content is
X-Max-Age: 300           # Configured max-age
X-SWR-Window: 3600       # Configured stale-while-revalidate
X-SWR-Time-Remaining: 3250  # Seconds left in SWR window

The Age Header Is Your Friend

Summary: Mastering Stale-While-Revalidate

Stale-while-revalidate is one of the most powerful patterns in CDN caching. It eliminates latency spikes, improves availability, and enables aggressive caching strategies.

Key Takeaways

•SWR solves the cache expiration latency spike — Users never wait for origin refresh during normal operation.
•SWR serves stale instantly, refreshes in background — The first request triggers async refresh; the user gets immediate response.
•SWR and stale-if-error complement each other — SWR for performance, stale-if-error for availability. Use both.
•Configure SWR based on acceptable staleness — Match the window to how long slightly-out-of-date content is acceptable.
•Combine SWR with purge for controlled updates — Long SWR windows are safe when you have reliable purge mechanisms.
•Monitor SWR behavior — Track Age headers, SWR hit rates, and background refresh times to ensure proper operation.

What's Next:

Page Complete

4 / 5