Caching Layers - Learning Module

Loading content...

0/273

CDN Caching (Edge)

Bringing Content Closer to Users

While browser caching eliminates network requests for returning visitors, it does nothing for first-time visitors or cache misses. When users must fetch content from your servers, physical distance becomes the enemy. Light travels at approximately 300,000 kilometers per second—the absolute speed limit of the universe. A request from Tokyo to a server in Virginia must travel roughly 11,000 kilometers each way, adding at least 75 milliseconds just for the speed of light, plus routing delays, processing time, and network congestion.

Of course, we cannot change physics. But what we can do is change geography—by placing copies of content closer to users. This is the fundamental insight behind Content Delivery Networks (CDNs), and it remains one of the most impactful performance optimizations available to any web-scale system.

What You Will Learn

This page covers CDN architecture and how edge caching works, cache key design and the Vary header, TTL strategies and freshness management, cache invalidation patterns, origin shield configuration, and production-grade CDN configuration. By the end, you'll be able to design and implement CDN caching strategies that deliver sub-100ms response times globally.

CDN Architecture Fundamentals

A Content Delivery Network consists of a globally distributed network of proxy servers strategically positioned to serve content from locations geographically close to end users. Understanding CDN architecture is essential for effective caching strategy design.

Core CDN Components:

CDN Topology

•Edge Locations (Points of Presence / PoPs) — Physical data centers distributed globally where content is cached and served. Major CDNs operate 200+ PoPs worldwide. These are the "last mile" of content delivery.
•Edge Servers — Servers within each PoP that cache and serve content. They handle TLS termination, cache lookup, request routing, and response compression.
•Origin Server — Your actual web server or application that generates the authoritative content. The CDN fetches from here on cache misses.
•Origin Shield (Mid-tier Cache) — An intermediate caching layer between edge servers and origin, protecting the origin from thundering herd problems.
•Control Plane — The management infrastructure for cache invalidation, configuration propagation, analytics, and health monitoring.
•DNS-based Routing — Anycast or GeoDNS systems that route user requests to the optimal edge location based on geography, latency, or server health.

The Request Flow Through a CDN:

User (Tokyo) → DNS Resolution → Edge PoP (Tokyo) → Cache Hit?
                                                      |
                                     ┌────────────────┴────────────────┐
                                     │ YES                              │ NO
                                     ↓                                  ↓
                              Serve from Cache              Origin Shield (Singapore)?
                              (< 20ms)                              |
                                                     ┌──────────────┴──────────────┐
                                                     │ Cache Hit                  │ Cache Miss
                                                     ↓                            ↓
                                              Serve from Shield          Fetch from Origin (US)
                                              (+ 50ms)                   + Cache at Shield + Edge
                                                                         (+ 150ms)

Latency Impact by Cache Hit Location:

Cache Location	Typical Latency	Notes
Browser cache	< 5ms	No network at all
Edge PoP (same city)	5-30ms	Single network hop
Edge PoP (same region)	20-50ms	Few network hops
Origin shield	50-100ms	Regional backbone
Origin server	100-400ms+	Cross-continental

Edge Location Density Matters

The number of edge locations directly impacts user experience. CloudFlare operates 300+ PoPs, AWS CloudFront 450+, and Akamai 4,000+. More edge locations mean more users are "close" to a cache, reducing latency. When evaluating CDNs, consider their PoP density in your key markets.

Cache Keys and the Vary Header

The cache key determines what constitutes a unique cacheable object. By default, CDNs use the full request URL as the cache key. However, real-world applications often need to serve different content for the same URL based on request characteristics—compression capabilities, language preferences, device type, or authentication state.

The Default Cache Key:

https://example.com/api/products?category=electronics&sort=price
└──────────────────────────────────────────────────────────────┘
                      Default Cache Key

This works well for truly static content but breaks down when:

The same URL should return gzipped content to browsers that support it, plain text otherwise
Different users should receive localized content
Mobile users should receive different content than desktop users
Authenticated users should receive personalized content (or no caching)

The Vary Header:

The HTTP Vary header instructs caches to treat requests with different values for the specified headers as different cache entries. It effectively adds request header values to the cache key.

HTTP/1.1 200 OK
Content-Type: text/html
Cache-Control: public, max-age=3600
Vary: Accept-Encoding, Accept-Language
Content-Encoding: gzip

<compressed content>

With Vary: Accept-Encoding, Accept-Language, the effective cache key becomes:

https://example.com/page.html + Accept-Encoding=gzip + Accept-Language=en-US

The same URL with Accept-Language=fr-FR would be a separate cache entry.

Common Vary Header Patterns
Vary Header	Use Case	Cache Impact	Considerations
Accept-Encoding	Serve compressed content to capable browsers	Low (2-3 variants)	Standard practice; always include
Accept-Language	Serve localized content	Medium (number of languages)	Consider URL-based i18n instead
Cookie	Any cookie variation	Very High (can explode)	Avoid if possible; breaks caching
Authorization	Authenticated content	Extremely High	Usually means no-cache instead
User-Agent	Device-specific content	Catastrophic (thousands of variants)	Never use; use device detection logic
Accept	Content negotiation (HTML vs JSON)	Low (few content types)	Consider separate URLs instead

cdn-cache-key-customization.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
// Cloudflare Worker - Custom cache key
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request));
});
 
async function handleRequest(request) {
  const url = new URL(request.url);
  
  // Create custom cache key
  const cacheKeyComponents = [
    url.origin + url.pathname,     // Base URL without query params
    getSortedQueryString(url),     // Normalized query string
    getDeviceType(request),        // Device classification
    getGeoRegion(request),         // Geographic region
  ];
  
  const cacheKey = cacheKeyComponents.join('::');
  
  // Check cache with custom key
  const cache = caches.default;
  const cacheRequest = new Request(cacheKey, { cf: request.cf });
  
  let response = await cache.match(cacheRequest);
  
  if (!response) {
    // Cache miss - fetch from origin
    response = await fetch(request);
    
    // Only cache successful responses
    if (response.ok) {
      // Clone response for caching
      const responseToCache = response.clone();
      
      // Add Cache-Control if origin didn't
      if (!response.headers.get('Cache-Control')) {
        const headers = new Headers(response.headers);
        headers.set('Cache-Control', 'public, max-age=3600');
        responseToCache = new Response(responseToCache.body, {
          status: responseToCache.status,
          headers
        });
      }
      
      event.waitUntil(cache.put(cacheRequest, responseToCache));
    }
  }
  
  return response;
}
 
function getSortedQueryString(url) {
  const params = [...url.searchParams.entries()];
  // Ignore tracking parameters
  const filtered = params.filter(([key]) => 
    !['utm_source', 'utm_medium', 'ref', 'fbclid'].includes(key)
  );
  // Sort for consistent cache keys
  filtered.sort((a, b) => a[0].localeCompare(b[0]));
  return filtered.map(([k, v]) => `${k}=${v}`).join('&');
}
 
function getDeviceType(request) {
  const ua = request.headers.get('User-Agent') || '';
  if (/Mobile|Android|iPhone/i.test(ua)) return 'mobile';
  if (/Tablet|iPad/i.test(ua)) return 'tablet';
  return 'desktop';
}
 
function getGeoRegion(request) {
  // Cloudflare provides geo data
  const country = request.cf?.country || 'XX';
  const regionMap = {
    US: 'americas', CA: 'americas', MX: 'americas', BR: 'americas',
    GB: 'europe', DE: 'europe', FR: 'europe', IT: 'europe',
    JP: 'apac', CN: 'apac', IN: 'apac', AU: 'apac',
  };
  return regionMap[country] || 'other';
}

Vary: Cookie Destroys Cacheability

Including 'Cookie' in the Vary header effectively makes each user's session a unique cache entry—destroying shared caching benefits. If you need to personalize content for authenticated users, prefer 'private' caching (browser only) or edge-side includes (ESI) where the base page is cached and personalized fragments are injected.

TTL Strategies and Freshness Management

Time-To-Live (TTL) determines how long content remains fresh in the CDN cache before requiring revalidation or refresh from the origin. TTL strategy is a fundamental tension between freshness (showing current content) and efficiency (avoiding origin requests).

The TTL Spectrum:

Very Short TTL (seconds)          Medium TTL (minutes-hours)          Very Long TTL (days-years)
        │                                    │                                    │
        │ Fresh but expensive                │ Balanced approach                  │ Efficient but stale
        │ High origin load                   │ Reasonable freshness               │ Requires cache busting
        │ Good for dynamic content           │ Good for semi-static content       │ Good for immutable assets
        ▼                                    ▼                                    ▼

TTL Recommendations by Content Type
Content Type	Recommended TTL	Rationale	s-maxage vs max-age
Versioned static assets (hash)	1 year (31536000s)	URL changes with content; truly immutable	Same for both
Unversioned images/media	1 day - 1 week	Changes infrequently; moderate staleness OK	CDN longer than browser
API responses (public data)	1-5 minutes	Freshness matters; balance with efficiency	CDN same or longer
HTML pages	0 (must-revalidate)	Always get fresh asset references	n/a—use no-cache
User-specific data	0 or private only	Cannot serve cached to different users	Browser only, no CDN
Real-time data (stock prices)	0 + no-cache	Staleness unacceptable	n/a—always revalidate

Differentiating Browser and CDN TTLs:

You can set different cache durations for browsers (max-age) and CDNs (s-maxage). This enables powerful patterns:

# CDN caches for 1 hour, browsers cache for 5 minutes
Cache-Control: public, max-age=300, s-maxage=3600

Why differentiate?

CDN invalidation is controllable — You can purge CDN caches when content updates. You cannot purge user browsers.
CDN can revalidate efficiently — CDN-to-origin connections are fast and persistent. Browser-to-CDN may involve TLS setup.
Risk of stale browser content — Longer browser TTLs mean users may see stale content longer after an update.

Stale Content Directives:

Modern HTTP caching supports serving stale content in specific situations:

stale-content-strategies.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Stale-While-Revalidate: Serve stale immediately, fetch fresh in background
Cache-Control: public, max-age=60, stale-while-revalidate=300
 
# Timeline:
# 0-60 seconds:   Fresh - serve from cache
# 60-360 seconds: Stale but acceptable - serve stale, fetch fresh in background
# 360+ seconds:   Must revalidate before serving
 
# Stale-If-Error: Serve stale if origin is unreachable
Cache-Control: public, max-age=300, stale-if-error=86400
 
# Timeline:
# 0-300 seconds:    Fresh - serve from cache
# 300+ seconds:     Stale - if origin returns 5xx or times out, serve stale
# After 86400 secs: Don't serve stale even on errors
 
# Combined pattern for resilient API responses
Cache-Control: public, max-age=60, stale-while-revalidate=300, stale-if-error=86400
 
# This provides:
# - 60 seconds of fresh cache
# - Immediate response with background refresh for 5 more minutes
# - Graceful degradation to stale content during outages for up to 24 hours

Grace Mode in CDNs

Some CDNs (Fastly, Varnish-based) call stale-while-revalidate behavior "grace mode." It significantly improves perceived performance: users never wait for origin responses if there's any cached version available. The background refresh ensures eventual consistency.

Cache Invalidation Patterns

When content changes before TTL expiration, you need mechanisms to remove or update cached content. CDN cache invalidation is notoriously complex—Phil Karlton famously noted that "There are only two hard things in Computer Science: cache invalidation and naming things."

Invalidation Strategies:

Purge (Hard Invalidation)

•What it does — Immediately removes content from all edge caches
•Scope — By URL, by wildcard pattern, or full site purge
•Propagation — Typically 1-30 seconds globally
•Use when — Content must be removed immediately (legal, security)
•Drawback — Next request hits origin; potential thundering herd

Soft Purge / Revalidation

•What it does — Marks content as stale but keeps in cache
•Behavior — Serves stale while revalidating in background
•Scope — Same as hard purge
•Use when — Content updated but old version acceptable briefly
•Advantage — No origin overload; graceful transition

cdn-purge-examples.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
// Cloudflare API - Purge specific URLs
async function purgeUrls(urls) {
  const response = await fetch(
    `https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/purge_cache`,
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${API_TOKEN}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({ files: urls }),
    }
  );
  return response.json();
}
 
// Purge by prefix (wildcard)
async function purgePrefix(prefix) {
  // Not all CDNs support this; check your provider
  const response = await fetch(
    `https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/purge_cache`,
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${API_TOKEN}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({ prefixes: [prefix] }),
    }
  );
  return response.json();
}
 
// Purge by cache tag (surrogate key)
// Requires adding Surrogate-Key header to responses
async function purgeByTag(tags) {
  // Fastly API example
  const requests = tags.map(tag =>
    fetch(`https://api.fastly.com/service/${SERVICE_ID}/purge/${tag}`, {
      method: 'POST',
      headers: { 'Fastly-Key': API_KEY },
    })
  );
  return Promise.all(requests);
}
 
// Usage in deployment pipeline
async function deployAndInvalidate() {
  // 1. Deploy new version of assets
  await deployAssets();
  
  // 2. Purge old cached content
  await purgeByTag(['static-assets', 'v1.2.3']);
  
  // 3. Warm the cache for critical paths
  await warmCache([
    'https://example.com/',
    'https://example.com/products',
    'https://example.com/api/featured',
  ]);
  
  console.log('Deployment complete, caches invalidated and warmed');
}

Cache Tags / Surrogate Keys:

The most powerful invalidation pattern uses cache tags (Fastly calls them "surrogate keys"). Instead of tracking individual URLs, you tag responses with logical identifiers, then purge by tag.

GET /products/12345

HTTP/1.1 200 OK
Cache-Control: public, max-age=86400
Surrogate-Key: product-12345 category-electronics products-all

{"id": 12345, "name": "Laptop", ...}

When product 12345 is updated, purge product-12345. When category structure changes, purge category-electronics. When all products need refresh, purge products-all.

Benefits of Tag-based Invalidation:

Approach	Invalidate when product updated	Implementation Complexity
URL-based purge	Must know all URLs referencing the product	High—track all URL variations
Wildcard purge	Broad purge, over-invalidates	Low but inefficient
Cache tags	Purge single tag	Medium—add tags to responses

Thundering Herd After Purge

Purging popular content from all edge locations simultaneously can cause a thundering herd—all users suddenly hitting origin at once. Mitigate with: origin shield (single point for refetch), staggered purge, request coalescing at edge, or soft purge with stale-while-revalidate.

Origin Shield Configuration

An origin shield is an intermediate caching layer between edge PoPs and your origin server. It acts as a regional cache that serves multiple edge locations, collapsing many edge-to-origin requests into fewer shield-to-origin requests.

Without Origin Shield:

Edge (Tokyo)   ──┐
Edge (Seoul)   ──┼─── 3 separate requests ──→ Origin (US)
Edge (Sydney)  ──┘

On cache miss, each edge fetches independently

With Origin Shield:

Edge (Tokyo)   ──┐                        ┌──→ Origin (US)
Edge (Seoul)   ──┼─→ Shield (Singapore) ──┤
Edge (Sydney)  ──┘    (regional cache)    └──→ (only if shield misses)

On cache miss, only shield fetches from origin

Origin Shield Benefits

•Reduced Origin Load — Many edge misses become single shield request. Critical during traffic spikes or after cache purges.
•Higher Cache Hit Ratio — Larger aggregated traffic pool means better cache hit rates at shield level.
•Faster Cache Warming — New content cached at shield first; subsequent edges fetch from nearby shield, not distant origin.
•Thundering Herd Protection — Request coalescing at shield means only one origin request per cache miss, regardless of how many edges need the content.
•Improved Reliability — Shield can serve stale content if origin is down, protecting edge cache population.

Origin Shield Placement Strategies
Strategy	Place Shield Near	Use Case
Near Origin	Same region as origin server	Maximum origin protection; simpler architecture
Near Users	Region with most traffic	Faster shield hits for majority of users
Multiple Shields	One per major region	Global traffic distribution; regional resilience
Hierarchical	Multiple tiers of caching	Extreme scale; complex but optimal

cloudfront-origin-shield.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# AWS CloudFront Distribution with Origin Shield
AWSTemplateFormatVersion: '2010-09-09'
Description: CloudFront distribution with Origin Shield configured
 
Resources:
  CDNDistribution:
    Type: AWS::CloudFront::Distribution
    Properties:
      DistributionConfig:
        Origins:
          - Id: MyOrigin
            DomainName: origin.example.com
            CustomOriginConfig:
              HTTPPort: 80
              HTTPSPort: 443
              OriginProtocolPolicy: https-only
              OriginSSLProtocols:
                - TLSv1.2
            
            # Origin Shield Configuration
            OriginShield:
              Enabled: true
              # Choose region closest to your origin
              OriginShieldRegion: us-east-1
            
            # Origin timeout and retry settings
            ConnectionAttempts: 3
            ConnectionTimeout: 10
            
        DefaultCacheBehavior:
          TargetOriginId: MyOrigin
          ViewerProtocolPolicy: redirect-to-https
          
          # Cache settings
          CachePolicyId: !Ref CachePolicy
          
        # Enable compression
        PriceClass: PriceClass_All
        Enabled: true
 
  CachePolicy:
    Type: AWS::CloudFront::CachePolicy
    Properties:
      CachePolicyConfig:
        Name: OptimizedCaching
        DefaultTTL: 86400      # 1 day
        MaxTTL: 31536000       # 1 year
        MinTTL: 0
        
        ParametersInCacheKeyAndForwardedToOrigin:
          CookiesConfig:
            CookieBehavior: none
          HeadersConfig:
            HeaderBehavior: whitelist
            Headers:
              - Accept-Encoding
              - Accept-Language
          QueryStringsConfig:
            QueryStringBehavior: all
          
          EnableAcceptEncodingBrotli: true
          EnableAcceptEncodingGzip: true

Request Coalescing

Origin shields typically implement request coalescing: when multiple concurrent requests arrive for the same uncached content, only one request goes to origin while others wait. This is transparent to your application but dramatically reduces origin load during cache warming or after purges.

CDN for Dynamic Content

While CDNs are traditionally associated with static content, modern CDNs provide significant benefits for dynamic API responses and personalized content. Understanding how to leverage CDNs for dynamic content expands their utility considerably.

Micro-Caching for Dynamic Content:

Even content that changes frequently can benefit from very short TTLs (1-10 seconds). This "micro-caching" pattern provides:

Traffic spike absorption — During traffic bursts, most requests serve from cache
Origin protection — Origin load remains bounded regardless of traffic volume
Improved latency — Cache hits still faster than origin responses
Acceptable staleness — For many use cases, 5-second-old data is fine

micro-caching-patterns.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# API endpoint: Product catalog (changes hourly, allows staleness)
Cache-Control: public, max-age=60, stale-while-revalidate=300
# CDN serves fresh for 1 minute, then stale while refreshing for 5 more minutes
 
# API endpoint: Stock prices (changes constantly, minimal staleness)
Cache-Control: public, max-age=1, stale-while-revalidate=5
# Only 1-second freshness, but still absorbs traffic spikes
 
# API endpoint: Trending content (social proof, some staleness OK)
Cache-Control: public, max-age=30, stale-while-revalidate=60
# Updates every 30 seconds; users see fresh-ish content
 
# API endpoint: User-specific but not sensitive (recommendations)
Cache-Control: private, max-age=300
# Browser caches 5 minutes; CDN doesn't cache (private)
 
# Approach: Geographic variations of "public" content
# Cache different versions by region
Vary: Accept-Encoding
# Plus CDN-level configuration to vary by geographic region

Edge Compute for Dynamic Content:

Modern CDNs offer edge compute capabilities (Cloudflare Workers, AWS Lambda@Edge, Fastly Compute@Edge) that enable dynamic content generation at the edge:

Use Case	Edge Compute Pattern	Benefit
A/B testing	Assign user to variant at edge	No origin latency for variant assignment
Personalization	Inject user context into cached content	Combine caching with customization
Bot detection	Analyze request patterns at edge	Protect origin from bad traffic
API aggregation	Combine multiple cached responses	Reduce client round trips
Authentication validation	Validate JWT at edge	Block unauthorized before origin

edge-personalization.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
// Edge worker: Personalized content with cached base
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event));
});
 
async function handleRequest(event) {
  const request = event.request;
  const url = new URL(request.url);
  
  // Example: Homepage personalization
  if (url.pathname === '/') {
    return personalizeHomepage(event);
  }
  
  // Pass through to origin
  return fetch(request);
}
 
async function personalizeHomepage(event) {
  const request = event.request;
  
  // Fetch cached base HTML (generic version)
  const cache = caches.default;
  const baseUrl = new URL(request.url);
  baseUrl.searchParams.set('base', 'true');
  
  let baseHtml = await cache.match(baseUrl);
  
  if (!baseHtml) {
    baseHtml = await fetch(baseUrl.toString());
    event.waitUntil(cache.put(baseUrl, baseHtml.clone()));
  }
  
  // Get personalization data (from cookie or API)
  const userData = await getUserData(request);
  
  // Transform cached base with personalization
  const personalizedHtml = await personalizeHtml(baseHtml, userData);
  
  return new Response(personalizedHtml, {
    headers: {
      'Content-Type': 'text/html',
      'Cache-Control': 'private, max-age=0',  // Don't cache personalized version
    },
  });
}
 
async function getUserData(request) {
  const cookie = request.headers.get('Cookie');
  const userId = extractUserIdFromCookie(cookie);
  
  if (!userId) {
    return { isAnonymous: true, segments: [] };
  }
  
  // Fetch user segments (cached at edge)
  const segmentsUrl = `https://api.example.com/users/${userId}/segments`;
  const segmentsResponse = await fetch(segmentsUrl, {
    cf: { cacheTtl: 300 }  // Cache user segments for 5 minutes
  });
  
  return segmentsResponse.json();
}
 
async function personalizeHtml(response, userData) {
  const html = await response.text();
  
  // Use HTMLRewriter for streaming transformation
  return new HTMLRewriter()
    .on('[data-personalize="greeting"]', {
      element(element) {
        element.setInnerContent(
          userData.isAnonymous ? 'Welcome!' : `Welcome back, ${userData.name}!`
        );
      }
    })
    .on('[data-segment]', {
      element(element) {
        const targetSegment = element.getAttribute('data-segment');
        if (!userData.segments.includes(targetSegment)) {
          element.remove();
        }
      }
    })
    .transform(new Response(html));
}

Edge-Side Includes (ESI)

ESI is a markup language for assembling pages from cached fragments. The CDN caches the base page and dynamic fragments separately, then assembles them at the edge. This pattern works well for pages with mostly static content and small personalized sections (user name, cart count).

CDN Monitoring and Analytics

Effective CDN caching requires continuous monitoring to identify opportunities, detect issues, and optimize performance. Key metrics reveal how well your caching strategy is working.

Critical CDN Metrics:

Key CDN Performance Metrics
Metric	What It Measures	Target Range	Action if Out of Range
Cache Hit Ratio	% of requests served from cache	90% for static, > 50% for all	Review TTLs, Vary headers, cache key config
Origin Request Rate	Requests reaching origin per second	Stable, not tracking traffic spikes	Investigate cache misses, enable shield
Edge Response Time	Time from edge receive to edge send	< 50ms for cache hits	Check edge compute, response size
Origin Response Time	Time for origin to respond	< 500ms p95	Optimize origin, check network path
Error Rate	% of 4xx/5xx responses	< 1%	Debug errors, check origin health
Bandwidth Saved	Data served from cache vs origin	80%	Review asset caching, compression
TTL Distribution	Age of cached content when served	Varied by content type	Adjust TTLs if all very young/old

Debugging Cache Behavior:

CDNs typically add response headers that reveal caching behavior:

# Cloudflare response headers
CF-Cache-Status: HIT        # Served from Cloudflare cache
CF-Cache-Status: MISS       # Fetched from origin
CF-Cache-Status: BYPASS     # Caching not applicable
CF-Cache-Status: DYNAMIC    # Not cached (dynamic content)
CF-Cache-Status: EXPIRED    # Was cached but TTL expired
Age: 3600                   # Content has been cached for 1 hour

# AWS CloudFront
X-Cache: Hit from cloudfront
X-Cache: Miss from cloudfront
X-Amz-Cf-Pop: IAD53-C1      # Edge location that served request

# Fastly
X-Served-By: cache-sin18032-SIN  # Singapore PoP
X-Cache: HIT, HIT           # Hit at edge AND shield
X-Cache-Hits: 42            # Number of hits for this object
Age: 7200                   # Cached for 2 hours

cdn-analytics-dashboard.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
// CDN Analytics Query - Example using CloudWatch for CloudFront
const AWS = require('aws-sdk');
const cloudwatch = new AWS.CloudWatch();
 
async function getCDNMetrics(distributionId, hours = 24) {
  const endTime = new Date();
  const startTime = new Date(endTime - hours * 60 * 60 * 1000);
  
  const metrics = [
    { name: 'Requests', stat: 'Sum' },
    { name: 'BytesDownloaded', stat: 'Sum' },
    { name: 'CacheHitRate', stat: 'Average' },
    { name: 'OriginLatency', stat: 'Average' },
    { name: '4xxErrorRate', stat: 'Average' },
    { name: '5xxErrorRate', stat: 'Average' },
  ];
  
  const params = {
    Namespace: 'AWS/CloudFront',
    StartTime: startTime,
    EndTime: endTime,
    Period: 3600,  // 1 hour granularity
    Dimensions: [
      { Name: 'DistributionId', Value: distributionId },
      { Name: 'Region', Value: 'Global' },
    ],
    MetricDataQueries: metrics.map((m, i) => ({
      Id: `m${i}`,
      MetricStat: {
        Metric: {
          Namespace: 'AWS/CloudFront',
          MetricName: m.name,
          Dimensions: [
            { Name: 'DistributionId', Value: distributionId },
            { Name: 'Region', Value: 'Global' },
          ],
        },
        Period: 3600,
        Stat: m.stat,
      },
    })),
  };
  
  const data = await cloudwatch.getMetricData(params).promise();
  
  // Calculate summary
  const summary = {
    totalRequests: sum(data.MetricDataResults[0].Values),
    bytesServed: sum(data.MetricDataResults[1].Values),
    avgCacheHitRate: average(data.MetricDataResults[2].Values),
    avgOriginLatency: average(data.MetricDataResults[3].Values),
    errorRate: average(data.MetricDataResults[4].Values) + 
               average(data.MetricDataResults[5].Values),
  };
  
  return summary;
}
 
// Set up alerting for cache issues
async function checkCacheHealth(distributionId) {
  const metrics = await getCDNMetrics(distributionId, 1);
  
  const alerts = [];
  
  if (metrics.avgCacheHitRate < 0.7) {
    alerts.push({
      severity: 'warning',
      message: `Cache hit rate low: ${(metrics.avgCacheHitRate * 100).toFixed(1)}%`,
      action: 'Review TTL configurations and Vary headers',
    });
  }
  
  if (metrics.avgOriginLatency > 500) {
    alerts.push({
      severity: 'critical',
      message: `Origin latency high: ${metrics.avgOriginLatency.toFixed(0)}ms`,
      action: 'Check origin server health and network path',
    });
  }
  
  if (metrics.errorRate > 0.01) {
    alerts.push({
      severity: 'critical',
      message: `Error rate elevated: ${(metrics.errorRate * 100).toFixed(2)}%`,
      action: 'Investigate origin errors in logs',
    });
  }
  
  return alerts;
}

Real User Monitoring (RUM)

CDN metrics show what's happening at the edge, but RUM shows what users actually experience. Combine CDN metrics with client-side performance data (Core Web Vitals, Navigation Timing API) for a complete picture. A 99% cache hit rate doesn't help if the cached content is the wrong content.

Summary and Best Practices

CDN caching is essential for delivering fast, reliable experiences to global users. By placing content closer to users physically, CDNs eliminate the latency that no amount of server optimization can overcome. Effective CDN usage requires understanding cache keys, TTL strategies, invalidation patterns, and monitoring.

CDN Caching Best Practices

•Use content hashing and long TTLs — Hash-based filenames enable aggressive caching without stale content risk.
•Configure origin shield — Protect your origin from traffic spikes and reduce latency for cache warming.
•Minimize Vary header usage — Each Vary dimension multiplies cache entries. Avoid 'Vary: Cookie' and 'Vary: User-Agent'.
•Implement cache tags for invalidation — Tag responses logically; purge by tag rather than tracking URLs.
•Use stale-while-revalidate liberally — Improve user experience by serving stale while fetching fresh.
•Differentiate s-maxage and max-age — Give CDN longer TTLs than browsers for better invalidation control.
•Monitor cache hit ratios — Track metrics continuously; sudden drops indicate configuration problems.
•Plan for cache warming — After purges or deployments, critical paths may need proactive warming.
•Consider edge compute — Use Workers/Lambda@Edge for personalization on top of caching.
•Test cache behavior explicitly — Use response headers and tools to verify caching works as designed.

What's Next:

With browser and CDN caching covered, we move closer to the application. The next page explores Application-Level Caching—where your code explicitly manages cached data in memory, using techniques like memoization, in-process caches, and integration with caching libraries. This layer offers the most control but requires careful design to maintain consistency.

Page Complete

You now understand CDN caching comprehensively—from architecture and cache key design to TTL strategies, invalidation patterns, origin shield, and edge compute. You can design CDN caching strategies that deliver sub-100ms global response times while maintaining content freshness and operational control.