System Design (HLD)CDN Caching Mechanics

CDN Caching Mechanics

LevelIntermediate

Duration75 mins

TopicCDN Caching Mechanics

5 / 5

Cache Hit Ratio Optimization: The Ultimate CDN Metric

The Number That Measures CDN Success

Every CDN conversation eventually leads to a single question: "What's your cache hit ratio?"

The cache hit ratio (CHR), also called cache hit rate, is the percentage of requests that are served from cache rather than requiring a trip to the origin server. It's the fundamental measure of CDN effectiveness—the higher your CHR, the more value you're extracting from your CDN investment.

A high cache hit ratio means:

Lower latency — More users get sub-50ms responses from nearby edge servers
Reduced origin load — Your origin handles a fraction of total traffic
Lower costs — Less bandwidth from origin, fewer origin compute resources needed
Better reliability — Origin failures have less impact when most traffic is cached

This page provides a systematic approach to measuring, analyzing, and optimizing cache hit ratios—transforming your CDN from a passive passthrough into a powerful performance multiplier.

What You Will Learn

By the end of this page, you will understand how to accurately measure cache hit ratio, identify common causes of cache misses, implement strategies to improve CHR across different content types, and set realistic targets based on your application's characteristics.

Understanding Cache Hit Ratio

Cache hit ratio is conceptually simple but operationally nuanced. The basic formula is:

Cache Hit Ratio = (Cache Hits) / (Cache Hits + Cache Misses) × 100%

However, the definition of "hit" and "miss" varies by context, and a single CHR number can hide important details.

Types of Cache Hit Ratio Metrics
Metric	Definition	What It Measures	Typical Target
Request CHR	% of requests served from cache	Efficiency at reducing origin request volume	85-99%
Bandwidth CHR	% of bytes served from cache	Efficiency at reducing origin bandwidth	90-99%
Edge CHR	Hits at edge servers only	Edge cache effectiveness	70-95%
Origin Shield CHR	Hits at shield layer	Shield protection effectiveness	85-99%
Total CHR	Combined across all layers	Overall CDN effectiveness	95-99.9%

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
SCENARIO: CDN Traffic for 1 hour
 
Total Requests: 1,000,000
Cache Hits: 920,000
Cache Misses: 80,000
 
REQUEST CHR = 920,000 / 1,000,000 = 92%
 
───────────────────────────────────────────────────────────────
 
BANDWIDTH-WEIGHTED SCENARIO:
 
Cache Hits: 920,000 requests × avg 50KB = 46GB
Cache Misses: 80,000 requests × avg 500KB = 40GB (larger objects)
 
BANDWIDTH CHR = 46GB / 86GB = 53%
 
INSIGHT: Low bandwidth CHR despite high request CHR indicates
         large objects (videos, downloads) aren't being cached
 
───────────────────────────────────────────────────────────────
 
MULTI-LAYER SCENARIO:
 
Edge Request: 1,000,000
  ├─ Edge HIT: 700,000 (70% edge CHR)
  └─ Edge MISS → Shield: 300,000
                ├─ Shield HIT: 260,000 (87% shield CHR)
                └─ Shield MISS → Origin: 40,000
 
TOTAL CHR = (700,000 + 260,000) / 1,000,000 = 96%
ORIGIN OFFLOAD = 1 - (40,000 / 1,000,000) = 96%
 
INSIGHT: Multi-layer caching dramatically increases effective CHR

Measure What Matters

Choose your CHR metric based on your goals. If reducing origin request volume is the priority, track request CHR. If bandwidth costs dominate, track bandwidth CHR. Most teams should monitor both, as they can diverge significantly.

Common Causes of Cache Misses

Before optimizing, you must understand why cache misses occur. Each miss represents an opportunity—either expected (first request for new content) or avoidable (misconfiguration, poor cache key design).

Categories of Cache Misses

•Cold Misses (Compulsory) — First request for content that's never been cached. Unavoidable but can be reduced by cache warming.
•Capacity Misses — Content evicted due to limited cache storage. Popular content pushed out by other content due to insufficient cache size.
•Expiration Misses — Cached content expired (TTL reached). Fresh fetch required. Can optimize with longer TTLs or SWR.
•Invalidation Misses — Cached content explicitly purged. Necessary for updates but excessive purging hurts CHR.
•Fragmentation Misses — Same content requested with different cache keys (UTM params, etc.). Causes redundant caching.
•Bypass Misses — Requests intentionally skipping cache (Cache-Control: no-cache, cookies, authorization).
•Configuration Misses — Content not cached due to misconfigured rules, headers, or CDN settings.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
CACHE MISS BREAKDOWN ANALYSIS
 
Total Requests: 1,000,000
Cache Misses: 80,000 (8% miss rate)
 
MISS CATEGORY BREAKDOWN:
═══════════════════════════════════════════════════════════
 
Cold Misses (First Request):        12,000 (15%)
├─ New content published:            3,000
├─ Long-tail pages accessed:         5,000
├─ New users triggering unique URLs: 4,000
└─ STATUS: Expected, minimize with cache warming
 
Capacity Misses (Eviction):          5,000 (6%)
├─ Low-traffic pages evicted:        4,000
├─ Large objects evicted:            1,000
└─ STATUS: Consider cache sizing review
 
Expiration Misses (TTL):            18,000 (23%)
├─ Short TTL content:               12,000
├─ Moderately-trafficked pages:      6,000
└─ STATUS: Review TTL strategy, implement SWR
 
Fragmentation Misses (Key Variance): 25,000 (31%) ⚠️
├─ UTM parameters variations:       15,000
├─ Tracking pixel parameters:        7,000
├─ Social media click IDs:           3,000
└─ STATUS: CRITICAL - Fix cache key configuration
 
Bypass Misses (Intentional):        15,000 (19%)
├─ Authenticated requests:          10,000
├─ POST/PUT/DELETE methods:          3,000
├─ Cache-Control: no-cache:          2,000
└─ STATUS: Expected for personalized content
 
Configuration Misses (Errors):       5,000 (6%)
├─ Missing Cache-Control header:     3,000
├─ Private content marked public:    1,000
├─ Uncacheable status codes:         1,000
└─ STATUS: Fix origin header configuration
 
═══════════════════════════════════════════════════════════
 
OPTIMIZATION PRIORITY:
1. Fix fragmentation (31% of misses are avoidable)
2. Review expiration/TTL strategy (23% could be reduced)
3. Fix configuration issues (6% quick wins)

Fragmentation Is Often the Biggest Problem

In our analysis, 31% of cache misses were fragmentation—the same content cached multiple times due to query parameter variations. This is extremely common and often the single biggest CHR improvement opportunity. Audit your cache key configuration before any other optimization.

Cache Key Optimization Strategies

Cache key optimization is typically the highest-impact CHR improvement. A well-configured cache key eliminates fragmentation while preserving necessary content variations.

Cache Key Optimization Techniques

•Strip analytics parameters — Remove utm_*, fbclid, gclid, mc_* and other tracking parameters from cache key while preserving them for analytics.
•Normalize query string order — Ensure /page?a=1&b=2 and /page?b=2&a=1 produce the same cache key.
•Lowercase normalization — Treat /Page.HTML and /page.html as the same cache key (usually).
•Selective header inclusion — Only include headers that genuinely affect content (e.g., Accept-Language for localized content, not User-Agent).
•Cookie extraction — Extract only specific cookies (like A/B test variants) instead of including the entire cookie header.
•Device class normalization — Map thousands of User-Agent variations to 3-5 device classes (mobile, tablet, desktop).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
// Cloudflare Worker: Comprehensive cache key optimization
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request));
});
 
async function handleRequest(request) {
  const url = new URL(request.url);
  
  // Create normalized cache key URL
  const cacheKeyUrl = new URL(request.url);
  
  // 1. Strip analytics/tracking parameters
  const paramsToStrip = [
    'utm_source', 'utm_medium', 'utm_campaign', 'utm_term', 'utm_content',
    'fbclid', 'gclid', 'gclsrc', 'dclid',
    'mc_eid', 'mc_cid',
    '_ga', '_gl',
    'ref', 'source',
    'hsCtaTracking', 'hsmi', 'hsa_*',
  ];
  
  paramsToStrip.forEach(param => {
    if (param.endsWith('*')) {
      // Wildcard matching
      const prefix = param.slice(0, -1);
      for (const key of [...cacheKeyUrl.searchParams.keys()]) {
        if (key.startsWith(prefix)) {
          cacheKeyUrl.searchParams.delete(key);
        }
      }
    } else {
      cacheKeyUrl.searchParams.delete(param);
    }
  });
  
  // 2. Sort remaining query parameters
  const sortedParams = new URLSearchParams(
    [...cacheKeyUrl.searchParams.entries()].sort((a, b) => a[0].localeCompare(b[0]))
  );
  cacheKeyUrl.search = sortedParams.toString();
  
  // 3. Normalize path (lowercase, remove trailing slash)
  cacheKeyUrl.pathname = cacheKeyUrl.pathname.toLowerCase().replace(/\/$/, '') || '/';
  
  // 4. Add device class (not full User-Agent)
  const deviceClass = getDeviceClass(request.headers.get('User-Agent'));
  if (deviceClass !== 'desktop') {
    cacheKeyUrl.searchParams.set('_device', deviceClass);
  }
  
  // 5. Add normalized Accept-Language (if content varies by language)
  const language = normalizeLanguage(request.headers.get('Accept-Language'));
  if (language !== 'en') {
    cacheKeyUrl.searchParams.set('_lang', language);
  }
  
  // Create cache key request
  const cacheKey = new Request(cacheKeyUrl.toString(), request);
  
  // Check cache
  const cache = caches.default;
  let response = await cache.match(cacheKey);
  
  if (response) {
    // Add header indicating cache hit
    response = new Response(response.body, response);
    response.headers.set('X-Cache', 'HIT');
    response.headers.set('X-Cache-Key', cacheKeyUrl.pathname + cacheKeyUrl.search);
    return response;
  }
  
  // Cache miss - fetch from origin (using original request for analytics)
  response = await fetch(request);
  
  // Clone and cache
  const responseToCache = response.clone();
  event.waitUntil(cache.put(cacheKey, responseToCache));
  
  // Return response with cache miss header
  const modifiedResponse = new Response(response.body, response);
  modifiedResponse.headers.set('X-Cache', 'MISS');
  modifiedResponse.headers.set('X-Cache-Key', cacheKeyUrl.pathname + cacheKeyUrl.search);
  
  return modifiedResponse;
}
 
function getDeviceClass(userAgent) {
  if (!userAgent) return 'desktop';
  const ua = userAgent.toLowerCase();
  if (/mobile|android|iphone|ipod|blackberry|iemobile/i.test(ua)) return 'mobile';
  if (/tablet|ipad|playbook|silk/i.test(ua)) return 'tablet';
  return 'desktop';
}
 
function normalizeLanguage(acceptLanguage) {
  if (!acceptLanguage) return 'en';
  const primary = acceptLanguage.split(',')[0].split('-')[0].toLowerCase();
  const supported = ['en', 'es', 'fr', 'de', 'ja', 'zh'];
  return supported.includes(primary) ? primary : 'en';
}

Measure Before and After

Before implementing cache key changes, measure your current fragmentation. Add a header that echoes the cache key and analyze how many unique cache keys serve identical content. After optimization, measure again to quantify the improvement.

TTL Optimization for Higher Hit Ratios

TTL configuration directly impacts cache hit ratio. Longer TTLs mean content stays in cache longer, increasing the likelihood of hits. But the relationship isn't simply "longer = better"—optimal TTLs balance hit ratio with content freshness requirements.

Increase TTL When

•Content changes infrequently
•You have reliable purge mechanisms
•Stale content is acceptable temporarily
•Origin is slow or expensive
•Traffic is bursty (need protection)

Keep TTL Short When

•Content changes frequently
•Freshness is critical (prices, inventory)
•Purge is unreliable or expensive
•Content is personalized
•Regulatory requirements mandate freshness

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
TTL IMPACT ANALYSIS
 
Scenario: Product page receiving 100 requests/minute
          Average origin response time: 500ms
          Content changes: ~3 times per day
 
═══════════════════════════════════════════════════════════════
 
TTL = 60 seconds (1 minute)
────────────────────────────────────────────────────────────
- Every 60 seconds: 1 miss, 59 hits
- CHR = 59/60 = 98.3%
- Origin requests = 1,440/day
- Freshness: At most 1 minute stale
- Risk: Low (content usually fresh)
 
TTL = 300 seconds (5 minutes)
────────────────────────────────────────────────────────────
- Every 300 seconds: 1 miss, 299 hits
- CHR with SWR = 99.7%
- Origin requests = 288/day
- Freshness: At most 5 minutes stale
- Risk: Moderate (may need purge for critical updates)
 
TTL = 3600 seconds (1 hour) with SWR
────────────────────────────────────────────────────────────
- Every 3600 seconds: ~1 miss (SWR keeps content fresh)
- CHR = 99.97%
- Origin requests = 24/day
- Freshness: Usually fresh (SWR refreshes under traffic)
- Risk: Requires reliable purge for price changes
 
TTL = 86400 seconds (1 day) with Purge
────────────────────────────────────────────────────────────
- Origin requests = 3 (on purge) + 1 per cold start
- CHR = 99.99% under normal traffic
- Freshness: Always fresh (purge on change)
- Risk: Depends entirely on purge reliability
 
═══════════════════════════════════════════════════════════════
 
OPTIMAL STRATEGY for this scenario:
Cache-Control: public, max-age=300, s-maxage=3600, 
               stale-while-revalidate=86400, stale-if-error=86400
 
- Edge caches 1 hour
- Browser caches 5 minutes
- Stale serving for 24 hours (protection)
- Purge on content change

TTL + SWR = Best of Both Worlds

With stale-while-revalidate, you can use short max-age values (for staleness control) with very long SWR windows (for hit ratio). This combination ensures content stays relatively fresh while maximizing cache efficiency.

Origin Shield and Multi-Layer Caching

Single-layer edge caching has inherent hit ratio limitations. When content is requested from multiple edge locations, each edge has its own cache, leading to redundant origin requests. Origin shield and multi-layer caching collapse these redundant requests.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
SCENARIO: Content requested from 50 edge locations
          TTL = 300 seconds
 
═══════════════════════════════════════════════════════════════
SINGLE-LAYER CACHING (Edge Only)
═══════════════════════════════════════════════════════════════
 
Content published → Content requested at all 50 edges
 
Edge 1 → Cache miss → Origin (500ms)
Edge 2 → Cache miss → Origin (500ms)
Edge 3 → Cache miss → Origin (500ms)
...
Edge 50 → Cache miss → Origin (500ms)
 
Result: 50 origin requests for same content
        Total origin load: 50 × origin_cost
 
After 5 minutes, cycle repeats.
Daily origin requests: 50 locations × 288 cycles = 14,400
 
═══════════════════════════════════════════════════════════════
MULTI-LAYER CACHING (Edge + Origin Shield)
═══════════════════════════════════════════════════════════════
 
Content published → Content requested at all 50 edges
 
Edge 1 → Cache miss → Shield → Cache miss → Origin (500ms)
Edge 2 → Cache miss → Shield → Cache HIT (10ms)
Edge 3 → Cache miss → Shield → Cache HIT (10ms)
...
Edge 50 → Cache miss → Shield → Cache HIT (10ms)
 
Result: 1 origin request (49 served by shield)
        Total origin load: 1 × origin_cost
 
After 5 minutes:
- First edge to expire → Shield HIT (if shield still fresh)
- Shield TTL can be longer than edge TTL
 
Daily origin requests: 288 cycles (vs 14,400)
IMPROVEMENT: 50x reduction in origin load
 
═══════════════════════════════════════════════════════════════

Origin Shield Benefits
Benefit	Without Shield	With Shield	Improvement
Origin requests (first access)	N edges × 1	1	N× reduction
Origin requests (TTL refresh)	N edges × refreshes	1 × refreshes	N× reduction
Origin bandwidth	Full response × N	Full response × 1	N× reduction
Effective CHR	70-85%	95-99%	+10-30%
Latency for shield hits	Origin RTT (100-500ms)	Shield RTT (20-50ms)	5-10× faster

Shield Location Matters

Place your origin shield close to your origin server to minimize shield-to-origin latency. If your origin is in US-East, your shield should be in US-East. The shield adds a hop, so minimizing shield-to-origin distance is critical.

Cache Warming Strategies

Cache warming pre-populates caches before user requests arrive, eliminating cold misses for important content. It's particularly valuable for new content launches, post-purge recovery, and edge location expansion.

Cache Warming Scenarios

•New content publish — When publishing important content (product launches, blog posts, marketing campaigns), warm caches immediately after publish.
•Post-purge recovery — After purging popular content, warm it immediately to prevent user-facing cold misses.
•Deployment recovery — If deployment invalidates caches, warm critical assets as part of deployment pipeline.
•Geographic expansion — When adding new edge locations, warm them with popular content before routing traffic.
•Scheduled events — Before known traffic spikes (sales, broadcasts), pre-warm all relevant content.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
// Cache warming service
class CacheWarmer {
  constructor(cdnEdges, concurrency = 10) {
    this.cdnEdges = cdnEdges;  // List of edge endpoints
    this.concurrency = concurrency;
    this.queue = [];
  }
  
  // Warm a single URL across all edges
  async warmUrl(url, options = {}) {
    const { priority = 'normal', edges = this.cdnEdges } = options;
    
    const warmRequests = edges.map(edge => ({
      url,
      edge,
      priority,
    }));
    
    return this.executeBatch(warmRequests);
  }
  
  // Warm multiple URLs efficiently
  async warmUrls(urls, options = {}) {
    const { 
      priority = 'normal', 
      edges = this.cdnEdges,
      shieldOnly = false 
    } = options;
    
    // If shield-only, warm just the shield layer
    const targetEdges = shieldOnly ? [this.getShieldEndpoint()] : edges;
    
    const allRequests = urls.flatMap(url => 
      targetEdges.map(edge => ({ url, edge, priority }))
    );
    
    return this.executeBatch(allRequests);
  }
  
  async executeBatch(requests) {
    const results = [];
    const executing = [];
    
    for (const req of requests) {
      const promise = this.warmSingle(req)
        .then(result => {
          results.push(result);
          executing.splice(executing.indexOf(promise), 1);
        });
      
      executing.push(promise);
      
      if (executing.length >= this.concurrency) {
        await Promise.race(executing);
      }
    }
    
    await Promise.all(executing);
    return results;
  }
  
  async warmSingle({ url, edge }) {
    const start = Date.now();
    try {
      const response = await fetch(url, {
        headers: {
          'X-Cache-Warm': 'true',
          'X-Edge-Target': edge,  // Route to specific edge if supported
        },
        // Don't follow redirects - we want to cache the redirect
        redirect: 'manual',
      });
      
      return {
        url,
        edge,
        status: response.status,
        cached: response.headers.get('X-Cache') === 'HIT',
        duration: Date.now() - start,
        success: true,
      };
    } catch (error) {
      return {
        url,
        edge,
        error: error.message,
        duration: Date.now() - start,
        success: false,
      };
    }
  }
  
  getShieldEndpoint() {
    // Return shield endpoint for shield-only warming
    return this.cdnEdges.find(e => e.isShield) || this.cdnEdges[0];
  }
}
 
// Usage example: Warm after content publish
async function onContentPublish(content) {
  const warmer = new CacheWarmer(CDN_EDGES);
  
  const urlsToWarm = [
    content.url,
    content.apiUrl,
    ...content.imageUrls,
  ];
  
  // Warm shield first (protects origin)
  await warmer.warmUrls(urlsToWarm, { shieldOnly: true });
  
  // Then warm critical edges
  await warmer.warmUrls(urlsToWarm, { 
    edges: CRITICAL_EDGES  // US, EU data centers
  });
  
  console.log('Content warmed successfully');
}

Don't Over-Warm

Cache warming has costs: origin load during warming, cache storage for warmed content, and potential cache eviction of actually-requested content. Warm strategically—focus on high-traffic, high-value content that justifies the origin cost.

Monitoring and Continuous Optimization

Cache hit ratio optimization isn't a one-time effort—it requires continuous monitoring, analysis, and refinement. Building the right observability ensures you catch regressions early and identify new optimization opportunities.

Cache Performance Dashboard Metrics
Metric	Granularity	Alert Threshold	Action on Alert
Overall CHR	5 minutes	< 90%	Investigate miss sources
CHR by content type	5 minutes	Varies by type	Review specific type's config
CHR by edge location	15 minutes	10% variance	Check edge-specific issues
Cache fragmentation ratio	1 hour	5%	Audit cache key parameters
TTL expiration rate	5 minutes	Spike > 2x baseline	Review TTL strategy
Origin request rate	1 minute	2x expected	Possible cache bypass
Cache warmth ratio	15 minutes	< 80%	Increase TTLs or add warming

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
-- Overall CHR by hour
SELECT 
  DATE_TRUNC('hour', timestamp) as hour,
  COUNT(*) as total_requests,
  SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) as hits,
  SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chr
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '24 hours'
GROUP BY 1
ORDER BY 1;
 
-- CHR by content type
SELECT 
  content_type,
  COUNT(*) as requests,
  SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chr
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '1 hour'
GROUP BY 1
ORDER BY requests DESC;
 
-- Cache fragmentation analysis (same path, different cache keys)
SELECT 
  uri_path,
  COUNT(DISTINCT cache_key) as unique_cache_keys,
  COUNT(*) as total_requests,
  COUNT(DISTINCT cache_key) * 100.0 / COUNT(*) as fragmentation_ratio
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '1 hour'
  AND cache_status = 'MISS'
GROUP BY 1
HAVING COUNT(DISTINCT cache_key) > 5
ORDER BY unique_cache_keys DESC
LIMIT 20;
 
-- Top cache miss URLs
SELECT 
  uri_path,
  COUNT(*) as misses,
  AVG(origin_response_time_ms) as avg_origin_time
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '1 hour'
  AND cache_status = 'MISS'
GROUP BY 1
ORDER BY misses DESC
LIMIT 50;
 
-- Query parameter impact on CHR
SELECT 
  REGEXP_REPLACE(uri_path || query_string, '(utm_[^&=]+=[^&]*|fbclid=[^&]*)', '', 'g') as normalized_url,
  COUNT(*) as requests,
  COUNT(DISTINCT(uri_path || query_string)) as unique_urls,
  SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chr
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '1 hour'
GROUP BY 1
HAVING COUNT(DISTINCT(uri_path || query_string)) > 10
ORDER BY unique_urls DESC;

Set a CHR Target and Track It

Establish a CHR target based on your content mix (e.g., 95% for mostly static sites, 85% for dynamic sites) and track it as a key SLA. Review weekly, investigate any drops immediately, and continuously push toward higher efficiency.

Summary: Maximizing Cache Hit Ratio

Cache hit ratio is the ultimate measure of CDN effectiveness. Optimizing CHR is a multi-faceted effort spanning cache key design, TTL configuration, architecture choices, and ongoing monitoring.

Key Takeaways

•Measure CHR comprehensively — Track request CHR, bandwidth CHR, and per-layer metrics to understand true performance.
•Analyze miss causes — Categorize misses (cold, expiration, fragmentation, bypass) to prioritize optimization efforts.
•Optimize cache keys first — Eliminating fragmentation often provides the biggest CHR improvement with least risk.
•Balance TTL with freshness — Use stale-while-revalidate to enable long TTLs without staleness concerns.
•Leverage origin shield — Multi-layer caching collapses redundant requests and dramatically improves effective CHR.
•Warm strategically — Pre-populate caches for important content to eliminate user-facing cold misses.
•Monitor continuously — Build dashboards, set alerts, and review CHR regularly to catch regressions.

Module Complete:

You've now mastered CDN caching mechanics—from cache keys to TTL configuration, Cache-Control headers, stale-while-revalidate, and cache hit ratio optimization. These concepts form the operational foundation for effective content delivery. Apply them systematically, measure relentlessly, and your CDN will become a powerful force multiplier for application performance.

Module Complete

You now have comprehensive knowledge of CDN caching mechanics—the internal workings that determine whether your CDN delivers millisecond responses or merely forwards traffic to your origin. Use this knowledge to achieve 95%+ cache hit ratios and transform your application's performance characteristics.

5 / 5

Loading learning content...

System Design (HLD)CDN Caching Mechanics

CDN Caching Mechanics

LevelIntermediate

Duration75 mins

TopicCDN Caching Mechanics

5 / 5

Cache Hit Ratio Optimization: The Ultimate CDN Metric

The Number That Measures CDN Success

Every CDN conversation eventually leads to a single question: "What's your cache hit ratio?"

A high cache hit ratio means:

Lower latency — More users get sub-50ms responses from nearby edge servers
Reduced origin load — Your origin handles a fraction of total traffic
Lower costs — Less bandwidth from origin, fewer origin compute resources needed
Better reliability — Origin failures have less impact when most traffic is cached

This page provides a systematic approach to measuring, analyzing, and optimizing cache hit ratios—transforming your CDN from a passive passthrough into a powerful performance multiplier.

What You Will Learn

Understanding Cache Hit Ratio

Cache hit ratio is conceptually simple but operationally nuanced. The basic formula is:

Cache Hit Ratio = (Cache Hits) / (Cache Hits + Cache Misses) × 100%

However, the definition of "hit" and "miss" varies by context, and a single CHR number can hide important details.

Types of Cache Hit Ratio Metrics
Metric	Definition	What It Measures	Typical Target
Request CHR	% of requests served from cache	Efficiency at reducing origin request volume	85-99%
Bandwidth CHR	% of bytes served from cache	Efficiency at reducing origin bandwidth	90-99%
Edge CHR	Hits at edge servers only	Edge cache effectiveness	70-95%
Origin Shield CHR	Hits at shield layer	Shield protection effectiveness	85-99%
Total CHR	Combined across all layers	Overall CDN effectiveness	95-99.9%

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
SCENARIO: CDN Traffic for 1 hour
 
Total Requests: 1,000,000
Cache Hits: 920,000
Cache Misses: 80,000
 
REQUEST CHR = 920,000 / 1,000,000 = 92%
 
───────────────────────────────────────────────────────────────
 
BANDWIDTH-WEIGHTED SCENARIO:
 
Cache Hits: 920,000 requests × avg 50KB = 46GB
Cache Misses: 80,000 requests × avg 500KB = 40GB (larger objects)
 
BANDWIDTH CHR = 46GB / 86GB = 53%
 
INSIGHT: Low bandwidth CHR despite high request CHR indicates
         large objects (videos, downloads) aren't being cached
 
───────────────────────────────────────────────────────────────
 
MULTI-LAYER SCENARIO:
 
Edge Request: 1,000,000
  ├─ Edge HIT: 700,000 (70% edge CHR)
  └─ Edge MISS → Shield: 300,000
                ├─ Shield HIT: 260,000 (87% shield CHR)
                └─ Shield MISS → Origin: 40,000
 
TOTAL CHR = (700,000 + 260,000) / 1,000,000 = 96%
ORIGIN OFFLOAD = 1 - (40,000 / 1,000,000) = 96%
 
INSIGHT: Multi-layer caching dramatically increases effective CHR

Measure What Matters

Common Causes of Cache Misses

Categories of Cache Misses

•Cold Misses (Compulsory) — First request for content that's never been cached. Unavoidable but can be reduced by cache warming.
•Capacity Misses — Content evicted due to limited cache storage. Popular content pushed out by other content due to insufficient cache size.
•Expiration Misses — Cached content expired (TTL reached). Fresh fetch required. Can optimize with longer TTLs or SWR.
•Invalidation Misses — Cached content explicitly purged. Necessary for updates but excessive purging hurts CHR.
•Fragmentation Misses — Same content requested with different cache keys (UTM params, etc.). Causes redundant caching.
•Bypass Misses — Requests intentionally skipping cache (Cache-Control: no-cache, cookies, authorization).
•Configuration Misses — Content not cached due to misconfigured rules, headers, or CDN settings.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
CACHE MISS BREAKDOWN ANALYSIS
 
Total Requests: 1,000,000
Cache Misses: 80,000 (8% miss rate)
 
MISS CATEGORY BREAKDOWN:
═══════════════════════════════════════════════════════════
 
Cold Misses (First Request):        12,000 (15%)
├─ New content published:            3,000
├─ Long-tail pages accessed:         5,000
├─ New users triggering unique URLs: 4,000
└─ STATUS: Expected, minimize with cache warming
 
Capacity Misses (Eviction):          5,000 (6%)
├─ Low-traffic pages evicted:        4,000
├─ Large objects evicted:            1,000
└─ STATUS: Consider cache sizing review
 
Expiration Misses (TTL):            18,000 (23%)
├─ Short TTL content:               12,000
├─ Moderately-trafficked pages:      6,000
└─ STATUS: Review TTL strategy, implement SWR
 
Fragmentation Misses (Key Variance): 25,000 (31%) ⚠️
├─ UTM parameters variations:       15,000
├─ Tracking pixel parameters:        7,000
├─ Social media click IDs:           3,000
└─ STATUS: CRITICAL - Fix cache key configuration
 
Bypass Misses (Intentional):        15,000 (19%)
├─ Authenticated requests:          10,000
├─ POST/PUT/DELETE methods:          3,000
├─ Cache-Control: no-cache:          2,000
└─ STATUS: Expected for personalized content
 
Configuration Misses (Errors):       5,000 (6%)
├─ Missing Cache-Control header:     3,000
├─ Private content marked public:    1,000
├─ Uncacheable status codes:         1,000
└─ STATUS: Fix origin header configuration
 
═══════════════════════════════════════════════════════════
 
OPTIMIZATION PRIORITY:
1. Fix fragmentation (31% of misses are avoidable)
2. Review expiration/TTL strategy (23% could be reduced)
3. Fix configuration issues (6% quick wins)

Fragmentation Is Often the Biggest Problem

Cache Key Optimization Strategies

Cache key optimization is typically the highest-impact CHR improvement. A well-configured cache key eliminates fragmentation while preserving necessary content variations.

Cache Key Optimization Techniques

•Strip analytics parameters — Remove utm_*, fbclid, gclid, mc_* and other tracking parameters from cache key while preserving them for analytics.
•Normalize query string order — Ensure /page?a=1&b=2 and /page?b=2&a=1 produce the same cache key.
•Lowercase normalization — Treat /Page.HTML and /page.html as the same cache key (usually).
•Selective header inclusion — Only include headers that genuinely affect content (e.g., Accept-Language for localized content, not User-Agent).
•Cookie extraction — Extract only specific cookies (like A/B test variants) instead of including the entire cookie header.
•Device class normalization — Map thousands of User-Agent variations to 3-5 device classes (mobile, tablet, desktop).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
// Cloudflare Worker: Comprehensive cache key optimization
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request));
});
 
async function handleRequest(request) {
  const url = new URL(request.url);
  
  // Create normalized cache key URL
  const cacheKeyUrl = new URL(request.url);
  
  // 1. Strip analytics/tracking parameters
  const paramsToStrip = [
    'utm_source', 'utm_medium', 'utm_campaign', 'utm_term', 'utm_content',
    'fbclid', 'gclid', 'gclsrc', 'dclid',
    'mc_eid', 'mc_cid',
    '_ga', '_gl',
    'ref', 'source',
    'hsCtaTracking', 'hsmi', 'hsa_*',
  ];
  
  paramsToStrip.forEach(param => {
    if (param.endsWith('*')) {
      // Wildcard matching
      const prefix = param.slice(0, -1);
      for (const key of [...cacheKeyUrl.searchParams.keys()]) {
        if (key.startsWith(prefix)) {
          cacheKeyUrl.searchParams.delete(key);
        }
      }
    } else {
      cacheKeyUrl.searchParams.delete(param);
    }
  });
  
  // 2. Sort remaining query parameters
  const sortedParams = new URLSearchParams(
    [...cacheKeyUrl.searchParams.entries()].sort((a, b) => a[0].localeCompare(b[0]))
  );
  cacheKeyUrl.search = sortedParams.toString();
  
  // 3. Normalize path (lowercase, remove trailing slash)
  cacheKeyUrl.pathname = cacheKeyUrl.pathname.toLowerCase().replace(/\/$/, '') || '/';
  
  // 4. Add device class (not full User-Agent)
  const deviceClass = getDeviceClass(request.headers.get('User-Agent'));
  if (deviceClass !== 'desktop') {
    cacheKeyUrl.searchParams.set('_device', deviceClass);
  }
  
  // 5. Add normalized Accept-Language (if content varies by language)
  const language = normalizeLanguage(request.headers.get('Accept-Language'));
  if (language !== 'en') {
    cacheKeyUrl.searchParams.set('_lang', language);
  }
  
  // Create cache key request
  const cacheKey = new Request(cacheKeyUrl.toString(), request);
  
  // Check cache
  const cache = caches.default;
  let response = await cache.match(cacheKey);
  
  if (response) {
    // Add header indicating cache hit
    response = new Response(response.body, response);
    response.headers.set('X-Cache', 'HIT');
    response.headers.set('X-Cache-Key', cacheKeyUrl.pathname + cacheKeyUrl.search);
    return response;
  }
  
  // Cache miss - fetch from origin (using original request for analytics)
  response = await fetch(request);
  
  // Clone and cache
  const responseToCache = response.clone();
  event.waitUntil(cache.put(cacheKey, responseToCache));
  
  // Return response with cache miss header
  const modifiedResponse = new Response(response.body, response);
  modifiedResponse.headers.set('X-Cache', 'MISS');
  modifiedResponse.headers.set('X-Cache-Key', cacheKeyUrl.pathname + cacheKeyUrl.search);
  
  return modifiedResponse;
}
 
function getDeviceClass(userAgent) {
  if (!userAgent) return 'desktop';
  const ua = userAgent.toLowerCase();
  if (/mobile|android|iphone|ipod|blackberry|iemobile/i.test(ua)) return 'mobile';
  if (/tablet|ipad|playbook|silk/i.test(ua)) return 'tablet';
  return 'desktop';
}
 
function normalizeLanguage(acceptLanguage) {
  if (!acceptLanguage) return 'en';
  const primary = acceptLanguage.split(',')[0].split('-')[0].toLowerCase();
  const supported = ['en', 'es', 'fr', 'de', 'ja', 'zh'];
  return supported.includes(primary) ? primary : 'en';
}

Measure Before and After

TTL Optimization for Higher Hit Ratios

Increase TTL When

•Content changes infrequently
•You have reliable purge mechanisms
•Stale content is acceptable temporarily
•Origin is slow or expensive
•Traffic is bursty (need protection)

Keep TTL Short When

•Content changes frequently
•Freshness is critical (prices, inventory)
•Purge is unreliable or expensive
•Content is personalized
•Regulatory requirements mandate freshness

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
TTL IMPACT ANALYSIS
 
Scenario: Product page receiving 100 requests/minute
          Average origin response time: 500ms
          Content changes: ~3 times per day
 
═══════════════════════════════════════════════════════════════
 
TTL = 60 seconds (1 minute)
────────────────────────────────────────────────────────────
- Every 60 seconds: 1 miss, 59 hits
- CHR = 59/60 = 98.3%
- Origin requests = 1,440/day
- Freshness: At most 1 minute stale
- Risk: Low (content usually fresh)
 
TTL = 300 seconds (5 minutes)
────────────────────────────────────────────────────────────
- Every 300 seconds: 1 miss, 299 hits
- CHR with SWR = 99.7%
- Origin requests = 288/day
- Freshness: At most 5 minutes stale
- Risk: Moderate (may need purge for critical updates)
 
TTL = 3600 seconds (1 hour) with SWR
────────────────────────────────────────────────────────────
- Every 3600 seconds: ~1 miss (SWR keeps content fresh)
- CHR = 99.97%
- Origin requests = 24/day
- Freshness: Usually fresh (SWR refreshes under traffic)
- Risk: Requires reliable purge for price changes
 
TTL = 86400 seconds (1 day) with Purge
────────────────────────────────────────────────────────────
- Origin requests = 3 (on purge) + 1 per cold start
- CHR = 99.99% under normal traffic
- Freshness: Always fresh (purge on change)
- Risk: Depends entirely on purge reliability
 
═══════════════════════════════════════════════════════════════
 
OPTIMAL STRATEGY for this scenario:
Cache-Control: public, max-age=300, s-maxage=3600, 
               stale-while-revalidate=86400, stale-if-error=86400
 
- Edge caches 1 hour
- Browser caches 5 minutes
- Stale serving for 24 hours (protection)
- Purge on content change

TTL + SWR = Best of Both Worlds

Origin Shield and Multi-Layer Caching

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
SCENARIO: Content requested from 50 edge locations
          TTL = 300 seconds
 
═══════════════════════════════════════════════════════════════
SINGLE-LAYER CACHING (Edge Only)
═══════════════════════════════════════════════════════════════
 
Content published → Content requested at all 50 edges
 
Edge 1 → Cache miss → Origin (500ms)
Edge 2 → Cache miss → Origin (500ms)
Edge 3 → Cache miss → Origin (500ms)
...
Edge 50 → Cache miss → Origin (500ms)
 
Result: 50 origin requests for same content
        Total origin load: 50 × origin_cost
 
After 5 minutes, cycle repeats.
Daily origin requests: 50 locations × 288 cycles = 14,400
 
═══════════════════════════════════════════════════════════════
MULTI-LAYER CACHING (Edge + Origin Shield)
═══════════════════════════════════════════════════════════════
 
Content published → Content requested at all 50 edges
 
Edge 1 → Cache miss → Shield → Cache miss → Origin (500ms)
Edge 2 → Cache miss → Shield → Cache HIT (10ms)
Edge 3 → Cache miss → Shield → Cache HIT (10ms)
...
Edge 50 → Cache miss → Shield → Cache HIT (10ms)
 
Result: 1 origin request (49 served by shield)
        Total origin load: 1 × origin_cost
 
After 5 minutes:
- First edge to expire → Shield HIT (if shield still fresh)
- Shield TTL can be longer than edge TTL
 
Daily origin requests: 288 cycles (vs 14,400)
IMPROVEMENT: 50x reduction in origin load
 
═══════════════════════════════════════════════════════════════

Origin Shield Benefits
Benefit	Without Shield	With Shield	Improvement
Origin requests (first access)	N edges × 1	1	N× reduction
Origin requests (TTL refresh)	N edges × refreshes	1 × refreshes	N× reduction
Origin bandwidth	Full response × N	Full response × 1	N× reduction
Effective CHR	70-85%	95-99%	+10-30%
Latency for shield hits	Origin RTT (100-500ms)	Shield RTT (20-50ms)	5-10× faster

Shield Location Matters

Cache Warming Strategies

Cache Warming Scenarios

•New content publish — When publishing important content (product launches, blog posts, marketing campaigns), warm caches immediately after publish.
•Post-purge recovery — After purging popular content, warm it immediately to prevent user-facing cold misses.
•Deployment recovery — If deployment invalidates caches, warm critical assets as part of deployment pipeline.
•Geographic expansion — When adding new edge locations, warm them with popular content before routing traffic.
•Scheduled events — Before known traffic spikes (sales, broadcasts), pre-warm all relevant content.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
// Cache warming service
class CacheWarmer {
  constructor(cdnEdges, concurrency = 10) {
    this.cdnEdges = cdnEdges;  // List of edge endpoints
    this.concurrency = concurrency;
    this.queue = [];
  }
  
  // Warm a single URL across all edges
  async warmUrl(url, options = {}) {
    const { priority = 'normal', edges = this.cdnEdges } = options;
    
    const warmRequests = edges.map(edge => ({
      url,
      edge,
      priority,
    }));
    
    return this.executeBatch(warmRequests);
  }
  
  // Warm multiple URLs efficiently
  async warmUrls(urls, options = {}) {
    const { 
      priority = 'normal', 
      edges = this.cdnEdges,
      shieldOnly = false 
    } = options;
    
    // If shield-only, warm just the shield layer
    const targetEdges = shieldOnly ? [this.getShieldEndpoint()] : edges;
    
    const allRequests = urls.flatMap(url => 
      targetEdges.map(edge => ({ url, edge, priority }))
    );
    
    return this.executeBatch(allRequests);
  }
  
  async executeBatch(requests) {
    const results = [];
    const executing = [];
    
    for (const req of requests) {
      const promise = this.warmSingle(req)
        .then(result => {
          results.push(result);
          executing.splice(executing.indexOf(promise), 1);
        });
      
      executing.push(promise);
      
      if (executing.length >= this.concurrency) {
        await Promise.race(executing);
      }
    }
    
    await Promise.all(executing);
    return results;
  }
  
  async warmSingle({ url, edge }) {
    const start = Date.now();
    try {
      const response = await fetch(url, {
        headers: {
          'X-Cache-Warm': 'true',
          'X-Edge-Target': edge,  // Route to specific edge if supported
        },
        // Don't follow redirects - we want to cache the redirect
        redirect: 'manual',
      });
      
      return {
        url,
        edge,
        status: response.status,
        cached: response.headers.get('X-Cache') === 'HIT',
        duration: Date.now() - start,
        success: true,
      };
    } catch (error) {
      return {
        url,
        edge,
        error: error.message,
        duration: Date.now() - start,
        success: false,
      };
    }
  }
  
  getShieldEndpoint() {
    // Return shield endpoint for shield-only warming
    return this.cdnEdges.find(e => e.isShield) || this.cdnEdges[0];
  }
}
 
// Usage example: Warm after content publish
async function onContentPublish(content) {
  const warmer = new CacheWarmer(CDN_EDGES);
  
  const urlsToWarm = [
    content.url,
    content.apiUrl,
    ...content.imageUrls,
  ];
  
  // Warm shield first (protects origin)
  await warmer.warmUrls(urlsToWarm, { shieldOnly: true });
  
  // Then warm critical edges
  await warmer.warmUrls(urlsToWarm, { 
    edges: CRITICAL_EDGES  // US, EU data centers
  });
  
  console.log('Content warmed successfully');
}

Don't Over-Warm

Monitoring and Continuous Optimization

Cache Performance Dashboard Metrics
Metric	Granularity	Alert Threshold	Action on Alert
Overall CHR	5 minutes	< 90%	Investigate miss sources
CHR by content type	5 minutes	Varies by type	Review specific type's config
CHR by edge location	15 minutes	10% variance	Check edge-specific issues
Cache fragmentation ratio	1 hour	5%	Audit cache key parameters
TTL expiration rate	5 minutes	Spike > 2x baseline	Review TTL strategy
Origin request rate	1 minute	2x expected	Possible cache bypass
Cache warmth ratio	15 minutes	< 80%	Increase TTLs or add warming

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
-- Overall CHR by hour
SELECT 
  DATE_TRUNC('hour', timestamp) as hour,
  COUNT(*) as total_requests,
  SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) as hits,
  SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chr
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '24 hours'
GROUP BY 1
ORDER BY 1;
 
-- CHR by content type
SELECT 
  content_type,
  COUNT(*) as requests,
  SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chr
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '1 hour'
GROUP BY 1
ORDER BY requests DESC;
 
-- Cache fragmentation analysis (same path, different cache keys)
SELECT 
  uri_path,
  COUNT(DISTINCT cache_key) as unique_cache_keys,
  COUNT(*) as total_requests,
  COUNT(DISTINCT cache_key) * 100.0 / COUNT(*) as fragmentation_ratio
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '1 hour'
  AND cache_status = 'MISS'
GROUP BY 1
HAVING COUNT(DISTINCT cache_key) > 5
ORDER BY unique_cache_keys DESC
LIMIT 20;
 
-- Top cache miss URLs
SELECT 
  uri_path,
  COUNT(*) as misses,
  AVG(origin_response_time_ms) as avg_origin_time
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '1 hour'
  AND cache_status = 'MISS'
GROUP BY 1
ORDER BY misses DESC
LIMIT 50;
 
-- Query parameter impact on CHR
SELECT 
  REGEXP_REPLACE(uri_path || query_string, '(utm_[^&=]+=[^&]*|fbclid=[^&]*)', '', 'g') as normalized_url,
  COUNT(*) as requests,
  COUNT(DISTINCT(uri_path || query_string)) as unique_urls,
  SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chr
FROM cdn_logs
WHERE timestamp > NOW() - INTERVAL '1 hour'
GROUP BY 1
HAVING COUNT(DISTINCT(uri_path || query_string)) > 10
ORDER BY unique_urls DESC;

Set a CHR Target and Track It

Summary: Maximizing Cache Hit Ratio

Cache hit ratio is the ultimate measure of CDN effectiveness. Optimizing CHR is a multi-faceted effort spanning cache key design, TTL configuration, architecture choices, and ongoing monitoring.

Key Takeaways

•Measure CHR comprehensively — Track request CHR, bandwidth CHR, and per-layer metrics to understand true performance.
•Analyze miss causes — Categorize misses (cold, expiration, fragmentation, bypass) to prioritize optimization efforts.
•Optimize cache keys first — Eliminating fragmentation often provides the biggest CHR improvement with least risk.
•Balance TTL with freshness — Use stale-while-revalidate to enable long TTLs without staleness concerns.
•Leverage origin shield — Multi-layer caching collapses redundant requests and dramatically improves effective CHR.
•Warm strategically — Pre-populate caches for important content to eliminate user-facing cold misses.
•Monitor continuously — Build dashboards, set alerts, and review CHR regularly to catch regressions.

Module Complete:

Module Complete

5 / 5