Loading learning content...
Every CDN conversation eventually leads to a single question: "What's your cache hit ratio?"
The cache hit ratio (CHR), also called cache hit rate, is the percentage of requests that are served from cache rather than requiring a trip to the origin server. It's the fundamental measure of CDN effectiveness—the higher your CHR, the more value you're extracting from your CDN investment.
A high cache hit ratio means:
This page provides a systematic approach to measuring, analyzing, and optimizing cache hit ratios—transforming your CDN from a passive passthrough into a powerful performance multiplier.
By the end of this page, you will understand how to accurately measure cache hit ratio, identify common causes of cache misses, implement strategies to improve CHR across different content types, and set realistic targets based on your application's characteristics.
Cache hit ratio is conceptually simple but operationally nuanced. The basic formula is:
Cache Hit Ratio = (Cache Hits) / (Cache Hits + Cache Misses) × 100%
However, the definition of "hit" and "miss" varies by context, and a single CHR number can hide important details.
| Metric | Definition | What It Measures | Typical Target |
|---|---|---|---|
| Request CHR | % of requests served from cache | Efficiency at reducing origin request volume | 85-99% |
| Bandwidth CHR | % of bytes served from cache | Efficiency at reducing origin bandwidth | 90-99% |
| Edge CHR | Hits at edge servers only | Edge cache effectiveness | 70-95% |
| Origin Shield CHR | Hits at shield layer | Shield protection effectiveness | 85-99% |
| Total CHR | Combined across all layers | Overall CDN effectiveness | 95-99.9% |
12345678910111213141516171819202122232425262728293031323334
SCENARIO: CDN Traffic for 1 hour Total Requests: 1,000,000Cache Hits: 920,000Cache Misses: 80,000 REQUEST CHR = 920,000 / 1,000,000 = 92% ─────────────────────────────────────────────────────────────── BANDWIDTH-WEIGHTED SCENARIO: Cache Hits: 920,000 requests × avg 50KB = 46GBCache Misses: 80,000 requests × avg 500KB = 40GB (larger objects) BANDWIDTH CHR = 46GB / 86GB = 53% INSIGHT: Low bandwidth CHR despite high request CHR indicates large objects (videos, downloads) aren't being cached ─────────────────────────────────────────────────────────────── MULTI-LAYER SCENARIO: Edge Request: 1,000,000 ├─ Edge HIT: 700,000 (70% edge CHR) └─ Edge MISS → Shield: 300,000 ├─ Shield HIT: 260,000 (87% shield CHR) └─ Shield MISS → Origin: 40,000 TOTAL CHR = (700,000 + 260,000) / 1,000,000 = 96%ORIGIN OFFLOAD = 1 - (40,000 / 1,000,000) = 96% INSIGHT: Multi-layer caching dramatically increases effective CHRChoose your CHR metric based on your goals. If reducing origin request volume is the priority, track request CHR. If bandwidth costs dominate, track bandwidth CHR. Most teams should monitor both, as they can diverge significantly.
Before optimizing, you must understand why cache misses occur. Each miss represents an opportunity—either expected (first request for new content) or avoidable (misconfiguration, poor cache key design).
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
CACHE MISS BREAKDOWN ANALYSIS Total Requests: 1,000,000Cache Misses: 80,000 (8% miss rate) MISS CATEGORY BREAKDOWN:═══════════════════════════════════════════════════════════ Cold Misses (First Request): 12,000 (15%)├─ New content published: 3,000├─ Long-tail pages accessed: 5,000├─ New users triggering unique URLs: 4,000└─ STATUS: Expected, minimize with cache warming Capacity Misses (Eviction): 5,000 (6%)├─ Low-traffic pages evicted: 4,000├─ Large objects evicted: 1,000└─ STATUS: Consider cache sizing review Expiration Misses (TTL): 18,000 (23%)├─ Short TTL content: 12,000├─ Moderately-trafficked pages: 6,000└─ STATUS: Review TTL strategy, implement SWR Fragmentation Misses (Key Variance): 25,000 (31%) ⚠️├─ UTM parameters variations: 15,000├─ Tracking pixel parameters: 7,000├─ Social media click IDs: 3,000└─ STATUS: CRITICAL - Fix cache key configuration Bypass Misses (Intentional): 15,000 (19%)├─ Authenticated requests: 10,000├─ POST/PUT/DELETE methods: 3,000├─ Cache-Control: no-cache: 2,000└─ STATUS: Expected for personalized content Configuration Misses (Errors): 5,000 (6%)├─ Missing Cache-Control header: 3,000├─ Private content marked public: 1,000├─ Uncacheable status codes: 1,000└─ STATUS: Fix origin header configuration ═══════════════════════════════════════════════════════════ OPTIMIZATION PRIORITY:1. Fix fragmentation (31% of misses are avoidable)2. Review expiration/TTL strategy (23% could be reduced)3. Fix configuration issues (6% quick wins)In our analysis, 31% of cache misses were fragmentation—the same content cached multiple times due to query parameter variations. This is extremely common and often the single biggest CHR improvement opportunity. Audit your cache key configuration before any other optimization.
Cache key optimization is typically the highest-impact CHR improvement. A well-configured cache key eliminates fragmentation while preserving necessary content variations.
utm_*, fbclid, gclid, mc_* and other tracking parameters from cache key while preserving them for analytics./page?a=1&b=2 and /page?b=2&a=1 produce the same cache key./Page.HTML and /page.html as the same cache key (usually).123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
// Cloudflare Worker: Comprehensive cache key optimizationaddEventListener('fetch', event => { event.respondWith(handleRequest(event.request));}); async function handleRequest(request) { const url = new URL(request.url); // Create normalized cache key URL const cacheKeyUrl = new URL(request.url); // 1. Strip analytics/tracking parameters const paramsToStrip = [ 'utm_source', 'utm_medium', 'utm_campaign', 'utm_term', 'utm_content', 'fbclid', 'gclid', 'gclsrc', 'dclid', 'mc_eid', 'mc_cid', '_ga', '_gl', 'ref', 'source', 'hsCtaTracking', 'hsmi', 'hsa_*', ]; paramsToStrip.forEach(param => { if (param.endsWith('*')) { // Wildcard matching const prefix = param.slice(0, -1); for (const key of [...cacheKeyUrl.searchParams.keys()]) { if (key.startsWith(prefix)) { cacheKeyUrl.searchParams.delete(key); } } } else { cacheKeyUrl.searchParams.delete(param); } }); // 2. Sort remaining query parameters const sortedParams = new URLSearchParams( [...cacheKeyUrl.searchParams.entries()].sort((a, b) => a[0].localeCompare(b[0])) ); cacheKeyUrl.search = sortedParams.toString(); // 3. Normalize path (lowercase, remove trailing slash) cacheKeyUrl.pathname = cacheKeyUrl.pathname.toLowerCase().replace(/\/$/, '') || '/'; // 4. Add device class (not full User-Agent) const deviceClass = getDeviceClass(request.headers.get('User-Agent')); if (deviceClass !== 'desktop') { cacheKeyUrl.searchParams.set('_device', deviceClass); } // 5. Add normalized Accept-Language (if content varies by language) const language = normalizeLanguage(request.headers.get('Accept-Language')); if (language !== 'en') { cacheKeyUrl.searchParams.set('_lang', language); } // Create cache key request const cacheKey = new Request(cacheKeyUrl.toString(), request); // Check cache const cache = caches.default; let response = await cache.match(cacheKey); if (response) { // Add header indicating cache hit response = new Response(response.body, response); response.headers.set('X-Cache', 'HIT'); response.headers.set('X-Cache-Key', cacheKeyUrl.pathname + cacheKeyUrl.search); return response; } // Cache miss - fetch from origin (using original request for analytics) response = await fetch(request); // Clone and cache const responseToCache = response.clone(); event.waitUntil(cache.put(cacheKey, responseToCache)); // Return response with cache miss header const modifiedResponse = new Response(response.body, response); modifiedResponse.headers.set('X-Cache', 'MISS'); modifiedResponse.headers.set('X-Cache-Key', cacheKeyUrl.pathname + cacheKeyUrl.search); return modifiedResponse;} function getDeviceClass(userAgent) { if (!userAgent) return 'desktop'; const ua = userAgent.toLowerCase(); if (/mobile|android|iphone|ipod|blackberry|iemobile/i.test(ua)) return 'mobile'; if (/tablet|ipad|playbook|silk/i.test(ua)) return 'tablet'; return 'desktop';} function normalizeLanguage(acceptLanguage) { if (!acceptLanguage) return 'en'; const primary = acceptLanguage.split(',')[0].split('-')[0].toLowerCase(); const supported = ['en', 'es', 'fr', 'de', 'ja', 'zh']; return supported.includes(primary) ? primary : 'en';}Before implementing cache key changes, measure your current fragmentation. Add a header that echoes the cache key and analyze how many unique cache keys serve identical content. After optimization, measure again to quantify the improvement.
TTL configuration directly impacts cache hit ratio. Longer TTLs mean content stays in cache longer, increasing the likelihood of hits. But the relationship isn't simply "longer = better"—optimal TTLs balance hit ratio with content freshness requirements.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
TTL IMPACT ANALYSIS Scenario: Product page receiving 100 requests/minute Average origin response time: 500ms Content changes: ~3 times per day ═══════════════════════════════════════════════════════════════ TTL = 60 seconds (1 minute)────────────────────────────────────────────────────────────- Every 60 seconds: 1 miss, 59 hits- CHR = 59/60 = 98.3%- Origin requests = 1,440/day- Freshness: At most 1 minute stale- Risk: Low (content usually fresh) TTL = 300 seconds (5 minutes)────────────────────────────────────────────────────────────- Every 300 seconds: 1 miss, 299 hits- CHR with SWR = 99.7%- Origin requests = 288/day- Freshness: At most 5 minutes stale- Risk: Moderate (may need purge for critical updates) TTL = 3600 seconds (1 hour) with SWR────────────────────────────────────────────────────────────- Every 3600 seconds: ~1 miss (SWR keeps content fresh)- CHR = 99.97%- Origin requests = 24/day- Freshness: Usually fresh (SWR refreshes under traffic)- Risk: Requires reliable purge for price changes TTL = 86400 seconds (1 day) with Purge────────────────────────────────────────────────────────────- Origin requests = 3 (on purge) + 1 per cold start- CHR = 99.99% under normal traffic- Freshness: Always fresh (purge on change)- Risk: Depends entirely on purge reliability ═══════════════════════════════════════════════════════════════ OPTIMAL STRATEGY for this scenario:Cache-Control: public, max-age=300, s-maxage=3600, stale-while-revalidate=86400, stale-if-error=86400 - Edge caches 1 hour- Browser caches 5 minutes- Stale serving for 24 hours (protection)- Purge on content changeWith stale-while-revalidate, you can use short max-age values (for staleness control) with very long SWR windows (for hit ratio). This combination ensures content stays relatively fresh while maximizing cache efficiency.
Single-layer edge caching has inherent hit ratio limitations. When content is requested from multiple edge locations, each edge has its own cache, leading to redundant origin requests. Origin shield and multi-layer caching collapse these redundant requests.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
SCENARIO: Content requested from 50 edge locations TTL = 300 seconds ═══════════════════════════════════════════════════════════════SINGLE-LAYER CACHING (Edge Only)═══════════════════════════════════════════════════════════════ Content published → Content requested at all 50 edges Edge 1 → Cache miss → Origin (500ms)Edge 2 → Cache miss → Origin (500ms)Edge 3 → Cache miss → Origin (500ms)...Edge 50 → Cache miss → Origin (500ms) Result: 50 origin requests for same content Total origin load: 50 × origin_cost After 5 minutes, cycle repeats.Daily origin requests: 50 locations × 288 cycles = 14,400 ═══════════════════════════════════════════════════════════════MULTI-LAYER CACHING (Edge + Origin Shield)═══════════════════════════════════════════════════════════════ Content published → Content requested at all 50 edges Edge 1 → Cache miss → Shield → Cache miss → Origin (500ms)Edge 2 → Cache miss → Shield → Cache HIT (10ms)Edge 3 → Cache miss → Shield → Cache HIT (10ms)...Edge 50 → Cache miss → Shield → Cache HIT (10ms) Result: 1 origin request (49 served by shield) Total origin load: 1 × origin_cost After 5 minutes:- First edge to expire → Shield HIT (if shield still fresh)- Shield TTL can be longer than edge TTL Daily origin requests: 288 cycles (vs 14,400)IMPROVEMENT: 50x reduction in origin load ═══════════════════════════════════════════════════════════════| Benefit | Without Shield | With Shield | Improvement |
|---|---|---|---|
| Origin requests (first access) | N edges × 1 | 1 | N× reduction |
| Origin requests (TTL refresh) | N edges × refreshes | 1 × refreshes | N× reduction |
| Origin bandwidth | Full response × N | Full response × 1 | N× reduction |
| Effective CHR | 70-85% | 95-99% | +10-30% |
| Latency for shield hits | Origin RTT (100-500ms) | Shield RTT (20-50ms) | 5-10× faster |
Place your origin shield close to your origin server to minimize shield-to-origin latency. If your origin is in US-East, your shield should be in US-East. The shield adds a hop, so minimizing shield-to-origin distance is critical.
Cache warming pre-populates caches before user requests arrive, eliminating cold misses for important content. It's particularly valuable for new content launches, post-purge recovery, and edge location expansion.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118
// Cache warming serviceclass CacheWarmer { constructor(cdnEdges, concurrency = 10) { this.cdnEdges = cdnEdges; // List of edge endpoints this.concurrency = concurrency; this.queue = []; } // Warm a single URL across all edges async warmUrl(url, options = {}) { const { priority = 'normal', edges = this.cdnEdges } = options; const warmRequests = edges.map(edge => ({ url, edge, priority, })); return this.executeBatch(warmRequests); } // Warm multiple URLs efficiently async warmUrls(urls, options = {}) { const { priority = 'normal', edges = this.cdnEdges, shieldOnly = false } = options; // If shield-only, warm just the shield layer const targetEdges = shieldOnly ? [this.getShieldEndpoint()] : edges; const allRequests = urls.flatMap(url => targetEdges.map(edge => ({ url, edge, priority })) ); return this.executeBatch(allRequests); } async executeBatch(requests) { const results = []; const executing = []; for (const req of requests) { const promise = this.warmSingle(req) .then(result => { results.push(result); executing.splice(executing.indexOf(promise), 1); }); executing.push(promise); if (executing.length >= this.concurrency) { await Promise.race(executing); } } await Promise.all(executing); return results; } async warmSingle({ url, edge }) { const start = Date.now(); try { const response = await fetch(url, { headers: { 'X-Cache-Warm': 'true', 'X-Edge-Target': edge, // Route to specific edge if supported }, // Don't follow redirects - we want to cache the redirect redirect: 'manual', }); return { url, edge, status: response.status, cached: response.headers.get('X-Cache') === 'HIT', duration: Date.now() - start, success: true, }; } catch (error) { return { url, edge, error: error.message, duration: Date.now() - start, success: false, }; } } getShieldEndpoint() { // Return shield endpoint for shield-only warming return this.cdnEdges.find(e => e.isShield) || this.cdnEdges[0]; }} // Usage example: Warm after content publishasync function onContentPublish(content) { const warmer = new CacheWarmer(CDN_EDGES); const urlsToWarm = [ content.url, content.apiUrl, ...content.imageUrls, ]; // Warm shield first (protects origin) await warmer.warmUrls(urlsToWarm, { shieldOnly: true }); // Then warm critical edges await warmer.warmUrls(urlsToWarm, { edges: CRITICAL_EDGES // US, EU data centers }); console.log('Content warmed successfully');}Cache warming has costs: origin load during warming, cache storage for warmed content, and potential cache eviction of actually-requested content. Warm strategically—focus on high-traffic, high-value content that justifies the origin cost.
Cache hit ratio optimization isn't a one-time effort—it requires continuous monitoring, analysis, and refinement. Building the right observability ensures you catch regressions early and identify new optimization opportunities.
| Metric | Granularity | Alert Threshold | Action on Alert |
|---|---|---|---|
| Overall CHR | 5 minutes | < 90% | Investigate miss sources |
| CHR by content type | 5 minutes | Varies by type | Review specific type's config |
| CHR by edge location | 15 minutes | 10% variance | Check edge-specific issues |
| Cache fragmentation ratio | 1 hour | 5% | Audit cache key parameters |
| TTL expiration rate | 5 minutes | Spike > 2x baseline | Review TTL strategy |
| Origin request rate | 1 minute | 2x expected | Possible cache bypass |
| Cache warmth ratio | 15 minutes | < 80% | Increase TTLs or add warming |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
-- Overall CHR by hourSELECT DATE_TRUNC('hour', timestamp) as hour, COUNT(*) as total_requests, SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) as hits, SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chrFROM cdn_logsWHERE timestamp > NOW() - INTERVAL '24 hours'GROUP BY 1ORDER BY 1; -- CHR by content typeSELECT content_type, COUNT(*) as requests, SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chrFROM cdn_logsWHERE timestamp > NOW() - INTERVAL '1 hour'GROUP BY 1ORDER BY requests DESC; -- Cache fragmentation analysis (same path, different cache keys)SELECT uri_path, COUNT(DISTINCT cache_key) as unique_cache_keys, COUNT(*) as total_requests, COUNT(DISTINCT cache_key) * 100.0 / COUNT(*) as fragmentation_ratioFROM cdn_logsWHERE timestamp > NOW() - INTERVAL '1 hour' AND cache_status = 'MISS'GROUP BY 1HAVING COUNT(DISTINCT cache_key) > 5ORDER BY unique_cache_keys DESCLIMIT 20; -- Top cache miss URLsSELECT uri_path, COUNT(*) as misses, AVG(origin_response_time_ms) as avg_origin_timeFROM cdn_logsWHERE timestamp > NOW() - INTERVAL '1 hour' AND cache_status = 'MISS'GROUP BY 1ORDER BY misses DESCLIMIT 50; -- Query parameter impact on CHRSELECT REGEXP_REPLACE(uri_path || query_string, '(utm_[^&=]+=[^&]*|fbclid=[^&]*)', '', 'g') as normalized_url, COUNT(*) as requests, COUNT(DISTINCT(uri_path || query_string)) as unique_urls, SUM(CASE WHEN cache_status = 'HIT' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as chrFROM cdn_logsWHERE timestamp > NOW() - INTERVAL '1 hour'GROUP BY 1HAVING COUNT(DISTINCT(uri_path || query_string)) > 10ORDER BY unique_urls DESC;Establish a CHR target based on your content mix (e.g., 95% for mostly static sites, 85% for dynamic sites) and track it as a key SLA. Review weekly, investigate any drops immediately, and continuously push toward higher efficiency.
Cache hit ratio is the ultimate measure of CDN effectiveness. Optimizing CHR is a multi-faceted effort spanning cache key design, TTL configuration, architecture choices, and ongoing monitoring.
Module Complete:
You've now mastered CDN caching mechanics—from cache keys to TTL configuration, Cache-Control headers, stale-while-revalidate, and cache hit ratio optimization. These concepts form the operational foundation for effective content delivery. Apply them systematically, measure relentlessly, and your CDN will become a powerful force multiplier for application performance.
You now have comprehensive knowledge of CDN caching mechanics—the internal workings that determine whether your CDN delivers millisecond responses or merely forwards traffic to your origin. Use this knowledge to achieve 95%+ cache hit ratios and transform your application's performance characteristics.