System Design (HLD)YouTube Video Platform

Designing YouTube: A Video Platform at Planetary Scale

LevelAdvanced

Duration180 mins

TopicYouTube Video Platform

5 / 6

CDN Integration

Global Delivery: Bringing Video to the Edge

A video platform can have the most efficient transcoding pipeline and the most sophisticated ABR algorithm, but if content is served from a single data center, users worldwide will experience unacceptable latency. The speed of light imposes a fundamental limit: a round-trip from New York to Tokyo takes at least 100ms, and real-world routing adds significant overhead.

Content Delivery Networks (CDNs) solve this by distributing content to edge servers worldwide, ensuring users always fetch video from a nearby location. At YouTube's scale, this means:

~1-2 exabytes of daily egress — more than any other service on the internet
Thousands of edge locations — from major metropolitan areas to developing regions
Cache hit ratios > 95% — most requests never hit origin storage
Sub-50ms latency — for cached content in well-served regions

What You Will Learn

By the end of this page, you will understand CDN architecture, caching strategies for video content, multi-CDN deployment patterns, cache invalidation, and cost optimization strategies. You'll be able to design a global video delivery infrastructure that balances performance, reliability, and cost.

CDN Architecture Overview

A CDN is a globally distributed network of servers that cache and serve content from locations close to end users. Understanding the architecture helps inform caching strategies and debugging performance issues.

CDN Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
┌─────────────────────────────────────────────────────────────────────────────────┐
│                              CDN ARCHITECTURE                                    │
└─────────────────────────────────────────────────────────────────────────────────┘
 
                               ORIGIN INFRASTRUCTURE
                        ┌─────────────────────────────────────┐
                        │            ORIGIN                   │
                        │  • Primary video storage (S3/GCS)   │
                        │  • Origin shield/mid-tier cache     │
                        │  • Manifest generation              │
                        │  • DRM license servers              │
                        └─────────────────────────────────────┘
                                          │
                    ┌─────────────────────┼─────────────────────┐
                    │                     │                     │
                    ▼                     ▼                     ▼
           ┌───────────────┐     ┌───────────────┐     ┌───────────────┐
           │  REGIONAL     │     │  REGIONAL     │     │  REGIONAL     │
           │  MID-TIER     │     │  MID-TIER     │     │  MID-TIER     │
           │  (US-WEST)    │     │  (EU)         │     │  (ASIA)       │
           │               │     │               │     │               │
           │  Cache: 50TB  │     │  Cache: 50TB  │     │  Cache: 50TB  │
           │  Aggregation  │     │  Aggregation  │     │  Aggregation  │
           └───────┬───────┘     └───────┬───────┘     └───────┬───────┘
                   │                     │                     │
    ┌──────────────┼──────────────┐     ...                   ...
    │              │              │
    ▼              ▼              ▼
┌─────────┐  ┌─────────┐  ┌─────────┐
│  EDGE   │  │  EDGE   │  │  EDGE   │     Thousands of edge locations
│  PoP    │  │  PoP    │  │  PoP    │     worldwide
│ (LA)    │  │ (SF)    │  │(Seattle)│
│         │  │         │  │         │     Each PoP:
│Cache:5TB│  │Cache:5TB│  │Cache:5TB│     • Multiple servers
│         │  │         │  │         │     • Local cache (SSD/RAM)
└────┬────┘  └────┬────┘  └────┬────┘     • Direct user serving
     │            │            │
     └────────────┼────────────┘
                  │ DNS-based traffic steering
                  │ Users directed to nearest PoP
                  ▼
          ┌───────────────┐
          │    USERS      │
          │  Worldwide    │
          └───────────────┘
 
 
                        REQUEST FLOW
                        ════════════
                        
User Request ──▶ Edge PoP ──┐
                    │       │
              Cache HIT?    │
                    │       │
         ┌──────YES─┴──NO───┴───┐
         │                      │
         ▼                      ▼
  Serve from            Request from Parent
  Edge Cache            (Mid-tier or Origin)
                              │
                        Cache at Edge
                              │
                              ▼
                        Serve to User

CDN Hierarchy Layers

•Origin — Your primary storage (S3, GCS, or custom). The source of truth for all content. Only accessed when content isn't cached anywhere in the CDN.
•Origin Shield / Mid-Tier — Regional caching layer between edge and origin. Aggregates cache misses from multiple edge PoPs to reduce origin load. One origin request serves multiple edges.
•Edge PoPs — Servers deployed in hundreds to thousands of locations worldwide. Direct user-facing layer. Optimized for low latency and high throughput.
•Last Mile — ISP integration (like Netflix Open Connect). Physical servers inside ISP networks for ultimate proximity. Not common for general platforms.

YouTube's Custom CDN

YouTube operates its own CDN rather than using commercial providers. With ~15% of global internet traffic, operating a custom CDN provides better control, lower costs at scale, and the ability to optimize for video-specific patterns. Most platforms use commercial CDNs (Cloudflare, Fastly, Akamai, CloudFront).

Video-Specific Caching Strategies

Video content has unique caching characteristics that differ significantly from web pages or API responses. Understanding these patterns enables effective cache utilization.

Video Content Caching Characteristics
Content Type	Size	Popularity Pattern	TTL Strategy	Cache Priority
Manifests (.m3u8/.mpd)	1-10 KB	Every playback session	Short (10-60s) for live, long for VOD	High (small, frequent)
Init segments	1-50 KB	Once per playback	Long (days/weeks)	Medium
Media segments	100KB-5MB	Power law (viral vs. long-tail)	Long (days/weeks)	Varies by popularity
Thumbnails	5-50 KB	Browse/search pages	Long (weeks)	Medium
Captions/subtitles	10-100 KB	Subset of viewers	Long (weeks)	Low

caching-strategy.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
// ================================================================
// CACHE KEY DESIGN
// ================================================================
 
interface CacheKeyComponents {
  // Video identification
  videoId: string;
  
  // Quality variant
  renditionId: string;     // e.g., "720p-h264-2500k"
  
  // Segment identification
  segmentIndex: number;    // e.g., 0, 1, 2, ...
  
  // Content type
  contentType: 'manifest' | 'init' | 'segment' | 'thumbnail' | 'caption';
  
  // Versioning (for cache invalidation)
  version?: string;        // Increment on content update
}
 
function buildCacheKey(components: CacheKeyComponents): string {
  // Example: /v/abc123/720p-h264-2500k/seg-042.m4s
  // Short keys are better for cache efficiency
  
  const parts = [
    'v',
    components.videoId,
    components.renditionId,
  ];
  
  if (components.contentType === 'segment') {
    parts.push(`seg-${components.segmentIndex.toString().padStart(4, '0')}.m4s`);
  } else if (components.contentType === 'init') {
    parts.push('init.mp4');
  } else if (components.contentType === 'manifest') {
    parts.push('manifest.m3u8');
  }
  
  // Add version suffix for invalidation
  if (components.version) {
    parts.push(`v${components.version}`);
  }
  
  return '/' + parts.join('/');
}
 
// ================================================================
// CACHE-CONTROL HEADERS
// ================================================================
 
interface CacheHeaders {
  'Cache-Control': string;
  'CDN-Cache-Control'?: string;  // CDN-specific override
  'Surrogate-Control'?: string;  // Fastly-style override
  'ETag'?: string;
  'Last-Modified'?: string;
  'Vary'?: string;
}
 
function getCacheHeaders(contentType: ContentType, isLive: boolean): CacheHeaders {
  switch (contentType) {
    case 'manifest':
      if (isLive) {
        return {
          // Live manifests change frequently
          'Cache-Control': 'public, max-age=2, s-maxage=1',
          'CDN-Cache-Control': 'max-age=1',
        };
      } else {
        return {
          // VOD manifests are stable
          'Cache-Control': 'public, max-age=86400, s-maxage=604800',
          'CDN-Cache-Control': 'max-age=604800', // 1 week at edge
        };
      }
      
    case 'init':
      return {
        // Init segments never change once published
        'Cache-Control': 'public, max-age=31536000, immutable',
        'CDN-Cache-Control': 'max-age=31536000',
      };
      
    case 'segment':
      return {
        // Media segments are immutable once encoded
        'Cache-Control': 'public, max-age=31536000, immutable',
        'CDN-Cache-Control': 'max-age=31536000',
        // Include ETag for conditional requests
        'ETag': generateSegmentETag(segment),
      };
      
    case 'thumbnail':
      return {
        'Cache-Control': 'public, max-age=604800', // 1 week
        'CDN-Cache-Control': 'max-age=2592000',    // 30 days at edge
        'Vary': 'Accept-WebP',  // Different versions for WebP support
      };
      
    default:
      return {
        'Cache-Control': 'public, max-age=3600',
      };
  }
}
 
// ================================================================
// POPULARITY-BASED CACHING
// ================================================================
 
class PopularityTracker {
  // Track access patterns for cache prioritization
  private accessCounts: Map<string, AccessStats> = new Map();
  
  recordAccess(videoId: string, segmentIndex: number): void {
    const key = `${videoId}:${segmentIndex}`;
    const stats = this.accessCounts.get(key) || {
      count: 0,
      firstAccess: Date.now(),
      lastAccess: Date.now(),
    };
    
    stats.count++;
    stats.lastAccess = Date.now();
    this.accessCounts.set(key, stats);
  }
  
  // Identify hot segments that should be aggressively cached
  getHotSegments(): HotSegment[] {
    const now = Date.now();
    const oneHour = 3600 * 1000;
    
    return Array.from(this.accessCounts.entries())
      .filter(([_, stats]) => {
        // High access rate in recent hour
        const recency = now - stats.lastAccess;
        return recency < oneHour && stats.count > 100;
      })
      .map(([key, stats]) => ({
        videoId: key.split(':')[0],
        segmentIndex: parseInt(key.split(':')[1]),
        accessRate: stats.count / ((now - stats.firstAccess) / 1000), // per second
      }))
      .sort((a, b) => b.accessRate - a.accessRate);
  }
  
  // Predict which segments to pre-warm based on viewing patterns
  predictNextSegments(videoId: string, currentSegment: number): number[] {
    // Most viewers watch sequentially
    const sequential = [currentSegment + 1, currentSegment + 2, currentSegment + 3];
    
    // Some videos have common skip points (e.g., post-intro)
    const skipPoints = this.getSkipPoints(videoId);
    
    return [...new Set([...sequential, ...skipPoints])];
  }
}

The Long-Tail Problem

Video libraries follow a power-law distribution: 1% of videos get 90% of views. Hot content fits easily in cache; the long tail forces cache misses. Solutions include compressed/lower-quality versions for long-tail content, or on-demand transcoding for rarely-accessed videos.

Multi-CDN Architecture

Enterprise video platforms typically use multiple CDN providers simultaneously. This multi-CDN approach provides redundancy, geographic optimization, and cost leverage.

Multi-CDN Benefits

•Redundancy — Single CDN outage doesn't affect service
•Performance optimization — Route to best-performing CDN per region
•Cost leverage — Negotiate better rates with competition
•Capacity — Aggregate capacity for traffic spikes
•Geographic coverage — Different CDNs excel in different regions

Multi-CDN Challenges

•Complexity — More integration, more monitoring, more contracts
•Cache fragmentation — Same content cached by multiple CDNs
•Inconsistent features — Different CDN capabilities
•Cost tracking — Multi-vendor billing complexity
•Debugging difficulty — Issues span multiple providers

multi-cdn-routing.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
// ================================================================
// MULTI-CDN TRAFFIC STEERING
// ================================================================
 
interface CDNProvider {
  name: string;
  baseUrl: string;
  regions: string[];
  costPerGB: number;
  capabilities: CDNCapabilities;
  healthScore: number; // 0-100, from real-time monitoring
}
 
interface CDNCapabilities {
  supportsHLS: boolean;
  supportsDASH: boolean;
  supportsHTTP2: boolean;
  supportsHTTP3: boolean;
  supportsTokenAuth: boolean;
  maxBitrate: number;
}
 
class CDNRouter {
  private providers: CDNProvider[] = [
    {
      name: 'cloudflare',
      baseUrl: 'https://video.cf-cdn.example.com',
      regions: ['us', 'eu', 'asia-pacific'],
      costPerGB: 0.02,
      capabilities: { supportsHLS: true, supportsDASH: true, supportsHTTP3: true, ... },
      healthScore: 98,
    },
    {
      name: 'fastly',
      baseUrl: 'https://video.fastly-cdn.example.com',
      regions: ['us', 'eu', 'latam'],
      costPerGB: 0.025,
      capabilities: { supportsHLS: true, supportsDASH: true, supportsHTTP3: true, ... },
      healthScore: 99,
    },
    {
      name: 'akamai',
      baseUrl: 'https://video.akamai-cdn.example.com',
      regions: ['us', 'eu', 'asia', 'india', 'africa'],
      costPerGB: 0.03,
      capabilities: { supportsHLS: true, supportsDASH: true, supportsHTTP2: true, ... },
      healthScore: 97,
    },
  ];
  
  // Select CDN for a user request
  selectCDN(request: VideoRequest): CDNProvider {
    // 1. Filter by region coverage
    const regionCDNs = this.providers.filter(
      cdn => cdn.regions.includes(request.userRegion)
    );
    
    if (regionCDNs.length === 0) {
      // Fallback to any available CDN
      return this.selectByHealth(this.providers);
    }
    
    // 2. Filter by required capabilities
    const capableCDNs = regionCDNs.filter(cdn => 
      this.meetsRequirements(cdn, request)
    );
    
    // 3. Apply routing strategy
    return this.applyRoutingStrategy(capableCDNs, request);
  }
  
  private applyRoutingStrategy(
    cdns: CDNProvider[], 
    request: VideoRequest
  ): CDNProvider {
    switch (this.routingMode) {
      case 'performance':
        // Route to historically best-performing CDN for this region
        return this.selectByPerformance(cdns, request.userRegion);
        
      case 'cost':
        // Route to cheapest CDN with acceptable performance
        return this.selectByCost(cdns);
        
      case 'weighted':
        // Weighted distribution based on contracts/performance
        return this.selectByWeight(cdns);
        
      case 'failover':
        // Primary with fallback to secondary
        return this.selectWithFailover(cdns);
        
      default:
        return cdns[0];
    }
  }
  
  // Real-time performance-based selection
  private selectByPerformance(cdns: CDNProvider[], region: string): CDNProvider {
    const metrics = this.performanceMetrics.get(region);
    
    // Score each CDN based on recent metrics
    const scored = cdns.map(cdn => ({
      cdn,
      score: this.calculatePerformanceScore(cdn, metrics),
    }));
    
    // Add some randomization to gather fresh data
    const exploration = Math.random() < 0.05; // 5% exploration
    if (exploration) {
      return scored[Math.floor(Math.random() * scored.length)].cdn;
    }
    
    return scored.sort((a, b) => b.score - a.score)[0].cdn;
  }
  
  private calculatePerformanceScore(
    cdn: CDNProvider, 
    metrics: RegionMetrics
  ): number {
    const cdnMetrics = metrics.byCDN.get(cdn.name);
    if (!cdnMetrics) return 50; // Default for unknown
    
    // Weighted score
    const latencyScore = 100 - (cdnMetrics.p95Latency / 10); // Lower is better
    const errorScore = (1 - cdnMetrics.errorRate) * 100;
    const throughputScore = Math.min(cdnMetrics.avgThroughput / 50, 100); // Mbps
    
    return (
      latencyScore * 0.3 +
      errorScore * 0.4 +
      throughputScore * 0.2 +
      cdn.healthScore * 0.1
    );
  }
}
 
// ================================================================
// CDN FAILOVER HANDLING
// ================================================================
 
class CDNFailoverHandler {
  async fetchWithFailover(url: string, cdns: CDNProvider[]): Promise<Response> {
    const orderedCDNs = this.orderByPreference(cdns);
    
    for (let i = 0; i < orderedCDNs.length; i++) {
      const cdn = orderedCDNs[i];
      const cdnUrl = this.rewriteUrl(url, cdn);
      
      try {
        const response = await fetch(cdnUrl, {
          signal: AbortSignal.timeout(5000), // 5s timeout
        });
        
        if (response.ok) {
          this.recordSuccess(cdn);
          return response;
        }
        
        // 5xx errors: try next CDN
        if (response.status >= 500) {
          this.recordError(cdn, response.status);
          continue;
        }
        
        // 4xx errors: likely content issue, not CDN
        throw new ContentError(response.status);
        
      } catch (error) {
        if (error instanceof ContentError) throw error;
        
        // Network error: try next CDN
        this.recordError(cdn, error);
        
        if (i === orderedCDNs.length - 1) {
          throw new AllCDNsFailedError(cdns.map(c => c.name));
        }
      }
    }
    
    throw new Error('No CDNs available');
  }
}

CDN Selection Decision Matrix
Scenario	Primary Strategy	Secondary Strategy	Fallback
Normal traffic	Performance-based	Cost optimization	Random healthy CDN
Traffic spike	Capacity distribution	Health-based	All available CDNs
Regional outage	Failover to backup	Global CDN	Origin direct
Cost constraints	Cost-optimized	Traffic shaping	Lower quality

Cache Invalidation Strategies

When video content changes—transcoding updates, content takedowns, or metadata corrections—cached copies across the CDN must be invalidated. This is famously "one of the two hard problems in computer science."

Invalidation Approaches

•URL versioning — Include version/hash in URL path. New version = new URL = cache miss. Old version eventually expires. Preferred for video segments.
•Purge API — Call CDN API to explicitly remove content from cache. Fast but expensive at scale. Best for targeted invalidation.
•Short TTL — Set low max-age so content naturally expires. Simple but increases origin load. Best for frequently-changing content.
•Surrogate keys (tags) — Tag content with keys, invalidate by tag. Efficient for grouped invalidation. Not universally supported.
•Stale-while-revalidate — Serve stale content while fetching fresh. Reduces latency impact of invalidation.

cache-invalidation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
// ================================================================
// INVALIDATION STRATEGIES
// ================================================================
 
interface InvalidationRequest {
  videoId: string;
  reason: 'content-update' | 'takedown' | 'error-fix' | 'metadata-update';
  scope: 'all' | 'manifest' | 'segments' | 'thumbnails';
  urgent: boolean;
}
 
class CacheInvalidator {
  
  // STRATEGY 1: URL Versioning (preferred for segments)
  // New content gets new URL, old naturally expires
  async invalidateViaVersioning(videoId: string): Promise<void> {
    const currentVersion = await this.getVersion(videoId);
    const newVersion = currentVersion + 1;
    
    // Update manifest to point to new segment URLs
    await this.updateManifest(videoId, {
      segmentUrlPattern: `/v/${videoId}/v${newVersion}/seg-{N}.m4s`
    });
    
    // Record version change
    await this.setVersion(videoId, newVersion);
    
    // Invalidate manifest (short TTL anyway)
    await this.invalidateManifest(videoId);
    
    // Old segments will expire naturally (long TTL)
    // This is acceptable for most use cases
  }
  
  // STRATEGY 2: CDN Purge API (for urgent takedowns)
  async purgeFromAllCDNs(paths: string[]): Promise<PurgeResult[]> {
    const results: PurgeResult[] = [];
    
    for (const cdn of this.cdnProviders) {
      try {
        switch (cdn.name) {
          case 'cloudflare':
            results.push(await this.purgeCloudflare(cdn, paths));
            break;
          case 'fastly':
            results.push(await this.purgeFastly(cdn, paths));
            break;
          case 'akamai':
            results.push(await this.purgeAkamai(cdn, paths));
            break;
        }
      } catch (error) {
        results.push({
          cdn: cdn.name,
          success: false,
          error: error.message,
        });
      }
    }
    
    return results;
  }
  
  private async purgeCloudflare(cdn: CDNProvider, paths: string[]): Promise<PurgeResult> {
    const response = await fetch(
      `https://api.cloudflare.com/client/v4/zones/${cdn.zoneId}/purge_cache`,
      {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${cdn.apiToken}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          files: paths.map(p => `${cdn.baseUrl}${p}`),
        }),
      }
    );
    
    return {
      cdn: 'cloudflare',
      success: response.ok,
      purgeId: (await response.json()).result?.id,
    };
  }
  
  // STRATEGY 3: Tag-based invalidation (Fastly Surrogate-Key)
  async invalidateByTag(tags: string[]): Promise<void> {
    // Fastly supports Surrogate-Key header for tag-based invalidation
    // Much more efficient than path-based purge for grouped content
    
    await fetch(`https://api.fastly.com/service/${this.serviceId}/purge`, {
      method: 'POST',
      headers: {
        'Fastly-Key': this.apiKey,
        'Surrogate-Key': tags.join(' '),
      },
    });
  }
  
  // High-priority takedown flow
  async emergencyTakedown(videoId: string): Promise<void> {
    // 1. Immediately block at origin
    await this.blockAtOrigin(videoId);
    
    // 2. Purge from all CDNs simultaneously
    const paths = await this.getAllPaths(videoId);
    await this.purgeFromAllCDNs(paths);
    
    // 3. Update manifest to return 404/410
    await this.tombstoneManifest(videoId);
    
    // 4. Verify purge completion
    await this.verifyPurge(videoId, paths);
    
    // 5. Log for compliance
    await this.logTakedown(videoId, Date.now());
  }
}
 
// ================================================================
// STALE-WHILE-REVALIDATE PATTERN
// ================================================================
 
// Headers for graceful invalidation
function getSWRHeaders(contentType: ContentType): Headers {
  return {
    // Serve stale for 1 hour while revalidating
    'Cache-Control': 'public, max-age=3600, stale-while-revalidate=3600',
    
    // CDN-specific: Cloudflare
    'CDN-Cache-Control': 'max-age=3600, stale-while-revalidate=3600',
    
    // Fastly-specific
    'Surrogate-Control': 'max-age=3600, stale-while-revalidate=3600',
  };
}

Takedown Requirements

Legal/copyright takedowns require rapid, verifiable cache purging. Build automated systems with compliance logging. Failure to remove content quickly can result in legal liability.

Origin Shield and Mid-Tier Caching

An origin shield (or mid-tier cache) sits between edge PoPs and your origin storage. It aggregates cache misses from multiple edges, dramatically reducing origin load and improving cache efficiency.

Origin Shield Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
┌─────────────────────────────────────────────────────────────────────────────────┐
│                         WITHOUT ORIGIN SHIELD                                     │
└─────────────────────────────────────────────────────────────────────────────────┘
 
   Cache Miss          Cache Miss          Cache Miss
       │                   │                   │
       ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Edge (LA)  │     │  Edge (NY)  │     │ Edge (London)│
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                   │                   │
       └───────────────────┼───────────────────┘
                           │
                           │  3 separate requests to origin
                           │  (same content fetched 3 times)
                           ▼
                    ┌─────────────┐
                    │   ORIGIN    │  Origin overloaded
                    │   (S3/GCS)  │  High egress costs
                    └─────────────┘
 
 
┌─────────────────────────────────────────────────────────────────────────────────┐
│                          WITH ORIGIN SHIELD                                       │
└─────────────────────────────────────────────────────────────────────────────────┘
 
   Cache Miss          Cache Miss          Cache Miss
       │                   │                   │
       ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Edge (LA)  │     │  Edge (NY)  │     │ Edge (London)│
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                   │                   │
       ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ Shield (US) │     │ Shield (US) │     │ Shield (EU) │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                   │                   │
       └─────────┬─────────┘                   │
              HIT│from Shield                  │Miss
                 │                             │
                 ▼                             ▼
          [Served from                  ┌─────────────┐
           US Shield]                   │   ORIGIN    │  Only 1 request
                                        │   (S3/GCS)  │  Much lower load
                                        └─────────────┘

Origin Shield Benefits
Metric	Without Shield	With Shield	Improvement
Origin requests/sec	50,000	5,000	90% reduction
Origin egress cost	$100K/month	$15K/month	85% reduction
Cache hit ratio	85%	98%	13% increase
p99 latency (cold)	800ms	200ms	75% reduction
Origin availability requirement	99.99%	99.9%	Less stringent

shield-configuration.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
// ================================================================
// ORIGIN SHIELD CONFIGURATION
// ================================================================
 
interface ShieldConfig {
  // Regional shield locations
  regions: ShieldRegion[];
  
  // Caching behavior
  caching: {
    defaultTTL: number;
    respectOriginHeaders: boolean;
    negativeCache: boolean;      // Cache 404s briefly
    staleOnOriginError: boolean; // Serve stale if origin fails
  };
  
  // Connection to origin
  origin: {
    host: string;
    connectionLimit: number;     // Limit concurrent connections
    timeout: number;
    retries: number;
  };
  
  // Request coalescing
  coalescing: {
    enabled: boolean;
    maxWait: number;             // Max time to wait for coalescing
  };
}
 
interface ShieldRegion {
  name: string;
  location: string;              // e.g., "us-east-1"
  capacity: StorageCapacity;     // Cache size
  edgePoPs: string[];            // Which edges this shield serves
}
 
const shieldConfig: ShieldConfig = {
  regions: [
    {
      name: 'us-shield',
      location: 'us-east-1',
      capacity: { ssd: '100TB', memory: '1TB' },
      edgePoPs: ['us-east-*', 'us-west-*', 'ca-*'],
    },
    {
      name: 'eu-shield',
      location: 'eu-west-1',
      capacity: { ssd: '80TB', memory: '800GB' },
      edgePoPs: ['eu-*', 'uk-*', 'me-*'],
    },
    {
      name: 'asia-shield',
      location: 'ap-southeast-1',
      capacity: { ssd: '60TB', memory: '600GB' },
      edgePoPs: ['ap-*', 'au-*'],
    },
  ],
  
  caching: {
    defaultTTL: 86400,
    respectOriginHeaders: true,
    negativeCache: true,
    staleOnOriginError: true,
  },
  
  origin: {
    host: 'storage.googleapis.com',
    connectionLimit: 1000,        // Limit connections to origin
    timeout: 30000,
    retries: 2,
  },
  
  coalescing: {
    enabled: true,
    maxWait: 100,                 // 100ms to coalesce requests
  },
};
 
// Request coalescing: multiple edge requests for same object
// become single origin request
class RequestCoalescer {
  private pendingRequests: Map<string, Promise<Response>> = new Map();
  
  async fetch(url: string): Promise<Response> {
    // Check if request already in flight
    const pending = this.pendingRequests.get(url);
    if (pending) {
      // Wait for existing request instead of making new one
      return pending.then(r => r.clone());
    }
    
    // New request - fetch and cache promise
    const request = this.doFetch(url);
    this.pendingRequests.set(url, request);
    
    try {
      const response = await request;
      return response;
    } finally {
      // Clean up after response received
      this.pendingRequests.delete(url);
    }
  }
}

Request Coalescing

When viral content causes a thundering herd, the shield coalesces identical requests—100 edge requests for the same uncached segment become 1 origin request. The slight delay (50-100ms) is vastly better than origin overload.

CDN Cost Optimization

At exabyte-scale delivery, even small per-GB cost differences compound into millions of dollars. CDN cost optimization is a critical discipline for video platforms.

CDN Cost Components
Cost Factor	Range	Optimization Strategy	Potential Savings
Egress bandwidth	$0.01-0.10/GB	Multi-CDN negotiation, commit discounts	30-50%
Origin egress	$0.05-0.15/GB	Origin shield, higher edge TTL	80-90%
Cache storage	$0.02-0.05/GB-month	Eviction policies, compression	20-30%
Request fees	$0.0001-0.001/request	Segment size optimization	Variable
Premium features	Variable	Use only where needed	10-20%

cost-optimization.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
// ================================================================
// CDN COST OPTIMIZATION STRATEGIES
// ================================================================
 
// STRATEGY 1: Traffic shaping based on cost
class CostAwareRouter {
  private monthlyCommitments: Map<string, CommitmentTier> = new Map([
    ['cloudflare', { gbIncluded: 1_000_000, overage: 0.02 }],
    ['fastly', { gbIncluded: 500_000, overage: 0.025 }],
    ['akamai', { gbIncluded: 2_000_000, overage: 0.015 }],
  ]);
  
  selectCDN(request: VideoRequest): CDNProvider {
    // Check which CDN has unused commitment capacity
    const usageByProvider = this.getCurrentMonthUsage();
    
    for (const [provider, commitment] of this.monthlyCommitments) {
      const usage = usageByProvider.get(provider) || 0;
      if (usage < commitment.gbIncluded * 0.95) {
        // Use included capacity first
        return this.getProvider(provider);
      }
    }
    
    // All commitments used; route to cheapest overage
    return this.selectByOverageCost();
  }
  
  // Enforce commitment minimums to maintain contract pricing
  async balanceTrafficToCommitments(): Promise<void> {
    const daysRemaining = this.getDaysRemainingInMonth();
    
    for (const [provider, commitment] of this.monthlyCommitments) {
      const currentUsage = await this.getUsage(provider);
      const projectedUsage = currentUsage / (30 - daysRemaining) * 30;
      
      if (projectedUsage < commitment.gbIncluded * 0.8) {
        // Risk missing commitment minimum - increase traffic share
        await this.increaseTrafficShare(provider, commitment.gbIncluded - projectedUsage);
      }
    }
  }
}
 
// STRATEGY 2: Bitrate-based routing
// Route high-bitrate content to cheaper CDN for same user
class BitrateAwareRouter {
  selectCDN(request: VideoRequest, quality: VideoQuality): CDNProvider {
    // 4K content = high bandwidth = expensive
    if (quality.height >= 2160) {
      // Route to CDN with lower per-GB cost
      return this.cheapestCDN(request.region);
    }
    
    // Low-res content = many requests, low bandwidth
    if (quality.height <= 360) {
      // Route to CDN with lower per-request cost
      return this.lowestRequestCostCDN(request.region);
    }
    
    // Standard quality: optimize for performance
    return this.bestPerformanceCDN(request.region);
  }
}
 
// STRATEGY 3: Segment size optimization
// Larger segments = fewer requests = lower request fees
// But larger segments = less ABR flexibility
 
interface SegmentOptimization {
  // Default: 4 second segments
  defaultDuration: 4;
  
  // For high-bitrate content: larger segments reduce request overhead
  highBitrate: {
    threshold: 10_000_000, // 10 Mbps
    duration: 6,           // 6 second segments
  };
  
  // For low-latency live: smaller segments
  lowLatencyLive: {
    duration: 2,
  };
}
 
// STRATEGY 4: Tiered storage for long-tail content
class TieredStorage {
  async getSegmentLocation(videoId: string, segmentIndex: number): Promise<StorageLocation> {
    const video = await this.getVideoMetadata(videoId);
    const accessPattern = await this.getAccessPattern(videoId);
    
    // Hot content: SSD edge cache
    if (accessPattern.dailyViews > 10000) {
      return { tier: 'hot', location: 'edge-ssd' };
    }
    
    // Warm content: Regional cache
    if (accessPattern.weeklyViews > 1000) {
      return { tier: 'warm', location: 'regional-hdd' };
    }
    
    // Cold content: Origin with on-demand edge caching
    return { tier: 'cold', location: 'origin-archive' };
  }
}
 
// Monthly cost analysis
interface MonthlyCostBreakdown {
  egress: {
    byProvider: Map<string, number>;
    byRegion: Map<string, number>;
    total: number;
  };
  originEgress: number;
  storage: number;
  requests: number;
  otherFees: number;
  total: number;
  
  // Optimization opportunities
  opportunities: CostOpportunity[];
}
 
function analyzeCosts(month: string): MonthlyCostBreakdown {
  // ... analysis logic
  
  return {
    egress: { ... },
    total: 2_500_000, // $2.5M example
    opportunities: [
      {
        type: 'unused-commitment',
        description: 'Fastly commitment underutilized by 20%',
        potentialSavings: 50_000,
      },
      {
        type: 'origin-egress',
        description: 'High origin egress in APAC - add shield',
        potentialSavings: 80_000,
      },
    ],
  };
}

Negotiation Leverage

Multi-CDN architectures provide negotiation leverage. CDN providers offer better pricing when competing for traffic share. Annual commitments unlock significant discounts but require accurate traffic forecasting.

CDN Monitoring and Observability

With billions of daily requests across multiple CDN providers, robust monitoring is essential for maintaining quality and optimizing performance.

Key CDN Metrics

•Cache hit ratio — Percentage of requests served from cache. Target: >95%. Low hit ratio indicates caching issues or unusual traffic patterns.
•Origin shield hit ratio — Shield-level cache effectiveness. Target: >99%. Low ratio means excessive origin load.
•Response latency (p50, p95, p99) — Time to first byte. Segmented by region, CDN, and content type. Alerts on regression.
•Error rate (4xx, 5xx) — Percentage of failed requests. 4xx often indicates content issues; 5xx indicates CDN/origin problems.
•Throughput — Bytes served per second. Track capacity utilization and peak handling.
•Time to first frame — End-to-end metric from player. Combines CDN performance with client-side decode.

cdn-monitoring.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
// ================================================================
// CDN HEALTH MONITORING
// ================================================================
 
interface CDNHealthMetrics {
  provider: string;
  region: string;
  timestamp: Date;
  
  // Availability
  availability: number;           // 0-100%
  errorRate: number;             // 0-1
  
  // Performance
  latencyP50: number;            // ms
  latencyP95: number;
  latencyP99: number;
  throughputMbps: number;
  
  // Cache efficiency
  cacheHitRatio: number;         // 0-1
  cacheBytesServed: number;
  originBytesServed: number;
  
  // Capacity
  currentLoad: number;           // 0-1
  peakLoad: number;
}
 
class CDNMonitor {
  private healthScores: Map<string, number> = new Map();
  
  async collectMetrics(): Promise<CDNHealthMetrics[]> {
    const metrics: CDNHealthMetrics[] = [];
    
    for (const provider of this.providers) {
      // Collect from CDN analytics APIs
      const providerMetrics = await this.fetchProviderMetrics(provider);
      metrics.push(...providerMetrics);
      
      // Also collect from client-side analytics
      const clientMetrics = await this.fetchClientMetrics(provider);
      
      // Calculate health score
      const score = this.calculateHealthScore(providerMetrics, clientMetrics);
      this.healthScores.set(provider.name, score);
    }
    
    return metrics;
  }
  
  private calculateHealthScore(
    cdnMetrics: CDNHealthMetrics[],
    clientMetrics: ClientMetrics[]
  ): number {
    // Weighted score across dimensions
    let score = 100;
    
    const avgMetrics = this.aggregateMetrics(cdnMetrics);
    
    // Penalize for errors
    score -= avgMetrics.errorRate * 100 * 2; // Heavy penalty for errors
    
    // Penalize for high latency
    if (avgMetrics.latencyP95 > 200) {
      score -= (avgMetrics.latencyP95 - 200) / 10;
    }
    
    // Penalize for low cache hit ratio
    if (avgMetrics.cacheHitRatio < 0.95) {
      score -= (0.95 - avgMetrics.cacheHitRatio) * 100;
    }
    
    // Factor in client-reported metrics (ground truth)
    const clientScore = this.calculateClientScore(clientMetrics);
    
    // Blend CDN-reported and client-reported
    return score * 0.4 + clientScore * 0.6;
  }
  
  // Alerting based on metrics
  async checkAlerts(): Promise<Alert[]> {
    const alerts: Alert[] = [];
    const metrics = await this.getLatestMetrics();
    
    for (const m of metrics) {
      // High error rate
      if (m.errorRate > 0.01) { // >1% errors
        alerts.push({
          severity: m.errorRate > 0.05 ? 'critical' : 'warning',
          type: 'high-error-rate',
          provider: m.provider,
          region: m.region,
          value: m.errorRate,
          message: `Error rate ${(m.errorRate * 100).toFixed(2)}% exceeds threshold`,
        });
      }
      
      // Latency regression
      const baseline = await this.getBaselineLatency(m.provider, m.region);
      if (m.latencyP95 > baseline * 1.5) {
        alerts.push({
          severity: 'warning',
          type: 'latency-regression',
          provider: m.provider,
          region: m.region,
          value: m.latencyP95,
          message: `P95 latency ${m.latencyP95}ms, baseline ${baseline}ms`,
        });
      }
      
      // Cache hit ratio drop
      if (m.cacheHitRatio < 0.90) {
        alerts.push({
          severity: 'warning',
          type: 'low-cache-hit',
          provider: m.provider,
          region: m.region,
          value: m.cacheHitRatio,
          message: `Cache hit ratio ${(m.cacheHitRatio * 100).toFixed(1)}% below 90%`,
        });
      }
    }
    
    return alerts;
  }
}
 
// Real User Monitoring (RUM) for ground truth
class RealUserMonitor {
  collectFromPlayer(player: VideoPlayer): void {
    player.on('segment-loaded', (event) => {
      this.report({
        type: 'segment-load',
        cdn: this.extractCDNFromUrl(event.url),
        latencyMs: event.loadTime,
        bytesLoaded: event.bytes,
        cacheStatus: event.headers['x-cache'],
        region: this.getUserRegion(),
      });
    });
    
    player.on('error', (event) => {
      this.report({
        type: 'error',
        cdn: event.cdn,
        errorCode: event.code,
        errorMessage: event.message,
      });
    });
  }
}

Trust Client Metrics

CDN-reported metrics show what the CDN sees; client-reported metrics show what users experience. A CDN might report 100ms latency, but if the user is on a slow ISP, they experience 500ms. Use Real User Monitoring (RUM) for ground truth.

CDN Integration Summary

We've explored the architecture, strategies, and operational considerations for delivering video at planetary scale. Let's consolidate the key takeaways:

Key Design Decisions

•Multi-tier caching hierarchy — Edge PoPs for user-facing traffic, regional shields to aggregate cache misses, origin as source of truth. Each tier reduces load on the next.
•Video-specific caching — Long TTL for immutable segments, short TTL for live manifests, popularity-based cache prioritization for the long tail.
•Multi-CDN architecture — Multiple providers for redundancy, performance optimization, and cost leverage. Route dynamically based on performance and cost.
•Origin shield/mid-tier — Request coalescing and regional caching dramatically reduce origin load and costs. Essential for high-traffic content.
•Strategic invalidation — URL versioning for routine updates, API purge for urgent takedowns, stale-while-revalidate for graceful transitions.
•Cost optimization — Commit-based pricing, traffic shaping to optimize provider spend, segment size tuning, tiered storage for long-tail content.
•Comprehensive monitoring — CDN-reported metrics plus client-side RUM. Health scores for routing decisions. Alert on regression.

What's next:

With video efficiently delivered worldwide, we need to help users discover content they'll enjoy. The final page covers the Recommendation Engine—the machine learning systems that power personalized suggestions and drive engagement.

Page Complete

You now understand the architecture of global video delivery at scale. From CDN hierarchies to multi-CDN routing to cost optimization, these patterns enable low-latency, high-reliability video streaming worldwide.

5 / 6

Loading learning content...

System Design (HLD)YouTube Video Platform

Designing YouTube: A Video Platform at Planetary Scale

LevelAdvanced

Duration180 mins

TopicYouTube Video Platform

5 / 6

CDN Integration

Global Delivery: Bringing Video to the Edge

Content Delivery Networks (CDNs) solve this by distributing content to edge servers worldwide, ensuring users always fetch video from a nearby location. At YouTube's scale, this means:

~1-2 exabytes of daily egress — more than any other service on the internet
Thousands of edge locations — from major metropolitan areas to developing regions
Cache hit ratios > 95% — most requests never hit origin storage
Sub-50ms latency — for cached content in well-served regions

What You Will Learn

CDN Architecture Overview

CDN Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
┌─────────────────────────────────────────────────────────────────────────────────┐
│                              CDN ARCHITECTURE                                    │
└─────────────────────────────────────────────────────────────────────────────────┘
 
                               ORIGIN INFRASTRUCTURE
                        ┌─────────────────────────────────────┐
                        │            ORIGIN                   │
                        │  • Primary video storage (S3/GCS)   │
                        │  • Origin shield/mid-tier cache     │
                        │  • Manifest generation              │
                        │  • DRM license servers              │
                        └─────────────────────────────────────┘
                                          │
                    ┌─────────────────────┼─────────────────────┐
                    │                     │                     │
                    ▼                     ▼                     ▼
           ┌───────────────┐     ┌───────────────┐     ┌───────────────┐
           │  REGIONAL     │     │  REGIONAL     │     │  REGIONAL     │
           │  MID-TIER     │     │  MID-TIER     │     │  MID-TIER     │
           │  (US-WEST)    │     │  (EU)         │     │  (ASIA)       │
           │               │     │               │     │               │
           │  Cache: 50TB  │     │  Cache: 50TB  │     │  Cache: 50TB  │
           │  Aggregation  │     │  Aggregation  │     │  Aggregation  │
           └───────┬───────┘     └───────┬───────┘     └───────┬───────┘
                   │                     │                     │
    ┌──────────────┼──────────────┐     ...                   ...
    │              │              │
    ▼              ▼              ▼
┌─────────┐  ┌─────────┐  ┌─────────┐
│  EDGE   │  │  EDGE   │  │  EDGE   │     Thousands of edge locations
│  PoP    │  │  PoP    │  │  PoP    │     worldwide
│ (LA)    │  │ (SF)    │  │(Seattle)│
│         │  │         │  │         │     Each PoP:
│Cache:5TB│  │Cache:5TB│  │Cache:5TB│     • Multiple servers
│         │  │         │  │         │     • Local cache (SSD/RAM)
└────┬────┘  └────┬────┘  └────┬────┘     • Direct user serving
     │            │            │
     └────────────┼────────────┘
                  │ DNS-based traffic steering
                  │ Users directed to nearest PoP
                  ▼
          ┌───────────────┐
          │    USERS      │
          │  Worldwide    │
          └───────────────┘
 
 
                        REQUEST FLOW
                        ════════════
                        
User Request ──▶ Edge PoP ──┐
                    │       │
              Cache HIT?    │
                    │       │
         ┌──────YES─┴──NO───┴───┐
         │                      │
         ▼                      ▼
  Serve from            Request from Parent
  Edge Cache            (Mid-tier or Origin)
                              │
                        Cache at Edge
                              │
                              ▼
                        Serve to User

CDN Hierarchy Layers

•Origin — Your primary storage (S3, GCS, or custom). The source of truth for all content. Only accessed when content isn't cached anywhere in the CDN.
•Origin Shield / Mid-Tier — Regional caching layer between edge and origin. Aggregates cache misses from multiple edge PoPs to reduce origin load. One origin request serves multiple edges.
•Edge PoPs — Servers deployed in hundreds to thousands of locations worldwide. Direct user-facing layer. Optimized for low latency and high throughput.
•Last Mile — ISP integration (like Netflix Open Connect). Physical servers inside ISP networks for ultimate proximity. Not common for general platforms.

YouTube's Custom CDN

Video-Specific Caching Strategies

Video content has unique caching characteristics that differ significantly from web pages or API responses. Understanding these patterns enables effective cache utilization.

Video Content Caching Characteristics
Content Type	Size	Popularity Pattern	TTL Strategy	Cache Priority
Manifests (.m3u8/.mpd)	1-10 KB	Every playback session	Short (10-60s) for live, long for VOD	High (small, frequent)
Init segments	1-50 KB	Once per playback	Long (days/weeks)	Medium
Media segments	100KB-5MB	Power law (viral vs. long-tail)	Long (days/weeks)	Varies by popularity
Thumbnails	5-50 KB	Browse/search pages	Long (weeks)	Medium
Captions/subtitles	10-100 KB	Subset of viewers	Long (weeks)	Low

caching-strategy.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
// ================================================================
// CACHE KEY DESIGN
// ================================================================
 
interface CacheKeyComponents {
  // Video identification
  videoId: string;
  
  // Quality variant
  renditionId: string;     // e.g., "720p-h264-2500k"
  
  // Segment identification
  segmentIndex: number;    // e.g., 0, 1, 2, ...
  
  // Content type
  contentType: 'manifest' | 'init' | 'segment' | 'thumbnail' | 'caption';
  
  // Versioning (for cache invalidation)
  version?: string;        // Increment on content update
}
 
function buildCacheKey(components: CacheKeyComponents): string {
  // Example: /v/abc123/720p-h264-2500k/seg-042.m4s
  // Short keys are better for cache efficiency
  
  const parts = [
    'v',
    components.videoId,
    components.renditionId,
  ];
  
  if (components.contentType === 'segment') {
    parts.push(`seg-${components.segmentIndex.toString().padStart(4, '0')}.m4s`);
  } else if (components.contentType === 'init') {
    parts.push('init.mp4');
  } else if (components.contentType === 'manifest') {
    parts.push('manifest.m3u8');
  }
  
  // Add version suffix for invalidation
  if (components.version) {
    parts.push(`v${components.version}`);
  }
  
  return '/' + parts.join('/');
}
 
// ================================================================
// CACHE-CONTROL HEADERS
// ================================================================
 
interface CacheHeaders {
  'Cache-Control': string;
  'CDN-Cache-Control'?: string;  // CDN-specific override
  'Surrogate-Control'?: string;  // Fastly-style override
  'ETag'?: string;
  'Last-Modified'?: string;
  'Vary'?: string;
}
 
function getCacheHeaders(contentType: ContentType, isLive: boolean): CacheHeaders {
  switch (contentType) {
    case 'manifest':
      if (isLive) {
        return {
          // Live manifests change frequently
          'Cache-Control': 'public, max-age=2, s-maxage=1',
          'CDN-Cache-Control': 'max-age=1',
        };
      } else {
        return {
          // VOD manifests are stable
          'Cache-Control': 'public, max-age=86400, s-maxage=604800',
          'CDN-Cache-Control': 'max-age=604800', // 1 week at edge
        };
      }
      
    case 'init':
      return {
        // Init segments never change once published
        'Cache-Control': 'public, max-age=31536000, immutable',
        'CDN-Cache-Control': 'max-age=31536000',
      };
      
    case 'segment':
      return {
        // Media segments are immutable once encoded
        'Cache-Control': 'public, max-age=31536000, immutable',
        'CDN-Cache-Control': 'max-age=31536000',
        // Include ETag for conditional requests
        'ETag': generateSegmentETag(segment),
      };
      
    case 'thumbnail':
      return {
        'Cache-Control': 'public, max-age=604800', // 1 week
        'CDN-Cache-Control': 'max-age=2592000',    // 30 days at edge
        'Vary': 'Accept-WebP',  // Different versions for WebP support
      };
      
    default:
      return {
        'Cache-Control': 'public, max-age=3600',
      };
  }
}
 
// ================================================================
// POPULARITY-BASED CACHING
// ================================================================
 
class PopularityTracker {
  // Track access patterns for cache prioritization
  private accessCounts: Map<string, AccessStats> = new Map();
  
  recordAccess(videoId: string, segmentIndex: number): void {
    const key = `${videoId}:${segmentIndex}`;
    const stats = this.accessCounts.get(key) || {
      count: 0,
      firstAccess: Date.now(),
      lastAccess: Date.now(),
    };
    
    stats.count++;
    stats.lastAccess = Date.now();
    this.accessCounts.set(key, stats);
  }
  
  // Identify hot segments that should be aggressively cached
  getHotSegments(): HotSegment[] {
    const now = Date.now();
    const oneHour = 3600 * 1000;
    
    return Array.from(this.accessCounts.entries())
      .filter(([_, stats]) => {
        // High access rate in recent hour
        const recency = now - stats.lastAccess;
        return recency < oneHour && stats.count > 100;
      })
      .map(([key, stats]) => ({
        videoId: key.split(':')[0],
        segmentIndex: parseInt(key.split(':')[1]),
        accessRate: stats.count / ((now - stats.firstAccess) / 1000), // per second
      }))
      .sort((a, b) => b.accessRate - a.accessRate);
  }
  
  // Predict which segments to pre-warm based on viewing patterns
  predictNextSegments(videoId: string, currentSegment: number): number[] {
    // Most viewers watch sequentially
    const sequential = [currentSegment + 1, currentSegment + 2, currentSegment + 3];
    
    // Some videos have common skip points (e.g., post-intro)
    const skipPoints = this.getSkipPoints(videoId);
    
    return [...new Set([...sequential, ...skipPoints])];
  }
}

The Long-Tail Problem

Multi-CDN Architecture

Enterprise video platforms typically use multiple CDN providers simultaneously. This multi-CDN approach provides redundancy, geographic optimization, and cost leverage.

Multi-CDN Benefits

•Redundancy — Single CDN outage doesn't affect service
•Performance optimization — Route to best-performing CDN per region
•Cost leverage — Negotiate better rates with competition
•Capacity — Aggregate capacity for traffic spikes
•Geographic coverage — Different CDNs excel in different regions

Multi-CDN Challenges

•Complexity — More integration, more monitoring, more contracts
•Cache fragmentation — Same content cached by multiple CDNs
•Inconsistent features — Different CDN capabilities
•Cost tracking — Multi-vendor billing complexity
•Debugging difficulty — Issues span multiple providers

multi-cdn-routing.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
// ================================================================
// MULTI-CDN TRAFFIC STEERING
// ================================================================
 
interface CDNProvider {
  name: string;
  baseUrl: string;
  regions: string[];
  costPerGB: number;
  capabilities: CDNCapabilities;
  healthScore: number; // 0-100, from real-time monitoring
}
 
interface CDNCapabilities {
  supportsHLS: boolean;
  supportsDASH: boolean;
  supportsHTTP2: boolean;
  supportsHTTP3: boolean;
  supportsTokenAuth: boolean;
  maxBitrate: number;
}
 
class CDNRouter {
  private providers: CDNProvider[] = [
    {
      name: 'cloudflare',
      baseUrl: 'https://video.cf-cdn.example.com',
      regions: ['us', 'eu', 'asia-pacific'],
      costPerGB: 0.02,
      capabilities: { supportsHLS: true, supportsDASH: true, supportsHTTP3: true, ... },
      healthScore: 98,
    },
    {
      name: 'fastly',
      baseUrl: 'https://video.fastly-cdn.example.com',
      regions: ['us', 'eu', 'latam'],
      costPerGB: 0.025,
      capabilities: { supportsHLS: true, supportsDASH: true, supportsHTTP3: true, ... },
      healthScore: 99,
    },
    {
      name: 'akamai',
      baseUrl: 'https://video.akamai-cdn.example.com',
      regions: ['us', 'eu', 'asia', 'india', 'africa'],
      costPerGB: 0.03,
      capabilities: { supportsHLS: true, supportsDASH: true, supportsHTTP2: true, ... },
      healthScore: 97,
    },
  ];
  
  // Select CDN for a user request
  selectCDN(request: VideoRequest): CDNProvider {
    // 1. Filter by region coverage
    const regionCDNs = this.providers.filter(
      cdn => cdn.regions.includes(request.userRegion)
    );
    
    if (regionCDNs.length === 0) {
      // Fallback to any available CDN
      return this.selectByHealth(this.providers);
    }
    
    // 2. Filter by required capabilities
    const capableCDNs = regionCDNs.filter(cdn => 
      this.meetsRequirements(cdn, request)
    );
    
    // 3. Apply routing strategy
    return this.applyRoutingStrategy(capableCDNs, request);
  }
  
  private applyRoutingStrategy(
    cdns: CDNProvider[], 
    request: VideoRequest
  ): CDNProvider {
    switch (this.routingMode) {
      case 'performance':
        // Route to historically best-performing CDN for this region
        return this.selectByPerformance(cdns, request.userRegion);
        
      case 'cost':
        // Route to cheapest CDN with acceptable performance
        return this.selectByCost(cdns);
        
      case 'weighted':
        // Weighted distribution based on contracts/performance
        return this.selectByWeight(cdns);
        
      case 'failover':
        // Primary with fallback to secondary
        return this.selectWithFailover(cdns);
        
      default:
        return cdns[0];
    }
  }
  
  // Real-time performance-based selection
  private selectByPerformance(cdns: CDNProvider[], region: string): CDNProvider {
    const metrics = this.performanceMetrics.get(region);
    
    // Score each CDN based on recent metrics
    const scored = cdns.map(cdn => ({
      cdn,
      score: this.calculatePerformanceScore(cdn, metrics),
    }));
    
    // Add some randomization to gather fresh data
    const exploration = Math.random() < 0.05; // 5% exploration
    if (exploration) {
      return scored[Math.floor(Math.random() * scored.length)].cdn;
    }
    
    return scored.sort((a, b) => b.score - a.score)[0].cdn;
  }
  
  private calculatePerformanceScore(
    cdn: CDNProvider, 
    metrics: RegionMetrics
  ): number {
    const cdnMetrics = metrics.byCDN.get(cdn.name);
    if (!cdnMetrics) return 50; // Default for unknown
    
    // Weighted score
    const latencyScore = 100 - (cdnMetrics.p95Latency / 10); // Lower is better
    const errorScore = (1 - cdnMetrics.errorRate) * 100;
    const throughputScore = Math.min(cdnMetrics.avgThroughput / 50, 100); // Mbps
    
    return (
      latencyScore * 0.3 +
      errorScore * 0.4 +
      throughputScore * 0.2 +
      cdn.healthScore * 0.1
    );
  }
}
 
// ================================================================
// CDN FAILOVER HANDLING
// ================================================================
 
class CDNFailoverHandler {
  async fetchWithFailover(url: string, cdns: CDNProvider[]): Promise<Response> {
    const orderedCDNs = this.orderByPreference(cdns);
    
    for (let i = 0; i < orderedCDNs.length; i++) {
      const cdn = orderedCDNs[i];
      const cdnUrl = this.rewriteUrl(url, cdn);
      
      try {
        const response = await fetch(cdnUrl, {
          signal: AbortSignal.timeout(5000), // 5s timeout
        });
        
        if (response.ok) {
          this.recordSuccess(cdn);
          return response;
        }
        
        // 5xx errors: try next CDN
        if (response.status >= 500) {
          this.recordError(cdn, response.status);
          continue;
        }
        
        // 4xx errors: likely content issue, not CDN
        throw new ContentError(response.status);
        
      } catch (error) {
        if (error instanceof ContentError) throw error;
        
        // Network error: try next CDN
        this.recordError(cdn, error);
        
        if (i === orderedCDNs.length - 1) {
          throw new AllCDNsFailedError(cdns.map(c => c.name));
        }
      }
    }
    
    throw new Error('No CDNs available');
  }
}

CDN Selection Decision Matrix
Scenario	Primary Strategy	Secondary Strategy	Fallback
Normal traffic	Performance-based	Cost optimization	Random healthy CDN
Traffic spike	Capacity distribution	Health-based	All available CDNs
Regional outage	Failover to backup	Global CDN	Origin direct
Cost constraints	Cost-optimized	Traffic shaping	Lower quality

Cache Invalidation Strategies

Invalidation Approaches

•URL versioning — Include version/hash in URL path. New version = new URL = cache miss. Old version eventually expires. Preferred for video segments.
•Purge API — Call CDN API to explicitly remove content from cache. Fast but expensive at scale. Best for targeted invalidation.
•Short TTL — Set low max-age so content naturally expires. Simple but increases origin load. Best for frequently-changing content.
•Surrogate keys (tags) — Tag content with keys, invalidate by tag. Efficient for grouped invalidation. Not universally supported.
•Stale-while-revalidate — Serve stale content while fetching fresh. Reduces latency impact of invalidation.

cache-invalidation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
// ================================================================
// INVALIDATION STRATEGIES
// ================================================================
 
interface InvalidationRequest {
  videoId: string;
  reason: 'content-update' | 'takedown' | 'error-fix' | 'metadata-update';
  scope: 'all' | 'manifest' | 'segments' | 'thumbnails';
  urgent: boolean;
}
 
class CacheInvalidator {
  
  // STRATEGY 1: URL Versioning (preferred for segments)
  // New content gets new URL, old naturally expires
  async invalidateViaVersioning(videoId: string): Promise<void> {
    const currentVersion = await this.getVersion(videoId);
    const newVersion = currentVersion + 1;
    
    // Update manifest to point to new segment URLs
    await this.updateManifest(videoId, {
      segmentUrlPattern: `/v/${videoId}/v${newVersion}/seg-{N}.m4s`
    });
    
    // Record version change
    await this.setVersion(videoId, newVersion);
    
    // Invalidate manifest (short TTL anyway)
    await this.invalidateManifest(videoId);
    
    // Old segments will expire naturally (long TTL)
    // This is acceptable for most use cases
  }
  
  // STRATEGY 2: CDN Purge API (for urgent takedowns)
  async purgeFromAllCDNs(paths: string[]): Promise<PurgeResult[]> {
    const results: PurgeResult[] = [];
    
    for (const cdn of this.cdnProviders) {
      try {
        switch (cdn.name) {
          case 'cloudflare':
            results.push(await this.purgeCloudflare(cdn, paths));
            break;
          case 'fastly':
            results.push(await this.purgeFastly(cdn, paths));
            break;
          case 'akamai':
            results.push(await this.purgeAkamai(cdn, paths));
            break;
        }
      } catch (error) {
        results.push({
          cdn: cdn.name,
          success: false,
          error: error.message,
        });
      }
    }
    
    return results;
  }
  
  private async purgeCloudflare(cdn: CDNProvider, paths: string[]): Promise<PurgeResult> {
    const response = await fetch(
      `https://api.cloudflare.com/client/v4/zones/${cdn.zoneId}/purge_cache`,
      {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${cdn.apiToken}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          files: paths.map(p => `${cdn.baseUrl}${p}`),
        }),
      }
    );
    
    return {
      cdn: 'cloudflare',
      success: response.ok,
      purgeId: (await response.json()).result?.id,
    };
  }
  
  // STRATEGY 3: Tag-based invalidation (Fastly Surrogate-Key)
  async invalidateByTag(tags: string[]): Promise<void> {
    // Fastly supports Surrogate-Key header for tag-based invalidation
    // Much more efficient than path-based purge for grouped content
    
    await fetch(`https://api.fastly.com/service/${this.serviceId}/purge`, {
      method: 'POST',
      headers: {
        'Fastly-Key': this.apiKey,
        'Surrogate-Key': tags.join(' '),
      },
    });
  }
  
  // High-priority takedown flow
  async emergencyTakedown(videoId: string): Promise<void> {
    // 1. Immediately block at origin
    await this.blockAtOrigin(videoId);
    
    // 2. Purge from all CDNs simultaneously
    const paths = await this.getAllPaths(videoId);
    await this.purgeFromAllCDNs(paths);
    
    // 3. Update manifest to return 404/410
    await this.tombstoneManifest(videoId);
    
    // 4. Verify purge completion
    await this.verifyPurge(videoId, paths);
    
    // 5. Log for compliance
    await this.logTakedown(videoId, Date.now());
  }
}
 
// ================================================================
// STALE-WHILE-REVALIDATE PATTERN
// ================================================================
 
// Headers for graceful invalidation
function getSWRHeaders(contentType: ContentType): Headers {
  return {
    // Serve stale for 1 hour while revalidating
    'Cache-Control': 'public, max-age=3600, stale-while-revalidate=3600',
    
    // CDN-specific: Cloudflare
    'CDN-Cache-Control': 'max-age=3600, stale-while-revalidate=3600',
    
    // Fastly-specific
    'Surrogate-Control': 'max-age=3600, stale-while-revalidate=3600',
  };
}

Takedown Requirements

Legal/copyright takedowns require rapid, verifiable cache purging. Build automated systems with compliance logging. Failure to remove content quickly can result in legal liability.

Origin Shield and Mid-Tier Caching

Origin Shield Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
┌─────────────────────────────────────────────────────────────────────────────────┐
│                         WITHOUT ORIGIN SHIELD                                     │
└─────────────────────────────────────────────────────────────────────────────────┘
 
   Cache Miss          Cache Miss          Cache Miss
       │                   │                   │
       ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Edge (LA)  │     │  Edge (NY)  │     │ Edge (London)│
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                   │                   │
       └───────────────────┼───────────────────┘
                           │
                           │  3 separate requests to origin
                           │  (same content fetched 3 times)
                           ▼
                    ┌─────────────┐
                    │   ORIGIN    │  Origin overloaded
                    │   (S3/GCS)  │  High egress costs
                    └─────────────┘
 
 
┌─────────────────────────────────────────────────────────────────────────────────┐
│                          WITH ORIGIN SHIELD                                       │
└─────────────────────────────────────────────────────────────────────────────────┘
 
   Cache Miss          Cache Miss          Cache Miss
       │                   │                   │
       ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Edge (LA)  │     │  Edge (NY)  │     │ Edge (London)│
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                   │                   │
       ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ Shield (US) │     │ Shield (US) │     │ Shield (EU) │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                   │                   │
       └─────────┬─────────┘                   │
              HIT│from Shield                  │Miss
                 │                             │
                 ▼                             ▼
          [Served from                  ┌─────────────┐
           US Shield]                   │   ORIGIN    │  Only 1 request
                                        │   (S3/GCS)  │  Much lower load
                                        └─────────────┘

Origin Shield Benefits
Metric	Without Shield	With Shield	Improvement
Origin requests/sec	50,000	5,000	90% reduction
Origin egress cost	$100K/month	$15K/month	85% reduction
Cache hit ratio	85%	98%	13% increase
p99 latency (cold)	800ms	200ms	75% reduction
Origin availability requirement	99.99%	99.9%	Less stringent

shield-configuration.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
// ================================================================
// ORIGIN SHIELD CONFIGURATION
// ================================================================
 
interface ShieldConfig {
  // Regional shield locations
  regions: ShieldRegion[];
  
  // Caching behavior
  caching: {
    defaultTTL: number;
    respectOriginHeaders: boolean;
    negativeCache: boolean;      // Cache 404s briefly
    staleOnOriginError: boolean; // Serve stale if origin fails
  };
  
  // Connection to origin
  origin: {
    host: string;
    connectionLimit: number;     // Limit concurrent connections
    timeout: number;
    retries: number;
  };
  
  // Request coalescing
  coalescing: {
    enabled: boolean;
    maxWait: number;             // Max time to wait for coalescing
  };
}
 
interface ShieldRegion {
  name: string;
  location: string;              // e.g., "us-east-1"
  capacity: StorageCapacity;     // Cache size
  edgePoPs: string[];            // Which edges this shield serves
}
 
const shieldConfig: ShieldConfig = {
  regions: [
    {
      name: 'us-shield',
      location: 'us-east-1',
      capacity: { ssd: '100TB', memory: '1TB' },
      edgePoPs: ['us-east-*', 'us-west-*', 'ca-*'],
    },
    {
      name: 'eu-shield',
      location: 'eu-west-1',
      capacity: { ssd: '80TB', memory: '800GB' },
      edgePoPs: ['eu-*', 'uk-*', 'me-*'],
    },
    {
      name: 'asia-shield',
      location: 'ap-southeast-1',
      capacity: { ssd: '60TB', memory: '600GB' },
      edgePoPs: ['ap-*', 'au-*'],
    },
  ],
  
  caching: {
    defaultTTL: 86400,
    respectOriginHeaders: true,
    negativeCache: true,
    staleOnOriginError: true,
  },
  
  origin: {
    host: 'storage.googleapis.com',
    connectionLimit: 1000,        // Limit connections to origin
    timeout: 30000,
    retries: 2,
  },
  
  coalescing: {
    enabled: true,
    maxWait: 100,                 // 100ms to coalesce requests
  },
};
 
// Request coalescing: multiple edge requests for same object
// become single origin request
class RequestCoalescer {
  private pendingRequests: Map<string, Promise<Response>> = new Map();
  
  async fetch(url: string): Promise<Response> {
    // Check if request already in flight
    const pending = this.pendingRequests.get(url);
    if (pending) {
      // Wait for existing request instead of making new one
      return pending.then(r => r.clone());
    }
    
    // New request - fetch and cache promise
    const request = this.doFetch(url);
    this.pendingRequests.set(url, request);
    
    try {
      const response = await request;
      return response;
    } finally {
      // Clean up after response received
      this.pendingRequests.delete(url);
    }
  }
}

Request Coalescing

CDN Cost Optimization

At exabyte-scale delivery, even small per-GB cost differences compound into millions of dollars. CDN cost optimization is a critical discipline for video platforms.

CDN Cost Components
Cost Factor	Range	Optimization Strategy	Potential Savings
Egress bandwidth	$0.01-0.10/GB	Multi-CDN negotiation, commit discounts	30-50%
Origin egress	$0.05-0.15/GB	Origin shield, higher edge TTL	80-90%
Cache storage	$0.02-0.05/GB-month	Eviction policies, compression	20-30%
Request fees	$0.0001-0.001/request	Segment size optimization	Variable
Premium features	Variable	Use only where needed	10-20%

cost-optimization.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
// ================================================================
// CDN COST OPTIMIZATION STRATEGIES
// ================================================================
 
// STRATEGY 1: Traffic shaping based on cost
class CostAwareRouter {
  private monthlyCommitments: Map<string, CommitmentTier> = new Map([
    ['cloudflare', { gbIncluded: 1_000_000, overage: 0.02 }],
    ['fastly', { gbIncluded: 500_000, overage: 0.025 }],
    ['akamai', { gbIncluded: 2_000_000, overage: 0.015 }],
  ]);
  
  selectCDN(request: VideoRequest): CDNProvider {
    // Check which CDN has unused commitment capacity
    const usageByProvider = this.getCurrentMonthUsage();
    
    for (const [provider, commitment] of this.monthlyCommitments) {
      const usage = usageByProvider.get(provider) || 0;
      if (usage < commitment.gbIncluded * 0.95) {
        // Use included capacity first
        return this.getProvider(provider);
      }
    }
    
    // All commitments used; route to cheapest overage
    return this.selectByOverageCost();
  }
  
  // Enforce commitment minimums to maintain contract pricing
  async balanceTrafficToCommitments(): Promise<void> {
    const daysRemaining = this.getDaysRemainingInMonth();
    
    for (const [provider, commitment] of this.monthlyCommitments) {
      const currentUsage = await this.getUsage(provider);
      const projectedUsage = currentUsage / (30 - daysRemaining) * 30;
      
      if (projectedUsage < commitment.gbIncluded * 0.8) {
        // Risk missing commitment minimum - increase traffic share
        await this.increaseTrafficShare(provider, commitment.gbIncluded - projectedUsage);
      }
    }
  }
}
 
// STRATEGY 2: Bitrate-based routing
// Route high-bitrate content to cheaper CDN for same user
class BitrateAwareRouter {
  selectCDN(request: VideoRequest, quality: VideoQuality): CDNProvider {
    // 4K content = high bandwidth = expensive
    if (quality.height >= 2160) {
      // Route to CDN with lower per-GB cost
      return this.cheapestCDN(request.region);
    }
    
    // Low-res content = many requests, low bandwidth
    if (quality.height <= 360) {
      // Route to CDN with lower per-request cost
      return this.lowestRequestCostCDN(request.region);
    }
    
    // Standard quality: optimize for performance
    return this.bestPerformanceCDN(request.region);
  }
}
 
// STRATEGY 3: Segment size optimization
// Larger segments = fewer requests = lower request fees
// But larger segments = less ABR flexibility
 
interface SegmentOptimization {
  // Default: 4 second segments
  defaultDuration: 4;
  
  // For high-bitrate content: larger segments reduce request overhead
  highBitrate: {
    threshold: 10_000_000, // 10 Mbps
    duration: 6,           // 6 second segments
  };
  
  // For low-latency live: smaller segments
  lowLatencyLive: {
    duration: 2,
  };
}
 
// STRATEGY 4: Tiered storage for long-tail content
class TieredStorage {
  async getSegmentLocation(videoId: string, segmentIndex: number): Promise<StorageLocation> {
    const video = await this.getVideoMetadata(videoId);
    const accessPattern = await this.getAccessPattern(videoId);
    
    // Hot content: SSD edge cache
    if (accessPattern.dailyViews > 10000) {
      return { tier: 'hot', location: 'edge-ssd' };
    }
    
    // Warm content: Regional cache
    if (accessPattern.weeklyViews > 1000) {
      return { tier: 'warm', location: 'regional-hdd' };
    }
    
    // Cold content: Origin with on-demand edge caching
    return { tier: 'cold', location: 'origin-archive' };
  }
}
 
// Monthly cost analysis
interface MonthlyCostBreakdown {
  egress: {
    byProvider: Map<string, number>;
    byRegion: Map<string, number>;
    total: number;
  };
  originEgress: number;
  storage: number;
  requests: number;
  otherFees: number;
  total: number;
  
  // Optimization opportunities
  opportunities: CostOpportunity[];
}
 
function analyzeCosts(month: string): MonthlyCostBreakdown {
  // ... analysis logic
  
  return {
    egress: { ... },
    total: 2_500_000, // $2.5M example
    opportunities: [
      {
        type: 'unused-commitment',
        description: 'Fastly commitment underutilized by 20%',
        potentialSavings: 50_000,
      },
      {
        type: 'origin-egress',
        description: 'High origin egress in APAC - add shield',
        potentialSavings: 80_000,
      },
    ],
  };
}

Negotiation Leverage

CDN Monitoring and Observability

With billions of daily requests across multiple CDN providers, robust monitoring is essential for maintaining quality and optimizing performance.

Key CDN Metrics

•Cache hit ratio — Percentage of requests served from cache. Target: >95%. Low hit ratio indicates caching issues or unusual traffic patterns.
•Origin shield hit ratio — Shield-level cache effectiveness. Target: >99%. Low ratio means excessive origin load.
•Response latency (p50, p95, p99) — Time to first byte. Segmented by region, CDN, and content type. Alerts on regression.
•Error rate (4xx, 5xx) — Percentage of failed requests. 4xx often indicates content issues; 5xx indicates CDN/origin problems.
•Throughput — Bytes served per second. Track capacity utilization and peak handling.
•Time to first frame — End-to-end metric from player. Combines CDN performance with client-side decode.

cdn-monitoring.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
// ================================================================
// CDN HEALTH MONITORING
// ================================================================
 
interface CDNHealthMetrics {
  provider: string;
  region: string;
  timestamp: Date;
  
  // Availability
  availability: number;           // 0-100%
  errorRate: number;             // 0-1
  
  // Performance
  latencyP50: number;            // ms
  latencyP95: number;
  latencyP99: number;
  throughputMbps: number;
  
  // Cache efficiency
  cacheHitRatio: number;         // 0-1
  cacheBytesServed: number;
  originBytesServed: number;
  
  // Capacity
  currentLoad: number;           // 0-1
  peakLoad: number;
}
 
class CDNMonitor {
  private healthScores: Map<string, number> = new Map();
  
  async collectMetrics(): Promise<CDNHealthMetrics[]> {
    const metrics: CDNHealthMetrics[] = [];
    
    for (const provider of this.providers) {
      // Collect from CDN analytics APIs
      const providerMetrics = await this.fetchProviderMetrics(provider);
      metrics.push(...providerMetrics);
      
      // Also collect from client-side analytics
      const clientMetrics = await this.fetchClientMetrics(provider);
      
      // Calculate health score
      const score = this.calculateHealthScore(providerMetrics, clientMetrics);
      this.healthScores.set(provider.name, score);
    }
    
    return metrics;
  }
  
  private calculateHealthScore(
    cdnMetrics: CDNHealthMetrics[],
    clientMetrics: ClientMetrics[]
  ): number {
    // Weighted score across dimensions
    let score = 100;
    
    const avgMetrics = this.aggregateMetrics(cdnMetrics);
    
    // Penalize for errors
    score -= avgMetrics.errorRate * 100 * 2; // Heavy penalty for errors
    
    // Penalize for high latency
    if (avgMetrics.latencyP95 > 200) {
      score -= (avgMetrics.latencyP95 - 200) / 10;
    }
    
    // Penalize for low cache hit ratio
    if (avgMetrics.cacheHitRatio < 0.95) {
      score -= (0.95 - avgMetrics.cacheHitRatio) * 100;
    }
    
    // Factor in client-reported metrics (ground truth)
    const clientScore = this.calculateClientScore(clientMetrics);
    
    // Blend CDN-reported and client-reported
    return score * 0.4 + clientScore * 0.6;
  }
  
  // Alerting based on metrics
  async checkAlerts(): Promise<Alert[]> {
    const alerts: Alert[] = [];
    const metrics = await this.getLatestMetrics();
    
    for (const m of metrics) {
      // High error rate
      if (m.errorRate > 0.01) { // >1% errors
        alerts.push({
          severity: m.errorRate > 0.05 ? 'critical' : 'warning',
          type: 'high-error-rate',
          provider: m.provider,
          region: m.region,
          value: m.errorRate,
          message: `Error rate ${(m.errorRate * 100).toFixed(2)}% exceeds threshold`,
        });
      }
      
      // Latency regression
      const baseline = await this.getBaselineLatency(m.provider, m.region);
      if (m.latencyP95 > baseline * 1.5) {
        alerts.push({
          severity: 'warning',
          type: 'latency-regression',
          provider: m.provider,
          region: m.region,
          value: m.latencyP95,
          message: `P95 latency ${m.latencyP95}ms, baseline ${baseline}ms`,
        });
      }
      
      // Cache hit ratio drop
      if (m.cacheHitRatio < 0.90) {
        alerts.push({
          severity: 'warning',
          type: 'low-cache-hit',
          provider: m.provider,
          region: m.region,
          value: m.cacheHitRatio,
          message: `Cache hit ratio ${(m.cacheHitRatio * 100).toFixed(1)}% below 90%`,
        });
      }
    }
    
    return alerts;
  }
}
 
// Real User Monitoring (RUM) for ground truth
class RealUserMonitor {
  collectFromPlayer(player: VideoPlayer): void {
    player.on('segment-loaded', (event) => {
      this.report({
        type: 'segment-load',
        cdn: this.extractCDNFromUrl(event.url),
        latencyMs: event.loadTime,
        bytesLoaded: event.bytes,
        cacheStatus: event.headers['x-cache'],
        region: this.getUserRegion(),
      });
    });
    
    player.on('error', (event) => {
      this.report({
        type: 'error',
        cdn: event.cdn,
        errorCode: event.code,
        errorMessage: event.message,
      });
    });
  }
}

Trust Client Metrics

CDN Integration Summary

We've explored the architecture, strategies, and operational considerations for delivering video at planetary scale. Let's consolidate the key takeaways:

Key Design Decisions

•Multi-tier caching hierarchy — Edge PoPs for user-facing traffic, regional shields to aggregate cache misses, origin as source of truth. Each tier reduces load on the next.
•Video-specific caching — Long TTL for immutable segments, short TTL for live manifests, popularity-based cache prioritization for the long tail.
•Multi-CDN architecture — Multiple providers for redundancy, performance optimization, and cost leverage. Route dynamically based on performance and cost.
•Origin shield/mid-tier — Request coalescing and regional caching dramatically reduce origin load and costs. Essential for high-traffic content.
•Strategic invalidation — URL versioning for routine updates, API purge for urgent takedowns, stale-while-revalidate for graceful transitions.
•Cost optimization — Commit-based pricing, traffic shaping to optimize provider spend, segment size tuning, tiered storage for long-tail content.
•Comprehensive monitoring — CDN-reported metrics plus client-side RUM. Health scores for routing decisions. Alert on regression.

What's next:

Page Complete

5 / 6