Loading system design...
Design a global Content Delivery Network (CDN) with edge computing capabilities. The system deploys 200+ edge Points of Presence (PoPs) worldwide, routes user requests to the optimal PoP via Anycast/GeoDNS, caches static content in tiered storage (RAM + SSD) with configurable TTL and sub-5-second global purge propagation, supports edge compute (V8 isolates / Wasm for A/B testing, auth, geo-personalisation), delivers video via HLS/DASH with adaptive bitrate and request collapsing for live streaming, provides DDoS absorption (200+ Tbps) and WAF at the edge, accelerates dynamic content via connection pooling and smart routing, and achieves > 90% cache hit ratio with < 50ms TTFB for cached content.
| Metric | Value |
|---|---|
| Edge PoPs | 200+ globally |
| Aggregate network capacity | 200+ Tbps |
| HTTP requests/sec (global) | 20 million+ |
| Cache hit ratio (target) | > 90% |
| TTFB for cached content | < 50ms |
| Purge propagation time | < 5 seconds globally |
| Edge compute cold start | < 5ms (V8 isolates) |
| TLS handshake (to nearest PoP) | < 20ms |
| Video segments served/sec | Millions |
| Internet traffic served by CDNs | 30%+ |
Global content distribution: cache and serve static content (HTML, CSS, JS, images, videos, fonts) from edge servers (PoPs — Points of Presence) deployed in 200+ locations worldwide; user requests routed to the nearest PoP; significantly reduce latency compared to fetching from the origin server
Request routing: intelligently route each user request to the optimal edge PoP; routing criteria: geographic proximity, network latency (RTT), edge server health and load, content availability at the edge; techniques: Anycast (BGP-based, same IP announced from all PoPs, routing to nearest), DNS-based (GeoDNS resolves to nearest PoP IP), HTTP redirect (302 to optimal PoP URL)
Caching layer: edge servers cache content with configurable TTL (Cache-Control headers); cache HIT → serve directly from edge (< 10ms); cache MISS → fetch from origin → cache at edge → serve to user; support cache hierarchies: L1 edge PoP → L2 regional hub → origin (reduces origin load on cache misses)
Cache invalidation: when origin content changes → purge stale cached content from all edge PoPs; support: (a) time-based expiration (TTL); (b) explicit purge (API call: purge URL / purge by tag / purge all); (c) stale-while-revalidate (serve stale content while fetching fresh in background); purge propagation to all 200+ PoPs within seconds
Edge computing / serverless at edge: run custom application logic AT the edge PoPs (not at origin); use cases: A/B testing (route users to variant at edge), authentication/authorisation (validate JWT at edge → reject unauthorized before hitting origin), request rewriting (URL rewrite, header modification), geo-based personalisation (show regional content/pricing)
Dynamic content acceleration: for non-cacheable dynamic requests → optimise the path from edge to origin; techniques: persistent connections between edge and origin (connection pooling, pre-warmed TLS), route optimisation (Argo-like smart routing — choose fastest network path, not shortest), TCP optimisation (congestion control tuning, TLS 1.3 0-RTT), request collapsing (coalesce identical concurrent requests to origin)
Video / large file delivery: support streaming video (HLS/DASH — chunked delivery, adaptive bitrate); large file downloads (range requests, resumable); byte-range caching (cache individual chunks independently); prefetching next chunks based on playback position; video on demand (VoD) and live streaming support
DDoS protection and WAF: edge PoPs absorb DDoS attacks (volumetric, SYN flood) by distributing traffic across the global network; Web Application Firewall (WAF) at the edge — inspect HTTP requests, block SQL injection / XSS / bot traffic before it reaches origin; rate limiting per IP/region
TLS termination and certificate management: terminate TLS at the edge (reduce TLS handshake latency — user negotiates with nearby edge, not distant origin); manage SSL certificates (auto-provisioning via Let's Encrypt, custom certificates); support TLS 1.3, HTTP/2, HTTP/3 (QUIC); edge-to-origin connection can be separate TLS or plaintext internal
Analytics and observability: real-time analytics per PoP: cache hit ratio, bandwidth served, request count, latency percentiles, error rates, top URLs; per-customer dashboards; access logs (streaming to customer's storage — S3/BigQuery); alerting on origin errors, cache hit ratio drops, latency spikes; global traffic map visualisation
Non-functional requirements define the system qualities critical to your users. Frame them as 'The system should be able to...' statements. These will guide your deep dives later.
Think about CAP theorem trade-offs, scalability limits, latency targets, durability guarantees, security requirements, fault tolerance, and compliance needs.
Frame NFRs for this specific system. 'Low latency search under 100ms' is far more valuable than just 'low latency'.
Add concrete numbers: 'P99 response time < 500ms', '99.9% availability', '10M DAU'. This drives architectural decisions.
Choose the 3-5 most critical NFRs. Every system should be 'scalable', but what makes THIS system's scaling uniquely challenging?