Loading content...
Every TCP connection requires a handshake. Every TLS connection requires cryptographic negotiation. These setup costs, measured in round trips and milliseconds, dominate latency for short-lived requests—exactly the pattern seen in API calls and dynamic content.
Edge termination (covered earlier) solves this for user-to-edge connections. But what about edge-to-origin? If each user request creates a new connection from edge to origin, we've just shifted the problem rather than solving it.
Connection reuse is the solution: CDN edge servers maintain pools of pre-established, warmed-up connections to origin servers, eliminating handshake overhead entirely for forwarded requests.
This page covers connection pooling architecture, HTTP/1.1 keep-alive limitations, HTTP/2 multiplexing advantages, pool sizing strategies, and the operational considerations of maintaining persistent connections across global infrastructure.
CDN connection reuse operates at two distinct layers, each providing cumulative benefits:
1234567891011121314151617181920212223
LAYER 1: User ↔ Edge (Client-Facing)┌─────────────────────────────────────────────────────────────────┐│ Users maintain keep-alive connections to nearby edge servers ││ Benefits: TLS session resumption, HTTP/2 multiplexing ││ Scope: Per-user connection state │└─────────────────────────────────────────────────────────────────┘ │ │ User request arrives at edge ▼LAYER 2: Edge ↔ Origin (Backend)┌─────────────────────────────────────────────────────────────────┐│ Edge servers maintain shared connection pools to origins ││ Benefits: Zero handshake latency, warmed TCP windows ││ Scope: Shared across many user requests │└─────────────────────────────────────────────────────────────────┘ Example flow:1. User request arrives at edge (uses existing keep-alive)2. Edge checks pool for available connection to origin3. If available: forward immediately (0ms handshake)4. If not available: create new connection OR queue request5. Response returns through same connections6. Connections return to pools for next requestThe multiplicative advantage:
Consider a CDN edge server handling 10,000 requests per second to a single origin. Without connection reuse:
With connection reuse:
Connection reuse also provides a connection aggregation benefit. Instead of the origin server seeing 10,000 concurrent connections from 10,000 users, it sees perhaps 100-500 connections from the CDN edge. This dramatically reduces origin server connection overhead and improves its scalability.
HTTP/1.1 introduced persistent connections via the Connection: keep-alive header, allowing multiple HTTP requests to share a single TCP connection. This was a major improvement over HTTP/1.0 (new connection per request), but still carries significant limitations.
12345678910111213141516171819
# Request with keep-aliveGET /api/users/123 HTTP/1.1Host: api.example.comConnection: keep-alive # Response allowing keep-aliveHTTP/1.1 200 OKContent-Type: application/jsonConnection: keep-aliveKeep-Alive: timeout=30, max=100Content-Length: 256 {"id": 123, "name": "Example User", ...} # Same connection reused for next requestGET /api/users/123/orders HTTP/1.1Host: api.example.comConnection: keep-alive...1234567891011121314151617
Connection 1 (single TCP connection, sequential processing): Time (ms) | Activity-----------+-------------------------------------------------- 0 | Request A sent (large report, ~500ms to generate) 0-500 | Request B waiting (even though it's just a quick lookup) 0-500 | Request C waiting 500 | Response A returns 500 | Request B can now be processed (10ms) 510 | Response B returns 510 | Request C can now be processed (10ms) 520 | Response C returns Total time: 520msIf parallelized: max(500, 10, 10) = 500ms Request B and C delayed 490ms unnecessarily!CDN strategies for HTTP/1.1 origins:
Even when CDN edges use HTTP/2 or HTTP/3 to users, many origin servers still speak HTTP/1.1. CDN edges handle this mismatch through:
HTTP/2 fundamentally solves HTTP/1.1's connection limitations through multiplexing: multiple independent request/response streams share a single TCP connection, with responses arriving as they're ready rather than in request order.
1234567891011121314151617
Single HTTP/2 Connection (multiple concurrent streams): Time (ms) | Stream 1 | Stream 2 | Stream 3-----------+-------------------+-------------------+------------------ 0 | Request A sent | Request B sent | Request C sent 0-500 | (processing) | (processing) | (processing) 10 | | Response B ready | 10 | | Response B sent | 15 | | | Response C ready 15 | | | Response C sent 500 | Response A ready | | 500 | Response A sent | | Total time: 500ms (limited by slowest request)Requests B and C: ~10-15ms each (no waiting!) Same results as parallel connections, but with ONE connection.| Characteristic | HTTP/1.1 | HTTP/2 |
|---|---|---|
| Requests per connection | 1 active at a time | 100+ concurrent |
| Response ordering | Sequential (HoL) | Any order (no HoL) |
| Header handling | Repeated per request | HPACK compression |
| Connections needed | 6-8 per origin | 1 per origin |
| Handshake overhead | Multiplied by connections | Once per origin |
| Server resource usage | High (many connections) | Low (few connections) |
HTTP/2 for CDN edge-to-origin connections:
HTTP/2 is transformative for edge-to-origin communication:
Single connection per origin per edge server: Instead of 100+ HTTP/1.1 connections, one HTTP/2 connection handles all traffic
Optimal congestion window: That single connection maintains a fully warmed congestion window with maximum throughput capacity
Reduced origin load: Origin servers see dramatically fewer connections (1 per edge server vs. 100+ per edge server)
Header compression: HPACK compression reduces repetitive header overhead by 80-90%
Priority signaling: Edge can signal request priorities to origin for intelligent scheduling
123456789101112131415161718192021222324252627282930313233343536
interface HTTP2ConnectionPool { private connections: Map<string, HTTP2Connection>; async getConnection(origin: string): Promise<HTTP2Connection> { let conn = this.connections.get(origin); if (!conn || !conn.isOpen()) { // Create new HTTP/2 connection to origin conn = await this.createConnection(origin); this.connections.set(origin, conn); } // HTTP/2: single connection handles all streams // No need to track "in-use" vs "available" like HTTP/1.1 return conn; } async forwardRequest(request: Request): Promise<Response> { const conn = await this.getConnection(request.origin); // Create a new stream on the existing connection const stream = conn.createStream(); // Send request - doesn't block other streams await stream.sendHeaders(request.headers); if (request.body) { await stream.sendData(request.body); } // Receive response - other streams continue independently const responseHeaders = await stream.receiveHeaders(); const responseBody = await stream.receiveData(); return new Response(responseHeaders, responseBody); }}HTTP/2 eliminates application-layer HoL blocking, but TCP underneath still has it. If a TCP packet is lost, all HTTP/2 streams pause until retransmission completes. This is why HTTP/3 (QUIC) moves to UDP—eliminating TCP-level HoL blocking entirely.
Connection pools must be sized appropriately—too small creates queuing delays; too large wastes resources and can overwhelm origins. Optimal sizing depends on traffic patterns, origin capacity, and latency requirements.
1234567891011121314151617181920212223242526
Goal: Size pool to handle peak request rate without queuing Variables: - R = Request rate (requests/second) to this origin - L = Average request latency (seconds) through origin - C = Connections in pool - For HTTP/1.1: Each connection handles 1/L requests/second - For HTTP/2: Single connection handles many concurrent streams HTTP/1.1 Pool Sizing: Capacity per connection = 1/L requests/second Required connections: C = R × L Example: 1000 req/s, 100ms average latency C = 1000 × 0.1 = 100 connections needed With headroom: C = 100 × 1.5 = 150 connections HTTP/2 Pool Sizing: Typically 1-2 connections per origin per edge server Each connection handles 100+ concurrent streams Streams limited by: MAX_CONCURRENT_STREAMS setting (default 100) Example: 1000 req/s, 100ms average latency Concurrent requests: 1000 × 0.1 = 100 One HTTP/2 connection with 100 max streams suffices Add second for redundancy and extreme bursts12345678910111213141516171819202122232425262728293031
upstream origin_api { # Origin server address server origin.example.com:443; # Connection pool settings keepalive 100; # Min connections to keep open keepalive_requests 10000; # Max requests per connection keepalive_timeout 60s; # Idle timeout # HTTP/2 to origin (if supported) proxy_http_version 1.1; # or 2 for HTTP/2 proxy_set_header Connection ""; # Don't forward user's Connection header} server { listen 443 ssl http2; location /api/ { proxy_pass https://origin_api; # Connection reuse settings proxy_socket_keepalive on; proxy_connect_timeout 10s; proxy_read_timeout 120s; # HTTP version negotiation proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; }}Origin servers have their own connection limits. A CDN with 200 edge PoPs, each maintaining 100 connections, creates 20,000 origin connections. This can overwhelm origins not designed for it. Origin shield (covered in Module 4) helps by consolidating edge-to-origin traffic.
Connection warming ensures that when traffic arrives, pooled connections are already established and ready. Cold pools (no pre-established connections) add latency for initial requests after traffic lulls.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
class ConnectionPool { private warmConnections: number; private maxConnections: number; private connections: Connection[] = []; constructor(config: PoolConfig) { this.warmConnections = config.warm; // e.g., 10 this.maxConnections = config.max; // e.g., 100 // Immediately establish warm connections this.warmPool(); // Periodically check and restore warm level setInterval(() => this.maintainWarmth(), 10000); } private async warmPool(): Promise<void> { const toCreate = this.warmConnections - this.availableCount(); if (toCreate > 0) { console.log(`Warming pool: creating ${toCreate} connections`); await Promise.all( Array(toCreate).fill(0).map(() => this.createConnection()) ); } } private async maintainWarmth(): Promise<void> { // Remove dead connections this.connections = this.connections.filter(c => c.isHealthy()); // Health check remaining connections await Promise.all( this.connections.map(c => c.healthCheck()) ); // Restore to warm level if below await this.warmPool(); } private async createConnection(): Promise<Connection> { const conn = new Connection(this.origin); // Full TCP + TLS handshake happens HERE, before real traffic await conn.connect(); // Send HTTP/2 SETTINGS, wait for response await conn.establishHTTP2(); // Now connection is fully warmed this.connections.push(conn); return conn; }}Client-side preconnect hints:
CDNs can also leverage client-side preconnect to warm connections proactively. The <link rel="preconnect"> hint tells browsers to establish connections before they're needed.
12345678910111213141516
<!-- In HTML head, hint browser to preconnect to CDN edges --><link rel="preconnect" href="https://cdn.example.com"><link rel="dns-prefetch" href="https://cdn.example.com"> <!-- For cross-origin resources with credentialed requests --><link rel="preconnect" href="https://api.example.com" crossorigin> <!-- Browser behavior: 1. DNS resolution happens immediately (dns-prefetch) 2. TCP connection established before needed (preconnect) 3. TLS handshake completed before needed 4. When actual request happens: connection already ready Savings: Typically 200-400ms for first request to this origin-->HTTP 103 Early Hints allows servers to send preconnect hints before the final response is ready. The edge can respond with 103 immediately, telling the browser to preconnect to other origins while the edge fetches dynamic content from the origin.
Pooled connections require active management. Network conditions change, servers restart, and keep-alive timeouts vary. A robust connection pool must detect and handle these conditions to maintain reliability.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
class HealthAwareConnection { private lastActivity: number = Date.now(); private requestCount: number = 0; private consecutiveErrors: number = 0; constructor( private maxAge: number = 3600000, // 1 hour max lifetime private maxRequests: number = 10000, // recycle after N requests private idleTimeout: number = 30000, // 30 second idle detection private maxErrors: number = 3 // mark unhealthy after N errors ) {} isHealthy(): boolean { // Check all health conditions if (!this.socket.isConnected()) return false; if (Date.now() - this.createdAt > this.maxAge) return false; if (this.requestCount >= this.maxRequests) return false; if (this.consecutiveErrors >= this.maxErrors) return false; return true; } needsHealthCheck(): boolean { // Check if idle too long (might be half-open) return Date.now() - this.lastActivity > this.idleTimeout; } async healthCheck(): Promise<boolean> { try { // For HTTP/2: send PING frame await this.socket.sendPing(); // Or for HTTP/1.1: send OPTIONS or HEAD request // await this.socket.sendHeadRequest('/health'); this.consecutiveErrors = 0; this.lastActivity = Date.now(); return true; } catch (error) { this.consecutiveErrors++; return false; } } async sendRequest(request: Request): Promise<Response> { // Check if health check needed before use if (this.needsHealthCheck()) { if (!await this.healthCheck()) { throw new ConnectionUnhealthyError('Connection failed health check'); } } try { const response = await this._send(request); this.requestCount++; this.consecutiveErrors = 0; this.lastActivity = Date.now(); return response; } catch (error) { this.consecutiveErrors++; throw error; } }}Connection recycling strategies:
Age-based recycling: Close connections older than N hours to prevent accumulation of subtle issues
Request-based recycling: Close after N requests to prevent memory/state buildup
Error-based recycling: After consecutive errors, mark connection unhealthy and replace
Graceful drain: When recycling, stop sending new requests but allow in-flight to complete
Exponential backoff: After connection failures, delay reconnection attempts with increasing intervals
HTTP/3 builds on QUIC (Quick UDP Internet Connections) to provide connection reuse benefits that surpass even HTTP/2. QUIC addresses HTTP/2's remaining limitations, particularly TCP-level head-of-line blocking and connection migration.
| Feature | HTTP/2 (over TCP) | HTTP/3 (over QUIC) |
|---|---|---|
| Transport | TCP | UDP with QUIC |
| Handshake | TCP + TLS (2-3 RTT) | Combined (1 RTT, 0-RTT for resume) |
| HoL blocking | TCP-level exists | None (independent streams) |
| Connection migration | Breaks on IP change | Survives network changes |
| Multiplexing | Single connection | Same, but truly independent streams |
| Loss recovery | TCP retransmit delays all | Per-stream, no cross-blocking |
12345678910111213141516171819202122232425
First connection to server:Client Server | | |-- QUIC Initial + ClientHello --->| (includes 0-RTT attempt) |<-- QUIC Initial + ServerHello ---| RTT 1 |-- QUIC Handshake Done ---------->| |<-- QUIC Handshake Done ----------| | | |-- HTTP Request ------------------>| RTT 2 (with data) |<-- HTTP Response -----------------| (Server provides session ticket for future 0-RTT) Subsequent connections (0-RTT):Client Server | | |-- QUIC Initial + ClientHello --->| | + 0-RTT HTTP Request | <- Data in first packet! |<-- QUIC Initial + ServerHello ---| | + 0.5-RTT HTTP Response | RTT 1 (response starts!) |-- Handshake complete ------------>| |<-- Handshake complete ------------| | | Savings: Full round trip for repeat connections!QUIC advantages for CDN edge-to-origin:
Connection migration: If an edge server's IP changes (network event), QUIC connections survive via connection IDs. No need to re-establish.
True stream independence: Lost packet in stream A doesn't block streams B, C, D. Crucial for multiplexed dynamic content where one slow response shouldn't slow others.
0-RTT for new connections: For repeat connections, data can be sent immediately without any handshake delay.
Improved loss recovery: QUIC's ACK frames and loss detection are more sophisticated than TCP, recovering from loss faster.
Major CDNs (Cloudflare, Fastly, Akamai) support HTTP/3 for edge-to-user connections. Edge-to-origin HTTP/3 is less common due to origin server support and UDP handling in corporate networks. Adoption is growing rapidly as benefits become clear.
Connection reuse forms a critical pillar of CDN dynamic content acceleration. By maintaining pools of ready-to-use connections, CDNs eliminate the handshake overhead that would otherwise dominate latency for short-lived requests.
What's next:
The next page explores route optimization—how CDNs select the fastest network paths for traffic, dynamically adapting to congestion, outages, and changing network conditions to minimize latency.
You now understand connection reuse strategies that CDNs employ to eliminate per-request connection overhead. Combined with edge termination and TCP optimization, connection reuse explains how CDNs dramatically accelerate even completely uncacheable dynamic content.