Dynamic Content Acceleration - Learning Module

Loading content...

0/273

Connection Reuse

The Cost of Connection Establishment

Every TCP connection requires a handshake. Every TLS connection requires cryptographic negotiation. These setup costs, measured in round trips and milliseconds, dominate latency for short-lived requests—exactly the pattern seen in API calls and dynamic content.

Edge termination (covered earlier) solves this for user-to-edge connections. But what about edge-to-origin? If each user request creates a new connection from edge to origin, we've just shifted the problem rather than solving it.

Connection reuse is the solution: CDN edge servers maintain pools of pre-established, warmed-up connections to origin servers, eliminating handshake overhead entirely for forwarded requests.

What You Will Learn

This page covers connection pooling architecture, HTTP/1.1 keep-alive limitations, HTTP/2 multiplexing advantages, pool sizing strategies, and the operational considerations of maintaining persistent connections across global infrastructure.

Connection Reuse Architecture

CDN connection reuse operates at two distinct layers, each providing cumulative benefits:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
LAYER 1: User ↔ Edge (Client-Facing)
┌─────────────────────────────────────────────────────────────────┐
│  Users maintain keep-alive connections to nearby edge servers   │
│  Benefits: TLS session resumption, HTTP/2 multiplexing          │
│  Scope: Per-user connection state                               │
└─────────────────────────────────────────────────────────────────┘
         │
         │  User request arrives at edge
         ▼
LAYER 2: Edge ↔ Origin (Backend)
┌─────────────────────────────────────────────────────────────────┐
│  Edge servers maintain shared connection pools to origins       │
│  Benefits: Zero handshake latency, warmed TCP windows           │
│  Scope: Shared across many user requests                        │
└─────────────────────────────────────────────────────────────────┘
 
Example flow:
1. User request arrives at edge (uses existing keep-alive)
2. Edge checks pool for available connection to origin
3. If available: forward immediately (0ms handshake)
4. If not available: create new connection OR queue request
5. Response returns through same connections
6. Connections return to pools for next request

The multiplicative advantage:

Consider a CDN edge server handling 10,000 requests per second to a single origin. Without connection reuse:

10,000 TCP handshakes per second (30,000 packets just for SYN/SYN-ACK/ACK)
10,000 TLS handshakes per second (massive CPU load for cryptographic operations)
Each connection starts with a cold congestion window

With connection reuse:

Perhaps 500 persistent connections serve all 10,000 requests
Zero handshakes (connections already established)
Warmed congestion windows from previous requests
Origin server handles far fewer connection events

Connection Aggregation Effect

Connection reuse also provides a connection aggregation benefit. Instead of the origin server seeing 10,000 concurrent connections from 10,000 users, it sees perhaps 100-500 connections from the CDN edge. This dramatically reduces origin server connection overhead and improves its scalability.

HTTP/1.1 Keep-Alive: Foundation and Limitations

HTTP/1.1 introduced persistent connections via the Connection: keep-alive header, allowing multiple HTTP requests to share a single TCP connection. This was a major improvement over HTTP/1.0 (new connection per request), but still carries significant limitations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Request with keep-alive
GET /api/users/123 HTTP/1.1
Host: api.example.com
Connection: keep-alive
 
# Response allowing keep-alive
HTTP/1.1 200 OK
Content-Type: application/json
Connection: keep-alive
Keep-Alive: timeout=30, max=100
Content-Length: 256
 
{"id": 123, "name": "Example User", ...}
 
# Same connection reused for next request
GET /api/users/123/orders HTTP/1.1
Host: api.example.com
Connection: keep-alive
...

HTTP/1.1 Keep-Alive Limitations

•Head-of-Line Blocking — HTTP/1.1 processes requests sequentially on each connection. If request A takes 500ms, request B waits even if it could complete in 10ms.
•Limited Parallelism — To work around HoL blocking, browsers open 6-8 connections per origin. This helps but multiplies connection overhead.
•Inefficient for Multiplexed Traffic — Many small requests (API calls, AJAX) suffer from the sequential constraint. Real-time applications are particularly affected.
•Connection Count Scaling — As traffic grows, HTTP/1.1 requires proportionally more connections, eventually hitting practical limits.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Connection 1 (single TCP connection, sequential processing):
 
Time (ms)  | Activity
-----------+--------------------------------------------------
    0      | Request A sent (large report, ~500ms to generate)
    0-500  | Request B waiting (even though it's just a quick lookup)
    0-500  | Request C waiting
    500    | Response A returns
    500    | Request B can now be processed (10ms)
    510    | Response B returns
    510    | Request C can now be processed (10ms)
    520    | Response C returns
 
Total time: 520ms
If parallelized: max(500, 10, 10) = 500ms
 
Request B and C delayed 490ms unnecessarily!

CDN strategies for HTTP/1.1 origins:

Even when CDN edges use HTTP/2 or HTTP/3 to users, many origin servers still speak HTTP/1.1. CDN edges handle this mismatch through:

Connection pooling: Multiple connections per origin (e.g., 100 connections) to provide parallelism
Request routing: Distribute requests across pooled connections to minimize HoL blocking impact
Protocol translation: Accept HTTP/2 from users, speak HTTP/1.1 to origin, handle the translation
Pipelining (rarely used): Send multiple requests before waiting for responses (most servers don't support well)

HTTP/2 Multiplexing: The Connection Reuse Revolution

HTTP/2 fundamentally solves HTTP/1.1's connection limitations through multiplexing: multiple independent request/response streams share a single TCP connection, with responses arriving as they're ready rather than in request order.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Single HTTP/2 Connection (multiple concurrent streams):
 
Time (ms)  | Stream 1          | Stream 2          | Stream 3
-----------+-------------------+-------------------+------------------
    0      | Request A sent    | Request B sent    | Request C sent
    0-500  | (processing)      | (processing)      | (processing)
   10      |                   | Response B ready  |
   10      |                   | Response B sent   |
   15      |                   |                   | Response C ready
   15      |                   |                   | Response C sent
  500      | Response A ready  |                   |
  500      | Response A sent   |                   |
 
Total time: 500ms (limited by slowest request)
Requests B and C: ~10-15ms each (no waiting!)
 
Same results as parallel connections, but with ONE connection.

HTTP/1.1 vs HTTP/2 Connection Characteristics
Characteristic	HTTP/1.1	HTTP/2
Requests per connection	1 active at a time	100+ concurrent
Response ordering	Sequential (HoL)	Any order (no HoL)
Header handling	Repeated per request	HPACK compression
Connections needed	6-8 per origin	1 per origin
Handshake overhead	Multiplied by connections	Once per origin
Server resource usage	High (many connections)	Low (few connections)

HTTP/2 for CDN edge-to-origin connections:

HTTP/2 is transformative for edge-to-origin communication:

Single connection per origin per edge server: Instead of 100+ HTTP/1.1 connections, one HTTP/2 connection handles all traffic
Optimal congestion window: That single connection maintains a fully warmed congestion window with maximum throughput capacity
Reduced origin load: Origin servers see dramatically fewer connections (1 per edge server vs. 100+ per edge server)
Header compression: HPACK compression reduces repetitive header overhead by 80-90%
Priority signaling: Edge can signal request priorities to origin for intelligent scheduling

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
interface HTTP2ConnectionPool {
  private connections: Map<string, HTTP2Connection>;
  
  async getConnection(origin: string): Promise<HTTP2Connection> {
    let conn = this.connections.get(origin);
    
    if (!conn || !conn.isOpen()) {
      // Create new HTTP/2 connection to origin
      conn = await this.createConnection(origin);
      this.connections.set(origin, conn);
    }
    
    // HTTP/2: single connection handles all streams
    // No need to track "in-use" vs "available" like HTTP/1.1
    return conn;
  }
  
  async forwardRequest(request: Request): Promise<Response> {
    const conn = await this.getConnection(request.origin);
    
    // Create a new stream on the existing connection
    const stream = conn.createStream();
    
    // Send request - doesn't block other streams
    await stream.sendHeaders(request.headers);
    if (request.body) {
      await stream.sendData(request.body);
    }
    
    // Receive response - other streams continue independently
    const responseHeaders = await stream.receiveHeaders();
    const responseBody = await stream.receiveData();
    
    return new Response(responseHeaders, responseBody);
  }
}

TCP-Level HoL Blocking Remains

HTTP/2 eliminates application-layer HoL blocking, but TCP underneath still has it. If a TCP packet is lost, all HTTP/2 streams pause until retransmission completes. This is why HTTP/3 (QUIC) moves to UDP—eliminating TCP-level HoL blocking entirely.

Connection Pool Sizing Strategies

Connection pools must be sized appropriately—too small creates queuing delays; too large wastes resources and can overwhelm origins. Optimal sizing depends on traffic patterns, origin capacity, and latency requirements.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Goal: Size pool to handle peak request rate without queuing
 
Variables:
  - R = Request rate (requests/second) to this origin
  - L = Average request latency (seconds) through origin
  - C = Connections in pool
  - For HTTP/1.1: Each connection handles 1/L requests/second
  - For HTTP/2: Single connection handles many concurrent streams
 
HTTP/1.1 Pool Sizing:
  Capacity per connection = 1/L requests/second
  Required connections: C = R × L
  
  Example: 1000 req/s, 100ms average latency
  C = 1000 × 0.1 = 100 connections needed
  With headroom: C = 100 × 1.5 = 150 connections
 
HTTP/2 Pool Sizing:
  Typically 1-2 connections per origin per edge server
  Each connection handles 100+ concurrent streams
  Streams limited by: MAX_CONCURRENT_STREAMS setting (default 100)
  
  Example: 1000 req/s, 100ms average latency
  Concurrent requests: 1000 × 0.1 = 100
  One HTTP/2 connection with 100 max streams suffices
  Add second for redundancy and extreme bursts

Pool Configuration Parameters

•Minimum connections: Pre-established connections even during low traffic. Avoids cold start when traffic spikes. Typically 5-10 for HTTP/1.1, 1-2 for HTTP/2.
•Maximum connections: Upper limit to protect origin from overload. For HTTP/1.1: 100-500 per origin. For HTTP/2: 2-4 per origin.
•Idle timeout: How long unused connections stay open. Too short = repeated handshakes. Too long = wasted resources. Typically 30-120 seconds.
•Max requests per connection: Some origins become unstable with very long-lived connections. Rotate connections after N requests (e.g., 10,000).
•Health check interval: Periodic validation that pooled connections are still valid. Stale/broken connections are recycled.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
upstream origin_api {
    # Origin server address
    server origin.example.com:443;
    
    # Connection pool settings
    keepalive 100;              # Min connections to keep open
    keepalive_requests 10000;   # Max requests per connection
    keepalive_timeout 60s;      # Idle timeout
    
    # HTTP/2 to origin (if supported)
    proxy_http_version 1.1;     # or 2 for HTTP/2
    proxy_set_header Connection "";  # Don't forward user's Connection header
}
 
server {
    listen 443 ssl http2;
    
    location /api/ {
        proxy_pass https://origin_api;
        
        # Connection reuse settings
        proxy_socket_keepalive on;
        proxy_connect_timeout 10s;
        proxy_read_timeout 120s;
        
        # HTTP version negotiation
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
    }
}

Origin Connection Limits

Origin servers have their own connection limits. A CDN with 200 edge PoPs, each maintaining 100 connections, creates 20,000 origin connections. This can overwhelm origins not designed for it. Origin shield (covered in Module 4) helps by consolidating edge-to-origin traffic.

Connection Warming and Preconnect Strategies

Connection warming ensures that when traffic arrives, pooled connections are already established and ready. Cold pools (no pre-established connections) add latency for initial requests after traffic lulls.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
class ConnectionPool {
  private warmConnections: number;
  private maxConnections: number;
  private connections: Connection[] = [];
  
  constructor(config: PoolConfig) {
    this.warmConnections = config.warm; // e.g., 10
    this.maxConnections = config.max;   // e.g., 100
    
    // Immediately establish warm connections
    this.warmPool();
    
    // Periodically check and restore warm level
    setInterval(() => this.maintainWarmth(), 10000);
  }
  
  private async warmPool(): Promise<void> {
    const toCreate = this.warmConnections - this.availableCount();
    
    if (toCreate > 0) {
      console.log(`Warming pool: creating ${toCreate} connections`);
      
      await Promise.all(
        Array(toCreate).fill(0).map(() => this.createConnection())
      );
    }
  }
  
  private async maintainWarmth(): Promise<void> {
    // Remove dead connections
    this.connections = this.connections.filter(c => c.isHealthy());
    
    // Health check remaining connections
    await Promise.all(
      this.connections.map(c => c.healthCheck())
    );
    
    // Restore to warm level if below
    await this.warmPool();
  }
  
  private async createConnection(): Promise<Connection> {
    const conn = new Connection(this.origin);
    
    // Full TCP + TLS handshake happens HERE, before real traffic
    await conn.connect();
    
    // Send HTTP/2 SETTINGS, wait for response
    await conn.establishHTTP2();
    
    // Now connection is fully warmed
    this.connections.push(conn);
    return conn;
  }
}

Client-side preconnect hints:

CDNs can also leverage client-side preconnect to warm connections proactively. The <link rel="preconnect"> hint tells browsers to establish connections before they're needed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<!-- In HTML head, hint browser to preconnect to CDN edges -->
<link rel="preconnect" href="https://cdn.example.com">
<link rel="dns-prefetch" href="https://cdn.example.com">
 
<!-- For cross-origin resources with credentialed requests -->
<link rel="preconnect" href="https://api.example.com" crossorigin>
 
<!-- 
  Browser behavior:
  1. DNS resolution happens immediately (dns-prefetch)
  2. TCP connection established before needed (preconnect)
  3. TLS handshake completed before needed
  4. When actual request happens: connection already ready
  
  Savings: Typically 200-400ms for first request to this origin
-->

Early Hints (HTTP 103)

HTTP 103 Early Hints allows servers to send preconnect hints before the final response is ready. The edge can respond with 103 immediately, telling the browser to preconnect to other origins while the edge fetches dynamic content from the origin.

Connection Health and Lifecycle Management

Pooled connections require active management. Network conditions change, servers restart, and keep-alive timeouts vary. A robust connection pool must detect and handle these conditions to maintain reliability.

Connection Health Challenges

•Half-open connections: Server closed connection, edge hasn't detected it yet. Next request fails on send.
•TCP keepalive timeout: Connections idle for minutes may be silently closed by intermediate firewalls/NATs.
•Server restart: All connections to a restarting origin simultaneously break.
•Network partition: Connectivity issues can leave connections in ambiguous states.
•Memory leaks: Long-lived connections may accumulate state in buggy implementations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
class HealthAwareConnection {
  private lastActivity: number = Date.now();
  private requestCount: number = 0;
  private consecutiveErrors: number = 0;
  
  constructor(
    private maxAge: number = 3600000,        // 1 hour max lifetime
    private maxRequests: number = 10000,     // recycle after N requests
    private idleTimeout: number = 30000,     // 30 second idle detection
    private maxErrors: number = 3            // mark unhealthy after N errors
  ) {}
  
  isHealthy(): boolean {
    // Check all health conditions
    if (!this.socket.isConnected()) return false;
    if (Date.now() - this.createdAt > this.maxAge) return false;
    if (this.requestCount >= this.maxRequests) return false;
    if (this.consecutiveErrors >= this.maxErrors) return false;
    
    return true;
  }
  
  needsHealthCheck(): boolean {
    // Check if idle too long (might be half-open)
    return Date.now() - this.lastActivity > this.idleTimeout;
  }
  
  async healthCheck(): Promise<boolean> {
    try {
      // For HTTP/2: send PING frame
      await this.socket.sendPing();
      
      // Or for HTTP/1.1: send OPTIONS or HEAD request
      // await this.socket.sendHeadRequest('/health');
      
      this.consecutiveErrors = 0;
      this.lastActivity = Date.now();
      return true;
    } catch (error) {
      this.consecutiveErrors++;
      return false;
    }
  }
  
  async sendRequest(request: Request): Promise<Response> {
    // Check if health check needed before use
    if (this.needsHealthCheck()) {
      if (!await this.healthCheck()) {
        throw new ConnectionUnhealthyError('Connection failed health check');
      }
    }
    
    try {
      const response = await this._send(request);
      this.requestCount++;
      this.consecutiveErrors = 0;
      this.lastActivity = Date.now();
      return response;
    } catch (error) {
      this.consecutiveErrors++;
      throw error;
    }
  }
}

Connection recycling strategies:

Age-based recycling: Close connections older than N hours to prevent accumulation of subtle issues
Request-based recycling: Close after N requests to prevent memory/state buildup
Error-based recycling: After consecutive errors, mark connection unhealthy and replace
Graceful drain: When recycling, stop sending new requests but allow in-flight to complete
Exponential backoff: After connection failures, delay reconnection attempts with increasing intervals

HTTP/3 and QUIC: Next-Generation Connection Reuse

HTTP/3 builds on QUIC (Quick UDP Internet Connections) to provide connection reuse benefits that surpass even HTTP/2. QUIC addresses HTTP/2's remaining limitations, particularly TCP-level head-of-line blocking and connection migration.

HTTP/2 vs HTTP/3 Connection Characteristics
Feature	HTTP/2 (over TCP)	HTTP/3 (over QUIC)
Transport	TCP	UDP with QUIC
Handshake	TCP + TLS (2-3 RTT)	Combined (1 RTT, 0-RTT for resume)
HoL blocking	TCP-level exists	None (independent streams)
Connection migration	Breaks on IP change	Survives network changes
Multiplexing	Single connection	Same, but truly independent streams
Loss recovery	TCP retransmit delays all	Per-stream, no cross-blocking

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
First connection to server:
Client                              Server
   |                                   |
   |-- QUIC Initial + ClientHello --->|  (includes 0-RTT attempt)
   |<-- QUIC Initial + ServerHello ---|  RTT 1
   |-- QUIC Handshake Done ---------->|
   |<-- QUIC Handshake Done ----------|
   |                                   |
   |-- HTTP Request ------------------>|  RTT 2 (with data)
   |<-- HTTP Response -----------------|
   
(Server provides session ticket for future 0-RTT)
 
Subsequent connections (0-RTT):
Client                              Server
   |                                   |
   |-- QUIC Initial + ClientHello --->|
   |   + 0-RTT HTTP Request           |  <- Data in first packet!
   |<-- QUIC Initial + ServerHello ---|
   |   + 0.5-RTT HTTP Response        |  RTT 1 (response starts!)
   |-- Handshake complete ------------>|
   |<-- Handshake complete ------------|
   |                                   |
 
Savings: Full round trip for repeat connections!

QUIC advantages for CDN edge-to-origin:

Connection migration: If an edge server's IP changes (network event), QUIC connections survive via connection IDs. No need to re-establish.
True stream independence: Lost packet in stream A doesn't block streams B, C, D. Crucial for multiplexed dynamic content where one slow response shouldn't slow others.
0-RTT for new connections: For repeat connections, data can be sent immediately without any handshake delay.
Improved loss recovery: QUIC's ACK frames and loss detection are more sophisticated than TCP, recovering from loss faster.

QUIC Deployment Status

Major CDNs (Cloudflare, Fastly, Akamai) support HTTP/3 for edge-to-user connections. Edge-to-origin HTTP/3 is less common due to origin server support and UDP handling in corporate networks. Adoption is growing rapidly as benefits become clear.

Summary: Eliminating Connection Overhead

Connection reuse forms a critical pillar of CDN dynamic content acceleration. By maintaining pools of ready-to-use connections, CDNs eliminate the handshake overhead that would otherwise dominate latency for short-lived requests.

Key Takeaways

•Connection reuse operates at two layers — User-to-edge (per user) and edge-to-origin (shared pool), each providing distinct benefits.
•HTTP/1.1 keep-alive has head-of-line blocking — Sequential request processing limits parallelism; pools must be sized accordingly.
•HTTP/2 multiplexing transforms pooling — One connection handles hundreds of streams; pool size drops from 100+ to 1-2 connections.
•Pool sizing must match traffic patterns — Under-sized pools cause queuing; over-sized pools waste resources and stress origins.
•Connection warming prevents cold-start latency — Pre-established connections ensure immediate readiness when traffic arrives.
•Health management is essential — Pooled connections require monitoring, recycling, and graceful handling of failures.
•HTTP/3 (QUIC) eliminates remaining limitations — 0-RTT resumption, connection migration, and true stream independence represent the future of connection reuse.

What's next:

The next page explores route optimization—how CDNs select the fastest network paths for traffic, dynamically adapting to congestion, outages, and changing network conditions to minimize latency.

Page Complete

You now understand connection reuse strategies that CDNs employ to eliminate per-request connection overhead. Combined with edge termination and TCP optimization, connection reuse explains how CDNs dramatically accelerate even completely uncacheable dynamic content.