System Design (HLD)Throughput Optimization

Throughput Optimization Techniques

LevelIntermediate

Duration120 mins

TopicThroughput Optimization

3 / 5

Connection Reuse: Eliminating Establishment Overhead

The Hidden Cost of Every Connection

Every time your application connects to a database, cache, or external service, it pays a connection establishment tax. This tax includes TCP handshakes, TLS negotiations, authentication exchanges, and protocol initialization. For a single connection, this might seem trivial—perhaps 10-50 milliseconds. But multiply that by thousands of requests per second, and connection overhead becomes a throughput killer.

Consider a web application making database queries:

Request arrives, application needs data
Application opens connection to database (10ms)
Application sends query (0.5ms)
Database processes query (1ms)
Application receives result (0.5ms)
Application closes connection (0.1ms)

Total time: 12.1ms, but only 1ms was actual work. The rest was connection overhead.

Now imagine this application handles 1,000 requests per second. That's 1,000 × 10ms = 10 seconds of CPU time per second just on connection establishment—impossible! The application would collapse.

What You Will Learn

This page explores connection reuse strategies that eliminate establishment overhead: connection pooling, HTTP keep-alive, connection multiplexing (HTTP/2, gRPC), and persistent connections. You'll understand how to configure these mechanisms and the trade-offs involved in each approach.

The Anatomy of Connection Establishment

To understand why connection reuse matters, let's dissect what happens when establishing a new connection.

TCP Three-Way Handshake (Every TCP Connection):

Client                                Server
   │                                    │
   │──────── SYN (seq=x) ──────────────▶│  RTT/2
   │                                    │
   │◀─────── SYN-ACK (seq=y, ack=x+1) ──│  RTT/2
   │                                    │
   │──────── ACK (ack=y+1) ────────────▶│  RTT/2
   │                                    │
   │        Connection Established       │
   │                                    │

Minimum time: 1.5 × RTT (Round-Trip Time)
- Same datacenter: 0.5-2ms RTT → ~1-3ms handshake
- Cross-datacenter: 10-50ms RTT → ~15-75ms handshake
- Cross-continent: 100-200ms RTT → ~150-300ms handshake

TLS Handshake (HTTPS, Secure Connections):

TLS 1.2 (Full Handshake):
1. ClientHello → (cipher suites, random)
2. ServerHello ← (chosen cipher, random)
3. Certificate ← (server's certificate)
4. ServerKeyExchange ← (DH parameters)
5. ClientKeyExchange → (encrypted pre-master)
6. ChangeCipherSpec → / ←
7. Finished → / ←

Total: 2 additional RTTs (on top of TCP handshake)

TLS 1.3 (Optimized):
- 1 RTT for full handshake
- 0 RTT for resumption (with PSK)

Cost:
- TLS 1.2: 3.5 RTTs total (TCP + TLS)
- TLS 1.3: 2.5 RTTs (TCP + TLS)
- TLS 1.3 with 0-RTT: 1.5 RTTs (TCP + TLS resumption)

Connection Establishment Costs by Protocol
Protocol	Components	RTTs Required	Typical Latency (Same DC)
TCP only	3-way handshake	1.5 RTT	1-3ms
TCP + TLS 1.2	TCP + full TLS handshake	3.5 RTT	3-7ms
TCP + TLS 1.3	TCP + optimized TLS	2.5 RTT	2-5ms
PostgreSQL	TCP + TLS + auth + startup	4-6 RTT	10-30ms
MySQL	TCP + TLS + auth packets	4-5 RTT	8-20ms
Redis (AUTH)	TCP + AUTH command	2 RTT	2-4ms
HTTP/1.1 + TLS	TCP + TLS	2.5-3.5 RTT	3-7ms
gRPC (HTTP/2)	TCP + TLS + HTTP/2 preface	3-4 RTT	4-10ms

Connection Costs Compound

A microservices request that fans out to 5 services, each establishing new connections, pays 5× the connection overhead. With cross-datacenter calls, this can add hundreds of milliseconds to every request—unacceptable for user-facing latency.

Connection Pooling

Connection pooling is the most important connection reuse technique. Instead of creating a new connection for each operation, applications maintain a pool of pre-established connections that are borrowed, used, and returned.

Without Pooling:                    With Pooling:

Request 1:                          Request 1:
  connect (10ms)                      borrow from pool (0.01ms)
  query (1ms)                         query (1ms)
  disconnect                          return to pool (0.01ms)
  
Request 2:                          Request 2:
  connect (10ms)                      borrow from pool (0.01ms)
  query (1ms)                         query (1ms)
  disconnect                          return to pool (0.01ms)

Total: 22ms                         Total: 2.04ms  (10x faster!)

Pool Architecture:

                    ┌─────────────────────────────────────┐
                    │        Application Process          │
                    │                                     │
                    │  ┌───────────────────────────────┐  │
                    │  │      Connection Pool          │  │
                    │  │  ┌────┐ ┌────┐ ┌────┐ ┌────┐ │  │
                    │  │  │Conn│ │Conn│ │Conn│ │Conn│ │  │
                    │  │  │ 1  │ │ 2  │ │ 3  │ │ 4  │ │  │
                    │  │  │idle│ │busy│ │idle│ │busy│ │  │
                    │  │  └──┬─┘ └──┬─┘ └──┬─┘ └──┬─┘ │  │
                    │  └─────┼─────┼─────┼─────┼─────┘  │
                    └────────┼─────┼─────┼─────┼────────┘
                             │     │     │     │
                             ▼     ▼     ▼     ▼
                    ┌─────────────────────────────────────┐
                    │          Database Server            │
                    └─────────────────────────────────────┘

Connection Pool Configuration Parameters

•Minimum Size (min_idle) — Connections to keep open even when idle. Pre-warms pool to avoid cold-start latency. Too high wastes database resources.
•Maximum Size (max_size) — Upper limit on total connections. Prevents resource exhaustion but causes queuing when exceeded. Critical for database protection.
•Idle Timeout — How long unused connections survive before being closed. Balances resource usage against reconnection overhead.
•Max Lifetime — Maximum age of a connection regardless of activity. Prevents stale connections, handles server-side limits, and allows load rebalancing.
•Connection Timeout — How long to wait for a connection from pool. Short timeouts fail fast; long timeouts queue requests.
•Validation Query — Query to verify connection health before use (e.g., 'SELECT 1'). Adds latency but prevents using dead connections.

connection-pool-config.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
// Database connection pool configuration examples
 
// PostgreSQL with pg-pool
import { Pool } from 'pg';
 
const pgPool = new Pool({
  host: 'localhost',
  database: 'myapp',
  user: 'appuser',
  password: 'secret',
  
  // Pool sizing
  min: 5,                    // Keep 5 connections warm
  max: 20,                   // Never exceed 20 connections
  
  // Timeouts
  connectionTimeoutMillis: 5000,   // Wait up to 5s for connection
  idleTimeoutMillis: 30000,        // Close idle after 30s
  
  // Health checking
  allowExitOnIdle: false,          // Keep pool alive
});
 
// Usage - connection automatically returned to pool
const result = await pgPool.query('SELECT * FROM users WHERE id = $1', [userId]);
 
 
// TypeORM configuration
const dataSource = new DataSource({
  type: 'postgres',
  host: 'localhost',
  database: 'myapp',
  
  // Pool configuration
  extra: {
    min: 5,
    max: 20,
    idleTimeoutMillis: 30000,
    
    // Connection validation
    validateConnection: true,
    validationQuery: 'SELECT 1',
  }
});
 
 
// Redis connection pool (ioredis)
import Redis from 'ioredis';
 
const redisCluster = new Redis.Cluster([
  { host: 'redis-1', port: 6379 },
  { host: 'redis-2', port: 6379 },
], {
  // Per-node connection settings
  redisOptions: {
    connectTimeout: 5000,
    maxRetriesPerRequest: 3,
  },
  
  // Pool-like behavior
  scaleReads: 'slave',           // Read from replicas
  natMap: {},                     // NAT translation
  
  // Connection reuse
  enableOfflineQueue: true,       // Queue commands during reconnect
  enableReadyCheck: true,         // Validate before use
});

Pool Sizing Strategies

Determining optimal pool size is crucial—too small limits throughput, too large wastes resources and can overwhelm backends.

The Pool Size Formula:

For CPU-bound database work:

$$ \text{Optimal Pool Size} = \text{CPU Cores} \times 2 $$

For I/O-bound work (waiting on disk/network):

$$ \text{Pool Size} = \text{Threads} \times \left(1 + \frac{\text{Wait Time}}{\text{Compute Time}}\right) $$

PostgreSQL's Recommendation:

The PostgreSQL wiki suggests a surprisingly small pool size:

$$ \text{connections} = (\text{core_count} \times 2) + \text{effective_spindle_count} $$

For SSD-based systems, this often means pool size < 20 even for high-traffic applications.

Why Small Pools Work:

Scenario: 10,000 requests/second, 10ms average query time

Calculation:
- Concurrent queries needed = 10,000 req/s × 0.01s = 100
- But database can execute ~20 queries truly in parallel (CPU-bound)
- Remaining 80 queries are just waiting in database queue

Conclusion:
- Pool of 20 connections: Database works at capacity
- Pool of 100 connections: 80 connections idle, competing for locks
- Pool of 500 connections: Memory wasted, context switching overhead

The database doesn't go faster with more connections!

The "More Connections = More Throughput" Fallacy

Counter-intuitively, smaller pools often yield HIGHER throughput. Each database connection consumes memory (~5-10MB in PostgreSQL), holds locks, and competes for CPU. Beyond a threshold, adding connections causes contention that reduces overall throughput. Pool size should reflect the database's parallel execution capacity, not application concurrency.

Connection Pool Anti-Pattern: Per-Request Connections

// WRONG: New pool per request (defeats the purpose!)
async function handleRequest(req) {
  const pool = new Pool({ max: 5 });  // Creates new pool!
  const result = await pool.query('SELECT...');
  await pool.end();                    // Destroys connections!
}

// RIGHT: Shared pool
const pool = new Pool({ max: 20 });   // Created once at startup

async function handleRequest(req) {
  const result = await pool.query('SELECT...');
  // Connection automatically returned to pool
}

Pool Size Guidelines by Workload
Application Type	Pool Size Formula	Typical Range
Web API (Node.js, 1 process)	2-3 × CPU cores	10-30
Web API (multi-process)	Per-process: cores × 2, Total ÷ processes	5-10 per process
Background workers	Workers × 2	2-4 per worker
Connection proxy (PgBouncer)	Based on backend database capacity	50-400
Microservice (high fanout)	Keep small, use proxy	5-10

Connection Proxies and Multiplexers

When many application instances need database access, individual pools can overwhelm the database. Connection proxies solve this by multiplexing many client connections over fewer backend connections.

The Problem: Connection Explosion

                    100 App Instances
                    (20 connections each)
                            │
                            ▼
                    ┌───────────────┐
                    │   Database    │
                    │ (max 500 conn)│─── OVERWHELMED!
                    │               │    2000 connections requested
                    └───────────────┘    but only 500 supported

The Solution: Connection Proxy

                    100 App Instances
                    (20 connections each)
                            │
                    2000 connections
                            │
                            ▼
                    ┌───────────────┐
                    │   PgBouncer   │
                    │   (Proxy)     │
                    └───────┬───────┘
                            │
                    100 connections
                            │
                            ▼
                    ┌───────────────┐
                    │   Database    │
                    │  (happy!)     │
                    └───────────────┘

Popular Connection Proxies

•PgBouncer — Lightweight PostgreSQL proxy. Modes: session (1:1), transaction (multiplex per transaction), statement (multiplex per query). Transaction mode gives best pooling.
•Pgpool-II — PostgreSQL proxy with pooling, replication, and load balancing. More features but higher overhead than PgBouncer.
•ProxySQL — MySQL/MariaDB proxy with query caching, query rewriting, and read/write splitting. Highly configurable routing rules.
•Amazon RDS Proxy — Managed proxy for Aurora/RDS. Handles connection pooling, failover, and IAM authentication. Serverless-friendly.
•Vitess — MySQL clustering solution with built-in connection pooling, sharding, and query routing. Used by YouTube, Slack.

pgbouncer-config.ini
INI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
; PgBouncer configuration for high-throughput applications
[databases]
; Connection string to actual PostgreSQL
myapp = host=pg-primary.internal port=5432 dbname=myapp
 
[pgbouncer]
; Listening configuration
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
 
; Pool mode - transaction gives best throughput
; session: 1 client = 1 backend (no multiplexing)
; transaction: multiplex after each transaction
; statement: multiplex after each statement (dangerous!)
pool_mode = transaction
 
; Pool sizing
default_pool_size = 20        ; Connections per database/user pair
min_pool_size = 5             ; Keep this many warm
reserve_pool_size = 5         ; Extra for burst traffic
reserve_pool_timeout = 3      ; Wait before using reserve
 
; Connection limits
max_client_conn = 10000       ; Accept up to 10K clients
max_db_connections = 100      ; But only 100 to actual database!
 
; Timeouts
server_connect_timeout = 3
server_idle_timeout = 600     ; Close idle backend after 10min
server_lifetime = 3600        ; Force reconnect after 1 hour
client_idle_timeout = 0       ; Never timeout idle clients
 
; Query timeout
query_timeout = 30            ; Kill queries over 30s
query_wait_timeout = 120      ; Wait up to 2min for connection
 
; Stats
stats_period = 60
log_connections = 0           ; Don't log every connect
log_disconnections = 0        ; Don't log every disconnect

HTTP Keep-Alive and Persistent Connections

HTTP/1.0 originally created a new TCP connection for every request—catastrophically inefficient. HTTP Keep-Alive (persistent connections) allows connection reuse across multiple HTTP exchanges.

HTTP/1.0 (Without Keep-Alive):

Request 1: TCP connect → TLS → Send GET → Receive → TCP close
Request 2: TCP connect → TLS → Send GET → Receive → TCP close
Request 3: TCP connect → TLS → Send GET → Receive → TCP close

Cost per request: ~5-10ms overhead

HTTP/1.1 (Keep-Alive Default):

TCP connect → TLS → Send GET 1 → Receive
                  → Send GET 2 → Receive
                  → Send GET 3 → Receive
                  ... (reuse indefinitely)
                  → TCP close (after timeout)

Cost per request: ~0.1ms overhead (after connection established)

http-keepalive.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import http from 'http';
import https from 'https';
 
// Node.js HTTP Agent with Keep-Alive
// Reuses connections for requests to the same host
 
const httpAgent = new http.Agent({
  keepAlive: true,              // Enable keep-alive (default: false!)
  keepAliveMsecs: 1000,         // TCP keep-alive probe interval
  maxSockets: 50,               // Max connections per host
  maxFreeSockets: 10,           // Max idle connections to keep
  timeout: 60000,               // Socket timeout (60s)
  scheduling: 'fifo',           // Use first-in-first-out for fairness
});
 
const httpsAgent = new https.Agent({
  keepAlive: true,
  maxSockets: 50,
  maxFreeSockets: 10,
  
  // TLS session caching (reuse TLS session tickets)
  maxCachedSessions: 100,       // Cache up to 100 TLS sessions
});
 
// Using with fetch (Node.js 18+)
const response = await fetch('https://api.example.com/data', {
  agent: httpsAgent,  // Reuses connections
});
 
// Using with axios
import axios from 'axios';
 
const client = axios.create({
  httpAgent,
  httpsAgent,
  timeout: 10000,
});
 
// All requests through this client reuse connections:
await client.get('https://api.example.com/users');
await client.get('https://api.example.com/orders');
await client.get('https://api.example.com/products');
// ^^^ These likely reuse the same TCP connection!
 
 
// Express server: Keep-Alive configuration
import express from 'express';
 
const app = express();
const server = app.listen(3000);
 
// Keep-Alive timeout (how long to keep idle connections)
server.keepAliveTimeout = 65000;  // 65 seconds
 
// Headers timeout (must be > keepAliveTimeout for ALB compatibility)
server.headersTimeout = 66000;
 
// Useful when behind AWS ALB (which has 60s idle timeout)
// Client ← 65s → Your Server
// ALB has 60s timeout, so 65s ensures server doesn't close first

Load Balancer Keep-Alive Gotcha

When running behind a load balancer (ALB, ELB, NGINX), ensure your server's keep-alive timeout EXCEEDS the load balancer's. Otherwise, the server might close a connection while the LB still thinks it's valid, causing 502 errors. AWS ALB has a 60-second idle timeout—set your server to 65+ seconds.

HTTP/2 and gRPC Multiplexing

HTTP/1.1 Keep-Alive has a limitation: head-of-line blocking. Only one request can be in-flight per connection at a time. To send multiple requests simultaneously, HTTP/1.1 clients open multiple connections (typically 6-8 per domain).

HTTP/2 solves this with multiplexing—multiple concurrent requests share a single TCP connection.

HTTP/1.1 with Keep-Alive:

Connection 1: [Req A]----[Resp A]
              [Req B]----[Resp B]
              [Req C]----[Resp C]
              (sequential)

Connection 2: [Req D]----[Resp D]
              [Req E]----[Resp E]
              (parallel connections)

Connection 3: [Req F]----[Resp F]
              ...

6-8 connections needed for
concurrency

HTTP/2 Multiplexing:

Single Connection:

[Req A]─┐  [Resp A chunk]───┐
[Req B]─┤  [Resp B chunk]───┤
[Req C]─┤  [Resp A chunk]───┤
[Req D]─┤  [Resp C chunk]───┤
        │  [Resp B chunk]───┤
        │  [Resp D chunk]───┤
        │         ...       │
        └───────────────────┘

All requests interleaved on
ONE connection!

HTTP/2 vs HTTP/1.1 Connection Efficiency
Aspect	HTTP/1.1	HTTP/2
Concurrent requests per connection	1	Unlimited (typically 100s)
Connections needed for 100 parallel requests	100	1
Head-of-line blocking	Yes (application layer)	No (stream-level)
Header compression	No	HPACK (significant savings)
Server push	No	Yes (proactive resource sending)
Connection establishment	Per-connection	Once, then multiplex

gRPC and HTTP/2:

gRPC is built on HTTP/2, inheriting multiplexing benefits:

Single gRPC Connection:

┌─────────────────────────────────────────────┐
│           HTTP/2 Connection                 │
│                                             │
│  Stream 1: GetUser(id=1)    → User{...}    │
│  Stream 3: GetUser(id=2)    → User{...}    │
│  Stream 5: ListOrders()     → Order{...}   │
│                              → Order{...}   │
│                              → Order{...}   │
│  Stream 7: CreateOrder(...)  → OrderID     │
│                                             │
│  All streams share ONE TCP connection!      │
└─────────────────────────────────────────────┘

Benefits for microservices:

Service A → Service B: One connection handles all RPC calls
Connection overhead paid once at startup
No need for connection pooling (HTTP/2 handles multiplexing)
But: Still use load balancing at the client (gRPC uses long-lived connections)

gRPC Load Balancing Caveat

Because gRPC connections are long-lived and multiplexed, traditional L4 load balancers (which balance per-connection) are ineffective. Use L7 load balancing (Envoy, NGINX with gRPC support) or client-side load balancing (gRPC's built-in mechanisms) to distribute requests across backend instances.

Connection Lifecycle Management

Long-lived connections require careful lifecycle management to handle failures, rebalancing, and resource cleanup.

Connection Lifecycle Concerns

•Health Checking — Detect dead connections before using them. TCP connections can appear open even when the remote end has crashed (until TCP keepalive kicks in).
•Reconnection Strategy — When connections fail, how to reconnect? Exponential backoff prevents thundering herd during outages.
•Connection Draining — When shutting down, complete in-flight requests before closing connections. Prevents request failures during deploys.
•Maximum Lifetime — Force connection rotation to prevent stale state, allow backend rebalancing, and protect against memory leaks.
•Idle Timeout — Close unused connections to free resources. Balance against reconnection overhead.
•Load Rebalancing — Long-lived connections can cause uneven load if servers scale. Periodic reconnection helps redistribute.

connection-lifecycle.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
// Robust connection management with health checks and reconnection
 
class ManagedConnection {
  private connection: Connection | null = null;
  private lastUsed: number = Date.now();
  private reconnectAttempts: number = 0;
  private readonly maxReconnectDelay = 30000;  // 30s max
  private readonly baseReconnectDelay = 100;   // Start at 100ms
  private readonly maxLifetime = 3600000;      // 1 hour max lifetime
  private readonly idleTimeout = 300000;       // 5 min idle timeout
  private readonly healthCheckInterval = 30000; // Check every 30s
  private createdAt: number = 0;
  
  async getConnection(): Promise<Connection> {
    // Check if connection needs refresh
    if (this.connection) {
      const age = Date.now() - this.createdAt;
      const idle = Date.now() - this.lastUsed;
      
      // Force reconnect if too old (prevents stale connections)
      if (age > this.maxLifetime) {
        await this.close('max lifetime exceeded');
      }
      // Close if idle too long
      else if (idle > this.idleTimeout) {
        await this.close('idle timeout');
      }
      // Validate connection health
      else if (!await this.isHealthy()) {
        await this.close('health check failed');
      }
    }
    
    // Establish new connection if needed
    if (!this.connection) {
      await this.connect();
    }
    
    this.lastUsed = Date.now();
    return this.connection!;
  }
  
  private async connect(): Promise<void> {
    while (true) {
      try {
        this.connection = await createConnection({
          host: 'db.example.com',
          port: 5432,
          // Enable TCP keep-alive for early dead peer detection
          keepAlive: true,
          keepAliveInitialDelayMillis: 10000,
        });
        
        this.createdAt = Date.now();
        this.reconnectAttempts = 0;
        console.log('Connection established');
        return;
        
      } catch (error) {
        this.reconnectAttempts++;
        
        // Exponential backoff with jitter
        const delay = Math.min(
          this.baseReconnectDelay * Math.pow(2, this.reconnectAttempts),
          this.maxReconnectDelay
        );
        const jitter = delay * 0.2 * Math.random();
        
        console.error(`Connection failed, retry in ${delay + jitter}ms`);
        await sleep(delay + jitter);
      }
    }
  }
  
  private async isHealthy(): Promise<boolean> {
    try {
      // Simple query to verify connection works
      await this.connection!.query('SELECT 1');
      return true;
    } catch {
      return false;
    }
  }
  
  // Graceful shutdown with connection draining
  async gracefulClose(drainTimeout: number = 30000): Promise<void> {
    console.log('Starting graceful connection shutdown');
    
    // Stop accepting new requests
    this.connection = null;
    
    // Wait for in-flight requests (simplified)
    await sleep(drainTimeout);
    
    // Force close
    await this.close('graceful shutdown');
  }
  
  private async close(reason: string): Promise<void> {
    if (this.connection) {
      console.log(`Closing connection: ${reason}`);
      await this.connection.end();
      this.connection = null;
    }
  }
}
 
function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Summary: Connection Reuse for Throughput

We've explored connection reuse as a critical throughput optimization. Here are the key insights:

Key Takeaways

•Connection establishment is expensive — TCP handshakes, TLS negotiation, and protocol initialization add 10-100ms per connection. At scale, this overhead is prohibitive.
•Connection pooling is essential — Pre-established, reusable connections eliminate per-request establishment costs. Every production database client should use pooling.
•Pool size should be small — Counter-intuitively, fewer connections often yield higher throughput. Size pools based on database parallelism, not application concurrency.
•Connection proxies handle scale — PgBouncer, ProxySQL, and managed proxies multiplex thousands of clients over limited backend connections.
•HTTP Keep-Alive is table stakes — Always enable persistent connections. Configure timeouts to work with load balancers.
•HTTP/2 and gRPC offer multiplexing — Single connections carry unlimited concurrent requests, eliminating the need for connection pools at the HTTP layer.
•Lifecycle management prevents issues — Health checks, reconnection strategies, and connection draining ensure reliability despite long-lived connections.

What's next:

Connection reuse eliminates establishment overhead, but request handling still requires synchronous processing. The next page explores queue-based processing—decoupling request acceptance from processing to handle load spikes, enable throttling, and achieve higher overall throughput.

Page Complete

You now understand connection reuse as a throughput optimization—from the physics of connection establishment, through pooling strategies and sizing, to HTTP/2 multiplexing and lifecycle management. Next, we'll examine queue-based processing patterns.

3 / 5

Loading learning content...

System Design (HLD)Throughput Optimization

Throughput Optimization Techniques

LevelIntermediate

Duration120 mins

TopicThroughput Optimization

3 / 5

Connection Reuse: Eliminating Establishment Overhead

The Hidden Cost of Every Connection

Consider a web application making database queries:

Request arrives, application needs data
Application opens connection to database (10ms)
Application sends query (0.5ms)
Database processes query (1ms)
Application receives result (0.5ms)
Application closes connection (0.1ms)

Total time: 12.1ms, but only 1ms was actual work. The rest was connection overhead.

What You Will Learn

The Anatomy of Connection Establishment

To understand why connection reuse matters, let's dissect what happens when establishing a new connection.

TCP Three-Way Handshake (Every TCP Connection):

Client                                Server
   │                                    │
   │──────── SYN (seq=x) ──────────────▶│  RTT/2
   │                                    │
   │◀─────── SYN-ACK (seq=y, ack=x+1) ──│  RTT/2
   │                                    │
   │──────── ACK (ack=y+1) ────────────▶│  RTT/2
   │                                    │
   │        Connection Established       │
   │                                    │

Minimum time: 1.5 × RTT (Round-Trip Time)
- Same datacenter: 0.5-2ms RTT → ~1-3ms handshake
- Cross-datacenter: 10-50ms RTT → ~15-75ms handshake
- Cross-continent: 100-200ms RTT → ~150-300ms handshake

TLS Handshake (HTTPS, Secure Connections):

TLS 1.2 (Full Handshake):
1. ClientHello → (cipher suites, random)
2. ServerHello ← (chosen cipher, random)
3. Certificate ← (server's certificate)
4. ServerKeyExchange ← (DH parameters)
5. ClientKeyExchange → (encrypted pre-master)
6. ChangeCipherSpec → / ←
7. Finished → / ←

Total: 2 additional RTTs (on top of TCP handshake)

TLS 1.3 (Optimized):
- 1 RTT for full handshake
- 0 RTT for resumption (with PSK)

Cost:
- TLS 1.2: 3.5 RTTs total (TCP + TLS)
- TLS 1.3: 2.5 RTTs (TCP + TLS)
- TLS 1.3 with 0-RTT: 1.5 RTTs (TCP + TLS resumption)

Connection Establishment Costs by Protocol
Protocol	Components	RTTs Required	Typical Latency (Same DC)
TCP only	3-way handshake	1.5 RTT	1-3ms
TCP + TLS 1.2	TCP + full TLS handshake	3.5 RTT	3-7ms
TCP + TLS 1.3	TCP + optimized TLS	2.5 RTT	2-5ms
PostgreSQL	TCP + TLS + auth + startup	4-6 RTT	10-30ms
MySQL	TCP + TLS + auth packets	4-5 RTT	8-20ms
Redis (AUTH)	TCP + AUTH command	2 RTT	2-4ms
HTTP/1.1 + TLS	TCP + TLS	2.5-3.5 RTT	3-7ms
gRPC (HTTP/2)	TCP + TLS + HTTP/2 preface	3-4 RTT	4-10ms

Connection Costs Compound

Connection Pooling

Without Pooling:                    With Pooling:

Request 1:                          Request 1:
  connect (10ms)                      borrow from pool (0.01ms)
  query (1ms)                         query (1ms)
  disconnect                          return to pool (0.01ms)
  
Request 2:                          Request 2:
  connect (10ms)                      borrow from pool (0.01ms)
  query (1ms)                         query (1ms)
  disconnect                          return to pool (0.01ms)

Total: 22ms                         Total: 2.04ms  (10x faster!)

Pool Architecture:

                    ┌─────────────────────────────────────┐
                    │        Application Process          │
                    │                                     │
                    │  ┌───────────────────────────────┐  │
                    │  │      Connection Pool          │  │
                    │  │  ┌────┐ ┌────┐ ┌────┐ ┌────┐ │  │
                    │  │  │Conn│ │Conn│ │Conn│ │Conn│ │  │
                    │  │  │ 1  │ │ 2  │ │ 3  │ │ 4  │ │  │
                    │  │  │idle│ │busy│ │idle│ │busy│ │  │
                    │  │  └──┬─┘ └──┬─┘ └──┬─┘ └──┬─┘ │  │
                    │  └─────┼─────┼─────┼─────┼─────┘  │
                    └────────┼─────┼─────┼─────┼────────┘
                             │     │     │     │
                             ▼     ▼     ▼     ▼
                    ┌─────────────────────────────────────┐
                    │          Database Server            │
                    └─────────────────────────────────────┘

Connection Pool Configuration Parameters

•Minimum Size (min_idle) — Connections to keep open even when idle. Pre-warms pool to avoid cold-start latency. Too high wastes database resources.
•Maximum Size (max_size) — Upper limit on total connections. Prevents resource exhaustion but causes queuing when exceeded. Critical for database protection.
•Idle Timeout — How long unused connections survive before being closed. Balances resource usage against reconnection overhead.
•Max Lifetime — Maximum age of a connection regardless of activity. Prevents stale connections, handles server-side limits, and allows load rebalancing.
•Connection Timeout — How long to wait for a connection from pool. Short timeouts fail fast; long timeouts queue requests.
•Validation Query — Query to verify connection health before use (e.g., 'SELECT 1'). Adds latency but prevents using dead connections.

connection-pool-config.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
// Database connection pool configuration examples
 
// PostgreSQL with pg-pool
import { Pool } from 'pg';
 
const pgPool = new Pool({
  host: 'localhost',
  database: 'myapp',
  user: 'appuser',
  password: 'secret',
  
  // Pool sizing
  min: 5,                    // Keep 5 connections warm
  max: 20,                   // Never exceed 20 connections
  
  // Timeouts
  connectionTimeoutMillis: 5000,   // Wait up to 5s for connection
  idleTimeoutMillis: 30000,        // Close idle after 30s
  
  // Health checking
  allowExitOnIdle: false,          // Keep pool alive
});
 
// Usage - connection automatically returned to pool
const result = await pgPool.query('SELECT * FROM users WHERE id = $1', [userId]);
 
 
// TypeORM configuration
const dataSource = new DataSource({
  type: 'postgres',
  host: 'localhost',
  database: 'myapp',
  
  // Pool configuration
  extra: {
    min: 5,
    max: 20,
    idleTimeoutMillis: 30000,
    
    // Connection validation
    validateConnection: true,
    validationQuery: 'SELECT 1',
  }
});
 
 
// Redis connection pool (ioredis)
import Redis from 'ioredis';
 
const redisCluster = new Redis.Cluster([
  { host: 'redis-1', port: 6379 },
  { host: 'redis-2', port: 6379 },
], {
  // Per-node connection settings
  redisOptions: {
    connectTimeout: 5000,
    maxRetriesPerRequest: 3,
  },
  
  // Pool-like behavior
  scaleReads: 'slave',           // Read from replicas
  natMap: {},                     // NAT translation
  
  // Connection reuse
  enableOfflineQueue: true,       // Queue commands during reconnect
  enableReadyCheck: true,         // Validate before use
});

Pool Sizing Strategies

Determining optimal pool size is crucial—too small limits throughput, too large wastes resources and can overwhelm backends.

The Pool Size Formula:

For CPU-bound database work:

$$ \text{Optimal Pool Size} = \text{CPU Cores} \times 2 $$

For I/O-bound work (waiting on disk/network):

$$ \text{Pool Size} = \text{Threads} \times \left(1 + \frac{\text{Wait Time}}{\text{Compute Time}}\right) $$

PostgreSQL's Recommendation:

The PostgreSQL wiki suggests a surprisingly small pool size:

$$ \text{connections} = (\text{core_count} \times 2) + \text{effective_spindle_count} $$

For SSD-based systems, this often means pool size < 20 even for high-traffic applications.

Why Small Pools Work:

Scenario: 10,000 requests/second, 10ms average query time

Calculation:
- Concurrent queries needed = 10,000 req/s × 0.01s = 100
- But database can execute ~20 queries truly in parallel (CPU-bound)
- Remaining 80 queries are just waiting in database queue

Conclusion:
- Pool of 20 connections: Database works at capacity
- Pool of 100 connections: 80 connections idle, competing for locks
- Pool of 500 connections: Memory wasted, context switching overhead

The database doesn't go faster with more connections!

The "More Connections = More Throughput" Fallacy

Connection Pool Anti-Pattern: Per-Request Connections

// WRONG: New pool per request (defeats the purpose!)
async function handleRequest(req) {
  const pool = new Pool({ max: 5 });  // Creates new pool!
  const result = await pool.query('SELECT...');
  await pool.end();                    // Destroys connections!
}

// RIGHT: Shared pool
const pool = new Pool({ max: 20 });   // Created once at startup

async function handleRequest(req) {
  const result = await pool.query('SELECT...');
  // Connection automatically returned to pool
}

Pool Size Guidelines by Workload
Application Type	Pool Size Formula	Typical Range
Web API (Node.js, 1 process)	2-3 × CPU cores	10-30
Web API (multi-process)	Per-process: cores × 2, Total ÷ processes	5-10 per process
Background workers	Workers × 2	2-4 per worker
Connection proxy (PgBouncer)	Based on backend database capacity	50-400
Microservice (high fanout)	Keep small, use proxy	5-10

Connection Proxies and Multiplexers

The Problem: Connection Explosion

                    100 App Instances
                    (20 connections each)
                            │
                            ▼
                    ┌───────────────┐
                    │   Database    │
                    │ (max 500 conn)│─── OVERWHELMED!
                    │               │    2000 connections requested
                    └───────────────┘    but only 500 supported

The Solution: Connection Proxy

                    100 App Instances
                    (20 connections each)
                            │
                    2000 connections
                            │
                            ▼
                    ┌───────────────┐
                    │   PgBouncer   │
                    │   (Proxy)     │
                    └───────┬───────┘
                            │
                    100 connections
                            │
                            ▼
                    ┌───────────────┐
                    │   Database    │
                    │  (happy!)     │
                    └───────────────┘

Popular Connection Proxies

•PgBouncer — Lightweight PostgreSQL proxy. Modes: session (1:1), transaction (multiplex per transaction), statement (multiplex per query). Transaction mode gives best pooling.
•Pgpool-II — PostgreSQL proxy with pooling, replication, and load balancing. More features but higher overhead than PgBouncer.
•ProxySQL — MySQL/MariaDB proxy with query caching, query rewriting, and read/write splitting. Highly configurable routing rules.
•Amazon RDS Proxy — Managed proxy for Aurora/RDS. Handles connection pooling, failover, and IAM authentication. Serverless-friendly.
•Vitess — MySQL clustering solution with built-in connection pooling, sharding, and query routing. Used by YouTube, Slack.

pgbouncer-config.ini
INI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
; PgBouncer configuration for high-throughput applications
[databases]
; Connection string to actual PostgreSQL
myapp = host=pg-primary.internal port=5432 dbname=myapp
 
[pgbouncer]
; Listening configuration
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
 
; Pool mode - transaction gives best throughput
; session: 1 client = 1 backend (no multiplexing)
; transaction: multiplex after each transaction
; statement: multiplex after each statement (dangerous!)
pool_mode = transaction
 
; Pool sizing
default_pool_size = 20        ; Connections per database/user pair
min_pool_size = 5             ; Keep this many warm
reserve_pool_size = 5         ; Extra for burst traffic
reserve_pool_timeout = 3      ; Wait before using reserve
 
; Connection limits
max_client_conn = 10000       ; Accept up to 10K clients
max_db_connections = 100      ; But only 100 to actual database!
 
; Timeouts
server_connect_timeout = 3
server_idle_timeout = 600     ; Close idle backend after 10min
server_lifetime = 3600        ; Force reconnect after 1 hour
client_idle_timeout = 0       ; Never timeout idle clients
 
; Query timeout
query_timeout = 30            ; Kill queries over 30s
query_wait_timeout = 120      ; Wait up to 2min for connection
 
; Stats
stats_period = 60
log_connections = 0           ; Don't log every connect
log_disconnections = 0        ; Don't log every disconnect

HTTP Keep-Alive and Persistent Connections

HTTP/1.0 originally created a new TCP connection for every request—catastrophically inefficient. HTTP Keep-Alive (persistent connections) allows connection reuse across multiple HTTP exchanges.

HTTP/1.0 (Without Keep-Alive):

Request 1: TCP connect → TLS → Send GET → Receive → TCP close
Request 2: TCP connect → TLS → Send GET → Receive → TCP close
Request 3: TCP connect → TLS → Send GET → Receive → TCP close

Cost per request: ~5-10ms overhead

HTTP/1.1 (Keep-Alive Default):

TCP connect → TLS → Send GET 1 → Receive
                  → Send GET 2 → Receive
                  → Send GET 3 → Receive
                  ... (reuse indefinitely)
                  → TCP close (after timeout)

Cost per request: ~0.1ms overhead (after connection established)

http-keepalive.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import http from 'http';
import https from 'https';
 
// Node.js HTTP Agent with Keep-Alive
// Reuses connections for requests to the same host
 
const httpAgent = new http.Agent({
  keepAlive: true,              // Enable keep-alive (default: false!)
  keepAliveMsecs: 1000,         // TCP keep-alive probe interval
  maxSockets: 50,               // Max connections per host
  maxFreeSockets: 10,           // Max idle connections to keep
  timeout: 60000,               // Socket timeout (60s)
  scheduling: 'fifo',           // Use first-in-first-out for fairness
});
 
const httpsAgent = new https.Agent({
  keepAlive: true,
  maxSockets: 50,
  maxFreeSockets: 10,
  
  // TLS session caching (reuse TLS session tickets)
  maxCachedSessions: 100,       // Cache up to 100 TLS sessions
});
 
// Using with fetch (Node.js 18+)
const response = await fetch('https://api.example.com/data', {
  agent: httpsAgent,  // Reuses connections
});
 
// Using with axios
import axios from 'axios';
 
const client = axios.create({
  httpAgent,
  httpsAgent,
  timeout: 10000,
});
 
// All requests through this client reuse connections:
await client.get('https://api.example.com/users');
await client.get('https://api.example.com/orders');
await client.get('https://api.example.com/products');
// ^^^ These likely reuse the same TCP connection!
 
 
// Express server: Keep-Alive configuration
import express from 'express';
 
const app = express();
const server = app.listen(3000);
 
// Keep-Alive timeout (how long to keep idle connections)
server.keepAliveTimeout = 65000;  // 65 seconds
 
// Headers timeout (must be > keepAliveTimeout for ALB compatibility)
server.headersTimeout = 66000;
 
// Useful when behind AWS ALB (which has 60s idle timeout)
// Client ← 65s → Your Server
// ALB has 60s timeout, so 65s ensures server doesn't close first

Load Balancer Keep-Alive Gotcha

HTTP/2 and gRPC Multiplexing

HTTP/2 solves this with multiplexing—multiple concurrent requests share a single TCP connection.

HTTP/1.1 with Keep-Alive:

Connection 1: [Req A]----[Resp A]
              [Req B]----[Resp B]
              [Req C]----[Resp C]
              (sequential)

Connection 2: [Req D]----[Resp D]
              [Req E]----[Resp E]
              (parallel connections)

Connection 3: [Req F]----[Resp F]
              ...

6-8 connections needed for
concurrency

HTTP/2 Multiplexing:

Single Connection:

[Req A]─┐  [Resp A chunk]───┐
[Req B]─┤  [Resp B chunk]───┤
[Req C]─┤  [Resp A chunk]───┤
[Req D]─┤  [Resp C chunk]───┤
        │  [Resp B chunk]───┤
        │  [Resp D chunk]───┤
        │         ...       │
        └───────────────────┘

All requests interleaved on
ONE connection!

HTTP/2 vs HTTP/1.1 Connection Efficiency
Aspect	HTTP/1.1	HTTP/2
Concurrent requests per connection	1	Unlimited (typically 100s)
Connections needed for 100 parallel requests	100	1
Head-of-line blocking	Yes (application layer)	No (stream-level)
Header compression	No	HPACK (significant savings)
Server push	No	Yes (proactive resource sending)
Connection establishment	Per-connection	Once, then multiplex

gRPC and HTTP/2:

gRPC is built on HTTP/2, inheriting multiplexing benefits:

Single gRPC Connection:

┌─────────────────────────────────────────────┐
│           HTTP/2 Connection                 │
│                                             │
│  Stream 1: GetUser(id=1)    → User{...}    │
│  Stream 3: GetUser(id=2)    → User{...}    │
│  Stream 5: ListOrders()     → Order{...}   │
│                              → Order{...}   │
│                              → Order{...}   │
│  Stream 7: CreateOrder(...)  → OrderID     │
│                                             │
│  All streams share ONE TCP connection!      │
└─────────────────────────────────────────────┘

Benefits for microservices:

Service A → Service B: One connection handles all RPC calls
Connection overhead paid once at startup
No need for connection pooling (HTTP/2 handles multiplexing)
But: Still use load balancing at the client (gRPC uses long-lived connections)

gRPC Load Balancing Caveat

Connection Lifecycle Management

Long-lived connections require careful lifecycle management to handle failures, rebalancing, and resource cleanup.

Connection Lifecycle Concerns

•Health Checking — Detect dead connections before using them. TCP connections can appear open even when the remote end has crashed (until TCP keepalive kicks in).
•Reconnection Strategy — When connections fail, how to reconnect? Exponential backoff prevents thundering herd during outages.
•Connection Draining — When shutting down, complete in-flight requests before closing connections. Prevents request failures during deploys.
•Maximum Lifetime — Force connection rotation to prevent stale state, allow backend rebalancing, and protect against memory leaks.
•Idle Timeout — Close unused connections to free resources. Balance against reconnection overhead.
•Load Rebalancing — Long-lived connections can cause uneven load if servers scale. Periodic reconnection helps redistribute.

connection-lifecycle.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
// Robust connection management with health checks and reconnection
 
class ManagedConnection {
  private connection: Connection | null = null;
  private lastUsed: number = Date.now();
  private reconnectAttempts: number = 0;
  private readonly maxReconnectDelay = 30000;  // 30s max
  private readonly baseReconnectDelay = 100;   // Start at 100ms
  private readonly maxLifetime = 3600000;      // 1 hour max lifetime
  private readonly idleTimeout = 300000;       // 5 min idle timeout
  private readonly healthCheckInterval = 30000; // Check every 30s
  private createdAt: number = 0;
  
  async getConnection(): Promise<Connection> {
    // Check if connection needs refresh
    if (this.connection) {
      const age = Date.now() - this.createdAt;
      const idle = Date.now() - this.lastUsed;
      
      // Force reconnect if too old (prevents stale connections)
      if (age > this.maxLifetime) {
        await this.close('max lifetime exceeded');
      }
      // Close if idle too long
      else if (idle > this.idleTimeout) {
        await this.close('idle timeout');
      }
      // Validate connection health
      else if (!await this.isHealthy()) {
        await this.close('health check failed');
      }
    }
    
    // Establish new connection if needed
    if (!this.connection) {
      await this.connect();
    }
    
    this.lastUsed = Date.now();
    return this.connection!;
  }
  
  private async connect(): Promise<void> {
    while (true) {
      try {
        this.connection = await createConnection({
          host: 'db.example.com',
          port: 5432,
          // Enable TCP keep-alive for early dead peer detection
          keepAlive: true,
          keepAliveInitialDelayMillis: 10000,
        });
        
        this.createdAt = Date.now();
        this.reconnectAttempts = 0;
        console.log('Connection established');
        return;
        
      } catch (error) {
        this.reconnectAttempts++;
        
        // Exponential backoff with jitter
        const delay = Math.min(
          this.baseReconnectDelay * Math.pow(2, this.reconnectAttempts),
          this.maxReconnectDelay
        );
        const jitter = delay * 0.2 * Math.random();
        
        console.error(`Connection failed, retry in ${delay + jitter}ms`);
        await sleep(delay + jitter);
      }
    }
  }
  
  private async isHealthy(): Promise<boolean> {
    try {
      // Simple query to verify connection works
      await this.connection!.query('SELECT 1');
      return true;
    } catch {
      return false;
    }
  }
  
  // Graceful shutdown with connection draining
  async gracefulClose(drainTimeout: number = 30000): Promise<void> {
    console.log('Starting graceful connection shutdown');
    
    // Stop accepting new requests
    this.connection = null;
    
    // Wait for in-flight requests (simplified)
    await sleep(drainTimeout);
    
    // Force close
    await this.close('graceful shutdown');
  }
  
  private async close(reason: string): Promise<void> {
    if (this.connection) {
      console.log(`Closing connection: ${reason}`);
      await this.connection.end();
      this.connection = null;
    }
  }
}
 
function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Summary: Connection Reuse for Throughput

We've explored connection reuse as a critical throughput optimization. Here are the key insights:

Key Takeaways

•Connection establishment is expensive — TCP handshakes, TLS negotiation, and protocol initialization add 10-100ms per connection. At scale, this overhead is prohibitive.
•Connection pooling is essential — Pre-established, reusable connections eliminate per-request establishment costs. Every production database client should use pooling.
•Pool size should be small — Counter-intuitively, fewer connections often yield higher throughput. Size pools based on database parallelism, not application concurrency.
•Connection proxies handle scale — PgBouncer, ProxySQL, and managed proxies multiplex thousands of clients over limited backend connections.
•HTTP Keep-Alive is table stakes — Always enable persistent connections. Configure timeouts to work with load balancers.
•HTTP/2 and gRPC offer multiplexing — Single connections carry unlimited concurrent requests, eliminating the need for connection pools at the HTTP layer.
•Lifecycle management prevents issues — Health checks, reconnection strategies, and connection draining ensure reliability despite long-lived connections.

What's next:

Page Complete

3 / 5