Loading learning content...
Every time your application connects to a database, cache, or external service, it pays a connection establishment tax. This tax includes TCP handshakes, TLS negotiations, authentication exchanges, and protocol initialization. For a single connection, this might seem trivial—perhaps 10-50 milliseconds. But multiply that by thousands of requests per second, and connection overhead becomes a throughput killer.
Consider a web application making database queries:
Total time: 12.1ms, but only 1ms was actual work. The rest was connection overhead.
Now imagine this application handles 1,000 requests per second. That's 1,000 × 10ms = 10 seconds of CPU time per second just on connection establishment—impossible! The application would collapse.
This page explores connection reuse strategies that eliminate establishment overhead: connection pooling, HTTP keep-alive, connection multiplexing (HTTP/2, gRPC), and persistent connections. You'll understand how to configure these mechanisms and the trade-offs involved in each approach.
To understand why connection reuse matters, let's dissect what happens when establishing a new connection.
TCP Three-Way Handshake (Every TCP Connection):
Client Server
│ │
│──────── SYN (seq=x) ──────────────▶│ RTT/2
│ │
│◀─────── SYN-ACK (seq=y, ack=x+1) ──│ RTT/2
│ │
│──────── ACK (ack=y+1) ────────────▶│ RTT/2
│ │
│ Connection Established │
│ │
Minimum time: 1.5 × RTT (Round-Trip Time)
- Same datacenter: 0.5-2ms RTT → ~1-3ms handshake
- Cross-datacenter: 10-50ms RTT → ~15-75ms handshake
- Cross-continent: 100-200ms RTT → ~150-300ms handshake
TLS Handshake (HTTPS, Secure Connections):
TLS 1.2 (Full Handshake):
1. ClientHello → (cipher suites, random)
2. ServerHello ← (chosen cipher, random)
3. Certificate ← (server's certificate)
4. ServerKeyExchange ← (DH parameters)
5. ClientKeyExchange → (encrypted pre-master)
6. ChangeCipherSpec → / ←
7. Finished → / ←
Total: 2 additional RTTs (on top of TCP handshake)
TLS 1.3 (Optimized):
- 1 RTT for full handshake
- 0 RTT for resumption (with PSK)
Cost:
- TLS 1.2: 3.5 RTTs total (TCP + TLS)
- TLS 1.3: 2.5 RTTs (TCP + TLS)
- TLS 1.3 with 0-RTT: 1.5 RTTs (TCP + TLS resumption)
| Protocol | Components | RTTs Required | Typical Latency (Same DC) |
|---|---|---|---|
| TCP only | 3-way handshake | 1.5 RTT | 1-3ms |
| TCP + TLS 1.2 | TCP + full TLS handshake | 3.5 RTT | 3-7ms |
| TCP + TLS 1.3 | TCP + optimized TLS | 2.5 RTT | 2-5ms |
| PostgreSQL | TCP + TLS + auth + startup | 4-6 RTT | 10-30ms |
| MySQL | TCP + TLS + auth packets | 4-5 RTT | 8-20ms |
| Redis (AUTH) | TCP + AUTH command | 2 RTT | 2-4ms |
| HTTP/1.1 + TLS | TCP + TLS | 2.5-3.5 RTT | 3-7ms |
| gRPC (HTTP/2) | TCP + TLS + HTTP/2 preface | 3-4 RTT | 4-10ms |
A microservices request that fans out to 5 services, each establishing new connections, pays 5× the connection overhead. With cross-datacenter calls, this can add hundreds of milliseconds to every request—unacceptable for user-facing latency.
Connection pooling is the most important connection reuse technique. Instead of creating a new connection for each operation, applications maintain a pool of pre-established connections that are borrowed, used, and returned.
Without Pooling: With Pooling:
Request 1: Request 1:
connect (10ms) borrow from pool (0.01ms)
query (1ms) query (1ms)
disconnect return to pool (0.01ms)
Request 2: Request 2:
connect (10ms) borrow from pool (0.01ms)
query (1ms) query (1ms)
disconnect return to pool (0.01ms)
Total: 22ms Total: 2.04ms (10x faster!)
Pool Architecture:
┌─────────────────────────────────────┐
│ Application Process │
│ │
│ ┌───────────────────────────────┐ │
│ │ Connection Pool │ │
│ │ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ │
│ │ │Conn│ │Conn│ │Conn│ │Conn│ │ │
│ │ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │ │
│ │ │idle│ │busy│ │idle│ │busy│ │ │
│ │ └──┬─┘ └──┬─┘ └──┬─┘ └──┬─┘ │ │
│ └─────┼─────┼─────┼─────┼─────┘ │
└────────┼─────┼─────┼─────┼────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────┐
│ Database Server │
└─────────────────────────────────────┘
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
// Database connection pool configuration examples // PostgreSQL with pg-poolimport { Pool } from 'pg'; const pgPool = new Pool({ host: 'localhost', database: 'myapp', user: 'appuser', password: 'secret', // Pool sizing min: 5, // Keep 5 connections warm max: 20, // Never exceed 20 connections // Timeouts connectionTimeoutMillis: 5000, // Wait up to 5s for connection idleTimeoutMillis: 30000, // Close idle after 30s // Health checking allowExitOnIdle: false, // Keep pool alive}); // Usage - connection automatically returned to poolconst result = await pgPool.query('SELECT * FROM users WHERE id = $1', [userId]); // TypeORM configurationconst dataSource = new DataSource({ type: 'postgres', host: 'localhost', database: 'myapp', // Pool configuration extra: { min: 5, max: 20, idleTimeoutMillis: 30000, // Connection validation validateConnection: true, validationQuery: 'SELECT 1', }}); // Redis connection pool (ioredis)import Redis from 'ioredis'; const redisCluster = new Redis.Cluster([ { host: 'redis-1', port: 6379 }, { host: 'redis-2', port: 6379 },], { // Per-node connection settings redisOptions: { connectTimeout: 5000, maxRetriesPerRequest: 3, }, // Pool-like behavior scaleReads: 'slave', // Read from replicas natMap: {}, // NAT translation // Connection reuse enableOfflineQueue: true, // Queue commands during reconnect enableReadyCheck: true, // Validate before use});Determining optimal pool size is crucial—too small limits throughput, too large wastes resources and can overwhelm backends.
The Pool Size Formula:
For CPU-bound database work:
$$ \text{Optimal Pool Size} = \text{CPU Cores} \times 2 $$
For I/O-bound work (waiting on disk/network):
$$ \text{Pool Size} = \text{Threads} \times \left(1 + \frac{\text{Wait Time}}{\text{Compute Time}}\right) $$
PostgreSQL's Recommendation:
The PostgreSQL wiki suggests a surprisingly small pool size:
$$ \text{connections} = (\text{core_count} \times 2) + \text{effective_spindle_count} $$
For SSD-based systems, this often means pool size < 20 even for high-traffic applications.
Why Small Pools Work:
Scenario: 10,000 requests/second, 10ms average query time
Calculation:
- Concurrent queries needed = 10,000 req/s × 0.01s = 100
- But database can execute ~20 queries truly in parallel (CPU-bound)
- Remaining 80 queries are just waiting in database queue
Conclusion:
- Pool of 20 connections: Database works at capacity
- Pool of 100 connections: 80 connections idle, competing for locks
- Pool of 500 connections: Memory wasted, context switching overhead
The database doesn't go faster with more connections!
Counter-intuitively, smaller pools often yield HIGHER throughput. Each database connection consumes memory (~5-10MB in PostgreSQL), holds locks, and competes for CPU. Beyond a threshold, adding connections causes contention that reduces overall throughput. Pool size should reflect the database's parallel execution capacity, not application concurrency.
Connection Pool Anti-Pattern: Per-Request Connections
// WRONG: New pool per request (defeats the purpose!)
async function handleRequest(req) {
const pool = new Pool({ max: 5 }); // Creates new pool!
const result = await pool.query('SELECT...');
await pool.end(); // Destroys connections!
}
// RIGHT: Shared pool
const pool = new Pool({ max: 20 }); // Created once at startup
async function handleRequest(req) {
const result = await pool.query('SELECT...');
// Connection automatically returned to pool
}
| Application Type | Pool Size Formula | Typical Range |
|---|---|---|
| Web API (Node.js, 1 process) | 2-3 × CPU cores | 10-30 |
| Web API (multi-process) | Per-process: cores × 2, Total ÷ processes | 5-10 per process |
| Background workers | Workers × 2 | 2-4 per worker |
| Connection proxy (PgBouncer) | Based on backend database capacity | 50-400 |
| Microservice (high fanout) | Keep small, use proxy | 5-10 |
When many application instances need database access, individual pools can overwhelm the database. Connection proxies solve this by multiplexing many client connections over fewer backend connections.
The Problem: Connection Explosion
100 App Instances
(20 connections each)
│
▼
┌───────────────┐
│ Database │
│ (max 500 conn)│─── OVERWHELMED!
│ │ 2000 connections requested
└───────────────┘ but only 500 supported
The Solution: Connection Proxy
100 App Instances
(20 connections each)
│
2000 connections
│
▼
┌───────────────┐
│ PgBouncer │
│ (Proxy) │
└───────┬───────┘
│
100 connections
│
▼
┌───────────────┐
│ Database │
│ (happy!) │
└───────────────┘
123456789101112131415161718192021222324252627282930313233343536373839404142
; PgBouncer configuration for high-throughput applications[databases]; Connection string to actual PostgreSQLmyapp = host=pg-primary.internal port=5432 dbname=myapp [pgbouncer]; Listening configurationlisten_addr = 0.0.0.0listen_port = 6432auth_type = md5auth_file = /etc/pgbouncer/userlist.txt ; Pool mode - transaction gives best throughput; session: 1 client = 1 backend (no multiplexing); transaction: multiplex after each transaction; statement: multiplex after each statement (dangerous!)pool_mode = transaction ; Pool sizingdefault_pool_size = 20 ; Connections per database/user pairmin_pool_size = 5 ; Keep this many warmreserve_pool_size = 5 ; Extra for burst trafficreserve_pool_timeout = 3 ; Wait before using reserve ; Connection limitsmax_client_conn = 10000 ; Accept up to 10K clientsmax_db_connections = 100 ; But only 100 to actual database! ; Timeoutsserver_connect_timeout = 3server_idle_timeout = 600 ; Close idle backend after 10minserver_lifetime = 3600 ; Force reconnect after 1 hourclient_idle_timeout = 0 ; Never timeout idle clients ; Query timeoutquery_timeout = 30 ; Kill queries over 30squery_wait_timeout = 120 ; Wait up to 2min for connection ; Statsstats_period = 60log_connections = 0 ; Don't log every connectlog_disconnections = 0 ; Don't log every disconnectHTTP/1.0 originally created a new TCP connection for every request—catastrophically inefficient. HTTP Keep-Alive (persistent connections) allows connection reuse across multiple HTTP exchanges.
HTTP/1.0 (Without Keep-Alive):
Request 1: TCP connect → TLS → Send GET → Receive → TCP close
Request 2: TCP connect → TLS → Send GET → Receive → TCP close
Request 3: TCP connect → TLS → Send GET → Receive → TCP close
Cost per request: ~5-10ms overhead
HTTP/1.1 (Keep-Alive Default):
TCP connect → TLS → Send GET 1 → Receive
→ Send GET 2 → Receive
→ Send GET 3 → Receive
... (reuse indefinitely)
→ TCP close (after timeout)
Cost per request: ~0.1ms overhead (after connection established)
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
import http from 'http';import https from 'https'; // Node.js HTTP Agent with Keep-Alive// Reuses connections for requests to the same host const httpAgent = new http.Agent({ keepAlive: true, // Enable keep-alive (default: false!) keepAliveMsecs: 1000, // TCP keep-alive probe interval maxSockets: 50, // Max connections per host maxFreeSockets: 10, // Max idle connections to keep timeout: 60000, // Socket timeout (60s) scheduling: 'fifo', // Use first-in-first-out for fairness}); const httpsAgent = new https.Agent({ keepAlive: true, maxSockets: 50, maxFreeSockets: 10, // TLS session caching (reuse TLS session tickets) maxCachedSessions: 100, // Cache up to 100 TLS sessions}); // Using with fetch (Node.js 18+)const response = await fetch('https://api.example.com/data', { agent: httpsAgent, // Reuses connections}); // Using with axiosimport axios from 'axios'; const client = axios.create({ httpAgent, httpsAgent, timeout: 10000,}); // All requests through this client reuse connections:await client.get('https://api.example.com/users');await client.get('https://api.example.com/orders');await client.get('https://api.example.com/products');// ^^^ These likely reuse the same TCP connection! // Express server: Keep-Alive configurationimport express from 'express'; const app = express();const server = app.listen(3000); // Keep-Alive timeout (how long to keep idle connections)server.keepAliveTimeout = 65000; // 65 seconds // Headers timeout (must be > keepAliveTimeout for ALB compatibility)server.headersTimeout = 66000; // Useful when behind AWS ALB (which has 60s idle timeout)// Client ← 65s → Your Server// ALB has 60s timeout, so 65s ensures server doesn't close firstWhen running behind a load balancer (ALB, ELB, NGINX), ensure your server's keep-alive timeout EXCEEDS the load balancer's. Otherwise, the server might close a connection while the LB still thinks it's valid, causing 502 errors. AWS ALB has a 60-second idle timeout—set your server to 65+ seconds.
HTTP/1.1 Keep-Alive has a limitation: head-of-line blocking. Only one request can be in-flight per connection at a time. To send multiple requests simultaneously, HTTP/1.1 clients open multiple connections (typically 6-8 per domain).
HTTP/2 solves this with multiplexing—multiple concurrent requests share a single TCP connection.
HTTP/1.1 with Keep-Alive:
Connection 1: [Req A]----[Resp A]
[Req B]----[Resp B]
[Req C]----[Resp C]
(sequential)
Connection 2: [Req D]----[Resp D]
[Req E]----[Resp E]
(parallel connections)
Connection 3: [Req F]----[Resp F]
...
6-8 connections needed for
concurrency
HTTP/2 Multiplexing:
Single Connection:
[Req A]─┐ [Resp A chunk]───┐
[Req B]─┤ [Resp B chunk]───┤
[Req C]─┤ [Resp A chunk]───┤
[Req D]─┤ [Resp C chunk]───┤
│ [Resp B chunk]───┤
│ [Resp D chunk]───┤
│ ... │
└───────────────────┘
All requests interleaved on
ONE connection!
| Aspect | HTTP/1.1 | HTTP/2 |
|---|---|---|
| Concurrent requests per connection | 1 | Unlimited (typically 100s) |
| Connections needed for 100 parallel requests | 100 | 1 |
| Head-of-line blocking | Yes (application layer) | No (stream-level) |
| Header compression | No | HPACK (significant savings) |
| Server push | No | Yes (proactive resource sending) |
| Connection establishment | Per-connection | Once, then multiplex |
gRPC and HTTP/2:
gRPC is built on HTTP/2, inheriting multiplexing benefits:
Single gRPC Connection:
┌─────────────────────────────────────────────┐
│ HTTP/2 Connection │
│ │
│ Stream 1: GetUser(id=1) → User{...} │
│ Stream 3: GetUser(id=2) → User{...} │
│ Stream 5: ListOrders() → Order{...} │
│ → Order{...} │
│ → Order{...} │
│ Stream 7: CreateOrder(...) → OrderID │
│ │
│ All streams share ONE TCP connection! │
└─────────────────────────────────────────────┘
Benefits for microservices:
Because gRPC connections are long-lived and multiplexed, traditional L4 load balancers (which balance per-connection) are ineffective. Use L7 load balancing (Envoy, NGINX with gRPC support) or client-side load balancing (gRPC's built-in mechanisms) to distribute requests across backend instances.
Long-lived connections require careful lifecycle management to handle failures, rebalancing, and resource cleanup.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110
// Robust connection management with health checks and reconnection class ManagedConnection { private connection: Connection | null = null; private lastUsed: number = Date.now(); private reconnectAttempts: number = 0; private readonly maxReconnectDelay = 30000; // 30s max private readonly baseReconnectDelay = 100; // Start at 100ms private readonly maxLifetime = 3600000; // 1 hour max lifetime private readonly idleTimeout = 300000; // 5 min idle timeout private readonly healthCheckInterval = 30000; // Check every 30s private createdAt: number = 0; async getConnection(): Promise<Connection> { // Check if connection needs refresh if (this.connection) { const age = Date.now() - this.createdAt; const idle = Date.now() - this.lastUsed; // Force reconnect if too old (prevents stale connections) if (age > this.maxLifetime) { await this.close('max lifetime exceeded'); } // Close if idle too long else if (idle > this.idleTimeout) { await this.close('idle timeout'); } // Validate connection health else if (!await this.isHealthy()) { await this.close('health check failed'); } } // Establish new connection if needed if (!this.connection) { await this.connect(); } this.lastUsed = Date.now(); return this.connection!; } private async connect(): Promise<void> { while (true) { try { this.connection = await createConnection({ host: 'db.example.com', port: 5432, // Enable TCP keep-alive for early dead peer detection keepAlive: true, keepAliveInitialDelayMillis: 10000, }); this.createdAt = Date.now(); this.reconnectAttempts = 0; console.log('Connection established'); return; } catch (error) { this.reconnectAttempts++; // Exponential backoff with jitter const delay = Math.min( this.baseReconnectDelay * Math.pow(2, this.reconnectAttempts), this.maxReconnectDelay ); const jitter = delay * 0.2 * Math.random(); console.error(`Connection failed, retry in ${delay + jitter}ms`); await sleep(delay + jitter); } } } private async isHealthy(): Promise<boolean> { try { // Simple query to verify connection works await this.connection!.query('SELECT 1'); return true; } catch { return false; } } // Graceful shutdown with connection draining async gracefulClose(drainTimeout: number = 30000): Promise<void> { console.log('Starting graceful connection shutdown'); // Stop accepting new requests this.connection = null; // Wait for in-flight requests (simplified) await sleep(drainTimeout); // Force close await this.close('graceful shutdown'); } private async close(reason: string): Promise<void> { if (this.connection) { console.log(`Closing connection: ${reason}`); await this.connection.end(); this.connection = null; } }} function sleep(ms: number): Promise<void> { return new Promise(resolve => setTimeout(resolve, ms));}We've explored connection reuse as a critical throughput optimization. Here are the key insights:
What's next:
Connection reuse eliminates establishment overhead, but request handling still requires synchronous processing. The next page explores queue-based processing—decoupling request acceptance from processing to handle load spikes, enable throttling, and achieve higher overall throughput.
You now understand connection reuse as a throughput optimization—from the physics of connection establishment, through pooling strategies and sizing, to HTTP/2 multiplexing and lifecycle management. Next, we'll examine queue-based processing patterns.