Loading content...
When your code executes await fetch('https://api.example.com/users/123'), it appears instantaneous—a simple line that returns data. But beneath this simplicity lies a complex journey involving dozens of systems, protocols, and decisions.
A request must traverse DNS resolution, TCP connection establishment, TLS handshake, HTTP framing, network routing, server processing, and response transmission—each stage adding latency and potential failure modes.
Understanding the complete request lifecycle is essential for:
This page dissects every stage of the request lifecycle, providing the knowledge to reason about network behavior like an expert.
By the end of this page, you will understand every stage of the HTTP request lifecycle from DNS lookup to response processing. You'll comprehend the timing of each stage, know what resources are consumed, understand failure modes at each step, and be able to optimize each phase for performance and reliability.
A complete HTTP request lifecycle consists of eight distinct phases. Understanding each phase—and their relative contributions to total latency—is fundamental to system optimization.
The Eight Phases:
┌──────────────────────────────────────────────────────────────────────────────┐
│ Request Lifecycle Timeline │
├──────────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. DNS Resolution [█████ ] 20-120ms (uncached) │
│ [█ ] 0-1ms (cached) │
│ │
│ 2. TCP Handshake [███ ] 15-150ms (1 RTT) │
│ [ ] 0ms (connection reuse) │
│ │
│ 3. TLS Handshake [██████ ] 30-150ms (1-2 RTT) │
│ [█ ] 0ms (session resume) │
│ │
│ 4. Request Sending [█ ] 1-10ms (typical) │
│ │
│ 5. Server Processing [███████████ ] Variable (1-5000ms) │
│ │
│ 6. Response Sending [████ ] 5-500ms (depends on size) │
│ │
│ 7. Response Parsing [█ ] 1-20ms (depends on size/format) │
│ │
│ 8. Cleanup [ ] <1ms │
│ │
└──────────────────────────────────────────────────────────────────────────────┘
Latency Breakdown for a Typical API Call:
For a new HTTPS connection to an API server 50ms away:
| Phase | First Request | Subsequent Request (same connection) |
|---|---|---|
| DNS Resolution | 50ms | 0ms (cached) |
| TCP Handshake | 50ms | 0ms (reused) |
| TLS Handshake | 100ms | 0ms (reused) |
| Request Send | 5ms | 5ms |
| Server Processing | 50ms | 50ms |
| Response Receive | 10ms | 10ms |
| Total | 265ms | 65ms |
Connection reuse reduces latency by 75%. This is why persistent connections and connection pooling are so critical.
The first request to a new endpoint pays a 'tax' of DNS + TCP + TLS that subsequent requests avoid. For latency-sensitive applications, techniques like connection pre-warming (opening connections before they're needed) and DNS prefetching can eliminate this tax from the critical path.
Before any network communication can occur, the client must translate the hostname (e.g., api.example.com) into an IP address. This is DNS resolution.
The DNS Resolution Process:
┌────────────┐ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐
│ Client │────▶│ Local Resolver│────▶│ Root DNS │────▶│ TLD DNS │
│ (Browser/ │ │ (ISP/Local) │ │ Servers │ │ Servers │
│ Service) │ └──────────────┘ └─────────────┘ │ (.com, etc) │
└────────────┘ └──────────────┘
│
┌──────────────┐ ┌─────────────────┐│
│ IP Address │◀────│ Authoritative │◀┘
│ Returned │ │ DNS Server │
└──────────────┘ │ (example.com) │
└─────────────────┘
Step by Step:
DNS Latency Factors:
| Scenario | Typical Latency | Notes |
|---|---|---|
| Browser cache hit | 0ms | Instant, no network |
| OS cache hit | <1ms | System call, no network |
| Local resolver cache hit | 1-5ms | Single network hop |
| Resolver recursive lookup | 20-150ms | Multiple network hops |
| Authoritative server far away | 50-200ms | Geographic latency |
DNS Resolution Optimizations:
1. DNS Prefetching: Hint to the browser/client that a hostname will be needed soon:
<link rel="dns-prefetch" href="//api.example.com">
For services, pre-resolve DNS during startup or idle periods.
2. Reduced TTLs for Flexibility (with Trade-offs):
3. Multiple A Records (Round-Robin DNS): Return multiple IP addresses; client typically uses first, but has fallbacks.
4. Local DNS Caching: Run a local caching resolver (systemd-resolved, dnsmasq) to reduce latency.
5. DNS over HTTPS (DoH) / DNS over TLS (DoT): Encrypted DNS prevents eavesdropping but may add latency. Use with persistent connection pooling.
Failure Modes:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
import * as dns from 'dns';import { performance } from 'perf_hooks'; // DNS resolution with timing and caching insightsasync function resolveDNSWithMetrics(hostname: string): Promise<DNSResolutionResult> { const startTime = performance.now(); return new Promise((resolve, reject) => { // dns.resolve4 gets IPv4 addresses dns.resolve4(hostname, { ttl: true }, (err, addresses) => { const endTime = performance.now(); const latencyMs = endTime - startTime; if (err) { reject({ hostname, error: err.code, latencyMs, // Common errors: // ENOTFOUND - Domain doesn't exist // ETIMEDOUT - DNS server timeout // ESERVFAIL - DNS server error }); return; } resolve({ hostname, addresses: addresses.map(a => ({ address: a.address, ttl: a.ttl, // How long to cache })), latencyMs, // Estimate if this was cached wasCached: latencyMs < 5, // <5ms suggests cache hit }); }); });} // DNS prefetching for known endpointsclass DNSPrefetcher { private cache = new Map<string, { addresses: string[]; expiry: number }>(); private prefetchQueue: string[] = []; async prefetch(hostnames: string[]): Promise<void> { console.log(`Prefetching DNS for ${hostnames.length} hostnames...`); const results = await Promise.allSettled( hostnames.map(h => this.resolveAndCache(h)) ); const successful = results.filter(r => r.status === 'fulfilled').length; console.log(`DNS prefetch complete: ${successful}/${hostnames.length} successful`); } private async resolveAndCache(hostname: string): Promise<void> { const existing = this.cache.get(hostname); if (existing && existing.expiry > Date.now()) { return; // Already cached and valid } const result = await resolveDNSWithMetrics(hostname); // Cache with TTL (minimum 60s to avoid hammering DNS) const minTTL = 60; const ttl = Math.max( minTTL, Math.min(...result.addresses.map(a => a.ttl)) ); this.cache.set(hostname, { addresses: result.addresses.map(a => a.address), expiry: Date.now() + (ttl * 1000), }); } getAddress(hostname: string): string | undefined { const cached = this.cache.get(hostname); if (cached && cached.expiry > Date.now()) { // Round-robin through cached addresses return cached.addresses[0]; } return undefined; }} // Usage in service startupconst prefetcher = new DNSPrefetcher(); async function initializeService() { // Prefetch DNS for all dependent services during startup await prefetcher.prefetch([ 'database.internal.example.com', 'cache.internal.example.com', 'auth.internal.example.com', 'metrics.internal.example.com', ]); console.log('Service initialized with pre-resolved DNS');}With an IP address obtained, the client must establish a TCP connection to the server. This is the famous three-way handshake.
The Three-Way Handshake:
Client Server
│ │
│ ─────────── SYN (seq=x) ───────────────▶ │ t=0
│ │
│ ◀──────── SYN-ACK (seq=y, ack=x+1) ────── │ t=RTT/2
│ │
│ ─────────── ACK (ack=y+1) ──────────────▶ │ t=RTT
│ │
│ [Connection Established] │
│ │
Total time: 1 RTT (one round-trip time)
Before data can be sent: 1.5 RTT (if sending immediately after ACK)
Step by Step:
SYN (Synchronize): Client sends SYN packet with initial sequence number (x)
SYN-ACK: Server acknowledges SYN (ack=x+1) and sends its own SYN (seq=y)
ACK: Client acknowledges server's SYN (ack=y+1)
Why This Matters:
The three-way handshake ensures both parties are ready to communicate and establishes initial sequence numbers for reliable, ordered delivery. However, it costs 1 RTT before any data can flow.
For a server 50ms away (100ms RTT), this adds 100ms to every new connection.
| Server Location | Approx. RTT | Handshake Time | Impact |
|---|---|---|---|
| Same data center | 0.5-2ms | 1-2ms | Negligible |
| Same region | 5-20ms | 5-20ms | Minor |
| Across continent | 30-70ms | 30-70ms | Noticeable |
| Intercontinental | 100-200ms | 100-200ms | Significant |
| Opposite hemisphere | 200-300ms | 200-300ms | Severe |
TCP Fast Open (TFO):
TCP Fast Open is an extension that allows data to be sent in the initial SYN packet (for repeat connections):
First connection (obtains TFO cookie):
SYN → SYN-ACK w/ cookie → ACK + data
Subsequent connections:
SYN + cookie + data → SYN-ACK + response
Saves 1 RTT on connection establishment!
TFO requires:
Connection States and Resources:
Each TCP connection consumes server resources:
| Resource | Typical Cost | Concern |
|---|---|---|
| File descriptor | 1 | Limited per process (ulimit) |
| Memory (buffers) | 10-50KB | Scales with connection count |
| CPU (state management) | Minimal | Context switching at scale |
| Ephemeral port | 1 (client-side) | 65535 max per destination IP |
Connection Limits:
After closing a TCP connection, it enters TIME_WAIT state for 60 seconds (typically). A high-volume client making many short connections can exhaust ephemeral ports or accumulate memory in TIME_WAIT sockets. This is a key reason to use connection pooling and persistent connections.
For HTTPS connections, after TCP is established, the TLS handshake must complete before any HTTP data can be exchanged. This is the most complex phase of connection establishment.
TLS 1.2 Handshake (Legacy):
Client Server
│ │
│ ──────── ClientHello ─────────────────────▶ │ t=0
│ (cipher suites, random) │
│ │
│ ◀─────── ServerHello ───────────────────── │ t=RTT/2
│ ◀─────── Certificate ───────────────────── │
│ ◀─────── ServerKeyExchange ─────────────── │
│ ◀─────── ServerHelloDone ──────────────── │
│ │
│ ──────── ClientKeyExchange ────────────────▶│ t=RTT
│ ──────── ChangeCipherSpec ─────────────────▶│
│ ──────── Finished ─────────────────────────▶│
│ │
│ ◀─────── ChangeCipherSpec ──────────────── │ t=1.5 RTT
│ ◀─────── Finished ──────────────────────── │
│ │
│ [Encrypted Tunnel Established] │ t=2 RTT
Total time: 2 RTT additional (on top of TCP handshake)
TLS 1.3 Handshake (Modern):
TLS 1.3 reduces the handshake to just 1 RTT:
Client Server
│ │
│ ──────── ClientHello ─────────────────────▶ │ t=0
│ (cipher suites, key_share) │
│ [Key material included!] │
│ │
│ ◀─────── ServerHello ───────────────────── │ t=RTT/2
│ ◀─────── EncryptedExtensions ──────────── │
│ ◀─────── Certificate ───────────────────── │
│ ◀─────── CertificateVerify ─────────────── │
│ ◀─────── Finished ──────────────────────── │
│ │
│ ──────── Finished ─────────────────────────▶│ t=RTT
│ │
│ [Encrypted Tunnel Established] │
Total time: 1 RTT additional (on top of TCP handshake)
TLS 1.3 with 0-RTT Resumption:
For resumed connections with session tickets:
Client Server
│ │
│ ──── ClientHello + Early Data ────────────▶ │ t=0
│ (session ticket + encrypted data) │
│ │
│ ◀──── ServerHello + Response ───────────── │ t=RTT/2
│ │
Total additional time: 0 RTT!
Data sent immediately with first packet.
| Scenario | TLS 1.2 | TLS 1.3 | TLS 1.3 0-RTT |
|---|---|---|---|
| New connection (50ms RTT) | +100ms | +50ms | N/A (no session) |
| Resumed connection (50ms RTT) | +50ms (w/ tickets) | +50ms | +0ms |
| New connection (200ms RTT) | +400ms | +200ms | N/A |
| Resumed connection (200ms RTT) | +200ms | +200ms | +0ms |
Certificate Validation:
During the TLS handshake, the client must validate the server's certificate:
OCSP Stapling:
Without stapling, the client must contact the CA's OCSP server to check revocation—adding latency. OCSP stapling lets the server include a signed, time-stamped OCSP response, eliminating this round-trip:
# Nginx OCSP stapling configuration
ssl_stapling on;
ssl_stapling_verify on;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 5s;
Session Resumption:
To avoid full handshake overhead on reconnect:
Session resumption can reduce handshake from 2 RTT to 1 RTT (TLS 1.2) or enable 0-RTT (TLS 1.3).
TLS 1.3 0-RTT early data is not protected against replay attacks. An attacker could capture and resend early data. Only use 0-RTT for idempotent requests (GET, HEAD), never for state-changing operations (POST, PUT, DELETE).
With the connection established and encrypted, the HTTP request can finally be transmitted. These phases are where application logic takes over from protocol mechanics.
Phase 4: Request Transmission
The client sends the HTTP request over the established connection:
Factors Affecting Request Transmission Time:
| Factor | Impact | Optimization |
|---|---|---|
| Request body size | Linear with size | Compress, minimize |
| Available bandwidth | Direct | Often limited by last mile |
| Header size | Per-request overhead | Use HTTP/2 (HPACK) |
| Number of segments | More segments = more overhead | Large MTU if available |
| Network quality | Packet loss causes retransmits | QoS, redundant paths |
For typical API requests (1-10KB), transmission takes 1-10ms.
Phase 5: Server Processing
This is where the actual work happens and is entirely application-dependent:
┌─────────────────────────────────────────────────────────────────────┐
│ Server Processing Breakdown │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ 1. Connection Accept & Request Parsing [1-5ms] │
│ - Accept connection from listener │
│ - Parse HTTP headers │
│ - Route to handler │
│ │
│ 2. Authentication & Authorization [5-50ms] │
│ - JWT validation │
│ - Permission checks │
│ - Rate limit evaluation │
│ │
│ 3. Business Logic [Variable] │
│ - Input validation │
│ - Core processing │
│ - External service calls (can add 10-500ms each!) │
│ │
│ 4. Data Access [1-100ms] │
│ - Database queries │
│ - Cache lookups │
│ - File system access │
│ │
│ 5. Response Construction [1-20ms] │
│ - Serialize response body │
│ - Set headers │
│ - Compression (if enabled) │
│ │
└─────────────────────────────────────────────────────────────────────┘
Server Processing Optimization:
1. Minimize External Calls: Each synchronous call to another service adds its full request lifecycle. Where possible:
2. Optimize Data Access:
3. Efficient Serialization:
4. Response Compression:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
// Middleware for detailed request lifecycle timingimport { performance } from 'perf_hooks'; interface RequestTiming { parseStart: number; parseEnd: number; authStart: number; authEnd: number; handlerStart: number; handlerEnd: number; dbQueries: Array<{ query: string; durationMs: number }>; externalCalls: Array<{ service: string; durationMs: number }>; serializeStart: number; serializeEnd: number; totalServerTime: number;} function timingMiddleware(req, res, next) { const timing: Partial<RequestTiming> = { dbQueries: [], externalCalls: [], }; const startTime = performance.now(); timing.parseStart = startTime; // Attach timing object to request req.timing = timing; // Instrument response completion const originalEnd = res.end; res.end = function(...args) { timing.totalServerTime = performance.now() - startTime; // Add Server-Timing header for client visibility res.setHeader('Server-Timing', buildServerTimingHeader(timing)); // Log detailed timing logRequestTiming(req, timing); return originalEnd.apply(this, args); }; timing.parseEnd = performance.now(); next();} function buildServerTimingHeader(timing: Partial<RequestTiming>): string { const entries = []; entries.push(`parse;dur=${(timing.parseEnd - timing.parseStart).toFixed(1)}`); if (timing.authEnd && timing.authStart) { entries.push(`auth;dur=${(timing.authEnd - timing.authStart).toFixed(1)}`); } if (timing.handlerEnd && timing.handlerStart) { entries.push(`handler;dur=${(timing.handlerEnd - timing.handlerStart).toFixed(1)}`); } const totalDb = timing.dbQueries?.reduce((sum, q) => sum + q.durationMs, 0) ?? 0; if (totalDb > 0) { entries.push(`db;dur=${totalDb.toFixed(1)}`); } const totalExternal = timing.externalCalls?.reduce((sum, c) => sum + c.durationMs, 0) ?? 0; if (totalExternal > 0) { entries.push(`external;dur=${totalExternal.toFixed(1)}`); } entries.push(`total;dur=${timing.totalServerTime?.toFixed(1)}`); return entries.join(', ');} // Usage in handlerasync function getUserHandler(req, res) { const timing = req.timing as RequestTiming; // Auth timing timing.authStart = performance.now(); await authenticateRequest(req); timing.authEnd = performance.now(); // Handler timing timing.handlerStart = performance.now(); // Database query (instrumented) const dbStart = performance.now(); const user = await db.query('SELECT * FROM users WHERE id = $1', [req.params.id]); timing.dbQueries.push({ query: 'getUserById', durationMs: performance.now() - dbStart, }); // External call (instrumented) if (user.needsProfileEnrichment) { const extStart = performance.now(); const profile = await profileService.get(user.id); timing.externalCalls.push({ service: 'profile-service', durationMs: performance.now() - extStart, }); } timing.handlerEnd = performance.now(); // Serialize response timing.serializeStart = performance.now(); const responseBody = JSON.stringify(user); timing.serializeEnd = performance.now(); res.json(user);}The final phases complete the request lifecycle, returning data to the client and releasing resources.
Phase 6: Response Transmission
The server sends the HTTP response back through the same connection:
Server → [Serialize Response] → [Encrypt (TLS)] → [TCP Segment] → [Network] → Client
Response Size Impact:
| Response Size | Typical Transmission Time (100 Mbps) | Notes |
|---|---|---|
| 1 KB | <1ms | Single packet |
| 10 KB | ~1ms | Few packets |
| 100 KB | ~8ms | Many packets, starts showing |
| 1 MB | ~80ms | Significant, consider streaming |
| 10 MB | ~800ms | Very significant, definitely stream |
Streaming vs Buffered Responses:
For large responses, streaming allows the client to begin processing before the full response is transmitted:
Buffered: [--------- Server builds full response ---------] → [Transmit all]
|<------------- Client waits --------------->| |<- Process ->|
Streaming: [Build+Send chunk 1][Chunk 2][Chunk 3][Chunk N]
|<- Process ->| |<- ... ->| |<- Final ->|
First byte arrives much sooner!
Phase 7: Response Processing (Client-Side)
The client processes the received response:
Phase 8: Cleanup
Resources are released or prepared for reuse:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116
// Complete request lifecycle with client-side timing interface RequestLifecycleMetrics { dnsStart: number; dnsEnd: number; tcpStart: number; tcpEnd: number; tlsStart: number; tlsEnd: number; requestStart: number; requestEnd: number; responseStart: number; // First byte responseEnd: number; // Last byte parseStart: number; parseEnd: number; // Derived metrics dnsTime: number; connectionTime: number; // TCP + TLS ttfb: number; // Time to First Byte downloadTime: number; totalTime: number;} // In browsers, use Performance APIfunction measureWithPerformanceAPI(url: string): PerformanceResourceTiming | null { const entries = performance.getEntriesByName(url); if (entries.length === 0) return null; const entry = entries[entries.length - 1] as PerformanceResourceTiming; console.log(`Request to ${url}:`); console.log(` DNS: ${entry.domainLookupEnd - entry.domainLookupStart}ms`); console.log(` TCP: ${entry.connectEnd - entry.connectStart}ms`); console.log(` TLS: ${entry.secureConnectionStart ? entry.connectEnd - entry.secureConnectionStart : 0}ms`); console.log(` TTFB: ${entry.responseStart - entry.requestStart}ms`); console.log(` Download: ${entry.responseEnd - entry.responseStart}ms`); console.log(` Total: ${entry.responseEnd - entry.startTime}ms`); return entry;} // For Node.js, manual instrumentationasync function measureRequestLifecycle<T>( url: string, options: RequestInit = {}): Promise<{ data: T; metrics: RequestLifecycleMetrics }> { const metrics: Partial<RequestLifecycleMetrics> = {}; // Note: DNS and TCP timing require lower-level access // This example focuses on what's measurable at HTTP level metrics.requestStart = performance.now(); const response = await fetch(url, options); metrics.responseStart = performance.now(); metrics.ttfb = metrics.responseStart - metrics.requestStart; // Read response body const text = await response.text(); metrics.responseEnd = performance.now(); metrics.downloadTime = metrics.responseEnd - metrics.responseStart; // Parse response metrics.parseStart = performance.now(); const data = JSON.parse(text) as T; metrics.parseEnd = performance.now(); metrics.totalTime = metrics.parseEnd - metrics.requestStart; // Server timing (if provided) const serverTiming = response.headers.get('server-timing'); if (serverTiming) { console.log('Server timing:', serverTiming); } return { data, metrics: metrics as RequestLifecycleMetrics, };} // Streaming response processingasync function processStreamingResponse(url: string): Promise<void> { const response = await fetch(url); if (!response.body) { throw new Error('No response body'); } const reader = response.body.getReader(); const decoder = new TextDecoder(); let bytesReceived = 0; const startTime = performance.now(); while (true) { const { done, value } = await reader.read(); if (done) break; bytesReceived += value.length; const elapsed = performance.now() - startTime; const throughput = (bytesReceived / 1024) / (elapsed / 1000); // KB/s console.log(`Received ${bytesReceived} bytes, throughput: ${throughput.toFixed(1)} KB/s`); // Process chunk immediately instead of waiting for full response const chunk = decoder.decode(value, { stream: true }); await processChunk(chunk); } console.log(`Total: ${bytesReceived} bytes in ${performance.now() - startTime}ms`);}Understanding the request lifecycle enables systematic optimization. Here's a framework for reducing end-to-end latency:
Optimization Priority (by typical impact):
1. Connection Reuse [Eliminates DNS + TCP + TLS overhead]
↓
2. Server Processing Time [Often the largest component]
↓
3. Geographic Proximity [CDN, edge deployment]
↓
4. Response Size [Compression, minimal payloads]
↓
5. Protocol Selection [HTTP/2, HTTP/3 for specific cases]
Always measure which lifecycle phases dominate your latency before optimizing. If server processing is 90% of your latency, no amount of connection optimization will help significantly. Use distributed tracing (OpenTelemetry, Jaeger) and client-side metrics (Resource Timing API) to identify bottlenecks.
We've traced the complete journey of an HTTP request from initiation to completion. This deep understanding is essential for building and debugging high-performance distributed systems.
What's Next:
Now that we understand the complete request lifecycle, we'll examine connection management—how to efficiently manage pools of connections, handle connection failures, and tune connection parameters for optimal performance and resource utilization.
You now have a comprehensive understanding of every phase of the HTTP request lifecycle. This knowledge enables you to identify latency bottlenecks, make informed optimization decisions, and reason about network behavior at a professional level.