Http 1 1 - Learning Module

Loading content...

0/240

Chunked Transfer Encoding

Streaming the Unknown

Consider this scenario: a server needs to send a dynamically generated report that takes 30 seconds to compute. Under HTTP/1.0's model, the server had two options—both terrible:

Buffer the entire response: Wait for all 30 seconds of computation, accumulate the complete report in memory, then send it with a Content-Length header. Users stare at a blank screen.
Close the connection: Send data as it's generated without Content-Length, relying on connection close to signal completion. This breaks persistent connections and requires a new TCP handshake for the next request.

Chunked transfer encoding solved this dilemma elegantly. Introduced in HTTP/1.1, chunked encoding allows servers to send data in discrete chunks, each prefixed with its size, without knowing the total size upfront. The response streams to clients in real-time while the connection remains open for subsequent requests.

What You Will Learn

This page provides comprehensive coverage of chunked transfer encoding: the problem it solves, the wire-level format, implementation details, trailer headers, streaming use cases, and the interplay with compression. You'll understand how chunked encoding enables modern streaming patterns while maintaining HTTP/1.1's persistent connection model.

The Content-Length Problem

HTTP/1.0's response model was fundamentally based on known content length. Every response was expected to include a Content-Length header specifying the exact size of the body in bytes. This requirement created several serious problems:

Problem 1: Dynamic content generation

Web applications often generate content on-the-fly. A database query might return anywhere from 0 to millions of rows. A server-side template might produce different output sizes based on data. Computing the exact byte count required either:

Generating the entire response body first (memory-intensive)
Generating twice: once to count bytes, once to send (CPU-intensive)

Neither approach scales for large responses or high-traffic servers.

Problem 2: Compression compatibility

HTTP compression (gzip, deflate) transforms response bodies before transmission. The compressed size differs from the original size and isn't known until compression completes. Streaming compressed content was impossible—the server had to compress entirely, then count bytes, then send.

Problem 3: Proxies and transformations

Intermediate proxies might transform responses—adding headers, modifying content, or changing encoding. Each transformation potentially changed the content length, requiring the proxy to buffer the entire response before forwarding.

Problem 4: Real-time streaming

Some applications naturally produce unbounded streams: live video, server-sent events, real-time logs. These have no inherent "end" that would allow calculating a total length.

The HTTP/1.0 workaround:

HTTP/1.0's only escape hatch was to omit Content-Length entirely and signal "end of response" by closing the TCP connection. The client would read until the connection closed, knowing the response was complete. But this approach destroyed connection reuse:

HTTP/1.0 200 OK
Content-Type: text/html
[No Content-Length header]

<html>...dynamic content...</html>

[Server closes connection to signal end]
[Next request requires new TCP handshake]

The Connection-Close Trap

Using connection close to signal response end works, but at massive cost. Every subsequent request pays the TCP handshake tax (1.5 RTT minimum, plus TLS overhead for HTTPS). On modern pages with 50+ resources, this eliminates all persistent connection benefits—returning to HTTP/1.0's catastrophic performance.

Chunked Encoding Format

Chunked transfer encoding introduces a simple but powerful framing format. Instead of one large body with a known length, the response is divided into chunks, each prefixed with its size in hexadecimal. A final zero-length chunk signals the end of the response.

The format:

[chunk-size in hex]\r

[chunk-data]\r

[chunk-size in hex]\r

[chunk-data]\r

...
0\r

[optional trailer headers]\r

Each chunk is self-describing: the hex size tells the client exactly how many bytes to read, followed by CRLF (\r ), then the data bytes, followed by another CRLF. The terminating chunk has size 0 and no data.

chunked-response-example.http
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
HTTP/1.1 200 OK
Content-Type: text/html
Transfer-Encoding: chunked
 
1A
<html><head><title>Test
</head><body>
12
<h1>Welcome!</h1>
2F
<p>This content is being streamed...</p>
0
 
# Breakdown:
# Chunk 1: Size 0x1A (26 bytes) = "<html><head><title>Test
</head><body>
"
# Chunk 2: Size 0x12 (18 bytes) = "<h1>Welcome!</h1>
"
# Chunk 3: Size 0x2F (47 bytes) = "<p>This content is being streamed...</p>
"
# Chunk 4: Size 0 (0 bytes) = End of response
 
# Note: The hex size does NOT include the trailing CRLF
# Each chunk ends with CRLF, then next chunk begins

Key format details:

Chunk size is hexadecimal: A = 10 bytes, FF = 255 bytes, 1000 = 4096 bytes
Case-insensitive: a, A, 0a, 0A are all valid for 10 bytes
Leading zeros allowed: 0000A is valid for 10 bytes
Size excludes CRLF delimiters: A chunk of size 5 contains exactly 5 bytes of data, plus the surrounding CRLFs
No Content-Length header: The Transfer-Encoding: chunked header replaces (and is mutually exclusive with) Content-Length
Final chunk must be zero-sized: This is the only valid termination signal

Chunk Size Flexibility

There's no minimum or maximum chunk size (other than implementation limits). A server might send 1-byte chunks for maximum streaming granularity, or 64 KB chunks to minimize framing overhead. The protocol doesn't specify optimal chunk sizes—implementations choose based on buffering strategies and latency requirements.

Chunk Size Examples (Hexadecimal to Decimal)
Hex Value	Decimal Bytes	Typical Use Case
`1`	1	Character-by-character streaming
`10`	16	Fine-grained streaming
`100`	256	Small buffered chunks
`400`	1,024	Standard chunk boundary
`1000`	4,096	Typical server buffer size
`10000`	65,536	Large efficient chunks
`100000`	1,048,576	1 MB chunks (bulk transfer)

Enabling Dynamic Content

Chunked encoding unlocked entirely new patterns for web applications. Servers could now send data as it became available, fundamentally changing the relationship between content generation and transmission.

Pattern 1: Progressive HTML rendering

Servers can send HTML in stages as it's generated:

Send <head> section immediately (critical CSS, preloads)
Send navigation and header HTML
Stream content sections as database queries complete
Send footer and closing tags last

Browsers begin rendering as chunks arrive, improving perceived performance dramatically. Users see content within milliseconds instead of waiting for full generation.

progressive-html-streaming.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Progressive HTML rendering with chunked encoding
import { Response } from 'express';
 
async function streamDashboard(res: Response, userId: string) {
    res.setHeader('Content-Type', 'text/html');
    res.setHeader('Transfer-Encoding', 'chunked');
    
    // Chunk 1: Critical head section (immediate)
    res.write(`<!DOCTYPE html>
<html>
<head>
    <link rel="stylesheet" href="/critical.css">
    <link rel="preload" href="/main.js" as="script">
</head>
<body>
    <nav>Dashboard</nav>
    <div class="loading-skeleton">Loading...</div>`);
    
    // Chunk 2: User data (after DB query)
    const user = await database.getUser(userId);
    res.write(`
    <header>
        <h1>Welcome, ${user.name}!</h1>
    </header>`);
    
    // Chunk 3: Dashboard widgets (parallel queries)
    const [stats, notifications, activity] = await Promise.all([
        database.getUserStats(userId),
        database.getNotifications(userId),
        database.getRecentActivity(userId),
    ]);
    
    res.write(`
    <section class="stats">${renderStats(stats)}</section>
    <section class="notifications">${renderNotifications(notifications)}</section>
    <section class="activity">${renderActivity(activity)}</section>`);
    
    // Final chunk: closing tags
    res.write(`
    <script src="/main.js"></script>
</body>
</html>`);
    
    res.end();  // Sends zero-length terminating chunk
}

Pattern 2: Streaming API responses

Modern APIs often return large result sets. Chunked encoding enables streaming JSON arrays:

// Traditional (buffered): Wait for all results, send once
{"users": [{...}, {...}, {...}, ...(10000 items)...]}

// Streamed (chunked): Send items as they're retrieved
{"users": [
{...},   // Chunk 1: First item
{...},   // Chunk 2: Second item
{...},   // Chunk N: Nth item
]}       // Final chunk

Clients can begin processing results immediately, potentially rendering items while remaining data is still in transit.

Without Chunking

•Generate complete response in memory
•Calculate Content-Length
•Send entire response at once
•User waits for full generation
•Memory usage scales with response size

With Chunking

•Generate and send incrementally
•No Content-Length needed
•Stream data as it's ready
•User sees progress immediately
•Constant memory usage (buffer only)

Time to First Byte (TTFB)

Chunked encoding dramatically improves TTFB for dynamically generated content. Instead of waiting until generation completes, the server sends the first chunk as soon as any data is ready. For slow operations (complex queries, external API calls), this can reduce perceived latency from seconds to milliseconds.

Trailer Headers

An often-overlooked feature of chunked encoding is trailer headers—HTTP headers sent after the body content. Trailers solve a specific problem: headers whose values depend on the body content itself.

The use case:

Consider sending a large file with an integrity checksum. The checksum can only be calculated after processing the entire file. Without trailers:

Option A: Calculate checksum first, then send (requires reading file twice or buffering)
Option B: Send checksum in a separate request/response (complicates protocol)

With trailers, the checksum header is sent after the body, computed as data streams through.

trailer-headers-example.http
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Transfer-Encoding: chunked
Trailer: Content-MD5, Server-Timing
 
1000
[4096 bytes of file data...]
1000
[4096 more bytes...]
800
[2048 final bytes...]
0
Content-MD5: Q2hlY2sgSW50ZWdyaXR5IQ==
Server-Timing: db;dur=123, render;dur=45
 
# Breakdown:
# - "Trailer:" header announces which headers will appear as trailers
# - Body is sent in chunks
# - Zero-length chunk signals body end
# - Trailer headers follow the zero-length chunk
# - Empty line terminates the message

Trailer header rules:

Must be announced: The Trailer header in the initial response lists which headers will appear as trailers
Prohibited headers: Certain headers cannot be trailers:
- Transfer-Encoding (controls message framing)
- Content-Length (would contradict chunked encoding)
- Trailer (recursive)
- Host (request header)
- Authentication headers
Client support required: Not all clients process trailers correctly
Proxy considerations: Some proxies strip trailers when forwarding

Practical trailer applications:

Common Trailer Header Use Cases
Trailer Header	Purpose	Example Value
`Content-MD5`	Integrity verification after streaming	`rL0Y20zC+Fzt72VPzMSk2A==`
`Content-SHA256`	Stronger integrity check	`[base64-encoded SHA256]`
`Server-Timing`	Performance metrics calculated during processing	`db;dur=53.2, render;dur=12.3`
`X-Request-Id`	Request tracking ID generated during handling	`req-abc123-def456`
`Digest`	RFC 3230 content digest	`sha-256=:base64hash:`

Limited Browser Support

Modern browsers have inconsistent trailer support. While the Fetch API can access trailers in some browsers, XMLHttpRequest typically ignores them. Server-to-server communication and specialized clients handle trailers more reliably. Always verify client support before depending on trailer headers.

Chunking and Compression

Chunked encoding interacts importantly with HTTP compression. When both are used, the order of encoding matters, and the relationship can be confusing.

The encoding order:

HTTP distinguishes between two types of encoding:

Content-Encoding: Transforms the content itself (compression)
Transfer-Encoding: Transforms the message for transmission (chunking)

These encodings are applied in a specific order:

Original Content
     ↓
Content-Encoding (gzip/deflate) ← Applied first, to content
     ↓  
Transfer-Encoding (chunked) ← Applied second, to message
     ↓
Transmitted bytes

The receiver reverses the process: dechunk first, then decompress.

gzip-chunked-response.http
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Transfer-Encoding: chunked
 
3E8
[1000 bytes of gzip-compressed data]
200
[512 bytes of gzip-compressed data]
C8
[200 bytes of gzip-compressed data]
0
 
# The flow:
# 1. Server streams HTML through gzip compressor
# 2. Compressed data accumulates in buffer
# 3. When buffer reaches threshold, send as chunk
# 4. Continue until content complete
# 5. Flush final compressed data
# 6. Send zero-length chunk
 
# Client processing:
# 1. Receive chunks and dechunk (reassemble stream)
# 2. Pass assembled stream through gzip decompressor
# 3. Render resulting HTML

Streaming compression:

Chunked encoding enables streaming compression, where data is compressed incrementally rather than buffered entirely:

Application generates content byte-by-byte
Content flows through compressor (gzip/brotli)
Compressor emits compressed bytes as available
Compressed bytes are chunked for transmission
Client dechunks and decompresses incrementally

This pipeline minimizes memory usage at every stage—neither server nor client needs to buffer the complete uncompressed or compressed content.

Compression Flush Strategies

Stream compressors like gzip maintain internal buffers for efficiency. For real-time streaming, servers can "flush" the compressor periodically, forcing it to emit whatever compressed data is available. This trades compression ratio for latency—more flushes mean faster delivery but larger total size.

The Transfer-Encoding: chunked, gzip ambiguity:

RFC 7230 technically allows stacking multiple transfer encodings:

Transfer-Encoding: gzip, chunked

This means: "First gzip the message, then chunk it." However, this creates complications:

What if the receiver supports chunked but not gzip?
How does this interact with Content-Encoding?
Proxies might not understand stacked transfer encodings

In practice, servers use Content-Encoding for compression and Transfer-Encoding: chunked for framing, avoiding stacked transfer encodings entirely.

Encoding Header Comparison
Header	Purpose	Values	Applied To
`Content-Encoding`	Compress/transform content	gzip, deflate, br, identity	Resource representation
`Transfer-Encoding`	Frame for transmission	chunked, (gzip), (deflate)	HTTP message
`Accept-Encoding`	Client compression prefs	gzip, deflate, br, *	Request header
`TE`	Client transfer prefs	trailers, chunked, gzip	Request header

Server-Sent Events and Streaming Applications

Chunked encoding is the foundation for HTTP-based streaming technologies. One of the most important is Server-Sent Events (SSE)—a standardized mechanism for servers to push events to clients over HTTP.

How SSE leverages chunked encoding:

SSE uses chunked encoding to maintain a long-lived HTTP response that never truly "ends." The server sends event data as chunks whenever events occur:

Client initiates request: GET /events
Server responds with Transfer-Encoding: chunked
Server sends event chunks as events occur
Connection remains open indefinitely
Client processes events as they arrive
If connection drops, client reconnects automatically

server-sent-events.http
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Transfer-Encoding: chunked
Connection: keep-alive
 
# First event chunk
36
data: {"type":"greeting","message":"Connected!"}
 
 
# Second event chunk (sent 5 seconds later)
42
data: {"type":"update","count":42,"timestamp":1705500000}
 
 
# Third event chunk (sent 2 minutes later)
38
event: notification
data: {"text":"You have a new message"}
 
 
# Connection remains open indefinitely...
# Server sends chunks whenever events occur
# Client receives them as a continuous stream
 
# Event format (text/event-stream):
# - "data:" prefix for event data
# - "event:" prefix for event type (optional)
# - "id:" prefix for event ID (optional)
# - Empty line separates events

Other streaming patterns enabled by chunked encoding:

Streaming Use Cases

•Long polling: Client request blocks until server has data, response is chunked, client reconnects immediately after
•Streaming JSON (NDJSON): Newline-delimited JSON objects streamed as chunks, enabling processing before complete response
•Log tailing: Real-time log file streaming, each log line sent as it's written
•Progressive image loading: Image data streamed in chunks, browser renders progressively (for supported formats)
•HTTP/2 Server Push preparation: Chunked encoding concepts influenced HTTP/2's stream-based architecture
•AI/LLM response streaming: Token-by-token response streaming from language models for immediate user feedback

ndjson-streaming.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
// Streaming NDJSON (Newline-Delimited JSON) response
import { Response } from 'express';
 
async function streamSearchResults(
    res: Response, 
    query: string
): Promise<void> {
    res.setHeader('Content-Type', 'application/x-ndjson');
    res.setHeader('Transfer-Encoding', 'chunked');
    
    // Stream results from database cursor
    const cursor = database.searchCursor(query);
    
    let resultCount = 0;
    for await (const result of cursor) {
        // Each result is a complete JSON object on its own line
        const jsonLine = JSON.stringify(result) + '
';
        res.write(jsonLine);
        resultCount++;
        
        // Client can process each result immediately
        // No need to wait for all 10,000 results
    }
    
    // Send summary as final object
    res.write(JSON.stringify({ 
        type: 'summary', 
        totalResults: resultCount,
        queryTime: Date.now() - startTime 
    }) + '
');
    
    res.end();
}
 
// Client can process line-by-line:
// const reader = response.body.getReader();
// const decoder = new TextDecoder();
// while (true) {
//     const { value, done } = await reader.read();
//     if (done) break;
//     const lines = decoder.decode(value).split('
');
//     for (const line of lines) {
//         if (line) processResult(JSON.parse(line));
//     }
// }

WebSockets vs. Chunked Streaming

WebSockets provide bidirectional communication, while chunked HTTP responses are unidirectional (server to client). For pure server-to-client streaming (dashboards, notifications, logs), chunked encoding or SSE are simpler and work through more proxies. For interactive applications requiring client-to-server messages, WebSockets are preferred.

Implementation Details and Edge Cases

Implementing chunked encoding correctly requires attention to several details and edge cases.

Chunk parsing correctness:

Parsing chunked responses requires careful handling:

Read chunk size (hex digits until CRLF)
Convert hex to integer
Read exactly that many bytes (the chunk data)
Read and discard following CRLF
If size was 0, read trailer headers until empty line
Otherwise, repeat from step 1

chunk-parser.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
// Robust chunked encoding parser
class ChunkedDecoder {
    private buffer: Buffer = Buffer.alloc(0);
    private state: 'SIZE' | 'DATA' | 'DATA_CRLF' | 'TRAILER' | 'DONE' = 'SIZE';
    private currentChunkSize: number = 0;
    private bytesRemaining: number = 0;
 
    decode(input: Buffer): Buffer[] {
        this.buffer = Buffer.concat([this.buffer, input]);
        const chunks: Buffer[] = [];
        
        while (this.buffer.length > 0 && this.state !== 'DONE') {
            switch (this.state) {
                case 'SIZE': {
                    const crlfIndex = this.buffer.indexOf('\r
');
                    if (crlfIndex === -1) return chunks; // Need more data
                    
                    const sizeHex = this.buffer.slice(0, crlfIndex).toString();
                    // Handle optional chunk extensions (rare)
                    const sizePart = sizeHex.split(';')[0].trim();
                    this.currentChunkSize = parseInt(sizePart, 16);
                    
                    if (isNaN(this.currentChunkSize)) {
                        throw new Error(`Invalid chunk size: ${sizeHex}`);
                    }
                    
                    this.buffer = this.buffer.slice(crlfIndex + 2);
                    
                    if (this.currentChunkSize === 0) {
                        this.state = 'TRAILER';
                    } else {
                        this.bytesRemaining = this.currentChunkSize;
                        this.state = 'DATA';
                    }
                    break;
                }
                
                case 'DATA': {
                    const toRead = Math.min(this.bytesRemaining, this.buffer.length);
                    if (toRead > 0) {
                        chunks.push(this.buffer.slice(0, toRead));
                        this.buffer = this.buffer.slice(toRead);
                        this.bytesRemaining -= toRead;
                    }
                    
                    if (this.bytesRemaining === 0) {
                        this.state = 'DATA_CRLF';
                    } else {
                        return chunks; // Need more data
                    }
                    break;
                }
                
                case 'DATA_CRLF': {
                    if (this.buffer.length < 2) return chunks;
                    
                    if (this.buffer[0] !== 0x0d || this.buffer[1] !== 0x0a) {
                        throw new Error('Missing CRLF after chunk data');
                    }
                    
                    this.buffer = this.buffer.slice(2);
                    this.state = 'SIZE';
                    break;
                }
                
                case 'TRAILER': {
                    // Read trailer headers until empty line
                    const emptyLineIndex = this.buffer.indexOf('\r
\r
');
                    if (emptyLineIndex === -1) {
                        // Check for just \r
 (no trailers)
                        if (this.buffer.length >= 2 && 
                            this.buffer[0] === 0x0d && 
                            this.buffer[1] === 0x0a) {
                            this.buffer = this.buffer.slice(2);
                            this.state = 'DONE';
                        }
                        return chunks;
                    }
                    
                    // Parse trailer headers here if needed
                    this.buffer = this.buffer.slice(emptyLineIndex + 4);
                    this.state = 'DONE';
                    break;
                }
            }
        }
        
        return chunks;
    }
    
    isDone(): boolean {
        return this.state === 'DONE';
    }
}

Edge cases and error handling:

Important Edge Cases

•Chunk extensions: The chunk size line can include extensions after semicolon (e.g., 1A;ext=value\r ). Parse and ignore unrecognized extensions.
•Empty chunks: A chunk with size 0 is the terminator. Mid-stream empty chunks are illegal.
•Huge chunk sizes: Malicious servers might send enormous chunk sizes. Implementations should impose reasonable limits (e.g., 1 GB).
•Incomplete chunks: If connection dies mid-chunk, all previously-received chunks are valid; the incomplete chunk is lost.
•Mixed encoding claims: Content-Length and Transfer-Encoding: chunked are mutually exclusive. If both present, chunked takes precedence.
•Line ending variants: Strictly require CRLF (\r ). Some lax parsers accept bare LF, but this violates the spec and can enable attacks.

Security: Request Smuggling

Ambiguity between Content-Length and Transfer-Encoding handling has enabled "HTTP request smuggling" attacks. When a frontend proxy and backend server disagree on message boundaries, attackers can inject malicious requests. Always prefer Transfer-Encoding over Content-Length when both are present, and use strict parsing.

Summary: Chunked Transfer Encoding

Chunked transfer encoding represents one of HTTP/1.1's most impactful and successful innovations. Unlike pipelining, which failed in practice, chunked encoding became universally adopted and remains essential even as HTTP evolves.

Key Takeaways

•Chunked encoding solves the Content-Length problem — Servers can stream responses without knowing total size upfront
•The format is simple and self-describing — Each chunk includes its size in hex, followed by data, terminated by zero-length chunk
•Trailer headers extend the format — Headers depending on body content can be sent after the body completes
•Chunking enables streaming compression — Content-Encoding (gzip) and Transfer-Encoding (chunked) work together for efficient streaming
•Modern streaming patterns depend on chunking — Server-Sent Events, NDJSON streaming, progressive rendering all leverage chunked encoding
•Implementation requires careful parsing — Edge cases around chunk extensions, line endings, and error handling require attention
•Persistent connections benefit from chunking — Response end is signaled by zero-length chunk, not connection close

What's next:

With chunked encoding enabling flexible response handling, we'll examine another critical HTTP/1.1 feature: the Host header. This seemingly simple addition was essential for virtual hosting—running multiple websites on a single IP address—and fundamentally changed how web infrastructure scales.

Page Complete

You now understand chunked transfer encoding comprehensively: the Content-Length problem it solves, the wire format, streaming patterns it enables, interaction with compression, and implementation considerations. This knowledge is essential for understanding HTTP streaming, Server-Sent Events, and modern real-time web applications.