System Design (HLD)gRPC

gRPC: High-Performance Remote Procedure Calls

LevelAdvanced

Duration75 mins

TopicgRPC

2 / 5

HTTP/2: The Transport Revolution Enabling gRPC

Beyond Request-Response: A New Transport Era

For over 15 years, HTTP/1.1 served as the backbone of the web. But its fundamental design—text-based headers, one request per connection, synchronous request-response cycles—created bottlenecks that became increasingly painful as applications grew more complex and demanding.

HTTP/2, standardized in 2015 (RFC 7540), represents a complete reimagining of how browsers and servers communicate. It maintains HTTP semantics (verbs, headers, status codes) while revolutionizing the underlying transport with binary framing, multiplexing, and stream-based communication.

For gRPC, HTTP/2 isn't just beneficial—it's essential. gRPC's streaming capabilities, efficient binary transport, and multiplexed connections all depend on HTTP/2 features that are simply impossible on HTTP/1.1.

What You Will Learn

By the end of this page, you will understand HTTP/2's architecture at the frame level: binary framing, stream multiplexing, header compression with HPACK, flow control, server push, and prioritization. You'll see why these features make HTTP/2 the ideal transport for high-performance RPC frameworks like gRPC.

The Limitations of HTTP/1.1

To appreciate HTTP/2's innovations, we must first understand the specific problems it solves. HTTP/1.1's design constraints created significant performance bottlenecks for modern applications.

Head-of-Line Blocking:

HTTP/1.1 processes requests sequentially within a connection. If you send requests A, B, and C, you must wait for response A before receiving B, and wait for B before C—even if B is ready first. A single slow response blocks everything behind it.

HTTP/1.1 Timeline:

  Request A ─────►
                    Response A ─────────────────►
  Request B ─────►                               (BLOCKED)
                                                 Response B ─────►
  Request C ─────►                                           (BLOCKED)
                                                             Response C ─►

Total time: Sum of all response times (sequential)

Workarounds and Their Costs:

Developers invented workarounds, but each has significant downsides:

HTTP/1.1 Workarounds and Their Problems
Workaround	How It Works	Problems
Multiple Connections	Open 6-8 parallel connections	Socket exhaustion, server memory, TCP slow start repeated
Domain Sharding	Spread resources across subdomains	DNS lookups, SSL handshakes, operational complexity
Image Sprites	Combine images into one file	Download entire sprite for one image, cache invalidation
CSS/JS Bundling	Concatenate files together	Single file invalidates entire cache, delay loading
Data URI Inlining	Base64 embed resources in HTML	33% size increase, no separate caching

Text-Based Protocol Overhead:

HTTP/1.1 headers are plain text, repeated verbatim with every request:

GET /api/users/12345 HTTP/1.1
Host: api.example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)...
Accept: application/json
Accept-Language: en-US,en;q=0.9
Accept-Encoding: gzip, deflate, br
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
Cookie: session=abc; tracking=xyz; preferences=dark-mode
Connection: keep-alive

This header block can be 500-800 bytes—sent with every single request. For an API making 100 requests/second, that's 50-80 KB/sec just for headers that rarely change.

HTTP/1.1 Pipelining Was Not the Answer

HTTP/1.1 defined pipelining (send multiple requests without waiting), but it was never widely deployed. Responses still had to arrive in order, many proxies didn't support it, and any error required connection reset. Most browsers disabled it by default. HTTP/2's multiplexing is the proper solution.

Binary Framing: The Foundation of HTTP/2

HTTP/2's most fundamental change is the introduction of a binary framing layer between the application (HTTP semantics) and the transport (TCP). This layer divides all HTTP/2 communication into discrete, binary-encoded frames.

Frame Structure:

Every HTTP/2 frame consists of a 9-byte header followed by the payload:

http2_frame_structure.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
HTTP/2 Frame Format (RFC 7540 Section 4.1)
 
+-----------------------------------------------+
|                 Length (24)                   |
+---------------+---------------+---------------+
|   Type (8)    |   Flags (8)   |
+-+-------------+---------------+-------------------------------+
|R|                 Stream Identifier (31)                      |
+=+=============================================================+
|                   Frame Payload (0...)                        |
+---------------------------------------------------------------+
 
Total Header: 9 bytes
- Length:    24 bits (payload length, max 16,384 bytes default)
- Type:      8 bits  (DATA, HEADERS, PRIORITY, RST_STREAM, etc.)
- Flags:     8 bits  (END_STREAM, END_HEADERS, PADDED, PRIORITY)
- Reserved:  1 bit   (must be 0)
- Stream ID: 31 bits (which stream this frame belongs to)

HTTP/2 Frame Types
Type	Code	Purpose	Key Flags
DATA	0x00	Application data (request/response bodies)	END_STREAM, PADDED
HEADERS	0x01	HTTP headers (compressed with HPACK)	END_STREAM, END_HEADERS, PRIORITY
PRIORITY	0x02	Stream dependency and weight	None
RST_STREAM	0x03	Immediately terminate a stream	None
SETTINGS	0x04	Connection-level configuration	ACK
PUSH_PROMISE	0x05	Server push notification	END_HEADERS, PADDED
PING	0x06	Connection liveness and RTT measurement	ACK
GOAWAY	0x07	Graceful connection shutdown	None
WINDOW_UPDATE	0x08	Flow control credit	None
CONTINUATION	0x09	Continue HEADERS if too large	END_HEADERS

frame_parsing_example.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
// Example: Parsing an HTTP/2 frame header
 
interface HTTP2FrameHeader {
    length: number;      // Payload length (24 bits)
    type: number;        // Frame type (8 bits)
    flags: number;       // Frame flags (8 bits)
    streamId: number;    // Stream identifier (31 bits)
}
 
const FRAME_TYPES = {
    DATA: 0x00,
    HEADERS: 0x01,
    PRIORITY: 0x02,
    RST_STREAM: 0x03,
    SETTINGS: 0x04,
    PUSH_PROMISE: 0x05,
    PING: 0x06,
    GOAWAY: 0x07,
    WINDOW_UPDATE: 0x08,
    CONTINUATION: 0x09,
} as const;
 
const FLAGS = {
    END_STREAM: 0x01,    // No more data for this stream
    END_HEADERS: 0x04,   // Headers complete (no CONTINUATION)
    PADDED: 0x08,        // Payload has padding
    PRIORITY: 0x20,      // Includes priority information
} as const;
 
function parseFrameHeader(buffer: Uint8Array): HTTP2FrameHeader {
    if (buffer.length < 9) {
        throw new Error('Insufficient data for frame header');
    }
    
    // Parse 24-bit length (big-endian)
    const length = (buffer[0] << 16) | (buffer[1] << 8) | buffer[2];
    
    // Parse type and flags
    const type = buffer[3];
    const flags = buffer[4];
    
    // Parse 31-bit stream ID (ignore reserved bit)
    const streamId = (
        ((buffer[5] & 0x7F) << 24) |  // Mask out reserved bit
        (buffer[6] << 16) |
        (buffer[7] << 8) |
        buffer[8]
    );
    
    return { length, type, flags, streamId };
}
 
// Example: A HEADERS frame establishing a new request
const headersFrame = new Uint8Array([
    0x00, 0x00, 0x3D,  // Length: 61 bytes
    0x01,               // Type: HEADERS
    0x04,               // Flags: END_HEADERS
    0x00, 0x00, 0x00, 0x01,  // Stream ID: 1
    // ... payload: HPACK-encoded headers
]);
 
const parsed = parseFrameHeader(headersFrame);
console.log(parsed);
// { length: 61, type: 1, flags: 4, streamId: 1 }
// This is a HEADERS frame for stream 1, headers complete

Why Binary Matters

Binary parsing is deterministic and fast: read fixed offsets, apply bit masks, done. Text parsing requires scanning for delimiters (\r\n), handling variable-length lines, and string comparisons. HTTP/2's binary framing enables predictable, efficient parsing without ambiguity.

Stream Multiplexing: The End of Head-of-Line Blocking

HTTP/2's most impactful feature is stream multiplexing: the ability to interleave multiple independent request-response exchanges on a single TCP connection.

Streams, Messages, and Frames:

HTTP/2 introduces a hierarchy:

Connection: A single TCP connection between client and server
Stream: A bidirectional flow of frames with a unique ID (within a connection)
Message: A complete HTTP request or response (spanning multiple frames)
Frame: The smallest unit of communication

Multiple streams coexist on one connection, frames from different streams interleave freely, and there's no blocking between streams.

multiplexing_visual.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
HTTP/2 Multiplexing on Single Connection
 
Time ──────────────────────────────────────────────────►
 
Stream 1 (API call):
├── HEADERS (request) ──►          ◄── HEADERS (response)
│                                  ◄── DATA (part 1)
│                                  ◄── DATA (part 2, END_STREAM)
 
Stream 3 (Image download):
│   HEADERS ──►       ◄── HEADERS
│                     ◄── DATA (chunk 1)
│                     ◄── DATA (chunk 2)
│                     ◄── DATA (chunk 3, END_STREAM)
 
Stream 5 (WebSocket-like bidirectional):
│   HEADERS (upgrade) ──►
│                     ◄── HEADERS (accept)
│   DATA (client msg) ──►
│                     ◄── DATA (server msg)
│   DATA (client msg) ──►
│                     ◄── DATA (server msg ...)
 
Wire order (interleaved frames):
┌─────────┬─────────┬─────────┬─────────┬─────────┐
│Stream 1 │Stream 3 │Stream 1 │Stream 5 │Stream 3 │
│HEADERS  │HEADERS  │DATA     │HEADERS  │DATA     │
└─────────┴─────────┴─────────┴─────────┴─────────┘
 
All three streams progress concurrently on ONE TCP connection!

Stream Lifecycle:

Streams have a well-defined state machine:

stream_lifecycle.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
// HTTP/2 Stream States (RFC 7540 Section 5.1)
 
enum StreamState {
    IDLE = 'idle',
    RESERVED_LOCAL = 'reserved (local)',
    RESERVED_REMOTE = 'reserved (remote)',
    OPEN = 'open',
    HALF_CLOSED_LOCAL = 'half-closed (local)',
    HALF_CLOSED_REMOTE = 'half-closed (remote)',
    CLOSED = 'closed',
}
 
/*
State Transitions:
 
                             +--------+
                     send PP |        | recv PP
                    ,--------|  idle  |--------.
                   /         |        |         \
                  v          +--------+          v
           +----------+          |           +----------+
           |          |          | send H /  |          |
    ,------| reserved |          | recv H    | reserved |------.
    |      | (local)  |          |           | (remote) |      |
    |      +----------+          v           +----------+      |
    |          |             +--------+             |          |
    |          |     recv ES |        | send ES    |          |
    |   send H |     ,-------|  open  |-------.    | recv H   |
    |          |    /        |        |        \   |          |
    |          v   v         +--------+         v  v          |
    |      +----------+          |           +----------+     |
    |      |   half   |          |           |   half   |     |
    |      |  closed  |          | send R /  |  closed  |     |
    |      |  (remote)|          | recv R    |  (local) |     |
    |      +----------+          |           +----------+     |
    |           |                |                 |          |
    |           | send ES /      |       recv ES / |          |
    |           | send R /       v        send R / |          |
    |           | recv R    +--------+    recv R   |          |
    | send R /  '---------->|        |<-----------'  send R / |
    | recv R                | closed |               recv R   |
    '---------------------->|        |<----------------------'
                            +--------+
 
Legend:
  H:  HEADERS frame
  PP: PUSH_PROMISE frame
  ES: END_STREAM flag
  R:  RST_STREAM frame
*/
 
interface Stream {
    id: number;
    state: StreamState;
    localWindowSize: number;
    remoteWindowSize: number;
    priority: StreamPriority;
}
 
interface StreamPriority {
    dependencyId: number;  // Parent stream (0 = root)
    weight: number;        // 1-256, relative priority
    exclusive: boolean;    // Become sole child of dependency
}
 
// Stream ID rules
// - Client-initiated streams: ODD numbers (1, 3, 5, 7...)
// - Server-initiated streams: EVEN numbers (2, 4, 6, 8...)
// - Stream 0 is CONNECTION-level (SETTINGS, PING, GOAWAY)
 
function isClientStream(streamId: number): boolean {
    return streamId % 2 === 1;
}
 
function isServerStream(streamId: number): boolean {
    return streamId !== 0 && streamId % 2 === 0;
}

TCP Head-of-Line Blocking Remains

HTTP/2 solves HTTP-level head-of-line blocking, but TCP still delivers packets in order. If a packet is lost, TCP blocks all streams until retransmission. This is why HTTP/3 uses QUIC (UDP-based), which provides per-stream packet ordering. For now, HTTP/2 over TCP remains the standard for gRPC.

HPACK: Stateful Header Compression

HTTP headers are often repetitive—the same User-Agent, Accept, Authorization, and Cookie headers appear with every request. HPACK (RFC 7541) dramatically reduces header overhead through static and dynamic tables combined with Huffman coding.

HPACK Components:

Static Table: 61 pre-defined header name-value pairs (e.g., :method: GET, accept-encoding: gzip)
Dynamic Table: Connection-specific table of recently used headers
Huffman Encoding: Variable-length encoding optimized for HTTP header text
Indexing: Reference previous headers by table index instead of repeating

hpack_example.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
// HPACK Static Table (RFC 7541 Appendix A)
// Pre-defined entries that never change
 
const STATIC_TABLE = [
    // Index 1-15: Pseudo-headers and common names
    { index: 1,  name: ':authority',       value: '' },
    { index: 2,  name: ':method',          value: 'GET' },
    { index: 3,  name: ':method',          value: 'POST' },
    { index: 4,  name: ':path',            value: '/' },
    { index: 5,  name: ':path',            value: '/index.html' },
    { index: 6,  name: ':scheme',          value: 'http' },
    { index: 7,  name: ':scheme',          value: 'https' },
    { index: 8,  name: ':status',          value: '200' },
    { index: 9,  name: ':status',          value: '204' },
    { index: 10, name: ':status',          value: '206' },
    { index: 11, name: ':status',          value: '304' },
    { index: 12, name: ':status',          value: '400' },
    { index: 13, name: ':status',          value: '404' },
    { index: 14, name: ':status',          value: '500' },
    { index: 15, name: 'accept-charset',   value: '' },
    // ... up to index 61
    { index: 61, name: 'www-authenticate', value: '' },
];
 
// HPACK Encoding Examples
 
// Request 1: First request in connection
// Headers: :method GET, :path /api/users, authorization: Bearer xxx
// 
// :method GET      → Static index 2 = 1 byte: 0x82
// :path /api/users → Name index 4, literal value = ~12 bytes
// authorization: Bearer xxx → Literal with indexing = ~25 bytes
// Total: ~38 bytes (vs ~80+ in HTTP/1.1)
 
// Request 2: Same endpoint, same auth
// Dynamic table now contains:
//   Index 62: :path /api/users
//   Index 63: authorization: Bearer xxx
//
// :method GET      → Static index 2 = 1 byte: 0x82
// :path /api/users → Dynamic index 62 = 1 byte: 0xBE
// authorization    → Dynamic index 63 = 1 byte: 0xBF
// Total: 3 bytes! (96% reduction)
 
// Huffman encoding for string literals
// HTTP header chars have specific frequency distributions
// Common chars get shorter codes:
//   'e' = 5 bits, 'a' = 5 bits, '/' = 6 bits
//   'X' = 10 bits, '{' = 15 bits
//
// "GET" in Huffman: 15 bits vs 24 bits (37% saving)
// "/api/users" Huffman: ~40 bits vs 80 bits (50% saving)
 
interface DynamicTable {
    maxSize: number;     // Configured via SETTINGS_HEADER_TABLE_SIZE
    entries: { name: string; value: string }[];
    currentSize: number;  // Sum of (name.length + value.length + 32) for each entry
}
 
function addToDynamicTable(
    table: DynamicTable,
    name: string, 
    value: string
): void {
    const entrySize = name.length + value.length + 32;  // Per RFC 7541
    
    // Evict oldest entries if needed
    while (table.currentSize + entrySize > table.maxSize && table.entries.length > 0) {
        const evicted = table.entries.pop()!;
        table.currentSize -= (evicted.name.length + evicted.value.length + 32);
    }
    
    // Add new entry at beginning (most recent = lowest dynamic index)
    if (entrySize <= table.maxSize) {
        table.entries.unshift({ name, value });
        table.currentSize += entrySize;
    }
}

HPACK Compression Effectiveness
Scenario	HTTP/1.1 Headers	HPACK First Request	HPACK Subsequent	Savings
Simple GET	~500 bytes	~100 bytes	~3 bytes	99.4%
API with Auth	~800 bytes	~150 bytes	~8 bytes	99.0%
Rich Cookies	~2000 bytes	~400 bytes	~15 bytes	99.2%
gRPC metadata	~200 bytes	~50 bytes	~5 bytes	97.5%

CRIME Attack and HPACK Design

HPACK was designed to resist compression oracle attacks (like CRIME/BREACH) that exploited GZIP compression in TLS. HPACK uses Huffman coding (fixed codebook) instead of adaptive compression, and the dynamic table is per-connection, not shared. Never compress secrets with attacker-controlled data on the same connection.

Flow Control: Preventing Resource Exhaustion

With multiplexing, a fast sender could overwhelm a slow receiver with data from multiple streams simultaneously. HTTP/2 implements flow control to prevent this: a credit-based system where receivers must grant permission before senders can transmit.

Flow Control Design:

Per-stream flow control: Each stream has independent receive windows
Connection-level flow control: Overall limit across all streams
Receiver-controlled: Only the receiver can increase window size
DATA frames only: Headers and control frames are not flow-controlled
Cannot be disabled: Flow control is mandatory (but window can be huge)

flow_control.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
// HTTP/2 Flow Control Implementation
 
interface FlowControl {
    connectionWindow: number;  // Connection-level credit
    streamWindows: Map<number, number>;  // Per-stream credit
}
 
const DEFAULT_WINDOW_SIZE = 65535;  // 64 KB initial window (RFC 7540)
const MAX_WINDOW_SIZE = 2147483647;  // 2^31 - 1 (~2 GB)
 
class HTTP2FlowControl {
    private connectionWindow = DEFAULT_WINDOW_SIZE;
    private streamWindows = new Map<number, number>();
    
    // Called when we receive a SETTINGS frame with initial window size
    handleSettingsWindowSize(newSize: number): void {
        const delta = newSize - DEFAULT_WINDOW_SIZE;
        for (const [streamId, window] of this.streamWindows) {
            const updated = window + delta;
            if (updated > MAX_WINDOW_SIZE || updated < 0) {
                // Protocol error: FLOW_CONTROL_ERROR
                this.sendGoaway('FLOW_CONTROL_ERROR');
                return;
            }
            this.streamWindows.set(streamId, updated);
        }
    }
    
    // Called when we receive a WINDOW_UPDATE frame
    handleWindowUpdate(streamId: number, increment: number): void {
        if (increment === 0) {
            // Protocol error: increment must be 1-2^31-1
            this.sendRstStream(streamId, 'PROTOCOL_ERROR');
            return;
        }
        
        if (streamId === 0) {
            // Connection-level update
            const newWindow = this.connectionWindow + increment;
            if (newWindow > MAX_WINDOW_SIZE) {
                this.sendGoaway('FLOW_CONTROL_ERROR');
                return;
            }
            this.connectionWindow = newWindow;
        } else {
            // Stream-level update
            const current = this.streamWindows.get(streamId) ?? DEFAULT_WINDOW_SIZE;
            const newWindow = current + increment;
            if (newWindow > MAX_WINDOW_SIZE) {
                this.sendRstStream(streamId, 'FLOW_CONTROL_ERROR');
                return;
            }
            this.streamWindows.set(streamId, newWindow);
        }
        
        // Check if any pending data can now be sent
        this.flushPendingData();
    }
    
    // Called before sending DATA
    canSendData(streamId: number, dataLength: number): boolean {
        const streamWindow = this.streamWindows.get(streamId) ?? DEFAULT_WINDOW_SIZE;
        
        // Must have credit at BOTH connection and stream level
        return dataLength <= Math.min(this.connectionWindow, streamWindow);
    }
    
    // Called after sending DATA
    consumeFlowCredit(streamId: number, dataLength: number): void {
        this.connectionWindow -= dataLength;
        
        const current = this.streamWindows.get(streamId) ?? DEFAULT_WINDOW_SIZE;
        this.streamWindows.set(streamId, current - dataLength);
    }
    
    private sendGoaway(error: string): void { /* ... */ }
    private sendRstStream(streamId: number, error: string): void { /* ... */ }
    private flushPendingData(): void { /* ... */ }
}
 
/*
Flow Control Example Timeline:
 
1. Connection established, both sides have 65535 byte windows
2. Client sends 10000 bytes on stream 1
   - Client's stream 1 window: 55535 remaining
   - Client's connection window: 55535 remaining
3. Client sends 30000 bytes on stream 3
   - Client's stream 3 window: 35535 remaining
   - Client's connection window: 25535 remaining
4. Server processes stream 1 data, sends WINDOW_UPDATE
   - Stream 1: +10000 → client can send more on stream 1
   - Connection: +10000 → 35535 total available
5. Client sends 20000 bytes on stream 1
   - Connection window: 15535 remaining
6. Client wants to send 20000 bytes on stream 3
   - Stream 3 window: 35535 (OK)
   - Connection window: 15535 (NOT ENOUGH)
   - Client must wait for WINDOW_UPDATE on connection
*/

Flow Control Strategy for gRPC

For gRPC streaming, tune window sizes based on your use case. Large windows (megabytes) for bulk data transfer minimize round-trips. Smaller windows for many concurrent streams prevent any single stream from monopolizing bandwidth. gRPC libraries typically handle this automatically with BDP (Bandwidth-Delay Product) estimation.

Server Push: Proactive Resource Delivery

HTTP/2 allows servers to proactively send resources before the client requests them. When a client requests an HTML page, the server can push the CSS and JavaScript that it knows will be needed, eliminating round-trip latency.

How Server Push Works:

Client sends request for /page.html (stream 1)
Server sends PUSH_PROMISE on stream 1, promising /style.css (assigns stream 2)
Server sends PUSH_PROMISE on stream 1, promising /app.js (assigns stream 4)
Server sends response for /page.html on stream 1
Server sends /style.css on stream 2 (promised)
Server sends /app.js on stream 4 (promised)
When client's parser discovers CSS/JS links, resources are already cached

server_push_example.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// Server Push Frame Structure
 
// PUSH_PROMISE frame (Type 0x05)
// Sent on the stream of original request
// Contains: promised stream ID + request headers for pushed resource
 
interface PushPromiseFrame {
    type: 0x05;
    flags: number;
    streamId: number;         // Original request stream
    promisedStreamId: number; // New stream for pushed resource
    headerBlock: Uint8Array;  // HPACK-encoded request headers
}
 
// Example: Server pushes CSS when client requests HTML
// 
// Client → Server:
// HEADERS (stream 1): GET /page.html
//
// Server → Client:
// PUSH_PROMISE (stream 1): [promises stream 2 for GET /style.css]
// PUSH_PROMISE (stream 1): [promises stream 4 for GET /app.js]
// HEADERS (stream 1): 200 OK, content-type: text/html
// DATA (stream 1): <html>...</html>
// HEADERS (stream 2): 200 OK, content-type: text/css
// DATA (stream 2): body { ... }
// HEADERS (stream 4): 200 OK, content-type: application/javascript
// DATA (stream 4): function app() { ... }
 
// Client can reject pushed resources with RST_STREAM
// Reasons to reject:
// - Already in cache
// - Don't want the resource
// - Too many concurrent streams
 
// gRPC does NOT use server push
// Why?
// 1. gRPC is RPC, not web page loading
// 2. Client knows what it's requesting
// 3. Bidirectional streaming serves similar purpose
// 4. Push adds complexity without clear RPC benefit

Server Push in Practice

Despite its promise, server push has seen limited adoption. Issues include: difficulty knowing what's already cached, increased complexity, potential bandwidth waste, and better alternatives (preload hints, HTTP/103 Early Hints). Most modern deployments don't use it. gRPC explicitly ignores PUSH_PROMISE frames.

Stream Prioritization and Dependencies

HTTP/2 allows clients to express which streams are more important through a priority system. This enables intelligent resource allocation: render-blocking CSS loads before analytics scripts, critical API responses before background sync.

Priority Model:

Each stream can declare:

Dependency: A parent stream (creates a tree)
Weight: Relative importance among siblings (1-256)
Exclusive Flag: Become sole child of parent, other children become your children

priority_tree.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
// HTTP/2 Priority Tree Example
 
/*
Initial state (all streams depend on root with weight 16):
 
        root (0)
       /    |    \
   [16]   [16]   [16]
    S1     S3     S5
 
After S7 is added with exclusive dependency on S3:
 
        root (0)
       /    |    \
   [16]   [16]   [16]
    S1     S3     S5
           |
         [16]
          S7
 
After S9 is added depending on S3 with weight 64:
 
        root (0)
       /    |    \
   [16]   [16]   [16]
    S1     S3     S5
          /  \
       [16]  [64]
        S7    S9
 
Bandwidth allocation (simplified):
- Root level: S1, S3, S5 each get 1/3
- Under S3: S7 gets 16/80 of S3's share, S9 gets 64/80
- If S3 has 300 KB/s, S7 gets 60 KB/s, S9 gets 240 KB/s
*/
 
interface StreamPriority {
    streamId: number;
    dependencyId: number;  // 0 = root
    weight: number;        // 1-256
    exclusive: boolean;
}
 
class PriorityTree {
    private tree = new Map<number, {
        parent: number;
        weight: number;
        children: number[];
    }>();
    
    constructor() {
        // Root node
        this.tree.set(0, { parent: -1, weight: 256, children: [] });
    }
    
    addStream(priority: StreamPriority): void {
        const { streamId, dependencyId, weight, exclusive } = priority;
        const parent = this.tree.get(dependencyId);
        
        if (!parent) {
            // Invalid dependency, use root
            this.addStream({ ...priority, dependencyId: 0 });
            return;
        }
        
        if (exclusive) {
            // Exclusive: all current children become children of new stream
            const formerChildren = [...parent.children];
            parent.children = [streamId];
            this.tree.set(streamId, {
                parent: dependencyId,
                weight,
                children: formerChildren,
            });
            for (const child of formerChildren) {
                const childNode = this.tree.get(child)!;
                childNode.parent = streamId;
            }
        } else {
            // Non-exclusive: add as sibling
            parent.children.push(streamId);
            this.tree.set(streamId, {
                parent: dependencyId,
                weight,
                children: [],
            });
        }
    }
    
    removeStream(streamId: number): void {
        const node = this.tree.get(streamId);
        if (!node) return;
        
        // Reparent children to this node's parent
        const parent = this.tree.get(node.parent);
        if (parent) {
            parent.children = parent.children
                .filter(id => id !== streamId)
                .concat(node.children);
        }
        
        for (const childId of node.children) {
            const child = this.tree.get(childId)!;
            child.parent = node.parent;
        }
        
        this.tree.delete(streamId);
    }
    
    calculateBandwidthAllocation(
        available: number, 
        streamId: number = 0
    ): Map<number, number> {
        const result = new Map<number, number>();
        const node = this.tree.get(streamId)!;
        
        if (node.children.length === 0) {
            if (streamId !== 0) {
                result.set(streamId, available);
            }
            return result;
        }
        
        // Distribute based on weights
        const totalWeight = node.children
            .map(id => this.tree.get(id)!.weight)
            .reduce((a, b) => a + b, 0);
        
        for (const childId of node.children) {
            const child = this.tree.get(childId)!;
            const share = (child.weight / totalWeight) * available;
            
            // Recursively allocate to child's subtree
            const childAllocation = this.calculateBandwidthAllocation(share, childId);
            for (const [id, bw] of childAllocation) {
                result.set(id, bw);
            }
        }
        
        return result;
    }
}

Priority Deprecation in HTTP/2

HTTP/2 priority is complex and inconsistently implemented. RFC 9218 (2022) deprecated the original scheme and introduced Extensible Priorities using a simpler urgency/incremental model. For gRPC, priority is less critical as RPC calls are typically independent with similar urgency. gRPC relies on application-level prioritization instead.

How gRPC Leverages HTTP/2

gRPC was designed from the ground up to exploit HTTP/2's capabilities. Every gRPC feature maps directly to HTTP/2 primitives:

Mapping RPC to HTTP/2:

gRPC to HTTP/2 Mapping
gRPC Concept	HTTP/2 Implementation	Details
RPC Request	HEADERS + DATA frames	Stream with method path, metadata as headers
RPC Response	HEADERS + DATA frames	Response status, trailing metadata with gRPC status
Metadata	HTTP headers	Key-value pairs, binary values base64-encoded
Deadline/Timeout	grpc-timeout header	Propagated across services
Cancellation	RST_STREAM frame	Immediate stream termination
Unary RPC	Single HEADERS + DATA	Request stream, response stream
Server Streaming	Multiple DATA frames	One request, many response DATA
Client Streaming	Multiple DATA frames	Many request DATA, one response
Bidirectional	Interleaved DATA	Full-duplex on single stream
Multiplexing	Multiple streams	Concurrent RPCs on one connection

grpc_wire_format.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
// gRPC over HTTP/2 Wire Format
 
// Example: Unary RPC call
// Service: UserService
// Method: GetUser(GetUserRequest) returns (GetUserResponse)
 
// Request Frame Sequence:
// 1. HEADERS frame on stream 1:
//    :method: POST
//    :path: /com.example.UserService/GetUser
//    :scheme: https
//    :authority: api.example.com
//    content-type: application/grpc+proto
//    te: trailers
//    grpc-timeout: 10S
//    grpc-encoding: gzip
//    authorization: Bearer <token>
//    custom-metadata-bin: <base64>  // Binary metadata
 
// 2. DATA frame on stream 1:
//    +-------------------------+
//    | Compressed Flag (1 byte)|  0 or 1
//    | Message Length (4 bytes)|  Big-endian
//    | Message Data (N bytes)  |  Protobuf-encoded request
//    +-------------------------+
 
// Response Frame Sequence:
// 1. HEADERS frame (response headers):
//    :status: 200
//    content-type: application/grpc+proto
//    grpc-encoding: gzip
 
// 2. DATA frame (response body):
//    Same format as request DATA
 
// 3. HEADERS frame (trailers - END_STREAM):
//    grpc-status: 0
//    grpc-message: 
//    custom-trailer-bin: <base64>
 
// gRPC Message Framing
interface GrpcMessage {
    compressed: boolean;   // 1 byte (0 or 1)
    length: number;        // 4 bytes big-endian
    data: Uint8Array;      // length bytes
}
 
function encodeGrpcMessage(data: Uint8Array, compress: boolean): Uint8Array {
    const result = new Uint8Array(5 + data.length);
    result[0] = compress ? 1 : 0;
    
    // Big-endian length
    result[1] = (data.length >> 24) & 0xFF;
    result[2] = (data.length >> 16) & 0xFF;
    result[3] = (data.length >> 8) & 0xFF;
    result[4] = data.length & 0xFF;
    
    result.set(data, 5);
    return result;
}
 
function decodeGrpcMessage(frame: Uint8Array): GrpcMessage {
    const compressed = frame[0] === 1;
    const length = (frame[1] << 24) | (frame[2] << 16) | (frame[3] << 8) | frame[4];
    const data = frame.slice(5, 5 + length);
    return { compressed, length, data };
}
 
// gRPC Status Codes
const GRPC_STATUS = {
    OK: 0,
    CANCELLED: 1,
    UNKNOWN: 2,
    INVALID_ARGUMENT: 3,
    DEADLINE_EXCEEDED: 4,
    NOT_FOUND: 5,
    ALREADY_EXISTS: 6,
    PERMISSION_DENIED: 7,
    RESOURCE_EXHAUSTED: 8,
    FAILED_PRECONDITION: 9,
    ABORTED: 10,
    OUT_OF_RANGE: 11,
    UNIMPLEMENTED: 12,
    INTERNAL: 13,
    UNAVAILABLE: 14,
    DATA_LOSS: 15,
    UNAUTHENTICATED: 16,
};

Why gRPC Requires HTTP/2

gRPC cannot work over HTTP/1.1 because: (1) Streaming requires bidirectional data flow on a single connection, (2) Trailers (grpc-status) are not supported in HTTP/1.1, (3) Multiplexing concurrent RPCs would require connection pools, (4) Text overhead would negate Protobuf's efficiency. HTTP/2 is not optional—it's fundamental.

Summary: HTTP/2 as gRPC's Foundation

HTTP/2 provides the transport capabilities that make gRPC possible. Understanding this foundation is essential for optimizing and troubleshooting gRPC services.

Key Takeaways

•Binary framing layer — All communication divided into typed frames with fixed headers; efficient parsing
•Stream multiplexing — Concurrent request-response pairs on single connection; no HTTP-level head-of-line blocking
•HPACK compression — Stateful header compression with dynamic tables; 90%+ reduction after warmup
•Flow control — Credit-based system preventing receiver overload; per-stream and connection-level
•Server push — Proactive resource delivery (though rarely used in practice)
•Stream prioritization — Resource allocation hints (simplified in later RFCs)
•gRPC mapping — RPC semantics map directly to HTTP/2 streams, headers, data, and trailers
•Single connection efficiency — One TCP connection handles hundreds of concurrent RPCs

What's Next:

With Protocol Buffers for serialization and HTTP/2 for transport, we're ready to explore gRPC's most distinctive feature: streaming capabilities. The next page covers unary, server streaming, client streaming, and bidirectional streaming patterns—the communication modes that enable real-time, high-throughput distributed systems.

Page Complete

You now understand HTTP/2's architecture from frames to flow control. You can explain why HTTP/2 is essential for gRPC's streaming, multiplexing, and efficiency. This knowledge helps you tune connections, debug transport issues, and design for scale.

2 / 5

Loading learning content...

System Design (HLD)gRPC

gRPC: High-Performance Remote Procedure Calls

LevelAdvanced

Duration75 mins

TopicgRPC

2 / 5

HTTP/2: The Transport Revolution Enabling gRPC

Beyond Request-Response: A New Transport Era

What You Will Learn

The Limitations of HTTP/1.1

To appreciate HTTP/2's innovations, we must first understand the specific problems it solves. HTTP/1.1's design constraints created significant performance bottlenecks for modern applications.

Head-of-Line Blocking:

HTTP/1.1 Timeline:

  Request A ─────►
                    Response A ─────────────────►
  Request B ─────►                               (BLOCKED)
                                                 Response B ─────►
  Request C ─────►                                           (BLOCKED)
                                                             Response C ─►

Total time: Sum of all response times (sequential)

Workarounds and Their Costs:

Developers invented workarounds, but each has significant downsides:

HTTP/1.1 Workarounds and Their Problems
Workaround	How It Works	Problems
Multiple Connections	Open 6-8 parallel connections	Socket exhaustion, server memory, TCP slow start repeated
Domain Sharding	Spread resources across subdomains	DNS lookups, SSL handshakes, operational complexity
Image Sprites	Combine images into one file	Download entire sprite for one image, cache invalidation
CSS/JS Bundling	Concatenate files together	Single file invalidates entire cache, delay loading
Data URI Inlining	Base64 embed resources in HTML	33% size increase, no separate caching

Text-Based Protocol Overhead:

HTTP/1.1 headers are plain text, repeated verbatim with every request:

GET /api/users/12345 HTTP/1.1
Host: api.example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)...
Accept: application/json
Accept-Language: en-US,en;q=0.9
Accept-Encoding: gzip, deflate, br
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
Cookie: session=abc; tracking=xyz; preferences=dark-mode
Connection: keep-alive

This header block can be 500-800 bytes—sent with every single request. For an API making 100 requests/second, that's 50-80 KB/sec just for headers that rarely change.

HTTP/1.1 Pipelining Was Not the Answer

Binary Framing: The Foundation of HTTP/2

Frame Structure:

Every HTTP/2 frame consists of a 9-byte header followed by the payload:

http2_frame_structure.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
HTTP/2 Frame Format (RFC 7540 Section 4.1)
 
+-----------------------------------------------+
|                 Length (24)                   |
+---------------+---------------+---------------+
|   Type (8)    |   Flags (8)   |
+-+-------------+---------------+-------------------------------+
|R|                 Stream Identifier (31)                      |
+=+=============================================================+
|                   Frame Payload (0...)                        |
+---------------------------------------------------------------+
 
Total Header: 9 bytes
- Length:    24 bits (payload length, max 16,384 bytes default)
- Type:      8 bits  (DATA, HEADERS, PRIORITY, RST_STREAM, etc.)
- Flags:     8 bits  (END_STREAM, END_HEADERS, PADDED, PRIORITY)
- Reserved:  1 bit   (must be 0)
- Stream ID: 31 bits (which stream this frame belongs to)

HTTP/2 Frame Types
Type	Code	Purpose	Key Flags
DATA	0x00	Application data (request/response bodies)	END_STREAM, PADDED
HEADERS	0x01	HTTP headers (compressed with HPACK)	END_STREAM, END_HEADERS, PRIORITY
PRIORITY	0x02	Stream dependency and weight	None
RST_STREAM	0x03	Immediately terminate a stream	None
SETTINGS	0x04	Connection-level configuration	ACK
PUSH_PROMISE	0x05	Server push notification	END_HEADERS, PADDED
PING	0x06	Connection liveness and RTT measurement	ACK
GOAWAY	0x07	Graceful connection shutdown	None
WINDOW_UPDATE	0x08	Flow control credit	None
CONTINUATION	0x09	Continue HEADERS if too large	END_HEADERS

frame_parsing_example.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
// Example: Parsing an HTTP/2 frame header
 
interface HTTP2FrameHeader {
    length: number;      // Payload length (24 bits)
    type: number;        // Frame type (8 bits)
    flags: number;       // Frame flags (8 bits)
    streamId: number;    // Stream identifier (31 bits)
}
 
const FRAME_TYPES = {
    DATA: 0x00,
    HEADERS: 0x01,
    PRIORITY: 0x02,
    RST_STREAM: 0x03,
    SETTINGS: 0x04,
    PUSH_PROMISE: 0x05,
    PING: 0x06,
    GOAWAY: 0x07,
    WINDOW_UPDATE: 0x08,
    CONTINUATION: 0x09,
} as const;
 
const FLAGS = {
    END_STREAM: 0x01,    // No more data for this stream
    END_HEADERS: 0x04,   // Headers complete (no CONTINUATION)
    PADDED: 0x08,        // Payload has padding
    PRIORITY: 0x20,      // Includes priority information
} as const;
 
function parseFrameHeader(buffer: Uint8Array): HTTP2FrameHeader {
    if (buffer.length < 9) {
        throw new Error('Insufficient data for frame header');
    }
    
    // Parse 24-bit length (big-endian)
    const length = (buffer[0] << 16) | (buffer[1] << 8) | buffer[2];
    
    // Parse type and flags
    const type = buffer[3];
    const flags = buffer[4];
    
    // Parse 31-bit stream ID (ignore reserved bit)
    const streamId = (
        ((buffer[5] & 0x7F) << 24) |  // Mask out reserved bit
        (buffer[6] << 16) |
        (buffer[7] << 8) |
        buffer[8]
    );
    
    return { length, type, flags, streamId };
}
 
// Example: A HEADERS frame establishing a new request
const headersFrame = new Uint8Array([
    0x00, 0x00, 0x3D,  // Length: 61 bytes
    0x01,               // Type: HEADERS
    0x04,               // Flags: END_HEADERS
    0x00, 0x00, 0x00, 0x01,  // Stream ID: 1
    // ... payload: HPACK-encoded headers
]);
 
const parsed = parseFrameHeader(headersFrame);
console.log(parsed);
// { length: 61, type: 1, flags: 4, streamId: 1 }
// This is a HEADERS frame for stream 1, headers complete

Why Binary Matters

Stream Multiplexing: The End of Head-of-Line Blocking

HTTP/2's most impactful feature is stream multiplexing: the ability to interleave multiple independent request-response exchanges on a single TCP connection.

Streams, Messages, and Frames:

HTTP/2 introduces a hierarchy:

Connection: A single TCP connection between client and server
Stream: A bidirectional flow of frames with a unique ID (within a connection)
Message: A complete HTTP request or response (spanning multiple frames)
Frame: The smallest unit of communication

Multiple streams coexist on one connection, frames from different streams interleave freely, and there's no blocking between streams.

multiplexing_visual.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
HTTP/2 Multiplexing on Single Connection
 
Time ──────────────────────────────────────────────────►
 
Stream 1 (API call):
├── HEADERS (request) ──►          ◄── HEADERS (response)
│                                  ◄── DATA (part 1)
│                                  ◄── DATA (part 2, END_STREAM)
 
Stream 3 (Image download):
│   HEADERS ──►       ◄── HEADERS
│                     ◄── DATA (chunk 1)
│                     ◄── DATA (chunk 2)
│                     ◄── DATA (chunk 3, END_STREAM)
 
Stream 5 (WebSocket-like bidirectional):
│   HEADERS (upgrade) ──►
│                     ◄── HEADERS (accept)
│   DATA (client msg) ──►
│                     ◄── DATA (server msg)
│   DATA (client msg) ──►
│                     ◄── DATA (server msg ...)
 
Wire order (interleaved frames):
┌─────────┬─────────┬─────────┬─────────┬─────────┐
│Stream 1 │Stream 3 │Stream 1 │Stream 5 │Stream 3 │
│HEADERS  │HEADERS  │DATA     │HEADERS  │DATA     │
└─────────┴─────────┴─────────┴─────────┴─────────┘
 
All three streams progress concurrently on ONE TCP connection!

Stream Lifecycle:

Streams have a well-defined state machine:

stream_lifecycle.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
// HTTP/2 Stream States (RFC 7540 Section 5.1)
 
enum StreamState {
    IDLE = 'idle',
    RESERVED_LOCAL = 'reserved (local)',
    RESERVED_REMOTE = 'reserved (remote)',
    OPEN = 'open',
    HALF_CLOSED_LOCAL = 'half-closed (local)',
    HALF_CLOSED_REMOTE = 'half-closed (remote)',
    CLOSED = 'closed',
}
 
/*
State Transitions:
 
                             +--------+
                     send PP |        | recv PP
                    ,--------|  idle  |--------.
                   /         |        |         \
                  v          +--------+          v
           +----------+          |           +----------+
           |          |          | send H /  |          |
    ,------| reserved |          | recv H    | reserved |------.
    |      | (local)  |          |           | (remote) |      |
    |      +----------+          v           +----------+      |
    |          |             +--------+             |          |
    |          |     recv ES |        | send ES    |          |
    |   send H |     ,-------|  open  |-------.    | recv H   |
    |          |    /        |        |        \   |          |
    |          v   v         +--------+         v  v          |
    |      +----------+          |           +----------+     |
    |      |   half   |          |           |   half   |     |
    |      |  closed  |          | send R /  |  closed  |     |
    |      |  (remote)|          | recv R    |  (local) |     |
    |      +----------+          |           +----------+     |
    |           |                |                 |          |
    |           | send ES /      |       recv ES / |          |
    |           | send R /       v        send R / |          |
    |           | recv R    +--------+    recv R   |          |
    | send R /  '---------->|        |<-----------'  send R / |
    | recv R                | closed |               recv R   |
    '---------------------->|        |<----------------------'
                            +--------+
 
Legend:
  H:  HEADERS frame
  PP: PUSH_PROMISE frame
  ES: END_STREAM flag
  R:  RST_STREAM frame
*/
 
interface Stream {
    id: number;
    state: StreamState;
    localWindowSize: number;
    remoteWindowSize: number;
    priority: StreamPriority;
}
 
interface StreamPriority {
    dependencyId: number;  // Parent stream (0 = root)
    weight: number;        // 1-256, relative priority
    exclusive: boolean;    // Become sole child of dependency
}
 
// Stream ID rules
// - Client-initiated streams: ODD numbers (1, 3, 5, 7...)
// - Server-initiated streams: EVEN numbers (2, 4, 6, 8...)
// - Stream 0 is CONNECTION-level (SETTINGS, PING, GOAWAY)
 
function isClientStream(streamId: number): boolean {
    return streamId % 2 === 1;
}
 
function isServerStream(streamId: number): boolean {
    return streamId !== 0 && streamId % 2 === 0;
}

TCP Head-of-Line Blocking Remains

HPACK: Stateful Header Compression

HPACK Components:

Static Table: 61 pre-defined header name-value pairs (e.g., :method: GET, accept-encoding: gzip)
Dynamic Table: Connection-specific table of recently used headers
Huffman Encoding: Variable-length encoding optimized for HTTP header text
Indexing: Reference previous headers by table index instead of repeating

hpack_example.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
// HPACK Static Table (RFC 7541 Appendix A)
// Pre-defined entries that never change
 
const STATIC_TABLE = [
    // Index 1-15: Pseudo-headers and common names
    { index: 1,  name: ':authority',       value: '' },
    { index: 2,  name: ':method',          value: 'GET' },
    { index: 3,  name: ':method',          value: 'POST' },
    { index: 4,  name: ':path',            value: '/' },
    { index: 5,  name: ':path',            value: '/index.html' },
    { index: 6,  name: ':scheme',          value: 'http' },
    { index: 7,  name: ':scheme',          value: 'https' },
    { index: 8,  name: ':status',          value: '200' },
    { index: 9,  name: ':status',          value: '204' },
    { index: 10, name: ':status',          value: '206' },
    { index: 11, name: ':status',          value: '304' },
    { index: 12, name: ':status',          value: '400' },
    { index: 13, name: ':status',          value: '404' },
    { index: 14, name: ':status',          value: '500' },
    { index: 15, name: 'accept-charset',   value: '' },
    // ... up to index 61
    { index: 61, name: 'www-authenticate', value: '' },
];
 
// HPACK Encoding Examples
 
// Request 1: First request in connection
// Headers: :method GET, :path /api/users, authorization: Bearer xxx
// 
// :method GET      → Static index 2 = 1 byte: 0x82
// :path /api/users → Name index 4, literal value = ~12 bytes
// authorization: Bearer xxx → Literal with indexing = ~25 bytes
// Total: ~38 bytes (vs ~80+ in HTTP/1.1)
 
// Request 2: Same endpoint, same auth
// Dynamic table now contains:
//   Index 62: :path /api/users
//   Index 63: authorization: Bearer xxx
//
// :method GET      → Static index 2 = 1 byte: 0x82
// :path /api/users → Dynamic index 62 = 1 byte: 0xBE
// authorization    → Dynamic index 63 = 1 byte: 0xBF
// Total: 3 bytes! (96% reduction)
 
// Huffman encoding for string literals
// HTTP header chars have specific frequency distributions
// Common chars get shorter codes:
//   'e' = 5 bits, 'a' = 5 bits, '/' = 6 bits
//   'X' = 10 bits, '{' = 15 bits
//
// "GET" in Huffman: 15 bits vs 24 bits (37% saving)
// "/api/users" Huffman: ~40 bits vs 80 bits (50% saving)
 
interface DynamicTable {
    maxSize: number;     // Configured via SETTINGS_HEADER_TABLE_SIZE
    entries: { name: string; value: string }[];
    currentSize: number;  // Sum of (name.length + value.length + 32) for each entry
}
 
function addToDynamicTable(
    table: DynamicTable,
    name: string, 
    value: string
): void {
    const entrySize = name.length + value.length + 32;  // Per RFC 7541
    
    // Evict oldest entries if needed
    while (table.currentSize + entrySize > table.maxSize && table.entries.length > 0) {
        const evicted = table.entries.pop()!;
        table.currentSize -= (evicted.name.length + evicted.value.length + 32);
    }
    
    // Add new entry at beginning (most recent = lowest dynamic index)
    if (entrySize <= table.maxSize) {
        table.entries.unshift({ name, value });
        table.currentSize += entrySize;
    }
}

HPACK Compression Effectiveness
Scenario	HTTP/1.1 Headers	HPACK First Request	HPACK Subsequent	Savings
Simple GET	~500 bytes	~100 bytes	~3 bytes	99.4%
API with Auth	~800 bytes	~150 bytes	~8 bytes	99.0%
Rich Cookies	~2000 bytes	~400 bytes	~15 bytes	99.2%
gRPC metadata	~200 bytes	~50 bytes	~5 bytes	97.5%

CRIME Attack and HPACK Design

Flow Control: Preventing Resource Exhaustion

Flow Control Design:

Per-stream flow control: Each stream has independent receive windows
Connection-level flow control: Overall limit across all streams
Receiver-controlled: Only the receiver can increase window size
DATA frames only: Headers and control frames are not flow-controlled
Cannot be disabled: Flow control is mandatory (but window can be huge)

flow_control.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
// HTTP/2 Flow Control Implementation
 
interface FlowControl {
    connectionWindow: number;  // Connection-level credit
    streamWindows: Map<number, number>;  // Per-stream credit
}
 
const DEFAULT_WINDOW_SIZE = 65535;  // 64 KB initial window (RFC 7540)
const MAX_WINDOW_SIZE = 2147483647;  // 2^31 - 1 (~2 GB)
 
class HTTP2FlowControl {
    private connectionWindow = DEFAULT_WINDOW_SIZE;
    private streamWindows = new Map<number, number>();
    
    // Called when we receive a SETTINGS frame with initial window size
    handleSettingsWindowSize(newSize: number): void {
        const delta = newSize - DEFAULT_WINDOW_SIZE;
        for (const [streamId, window] of this.streamWindows) {
            const updated = window + delta;
            if (updated > MAX_WINDOW_SIZE || updated < 0) {
                // Protocol error: FLOW_CONTROL_ERROR
                this.sendGoaway('FLOW_CONTROL_ERROR');
                return;
            }
            this.streamWindows.set(streamId, updated);
        }
    }
    
    // Called when we receive a WINDOW_UPDATE frame
    handleWindowUpdate(streamId: number, increment: number): void {
        if (increment === 0) {
            // Protocol error: increment must be 1-2^31-1
            this.sendRstStream(streamId, 'PROTOCOL_ERROR');
            return;
        }
        
        if (streamId === 0) {
            // Connection-level update
            const newWindow = this.connectionWindow + increment;
            if (newWindow > MAX_WINDOW_SIZE) {
                this.sendGoaway('FLOW_CONTROL_ERROR');
                return;
            }
            this.connectionWindow = newWindow;
        } else {
            // Stream-level update
            const current = this.streamWindows.get(streamId) ?? DEFAULT_WINDOW_SIZE;
            const newWindow = current + increment;
            if (newWindow > MAX_WINDOW_SIZE) {
                this.sendRstStream(streamId, 'FLOW_CONTROL_ERROR');
                return;
            }
            this.streamWindows.set(streamId, newWindow);
        }
        
        // Check if any pending data can now be sent
        this.flushPendingData();
    }
    
    // Called before sending DATA
    canSendData(streamId: number, dataLength: number): boolean {
        const streamWindow = this.streamWindows.get(streamId) ?? DEFAULT_WINDOW_SIZE;
        
        // Must have credit at BOTH connection and stream level
        return dataLength <= Math.min(this.connectionWindow, streamWindow);
    }
    
    // Called after sending DATA
    consumeFlowCredit(streamId: number, dataLength: number): void {
        this.connectionWindow -= dataLength;
        
        const current = this.streamWindows.get(streamId) ?? DEFAULT_WINDOW_SIZE;
        this.streamWindows.set(streamId, current - dataLength);
    }
    
    private sendGoaway(error: string): void { /* ... */ }
    private sendRstStream(streamId: number, error: string): void { /* ... */ }
    private flushPendingData(): void { /* ... */ }
}
 
/*
Flow Control Example Timeline:
 
1. Connection established, both sides have 65535 byte windows
2. Client sends 10000 bytes on stream 1
   - Client's stream 1 window: 55535 remaining
   - Client's connection window: 55535 remaining
3. Client sends 30000 bytes on stream 3
   - Client's stream 3 window: 35535 remaining
   - Client's connection window: 25535 remaining
4. Server processes stream 1 data, sends WINDOW_UPDATE
   - Stream 1: +10000 → client can send more on stream 1
   - Connection: +10000 → 35535 total available
5. Client sends 20000 bytes on stream 1
   - Connection window: 15535 remaining
6. Client wants to send 20000 bytes on stream 3
   - Stream 3 window: 35535 (OK)
   - Connection window: 15535 (NOT ENOUGH)
   - Client must wait for WINDOW_UPDATE on connection
*/

Flow Control Strategy for gRPC

Server Push: Proactive Resource Delivery

How Server Push Works:

Client sends request for /page.html (stream 1)
Server sends PUSH_PROMISE on stream 1, promising /style.css (assigns stream 2)
Server sends PUSH_PROMISE on stream 1, promising /app.js (assigns stream 4)
Server sends response for /page.html on stream 1
Server sends /style.css on stream 2 (promised)
Server sends /app.js on stream 4 (promised)
When client's parser discovers CSS/JS links, resources are already cached

server_push_example.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// Server Push Frame Structure
 
// PUSH_PROMISE frame (Type 0x05)
// Sent on the stream of original request
// Contains: promised stream ID + request headers for pushed resource
 
interface PushPromiseFrame {
    type: 0x05;
    flags: number;
    streamId: number;         // Original request stream
    promisedStreamId: number; // New stream for pushed resource
    headerBlock: Uint8Array;  // HPACK-encoded request headers
}
 
// Example: Server pushes CSS when client requests HTML
// 
// Client → Server:
// HEADERS (stream 1): GET /page.html
//
// Server → Client:
// PUSH_PROMISE (stream 1): [promises stream 2 for GET /style.css]
// PUSH_PROMISE (stream 1): [promises stream 4 for GET /app.js]
// HEADERS (stream 1): 200 OK, content-type: text/html
// DATA (stream 1): <html>...</html>
// HEADERS (stream 2): 200 OK, content-type: text/css
// DATA (stream 2): body { ... }
// HEADERS (stream 4): 200 OK, content-type: application/javascript
// DATA (stream 4): function app() { ... }
 
// Client can reject pushed resources with RST_STREAM
// Reasons to reject:
// - Already in cache
// - Don't want the resource
// - Too many concurrent streams
 
// gRPC does NOT use server push
// Why?
// 1. gRPC is RPC, not web page loading
// 2. Client knows what it's requesting
// 3. Bidirectional streaming serves similar purpose
// 4. Push adds complexity without clear RPC benefit

Server Push in Practice

Stream Prioritization and Dependencies

Priority Model:

Each stream can declare:

Dependency: A parent stream (creates a tree)
Weight: Relative importance among siblings (1-256)
Exclusive Flag: Become sole child of parent, other children become your children

priority_tree.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
// HTTP/2 Priority Tree Example
 
/*
Initial state (all streams depend on root with weight 16):
 
        root (0)
       /    |    \
   [16]   [16]   [16]
    S1     S3     S5
 
After S7 is added with exclusive dependency on S3:
 
        root (0)
       /    |    \
   [16]   [16]   [16]
    S1     S3     S5
           |
         [16]
          S7
 
After S9 is added depending on S3 with weight 64:
 
        root (0)
       /    |    \
   [16]   [16]   [16]
    S1     S3     S5
          /  \
       [16]  [64]
        S7    S9
 
Bandwidth allocation (simplified):
- Root level: S1, S3, S5 each get 1/3
- Under S3: S7 gets 16/80 of S3's share, S9 gets 64/80
- If S3 has 300 KB/s, S7 gets 60 KB/s, S9 gets 240 KB/s
*/
 
interface StreamPriority {
    streamId: number;
    dependencyId: number;  // 0 = root
    weight: number;        // 1-256
    exclusive: boolean;
}
 
class PriorityTree {
    private tree = new Map<number, {
        parent: number;
        weight: number;
        children: number[];
    }>();
    
    constructor() {
        // Root node
        this.tree.set(0, { parent: -1, weight: 256, children: [] });
    }
    
    addStream(priority: StreamPriority): void {
        const { streamId, dependencyId, weight, exclusive } = priority;
        const parent = this.tree.get(dependencyId);
        
        if (!parent) {
            // Invalid dependency, use root
            this.addStream({ ...priority, dependencyId: 0 });
            return;
        }
        
        if (exclusive) {
            // Exclusive: all current children become children of new stream
            const formerChildren = [...parent.children];
            parent.children = [streamId];
            this.tree.set(streamId, {
                parent: dependencyId,
                weight,
                children: formerChildren,
            });
            for (const child of formerChildren) {
                const childNode = this.tree.get(child)!;
                childNode.parent = streamId;
            }
        } else {
            // Non-exclusive: add as sibling
            parent.children.push(streamId);
            this.tree.set(streamId, {
                parent: dependencyId,
                weight,
                children: [],
            });
        }
    }
    
    removeStream(streamId: number): void {
        const node = this.tree.get(streamId);
        if (!node) return;
        
        // Reparent children to this node's parent
        const parent = this.tree.get(node.parent);
        if (parent) {
            parent.children = parent.children
                .filter(id => id !== streamId)
                .concat(node.children);
        }
        
        for (const childId of node.children) {
            const child = this.tree.get(childId)!;
            child.parent = node.parent;
        }
        
        this.tree.delete(streamId);
    }
    
    calculateBandwidthAllocation(
        available: number, 
        streamId: number = 0
    ): Map<number, number> {
        const result = new Map<number, number>();
        const node = this.tree.get(streamId)!;
        
        if (node.children.length === 0) {
            if (streamId !== 0) {
                result.set(streamId, available);
            }
            return result;
        }
        
        // Distribute based on weights
        const totalWeight = node.children
            .map(id => this.tree.get(id)!.weight)
            .reduce((a, b) => a + b, 0);
        
        for (const childId of node.children) {
            const child = this.tree.get(childId)!;
            const share = (child.weight / totalWeight) * available;
            
            // Recursively allocate to child's subtree
            const childAllocation = this.calculateBandwidthAllocation(share, childId);
            for (const [id, bw] of childAllocation) {
                result.set(id, bw);
            }
        }
        
        return result;
    }
}

Priority Deprecation in HTTP/2

How gRPC Leverages HTTP/2

gRPC was designed from the ground up to exploit HTTP/2's capabilities. Every gRPC feature maps directly to HTTP/2 primitives:

Mapping RPC to HTTP/2:

gRPC to HTTP/2 Mapping
gRPC Concept	HTTP/2 Implementation	Details
RPC Request	HEADERS + DATA frames	Stream with method path, metadata as headers
RPC Response	HEADERS + DATA frames	Response status, trailing metadata with gRPC status
Metadata	HTTP headers	Key-value pairs, binary values base64-encoded
Deadline/Timeout	grpc-timeout header	Propagated across services
Cancellation	RST_STREAM frame	Immediate stream termination
Unary RPC	Single HEADERS + DATA	Request stream, response stream
Server Streaming	Multiple DATA frames	One request, many response DATA
Client Streaming	Multiple DATA frames	Many request DATA, one response
Bidirectional	Interleaved DATA	Full-duplex on single stream
Multiplexing	Multiple streams	Concurrent RPCs on one connection

grpc_wire_format.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
// gRPC over HTTP/2 Wire Format
 
// Example: Unary RPC call
// Service: UserService
// Method: GetUser(GetUserRequest) returns (GetUserResponse)
 
// Request Frame Sequence:
// 1. HEADERS frame on stream 1:
//    :method: POST
//    :path: /com.example.UserService/GetUser
//    :scheme: https
//    :authority: api.example.com
//    content-type: application/grpc+proto
//    te: trailers
//    grpc-timeout: 10S
//    grpc-encoding: gzip
//    authorization: Bearer <token>
//    custom-metadata-bin: <base64>  // Binary metadata
 
// 2. DATA frame on stream 1:
//    +-------------------------+
//    | Compressed Flag (1 byte)|  0 or 1
//    | Message Length (4 bytes)|  Big-endian
//    | Message Data (N bytes)  |  Protobuf-encoded request
//    +-------------------------+
 
// Response Frame Sequence:
// 1. HEADERS frame (response headers):
//    :status: 200
//    content-type: application/grpc+proto
//    grpc-encoding: gzip
 
// 2. DATA frame (response body):
//    Same format as request DATA
 
// 3. HEADERS frame (trailers - END_STREAM):
//    grpc-status: 0
//    grpc-message: 
//    custom-trailer-bin: <base64>
 
// gRPC Message Framing
interface GrpcMessage {
    compressed: boolean;   // 1 byte (0 or 1)
    length: number;        // 4 bytes big-endian
    data: Uint8Array;      // length bytes
}
 
function encodeGrpcMessage(data: Uint8Array, compress: boolean): Uint8Array {
    const result = new Uint8Array(5 + data.length);
    result[0] = compress ? 1 : 0;
    
    // Big-endian length
    result[1] = (data.length >> 24) & 0xFF;
    result[2] = (data.length >> 16) & 0xFF;
    result[3] = (data.length >> 8) & 0xFF;
    result[4] = data.length & 0xFF;
    
    result.set(data, 5);
    return result;
}
 
function decodeGrpcMessage(frame: Uint8Array): GrpcMessage {
    const compressed = frame[0] === 1;
    const length = (frame[1] << 24) | (frame[2] << 16) | (frame[3] << 8) | frame[4];
    const data = frame.slice(5, 5 + length);
    return { compressed, length, data };
}
 
// gRPC Status Codes
const GRPC_STATUS = {
    OK: 0,
    CANCELLED: 1,
    UNKNOWN: 2,
    INVALID_ARGUMENT: 3,
    DEADLINE_EXCEEDED: 4,
    NOT_FOUND: 5,
    ALREADY_EXISTS: 6,
    PERMISSION_DENIED: 7,
    RESOURCE_EXHAUSTED: 8,
    FAILED_PRECONDITION: 9,
    ABORTED: 10,
    OUT_OF_RANGE: 11,
    UNIMPLEMENTED: 12,
    INTERNAL: 13,
    UNAVAILABLE: 14,
    DATA_LOSS: 15,
    UNAUTHENTICATED: 16,
};

Why gRPC Requires HTTP/2

Summary: HTTP/2 as gRPC's Foundation

HTTP/2 provides the transport capabilities that make gRPC possible. Understanding this foundation is essential for optimizing and troubleshooting gRPC services.

Key Takeaways

•Binary framing layer — All communication divided into typed frames with fixed headers; efficient parsing
•Stream multiplexing — Concurrent request-response pairs on single connection; no HTTP-level head-of-line blocking
•HPACK compression — Stateful header compression with dynamic tables; 90%+ reduction after warmup
•Flow control — Credit-based system preventing receiver overload; per-stream and connection-level
•Server push — Proactive resource delivery (though rarely used in practice)
•Stream prioritization — Resource allocation hints (simplified in later RFCs)
•gRPC mapping — RPC semantics map directly to HTTP/2 streams, headers, data, and trailers
•Single connection efficiency — One TCP connection handles hundreds of concurrent RPCs

What's Next:

Page Complete

2 / 5