Loading learning content...
Traditional HTTP follows a strict request-response model: clients ask, servers answer. This seemingly obvious pattern hides a fundamental inefficiency—servers know what clients will need next, but must wait for explicit requests.
When you request index.html, the server knows you'll immediately need style.css and app.js. Yet in HTTP/1.1, the server sends HTML and waits. The browser parses HTML, discovers resources, and makes separate requests. Each round trip adds latency.
HTTP/2 Server Push breaks this pattern entirely. Servers can proactively send resources based on their knowledge of application structure, eliminating round-trip delays for critical assets. When implemented well, server push can shave hundreds of milliseconds from page loads.
By the end of this page, you will understand: (1) The push promise mechanism and PUSH_PROMISE frames, (2) How pushed resources are associated with requests, (3) Cache interaction and validation challenges, (4) Browser handling of pushed resources, (5) When server push helps and when it hurts performance, (6) Why server push has seen limited adoption and what alternatives exist.
To appreciate server push, we must understand the latency cost of sequential resource discovery.
Traditional Resource Loading:
Time 0ms: Browser → Server: GET /index.html
Time 50ms: Server → Browser: index.html (contains <link href='style.css'>)
Time 50ms: Browser parses HTML, discovers style.css
Time 50ms: Browser → Server: GET /style.css
Time 100ms: Server → Browser: style.css (contains url('font.woff'))
Time 100ms: Browser parses CSS, discovers font.woff
Time 100ms: Browser → Server: GET /font.woff
Time 150ms: Server → Browser: font.woff
Time 150ms: First meaningful render possible
Three sequential round trips! On a 100ms RTT connection, that's 300ms of pure waiting—not data transfer, just latency.
| Chain | Resources | RTT Cost | Impact |
|---|---|---|---|
| HTML → CSS | HTML references stylesheets | 1 RTT | Blocks rendering |
| CSS → Fonts | CSS @font-face references fonts | 1 RTT | Blocks text rendering (FOIT) |
| CSS → Images | CSS background-image references | 1 RTT | Delays visual completion |
| HTML → JS → API | App.js initializes and calls API | 2 RTT | Delays interactivity |
| JS → Lazy modules | Dynamic imports load code-split chunks | 1+ RTT | Delays feature availability |
The Server's Knowledge Advantage:
The server generating index.html knows what resources it references—the template includes specific CSS and JS files. Yet it sends only what was requested, waiting for the browser to parse and request dependencies.
Traditional workarounds include:
<link rel='preload'>. Requires browser to parse HTML first.Server push offers a protocol-level solution: start sending resources before the client knows it needs them.
On fast networks, data transfer is nearly instantaneous—latency is the bottleneck. Downloading a 10KB CSS file takes milliseconds, but discovering-then-requesting it takes a full RTT (50-150ms typically). Eliminating that discovery step is where server push provides value.
Server push uses the PUSH_PROMISE frame to initiate proactive resource delivery. This frame announces what the server intends to push and reserves a stream for the pushed response.
PUSH_PROMISE Frame Structure:
+---------------+
|Pad Length? (8)|
+-+-------------+-----------------------------------------------+
|R| Promised Stream ID (31) |
+-+-----------------------------+-------------------------------+
| Header Block Fragment (*) |
+---------------------------------------------------------------+
| Padding (*) |
+---------------------------------------------------------------+
Key fields:
The Push Sequence:
Client requests resource — Standard HEADERS frame on odd stream ID (e.g., Stream 1)
Server sends PUSH_PROMISE — On the original stream (1), server announces it will push a resource. The PUSH_PROMISE contains:
Client reserves stream — Upon receiving PUSH_PROMISE, client reserves Stream 2 for the promised response. The stream enters 'reserved (remote)' state.
Server sends pushed response — On the promised stream (2), server sends HEADERS and DATA frames as normal.
Push completes — Server sends DATA with END_STREAM on Stream 2. Client caches the response.
Original response continues — Stream 1 proceeds normally, interleaved with push.
A critical rule: PUSH_PROMISE for a resource must be sent BEFORE the response DATA that references it. The server must push CSS before the HTML that links to it arrives, otherwise the browser might parse the HTML and request the CSS before the push arrives—wasting the push.
Pushed streams follow a specific lifecycle that differs from regular client-initiated streams.
Push Stream State Machine:
+--------+
recv PP | | send PP
,--------->+ idle +<---------.
| | | |
| +---+----+ |
v | v
+----------+ | +----------+
| | | | |
| reserved | | | reserved |
| (remote) | | | (local) |
| | | | |
+----+-----+ | +----+-----+
| | |
| recv H | send H |
| | |
v v v
+----------+ +--------+ NOT USED
| half | | | for push
| closed | | open |
| (local) | | |
| | +--------+
+----+-----+
|
| recv DATA + ES
|
v
+--------+
| |
| closed |
| |
+--------+
The client's perspective:
Why Half-Closed (Local)?
A pushed stream is half-closed (local) from the client's perspective because:
However, the client CAN send:
Canceling Pushes:
Clients can reject pushes by sending RST_STREAM with error code CANCEL or REFUSED_STREAM:
Client receives: PUSH_PROMISE (Stream 2, /style.css)
Client has cached: /style.css (fresh)
Client sends: RST_STREAM (Stream 2, CANCEL)
Server response: Stops sending Stream 2 data
This is crucial—without cancellation, servers might push resources clients already have, wasting bandwidth.
Server-initiated streams (for push) use even stream IDs (2, 4, 6, ...), while client-initiated streams use odd IDs (1, 3, 5, ...). This prevents ID conflicts. The server allocates push stream IDs from its pool, typically sequentially.
Server push's interaction with browser caching is one of its most complex aspects—and a primary reason for its limited adoption.
The Fundamental Challenge:
When a server pushes /style.css, the browser may already have it cached. Options:
Browsers typically implement option 2, but the race condition is problematic:
T0: Server sends PUSH_PROMISE for /style.css
T1: Server begins sending CSS data
T2: Browser receives PUSH_PROMISE, checks cache
T3: Browser finds cached CSS, sends RST_STREAM
T4: Server receives RST_STREAM, stops sending
Problem: Bytes sent between T1 and T4 are wasted
| Cache State | Push Behavior | Outcome | Efficiency |
|---|---|---|---|
| No cache entry | Accept push | Resource cached | Optimal ✓ |
| Fresh cache entry | Cancel push | Use existing cache | Wasted bytes ✗ |
| Stale cache entry | Accept push or revalidate | Complex decision | Depends ⚠ |
| Vary mismatch | May accept or reject | Difficult to validate | Problematic ✗ |
Cache Digest Proposals:
Several proposals attempted to solve the cache coordination problem:
1. Cache Digests (abandoned): Client sends a Bloom filter of cached resources to the server. Server consults digest before pushing. Never standardized due to privacy concerns (reveals browsing history) and implementation complexity.
2. Accept-Push-Policy (proposed): Header signaling client's push preferences. Never gained traction.
3. No automatic solution (current state): Servers push blindly; clients cancel as fast as possible. Wasteful but functional.
The lack of cache coordination is a major reason server push hasn't achieved its potential. Servers can't know what clients have cached, so they either over-push (wasting bandwidth) or under-push (missing optimization opportunities).
Aggressive pushing on repeat visits is particularly problematic. A returning user with warm caches sees zero benefit from push—every pushed resource is canceled. On bandwidth-constrained connections, this overhead can actually hurt performance compared to not pushing at all.
Different browsers handle server push differently, and these differences affect practical deployment.
Push Handling Flow:
Push Cache vs HTTP Cache:
Browsers typically maintain a separate "push cache" distinct from the standard HTTP cache:
Push arrives → Stored in push cache (per-connection)
↓
Navigation requests resource → Check push cache first
↓
If found: Move to HTTP cache, use response
If not: Normal HTTP cache check, then network request
The push cache is usually:
Pushed resources must be used quickly. If the HTML takes too long to parse and reference the pushed CSS, browsers may evict the push from the temporary push cache. Servers should push only resources that will be requested within seconds of the push arriving.
Server push provides genuine benefits in specific scenarios. Understanding these scenarios helps identify where push deployment is worthwhile.
Optimal Push Scenarios:
First-Time Visitors: Empty cache means pushed resources are always useful. No wasted bandwidth on cache collisions.
Critical Path Resources: CSS that blocks rendering, Web Fonts that cause FOIT, JS required for interactivity. Eliminating one RTT for these high-impact resources matters.
Dynamic Resource Discovery: Resources discovered only after executing JavaScript (not in static HTML). Push delivers them before the JS even runs.
Known Application Structure: SPA frameworks where the server knows exactly what resources the shell needs. Every push is certain to be used.
| Scenario | Typical Benefit | RTT Saved | Recommendation |
|---|---|---|---|
| First visit, critical CSS | 100-200ms improvement | 1 RTT | Push (high value) |
| First visit, web fonts | 100-300ms reduction in FOIT | 1-2 RTT | Push (high value) |
| First visit, hero image | 50-150ms improvement | 1 RTT | Maybe push (large size) |
| Repeat visit, cached resources | 0ms (negative with overhead) | 0 | Don't push |
| SPA navigation | Varies by architecture | 1+ RTT | Consider alternatives |
Quantifying the Benefit:
Consider a page with critical CSS (15KB):
Without push (100ms RTT):
T0: Request HTML
T100: Receive HTML, start parsing
T100: Request CSS (discovered in HTML)
T200: Receive CSS, start rendering
Total: 200ms to first render
With push:
T0: Request HTML
T0-50: Push CSS arrives (started immediately)
T100: Receive HTML, CSS already in push cache
T100: Start rendering immediately
Total: 100ms to first render
Savings: 100ms (1 RTT) — Significant for perceived performance.
Push works best when: (1) Resources are critical for rendering, (2) Visitors have cold caches, (3) Resources are small-to-medium size (not large images/videos), (4) The server confidently knows what to push. Meeting all four criteria is the sweet spot.
Server push can actually degrade performance in several scenarios. Understanding anti-patterns prevents well-intentioned push implementations from causing harm.
Push Anti-Patterns:
The Priority Inversion Problem:
One of push's most serious issues: pushed resources compete for bandwidth with the main response.
Scenario: Server pushes 50KB CSS while sending 30KB HTML
Bandwidth allocation might be:
[HTML chunk][CSS chunk][HTML chunk][CSS chunk]...
Result:
- HTML takes longer to complete
- Browser can't parse HTML until more arrives
- Net effect: SLOWER first render despite push
The HTML (which the client explicitly requested and needs to discover other resources) is delayed by the pushed CSS. Proper priority handling helps, but many servers don't implement it correctly.
On bandwidth-constrained connections (mobile networks, slow WiFi), push can be actively harmful. Every byte sent as push is a byte not used for the primary response. If the pushed resource isn't needed immediately or is already cached, that bandwidth is wasted at the expense of what the user actually requested.
Real-World Disappointment:
Many high-profile sites experimented with push and abandoned it:
The consensus: Push's complexity exceeds its benefits in most cases. The cache coordination problem is essentially unsolvable at the protocol level, making push a footgun unless carefully tuned for specific first-visit scenarios.
For those who do implement server push, certain patterns maximize benefits while minimizing harm.
Pattern 1: Cookie-Based Push Decision
Use a cookie to track whether this is a first visit:
// Server-side logic
if (!request.cookies.has('visited')) {
// First visit: push critical resources
response.push('/critical.css');
response.push('/app.js');
response.setCookie('visited', 'true', { maxAge: 86400 });
} else {
// Repeat visit: don't push (resources likely cached)
// Client will request if needed
}
This simple heuristic avoids the most wasteful scenario (pushing to cached clients) while still benefiting first-time visitors.
Pattern 2: Conditional Push with Cache-Control
Align push decisions with cache lifetimes:
# Nginx configuration example
location / {
http2_push /static/critical.css;
http2_push /static/framework.js;
# Only push resources with short cache lifetimes
# Long-cache resources are likely cached
}
location /static/ {
# Resources served with long cache headers
add_header Cache-Control "public, max-age=31536000";
}
Pattern 3: Push on Specific Entry Points
Push only on pages that begin user sessions:
123456789101112131415161718192021222324252627282930313233
// Express.js HTTP/2 server push exampleconst http2 = require('http2');const express = require('express'); const app = express(); // Custom middleware for H2 pushapp.use((req, res, next) => { // Check if HTTP/2 and if client supports push if (req.httpVersion === '2.0' && res.push) { // First-visit heuristic const isFirstVisit = !req.cookies.visited; // Only push on first visit to main pages if (isFirstVisit && req.path === '/') { // Push critical CSS const cssStream = res.push('/css/critical.css', { request: { accept: 'text/css' }, response: { 'content-type': 'text/css' } }); cssStream.end(fs.readFileSync('./public/css/critical.css')); // Push framework JS const jsStream = res.push('/js/app.js', { request: { accept: 'application/javascript' } }); jsStream.end(fs.readFileSync('./public/js/app.js')); } } next();});Some CDNs support push via Link headers. Add Link: </style.css>; rel=preload; nopush to hint resources without pushing, or omit 'nopush' to push. This delegates push decisions to the CDN edge, which may have better cache visibility. However, CDN support varies significantly.
Given push's complexity and limited benefits, several alternatives often provide better results with less effort.
1. Resource Hints (Preload, Preconnect, Prefetch)
<!-- Preload: Fetch this resource for current navigation -->
<link rel="preload" href="/critical.css" as="style">
<!-- Preconnect: Establish connection to origin early -->
<link rel="preconnect" href="https://fonts.googleapis.com">
<!-- Prefetch: Fetch for future navigation (low priority) -->
<link rel="prefetch" href="/next-page.js">
Preload triggers immediately after HTML headers arrive, before full HTML parsing. The RTT cost remains, but the request starts earlier.
| Feature | Server Push | Preload | 103 Early Hints | Inline |
|---|---|---|---|---|
| RTT Saved | 1 RTT (ideally) | 0.5 RTT (starts earlier) | 1 RTT | 1 RTT |
| Cache Interaction | Complex, wasteful | Normal caching | Normal caching | No caching |
| Browser Support | All H2 browsers | Excellent | Growing | All browsers |
| Server Complexity | High | Low (HTML change) | Medium | Low |
| Bandwidth Efficiency | Poor (repeat visits) | Good | Good | Poor (no caching) |
2. HTTP 103 Early Hints
A newer approach that avoids push's problems:
HTTP/1.1 103 Early Hints
Link: </style.css>; rel=preload; as=style
Link: </app.js>; rel=preload; as=script
HTTP/1.1 200 OK
Content-Type: text/html
...
The server sends 103 Early Hints while preparing the response. The browser can start fetching hinted resources immediately. Benefits:
3. Service Worker Caching
Service workers can precache resources during installation:
self.addEventListener('install', (event) => {
event.waitUntil(
caches.open('v1').then((cache) => {
return cache.addAll(['/style.css', '/app.js', '/fonts/...']);
})
);
});
After first visit, all resources are local—zero network latency.
For most sites: Use preload for critical resources, 103 Early Hints where supported, and Service Workers for repeat visit performance. Server push is typically not worth the complexity. Reserve push for highly controlled environments (internal applications, CDN edge personalization) where cache state is known.
Server push embodies an elegant idea—let servers proactively send resources—but practical challenges have limited its adoption. Understanding push completes your HTTP/2 knowledge while illustrating why simpler alternatives often prevail.
What's Next:
With push explored, we complete our HTTP/2 module with Stream Prioritization. Priority allows clients to signal which resources matter most, enabling servers to schedule frame transmission intelligently. While complex and variably implemented, understanding prioritization is essential for optimizing HTTP/2 performance.
You now understand HTTP/2 server push—its mechanics, cache challenges, and why it hasn't revolutionized web performance as hoped. This knowledge helps you evaluate push for your own applications and understand why the industry has moved toward simpler alternatives for proactive resource delivery.