Long Polling - Learning Module

Loading content...

0/273

Simulating Push with Polling

The Original Real-Time Hack

In 2006, The New York Times faced a challenge that would reshape how we think about web communication. They wanted their website to display election results in real-time as votes were counted across the country. The problem? The web was fundamentally built around a request-response model where clients ask and servers answer. There was no native mechanism for servers to push updates to browsers when new data arrived.

The solution they implemented—long polling—was elegantly simple yet profoundly effective. Instead of the browser constantly asking "any new results?" every few seconds, the server would hold the connection open and only respond when there was actually something to say. This simple inversion of timing transformed short polling's wasteful chatter into an efficient, near-real-time communication channel.

Long polling became the foundation for the real-time web we know today. Gmail used it for instant email notifications. Facebook deployed it for chat. Twitter leveraged it for live timeline updates. Even as WebSockets and Server-Sent Events have emerged as "native" push technologies, long polling remains relevant—sometimes preferred—in specific architectural contexts.

What You Will Learn

By the end of this page, you will understand why long polling exists, how it inverts the timing of traditional polling to simulate push behavior, and the fundamental tradeoffs this approach involves. You'll develop intuition for when long polling is the right tool versus a legacy compromise.

The Problem with Traditional Polling

To understand why long polling was invented, we must first understand what problem it solved. Let's examine traditional short polling—the naive approach to checking for updates.

The Short Polling Pattern:

In short polling, the client repeatedly sends requests to the server at fixed intervals, asking "do you have new data for me?" The server immediately responds—either with new data or an empty response—and the connection closes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Classic short polling - the naive approach
function startShortPolling(interval: number = 5000) {
    setInterval(async () => {
        try {
            const response = await fetch('/api/notifications');
            const data = await response.json();
            
            if (data.notifications.length > 0) {
                // Process new notifications
                displayNotifications(data.notifications);
            }
            // If no notifications, we just wasted a request
        } catch (error) {
            console.error('Polling failed:', error);
        }
    }, interval);
}
 
// Server side: immediate response
app.get('/api/notifications', async (req, res) => {
    const userId = req.user.id;
    const notifications = await getUnreadNotifications(userId);
    
    // Always respond immediately
    res.json({ notifications });
});

Why Short Polling Falls Apart at Scale:

This approach works for small-scale applications, but its inefficiencies compound catastrophically as user counts grow:

Short Polling Resource Consumption at Scale
Users	Poll Interval	Requests/Minute	Requests/Hour	Daily Request Load
1,000	5 seconds	12,000	720,000	17.3 million
10,000	5 seconds	120,000	7.2 million	173 million
100,000	5 seconds	1.2 million	72 million	1.7 billion
1,000,000	5 seconds	12 million	720 million	17.3 billion

The Fundamental Waste

In a typical notification system, users might receive updates only a few times per day. With 5-second polling, 99.99% of requests return empty responses. You're maintaining an army of servers just to repeatedly say "nothing new." This is the problem long polling was designed to solve.

The Latency-Efficiency Tradeoff:

Short polling forces an impossible tradeoff. If you poll frequently (every 1 second), you achieve low latency but waste enormous resources. If you poll infrequently (every 60 seconds), you conserve resources but users experience unacceptable delays.

Poll Interval	Average Update Latency	Server Load	User Experience
1 second	0.5 seconds	Extreme	Excellent
5 seconds	2.5 seconds	High	Good
30 seconds	15 seconds	Moderate	Acceptable
60 seconds	30 seconds	Low	Poor

There's no winning with short polling. The fundamental architecture is flawed because it doesn't align with how data actually flows. Updates arrive unpredictably, but polling happens predictably. These rhythms don't match.

The Long Polling Insight

Long polling's genius lies in a simple inversion: instead of responding immediately to every request, the server holds the connection open until it has something meaningful to say.

This transforms the communication pattern from:

Client: "Any updates?"
Server: "No."
[5 seconds later]
Client: "Any updates?"
Server: "No."
[5 seconds later]
Client: "Any updates?"
Server: "Yes! Here's the data."

To:

Client: "Let me know when you have updates."
[Server holds connection open...]
[Server holds connection open...]
[Server holds connection open...]
Server: "Here's the data."
Client: "Great! Let me know when you have more."
[Connection cycles...]

The Key Insight

Long polling doesn't reduce the number of connections—it changes WHEN responses occur. By aligning response timing with data availability rather than arbitrary intervals, every response carries useful information. Zero wasted requests.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// Client-side long polling
async function startLongPolling() {
    while (true) {
        try {
            // This request may take 30+ seconds to complete
            const response = await fetch('/api/notifications/long-poll', {
                signal: AbortSignal.timeout(35000) // Slightly longer than server timeout
            });
            
            if (response.status === 200) {
                const data = await response.json();
                displayNotifications(data.notifications);
            }
            // Immediately reconnect for next update
        } catch (error) {
            if (error.name === 'TimeoutError') {
                // Normal timeout - just reconnect
                continue;
            }
            // Actual error - back off before retry
            await sleep(5000);
        }
    }
}
 
// Server-side long polling handler
app.get('/api/notifications/long-poll', async (req, res) => {
    const userId = req.user.id;
    const pollTimeout = 30000; // 30 seconds
    const startTime = Date.now();
    
    // Poll until we have data or timeout
    while (Date.now() - startTime < pollTimeout) {
        const notifications = await getUnreadNotifications(userId);
        
        if (notifications.length > 0) {
            return res.json({ notifications });
        }
        
        // Sleep briefly before checking again
        await sleep(500);
    }
    
    // Timeout - return empty response, client will reconnect
    res.status(204).end();
});

Understanding the Request Flow:

Let's trace a complete long polling cycle:

Client initiates request — Browser sends HTTP request to /api/notifications/long-poll
Server enters waiting state — Instead of querying and responding, the server holds the request handler open, periodically checking for new data
Connection remains open — The HTTP connection stays established but idle. No data flows in either direction.
Event occurs — A notification is created for this user elsewhere in the system
Server responds — The polling loop detects the new data and sends the response
Client reconnects — Upon receiving the response, the client immediately initiates a new long poll request
Cycle repeats — The pattern continues indefinitely

Why This Simulates Push

From the user's perspective, long polling is indistinguishable from true server push. When an event occurs, the user sees it immediately (within the polling loop check interval). The fact that technically the client initiated the request is an implementation detail hidden behind the timing inversion.

The Illusion of Push:

True Server Push (WebSocket)

•Server opens connection to client
•Server sends data whenever available
•Client passively receives
•Single persistent connection
•Bidirectional by design

Simulated Push (Long Polling)

•Client opens connection to server
•Server responds when data available
•Client appears to passively receive
•Sequential connections (recycled)
•Request-response per update

The key realization: True push and simulated push converge in terms of user experience. The difference lies in implementation complexity, resource consumption, and edge case handling—not in the fundamental capability to deliver timely updates.

Latency Analysis:

In a well-implemented long polling system, the perceived latency has two components:

Data detection latency — How quickly the polling loop notices new data (typically 100-500ms)
Network transmission latency — Time to send the response to the client (typically 20-100ms)

Total perceived latency: 120-600ms in typical conditions

This is fast enough for most real-time use cases. Human perception of "instant" is anything under ~300ms. Long polling comfortably achieves this threshold while maintaining HTTP compatibility.

The 200ms Rule

Research on human-computer interaction consistently shows that delays under 200ms feel instantaneous to users. Long polling's typical 100-400ms latency falls within or near this threshold, explaining why it delivers a satisfying real-time experience despite technically being poll-based.

Event-Driven Long Polling

The naive polling loop shown earlier is inefficient—it repeatedly queries the database even when nothing has changed. Production systems use an event-driven approach where the server efficiently waits for notifications without constant database polling.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import { EventEmitter } from 'events';
 
// Shared event bus (in production: Redis pub/sub or similar)
const notificationBus = new EventEmitter();
 
// When a notification is created anywhere in the system
async function createNotification(userId: string, notification: Notification) {
    // Save to database
    await db.notifications.create({
        data: { userId, ...notification }
    });
    
    // Broadcast to any waiting long poll handlers
    notificationBus.emit(`user:${userId}`, notification);
}
 
// Event-driven long poll handler - no polling loop needed
app.get('/api/notifications/long-poll', async (req, res) => {
    const userId = req.user.id;
    const timeout = 30000;
    
    // Check for existing notifications first
    const existing = await getUnreadNotifications(userId);
    if (existing.length > 0) {
        return res.json({ notifications: existing });
    }
    
    // No existing notifications - wait for new ones
    const cleanup = () => {
        notificationBus.off(`user:${userId}`, handler);
    };
    
    const timer = setTimeout(() => {
        cleanup();
        res.status(204).end();
    }, timeout);
    
    const handler = (notification: Notification) => {
        clearTimeout(timer);
        cleanup();
        res.json({ notifications: [notification] });
    };
    
    // Listen for this user's notifications
    notificationBus.once(`user:${userId}`, handler);
    
    // Handle client disconnect
    req.on('close', () => {
        clearTimeout(timer);
        cleanup();
    });
});

Why Event-Driven is Essential:

The event-driven approach transforms long polling's resource profile:

Approach	CPU Usage While Waiting	Database Load	Memory per Connection
Polling Loop	Continuous (checking)	High (repeated queries)	Low
Event-Driven	Zero (blocked on event)	Zero (no queries)	Minimal (event listener)

With event-driven long polling, 10,000 waiting connections consume essentially zero server resources until an event occurs. This is the architecture that enabled Gmail and Facebook to scale to millions of concurrent users.

The Pub/Sub Integration

In distributed systems, the event bus must span multiple server instances. Redis Pub/Sub is the common choice: when a notification is created on Server A, Redis broadcasts to Server B where the user's long poll connection is waiting. This horizontal scalability is key to long polling at scale.

Understanding the Connection Lifecycle

Long polling connections have a well-defined lifecycle that differs significantly from traditional HTTP requests. Understanding this lifecycle is crucial for proper implementation.

The Complete Long Polling Lifecycle:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Timeline: Long Polling Request Lifecycle
═══════════════════════════════════════════════════════════════════════════
 
Client                          Server                          Event Source
   │                               │                                  │
   │──── HTTP Request ────────────▶│                                  │
   │     GET /long-poll            │                                  │
   │                               │                                  │
   │                               │◀──── Check existing data         │
   │                               │       (none found)               │
   │                               │                                  │
   │                               │◀──── Register event listener     │
   │                               │                                  │
   │     ┌─────────────────────────┼──────────────────────────────────┤
   │     │   Connection held open  │     (Zero CPU usage)             │
   │     │   30 second timeout     │                                  │
   │     │   counting down...      │                                  │
   │     └─────────────────────────┤                                  │
   │                               │                                  │
   │                               │                     Event fires ─┤
   │                               │◀───────────────────── Notification
   │                               │                                  │
   │◀──── HTTP Response ───────────│                                  │
   │      200 OK + data            │                                  │
   │                               │                                  │
   │──── New HTTP Request ────────▶│      (Cycle restarts)            │
   │     GET /long-poll            │                                  │
   │                               │                                  │
   ▼                               ▼                                  ▼

Key Lifecycle Events:

1. Connection Establishment

Client opens HTTP connection with appropriate timeout headers
Server validates authentication and authorization
Server checks for any immediately available data

2. Wait State

Server registers callback for relevant events
Connection is held open but inactive
Server-side timeout timer starts

3. Resolution (one of three outcomes):

Outcome	Trigger	Server Response	Client Action
Data Available	Event fires	200 OK + payload	Process data, reconnect
Timeout	Timer expires	204 No Content	Reconnect immediately
Client Disconnect	Network issue	N/A (cleanup)	Reconnect after delay

4. Cleanup

Event listeners removed
Timers cleared
Resources deallocated
Metrics recorded

Connection Cleanup is Critical

Memory leaks from orphaned event listeners or timers are the most common long polling bug. Every code path must ensure cleanup occurs—normal completion, timeout, client disconnect, and error cases. Always use try/finally patterns or dedicated cleanup functions.

Historical Context and Evolution

Long polling wasn't invented in isolation—it emerged from constraints that defined web development in the mid-2000s.

The Pre-WebSocket Era:

Before 2011, when WebSockets were standardized, web developers had limited options for real-time updates:

Short Polling — Simple but wasteful
Long Polling — Efficient but complex
Forever Frames — Hidden iframes with streaming content (IE-specific)
Flash XMLSocket — Required Flash plugin

Long polling emerged as the most practical solution: it worked in all browsers, required no plugins, and used standard HTTP infrastructure.

The Companies That Scaled It:

Long Polling Pioneers
Company	Use Case	Scale	Innovation
Gmail (2004)	Email notifications	Millions of users	First major AJAX app with real-time
Facebook (2008)	Chat and notifications	100M+ concurrent	Comet servers, connection coalescing
Friendfeed (2008)	Real-time feed	Millions of updates/sec	SUP protocol, feed aggregation
Twitter (2009)	Live timeline	Massive spike handling	Streaming API development

Comet: The Original Name

Long polling was originally called "Comet" (a play on AJAX—the comet follows Ajax in Greek mythology). The term encompassed various HTTP-based push simulation techniques:

Long polling — Hold request, respond when data available
Streaming — Hold request, send multiple responses (chunked encoding)
Forever Frame — Hidden iframe with JavaScript pushed from server

The Comet pattern was so influential that libraries like CometD, Atmosphere, and Socket.io were built specifically to abstract its complexities. These libraries handled browser inconsistencies, reconnection logic, and transport fallbacks.

Legacy That Persists

Many modern real-time libraries still use long polling as a fallback transport. Socket.io, SignalR, and SockJS all negotiate the best available transport but fall back to long polling when WebSockets are blocked by proxies or firewalls. Understanding long polling helps you debug these fallback scenarios.

The Push Simulation Spectrum

Long polling exists on a spectrum of techniques for simulating server push. Understanding this spectrum helps you appreciate what long polling offers relative to alternatives.

From Polling to True Push:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
←───────────── More Polling ──────────────────────── More Push ─────────────→
 
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┐
│   Short     │     Long    │   HTTP      │   Server-   │  WebSocket  │
│   Polling   │   Polling   │  Streaming  │   Sent      │             │
│             │             │             │   Events    │             │
├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
│ New request │ New request │ Single open │ Single open │ Single open │
│ every N     │ per event   │ connection  │ connection  │ connection  │
│ seconds     │ batch       │ chunked     │ text/event  │ full-duplex │
├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
│ Max waste   │ Low waste   │ Zero waste  │ Zero waste  │ Zero waste  │
│ High load   │ Low load    │ Low load    │ Low load    │ Lowest load │
├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
│ ~50% of     │ ~100% of    │ ~100% of    │ ~100% of    │ ~100% of    │
│ interval    │ interval    │ immediate   │ immediate   │ immediate   │
│ latency     │ latency     │             │             │             │
├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
│ All HTTP    │ All HTTP    │ Most HTTP   │ Most        │ Requires    │
│ proxies OK  │ proxies OK  │ proxies OK  │ browsers    │ WS proxy    │
│             │             │             │ (not IE)    │ support     │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┘

Long Polling's Position:

Long polling occupies a sweet spot: it provides near-push timing while maintaining complete HTTP compatibility. Key characteristics:

Request per event batch — Each response requires the client to reconnect
HTTP-native — Works through any proxy, firewall, or load balancer that handles HTTP
Stateless server — Each request is independent (state managed elsewhere)
Browser universal — Works in every browser ever made that supports AJAX

The Compatibility Advantage

Long polling's killer feature is infrastructure compatibility. Corporate proxies often block WebSocket connections. CDNs may not support SSE properly. But every piece of HTTP infrastructure supports long polling because it's just... HTTP. This compatibility sometimes outweighs the efficiency gains of newer technologies.

Summary: The Art of Simulated Push

We've established the foundational concepts of long polling as a push simulation technique. Let's consolidate the key insights:

Key Takeaways

•Short polling is fundamentally wasteful — Most requests return empty responses, creating massive server load at scale.
•Long polling inverts the timing — By holding connections until data is available, every response is meaningful.
•Push is simulated through timing, not protocol — The client technically initiates, but the effect is indistinguishable from true push.
•Event-driven implementation is essential — Polling loops waste resources; proper implementations wait on events.
•Connection lifecycle management is complex — Multiple termination paths require careful cleanup.
•Historical context explains current usage — Long polling emerged from pre-WebSocket constraints but remains relevant.
•HTTP compatibility is the killer feature — Works through any HTTP infrastructure without modification.

What's Next:

Now that we understand why long polling exists and what it conceptually achieves, we'll dive into the mechanics of how it actually works. The next page explores the precise choreography of timeouts, reconnection, and state management that make long polling reliable in production.

Page Complete

You now understand the fundamental concept of long polling and why it was invented. You've seen how holding HTTP connections can simulate push behavior efficiently. Next, we'll examine the precise mechanics that make this pattern work reliably at scale.