Slow Start Congestion Avoidance - Learning Module

Loading content...

0/228

Slow Start Phase

The Birth of Congestion Control

In October 1986, the Internet nearly collapsed. Network throughput plummeted to barely 100 bits per second—a catastrophic 1000-fold reduction from normal capacity. The culprit wasn't hardware failure or malicious attack; it was congestion collapse, a phenomenon where networks become so overwhelmed that useful throughput approaches zero even as traffic demand soars.

This crisis forced Van Jacobson, a researcher at Lawrence Berkeley Laboratory, to fundamentally rethink how TCP connections should behave when joining a network. His solution, published in his landmark 1988 paper "Congestion Avoidance and Control," introduced slow start—arguably the most important algorithm in modern networking.

Despite its name, slow start isn't actually slow. It's a carefully engineered mechanism that allows new TCP connections to rapidly discover available network capacity without overwhelming intermediate routers. Understanding slow start is essential for any engineer working with networked systems, as it directly impacts application performance, server scalability, and network stability.

What You Will Learn

By the end of this page, you will understand why slow start exists, how it operates at the packet level, why its exponential growth is critical for performance, and how modern networks depend on this 35-year-old algorithm for stability. You'll gain insight into one of the most elegant solutions in computer science.

The Congestion Problem

Before understanding slow start, we must first understand the problem it solves. Consider what happens when a new TCP connection starts transmitting data without any awareness of network conditions:

The naive approach:

Imagine a sender with a 1 Gbps connection to its local switch. Without congestion control, this sender might immediately begin transmitting at full speed—up to 1 million packets per second. But the path to the receiver might traverse dozens of routers, each with different capacities. A core router might handle hundreds of gigabits per second, while an edge router serving a residential connection might only handle 100 Mbps.

What happens next is predictable and devastating:

Cascade of Congestion Collapse

•Buffer Overflow — The bottleneck router's buffers fill with packets faster than they can be forwarded. Incoming packets are dropped.
•Timeout Triggers — The sender doesn't receive acknowledgments for dropped packets, triggering retransmission timers.
•Retransmission Storm — The sender retransmits packets that were already dropped, adding even more traffic to an overloaded network.
•Global Synchronization — Multiple connections experience losses simultaneously, then retransmit simultaneously, creating synchronized waves of congestion.
•Throughput Collapse — The network spends most of its capacity transmitting duplicate packets that will also be dropped. Useful throughput approaches zero.

The 1986 Internet Congestion Collapse

The 1986 congestion collapse wasn't a theoretical scenario—it happened. The LBL-to-UC-Berkeley link, normally capable of 32 Kbps (already modest), degraded to just 40 bits per second. Packets were being retransmitted so aggressively that almost no original data got through. This event proved that TCP needed fundamental changes to survive.

The core insight:

The fundamental problem is that a sender has no inherent knowledge of the path's capacity or current load. The sender might be communicating with a server on a different continent, traversing undersea cables, satellite links, and networks in countries with varying infrastructure quality.

The sender needs a way to:

Discover the available bandwidth without causing congestion
Adapt when network conditions change
Cooperate with other connections sharing the same path

This is precisely what slow start achieves.

The Congestion Window Concept

To implement slow start, TCP introduces a critical state variable called the congestion window (cwnd). This variable represents the sender's estimate of how many bytes (or segments) it can safely have "in flight" on the network at any given time.

The congestion window is entirely local to the sender. Unlike the receiver's advertised window (rwnd), which is communicated in TCP headers, cwnd exists only in the sender's TCP stack. The receiver has no visibility into the sender's cwnd value.

The effective sending rate:

The actual amount of data a sender can transmit is constrained by both windows:

Effective Window = min(cwnd, rwnd)

This means the sender cannot exceed either:

The congestion window (network capacity estimate)
The receiver's advertised window (receiver buffer capacity)

Congestion Window vs Receiver Window
Property	Congestion Window (cwnd)	Receiver Window (rwnd)
Purpose	Prevent network congestion	Prevent receiver buffer overflow
Controlled by	Sender's congestion control algorithm	Receiver's available buffer space
Visibility	Local to sender only	Advertised in TCP header
Initial value	1-10 segments (OS dependent)	Based on receiver's socket buffer
Changes based on	Network feedback (ACKs, losses)	Application consumption rate
Units	Bytes or segments	Bytes

Why two windows?

These two windows address fundamentally different constraints:

rwnd prevents the sender from overwhelming the receiver's ability to buffer and process incoming data
cwnd prevents the sender from overwhelming the network's ability to transport data

A receiver might have plenty of buffer space (large rwnd), but the network path might be constrained (small cwnd). Conversely, the network might have abundant capacity, but the receiver might be slow at processing data.

Modern initial cwnd values:

Historically, TCP started with cwnd = 1 segment (typically 536 or 1460 bytes). This was extremely conservative and penalized high-latency connections significantly.

Modern implementations (since RFC 6928, adopted around 2013) use an initial window of 10 segments (~14,600 bytes with standard MSS). This change alone improved web page load times by 10-15% for typical browsing scenarios by allowing more data to be transmitted before the first RTT completes.

The Bandwidth-Delay Product

The ideal cwnd value equals the bandwidth-delay product (BDP) of the path: BDP = Bandwidth × RTT. For a 100 Mbps link with 50ms RTT, the BDP is 625,000 bytes. If cwnd is smaller than BDP, the connection underutilizes available capacity. If cwnd exceeds BDP, packets queue in router buffers, increasing latency. Slow start's job is to find this optimal point.

Slow Start Algorithm Mechanics

The slow start algorithm operates with elegant simplicity, yet produces powerful emergent behavior. The core rule can be stated in one sentence:

For each acknowledgment received, increase cwnd by one Maximum Segment Size (MSS).

While this rule appears to produce linear growth (one ACK = one segment increase), the actual effect is exponential. Here's why:

Step-by-step example (initial cwnd = 1 MSS):

Slow Start Progression (MSS = 1460 bytes)
Round	cwnd at Start	Segments Sent	ACKs Received	cwnd at End	Cumulative Data Sent
1	1 MSS	1	1	2 MSS	1,460 bytes
2	2 MSS	2	2	4 MSS	4,380 bytes
3	4 MSS	4	4	8 MSS	10,220 bytes
4	8 MSS	8	8	16 MSS	21,900 bytes
5	16 MSS	16	16	32 MSS	45,260 bytes
6	32 MSS	32	32	64 MSS	92,080 bytes
7	64 MSS	64	64	128 MSS	185,620 bytes

The mathematical pattern:

After n round-trip times, the congestion window is:

cwnd = initial_cwnd × 2^n

This is exponential growth—the window doubles every RTT. Starting from 1 MSS:

After 1 RTT: 2 MSS
After 2 RTTs: 4 MSS
After 10 RTTs: 1,024 MSS (~1.5 MB)
After 20 RTTs: 1,048,576 MSS (~1.5 GB)

This exponential growth is why slow start isn't actually slow—it's remarkably fast at finding available capacity. A connection can grow from 1 segment to utilizing a 10 Gbps link in roughly 24 RTTs, which might be just 2-3 seconds even with moderate latency.

Why "Slow" Start?

The name 'slow start' is relative to the alternative—immediately blasting data at the sender's line rate. Compared to not having any congestion control, slow start is indeed slow. But compared to linear probing (adding one segment per RTT), slow start is exponentially faster. The name reflects historical context, not absolute speed.

The pseudocode implementation:

Here's how slow start is typically implemented in a TCP stack:

slow_start.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Slow Start Algorithm
// Called when a new ACK is received
 
function on_ack_received(ack):
    if cwnd < ssthresh:
        // In slow start phase
        cwnd = cwnd + MSS
        
        // This happens for EACH ACK, so if we sent N segments
        // and receive N ACKs, cwnd increases by N × MSS
        // Effectively doubling cwnd per RTT
    else:
        // In congestion avoidance phase (covered later)
        cwnd = cwnd + MSS × (MSS / cwnd)
        // This produces linear growth
 
// On connection initialization
function connection_init():
    cwnd = INITIAL_WINDOW  // Modern default: 10 × MSS
    ssthresh = 65535       // Initial threshold (often infinite)
 
// On packet loss detected (timeout)
function on_timeout():
    ssthresh = max(cwnd / 2, 2 × MSS)  // Save half of current window
    cwnd = 1 × MSS                      // Reset to slow start
    // Retransmit lost segment and resume slow start

Why Exponential Growth Works

The choice of exponential growth in slow start isn't arbitrary—it represents an optimal balance between probing speed and congestion risk. Let's analyze why this matters:

The probing dilemma:

A new connection faces a fundamental uncertainty: the network might have 1 Kbps available or 10 Gbps available. The difference spans seven orders of magnitude. Any probing algorithm must be able to discover this capacity efficiently while avoiding catastrophic over-injection.

Why not linear growth?

With linear growth (adding one segment per RTT), reaching a 10 Gbps capacity from a standing start would take thousands of RTTs—potentially minutes. For short-lived connections (like HTTP requests), linear growth would mean the connection terminates before ever discovering available capacity.

Why not immediate maximum rate?

Sending at maximum rate immediately is exactly what caused the 1986 congestion collapse. The sender has no information about path capacity and could inject thousands of packets into buffers that can only hold dozens.

Exponential Growth Benefits

•Logarithmic time to capacity — Reaches any target rate in O(log n) RTTs
•Self-limiting — Growth stops when losses indicate congestion
•Conservative initial behavior — Starts with minimal injection
•Rapid adaptation — Doubles capacity estimate every RTT
•Universal applicability — Works for 1 Kbps and 100 Gbps paths alike

Growth Too Slow (Linear)

•O(n) time to capacity — May never reach available bandwidth
•Wastes network resources — Path underutilized for extended periods
•Poor user experience — File downloads start slowly and stay slow
•Unfair to new connections — Existing flows dominate capacity
•Economically inefficient — Paid bandwidth goes unused

The mathematical elegance:

Exponential growth has a crucial property: the amount of "wasted" transmission when you overshoot is bounded.

When slow start continues until packet loss occurs, the sender has pushed cwnd to approximately 2× the path capacity (since it doubled from a working value to a failing one). This means at most half the transmitted data in the final RTT is wasted—a constant fraction regardless of the path's actual capacity.

This is fundamentally better than linear probing, where overshooting by even one segment could represent a tiny percentage (wasting little but taking forever) or aggressive initial transmission, where overshooting could waste thousands of segments (immediate but catastrophic).

Real-world impact:

For a 100 Mbps path with 50ms RTT (BDP = 625 KB):

Linear growth from 1 MSS: ~428 RTTs = ~21 seconds to utilize fully
Exponential growth from 1 MSS: ~9 RTTs = ~450ms to utilize fully

The difference is dramatic and directly affects every TCP connection's performance.

Slow Start in Practice

Understanding how slow start manifests in real network traces is essential for diagnosing performance issues. Let's examine what slow start looks like in practice:

Observing slow start with packet captures:

When you capture TCP traffic (using tools like Wireshark or tcpdump), slow start has a distinctive pattern:

slow_start_capture_pattern.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Typical slow start packet pattern (simplified)
 
Time    Direction    Seq        Size    Notes
────────────────────────────────────────────────────────
0.000   →           1          1460    First segment (cwnd=1)
0.050   ←           ACK 1461          Acknowledgment received
0.050   →           1461       1460    Second segment (cwnd=2)
0.050   →           2921       1460    Third segment
0.100   ←           ACK 2921          
0.100   ←           ACK 4381          Two ACKs → cwnd now 4
0.100   →           4381       1460    (cwnd=4, can send 4)
0.100   →           5841       1460    
0.100   →           7301       1460    
0.100   →           8761       1460    
0.150   ←           ACK 5841          
0.150   ←           ACK 7301          
0.150   ←           ACK 8761          
0.150   ←           ACK 10221         Four ACKs → cwnd now 8
# Pattern continues: 8, 16, 32, 64 segments per RTT...

Key observations in traces:

Clustered transmissions — Segments are sent in bursts equal to cwnd, then the sender waits for ACKs
Burst doubling — Each round of ACKs enables double the previous transmission burst
ACK clocking — The arrival rate of ACKs "clocks" the transmission of new data
RTT visibility — The spacing between clusters reveals round-trip time

Common diagnostic scenarios:

Slow Start Performance Scenarios
Scenario	Observation	Root Cause	Impact
High-latency path	Slow start takes many seconds	Large RTT means each doubling takes longer	Long time to full throughput
Lossy network	cwnd resets repeatedly	Packet loss triggers ssthresh reduction	Never achieves steady state
Small receiver buffer	cwnd limited early	rwnd smaller than BDP	Cannot use full slow start potential
Initial window too small	Extended ramp-up period	Legacy OS with IW=1	Poor performance for short flows
Competing traffic	Variable ACK spacing	Cross-traffic affecting queues	Longer effective RTT during slow start

Diagnosing Slow Start Issues

When troubleshooting slow performance, check the initial window size using ss -i on Linux or inspecting the first few data segments. If you see single-segment transmissions initially on a modern system, the TCP stack may be misconfigured or an intermediary (like a WAN optimizer) may be interfering with slow start.

When Slow Start Ends

Slow start doesn't continue indefinitely—exponential growth must eventually stop. There are three conditions that terminate slow start:

Termination Condition 1: Packet Loss

The most common termination is packet loss. When a segment is lost (detected by timeout or duplicate ACKs), it indicates the network cannot handle the current transmission rate. In classic TCP:

The ssthresh is set to half the current cwnd
cwnd is reset to 1 MSS (on timeout) or reduced (on fast retransmit)
Slow start may resume until cwnd reaches the new ssthresh

Termination Condition 2: ssthresh Reached

If the sender has prior knowledge (from an earlier loss), the ssthresh variable stores an estimate of the network's capacity. When cwnd reaches ssthresh, TCP transitions from slow start to congestion avoidance (linear growth). This prevents the aggressive exponential growth from inevitably overshooting.

Termination Condition 3: Receiver Window Limit

If the receiver's advertised window (rwnd) is smaller than cwnd, the sender is limited by the receiver, not the network. Slow start effectively pauses—further cwnd increases don't translate to additional transmissions.

Converting Mermaid diagram...

The interplay of termination conditions:

In practice, these conditions interact:

A new connection starts with a very high ssthresh (often 65535 or "infinity"). Slow start continues until loss occurs, which sets ssthresh to a reasonable estimate.
After recovery from loss, slow start resumes but only until cwnd reaches the new ssthresh. This prevents repeating the exact same overshot that caused the original loss.
Once in congestion avoidance (cwnd ≥ ssthresh), the connection probes for additional capacity linearly, which is far less likely to cause loss.

This adaptive behavior means TCP "learns" about the path over time. Initial connections may experience loss during slow start, but subsequent data transfers on the same connection benefit from the learned ssthresh value.

Short Connections May Never Leave Slow Start

HTTP/1.1 web requests often transfer small objects. With a 10-segment initial window and typical web object sizes of 10-50 KB, many requests complete entirely during slow start—never reaching full capacity. This is why HTTP/2's multiplexing and connection reuse are critical for web performance: longer-lived connections can exit slow start and utilize more bandwidth.

Modern Slow Start Enhancements

The original slow start algorithm from 1988 has been refined considerably. Modern TCP implementations incorporate several enhancements:

RFC 6928: Initial Window Increase (2013)

The most significant change increased the initial window from 1-4 segments to 10 segments. This change alone:

Reduced page load times by 10-15% for average web pages
Allowed typical HTTP responses to begin transferring meaningful content immediately
Reduced the number of RTTs needed before useful throughput

HyStart (Hybrid Slow Start)

Developed for CUBIC TCP (Linux default), HyStart attempts to detect when slow start is approaching capacity before packet loss occurs. It monitors:

Delay increase — If RTT samples during slow start show increasing delay, it suggests buffer filling
ACK train length — If ACKs become more spaced out, it indicates growing queues

When HyStart detects these signals, it exits slow start early, avoiding the packet loss that would otherwise reduce cwnd.

Slow Start Evolution
Era	Initial Window	Key Features	Typical Impact
TCP Tahoe (1988)	1 segment	Original slow start	Very conservative, slow for short flows
RFC 2414 (1998)	2-4 segments	Experimental increase	Modest improvement for web traffic
RFC 3390 (2002)	2-4 segments	Standardized larger IW	Better compatibility
RFC 6928 (2013)	10 segments	Modern standard	Significant web performance gains
CUBIC + HyStart	10+ segments	Early exit on delay signals	Reduced loss-based exits
BBR v2	Variable	Rate-based probing	Fundamentally different approach

Pacing during slow start:

Modern kernels (Linux 4.13+) can pace segments during slow start rather than sending bursts. Instead of transmitting 10 segments back-to-back when the window allows, the sender spaces them across the RTT estimate. This:

Reduces burst-induced congestion
Smooths queue occupancy at intermediate routers
Improves coexistence with other traffic

RACK-TLP (Recent ACK and Tail Loss Probe):

Modern loss detection uses timing information rather than just duplicate ACK counts. During slow start, this means:

Faster loss detection
More accurate ssthresh estimation
Reduced unnecessary retransmits

These enhancements preserve the core slow start principle—exponential probing with cwnd—while addressing practical limitations discovered through decades of operational experience.

BBR: A Different Philosophy

Google's BBR congestion control algorithm takes a fundamentally different approach. Instead of growing cwnd until loss occurs (loss-based), BBR measures actual bandwidth and RTT to determine pacing rate (model-based). While BBR doesn't use traditional slow start, it has a similar 'startup' phase that probes for bandwidth. This represents an evolution beyond Jacobson's original design while respecting its core insight: TCP must actively probe to discover capacity.

Summary: The Foundation of Congestion Control

We've thoroughly explored slow start—the mechanism that saved the Internet from congestion collapse and continues to enable reliable high-speed data transfer. Let's consolidate the key insights:

Key Takeaways

•Slow start solves the discovery problem — New connections have no knowledge of path capacity; slow start probes exponentially to find it quickly.
•The congestion window (cwnd) is the control variable — It limits how much data can be in flight, preventing the sender from overwhelming the network.
•Exponential growth is the key mechanism — Doubling cwnd every RTT achieves logarithmic time to any capacity level.
•Loss terminates slow start — Packet loss signals that the network's limit has been reached; ssthresh captures this estimate.
•Modern enhancements preserve the core algorithm — Larger initial windows, HyStart, and pacing improve behavior without changing fundamental principles.
•Every TCP connection uses slow start — Understanding it is essential for diagnosing performance issues in any networked application.

What's next:

Now that we understand the slow start phase itself, the next page dives deeper into the exponential growth pattern—examining its mathematical properties, visualizing its behavior, and analyzing why this specific growth rate represents an optimal tradeoff between speed and safety.

Page Complete

You now understand TCP slow start—why it exists, how it operates, and why exponential growth is the key to its effectiveness. Slow start transforms TCP from a protocol that could collapse networks into one that cooperatively discovers and utilizes available capacity. Next, we'll examine the exponential growth pattern in greater mathematical detail.