Loading learning content...
TCP Tahoe solved the Internet's congestion collapse crisis, but its conservative philosophy came with a significant cost: complete window reset on every loss event. Engineers quickly recognized that this one-size-fits-all response was excessive for many real-world scenarios.
Consider the situation when Fast Retransmit triggers: three duplicate ACKs arrived, which means three subsequent packets were successfully delivered. The network isn't completely congested—it's still forwarding segments. Resetting the entire congestion window seems like an overreaction.
This insight led to the development of TCP Reno, released in 1990 with BSD 4.3 Reno. Reno introduced a crucial new mechanism called Fast Recovery that fundamentally changed how TCP responds to fast retransmit events, dramatically improving throughput while maintaining the safety guarantees that made Tahoe successful.
Reno became the de facto standard TCP implementation for over a decade and remains influential in shaping our understanding of congestion control. Its core ideas—distinguishing severity of congestion signals and optimizing recovery accordingly—continue to inform modern protocol design.
By the end of this page, you will understand TCP Reno's Fast Recovery mechanism, how it differs from Tahoe's response to loss, why Reno maintains different behaviors for timeout versus triple-dupack losses, the performance improvements Reno achieves, and the remaining limitations that motivated further evolution.
To understand why Reno was necessary, we must examine Tahoe's behavior more closely and recognize the opportunities it was leaving on the table.
Tahoe's Conservative Philosophy:
Tahoe treats all loss events identically: set ssthresh to half the current window, reset cwnd to 1 MSS, and re-enter Slow Start. This ensures safety—if the network is severely congested, this dramatic backoff prevents making things worse.
But consider two very different scenarios:
Scenario A - Timeout (Severe Congestion):
Scenario B - Triple Duplicate ACK (Mild Congestion):
The Key Observation:
When three duplicate ACKs arrive, each one represents a successfully delivered packet beyond the loss point. These ACKs tell us two things:
Reno exploits this information: instead of going back to Slow Start, it retransmits the suspected lost packet and then uses the dup ACKs to pace the injection of new data. The network continues to carry traffic at a reduced but substantial rate, avoiding the deep throughput valley that Tahoe creates.
Each arriving ACK (including duplicates) indicates that a segment has left the network. This is the 'self-clocking' principle: ACKs naturally pace new transmissions at the rate the network can handle. Fast Recovery leverages this by treating dup ACKs as permission to send new data.
Fast Recovery is the defining innovation of TCP Reno. It replaces Tahoe's "always go to Slow Start" rule with a more nuanced response specifically for triple-duplicate-ACK loss events.
The Core Idea:
Instead of dropping cwnd to 1, Fast Recovery halves the window and then uses incoming duplicate ACKs to artificially inflate the window temporarily. This inflation accounts for the segments that have left the network (evidenced by the dup ACKs) and allows new transmissions to continue.
When the retransmitted segment is finally ACKed (indicating the receiver has filled the gap), cwnd is "deflated" back to its normal size and Congestion Avoidance resumes.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123
/* TCP Reno Fast Recovery Algorithm */ typedef enum { SLOW_START, CONGESTION_AVOIDANCE, FAST_RECOVERY /* New state in Reno */} RenoPhase; typedef struct { uint32_t cwnd; uint32_t ssthresh; uint32_t mss; uint8_t dupacks; RenoPhase phase; uint32_t recover; /* Highest seq sent before entering FR */} TCPReno; void reno_on_dup_ack(TCPReno* r) { r->dupacks++; if (r->dupacks == 3 && r->phase != FAST_RECOVERY) { /* * FAST RETRANSMIT + Enter FAST RECOVERY * * Step 1: Set ssthresh to half of flight size */ r->ssthresh = r->cwnd / 2; if (r->ssthresh < 2 * r->mss) { r->ssthresh = 2 * r->mss; } /* * Step 2: Retransmit the missing segment */ retransmit_from_last_ack(); /* * Step 3: Set cwnd = ssthresh + 3*MSS * * The "+3*MSS" accounts for the 3 dup ACKs already received. * Each dup ACK indicates a segment has left the network. */ r->cwnd = r->ssthresh + 3 * r->mss; /* * Step 4: Record recovery point */ r->recover = highest_seq_sent(); /* * Step 5: Enter Fast Recovery phase */ r->phase = FAST_RECOVERY; } else if (r->phase == FAST_RECOVERY) { /* * Already in Fast Recovery: each dup ACK inflates cwnd * * Each dup ACK means another segment left the network. * We can transmit a new segment to replace it. */ r->cwnd += r->mss; /* Transmit new segment if allowed by window */ if (can_send_new_data(r)) { send_new_segment(); } }} void reno_on_new_ack(TCPReno* r, uint32_t ack_num) { r->dupacks = 0; if (r->phase == FAST_RECOVERY) { if (ack_num >= r->recover) { /* * FULL ACK: Acknowledges all data up to recover point * * Fast Recovery complete. "Deflate" the window. */ r->cwnd = r->ssthresh; /* Deflate to halved value */ r->phase = CONGESTION_AVOIDANCE; } else { /* * PARTIAL ACK: More losses exist * * Stay in Fast Recovery, retransmit next missing segment. * This is where Reno has problems (see NewReno). */ retransmit_from_ack(ack_num); /* Reno: exit to CA anyway (problematic) */ r->cwnd = r->ssthresh; r->phase = CONGESTION_AVOIDANCE; } } else { /* Normal ACK processing (Slow Start or CA) */ if (r->cwnd < r->ssthresh) { r->cwnd += r->mss; /* Slow Start */ r->phase = SLOW_START; } else { r->cwnd += (r->mss * r->mss) / r->cwnd; /* CA */ r->phase = CONGESTION_AVOIDANCE; } }} void reno_on_timeout(TCPReno* r) { /* * TIMEOUT: Indicates severe congestion * * Same as Tahoe: full reset to Slow Start */ r->ssthresh = r->cwnd / 2; if (r->ssthresh < 2 * r->mss) { r->ssthresh = 2 * r->mss; } r->cwnd = r->mss; /* Reset to 1 MSS */ r->dupacks = 0; r->phase = SLOW_START; /* Retransmit all outstanding segments */ retransmit_all_outstanding();}Step-by-Step Walkthrough:
Trigger (3rd duplicate ACK): Fast Retransmit activates, indicating a likely single segment loss.
Set ssthresh = cwnd/2: Record the congestion point for future reference.
Retransmit: Immediately send the presumed lost segment.
Set cwnd = ssthresh + 3×MSS: The "+3×MSS" represents the three segments the receiver has confirmed receiving (via the dup ACKs). We subtract them conceptually since they've left the network.
Enter Fast Recovery: A new state where we're actively recovering from a loss.
On each subsequent dup ACK: Increment cwnd by 1 MSS. This "inflates" the window, allowing new transmissions. Each dup ACK proves another segment left the network.
On recovery ACK (ACKs lost + buffered data): "Deflate" cwnd to ssthresh and resume Congestion Avoidance. Recovery is complete.
The inflation during Fast Recovery is artificial—it temporarily allows extra segments in flight to maintain throughput. The deflation on recovery returns to the 'true' halved window. This technique keeps the pipe full while recovery occurs.
The fundamental difference between Reno and Tahoe lies in how they transition after detecting loss via triple duplicate ACKs. Let's compare the state machines:
| Event | TCP Tahoe | TCP Reno |
|---|---|---|
| 3 Duplicate ACKs | ssthresh = cwnd/2\ncwnd = 1 MSS\nEnter Slow Start | ssthresh = cwnd/2\ncwnd = ssthresh + 3*MSS\nEnter Fast Recovery |
| Additional Dup ACKs | N/A (already in SS) | cwnd += MSS (inflate)\nSend new data if possible |
| ACK for retransmit | Normal SS processing | cwnd = ssthresh (deflate)\nEnter Congestion Avoidance |
| Timeout | ssthresh = cwnd/2\ncwnd = 1 MSS\nEnter Slow Start | ssthresh = cwnd/2\ncwnd = 1 MSS\nEnter Slow Start |
| Recovery Time | log₂(ssthresh) RTTs to reach ssthresh | Immediate: already at ssthresh |
Critical Observation:
The key difference is where cwnd ends up after detecting triple-dupack loss:
This means Reno recovers from a single-packet loss in roughly 1 RTT (the time to receive the recovery ACK), while Tahoe takes log₂(ssthresh) RTTs just to reach the same window size.
Reno keeps Tahoe's timeout behavior exactly: full reset to cwnd = 1 and Slow Start. The assumption is that timeout indicates a more severe problem—perhaps the network path has fundamentally changed or is heavily congested. The conservative response remains appropriate here.
Fast Recovery's elegance lies in its exploitation of TCP's self-clocking property—a fundamental characteristic first articulated by Van Jacobson in his original congestion control work.
What is Self-Clocking?
In steady state, a TCP connection achieves an equilibrium where new packets are sent at the same rate old packets (and their ACKs) complete. ACKs naturally arrive at the rate the network can deliver them, creating a "clock" that paces new transmissions.
Mathematically, if the bottleneck bandwidth is B and the RTT is R, then:
12345678910111213141516171819202122232425262728293031323334353637383940
/* * Self-Clocking in Action * * Scenario: cwnd = 10 segments, 1 segment lost (segment #5) * * Sender transmits: [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] * ↓ * [5] is LOST * * Receiver sees: [1] [2] [3] [4] [6] [7] [8] [9] [10] * * Receiver sends: * ACK 2 (received [1]) * ACK 3 (received [2]) * ACK 4 (received [3]) * ACK 5 (received [4]) * ACK 5 (DUPACK #1, received [6], still expecting [5]) * ACK 5 (DUPACK #2, received [7]) * ACK 5 (DUPACK #3, received [8]) -> FAST RETRANSMIT * ACK 5 (DUPACK #4, received [9]) * ACK 5 (DUPACK #5, received [10]) * * Key insight: Each ACK (including dup ACKs) represents a segment * that has left the network. The network still has capacity for * that segment. * * With Fast Recovery window inflation: * - Enter FR: cwnd = ssthresh + 3*MSS = 5 + 3 = 8 segments * - After DUPACK #4: cwnd = 9 -> can send [11] * - After DUPACK #5: cwnd = 10 -> can send [12] * * The pipe stays full! Dup ACKs clock new transmissions. */ /* * Without Fast Recovery (Tahoe): * - On 3 dupacks: cwnd = 1, can only send 1 segment * - Must wait through entire Slow Start to recover throughput * - Network capacity is wasted during recovery */Why Dup ACKs Are Valuable:
Each duplicate ACK isn't just noise—it carries critical information:
Segment Departed: The receiver got another out-of-order segment, which means a segment left the network.
Network Functioning: The path is working (otherwise no ACKs would return).
Clock Signal: Each dup ACK grants "permission" to send one new segment without violating the congestion window constraint.
By inflating cwnd with each dup ACK, Fast Recovery maintains the "ACK clock" throughout recovery. New segments enter the network at exactly the rate old ones leave, keeping the pipe optimally full.
Tahoe's reset to cwnd = 1 breaks the ACK clock—there's only one segment in flight, generating only one ACK, which allows only one more segment. The exponential growth of Slow Start rebuilds the clock, but this takes time. Reno's Fast Recovery never breaks the clock; it maintains continuous transmission throughout.
Let's quantify Reno's performance advantage over Tahoe. The improvement is most significant for long-lived flows experiencing sporadic losses—exactly the scenario that characterizes most bulk data transfers.
Analytical Comparison:
Consider a connection with the following characteristics:
| Phase | TCP Tahoe | TCP Reno |
|---|---|---|
| Pre-loss cwnd | 100 segments | 100 segments |
| Post-loss ssthresh | 50 segments | 50 segments |
| Post-loss cwnd | 1 segment | ~50 segments |
| Time to reach 50 segs | log₂(50) ≈ 6 RTTs | 0 RTTs (immediate) |
| Time to reach 100 segs | 6 + 50 = 56 RTTs | 50 RTTs |
| Throughput during recovery | ~25% average | ~75% average |
The Throughput Model:
For a simple analysis, assume losses occur at random with probability p. The steady-state throughput of TCP can be modeled as:
Tahoe Throughput:
Throughput ≈ C / (RTT × √p)
where C is a constant around 0.7-0.9, depending on the recovery behavior.
Reno Throughput:
Throughput ≈ 1.22 / (RTT × √p)
The key difference is the constant factor. Reno's faster recovery translates to a roughly 30-50% throughput improvement over Tahoe in typical conditions.
Visual Comparison:
cwnd (segments) ^ | Tahoe Reno | 100 | /\ /\ /\ | / \ / \/ \ 50 | / ------- /--------\ | / CA \ / CA \ 25 | / SS \ / \ | / \ / \ 10 | / \ < No deep drop > | / \ 1 |---/ \__ | +-----------------------------------------> time Loss Loss Tahoe: Deep valley, slow climb from cwnd=1 Reno: Shallow drop, immediate recovery from cwnd~=ssthresh Shaded area represents data transferred. Reno's smaller drop = more data transferred over time.The performance gap between Reno and Tahoe grows with the bandwidth-delay product. On paths where the optimal window is hundreds or thousands of segments, Tahoe's climb from 1 segment takes many RTTs, while Reno recovers almost immediately. Modern high-speed networks strongly favor Reno's approach.
TCP Reno excels when a single packet is lost from a window. But its performance degrades dramatically when multiple packets are lost from the same window—a scenario that's common during congestion events.
The Problem:
Fast Recovery assumes that there's only one lost segment. When the retransmitted segment is ACKed, Reno exits Fast Recovery and resumes Congestion Avoidance. But what if there were actually two lost segments?
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
/* * Reno's Failure with Multiple Losses * * Scenario: Segments [5] and [8] are both lost from a window * * Sender: [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] * X X * LOST LOST * * Step 1: Receiver generates dup ACKs for [5] * Step 2: After 3 dupacks, sender fast-retransmits [5] * Step 3: Sender inflates cwnd, continues sending [11], [12], ... * Step 4: Receiver gets retransmitted [5], fills that gap * Step 5: Receiver sends ACK for [7] (cumulative—segments 5,6,7 complete) * But still missing [8]! This is a PARTIAL ACK. * * Here's where Reno fails: * * Original Reno behavior on partial ACK: * - Exit Fast Recovery (premature!) * - Set cwnd = ssthresh * - Resume Congestion Avoidance * * Problem: [8] is still missing! * - No more dup ACKs arriving (receiver already reported the gap) * - Can't trigger another Fast Retransmit * - Must wait for RTO to retransmit [8] * * Result: Timeout occurs, cwnd drops to 1, massive throughput loss */ /* * Performance Impact: * * Timeline with single loss: * t=0: Loss detected, Fast Recovery begins * t=1RTT: Recovery ACK received, exit Fast Recovery * Result: Minimal throughput impact * * Timeline with multiple losses (Reno): * t=0: First loss detected, Fast Recovery begins * t=1RTT: Partial ACK (second loss still pending) * t=1RTT: Exit Fast Recovery prematurely * t=RTO: Timeout for second loss (often 1-3 seconds!) * t=RTO+: Full Slow Start from cwnd=1 * Result: Catastrophic throughput loss * * The more losses in a window, the more timeouts Reno experiences. */Understanding Partial ACKs:
A partial ACK is an ACK that acknowledges some but not all of the data that was outstanding when Fast Recovery was entered. It indicates that:
Reno's original specification didn't handle partial ACKs correctly. It would exit Fast Recovery immediately, leaving subsequent lost segments unrecovered until timeout.
| Losses per Window | Frequency* | Reno Behavior | Impact |
|---|---|---|---|
| 1 | Common | Fast Recovery succeeds | Minimal—recovery in ~1 RTT |
| 2 | Occasional | Fast Recovery + Timeout | Severe—adds full RTO delay |
| 3+ | During heavy congestion | Multiple timeouts possible | Catastrophic—connection stalls |
| Tail loss | All traffic patterns | No dup ACKs to trigger FR | Always requires timeout |
When congestion is worst (multiple losses per window), Reno performs most poorly. This is backwards—we want robust recovery precisely when the network is stressed. Reno's performance cliff during heavy congestion motivated the development of TCP NewReno, which handles partial ACKs correctly.
Implementing TCP Reno correctly requires attention to several subtle details that aren't immediately obvious from the high-level algorithm description. Let's examine the key implementation considerations.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
/* Production-Quality Reno Implementation Considerations */ typedef struct { uint32_t cwnd; uint32_t ssthresh; uint32_t mss; uint8_t dupacks; bool in_fast_recovery; uint32_t recover; /* Highest SEQ sent when entering FR */ uint32_t high_data; /* Highest data being sent */ uint32_t high_ack; /* Highest ACK received */} TCPRenoState; /* * Key Implementation Detail #1: The Recover Variable * * Purpose: Track whether we're still in recovery for the original loss * * When Fast Recovery begins, record the highest sequence number sent. * Only exit Fast Recovery when an ACK arrives that acknowledges * data at or beyond this point. * * This prevents exiting FR on partial ACKs (important for correctness). */void enter_fast_recovery(TCPRenoState* s) { s->ssthresh = s->cwnd / 2; if (s->ssthresh < 2 * s->mss) { s->ssthresh = 2 * s->mss; } s->recover = s->high_data; /* Record recovery point */ retransmit_from_ack(s->high_ack); /* Retransmit first missing segment */ s->cwnd = s->ssthresh + 3 * s->mss; /* Inflate window */ s->in_fast_recovery = true;} /* * Key Implementation Detail #2: Not Entering FR Multiple Times * * If we're already in Fast Recovery and see 3 more dup ACKs, * we should NOT restart Fast Recovery. The multiplier effect * would be devastating (halving ssthresh repeatedly). */void on_dup_ack(TCPRenoState* s) { s->dupacks++; if (s->dupacks >= 3) { if (!s->in_fast_recovery) { /* First time: enter Fast Recovery */ enter_fast_recovery(s); } else { /* Already in FR: just inflate window */ s->cwnd += s->mss; maybe_send_new_data(s); } }} /* * Key Implementation Detail #3: Handling the Recovery ACK * * RFC 5681 defines the "full ACK" as one that acknowledges * data at or beyond the 'recover' point. */void on_ack(TCPRenoState* s, uint32_t ack_num, uint32_t bytes_acked) { if (s->in_fast_recovery) { if (ack_num >= s->recover) { /* FULL ACK - exit Fast Recovery */ s->cwnd = s->ssthresh; /* Deflate to normal */ s->in_fast_recovery = false; s->dupacks = 0; } else { /* PARTIAL ACK - more losses exist */ /* Original Reno: exit anyway (problematic) */ /* NewReno: stay in FR, retransmit next loss */ } } else { s->dupacks = 0; /* Normal Slow Start or Congestion Avoidance */ if (s->cwnd < s->ssthresh) { s->cwnd += s->mss; /* Slow Start */ } else { s->cwnd += (s->mss * s->mss) / s->cwnd; /* CA */ } } s->high_ack = ack_num;} /* * Key Implementation Detail #4: Timeout During Fast Recovery * * If RTO fires while in Fast Recovery, it indicates severe problems. * Full reset to Slow Start is appropriate. */void on_timeout(TCPRenoState* s) { s->ssthresh = s->cwnd / 2; if (s->ssthresh < 2 * s->mss) { s->ssthresh = 2 * s->mss; } s->cwnd = s->mss; /* Reset to 1 MSS */ s->in_fast_recovery = false; s->dupacks = 0; go_back_n_retransmit();}While TCP Reno was the dominant implementation through the 1990s and early 2000s, its role in today's networks is largely historical. Understanding where Reno stands today helps contextualize its importance and limitations.
Current Deployment:
Pure TCP Reno is rarely used in modern systems. Most implementations have evolved to at least NewReno (which fixes the multiple-loss problem) or beyond (CUBIC, BBR). However, Reno's algorithms form the foundation that all subsequent variants build upon.
| TCP Variant | Usage | Where Found |
|---|---|---|
| Tahoe | Essentially none | Educational implementations, legacy embedded |
| Reno | Rare (<1%) | Very old systems, some embedded devices |
| NewReno | Declining (~5%) | Older Linux kernels, some BSD systems |
| CUBIC | Dominant (~65%) | Linux default since 2.6.19 (2006), macOS, Android |
| BBR | Growing (~20%) | Google services, Linux option, some CDNs |
| Others | ~10% | DCTCP (data centers), specialized variants |
Why Study Reno?
Despite its limited contemporary deployment, studying Reno remains essential:
Conceptual Foundation: Fast Recovery's principles apply to all subsequent variants. Understanding Reno means understanding the building blocks.
Debugging Insight: Network performance issues often relate to congestion control behavior. Knowing Reno helps diagnose problems even in CUBIC/BBR systems.
Protocol Evolution: Reno's limitations motivated specific innovations in NewReno, CUBIC, and BBR. Knowing Reno explains why later variants work as they do.
Academic Standard: Much of the congestion control literature models or compares against Reno. Reading research requires understanding Reno's behavior.
NewReno, SACK-based variants, and even CUBIC are often called 'Reno-like' because they share the same core structure: Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery. The differences lie in how aggressively they grow during CA and how they handle specific loss scenarios.
TCP Reno represents a critical evolutionary step in congestion control, introducing the concept of differentiated loss response through Fast Recovery. Let's consolidate the essential concepts:
What's Next:
In the next page, we explore TCP NewReno, which directly addresses Reno's multiple-loss weakness. NewReno's clever handling of partial ACKs allows it to recover from multiple losses within a single Fast Recovery episode, avoiding the catastrophic timeouts that plague Reno during congestion events.
You now have comprehensive understanding of TCP Reno—its motivation, Fast Recovery mechanism, performance characteristics, and limitations. You can explain the differences from Tahoe, analyze Reno's behavior in various scenarios, and understand why its multiple-loss weakness necessitated further evolution.