Fast Recovery - Learning Module

Loading content...

0/228

Fast Recovery Phase

The Evolution of TCP Loss Recovery

In the landscape of TCP congestion control, the Fast Recovery phase represents one of the most significant algorithmic innovations in the protocol's evolution. While Fast Retransmit (covered in Module 1) enables TCP to detect and respond to packet loss without waiting for expensive timeout events, it was the introduction of Fast Recovery that truly transformed TCP's ability to maintain high throughput in the face of transient network conditions.

To fully appreciate Fast Recovery's importance, we must first understand the problem it was designed to solve. Before its introduction, even when TCP could quickly detect loss through duplicate ACKs, the sender's response was devastatingly conservative: it would reset the congestion window (cwnd) to 1 MSS and restart slow start from scratch. This behavior, while safe, was deeply inefficient on networks where packet loss events were typically isolated incidents rather than indicators of severe congestion collapse.

What You Will Learn

This page provides comprehensive coverage of the Fast Recovery phase, including: the fundamental concept and objectives of Fast Recovery, state machine transitions and lifecycle, relationship between Fast Recovery and congestion window management, the self-clocking mechanism during recovery, conditions for entering and exiting Fast Recovery, and the critical distinction between Fast Recovery and slow start.

Understanding Fast Recovery

Fast Recovery is a congestion control mechanism that allows TCP to maintain relatively high throughput during and after packet loss events by avoiding the costly return to slow start. Introduced as part of TCP Reno, Fast Recovery fundamentally changed how TCP responds to detected packet loss.

The Core Insight:

The key insight behind Fast Recovery is deceptively simple yet profoundly impactful: when duplicate ACKs arrive at the sender, they indicate that packets are still successfully reaching the receiver. Each duplicate ACK confirms that:

At least one packet has left the network (arrived at receiver)
The network path is still functioning
Only specific packets appear to be lost, not the entire pipe

This observation leads to a critical realization: if data is still flowing through the network, perhaps severe congestion window reduction isn't necessary. Instead of assuming catastrophic congestion and restarting from scratch, TCP can reduce its sending rate more moderately and continue transmitting.

Fast Recovery Objectives

•Maintain Network Utilization — Keep the network pipeline reasonably full even during loss recovery, preventing unnecessary throughput collapse.
•Avoid Slow Start Penalty — Bypass the expensive exponential growth phase that would otherwise follow a loss event, saving potentially dozens of RTTs of recovery time.
•React Proportionally — Reduce the sending rate in proportion to actual network conditions rather than assuming the worst-case scenario.
•Preserve Flow Continuity — Ensure that the continuous flow of acknowledgments (the 'ACK clock') is maintained to enable ongoing transmission.
•Efficient Bandwidth Probing — Allow TCP to quickly probe available bandwidth after recovering from loss without the long ramp-up of slow start.

Historical Context

Fast Recovery was introduced in 1990 by Van Jacobson and first implemented in TCP Reno. It built upon the Fast Retransmit mechanism that had been developed earlier. Together, these two mechanisms represented a fundamental shift from timeout-based loss detection to proactive, acknowledgment-based recovery strategies.

The Fast Recovery State Machine

Understanding Fast Recovery requires viewing TCP's congestion control as a state machine with well-defined transitions. The Fast Recovery phase is one such state, entered under specific conditions and exited when recovery completes or fails.

State Machine Overview:

TCP congestion control operates in one of several states:

Slow Start: Initial exponential growth phase
Congestion Avoidance: Linear growth after reaching threshold
Fast Recovery: Loss recovery without slow start penalty

The transitions between these states are triggered by specific events: acknowledgments, timeouts, and duplicate ACKs.

Converting Mermaid diagram...

Entry Conditions for Fast Recovery:

TCP enters the Fast Recovery phase when the following conditions are met:

Three Duplicate ACKs Received: The sender receives three acknowledgments for the same sequence number (four identical ACKs total including the original).
Fast Retransmit Triggered: Immediately upon receiving the third duplicate ACK, TCP performs Fast Retransmit of the presumed lost segment.
Threshold Adjustment: Before entering Fast Recovery, TCP sets the slow start threshold (ssthresh) to half of the current flight size.
Window Inflation: The congestion window is set to ssthresh plus 3×MSS (accounting for the three duplicate ACKs already received).

Fast Recovery Entry Actions
Step	Action	Purpose	Formula
1	Record FlightSize	Capture current bytes in flight	FlightSize = SND.NXT - SND.UNA
2	Set ssthresh	Reduce threshold to half	ssthresh = max(FlightSize/2, 2×MSS)
3	Retransmit lost segment	Fast Retransmit	Retransmit segment at SND.UNA
4	Set cwnd	Inflate window for recovery	cwnd = ssthresh + 3×MSS
5	Enter Fast Recovery	Change state	State = FAST_RECOVERY

The 3×MSS Addition

The addition of 3×MSS to the congestion window when entering Fast Recovery compensates for the three segments that have left the network (as indicated by the three duplicate ACKs). This ensures the sender can immediately begin transmitting new data rather than waiting for acknowledgments.

Behavior During Fast Recovery

Once TCP enters the Fast Recovery phase, its behavior changes significantly from normal congestion avoidance. The primary goal during this phase is to maintain the 'ACK clock' - the self-clocking mechanism that paces TCP transmission based on acknowledgment arrival.

ACK Clock Preservation:

The ACK clock is TCP's natural pacing mechanism. When packets arrive at the receiver, they trigger ACKs that travel back to the sender. These incoming ACKs permit the sender to transmit new data. This creates a self-regulating feedback loop that naturally adapts to network conditions.

During Fast Recovery, maintaining this clock is critical. Each duplicate ACK, while not acknowledging new data, does indicate that one segment has left the network and been received. This information is used to 'inflate' the congestion window temporarily.

Actions for Each Duplicate ACK During Fast Recovery

•Window Inflation — For each additional duplicate ACK received, increment cwnd by 1 MSS. This reflects that another packet has left the network.
•Transmit New Data — If the updated cwnd allows (cwnd - FlightSize ≥ MSS), transmit new data segments. This keeps the pipe full during recovery.
•Maintain Recovery State — Continue tracking the recovery point (the highest sequence number transmitted when Fast Recovery began).
•No ssthresh Modification — The slow start threshold remains unchanged during the recovery phase.

fast_recovery_pseudocode

Pseudocode

// Fast Recovery Algorithm (RFC 5681)
 
ON receiving duplicate ACK during Fast Recovery:
    // Step 1: Inflate the congestion window
    cwnd = cwnd + MSS
    
    // Step 2: Calculate bytes allowed to be in flight
    FlightSize = SND.NXT - SND.UNA
    
    // Step 3: Check if new data can be sent
    IF (cwnd - FlightSize >= MSS) THEN
        // Calculate how many new segments can be sent
        available = (cwnd - FlightSize) / MSS
        
        FOR i = 1 TO available:
            IF (SND.NXT < SND.MAX) THEN
                send_new_segment(SND.NXT)
                SND.NXT = SND.NXT + MSS
            END IF
        END FOR
    END IF
    
    // Step 4: Remain in Fast Recovery state
    // (Exit only occurs on new ACK or timeout)

The Window Inflation Rationale:

Window inflation during Fast Recovery may seem counterintuitive at first. Why increase cwnd when we've just detected a loss? The answer lies in understanding what the congestion window actually represents.

The cwnd limits the number of bytes that can be outstanding (unacknowledged) in the network. When a duplicate ACK arrives, it signals that a previously transmitted packet has reached the receiver and is now buffered there (otherwise, no ACK would be generated). This packet has effectively 'left the network' from the sender's perspective.

By incrementing cwnd, TCP compensates for this reduction in outstanding bytes, allowing new data to be transmitted. The total bytes in the network (flight size) remains roughly constant while recovery proceeds.

Visualizing Window Inflation

Think of the network path as a pipe. Each duplicate ACK indicates water has drained from the end of the pipe (packet arrived at receiver). To keep the pipe full, you add more water at the entrance (transmit new data). The total water in the pipe stays constant - you're not adding congestion, just maintaining steady flow.

Exiting Fast Recovery

Fast Recovery is a transient phase - TCP must eventually exit it. There are two possible exit paths, each with dramatically different implications for subsequent behavior.

Exit Condition 1: Successful Recovery (New ACK)

The desired exit from Fast Recovery occurs when TCP receives an ACK that acknowledges new data - specifically, data at or beyond the sequence number that was retransmitted. This 'new ACK' indicates that:

The retransmitted segment has been successfully received
All segments up to that point have been delivered
The network has recovered from whatever condition caused the loss

Upon receiving a new ACK, TCP performs 'window deflation' and transitions to Congestion Avoidance.

Fast Recovery Exit Actions (New ACK)
Step	Action	Purpose	Formula
1	Deflate cwnd	Remove inflation	cwnd = ssthresh
2	Exit Fast Recovery	Change state	State = CONGESTION_AVOIDANCE
3	Continue with AIMD	Resume normal operation	cwnd += MSS×MSS/cwnd per ACK
4	Reset dup ACK counter	Clear recovery state	dupACKcount = 0

Exit Condition 2: Recovery Failure (Timeout)

If Fast Recovery fails - meaning the retransmitted segment is also lost, or multiple segments need retransmission - the sender will eventually experience a retransmission timeout (RTO). This timeout triggers the most conservative response:

ssthresh reduction: Set to half of the current flight size (may already be set from Fast Recovery entry)
cwnd reset: Congestion window drops to 1 MSS
Slow Start restart: TCP returns to the slow start phase and must exponentially grow cwnd from scratch

This represents a significant performance penalty and highlights why Fast Recovery's success is crucial for maintaining throughput.

Successful Recovery

•New ACK acknowledges retransmitted data
•cwnd deflates to ssthresh
•Transitions to Congestion Avoidance
•Maintains approximately half of original rate
•Linear growth resumes immediately
•Minimal throughput disruption

Recovery Failure

•Retransmission timeout occurs
•cwnd drops to 1 MSS
•Transitions to Slow Start
•Rate drops by factor of N (where N = original cwnd/MSS)
•Exponential growth from scratch
•Full recovery may take many RTTs

Partial ACKs and Recovery

In basic TCP Reno, receiving an ACK that advances the window but doesn't fully acknowledge all retransmitted data (a 'partial ACK') causes an exit from Fast Recovery followed by immediate re-entry. This behavior, inefficient for multiple losses, was addressed in TCP NewReno with improved partial ACK handling.

Fast Recovery vs. Slow Start

Understanding the difference between Fast Recovery and Slow Start is essential for appreciating why Fast Recovery was such a significant advancement. Both phases involve managing the congestion window, but their philosophies and impacts are fundamentally different.

Slow Start's Approach:

Slow Start begins with an extremely conservative assumption: we know nothing about the network's capacity. Starting from cwnd = 1 MSS (or a small initial value), the sender doubles cwnd for each RTT. While 'slow' is a misnomer (it's actually exponential growth), the starting point is so low that reaching full capacity can take many RTTs.

Fast Recovery's Approach:

Fast Recovery assumes the network was recently working at a known capacity (the cwnd before loss detection) and that isolated losses don't indicate complete congestion collapse. It halves the rate (ssthresh = cwnd/2) but maintains transmission during recovery, then immediately resumes from that halfway point in Congestion Avoidance.

Detailed Comparison: Fast Recovery vs. Slow Start
Aspect	Slow Start	Fast Recovery
Initial cwnd	1-4 MSS (varies by implementation)	ssthresh + 3×MSS
Growth Pattern	Exponential (2× per RTT)	Maintained or slight increase during recovery
Philosophy	Probe from zero	Reduce from known working rate
Assumption	Network capacity unknown	Recent capacity is good estimate
Trigger	Timeout or connection start	3 duplicate ACKs + Fast Retransmit
ACK Response	cwnd += MSS per new ACK	cwnd += MSS per duplicate ACK (inflation)
Network Impact	Minimal initial load	Maintains substantial load
Recovery Time	O(log₂(cwnd_optimal))	O(1) RTT for single loss
Best For	Unknown/severely congested networks	Transient, isolated packet loss

Quantifying the Difference:

Consider a TCP connection that was operating with cwnd = 64 MSS before detecting a loss:

With Slow Start (Tahoe-style response):

cwnd drops to 1 MSS
After 1 RTT: cwnd = 2 MSS
After 2 RTTs: cwnd = 4 MSS
After 3 RTTs: cwnd = 8 MSS
After 6 RTTs: cwnd = 64 MSS (back to original)
Total recovery: 6+ RTTs

With Fast Recovery (Reno-style response):

ssthresh = 32 MSS, cwnd = 35 MSS (32 + 3)
Recovery completes in ~1 RTT (upon new ACK)
cwnd deflates to 32 MSS
Connection continues at ~50% of original rate immediately

For a network with 100ms RTT, Slow Start recovery costs 600+ ms. Fast Recovery costs approximately 100ms. This difference is significant for user-perceived performance and overall throughput.

The Practical Impact

Fast Recovery's efficiency makes it the preferred loss recovery mechanism for modern networks. Instead of treating every loss as an emergency requiring complete restart, it responds proportionally to the evidence available - three duplicate ACKs indicate a functioning network path, justifying a moderate rather than extreme response.

Self-Clocking During Recovery

One of Fast Recovery's most elegant aspects is how it leverages TCP's self-clocking mechanism to maintain flow during the recovery process. This self-clocking, also known as 'ACK-clocking,' is fundamental to understanding why Fast Recovery works.

The Self-Clocking Principle:

TCP's transmission rate is naturally regulated by the acknowledgment stream. When a packet arrives at the receiver, an ACK is generated and sent back. Only when an ACK arrives at the sender can new data be transmitted (within cwnd limits). This creates a natural feedback loop:

Packets leave sender at rate limited by cwnd
Packets traverse network with inherent delay
Packets arrive at receiver, triggering ACKs
ACKs traverse network back to sender
ACKs permit new data transmission

This cycle creates a steady-state where the transmission rate automatically matches the network's capacity - the 'ACK clock' paces the sender.

Converting Mermaid diagram...

Why Self-Clocking Matters During Fast Recovery:

When packet loss occurs, the self-clocking mechanism is at risk of breaking down. Without incoming ACKs, the sender cannot transmit new data. The network pipeline begins to drain, and if not carefully managed, the connection could stall.

Fast Recovery preserves the clock through window inflation:

Duplicate ACKs as Clock Ticks: Each duplicate ACK indicates a packet reached the receiver. Even though it doesn't acknowledge new data, it represents a 'clock tick' - proof that the path is functioning.
Window Inflation Permits Transmission: By increasing cwnd with each duplicate ACK, Fast Recovery creates room to send new data. This new data will generate future ACKs, keeping the clock running.
Recovery Point Tracking: TCP tracks the highest sequence number sent when recovery began. Only ACKs beyond this point indicate full recovery.
Continuous Flow: Throughout recovery, data continues flowing in both directions - retransmissions, new data (if cwnd permits), duplicate ACKs, and eventually new ACKs.

The Clock Metaphor

Think of TCP's flow as a clock where ACKs are the pendulum. During normal operation, each ACK swings the pendulum, permitting new transmission. During Fast Recovery, duplicate ACKs act as weaker pendulum swings - they're not acknowledging new data, but they still indicate the clock is running and prevent it from stopping entirely.

Recovery Point and Partial ACKs

A critical concept in Fast Recovery is the recovery point - the sequence number that marks the boundary between the 'lost segment region' and segments transmitted during recovery. Understanding this boundary is essential for correct recovery behavior.

Defining the Recovery Point:

When TCP enters Fast Recovery, it records the highest sequence number that has been transmitted:

recovery_point = SND.NXT - 1

This recovery point serves as a threshold. Only when TCP receives an ACK acknowledging data at or beyond this point can it conclude that recovery is complete. This ensures that:

All segments outstanding when loss was detected have been acknowledged
The retransmitted segment has been received
The receiver has successfully reconstructed the stream up to the recovery point

ACK Types During Fast Recovery
ACK Type	Definition	Fast Recovery Action	State Transition
Duplicate ACK	ACK for same seq num as before	Inflate cwnd by MSS	Remain in Fast Recovery
Partial ACK	ACK advances but < recovery point	Count as New ACK (deflate, exit), then re-enter	Exit then re-enter (Reno)
Full ACK	ACK ≥ recovery point	Deflate cwnd, complete recovery	Exit to Congestion Avoidance

The Partial ACK Problem:

A partial ACK occurs when the receiver acknowledges some new data but not all data up to the recovery point. This typically indicates that multiple segments were lost in the same window of data.

Consider this scenario:

Sender has transmitted segments 1-10
Segments 3 and 7 are lost
Receiver gets 1, 2, then 4-6, 8-10
Duplicate ACKs trigger Fast Recovery
Sender retransmits segment 3
Receiver gets segment 3, sends ACK for sequence up to segment 6
This is a partial ACK (expected ACK was for segment 10)

In TCP Reno, partial ACKs cause the sender to exit Fast Recovery, deflate cwnd, then immediately detect another loss and re-enter Fast Recovery. This inefficiency was addressed in later TCP variants.

Reno's Partial ACK Weakness

TCP Reno's handling of partial ACKs leads to multiple halvings of ssthresh when multiple segments are lost in one window. Each exit and re-entry of Fast Recovery halves the threshold again. This can cause severe throughput reduction that's disproportionate to the actual congestion level.

Improved Partial ACK Handling (NewReno Preview):

TCP NewReno introduced improved partial ACK handling that stays in Fast Recovery when receiving partial ACKs:

Immediately retransmit the next unacknowledged segment
Deflate cwnd by the amount of data acknowledged
Add back 1 MSS (for the segment that left the network)
Remain in Fast Recovery until a full ACK is received

This approach prevents multiple ssthresh halvings and provides more efficient recovery from multiple losses. We'll explore this in detail in the Reno implementation page.

Summary: Fast Recovery Phase

The Fast Recovery phase represents a fundamental advancement in TCP congestion control, enabling networks to maintain high utilization even in the presence of transient packet loss. Let's consolidate the key concepts covered:

Key Takeaways

•Fast Recovery avoids slow start — By halving the congestion window rather than resetting it, Fast Recovery maintains approximately 50% of the original throughput immediately after loss detection.
•Entry is triggered by Fast Retransmit — Upon receiving 3 duplicate ACKs, TCP performs Fast Retransmit and immediately enters Fast Recovery with cwnd = ssthresh + 3×MSS.
•Window inflation maintains flow — Each duplicate ACK during Fast Recovery inflates cwnd by 1 MSS, permitting new data transmission and preserving the ACK clock.
•Exit requires a new ACK — Fast Recovery completes when an ACK acknowledges data beyond the recovery point, causing cwnd to deflate to ssthresh.
•Timeouts cause fallback to slow start — If Fast Recovery fails and a timeout occurs, TCP falls back to slow start with cwnd = 1 MSS.
•Self-clocking is preserved — The key insight is that duplicate ACKs indicate packets are leaving the network, justifying continued transmission during recovery.
•Partial ACKs indicate multiple losses — When ACKs don't fully acknowledge the recovery region, additional segments were likely lost, requiring further recovery action.

What's Next:

With the Fast Recovery phase mechanism understood, we'll explore how TCP responds to congestion signals during and after recovery. The next page examines the congestion response strategies that determine how aggressively TCP reduces its sending rate.

Page Complete

You now understand the Fast Recovery phase - its purpose, mechanics, state transitions, and relationship to TCP's self-clocking mechanism. This foundation is essential for understanding the sophisticated congestion response strategies covered in the following pages.

Fast Recovery Phase

The Evolution of TCP Loss Recovery

What You Will Learn

Understanding Fast Recovery

The Core Insight:

At least one packet has left the network (arrived at receiver)
The network path is still functioning
Only specific packets appear to be lost, not the entire pipe

Fast Recovery Objectives

•Maintain Network Utilization — Keep the network pipeline reasonably full even during loss recovery, preventing unnecessary throughput collapse.
•Avoid Slow Start Penalty — Bypass the expensive exponential growth phase that would otherwise follow a loss event, saving potentially dozens of RTTs of recovery time.
•React Proportionally — Reduce the sending rate in proportion to actual network conditions rather than assuming the worst-case scenario.
•Preserve Flow Continuity — Ensure that the continuous flow of acknowledgments (the 'ACK clock') is maintained to enable ongoing transmission.
•Efficient Bandwidth Probing — Allow TCP to quickly probe available bandwidth after recovering from loss without the long ramp-up of slow start.

Historical Context

The Fast Recovery State Machine

State Machine Overview:

TCP congestion control operates in one of several states:

Slow Start: Initial exponential growth phase
Congestion Avoidance: Linear growth after reaching threshold
Fast Recovery: Loss recovery without slow start penalty

The transitions between these states are triggered by specific events: acknowledgments, timeouts, and duplicate ACKs.

Converting Mermaid diagram...

Entry Conditions for Fast Recovery:

TCP enters the Fast Recovery phase when the following conditions are met:

Three Duplicate ACKs Received: The sender receives three acknowledgments for the same sequence number (four identical ACKs total including the original).
Fast Retransmit Triggered: Immediately upon receiving the third duplicate ACK, TCP performs Fast Retransmit of the presumed lost segment.
Threshold Adjustment: Before entering Fast Recovery, TCP sets the slow start threshold (ssthresh) to half of the current flight size.
Window Inflation: The congestion window is set to ssthresh plus 3×MSS (accounting for the three duplicate ACKs already received).

Fast Recovery Entry Actions
Step	Action	Purpose	Formula
1	Record FlightSize	Capture current bytes in flight	FlightSize = SND.NXT - SND.UNA
2	Set ssthresh	Reduce threshold to half	ssthresh = max(FlightSize/2, 2×MSS)
3	Retransmit lost segment	Fast Retransmit	Retransmit segment at SND.UNA
4	Set cwnd	Inflate window for recovery	cwnd = ssthresh + 3×MSS
5	Enter Fast Recovery	Change state	State = FAST_RECOVERY

The 3×MSS Addition

Behavior During Fast Recovery

ACK Clock Preservation:

Actions for Each Duplicate ACK During Fast Recovery

•Window Inflation — For each additional duplicate ACK received, increment cwnd by 1 MSS. This reflects that another packet has left the network.
•Transmit New Data — If the updated cwnd allows (cwnd - FlightSize ≥ MSS), transmit new data segments. This keeps the pipe full during recovery.
•Maintain Recovery State — Continue tracking the recovery point (the highest sequence number transmitted when Fast Recovery began).
•No ssthresh Modification — The slow start threshold remains unchanged during the recovery phase.

fast_recovery_pseudocode

Pseudocode

// Fast Recovery Algorithm (RFC 5681)
 
ON receiving duplicate ACK during Fast Recovery:
    // Step 1: Inflate the congestion window
    cwnd = cwnd + MSS
    
    // Step 2: Calculate bytes allowed to be in flight
    FlightSize = SND.NXT - SND.UNA
    
    // Step 3: Check if new data can be sent
    IF (cwnd - FlightSize >= MSS) THEN
        // Calculate how many new segments can be sent
        available = (cwnd - FlightSize) / MSS
        
        FOR i = 1 TO available:
            IF (SND.NXT < SND.MAX) THEN
                send_new_segment(SND.NXT)
                SND.NXT = SND.NXT + MSS
            END IF
        END FOR
    END IF
    
    // Step 4: Remain in Fast Recovery state
    // (Exit only occurs on new ACK or timeout)

The Window Inflation Rationale:

Visualizing Window Inflation

Exiting Fast Recovery

Fast Recovery is a transient phase - TCP must eventually exit it. There are two possible exit paths, each with dramatically different implications for subsequent behavior.

Exit Condition 1: Successful Recovery (New ACK)

The retransmitted segment has been successfully received
All segments up to that point have been delivered
The network has recovered from whatever condition caused the loss

Upon receiving a new ACK, TCP performs 'window deflation' and transitions to Congestion Avoidance.

Fast Recovery Exit Actions (New ACK)
Step	Action	Purpose	Formula
1	Deflate cwnd	Remove inflation	cwnd = ssthresh
2	Exit Fast Recovery	Change state	State = CONGESTION_AVOIDANCE
3	Continue with AIMD	Resume normal operation	cwnd += MSS×MSS/cwnd per ACK
4	Reset dup ACK counter	Clear recovery state	dupACKcount = 0

Exit Condition 2: Recovery Failure (Timeout)

ssthresh reduction: Set to half of the current flight size (may already be set from Fast Recovery entry)
cwnd reset: Congestion window drops to 1 MSS
Slow Start restart: TCP returns to the slow start phase and must exponentially grow cwnd from scratch

This represents a significant performance penalty and highlights why Fast Recovery's success is crucial for maintaining throughput.

Successful Recovery

•New ACK acknowledges retransmitted data
•cwnd deflates to ssthresh
•Transitions to Congestion Avoidance
•Maintains approximately half of original rate
•Linear growth resumes immediately
•Minimal throughput disruption

Recovery Failure

•Retransmission timeout occurs
•cwnd drops to 1 MSS
•Transitions to Slow Start
•Rate drops by factor of N (where N = original cwnd/MSS)
•Exponential growth from scratch
•Full recovery may take many RTTs

Partial ACKs and Recovery

Fast Recovery vs. Slow Start

Slow Start's Approach:

Fast Recovery's Approach:

Detailed Comparison: Fast Recovery vs. Slow Start
Aspect	Slow Start	Fast Recovery
Initial cwnd	1-4 MSS (varies by implementation)	ssthresh + 3×MSS
Growth Pattern	Exponential (2× per RTT)	Maintained or slight increase during recovery
Philosophy	Probe from zero	Reduce from known working rate
Assumption	Network capacity unknown	Recent capacity is good estimate
Trigger	Timeout or connection start	3 duplicate ACKs + Fast Retransmit
ACK Response	cwnd += MSS per new ACK	cwnd += MSS per duplicate ACK (inflation)
Network Impact	Minimal initial load	Maintains substantial load
Recovery Time	O(log₂(cwnd_optimal))	O(1) RTT for single loss
Best For	Unknown/severely congested networks	Transient, isolated packet loss

Quantifying the Difference:

Consider a TCP connection that was operating with cwnd = 64 MSS before detecting a loss:

With Slow Start (Tahoe-style response):

cwnd drops to 1 MSS
After 1 RTT: cwnd = 2 MSS
After 2 RTTs: cwnd = 4 MSS
After 3 RTTs: cwnd = 8 MSS
After 6 RTTs: cwnd = 64 MSS (back to original)
Total recovery: 6+ RTTs

With Fast Recovery (Reno-style response):

ssthresh = 32 MSS, cwnd = 35 MSS (32 + 3)
Recovery completes in ~1 RTT (upon new ACK)
cwnd deflates to 32 MSS
Connection continues at ~50% of original rate immediately

For a network with 100ms RTT, Slow Start recovery costs 600+ ms. Fast Recovery costs approximately 100ms. This difference is significant for user-perceived performance and overall throughput.

The Practical Impact

Self-Clocking During Recovery

The Self-Clocking Principle:

Packets leave sender at rate limited by cwnd
Packets traverse network with inherent delay
Packets arrive at receiver, triggering ACKs
ACKs traverse network back to sender
ACKs permit new data transmission

This cycle creates a steady-state where the transmission rate automatically matches the network's capacity - the 'ACK clock' paces the sender.

Converting Mermaid diagram...

Why Self-Clocking Matters During Fast Recovery:

Fast Recovery preserves the clock through window inflation:

Duplicate ACKs as Clock Ticks: Each duplicate ACK indicates a packet reached the receiver. Even though it doesn't acknowledge new data, it represents a 'clock tick' - proof that the path is functioning.
Window Inflation Permits Transmission: By increasing cwnd with each duplicate ACK, Fast Recovery creates room to send new data. This new data will generate future ACKs, keeping the clock running.
Recovery Point Tracking: TCP tracks the highest sequence number sent when recovery began. Only ACKs beyond this point indicate full recovery.
Continuous Flow: Throughout recovery, data continues flowing in both directions - retransmissions, new data (if cwnd permits), duplicate ACKs, and eventually new ACKs.

The Clock Metaphor

Recovery Point and Partial ACKs

Defining the Recovery Point:

When TCP enters Fast Recovery, it records the highest sequence number that has been transmitted:

recovery_point = SND.NXT - 1

This recovery point serves as a threshold. Only when TCP receives an ACK acknowledging data at or beyond this point can it conclude that recovery is complete. This ensures that:

All segments outstanding when loss was detected have been acknowledged
The retransmitted segment has been received
The receiver has successfully reconstructed the stream up to the recovery point

ACK Types During Fast Recovery
ACK Type	Definition	Fast Recovery Action	State Transition
Duplicate ACK	ACK for same seq num as before	Inflate cwnd by MSS	Remain in Fast Recovery
Partial ACK	ACK advances but < recovery point	Count as New ACK (deflate, exit), then re-enter	Exit then re-enter (Reno)
Full ACK	ACK ≥ recovery point	Deflate cwnd, complete recovery	Exit to Congestion Avoidance

The Partial ACK Problem:

A partial ACK occurs when the receiver acknowledges some new data but not all data up to the recovery point. This typically indicates that multiple segments were lost in the same window of data.

Consider this scenario:

Sender has transmitted segments 1-10
Segments 3 and 7 are lost
Receiver gets 1, 2, then 4-6, 8-10
Duplicate ACKs trigger Fast Recovery
Sender retransmits segment 3
Receiver gets segment 3, sends ACK for sequence up to segment 6
This is a partial ACK (expected ACK was for segment 10)

In TCP Reno, partial ACKs cause the sender to exit Fast Recovery, deflate cwnd, then immediately detect another loss and re-enter Fast Recovery. This inefficiency was addressed in later TCP variants.

Reno's Partial ACK Weakness

Improved Partial ACK Handling (NewReno Preview):

TCP NewReno introduced improved partial ACK handling that stays in Fast Recovery when receiving partial ACKs:

Immediately retransmit the next unacknowledged segment
Deflate cwnd by the amount of data acknowledged
Add back 1 MSS (for the segment that left the network)
Remain in Fast Recovery until a full ACK is received

This approach prevents multiple ssthresh halvings and provides more efficient recovery from multiple losses. We'll explore this in detail in the Reno implementation page.

Summary: Fast Recovery Phase

Key Takeaways

•Fast Recovery avoids slow start — By halving the congestion window rather than resetting it, Fast Recovery maintains approximately 50% of the original throughput immediately after loss detection.
•Entry is triggered by Fast Retransmit — Upon receiving 3 duplicate ACKs, TCP performs Fast Retransmit and immediately enters Fast Recovery with cwnd = ssthresh + 3×MSS.
•Window inflation maintains flow — Each duplicate ACK during Fast Recovery inflates cwnd by 1 MSS, permitting new data transmission and preserving the ACK clock.
•Exit requires a new ACK — Fast Recovery completes when an ACK acknowledges data beyond the recovery point, causing cwnd to deflate to ssthresh.
•Timeouts cause fallback to slow start — If Fast Recovery fails and a timeout occurs, TCP falls back to slow start with cwnd = 1 MSS.
•Self-clocking is preserved — The key insight is that duplicate ACKs indicate packets are leaving the network, justifying continued transmission during recovery.
•Partial ACKs indicate multiple losses — When ACKs don't fully acknowledge the recovery region, additional segments were likely lost, requiring further recovery action.

What's Next:

Page Complete