Fast Retransmit - Learning Module

Loading content...

0/240

Retransmission Trigger

From Signal to Action

In the previous page, we established that duplicate acknowledgments serve as TCP's implicit signal of potential packet loss. But a signal alone is meaningless without a well-defined response. The retransmission trigger is the precise mechanism by which TCP converts the observation of duplicate ACKs into decisive action—specifically, the immediate retransmission of the presumed-lost segment.

This trigger represents one of TCP's most impactful performance optimizations. Before fast retransmit (introduced in TCP Tahoe, 1988), TCP had only one recovery mechanism: the retransmission timeout (RTO). An RTO can last hundreds of milliseconds to several seconds, during which the connection sits idle. Fast retransmit, triggered by duplicate ACKs, can initiate recovery in mere round-trip times—a speedup of 10x to 100x in loss recovery latency.

What You Will Master

By the end of this page, you will understand: (1) the precise conditions under which fast retransmit triggers, (2) the algorithm for selecting the segment to retransmit, (3) how the trigger interacts with retransmission timers, (4) edge cases and failure modes, and (5) how different TCP variants implement this mechanism.

The Three Duplicate ACK Threshold

The fast retransmit algorithm is elegantly simple in its core logic. RFC 5681 specifies:

When the third duplicate ACK is received, TCP SHOULD retransmit the segment that appears to be missing (i.e., the segment beginning at the sequence number in the ACK field of the duplicate ACK), without waiting for the retransmission timer to expire.

The operative phrase is "third duplicate ACK"—not the fourth, not any number configurable at runtime, but precisely three. This number is hardcoded in virtually all TCP implementations.

Counting Semantics:

The counting of duplicate ACKs requires careful attention:

Original ACK: The first ACK with a given acknowledgment number is the original, not a duplicate
Duplicate #1: The second ACK with the same number is duplicate #1
Duplicate #2: The third ACK is duplicate #2
Duplicate #3: The fourth ACK is duplicate #3 → TRIGGER!

So when we say "three duplicate ACKs," we mean the sender has received four total ACKs with the same acknowledgment number: one original plus three duplicates.

ACK Counting Sequence Leading to Fast Retransmit
ACK Received	ACK Number	Classification	Dup Count	Action
1st	5000	Original ACK	0	Normal processing; slide window
2nd	5000	Duplicate #1	1	Note: possibly reordering
3rd	5000	Duplicate #2	2	Likely loss; continue monitoring
4th	5000	Duplicate #3	3	FAST RETRANSMIT triggered!
5th	5000	Duplicate #4	4	In fast recovery (if Reno+)

Common Misconception

Many explanations incorrectly state that fast retransmit occurs after "receiving 3 ACKs with the same acknowledgment number." This is off-by-one. It occurs after receiving 4 ACKs with the same number: original + 3 duplicates. The distinction matters for implementation correctness.

Why Exactly Three?

As discussed in the previous page, three represents the empirically-validated balance between:

Avoiding false positives: Single or double reordering events are common; treating them as loss wastes bandwidth on unnecessary retransmissions and triggers congestion control penalties
Minimizing recovery delay: Waiting for more duplicates delays recovery; the goal is to retransmit as soon as we're confident loss occurred

Research by Vern Paxson and others in the 1990s established that network reordering rarely exceeds 2-3 packets. The threshold of 3 duplicates provides the safety margin while enabling rapid recovery.

The Fast Retransmit Algorithm

The fast retransmit algorithm, as specified in RFC 5681 and refined over subsequent RFCs, can be expressed in precise algorithmic form. While implementations vary in detail, the core logic is universal.

Formal Algorithm (RFC 5681 Section 3.2):

fast_retransmit_algorithm.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// Fast Retransmit Algorithm (RFC 5681)
// Called when an ACK is received
 
function process_ack(ack):
    if ack.ack_number > snd_una:
        // NEW ACK - acknowledges new data
        dupacks = 0
        snd_una = ack.ack_number
        
        // Exit fast recovery if we were in it
        if state == FAST_RECOVERY:
            exit_fast_recovery()
            cwnd = ssthresh  // Deflate window
        
        // Normal ACK processing
        update_rtt_estimator(ack)
        advance_send_window()
        maybe_send_new_data()
        
    else if ack.ack_number == snd_una:
        // DUPLICATE ACK - same ack number as before
        
        if is_duplicate_ack(ack):  // Check RFC 5681 criteria
            dupacks = dupacks + 1
            
            if dupacks == 3:
                // FAST RETRANSMIT TRIGGER!
                
                // Step 1: Record current ssthresh
                ssthresh = max(FlightSize / 2, 2 * SMSS)
                
                // Step 2: Retransmit the missing segment
                retransmit_segment(snd_una)
                
                // Step 3: Enter fast recovery (TCP Reno and later)
                cwnd = ssthresh + 3 * SMSS
                state = FAST_RECOVERY
                
            else if dupacks > 3 AND state == FAST_RECOVERY:
                // Each additional dup ACK inflates cwnd
                cwnd = cwnd + SMSS
                maybe_send_new_data()  // Transmit if cwnd permits
 
// Helper: Check if ACK qualifies as duplicate (RFC 5681)
function is_duplicate_ack(ack):
    return (ack.data_length == 0) AND
           (ack.window == last_advertised_window) AND
           (NOT ack.has_SYN_flag) AND
           (NOT ack.has_FIN_flag) AND
           (ack.ack_number == snd_una)

Algorithm Breakdown:

Step 1: Threshold Calculation

ssthresh = max(FlightSize / 2, 2 * SMSS)

FlightSize: Amount of data currently in flight (sent but not yet acknowledged)
SMSS: Sender Maximum Segment Size
The /2 reflects the AIMD (Additive Increase, Multiplicative Decrease) principle
The max with 2 * SMSS ensures we never set ssthresh below 2 segments

Step 2: Immediate Retransmission

retransmit_segment(snd_una)

snd_una: Send Unacknowledged—the oldest byte not yet acknowledged
The segment beginning at snd_una is exactly what the duplicate ACKs are asking for
This is the segment whose loss caused the receiver to generate duplicate ACKs

Step 3: Enter Fast Recovery (Reno+)

cwnd = ssthresh + 3 * SMSS

The + 3 * SMSS "inflates" cwnd to account for the 3 segments that generated the duplicate ACKs
These segments are buffered at the receiver—they're out of the network
This is the beginning of the fast recovery phase (covered in subsequent pages)

Evolution Across TCP Variants

TCP Tahoe (original) performs fast retransmit but then enters slow start. TCP Reno and later variants enter fast recovery instead, maintaining higher throughput. The retransmission trigger itself is identical—the difference is what happens afterward.

Which Segment to Retransmit

When fast retransmit triggers, the sender must decide which segment to retransmit. This decision seems obvious—retransmit the lost segment—but several subtleties arise in practice:

Without SACK: The Simple Case

When Selective Acknowledgment is not enabled, TCP has limited information. The duplicate ACKs tell us:

All data up to snd_una - 1 has been received
Data at snd_una has NOT been received (or hasn't been received contiguously)

The only logical choice is to retransmit the segment starting at snd_una. There's no information about what else might be lost.

Converting Mermaid diagram...

With SACK: Targeted Retransmission

When SACK is negotiated during connection establishment, the situation improves dramatically. SACK blocks in the duplicate ACKs tell the sender exactly which byte ranges have been received:

Duplicate ACK contents:
  ACK: 5000
  SACK: [6460-7920], [7920-9380], [9380-10840], [10840-12000]

The sender can now deduce:

Bytes 0-4999: Acknowledged (received and in order)
Bytes 5000-6459: The only missing range → Retransmit this
Bytes 6460-12000: Buffered at receiver (via SACK)

This precision matters when multiple segments are lost. Without SACK, after retransmitting the first lost segment, TCP must wait for its ACK before discovering additional losses. With SACK, all losses can be identified (and retransmitted) immediately.

Retransmission Strategy Comparison
Scenario	Without SACK	With SACK
Single segment lost	Retransmit snd_una segment	Same—retransmit snd_una segment
Multiple segments lost	Retransmit first; wait for ACK; repeat	Retransmit all lost segments immediately
Last segment lost	May need timeout (insufficient dup ACKs)	SACK identifies loss; can retransmit
Recovery time for N losses	~N × RTT	~1 RTT (all retransmits at once)
Bandwidth efficiency	Lower—unnecessary retransmits possible	Higher—precise retransmission

SACK Is Nearly Universal

As of 2024, SACK is enabled by default on virtually all operating systems (Linux, Windows, macOS, BSDs) and is negotiated in ~95% of TCP connections on the internet. Implementations that don't support SACK are increasingly rare outside of embedded systems.

Interaction with Retransmission Timer

Fast retransmit operates alongside—not instead of—the retransmission timeout (RTO) mechanism. Understanding their interaction is critical for robust TCP implementation.

The Timer States:

RTO Timer Behavior with Fast Retransmit

•Timer Running: When data is in flight, the RTO timer is always running. It's set when the first unacknowledged segment is sent.
•Fast Retransmit Does NOT Cancel Timer: When fast retransmit fires, the RTO timer continues running. It serves as a backup in case the retransmission also fails.
•Timer Reset on Fast Retransmit: Some implementations reset (restart) the timer when fast retransmit occurs. This gives the retransmission a full RTO period to succeed.
•Timer Cancel on ACK Progress: When new data is acknowledged (ACK advances past snd_una), the timer is reset for the new oldest unacknowledged segment.
•Timer Expiry = RTO Recovery: If the timer expires before fast retransmit (or despite it), RTO recovery takes over—this is the fallback mechanism.

Race Conditions:

Several scenarios create interesting timing interactions:

Scenario 1: Fast Retransmit Wins (Common Case)

Time 0ms:    Segment 5 sent
Time 10ms:   Segment 5 lost in network
Time 20ms:   Segments 6, 7, 8 arrive at receiver
Time 40ms:   3 duplicate ACKs arrive at sender
Time 40ms:   Fast retransmit triggered → Segment 5 resent
Time 60ms:   Retransmitted segment 5 acknowledged
[RTO timer never fires—cancelled by ACK]

Recovery time: ~60ms (2 RTT)

Scenario 2: RTO Fires (Duplicate ACKs Insufficient)

Time 0ms:    Segments 5, 6, 7 sent (window = 3)
Time 10ms:   Segment 5 lost
Time 20ms:   Segments 6, 7 arrive → 2 duplicate ACKs (not enough!)
Time 200ms:  RTO timer expires → Segment 5 resent
Time 220ms:  ACK for segment 7 arrives (all recovered)
[Slow start initiated due to RTO]

Recovery time: ~220ms (RTO penalty)

Fast Retransmit vs RTO: Performance Impact
Metric	Fast Retransmit	RTO-Based Recovery
Detection latency	~1 RTT (dup ACK arrival)	RTO (typically 200ms-3s)
Congestion window impact	Halved (ssthresh = cwnd/2)	Reset to 1 segment (slow start)
Throughput during recovery	Maintained (fast recovery)	Near-zero during timeout
Connection idle time	None	Full RTO duration
Recovery efficiency	High	Low

RTO Is Still Essential

Fast retransmit is an optimization, not a replacement for RTO. Many loss scenarios cannot generate 3 duplicate ACKs: tail losses, small-window situations, flights with multiple losses. The RTO remains TCP's ultimate fallback and must always be implemented correctly.

Edge Cases and Failure Modes

While fast retransmit is remarkably effective in common scenarios, several edge cases expose its limitations. Understanding these cases is essential for both implementation and debugging.

Edge Case 1: Tail Loss

When the last segment(s) of a transmission are lost, there are no subsequent segments to generate duplicate ACKs. Fast retransmit cannot trigger.

tail_loss_scenario.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Scenario: Tail Loss (Last Segment Lost)
 
Sender transmits:
  Seg 1: Seq 1000, Len 500  ✓ Received
  Seg 2: Seq 1500, Len 500  ✓ Received  
  Seg 3: Seq 2000, Len 500  ✓ Received
  Seg 4: Seq 2500, Len 500  ✓ Received
  Seg 5: Seq 3000, Len 500  ❌ LOST (last segment!)
 
Receiver ACKs:
  After Seg 1: ACK 1500
  After Seg 2: ACK 2000
  After Seg 3: ACK 2500
  After Seg 4: ACK 3000
  [No more segments → No more ACKs → No duplicate ACKs]
 
Result: Sender sees ACK 3000, expects ACK 3500
        No duplicate ACKs ever arrive
        Must wait for RTO to recover
 
Recovery: RTO timer fires → Retransmit Seg 5 → Eventually ACK 3500

Edge Case 2: Small Window

With a congestion window of 2 or fewer segments, fast retransmit is impossible—there aren't enough subsequent segments to generate 3 duplicate ACKs.

Window = 2 Segments

•Segment 1 sent, Segment 2 sent
•Segment 1 lost
•Segment 2 arrives → 1 dup ACK
•Window full—cannot send more
•Only 1 dup ACK—fast retransmit impossible
•Must wait for RTO

Window = 4 Segments

•Segments 1, 2, 3, 4 sent
•Segment 1 lost
•Segments 2, 3, 4 arrive → 3 dup ACKs
•Fast retransmit triggers!
•Total recovery: ~1 RTT
•No RTO wait required

Edge Case 3: Burst Losses

When multiple consecutive segments are lost, fast retransmit handles only the first loss. Without SACK, subsequent losses require additional RTTs or RTO to discover.

Edge Case 4: ACK Loss

If the duplicate ACKs themselves are lost, the sender never receives the signal. The segment loss is eventually recovered via RTO, but the fast path is lost.

Edge Case 5: Spurious Retransmission

If network conditions cause significant reordering (3+ packets), fast retransmit may trigger spuriously—retransmitting a segment that will arrive on its own. This wastes bandwidth and may cause the sender to incorrectly infer congestion.

Mitigations:

Modern Solutions to Edge Cases

•Tail Loss Probe (TLP): Proactively sends a probe segment after a timeout, generating ACKs that reveal tail loss without waiting for full RTO
•Early Retransmit: Allows fast retransmit with fewer than 3 dup ACKs when the flight size is known to be small
•RACK (Recent Acknowledgment): Uses timestamps instead of dup ACK counting, detecting loss more reliably
•F-RTO (Forward RTO-Recovery): Detects spurious timeouts and avoids unnecessary congestion response
•Eifel Algorithm: Uses timestamps to detect spurious retransmissions after the fact

Linux Default: RACK

As of Linux 4.11+, RACK is the default loss detection algorithm, supplementing (not replacing) the traditional dup ACK counting. RACK provides faster and more accurate loss detection, particularly in edge cases where duplicate ACK counting fails.

Implementation Across TCP Variants

While the core fast retransmit trigger (3 duplicate ACKs) is consistent across TCP variants, what happens after the trigger differs significantly. This affects both implementation and performance.

TCP Tahoe (1988) — Original Fast Retransmit

Tahoe was the first TCP variant to implement fast retransmit. Its behavior:

Detect 3 duplicate ACKs
Set ssthresh = max(flight_size/2, 2*SMSS)
Retransmit the lost segment
Set cwnd = 1 SMSS (enter slow start!)
Proceed with normal slow start recovery

The problem: entering slow start after fast retransmit is extremely conservative. Throughput drops to near-zero momentarily before rebuilding.

Fast Retransmit Behavior Across TCP Variants
TCP Variant	Year	Post-Trigger Action	Window After Trigger
Tahoe	1988	Slow Start	cwnd = 1 SMSS
Reno	1990	Fast Recovery	cwnd = ssthresh + 3*SMSS
NewReno	1999	Improved Fast Recovery	cwnd = ssthresh + 3*SMSS
SACK-based	1996+	Selective Retransmit	Depends on SACK info
CUBIC	2006+	Fast Recovery + CUBIC curve	Complex formula
BBR	2016+	Model-based (no dup ACK trigger)	Based on BtlBw/RTprop

TCP Reno (1990) — The Standard

Reno introduced fast recovery to complement fast retransmit:

Detect 3 duplicate ACKs
Set ssthresh = max(flight_size/2, 2*SMSS)
Retransmit the lost segment
Set cwnd = ssthresh + 3*SMSS (inflate for buffered segments)
For each additional dup ACK: cwnd += SMSS
On new ACK: exit fast recovery, cwnd = ssthresh

Reno avoids slow start, maintaining ~50% of previous throughput during recovery.

TCP NewReno (1999) — Multiple Loss Handling

NewReno fixes Reno's behavior when multiple segments are lost. Reno would exit fast recovery on the first partial ACK; NewReno stays in fast recovery until all data sent before fast retransmit is acknowledged.

Converting Mermaid diagram...

Modern Algorithms: CUBIC, BBR

CUBIC (Linux default since 2.6.19): Uses a cubic function to calculate cwnd, providing more aggressive growth in high-bandwidth networks while remaining fair in lower-bandwidth scenarios. Fast retransmit triggers the same way, but the recovery curve differs.

BBR (Bottleneck Bandwidth and Round-trip propagation time): Takes a fundamentally different approach. Rather than reacting to packet loss, BBR models the network's bottleneck bandwidth and round-trip time. It doesn't rely on duplicate ACK counting for loss detection—instead using RACK/timestamps. However, BBR still respects fast retransmit signals from the transport layer.

The Trigger Is Universal

Regardless of TCP variant, the fast retransmit trigger—3 duplicate ACKs—remains the same. What differs is the congestion control response. This separation of concerns (loss detection vs. congestion response) is a key principle in modern TCP design.

Debugging Fast Retransmit Issues

In production environments, understanding whether fast retransmit is working correctly—or failing—is essential for debugging performance issues. Here's a systematic approach to analyzing fast retransmit behavior.

Packet Capture Analysis:

The most definitive way to verify fast retransmit is examining packet captures. Look for the characteristic pattern:

wireshark_fast_retransmit.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Wireshark filter: tcp.analysis.fast_retransmission
 
# Expected pattern for fast retransmit:
 
No.   Time      Source    Dest      Length  Info
1     0.000     A         B         1500    Seq=1000 Len=1460
2     0.001     A         B         1500    Seq=2460 Len=1460   [Lost: Seq=2460]
3     0.002     A         B         1500    Seq=3920 Len=1460
4     0.003     A         B         1500    Seq=5380 Len=1460
5     0.004     A         B         1500    Seq=6840 Len=1460
 
6     0.020     B         A         66      Ack=2460
7     0.021     B         A         66      Ack=2460 [TCP Dup ACK #1]
8     0.022     B         A         66      Ack=2460 [TCP Dup ACK #2]
9     0.023     B         A         66      Ack=2460 [TCP Dup ACK #3]
 
10    0.024     A         B         1500    Seq=2460 [TCP Fast Retransmission]
 
11    0.044     B         A         66      Ack=8300 [Cumulative ACK - recovery complete]
 
# Key observations:
# - Packet 2 was lost (sequence gap visible in capture)
# - Packets 7-9 are duplicate ACKs with same ack number
# - Packet 10 is fast retransmission (< RTO, triggered by dup ACKs)
# - Packet 11 shows recovery (ACK jumps past retransmitted data)

ss and netstat Statistics (Linux):

Linux exposes TCP retransmission statistics that can reveal fast retransmit effectiveness:

linux_tcp_stats.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# View TCP retransmission statistics
$ ss -ti dst 10.0.0.1
 
# Output includes:
#   retrans:X/Y    - X current, Y total retransmissions for this connection
#   reordering:N   - detected reordering distance
 
# Kernel-wide statistics
$ cat /proc/net/snmp | grep Tcp
# Key fields:
#   RetransSegs    - Total retransmitted segments
#   InSegs         - Total received segments
#   OutSegs        - Total sent segments
 
# Calculate retransmission rate
$ nstat -z TcpRetransSegs TcpOutSegs
# RetransSegs / OutSegs = retransmission rate (should be < 1%)
 
# Fast retransmit-specific (via /proc/net/netstat)
$ grep -E "Tcp(Fast|Loss)" /proc/net/netstat
# TCPFastRetrans  - Count of fast retransmissions
# TCPSlowStartRetrans - Retransmissions during slow start
# TCPLossProbes - TLP probes sent

Common Issues and Causes

•High RTO retransmissions despite packet loss: Indicates fast retransmit isn't triggering. Causes: small window, tail loss, ACK loss, or implementation bugs.
•Spurious fast retransmissions: Too many retransmissions when loss rate is low. Causes: high network reordering exceeding threshold of 3.
•No recovery progress: Connection stalls after fast retransmit. Causes: multiple losses without SACK, wrong segment retransmitted, or receiver-side issues.
•Repeated fast retransmits for same data: Implementation bug—not tracking retransmitted segments properly.

Production Monitoring

For production monitoring, track the ratio of fast retransmits to RTO retransmits. A healthy ratio is >10:1 (most losses recovered via fast retransmit). If RTO retransmits dominate, investigate window sizing, tail loss, or SACK negotiation failures.

Summary: The Heart of Fast Loss Recovery

The retransmission trigger—the mechanism that converts three duplicate ACKs into immediate retransmission—is the heart of TCP's fast loss recovery. It transforms what would be catastrophic connection stalls into brief hiccups, enabling TCP to maintain high throughput over lossy networks.

Key Takeaways

•Three duplicate ACKs is the universal trigger — This means 4 total ACKs with the same acknowledgment number: 1 original + 3 duplicates
•The algorithm is simple but precise — Record ssthresh = cwnd/2, retransmit the segment at snd_una, enter fast recovery (on Reno+)
•Segment selection depends on SACK — Without SACK, only snd_una is retransmitted; with SACK, all lost segments can be identified
•Timer interaction is critical — Fast retransmit doesn't cancel RTO; it provides a faster path while RTO remains the fallback
•Edge cases require mitigations — Tail loss, small windows, and burst losses limit fast retransmit; modern mechanisms (TLP, RACK) address these
•TCP variants differ in post-trigger behavior — Tahoe enters slow start; Reno/NewReno use fast recovery; CUBIC/BBR have specialized responses

What's Next:

With fast retransmit triggered, the sender has retransmitted the lost segment. But the story doesn't end there. The next page explores timeout avoidance—how fast retransmit prevents the devastating performance impact of RTO expiration, and the quantitative analysis of performance improvement.

Page Complete

You now understand the precise mechanics of TCP's fast retransmit trigger—the threshold, the algorithm, the segment selection, timer interaction, edge cases, and variant-specific behaviors. This knowledge is essential for implementing, debugging, and optimizing TCP-based systems.

Retransmission Trigger

From Signal to Action

What You Will Master

The Three Duplicate ACK Threshold

The fast retransmit algorithm is elegantly simple in its core logic. RFC 5681 specifies:

When the third duplicate ACK is received, TCP SHOULD retransmit the segment that appears to be missing (i.e., the segment beginning at the sequence number in the ACK field of the duplicate ACK), without waiting for the retransmission timer to expire.

The operative phrase is "third duplicate ACK"—not the fourth, not any number configurable at runtime, but precisely three. This number is hardcoded in virtually all TCP implementations.

Counting Semantics:

The counting of duplicate ACKs requires careful attention:

Original ACK: The first ACK with a given acknowledgment number is the original, not a duplicate
Duplicate #1: The second ACK with the same number is duplicate #1
Duplicate #2: The third ACK is duplicate #2
Duplicate #3: The fourth ACK is duplicate #3 → TRIGGER!

So when we say "three duplicate ACKs," we mean the sender has received four total ACKs with the same acknowledgment number: one original plus three duplicates.

ACK Counting Sequence Leading to Fast Retransmit
ACK Received	ACK Number	Classification	Dup Count	Action
1st	5000	Original ACK	0	Normal processing; slide window
2nd	5000	Duplicate #1	1	Note: possibly reordering
3rd	5000	Duplicate #2	2	Likely loss; continue monitoring
4th	5000	Duplicate #3	3	FAST RETRANSMIT triggered!
5th	5000	Duplicate #4	4	In fast recovery (if Reno+)

Common Misconception

Why Exactly Three?

As discussed in the previous page, three represents the empirically-validated balance between:

Avoiding false positives: Single or double reordering events are common; treating them as loss wastes bandwidth on unnecessary retransmissions and triggers congestion control penalties
Minimizing recovery delay: Waiting for more duplicates delays recovery; the goal is to retransmit as soon as we're confident loss occurred

Research by Vern Paxson and others in the 1990s established that network reordering rarely exceeds 2-3 packets. The threshold of 3 duplicates provides the safety margin while enabling rapid recovery.

The Fast Retransmit Algorithm

Formal Algorithm (RFC 5681 Section 3.2):

fast_retransmit_algorithm.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// Fast Retransmit Algorithm (RFC 5681)
// Called when an ACK is received
 
function process_ack(ack):
    if ack.ack_number > snd_una:
        // NEW ACK - acknowledges new data
        dupacks = 0
        snd_una = ack.ack_number
        
        // Exit fast recovery if we were in it
        if state == FAST_RECOVERY:
            exit_fast_recovery()
            cwnd = ssthresh  // Deflate window
        
        // Normal ACK processing
        update_rtt_estimator(ack)
        advance_send_window()
        maybe_send_new_data()
        
    else if ack.ack_number == snd_una:
        // DUPLICATE ACK - same ack number as before
        
        if is_duplicate_ack(ack):  // Check RFC 5681 criteria
            dupacks = dupacks + 1
            
            if dupacks == 3:
                // FAST RETRANSMIT TRIGGER!
                
                // Step 1: Record current ssthresh
                ssthresh = max(FlightSize / 2, 2 * SMSS)
                
                // Step 2: Retransmit the missing segment
                retransmit_segment(snd_una)
                
                // Step 3: Enter fast recovery (TCP Reno and later)
                cwnd = ssthresh + 3 * SMSS
                state = FAST_RECOVERY
                
            else if dupacks > 3 AND state == FAST_RECOVERY:
                // Each additional dup ACK inflates cwnd
                cwnd = cwnd + SMSS
                maybe_send_new_data()  // Transmit if cwnd permits
 
// Helper: Check if ACK qualifies as duplicate (RFC 5681)
function is_duplicate_ack(ack):
    return (ack.data_length == 0) AND
           (ack.window == last_advertised_window) AND
           (NOT ack.has_SYN_flag) AND
           (NOT ack.has_FIN_flag) AND
           (ack.ack_number == snd_una)

Algorithm Breakdown:

Step 1: Threshold Calculation

ssthresh = max(FlightSize / 2, 2 * SMSS)

FlightSize: Amount of data currently in flight (sent but not yet acknowledged)
SMSS: Sender Maximum Segment Size
The /2 reflects the AIMD (Additive Increase, Multiplicative Decrease) principle
The max with 2 * SMSS ensures we never set ssthresh below 2 segments

Step 2: Immediate Retransmission

retransmit_segment(snd_una)

snd_una: Send Unacknowledged—the oldest byte not yet acknowledged
The segment beginning at snd_una is exactly what the duplicate ACKs are asking for
This is the segment whose loss caused the receiver to generate duplicate ACKs

Step 3: Enter Fast Recovery (Reno+)

cwnd = ssthresh + 3 * SMSS

The + 3 * SMSS "inflates" cwnd to account for the 3 segments that generated the duplicate ACKs
These segments are buffered at the receiver—they're out of the network
This is the beginning of the fast recovery phase (covered in subsequent pages)

Evolution Across TCP Variants

Which Segment to Retransmit

When fast retransmit triggers, the sender must decide which segment to retransmit. This decision seems obvious—retransmit the lost segment—but several subtleties arise in practice:

Without SACK: The Simple Case

When Selective Acknowledgment is not enabled, TCP has limited information. The duplicate ACKs tell us:

All data up to snd_una - 1 has been received
Data at snd_una has NOT been received (or hasn't been received contiguously)

The only logical choice is to retransmit the segment starting at snd_una. There's no information about what else might be lost.

Converting Mermaid diagram...

With SACK: Targeted Retransmission

When SACK is negotiated during connection establishment, the situation improves dramatically. SACK blocks in the duplicate ACKs tell the sender exactly which byte ranges have been received:

Duplicate ACK contents:
  ACK: 5000
  SACK: [6460-7920], [7920-9380], [9380-10840], [10840-12000]

The sender can now deduce:

Bytes 0-4999: Acknowledged (received and in order)
Bytes 5000-6459: The only missing range → Retransmit this
Bytes 6460-12000: Buffered at receiver (via SACK)

Retransmission Strategy Comparison
Scenario	Without SACK	With SACK
Single segment lost	Retransmit snd_una segment	Same—retransmit snd_una segment
Multiple segments lost	Retransmit first; wait for ACK; repeat	Retransmit all lost segments immediately
Last segment lost	May need timeout (insufficient dup ACKs)	SACK identifies loss; can retransmit
Recovery time for N losses	~N × RTT	~1 RTT (all retransmits at once)
Bandwidth efficiency	Lower—unnecessary retransmits possible	Higher—precise retransmission

SACK Is Nearly Universal

Interaction with Retransmission Timer

Fast retransmit operates alongside—not instead of—the retransmission timeout (RTO) mechanism. Understanding their interaction is critical for robust TCP implementation.

The Timer States:

RTO Timer Behavior with Fast Retransmit

•Timer Running: When data is in flight, the RTO timer is always running. It's set when the first unacknowledged segment is sent.
•Fast Retransmit Does NOT Cancel Timer: When fast retransmit fires, the RTO timer continues running. It serves as a backup in case the retransmission also fails.
•Timer Reset on Fast Retransmit: Some implementations reset (restart) the timer when fast retransmit occurs. This gives the retransmission a full RTO period to succeed.
•Timer Cancel on ACK Progress: When new data is acknowledged (ACK advances past snd_una), the timer is reset for the new oldest unacknowledged segment.
•Timer Expiry = RTO Recovery: If the timer expires before fast retransmit (or despite it), RTO recovery takes over—this is the fallback mechanism.

Race Conditions:

Several scenarios create interesting timing interactions:

Scenario 1: Fast Retransmit Wins (Common Case)

Time 0ms:    Segment 5 sent
Time 10ms:   Segment 5 lost in network
Time 20ms:   Segments 6, 7, 8 arrive at receiver
Time 40ms:   3 duplicate ACKs arrive at sender
Time 40ms:   Fast retransmit triggered → Segment 5 resent
Time 60ms:   Retransmitted segment 5 acknowledged
[RTO timer never fires—cancelled by ACK]

Recovery time: ~60ms (2 RTT)

Scenario 2: RTO Fires (Duplicate ACKs Insufficient)

Time 0ms:    Segments 5, 6, 7 sent (window = 3)
Time 10ms:   Segment 5 lost
Time 20ms:   Segments 6, 7 arrive → 2 duplicate ACKs (not enough!)
Time 200ms:  RTO timer expires → Segment 5 resent
Time 220ms:  ACK for segment 7 arrives (all recovered)
[Slow start initiated due to RTO]

Recovery time: ~220ms (RTO penalty)

Fast Retransmit vs RTO: Performance Impact
Metric	Fast Retransmit	RTO-Based Recovery
Detection latency	~1 RTT (dup ACK arrival)	RTO (typically 200ms-3s)
Congestion window impact	Halved (ssthresh = cwnd/2)	Reset to 1 segment (slow start)
Throughput during recovery	Maintained (fast recovery)	Near-zero during timeout
Connection idle time	None	Full RTO duration
Recovery efficiency	High	Low

RTO Is Still Essential

Edge Cases and Failure Modes

While fast retransmit is remarkably effective in common scenarios, several edge cases expose its limitations. Understanding these cases is essential for both implementation and debugging.

Edge Case 1: Tail Loss

When the last segment(s) of a transmission are lost, there are no subsequent segments to generate duplicate ACKs. Fast retransmit cannot trigger.

tail_loss_scenario.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Scenario: Tail Loss (Last Segment Lost)
 
Sender transmits:
  Seg 1: Seq 1000, Len 500  ✓ Received
  Seg 2: Seq 1500, Len 500  ✓ Received  
  Seg 3: Seq 2000, Len 500  ✓ Received
  Seg 4: Seq 2500, Len 500  ✓ Received
  Seg 5: Seq 3000, Len 500  ❌ LOST (last segment!)
 
Receiver ACKs:
  After Seg 1: ACK 1500
  After Seg 2: ACK 2000
  After Seg 3: ACK 2500
  After Seg 4: ACK 3000
  [No more segments → No more ACKs → No duplicate ACKs]
 
Result: Sender sees ACK 3000, expects ACK 3500
        No duplicate ACKs ever arrive
        Must wait for RTO to recover
 
Recovery: RTO timer fires → Retransmit Seg 5 → Eventually ACK 3500

Edge Case 2: Small Window

With a congestion window of 2 or fewer segments, fast retransmit is impossible—there aren't enough subsequent segments to generate 3 duplicate ACKs.

Window = 2 Segments

•Segment 1 sent, Segment 2 sent
•Segment 1 lost
•Segment 2 arrives → 1 dup ACK
•Window full—cannot send more
•Only 1 dup ACK—fast retransmit impossible
•Must wait for RTO

Window = 4 Segments

•Segments 1, 2, 3, 4 sent
•Segment 1 lost
•Segments 2, 3, 4 arrive → 3 dup ACKs
•Fast retransmit triggers!
•Total recovery: ~1 RTT
•No RTO wait required

Edge Case 3: Burst Losses

When multiple consecutive segments are lost, fast retransmit handles only the first loss. Without SACK, subsequent losses require additional RTTs or RTO to discover.

Edge Case 4: ACK Loss

If the duplicate ACKs themselves are lost, the sender never receives the signal. The segment loss is eventually recovered via RTO, but the fast path is lost.

Edge Case 5: Spurious Retransmission

Mitigations:

Modern Solutions to Edge Cases

•Tail Loss Probe (TLP): Proactively sends a probe segment after a timeout, generating ACKs that reveal tail loss without waiting for full RTO
•Early Retransmit: Allows fast retransmit with fewer than 3 dup ACKs when the flight size is known to be small
•RACK (Recent Acknowledgment): Uses timestamps instead of dup ACK counting, detecting loss more reliably
•F-RTO (Forward RTO-Recovery): Detects spurious timeouts and avoids unnecessary congestion response
•Eifel Algorithm: Uses timestamps to detect spurious retransmissions after the fact

Linux Default: RACK

Implementation Across TCP Variants

While the core fast retransmit trigger (3 duplicate ACKs) is consistent across TCP variants, what happens after the trigger differs significantly. This affects both implementation and performance.

TCP Tahoe (1988) — Original Fast Retransmit

Tahoe was the first TCP variant to implement fast retransmit. Its behavior:

Detect 3 duplicate ACKs
Set ssthresh = max(flight_size/2, 2*SMSS)
Retransmit the lost segment
Set cwnd = 1 SMSS (enter slow start!)
Proceed with normal slow start recovery

The problem: entering slow start after fast retransmit is extremely conservative. Throughput drops to near-zero momentarily before rebuilding.

Fast Retransmit Behavior Across TCP Variants
TCP Variant	Year	Post-Trigger Action	Window After Trigger
Tahoe	1988	Slow Start	cwnd = 1 SMSS
Reno	1990	Fast Recovery	cwnd = ssthresh + 3*SMSS
NewReno	1999	Improved Fast Recovery	cwnd = ssthresh + 3*SMSS
SACK-based	1996+	Selective Retransmit	Depends on SACK info
CUBIC	2006+	Fast Recovery + CUBIC curve	Complex formula
BBR	2016+	Model-based (no dup ACK trigger)	Based on BtlBw/RTprop

TCP Reno (1990) — The Standard

Reno introduced fast recovery to complement fast retransmit:

Detect 3 duplicate ACKs
Set ssthresh = max(flight_size/2, 2*SMSS)
Retransmit the lost segment
Set cwnd = ssthresh + 3*SMSS (inflate for buffered segments)
For each additional dup ACK: cwnd += SMSS
On new ACK: exit fast recovery, cwnd = ssthresh

Reno avoids slow start, maintaining ~50% of previous throughput during recovery.

TCP NewReno (1999) — Multiple Loss Handling

Converting Mermaid diagram...

Modern Algorithms: CUBIC, BBR

The Trigger Is Universal

Debugging Fast Retransmit Issues

Packet Capture Analysis:

The most definitive way to verify fast retransmit is examining packet captures. Look for the characteristic pattern:

wireshark_fast_retransmit.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Wireshark filter: tcp.analysis.fast_retransmission
 
# Expected pattern for fast retransmit:
 
No.   Time      Source    Dest      Length  Info
1     0.000     A         B         1500    Seq=1000 Len=1460
2     0.001     A         B         1500    Seq=2460 Len=1460   [Lost: Seq=2460]
3     0.002     A         B         1500    Seq=3920 Len=1460
4     0.003     A         B         1500    Seq=5380 Len=1460
5     0.004     A         B         1500    Seq=6840 Len=1460
 
6     0.020     B         A         66      Ack=2460
7     0.021     B         A         66      Ack=2460 [TCP Dup ACK #1]
8     0.022     B         A         66      Ack=2460 [TCP Dup ACK #2]
9     0.023     B         A         66      Ack=2460 [TCP Dup ACK #3]
 
10    0.024     A         B         1500    Seq=2460 [TCP Fast Retransmission]
 
11    0.044     B         A         66      Ack=8300 [Cumulative ACK - recovery complete]
 
# Key observations:
# - Packet 2 was lost (sequence gap visible in capture)
# - Packets 7-9 are duplicate ACKs with same ack number
# - Packet 10 is fast retransmission (< RTO, triggered by dup ACKs)
# - Packet 11 shows recovery (ACK jumps past retransmitted data)

ss and netstat Statistics (Linux):

Linux exposes TCP retransmission statistics that can reveal fast retransmit effectiveness:

linux_tcp_stats.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# View TCP retransmission statistics
$ ss -ti dst 10.0.0.1
 
# Output includes:
#   retrans:X/Y    - X current, Y total retransmissions for this connection
#   reordering:N   - detected reordering distance
 
# Kernel-wide statistics
$ cat /proc/net/snmp | grep Tcp
# Key fields:
#   RetransSegs    - Total retransmitted segments
#   InSegs         - Total received segments
#   OutSegs        - Total sent segments
 
# Calculate retransmission rate
$ nstat -z TcpRetransSegs TcpOutSegs
# RetransSegs / OutSegs = retransmission rate (should be < 1%)
 
# Fast retransmit-specific (via /proc/net/netstat)
$ grep -E "Tcp(Fast|Loss)" /proc/net/netstat
# TCPFastRetrans  - Count of fast retransmissions
# TCPSlowStartRetrans - Retransmissions during slow start
# TCPLossProbes - TLP probes sent

Common Issues and Causes

•High RTO retransmissions despite packet loss: Indicates fast retransmit isn't triggering. Causes: small window, tail loss, ACK loss, or implementation bugs.
•Spurious fast retransmissions: Too many retransmissions when loss rate is low. Causes: high network reordering exceeding threshold of 3.
•No recovery progress: Connection stalls after fast retransmit. Causes: multiple losses without SACK, wrong segment retransmitted, or receiver-side issues.
•Repeated fast retransmits for same data: Implementation bug—not tracking retransmitted segments properly.

Production Monitoring

Summary: The Heart of Fast Loss Recovery

Key Takeaways

•Three duplicate ACKs is the universal trigger — This means 4 total ACKs with the same acknowledgment number: 1 original + 3 duplicates
•The algorithm is simple but precise — Record ssthresh = cwnd/2, retransmit the segment at snd_una, enter fast recovery (on Reno+)
•Segment selection depends on SACK — Without SACK, only snd_una is retransmitted; with SACK, all lost segments can be identified
•Timer interaction is critical — Fast retransmit doesn't cancel RTO; it provides a faster path while RTO remains the fallback
•Edge cases require mitigations — Tail loss, small windows, and burst losses limit fast retransmit; modern mechanisms (TLP, RACK) address these
•TCP variants differ in post-trigger behavior — Tahoe enters slow start; Reno/NewReno use fast recovery; CUBIC/BBR have specialized responses

What's Next:

Page Complete