Computer NetworksFast Retransmit

Fast Retransmit

LevelAdvanced

Duration75 mins

TopicFast Retransmit

5 / 5

Implementation

From Theory to Code

Understanding fast retransmit conceptually is essential, but understanding its implementation reveals the real-world engineering decisions that make it work reliably at scale. Production TCP stacks—in Linux, Windows, FreeBSD, and network devices—implement fast retransmit with careful attention to edge cases, performance, and interoperability.

This page bridges theory and practice, examining the data structures, state machines, and code patterns that bring fast retransmit to life. Whether you're implementing a custom TCP stack, debugging production issues, or simply deepening your understanding, this implementation-focused perspective is invaluable.

What You Will Master

By the end of this page, you will understand: (1) the data structures used to track retransmission state, (2) the state machine governing fast retransmit and recovery, (3) pseudo-code and real code patterns, (4) platform-specific implementation differences, and (5) common bugs and debugging techniques.

Essential Data Structures

Fast retransmit implementation requires several key data structures to track connection state, identify losses, and manage recovery. Let's examine each component.

Transmission Control Block (TCB):

The TCB is the primary per-connection data structure, containing all state for a TCP connection. Fast retransmit uses these fields:

tcp_control_block.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// TCP Control Block - Fast Retransmit Fields
struct tcp_sock {
    // === Sequence Number State ===
    u32 snd_una;      // Send unacknowledged - oldest unacked byte
    u32 snd_nxt;      // Send next - next byte to transmit
    u32 snd_wnd;      // Send window (from receiver)
    
    // === Congestion Control State ===
    u32 cwnd;         // Congestion window (segments or bytes)
    u32 ssthresh;     // Slow start threshold
    
    // === Fast Retransmit State ===
    u32 dupacks;      // Count of duplicate ACKs received
    u32 high_seq;     // Highest seq sent at time of fast recovery entry
    u32 prior_ssthresh; // ssthresh before entering fast recovery
    u8  frto_counter; // F-RTO algorithm counter
    
    // === State Flags ===
    u8  ca_state;     // Congestion control state (OPEN, DISORDER, CWR, RECOVERY, LOSS)
    
    // === SACK State (if enabled) ===
    struct tcp_sack_block recv_sack_cache[4]; // SACK blocks from receiver
    u32 sacked_out;   // Count of SACKed segments
    u32 lost_out;     // Estimated lost segments (based on SACK)
    u32 retrans_out;  // Segments currently being retransmitted
    
    // === Scoreboard (for SACK-based recovery) ===
    struct rb_root retransmit_tree; // Red-black tree of sent segments
    
    // === Timing ===
    u32 rto;          // Retransmission timeout (in jiffies/ms)
    u64 retrans_stamp; // Time of last retransmission
};
 
// Congestion control states
enum tcp_ca_state {
    TCP_CA_Open = 0,      // Normal operation
    TCP_CA_Disorder = 1,  // Received dupacks, possible reordering
    TCP_CA_CWR = 2,       // Congestion Window Reduced (ECN)
    TCP_CA_Recovery = 3,  // Fast recovery in progress
    TCP_CA_Loss = 4,      // RTO fired, in loss recovery
};

The Send Buffer and Retransmission Queue:

TCP must maintain sent-but-unacknowledged data for potential retransmission. This is typically implemented as:

Send Buffer Organization

•Circular buffer/ring: Efficient for in-order data, but requires extra indexing for SACK-based selective retransmit
•Linked list of segments: Easy to manipulate, used in many implementations. Each node contains segment data + metadata (seq, len, timestamp, state flags)
•Segment scoreboard: For SACK, each segment is tracked individually with state (sent, sacked, lost, retransmitted)

send_buffer_segment.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Segment metadata for retransmission tracking
struct tcp_skb_cb {
    u32 seq;           // Sequence number of first byte
    u32 end_seq;       // Sequence number after last byte
    u8  sacked;        // SACK state flags
    u8  tcp_flags;     // SYN, FIN, etc.
    u64 tx_time;       // Transmission timestamp (for RTT and RACK)
    u8  retrans_count; // Number of times retransmitted
};
 
// SACK state flags per segment
#define TCPCB_SACKED_ACKED     0x01  // SACKed by receiver
#define TCPCB_SACKED_RETRANS   0x02  // Currently being retransmitted  
#define TCPCB_LOST             0x04  // Declared lost
#define TCPCB_TAGBITS          0x07  // All flags
 
// The send queue - list of unacknowledged segments
struct sk_buff_head write_queue;  // Ordered by sequence number

SACK Scoreboard

With SACK enabled, implementations maintain a 'scoreboard'—a data structure tracking the SACK state of each segment individually. This allows identifying exactly which segments are lost vs SACKed, enabling precise retransmission. The scoreboard adds memory overhead but dramatically improves multi-loss recovery.

The Fast Retransmit State Machine

Fast retransmit operates as part of TCP's broader congestion control state machine. Understanding state transitions is key to correct implementation.

Linux TCP Congestion Control States:

Converting Mermaid diagram...

State Descriptions:

OPEN (Normal Operation):

No evidence of loss
Window grows according to congestion control algorithm (slow start or congestion avoidance)
dupacks = 0

DISORDER (Possible Reordering):

Receiving duplicate ACKs (1 or 2)
Could be reordering; not yet assuming loss
Tracking dupacks, waiting to see if reordering resolves

RECOVERY (Fast Recovery):

3+ duplicate ACKs received
Fast retransmit triggered
Window inflated by incoming dup ACKs
Continue sending new data if window permits
Exit when all data sent before entering recovery is acknowledged

LOSS (RTO Recovery):

Retransmission timer expired
Severe response: cwnd = 1, slow start
All outstanding data assumed lost

state_transitions.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// State transition logic (simplified from Linux tcp_fastretrans_alert)
function on_ack_received(ack):
    if is_new_ack(ack):
        handle_new_ack(ack)
    else if is_duplicate_ack(ack):
        handle_dup_ack(ack)
 
function handle_new_ack(ack):
    if ca_state == RECOVERY:
        if ack >= high_seq:
            // Recovery complete: all pre-recovery data acked
            exit_recovery()
            ca_state = OPEN
        else:
            // Partial ACK during recovery
            // NewReno: retransmit next lost segment
            // SACK: retransmit based on scoreboard
            maybe_retransmit_on_partial_ack()
    else:
        ca_state = OPEN
    
    dupacks = 0
    update_cwnd_on_ack(ack)
 
function handle_dup_ack(ack):
    dupacks += 1
    
    if ca_state == OPEN:
        ca_state = DISORDER
    
    if ca_state == DISORDER or ca_state == RECOVERY:
        if dupacks == 3 and ca_state != RECOVERY:
            // === FAST RETRANSMIT TRIGGER ===
            enter_recovery()
            retransmit_segment(snd_una)
        else if ca_state == RECOVERY:
            // Window inflation
            cwnd += 1 SMSS
            try_send_more_data()

The high_seq Variable

The 'high_seq' variable (snd_nxt at recovery entry) is critical for knowing when recovery is complete. TCP must ACK all data sent before entering recovery, not just the retransmitted segment. Bugs in high_seq tracking cause premature or delayed recovery exit.

Complete Implementation Logic

Let's examine a complete pseudo-implementation of fast retransmit with fast recovery, incorporating all the details discussed so far.

Full Algorithm Implementation:

fast_retransmit_full.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
// Complete Fast Retransmit + Fast Recovery Implementation
// Based on TCP Reno (RFC 5681) with NewReno extensions (RFC 6582)
 
struct tcp_connection {
    // ... all fields from earlier ...
    uint32_t snd_una, snd_nxt, snd_wnd;
    uint32_t cwnd, ssthresh;
    uint32_t dupacks;
    uint32_t high_seq;
    enum tcp_ca_state ca_state;
    uint32_t mss;  // Max segment size
};
 
#define DUPACK_THRESHOLD 3
 
void tcp_process_ack(struct tcp_connection *conn, uint32_t ack_num) {
    // Check if this is a new ACK (advances snd_una)
    if (seq_after(ack_num, conn->snd_una)) {
        //
        // NEW ACK PROCESSING
        //
        uint32_t bytes_acked = ack_num - conn->snd_una;
        conn->snd_una = ack_num;
        
        // Remove acknowledged data from send buffer
        tcp_clean_rtx_queue(conn, ack_num);
        
        // State-specific handling
        switch (conn->ca_state) {
        case TCP_CA_Recovery:
            if (seq_geq(ack_num, conn->high_seq)) {
                // Full recovery complete
                tcp_end_recovery(conn);
                conn->ca_state = TCP_CA_Open;
                // Deflate window to ssthresh
                conn->cwnd = min(conn->ssthresh, 
                                 tcp_packets_in_flight(conn) + conn->mss);
            } else {
                // Partial ACK - more segments to retransmit (NewReno)
                tcp_retransmit_one(conn);  // Retransmit next lost
                // Deflate cwnd by amount acked, then add 1 mss
                conn->cwnd -= bytes_acked;
                conn->cwnd += conn->mss;
            }
            break;
            
        case TCP_CA_Loss:
            // In loss state, new ACK allows progress
            conn->ca_state = TCP_CA_Open;
            break;
            
        case TCP_CA_Disorder:
        case TCP_CA_Open:
        default:
            // Normal cwnd increase
            tcp_cong_avoid(conn, bytes_acked);
            conn->ca_state = TCP_CA_Open;
            break;
        }
        
        // Reset duplicate ACK counter on new ACK
        conn->dupacks = 0;
        
        // Try to send more data
        tcp_write_xmit(conn);
    }
    else if (ack_num == conn->snd_una) {
        //
        // DUPLICATE ACK PROCESSING
        //
        if (!tcp_is_valid_dupack(conn, ack_num)) {
            return;  // Not a true dup ACK (RFC 5681 criteria)
        }
        
        conn->dupacks++;
        
        switch (conn->ca_state) {
        case TCP_CA_Open:
            // First dup ACK - enter disorder
            conn->ca_state = TCP_CA_Disorder;
            // Fall through
            
        case TCP_CA_Disorder:
            if (conn->dupacks == DUPACK_THRESHOLD) {
                //
                // === FAST RETRANSMIT TRIGGER ===
                //
                tcp_enter_recovery(conn);
                tcp_retransmit_one(conn);  // Retransmit snd_una
            }
            break;
            
        case TCP_CA_Recovery:
            // Already in recovery - inflate window
            conn->cwnd += conn->mss;
            tcp_write_xmit(conn);  // Send new data if window allows
            break;
            
        case TCP_CA_Loss:
            // In loss state, dup ACKs ignored
            break;
        }
    }
}
 
void tcp_enter_recovery(struct tcp_connection *conn) {
    // Record high_seq for recovery exit condition
    conn->high_seq = conn->snd_nxt;
    
    // Set new ssthresh (RFC 5681: max(FlightSize/2, 2*MSS))
    uint32_t flight = tcp_packets_in_flight(conn);
    conn->ssthresh = max(flight / 2, 2);  // In segments
    
    // Set cwnd for recovery phase
    // ssthresh + 3*MSS for the 3 segments that triggered dup ACKs
    conn->cwnd = conn->ssthresh + DUPACK_THRESHOLD * conn->mss;
    
    // Enter recovery state
    conn->ca_state = TCP_CA_Recovery;
    
    // Record stats
    TCP_INC_STATS(TCP_MIB_FASTRETR);
}
 
void tcp_end_recovery(struct tcp_connection *conn) {
    conn->dupacks = 0;
    conn->ca_state = TCP_CA_Open;
}
 
bool tcp_is_valid_dupack(struct tcp_connection *conn, uint32_t ack) {
    // RFC 5681 duplicate ACK criteria:
    // 1. ACK carries no data
    // 2. ACK carries no window change 
    // 3. ACK carries no SYN or FIN
    // 4. ACK number equals snd_una
    // (These checks done before calling this function)
    return true;  // Simplified - actual impl checks packet flags
}

TCP_INC_STATS

Production implementations track statistics (TCP_MIB_FASTRETR, etc.) for monitoring. These counters appear in /proc/net/snmp and /proc/net/netstat on Linux, enabling runtime visibility into TCP behavior.

SACK-Based Implementation

When Selective Acknowledgment (SACK) is negotiated, fast retransmit becomes significantly more powerful. SACK provides explicit information about which segments the receiver has, enabling precise identification of lost segments.

SACK Processing:

sack_processing.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
// SACK block structure
struct tcp_sack_block {
    uint32_t start;  // First sequence number in block
    uint32_t end;    // Sequence number after last in block
};
 
// Process SACK option from ACK packet
void tcp_process_sack(struct tcp_connection *conn, 
                      struct tcp_sack_block *blocks, 
                      int num_blocks) {
    // Update scoreboard based on SACK blocks
    for (int i = 0; i < num_blocks; i++) {
        tcp_sack_mark_range(conn, blocks[i].start, blocks[i].end);
    }
    
    // Identify lost segments using SACK information
    tcp_update_lost_estimation(conn);
}
 
// Mark segments in range as SACKed
void tcp_sack_mark_range(struct tcp_connection *conn, 
                         uint32_t start, uint32_t end) {
    struct sk_buff *skb;
    
    // Walk send queue, mark matching segments
    foreach_skb_in_write_queue(skb, conn) {
        struct tcp_skb_cb *cb = TCP_SKB_CB(skb);
        
        if (seq_within(cb->seq, start, end) ||
            seq_within(cb->end_seq - 1, start, end)) {
            // Segment overlaps SACK block
            cb->sacked |= TCPCB_SACKED_ACKED;
            conn->sacked_out++;
        }
    }
}
 
// Identify lost segments based on SACK
void tcp_update_lost_estimation(struct tcp_connection *conn) {
    struct sk_buff *skb;
    int fack_count = 0;  // Forward ACK count
    
    // FACK-based loss detection:
    // Segment is lost if >3 segments after it have been SACKed
    foreach_skb_in_write_queue_reverse(skb, conn) {
        struct tcp_skb_cb *cb = TCP_SKB_CB(skb);
        
        if (cb->sacked & TCPCB_SACKED_ACKED) {
            fack_count++;
        } else {
            // Not SACKed - check if lost
            if (fack_count >= DUPACK_THRESHOLD) {
                // 3+ SACKed segments after this one = lost
                if (!(cb->sacked & TCPCB_LOST)) {
                    cb->sacked |= TCPCB_LOST;
                    conn->lost_out++;
                }
            }
        }
    }
}
 
// Retransmit all segments marked as lost
void tcp_retransmit_lost(struct tcp_connection *conn) {
    struct sk_buff *skb;
    
    foreach_skb_in_write_queue(skb, conn) {
        struct tcp_skb_cb *cb = TCP_SKB_CB(skb);
        
        if ((cb->sacked & TCPCB_LOST) && 
            !(cb->sacked & TCPCB_SACKED_RETRANS)) {
            // Lost and not already being retransmitted
            tcp_retransmit_skb(conn, skb);
            cb->sacked |= TCPCB_SACKED_RETRANS;
            conn->retrans_out++;
        }
    }
}

With SACK vs Without SACK
Capability	Without SACK	With SACK
Know which segments lost	Only first (snd_una)	All lost segments explicitly
Retransmit strategy	One at a time	All lost at once
Recovery for N losses	~N RTTs	~1 RTT
Spurious retransmit risk	Higher	Lower (explicit info)
Memory overhead	Lower	Higher (scoreboard)

FACK: Forward Acknowledgment

FACK is a loss detection algorithm that uses SACK information. It considers a segment lost if 3 or more segments after it have been SACKed. This is similar to the 3 dup ACK rule but based on explicit SACK data rather than counting ACKs.

Platform-Specific Implementations

While the core fast retransmit algorithm is standardized, implementations vary across operating systems. Understanding these differences is valuable for debugging and optimization.

Linux TCP:

Linux TCP Implementation Details

•Source files: net/ipv4/tcp_input.c (main ACK processing), tcp_output.c (retransmission), tcp_recovery.c (RACK)
•Key function: tcp_fastretrans_alert() - called on every ACK, handles all fast retransmit/recovery logic
•Loss detection: Supports classic dup ACK counting, FACK, and RACK (default since 4.11)
•Scoreboard: Uses linked list of sk_buffs with per-segment SACK state
•Tunable parameters: /proc/sys/net/ipv4/tcp_early_retrans, tcp_recovery (RACK on/off)

linux_tcp_fastretrans_alert.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// Simplified view of Linux's tcp_fastretrans_alert()
// Called from tcp_ack() for every incoming ACK
 
static void tcp_fastretrans_alert(struct sock *sk, struct tcp_ack_state *ack_state)
{
    struct inet_connection_sock *icsk = inet_csk(sk);
    struct tcp_sock *tp = tcp_sk(sk);
    int dup_ackers = ack_state->dup_acks;
    
    // Update reordering estimation
    if (dup_ackers > tp->reordering) {
        tcp_check_reordering(sk, dup_ackers);
    }
    
    // State machine transitions
    switch (icsk->icsk_ca_state) {
    case TCP_CA_Open:
        if (dup_ackers >= 1) {
            tcp_enter_cwr(sk);  // Or Disorder
        }
        break;
        
    case TCP_CA_Disorder:
        // Check for fast retransmit trigger
        if (tcp_time_to_recover(sk, dup_ackers)) {
            tcp_enter_recovery(sk, true);
            tcp_xmit_retransmit_queue(sk);
        }
        break;
        
    case TCP_CA_Recovery:
        // Handle partial/full ACKs
        if (ack_state->flag & FLAG_SND_UNA_ADVANCED) {
            tcp_try_undo_partial(sk);
            tcp_check_space(sk);
        }
        
        // Inflate cwnd on dup ACKs
        tcp_cwnd_reduction(sk, dup_ackers);
        break;
        
    case TCP_CA_Loss:
        tcp_try_undo_loss(sk, true);
        break;
    }
    
    // Retransmit lost segments (SACK-based if available)
    if (tcp_is_in_recovery(sk)) {
        tcp_xmit_retransmit_queue(sk);
    }
}

Windows TCP:

Implements RFC 5681 Fast Retransmit + Fast Recovery
SACK support enabled by default since Vista
Uses Compound TCP (CTCP) or CUBIC as default congestion control
Tunable via registry: TcpMaxDupAcks (default 2 on older versions, 3 since Win10)
Recent versions support RACK-TLP equivalent mechanisms

FreeBSD/macOS (BSD Stack):

Classic BSD implementation, well-documented in Stevens' TCP/IP Illustrated
SACK support in tcp_sack.c
Uses NewReno by default, CUBIC available
Source: sys/netinet/tcp_input.c
Very similar structure to Linux but historically more conservative

Reading TCP Source Code

For deep understanding, reading the actual kernel source is invaluable. Start with Linux's tcp_input.c and trace the path from tcp_rcv_established() → tcp_ack() → tcp_fastretrans_alert(). The code is well-commented with RFC references.

Common Bugs and Implementation Pitfalls

Fast retransmit implementations are notoriously tricky. Subtle bugs can cause performance degradation, spurious retransmissions, or complete failure to recover. Here are common pitfalls:

Bug 1: Incorrect Duplicate ACK Counting

bug_dup_ack_counting.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// BUG: Counting ACKs instead of duplicate ACKs
void buggy_handle_ack(struct tcp_conn *c, u32 ack_num) {
    if (ack_num == c->snd_una) {
        c->ack_count++;  // WRONG: counts ALL acks with same number
                         // Should filter by RFC 5681 criteria
        if (c->ack_count >= 3) {
            fast_retransmit();
        }
    }
}
 
// CORRECT: Filter by RFC 5681 duplicate ACK criteria
void correct_handle_ack(struct tcp_conn *c, packet *pkt) {
    if (pkt->ack_num == c->snd_una &&
        pkt->data_len == 0 &&           // No data
        pkt->window == c->last_window && // No window change
        !(pkt->flags & (SYN|FIN))) {    // No SYN/FIN
        c->dupacks++;
        if (c->dupacks == 3) {
            fast_retransmit();
        }
    }
}

Bug 2: Not Resetting Counter on New ACK

bug_counter_reset.c
1
2
3
4
5
6
7
8
9
10
11
12
// BUG: Forgetting to reset dupacks on new ACK
void buggy_new_ack(struct tcp_conn *c, u32 ack_num) {
    c->snd_una = ack_num;
    // MISSING: c->dupacks = 0;
    // Result: Can trigger spurious fast retransmit later
}
 
// CORRECT: Always reset on new ACK
void correct_new_ack(struct tcp_conn *c, u32 ack_num) {
    c->snd_una = ack_num;
    c->dupacks = 0;  // Critical!
}

Other Common Pitfalls

•Multiple fast retransmits for same loss: Re-triggering fast retransmit while already in recovery (should be blocked by state check)
•Incorrect high_seq tracking: Exiting recovery too early (before all pre-recovery data acked) or too late
•SACK scoreboard corruption: Not clearing SACKED flags when ACK advances past segment
•Retransmitting already-SACKed data: Not checking SACK state before retransmit
•Window inflation overflow: Not capping cwnd during recovery (can grow unboundedly with many dup ACKs)
•Timer interaction: Canceling RTO when entering recovery (should keep as backup) or not resetting on retransmit

The NewReno Partial ACK Bug

A historically famous bug: early implementations exited fast recovery on any new ACK, even partial ACKs (that advance snd_una but don't acknowledge all data). NewReno (RFC 6582) requires staying in recovery until all pre-recovery data is acked. This bug caused severe performance issues with multiple losses.

Testing and Debugging Fast Retransmit

Verifying fast retransmit correctness requires careful testing. Production-grade TCP stacks use extensive test suites and packet traces.

Testing Approaches:

Testing Methodologies

•Unit tests: Test state machine transitions in isolation. Mock duplicate ACKs and verify transitions.
•Integration tests: Use network emulation (netem, dummynet) to inject loss and verify retransmits occur.
•Packet capture comparison: Capture traffic and compare against expected behavior (known-good trace).
•RFC conformance testing: Verify behavior matches RFC 5681/6582 specifications.
•Stress testing: High-speed, high-loss to expose race conditions and edge cases.

test_fast_retransmit.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#!/bin/bash
# Test fast retransmit using network emulation
 
# Setup: Add packet loss
sudo tc qdisc add dev eth0 root netem loss 5%
 
# Start packet capture
tcpdump -i eth0 -w test_capture.pcap tcp port 5201 &
 
# Run iperf test
iperf3 -c 10.0.0.1 -t 30
 
# Stop capture
kill %1
 
# Analyze with tshark
echo "=== Fast Retransmit Analysis ==="
tshark -r test_capture.pcap -Y "tcp.analysis.fast_retransmission" -T fields -e frame.number -e tcp.seq | wc -l
echo "Fast retransmissions triggered"
 
echo "=== Duplicate ACK Analysis ==="
tshark -r test_capture.pcap -Y "tcp.analysis.duplicate_ack" -T fields -e frame.number | wc -l  
echo "Duplicate ACKs sent"
 
# Check kernel counters
nstat -sz | grep TcpFastRetrans
 
# Cleanup
sudo tc qdisc del dev eth0 root

Debugging Techniques:

Using ss for live connection state:

ss -ti dst 10.0.0.1 | grep -E "(retrans|cwnd|ssthresh)"

Kernel tracing (Linux):

# Trace TCP events
echo 1 > /sys/kernel/debug/tracing/events/tcp/tcp_retransmit_skb/enable
cat /sys/kernel/debug/tracing/trace_pipe

Wireshark filters:

tcp.analysis.fast_retransmission: Show fast retransmissions
tcp.analysis.duplicate_ack: Show duplicate ACKs
tcp.analysis.retransmission: All retransmissions
tcp.analysis.spurious_retransmission: Unnecessary retransmissions

packetdrill

Google's packetdrill tool allows scripting precise packet sequences to test TCP behavior. It's invaluable for testing fast retransmit edge cases—you can inject exactly 3 dup ACKs and verify the exact retransmission behavior. See: https://github.com/google/packetdrill

Summary: From Algorithm to Production

Implementing fast retransmit correctly is a non-trivial engineering challenge. The algorithm is elegant in concept but demands careful attention to state management, edge cases, and interaction with other TCP mechanisms.

Key Takeaways

•The TCB tracks all fast retransmit state — dupacks counter, high_seq, ca_state, and SACK scoreboard are essential fields
•A state machine governs transitions — OPEN → DISORDER → RECOVERY, with specific conditions for each transition
•SACK dramatically improves recovery — Explicit loss information enables precise, parallel retransmission of all lost segments
•Platform implementations vary — Linux uses tcp_fastretrans_alert(), tunable options differ, but core RFC behavior is consistent
•Common bugs are sneaky — Incorrect counting, missing counter resets, and premature recovery exit are frequent pitfalls
•Testing requires careful methodology — Network emulation, packet capture analysis, and kernel tracing are essential tools

Module Complete:

You have now completed the comprehensive exploration of TCP Fast Retransmit. From the foundational understanding of duplicate ACKs, through the trigger mechanism, timeout avoidance benefits, performance analysis, and implementation details—you possess expert-level knowledge of this critical TCP optimization.

This knowledge is directly applicable to:

Implementing custom TCP stacks
Debugging production network issues
Optimizing application performance
Understanding and tuning congestion control behavior

Module Complete

Congratulations! You have mastered TCP Fast Retransmit—from the theoretical foundations of duplicate ACKs to the practical realities of production implementation. This knowledge positions you to understand, debug, and optimize TCP behavior in any environment.

5 / 5

Loading learning content...

Computer NetworksFast Retransmit

Fast Retransmit

LevelAdvanced

Duration75 mins

TopicFast Retransmit

5 / 5

Implementation

From Theory to Code

What You Will Master

Essential Data Structures

Fast retransmit implementation requires several key data structures to track connection state, identify losses, and manage recovery. Let's examine each component.

Transmission Control Block (TCB):

The TCB is the primary per-connection data structure, containing all state for a TCP connection. Fast retransmit uses these fields:

tcp_control_block.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// TCP Control Block - Fast Retransmit Fields
struct tcp_sock {
    // === Sequence Number State ===
    u32 snd_una;      // Send unacknowledged - oldest unacked byte
    u32 snd_nxt;      // Send next - next byte to transmit
    u32 snd_wnd;      // Send window (from receiver)
    
    // === Congestion Control State ===
    u32 cwnd;         // Congestion window (segments or bytes)
    u32 ssthresh;     // Slow start threshold
    
    // === Fast Retransmit State ===
    u32 dupacks;      // Count of duplicate ACKs received
    u32 high_seq;     // Highest seq sent at time of fast recovery entry
    u32 prior_ssthresh; // ssthresh before entering fast recovery
    u8  frto_counter; // F-RTO algorithm counter
    
    // === State Flags ===
    u8  ca_state;     // Congestion control state (OPEN, DISORDER, CWR, RECOVERY, LOSS)
    
    // === SACK State (if enabled) ===
    struct tcp_sack_block recv_sack_cache[4]; // SACK blocks from receiver
    u32 sacked_out;   // Count of SACKed segments
    u32 lost_out;     // Estimated lost segments (based on SACK)
    u32 retrans_out;  // Segments currently being retransmitted
    
    // === Scoreboard (for SACK-based recovery) ===
    struct rb_root retransmit_tree; // Red-black tree of sent segments
    
    // === Timing ===
    u32 rto;          // Retransmission timeout (in jiffies/ms)
    u64 retrans_stamp; // Time of last retransmission
};
 
// Congestion control states
enum tcp_ca_state {
    TCP_CA_Open = 0,      // Normal operation
    TCP_CA_Disorder = 1,  // Received dupacks, possible reordering
    TCP_CA_CWR = 2,       // Congestion Window Reduced (ECN)
    TCP_CA_Recovery = 3,  // Fast recovery in progress
    TCP_CA_Loss = 4,      // RTO fired, in loss recovery
};

The Send Buffer and Retransmission Queue:

TCP must maintain sent-but-unacknowledged data for potential retransmission. This is typically implemented as:

Send Buffer Organization

•Circular buffer/ring: Efficient for in-order data, but requires extra indexing for SACK-based selective retransmit
•Linked list of segments: Easy to manipulate, used in many implementations. Each node contains segment data + metadata (seq, len, timestamp, state flags)
•Segment scoreboard: For SACK, each segment is tracked individually with state (sent, sacked, lost, retransmitted)

send_buffer_segment.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Segment metadata for retransmission tracking
struct tcp_skb_cb {
    u32 seq;           // Sequence number of first byte
    u32 end_seq;       // Sequence number after last byte
    u8  sacked;        // SACK state flags
    u8  tcp_flags;     // SYN, FIN, etc.
    u64 tx_time;       // Transmission timestamp (for RTT and RACK)
    u8  retrans_count; // Number of times retransmitted
};
 
// SACK state flags per segment
#define TCPCB_SACKED_ACKED     0x01  // SACKed by receiver
#define TCPCB_SACKED_RETRANS   0x02  // Currently being retransmitted  
#define TCPCB_LOST             0x04  // Declared lost
#define TCPCB_TAGBITS          0x07  // All flags
 
// The send queue - list of unacknowledged segments
struct sk_buff_head write_queue;  // Ordered by sequence number

SACK Scoreboard

The Fast Retransmit State Machine

Fast retransmit operates as part of TCP's broader congestion control state machine. Understanding state transitions is key to correct implementation.

Linux TCP Congestion Control States:

Converting Mermaid diagram...

State Descriptions:

OPEN (Normal Operation):

No evidence of loss
Window grows according to congestion control algorithm (slow start or congestion avoidance)
dupacks = 0

DISORDER (Possible Reordering):

Receiving duplicate ACKs (1 or 2)
Could be reordering; not yet assuming loss
Tracking dupacks, waiting to see if reordering resolves

RECOVERY (Fast Recovery):

3+ duplicate ACKs received
Fast retransmit triggered
Window inflated by incoming dup ACKs
Continue sending new data if window permits
Exit when all data sent before entering recovery is acknowledged

LOSS (RTO Recovery):

Retransmission timer expired
Severe response: cwnd = 1, slow start
All outstanding data assumed lost

state_transitions.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// State transition logic (simplified from Linux tcp_fastretrans_alert)
function on_ack_received(ack):
    if is_new_ack(ack):
        handle_new_ack(ack)
    else if is_duplicate_ack(ack):
        handle_dup_ack(ack)
 
function handle_new_ack(ack):
    if ca_state == RECOVERY:
        if ack >= high_seq:
            // Recovery complete: all pre-recovery data acked
            exit_recovery()
            ca_state = OPEN
        else:
            // Partial ACK during recovery
            // NewReno: retransmit next lost segment
            // SACK: retransmit based on scoreboard
            maybe_retransmit_on_partial_ack()
    else:
        ca_state = OPEN
    
    dupacks = 0
    update_cwnd_on_ack(ack)
 
function handle_dup_ack(ack):
    dupacks += 1
    
    if ca_state == OPEN:
        ca_state = DISORDER
    
    if ca_state == DISORDER or ca_state == RECOVERY:
        if dupacks == 3 and ca_state != RECOVERY:
            // === FAST RETRANSMIT TRIGGER ===
            enter_recovery()
            retransmit_segment(snd_una)
        else if ca_state == RECOVERY:
            // Window inflation
            cwnd += 1 SMSS
            try_send_more_data()

The high_seq Variable

Complete Implementation Logic

Let's examine a complete pseudo-implementation of fast retransmit with fast recovery, incorporating all the details discussed so far.

Full Algorithm Implementation:

fast_retransmit_full.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
// Complete Fast Retransmit + Fast Recovery Implementation
// Based on TCP Reno (RFC 5681) with NewReno extensions (RFC 6582)
 
struct tcp_connection {
    // ... all fields from earlier ...
    uint32_t snd_una, snd_nxt, snd_wnd;
    uint32_t cwnd, ssthresh;
    uint32_t dupacks;
    uint32_t high_seq;
    enum tcp_ca_state ca_state;
    uint32_t mss;  // Max segment size
};
 
#define DUPACK_THRESHOLD 3
 
void tcp_process_ack(struct tcp_connection *conn, uint32_t ack_num) {
    // Check if this is a new ACK (advances snd_una)
    if (seq_after(ack_num, conn->snd_una)) {
        //
        // NEW ACK PROCESSING
        //
        uint32_t bytes_acked = ack_num - conn->snd_una;
        conn->snd_una = ack_num;
        
        // Remove acknowledged data from send buffer
        tcp_clean_rtx_queue(conn, ack_num);
        
        // State-specific handling
        switch (conn->ca_state) {
        case TCP_CA_Recovery:
            if (seq_geq(ack_num, conn->high_seq)) {
                // Full recovery complete
                tcp_end_recovery(conn);
                conn->ca_state = TCP_CA_Open;
                // Deflate window to ssthresh
                conn->cwnd = min(conn->ssthresh, 
                                 tcp_packets_in_flight(conn) + conn->mss);
            } else {
                // Partial ACK - more segments to retransmit (NewReno)
                tcp_retransmit_one(conn);  // Retransmit next lost
                // Deflate cwnd by amount acked, then add 1 mss
                conn->cwnd -= bytes_acked;
                conn->cwnd += conn->mss;
            }
            break;
            
        case TCP_CA_Loss:
            // In loss state, new ACK allows progress
            conn->ca_state = TCP_CA_Open;
            break;
            
        case TCP_CA_Disorder:
        case TCP_CA_Open:
        default:
            // Normal cwnd increase
            tcp_cong_avoid(conn, bytes_acked);
            conn->ca_state = TCP_CA_Open;
            break;
        }
        
        // Reset duplicate ACK counter on new ACK
        conn->dupacks = 0;
        
        // Try to send more data
        tcp_write_xmit(conn);
    }
    else if (ack_num == conn->snd_una) {
        //
        // DUPLICATE ACK PROCESSING
        //
        if (!tcp_is_valid_dupack(conn, ack_num)) {
            return;  // Not a true dup ACK (RFC 5681 criteria)
        }
        
        conn->dupacks++;
        
        switch (conn->ca_state) {
        case TCP_CA_Open:
            // First dup ACK - enter disorder
            conn->ca_state = TCP_CA_Disorder;
            // Fall through
            
        case TCP_CA_Disorder:
            if (conn->dupacks == DUPACK_THRESHOLD) {
                //
                // === FAST RETRANSMIT TRIGGER ===
                //
                tcp_enter_recovery(conn);
                tcp_retransmit_one(conn);  // Retransmit snd_una
            }
            break;
            
        case TCP_CA_Recovery:
            // Already in recovery - inflate window
            conn->cwnd += conn->mss;
            tcp_write_xmit(conn);  // Send new data if window allows
            break;
            
        case TCP_CA_Loss:
            // In loss state, dup ACKs ignored
            break;
        }
    }
}
 
void tcp_enter_recovery(struct tcp_connection *conn) {
    // Record high_seq for recovery exit condition
    conn->high_seq = conn->snd_nxt;
    
    // Set new ssthresh (RFC 5681: max(FlightSize/2, 2*MSS))
    uint32_t flight = tcp_packets_in_flight(conn);
    conn->ssthresh = max(flight / 2, 2);  // In segments
    
    // Set cwnd for recovery phase
    // ssthresh + 3*MSS for the 3 segments that triggered dup ACKs
    conn->cwnd = conn->ssthresh + DUPACK_THRESHOLD * conn->mss;
    
    // Enter recovery state
    conn->ca_state = TCP_CA_Recovery;
    
    // Record stats
    TCP_INC_STATS(TCP_MIB_FASTRETR);
}
 
void tcp_end_recovery(struct tcp_connection *conn) {
    conn->dupacks = 0;
    conn->ca_state = TCP_CA_Open;
}
 
bool tcp_is_valid_dupack(struct tcp_connection *conn, uint32_t ack) {
    // RFC 5681 duplicate ACK criteria:
    // 1. ACK carries no data
    // 2. ACK carries no window change 
    // 3. ACK carries no SYN or FIN
    // 4. ACK number equals snd_una
    // (These checks done before calling this function)
    return true;  // Simplified - actual impl checks packet flags
}

TCP_INC_STATS

SACK-Based Implementation

SACK Processing:

sack_processing.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
// SACK block structure
struct tcp_sack_block {
    uint32_t start;  // First sequence number in block
    uint32_t end;    // Sequence number after last in block
};
 
// Process SACK option from ACK packet
void tcp_process_sack(struct tcp_connection *conn, 
                      struct tcp_sack_block *blocks, 
                      int num_blocks) {
    // Update scoreboard based on SACK blocks
    for (int i = 0; i < num_blocks; i++) {
        tcp_sack_mark_range(conn, blocks[i].start, blocks[i].end);
    }
    
    // Identify lost segments using SACK information
    tcp_update_lost_estimation(conn);
}
 
// Mark segments in range as SACKed
void tcp_sack_mark_range(struct tcp_connection *conn, 
                         uint32_t start, uint32_t end) {
    struct sk_buff *skb;
    
    // Walk send queue, mark matching segments
    foreach_skb_in_write_queue(skb, conn) {
        struct tcp_skb_cb *cb = TCP_SKB_CB(skb);
        
        if (seq_within(cb->seq, start, end) ||
            seq_within(cb->end_seq - 1, start, end)) {
            // Segment overlaps SACK block
            cb->sacked |= TCPCB_SACKED_ACKED;
            conn->sacked_out++;
        }
    }
}
 
// Identify lost segments based on SACK
void tcp_update_lost_estimation(struct tcp_connection *conn) {
    struct sk_buff *skb;
    int fack_count = 0;  // Forward ACK count
    
    // FACK-based loss detection:
    // Segment is lost if >3 segments after it have been SACKed
    foreach_skb_in_write_queue_reverse(skb, conn) {
        struct tcp_skb_cb *cb = TCP_SKB_CB(skb);
        
        if (cb->sacked & TCPCB_SACKED_ACKED) {
            fack_count++;
        } else {
            // Not SACKed - check if lost
            if (fack_count >= DUPACK_THRESHOLD) {
                // 3+ SACKed segments after this one = lost
                if (!(cb->sacked & TCPCB_LOST)) {
                    cb->sacked |= TCPCB_LOST;
                    conn->lost_out++;
                }
            }
        }
    }
}
 
// Retransmit all segments marked as lost
void tcp_retransmit_lost(struct tcp_connection *conn) {
    struct sk_buff *skb;
    
    foreach_skb_in_write_queue(skb, conn) {
        struct tcp_skb_cb *cb = TCP_SKB_CB(skb);
        
        if ((cb->sacked & TCPCB_LOST) && 
            !(cb->sacked & TCPCB_SACKED_RETRANS)) {
            // Lost and not already being retransmitted
            tcp_retransmit_skb(conn, skb);
            cb->sacked |= TCPCB_SACKED_RETRANS;
            conn->retrans_out++;
        }
    }
}

With SACK vs Without SACK
Capability	Without SACK	With SACK
Know which segments lost	Only first (snd_una)	All lost segments explicitly
Retransmit strategy	One at a time	All lost at once
Recovery for N losses	~N RTTs	~1 RTT
Spurious retransmit risk	Higher	Lower (explicit info)
Memory overhead	Lower	Higher (scoreboard)

FACK: Forward Acknowledgment

Platform-Specific Implementations

While the core fast retransmit algorithm is standardized, implementations vary across operating systems. Understanding these differences is valuable for debugging and optimization.

Linux TCP:

Linux TCP Implementation Details

•Source files: net/ipv4/tcp_input.c (main ACK processing), tcp_output.c (retransmission), tcp_recovery.c (RACK)
•Key function: tcp_fastretrans_alert() - called on every ACK, handles all fast retransmit/recovery logic
•Loss detection: Supports classic dup ACK counting, FACK, and RACK (default since 4.11)
•Scoreboard: Uses linked list of sk_buffs with per-segment SACK state
•Tunable parameters: /proc/sys/net/ipv4/tcp_early_retrans, tcp_recovery (RACK on/off)

linux_tcp_fastretrans_alert.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// Simplified view of Linux's tcp_fastretrans_alert()
// Called from tcp_ack() for every incoming ACK
 
static void tcp_fastretrans_alert(struct sock *sk, struct tcp_ack_state *ack_state)
{
    struct inet_connection_sock *icsk = inet_csk(sk);
    struct tcp_sock *tp = tcp_sk(sk);
    int dup_ackers = ack_state->dup_acks;
    
    // Update reordering estimation
    if (dup_ackers > tp->reordering) {
        tcp_check_reordering(sk, dup_ackers);
    }
    
    // State machine transitions
    switch (icsk->icsk_ca_state) {
    case TCP_CA_Open:
        if (dup_ackers >= 1) {
            tcp_enter_cwr(sk);  // Or Disorder
        }
        break;
        
    case TCP_CA_Disorder:
        // Check for fast retransmit trigger
        if (tcp_time_to_recover(sk, dup_ackers)) {
            tcp_enter_recovery(sk, true);
            tcp_xmit_retransmit_queue(sk);
        }
        break;
        
    case TCP_CA_Recovery:
        // Handle partial/full ACKs
        if (ack_state->flag & FLAG_SND_UNA_ADVANCED) {
            tcp_try_undo_partial(sk);
            tcp_check_space(sk);
        }
        
        // Inflate cwnd on dup ACKs
        tcp_cwnd_reduction(sk, dup_ackers);
        break;
        
    case TCP_CA_Loss:
        tcp_try_undo_loss(sk, true);
        break;
    }
    
    // Retransmit lost segments (SACK-based if available)
    if (tcp_is_in_recovery(sk)) {
        tcp_xmit_retransmit_queue(sk);
    }
}

Windows TCP:

Implements RFC 5681 Fast Retransmit + Fast Recovery
SACK support enabled by default since Vista
Uses Compound TCP (CTCP) or CUBIC as default congestion control
Tunable via registry: TcpMaxDupAcks (default 2 on older versions, 3 since Win10)
Recent versions support RACK-TLP equivalent mechanisms

FreeBSD/macOS (BSD Stack):

Classic BSD implementation, well-documented in Stevens' TCP/IP Illustrated
SACK support in tcp_sack.c
Uses NewReno by default, CUBIC available
Source: sys/netinet/tcp_input.c
Very similar structure to Linux but historically more conservative

Reading TCP Source Code

Common Bugs and Implementation Pitfalls

Fast retransmit implementations are notoriously tricky. Subtle bugs can cause performance degradation, spurious retransmissions, or complete failure to recover. Here are common pitfalls:

Bug 1: Incorrect Duplicate ACK Counting

bug_dup_ack_counting.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// BUG: Counting ACKs instead of duplicate ACKs
void buggy_handle_ack(struct tcp_conn *c, u32 ack_num) {
    if (ack_num == c->snd_una) {
        c->ack_count++;  // WRONG: counts ALL acks with same number
                         // Should filter by RFC 5681 criteria
        if (c->ack_count >= 3) {
            fast_retransmit();
        }
    }
}
 
// CORRECT: Filter by RFC 5681 duplicate ACK criteria
void correct_handle_ack(struct tcp_conn *c, packet *pkt) {
    if (pkt->ack_num == c->snd_una &&
        pkt->data_len == 0 &&           // No data
        pkt->window == c->last_window && // No window change
        !(pkt->flags & (SYN|FIN))) {    // No SYN/FIN
        c->dupacks++;
        if (c->dupacks == 3) {
            fast_retransmit();
        }
    }
}

Bug 2: Not Resetting Counter on New ACK

bug_counter_reset.c
1
2
3
4
5
6
7
8
9
10
11
12
// BUG: Forgetting to reset dupacks on new ACK
void buggy_new_ack(struct tcp_conn *c, u32 ack_num) {
    c->snd_una = ack_num;
    // MISSING: c->dupacks = 0;
    // Result: Can trigger spurious fast retransmit later
}
 
// CORRECT: Always reset on new ACK
void correct_new_ack(struct tcp_conn *c, u32 ack_num) {
    c->snd_una = ack_num;
    c->dupacks = 0;  // Critical!
}

Other Common Pitfalls

•Multiple fast retransmits for same loss: Re-triggering fast retransmit while already in recovery (should be blocked by state check)
•Incorrect high_seq tracking: Exiting recovery too early (before all pre-recovery data acked) or too late
•SACK scoreboard corruption: Not clearing SACKED flags when ACK advances past segment
•Retransmitting already-SACKed data: Not checking SACK state before retransmit
•Window inflation overflow: Not capping cwnd during recovery (can grow unboundedly with many dup ACKs)
•Timer interaction: Canceling RTO when entering recovery (should keep as backup) or not resetting on retransmit

The NewReno Partial ACK Bug

Testing and Debugging Fast Retransmit

Verifying fast retransmit correctness requires careful testing. Production-grade TCP stacks use extensive test suites and packet traces.

Testing Approaches:

Testing Methodologies

•Unit tests: Test state machine transitions in isolation. Mock duplicate ACKs and verify transitions.
•Integration tests: Use network emulation (netem, dummynet) to inject loss and verify retransmits occur.
•Packet capture comparison: Capture traffic and compare against expected behavior (known-good trace).
•RFC conformance testing: Verify behavior matches RFC 5681/6582 specifications.
•Stress testing: High-speed, high-loss to expose race conditions and edge cases.

test_fast_retransmit.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#!/bin/bash
# Test fast retransmit using network emulation
 
# Setup: Add packet loss
sudo tc qdisc add dev eth0 root netem loss 5%
 
# Start packet capture
tcpdump -i eth0 -w test_capture.pcap tcp port 5201 &
 
# Run iperf test
iperf3 -c 10.0.0.1 -t 30
 
# Stop capture
kill %1
 
# Analyze with tshark
echo "=== Fast Retransmit Analysis ==="
tshark -r test_capture.pcap -Y "tcp.analysis.fast_retransmission" -T fields -e frame.number -e tcp.seq | wc -l
echo "Fast retransmissions triggered"
 
echo "=== Duplicate ACK Analysis ==="
tshark -r test_capture.pcap -Y "tcp.analysis.duplicate_ack" -T fields -e frame.number | wc -l  
echo "Duplicate ACKs sent"
 
# Check kernel counters
nstat -sz | grep TcpFastRetrans
 
# Cleanup
sudo tc qdisc del dev eth0 root

Debugging Techniques:

Using ss for live connection state:

ss -ti dst 10.0.0.1 | grep -E "(retrans|cwnd|ssthresh)"

Kernel tracing (Linux):

# Trace TCP events
echo 1 > /sys/kernel/debug/tracing/events/tcp/tcp_retransmit_skb/enable
cat /sys/kernel/debug/tracing/trace_pipe

Wireshark filters:

tcp.analysis.fast_retransmission: Show fast retransmissions
tcp.analysis.duplicate_ack: Show duplicate ACKs
tcp.analysis.retransmission: All retransmissions
tcp.analysis.spurious_retransmission: Unnecessary retransmissions

packetdrill

Summary: From Algorithm to Production

Key Takeaways

•The TCB tracks all fast retransmit state — dupacks counter, high_seq, ca_state, and SACK scoreboard are essential fields
•A state machine governs transitions — OPEN → DISORDER → RECOVERY, with specific conditions for each transition
•SACK dramatically improves recovery — Explicit loss information enables precise, parallel retransmission of all lost segments
•Platform implementations vary — Linux uses tcp_fastretrans_alert(), tunable options differ, but core RFC behavior is consistent
•Common bugs are sneaky — Incorrect counting, missing counter resets, and premature recovery exit are frequent pitfalls
•Testing requires careful methodology — Network emulation, packet capture analysis, and kernel tracing are essential tools

Module Complete:

This knowledge is directly applicable to:

Implementing custom TCP stacks
Debugging production network issues
Optimizing application performance
Understanding and tuning congestion control behavior

Module Complete

5 / 5