Dynamic Timeout - Learning Module

Loading content...

0/228

Karn's Algorithm: Resolving Retransmission Ambiguity

The Measurement That Cannot Be Made

Consider a fundamental problem in measurement: you're timing how long it takes for a message to reach its destination and for a reply to return. Simple enough. But what if the message gets lost and you must send it again? When the reply finally arrives, how do you know which message it's responding to—the original or the retransmission?

This is the retransmission ambiguity problem, and it plagued early TCP implementations. Worse, naively using RTT samples from retransmitted segments led to a pernicious failure mode: the RTT estimator would become corrupted, causing either massive over-timeouts or catastrophic under-timeouts that triggered retransmission storms.

Phil Karn, in collaboration with Craig Partridge, published an elegant solution in 1987. Karn's algorithm addresses this problem with a simple but powerful insight: if you can't make a clean measurement, don't make the measurement at all.

What You Will Learn

By the end of this page, you will understand the retransmission ambiguity problem in depth, why incorrect handling can corrupt RTT estimates, Karn's two-part solution (don't measure + back off), how Karn's algorithm complements Jacobson's algorithm, and how modern TCP timestamps eliminate the ambiguity entirely.

The Retransmission Ambiguity Problem in Depth

We introduced this problem briefly in Page 1, but let's examine it thoroughly. The problem occurs in any scenario where:

A segment is transmitted
The retransmission timeout expires (no ACK received in time)
The segment is retransmitted
An ACK arrives

The fundamental question: Which transmission does the ACK acknowledge?

Converting Mermaid diagram...

Two Possible Scenarios (Indistinguishable)

Scenario A: Original was delivered, ACK acknowledges original

Original segment arrived successfully at the receiver
Receiver sent ACK promptly, but ACK was delayed or lost
Sender retransmitted before ACK arrived
Original ACK (or a replacement ACK) arrives
True RTT: The longer measurement (T₃ - T₁)

Scenario B: Original was lost, ACK acknowledges retransmission

Original segment was lost in the network
Retransmission was received successfully
Receiver sent ACK for the retransmission
True RTT: The shorter measurement (T₃ - T₂)

The critical problem: Standard TCP acknowledgments carry no information to distinguish these scenarios. The ACK simply says "I've received all bytes up to X." It doesn't say "I received them from the transmission you sent at time Y."

Not Just Unknown—Actively Harmful

The problem isn't merely that we don't know the true RTT. Either interpretation we choose has a 50% chance of being wrong, and the wrong choice corrupts our RTT estimator. Repeated wrong choices cause the estimator to drift, potentially by orders of magnitude.

Why Guessing Goes Wrong: The Corruption Cascade

Let's trace through what happens when TCP implementations guess incorrectly.

Guessing Strategy 1: Use the Original Send Time (T₁)

This strategy timestamps RTT from the first transmission:

SampleRTT = T₃ - T₁

Impact of Using Original Send Time
Actual Scenario	Measured RTT	Actual RTT	Effect on Estimator
A (Original delivered)	T₃ - T₁	T₃ - T₁	Correct
B (Original lost)	T₃ - T₁ (long)	T₃ - T₂ (short)	Inflated: SRTT and RTTVAR increase

When Scenario B occurs but we use T₁:

Measured RTT = (T₂ - T₁) + actual_RTT = RTO + actual_RTT
This sample is at least one full RTO larger than the true RTT
SRTT increases, RTTVAR increases
RTO for subsequent segments becomes inflated
If this happens repeatedly, RTO can grow unboundedly

Failure mode: "RTO inflation" — Every timeout leads to a larger RTO, creating a positive feedback loop where retransmission becomes increasingly conservative. Throughput drops to near zero on lossy links.

Guessing Strategy 2: Use the Retransmission Time (T₂)

This strategy timestamps RTT from the retransmission:

SampleRTT = T₃ - T₂

Impact of Using Retransmission Time
Actual Scenario	Measured RTT	Actual RTT	Effect on Estimator
A (Original delivered)	T₃ - T₂ (short)	T₃ - T₁ (long)	Deflated: SRTT and RTTVAR decrease
B (Original lost)	T₃ - T₂	T₃ - T₂	Correct

When Scenario A occurs but we use T₂:

Measured RTT = T₃ - T₂, but ACK was actually for the original
This sample is much smaller than the true RTT (it's only the time since retransmission)
SRTT decreases, and crucially, RTTVAR may decrease
RTO shrinks
Subsequent segments may timeout spuriously

Failure mode: "Retransmission storm" — Deflated RTO causes spurious retransmissions, which cause more ambiguous samples, which further deflate RTO. This creates a positive feedback loop that floods the network with duplicate packets, worsening congestion.

Both Strategies Fail

Neither guessing strategy is safe. One leads to unbounded RTO growth (connection stalls). The other leads to RTO collapse and retransmission storms (congestion amplification). The only safe approach is to not use ambiguous samples at all.

Karn's Solution: Two Simple Rules

Phil Karn and Craig Partridge published their solution in 1987 in a paper titled "Improving Round-Trip Time Estimates in Reliable Transport Protocols." The algorithm consists of two synergistic rules:

Rule 1: Don't Use Ambiguous Samples

When a segment is retransmitted, do not use the resulting ACK to update the RTT estimator (SRTT and RTTVAR).

This rule is elegantly simple: if we can't be sure which transmission the ACK corresponds to, we simply don't use it. We wait for an unambiguous measurement—one where the segment was not retransmitted.

Rule 2: Back Off and Hold

When a timeout occurs and triggers retransmission, back off the RTO (typically doubling it) and keep it at that backed-off value until a successful (non-retransmitted) transmission completes.

This rule is essential because Rule 1 alone would be dangerous. Consider:

Timeout occurs → retransmit → don't update estimator
What if the network has genuinely slowed down?
If we keep using the old RTO, we'll keep timing out
We need some mechanism to increase RTO even without new samples

The solution: exponential backoff. Each timeout doubles the RTO. This backed-off value is maintained until we get a clean sample from a non-retransmitted segment.

karns_algorithm.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class TCPConnection:
    def __init__(self):
        # RTT estimator state
        self.srtt = None       # Smoothed RTT
        self.rttvar = None     # RTT Variance
        self.rto = 1000        # Initial RTO: 1 second (ms)
        
        # Segment tracking
        self.pending_segments = {}  # seq_num -> (send_time, retransmitted_flag)
        self.num_retransmits = {}   # seq_num -> count
    
    def send_segment(self, segment):
        """Record send time and track if this is a retransmission."""
        seq_num = segment.sequence_number
        send_time = current_timestamp()
        
        if seq_num in self.pending_segments:
            # This is a retransmission
            old_send_time, _ = self.pending_segments[seq_num]
            self.pending_segments[seq_num] = (old_send_time, True)  # Mark as retransmitted
            self.num_retransmits[seq_num] = self.num_retransmits.get(seq_num, 0) + 1
        else:
            # First transmission
            self.pending_segments[seq_num] = (send_time, False)
        
        self.transmit(segment)
    
    def receive_ack(self, ack):
        """Process ACK and potentially update RTT estimate."""
        ack_num = ack.acknowledgment_number
        receive_time = current_timestamp()
        
        for seq_num in list(self.pending_segments.keys()):
            if seq_num < ack_num:
                send_time, was_retransmitted = self.pending_segments.pop(seq_num)
                
                # KARN'S RULE 1: Only update estimator if NOT retransmitted
                if not was_retransmitted:
                    sample_rtt = receive_time - send_time
                    self.update_rtt_estimator(sample_rtt)
                    
                    # KARN'S RULE 2 (reset): Successful non-retransmit
                    # means we can now trust our estimates again
                    # RTO is recalculated from the estimator, not held at backed-off value
                else:
                    # Retransmitted segment acknowledged
                    # DO NOT update SRTT/RTTVAR
                    # Keep the backed-off RTO
                    pass
                
                # Clean up retransmit count
                if seq_num in self.num_retransmits:
                    del self.num_retransmits[seq_num]
    
    def handle_timeout(self, seq_num):
        """Handle retransmission timeout."""
        # KARN'S RULE 2: Exponential backoff
        self.rto = min(self.rto * 2, MAX_RTO)  # Double RTO, up to maximum
        
        # Retransmit the segment (will be marked as retransmitted)
        segment = self.get_segment(seq_num)
        self.send_segment(segment)
    
    def update_rtt_estimator(self, sample_rtt):
        """Jacobson's algorithm for updating SRTT/RTTVAR."""
        if self.srtt is None:
            # First sample
            self.srtt = sample_rtt
            self.rttvar = sample_rtt / 2
        else:
            # Subsequent samples
            err = sample_rtt - self.srtt
            self.srtt = self.srtt + (err / 8)
            self.rttvar = self.rttvar + (abs(err) - self.rttvar) / 4
        
        # Recalculate RTO
        self.rto = self.srtt + 4 * self.rttvar
        self.rto = max(self.rto, MIN_RTO)  # Apply minimum bound

The Key Insight

Karn's algorithm doesn't try to guess which scenario occurred. It accepts that ambiguous measurements are worse than no measurements. By combining "don't measure" with "back off," it preserves the integrity of the RTT estimator while still adapting to worsening network conditions through the backoff mechanism.

Why Both Rules Are Necessary

It's tempting to simplify Karn's algorithm to just one of the rules. Let's see why both are essential:

Without Rule 1 (Only Backoff)

If we still update SRTT/RTTVAR using ambiguous samples but apply backoff:

Ambiguous samples still corrupt the estimator
SRTT could drift high or low depending on scenarios
Once we get a clean sample, RTO calculation uses corrupted SRTT/RTTVAR
The backoff is undone by a corrupted small RTO calculation

Result: The estimator corruption problem remains; backoff just delays its effects.

Without Rule 2 (Only "Don't Measure")

If we refuse ambiguous samples but don't back off:

Scenario: Network RTT Increases Permanently
Event	RTO	Actual RTT	Outcome
Initial state	200ms	100ms	Working fine
Network degrades	200ms	500ms	Timeout (expected)
Retransmit	200ms (unchanged)	500ms	Segment sent; don't measure
ACK arrives	200ms (no update)	500ms	Timeout again on next segment
Retransmit again	200ms (still)	500ms	Stuck in perpetual timeout loop

Without backoff, if the network genuinely slows down and causes a timeout, we can't adapt. We refuse to update from the resulting sample (correctly, due to ambiguity), but we also don't increase RTO. Every subsequent segment times out.

Result: The connection becomes stuck in a perpetual timeout-retransmit loop, never adapting to the new network conditions.

Rule 1 Alone

•Protects estimator integrity
•But cannot adapt to worsening conditions
•Leads to perpetual timeouts
•Connection effectively dead

Rule 2 Alone

•Allows adaptation via backoff
•But corrupted estimator remains
•Once clean sample arrives, RTO corrupted
•Potential for storms or stalls

Synergy of the Two Rules

Together, the rules form a complete solution: Rule 1 protects the long-term estimator from corruption, while Rule 2 provides short-term adaptation to worsening conditions. When conditions stabilize, clean samples eventually arrive, the estimator updates correctly, and the backed-off RTO is replaced with an accurate calculation.

Interaction with Jacobson's Algorithm

Karn's algorithm and Jacobson's algorithm are designed to work together as complementary pieces of the RTO calculation:

The Division of Labor

Algorithm	Responsibility
Jacobson's	How to update SRTT and RTTVAR given an RTT sample; how to calculate RTO from these estimates
Karn's	Which samples to feed into Jacobson's algorithm; how to adjust RTO when samples are unavailable

Neither algorithm is complete without the other:

Jacobson's algorithm needs samples, but doesn't specify how to handle retransmission ambiguity
Karn's algorithm specifies which samples to use, but doesn't say how to process them

The Complete Algorithm

Combining both algorithms gives us the complete TCP RTO calculation:

complete_rto_algorithm.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
"""
Complete TCP RTO Calculation: Jacobson + Karn
 
This implementation follows RFC 6298 which incorporates both algorithms.
"""
 
class TCPTimer:
    # Constants (from RFC 6298)
    ALPHA = 1/8       # SRTT smoothing factor
    BETA = 1/4        # RTTVAR smoothing factor  
    K = 4             # Variance multiplier
    MIN_RTO = 1000    # 1 second minimum (ms)
    MAX_RTO = 60000   # 60 second maximum (ms)
    
    def __init__(self):
        self.srtt = None
        self.rttvar = None
        self.rto = self.MIN_RTO  # Initial RTO
        self._rto_backed_off = False
    
    def process_ack(self, sample_rtt: float, was_retransmitted: bool):
        """
        Process an acknowledgment and potentially update RTO.
        
        Args:
            sample_rtt: Measured RTT for this sample (ms)
            was_retransmitted: True if the segment was retransmitted
        """
        # KARN'S RULE 1: Don't use retransmitted samples
        if was_retransmitted:
            # Do not update SRTT/RTTVAR
            # Keep current (possibly backed-off) RTO
            return
        
        # Unambiguous sample: apply Jacobson's algorithm
        if self.srtt is None:
            # First measurement (RFC 6298 Section 2.2)
            self.srtt = sample_rtt
            self.rttvar = sample_rtt / 2
        else:
            # Subsequent measurements (RFC 6298 Section 2.3)
            # Note: Use SRTT from before update for RTTVAR calculation
            err = sample_rtt - self.srtt
            
            # Update RTTVAR first (uses old SRTT)
            self.rttvar = (1 - self.BETA) * self.rttvar + self.BETA * abs(err)
            
            # Update SRTT
            self.srtt = (1 - self.ALPHA) * self.srtt + self.ALPHA * sample_rtt
        
        # Calculate new RTO
        self.rto = self.srtt + self.K * self.rttvar
        
        # Apply bounds
        self.rto = max(self.rto, self.MIN_RTO)
        self.rto = min(self.rto, self.MAX_RTO)
        
        # Clear backoff flag - we have a fresh measurement
        self._rto_backed_off = False
    
    def handle_timeout(self):
        """
        Handle a retransmission timeout.
        
        KARN'S RULE 2: Back off the RTO.
        """
        # Exponential backoff: double the RTO
        self.rto = min(self.rto * 2, self.MAX_RTO)
        self._rto_backed_off = True
        
        # The backed-off RTO is used until a successful
        # non-retransmitted segment is acknowledged
    
    def get_rto(self) -> float:
        """Get the current RTO value for segment transmission."""
        return self.rto
 
 
# Example: Simulating TCP behavior
timer = TCPTimer()
 
# First segment sent and acknowledged (RTT = 100ms)
timer.process_ack(sample_rtt=100, was_retransmitted=False)
print(f"After first sample: RTO = {timer.rto:.1f}ms")
# RTO = 100 + 4*50 = 300ms
 
# Second segment acknowledged (RTT = 120ms)
timer.process_ack(sample_rtt=120, was_retransmitted=False)
print(f"After second sample: RTO = {timer.rto:.1f}ms")
 
# Timeout occurs on third segment
timer.handle_timeout()
print(f"After timeout (backoff): RTO = {timer.rto:.1f}ms")
 
# Retransmission acknowledged - don't update estimator
timer.process_ack(sample_rtt=150, was_retransmitted=True)
print(f"After retransmit ACK (no update): RTO = {timer.rto:.1f}ms")
 
# Next segment succeeds without retransmission
timer.process_ack(sample_rtt=110, was_retransmitted=False)
print(f"After clean sample: RTO = {timer.rto:.1f}ms")

RFC 6298 Clarifications

RFC 6298 (which obsoletes RFC 2988) provides the standard specification for TCP RTO computation. It explicitly incorporates both Jacobson's variance-based estimation and Karn's handling of retransmission ambiguity. Implementations should follow RFC 6298 for interoperability.

TCP Timestamps: The Modern Solution

Karn's algorithm solves the ambiguity problem by refusing to measure ambiguous samples. But what if we could eliminate the ambiguity entirely? That's what TCP timestamps (RFC 7323) accomplish.

How TCP Timestamps Work

The TCP Timestamps option adds two 32-bit fields to the TCP header:

TSval (Timestamp Value): The sender's timestamp when the segment was sent
TSecr (Timestamp Echo Reply): The most recent TSval received from the other side

When a sender transmits a segment, it includes its current timestamp in TSval. When the receiver acknowledges, it echoes that timestamp back in TSecr.

Converting Mermaid diagram...

Eliminating Ambiguity

With timestamps, the sender knows exactly which transmission the ACK corresponds to:

Original sent at TSval=1000
Retransmission sent at TSval=1200
ACK arrives with TSecr=1200
Conclusion: The ACK is for the retransmission (TSval=1200 matches)
RTT sample: Current time minus 1200

Even though the segment was retransmitted, we can now safely use the RTT sample because we know which transmission it corresponds to.

Benefits of TCP Timestamps

Advantages of TCP Timestamps

•Eliminates retransmission ambiguity — Every ACK can be precisely matched to a specific transmission.
•More RTT samples — Without Karn's Rule 1 restriction, every ACK provides a valid RTT sample.
•Better estimator accuracy — More samples lead to more accurate SRTT and RTTVAR values.
•Protection Against Wrapped Sequence (PAWS) — Timestamps help detect ancient duplicate segments even after sequence number wrap.
•Essential for high-bandwidth networks — With window scaling, sequence numbers can wrap quickly; PAWS becomes essential.

Karn's Algorithm in the Timestamp Era

With timestamps, Karn's Rule 1 becomes less critical—we can safely use samples from retransmitted segments. However, Karn's Rule 2 (exponential backoff) remains important:

Even with timestamps, a timeout indicates something is wrong
Backoff still provides conservative adaptation to unknown conditions
The timeout might be due to congestion; aggressive retransmission would worsen it

Modern TCPs with timestamp support thus:

Use samples from retransmitted segments (if timestamped)
Still apply exponential backoff on timeout
Recalculate RTO normally once a clean, timestamped sample arrives

Timestamp Adoption

TCP timestamps are widely supported and often enabled by default in modern operating systems. They add 12 bytes to each TCP header (10 bytes option + 2 bytes padding for alignment), but the benefits for RTT measurement and PAWS protection far outweigh this modest overhead in most scenarios.

Practical Considerations and Edge Cases

Implementing Karn's algorithm correctly requires attention to several practical details:

Handling Multiple Retransmissions

If a segment is retransmitted multiple times before an ACK arrives:

T₁: Send segment (seq=1000)
T₂: Timeout → retransmit
T₃: Timeout → retransmit again
T₄: ACK arrives

Without timestamps, all three transmissions are ambiguous. The segment is marked as retransmitted, and no sample is taken.

With timestamps, the TSecr in the ACK indicates which transmission was received. We can calculate RTT relative to that specific transmission.

Exponential Backoff Limits

Backoff should be bounded to prevent RTO from growing unboundedly:

backoff_limits.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Typical backoff implementation with limits
 
MAX_RTO = 60000  # 60 seconds maximum
MAX_RETRIES = 15  # Maximum retransmission attempts
 
def handle_timeout(self, retransmit_count):
    """Apply exponential backoff with limits."""
    
    # Check if we've exceeded maximum retries
    if retransmit_count >= MAX_RETRIES:
        # Connection presumed dead, abort
        self.abort_connection()
        return
    
    # Apply exponential backoff
    # Note: each timeout doubles RTO from its CURRENT value
    # not from the original calculated value
    self.rto = min(self.rto * 2, MAX_RTO)
    
    # Some implementations also track backoff depth
    # for diagnostic purposes
    self.backoff_depth = retransmit_count

Race Conditions with Multiple Segments

Consider:

Segment A sent (seq=1000) at T₁
Segment B sent (seq=1500) at T₂
Segment A times out, retransmitted at T₃
ACK for both A and B arrives (ack=2000) at T₄

In this case:

Segment A is marked as retransmitted → no sample (per Karn's Rule 1)
Segment B was never retransmitted → safe to use RTT = T₄ - T₂

Modern implementations track per-segment retransmission status to maximize usable samples while still protecting the estimator.

Spurious Timeouts

If a timeout was spurious (ACK was just delayed, not lost), the estimator should ideally not be penalized. Some TCP extensions (like F-RTO, RFC 5682) attempt to detect spurious timeouts and undo the backoff. However, the basic Karn's algorithm doesn't distinguish spurious from genuine timeouts—it backs off either way, which is conservative but safe.

Implementation Variation

Different TCP stacks implement variations on Karn's algorithm. Some are more aggressive in keeping the backed-off RTO; others more quickly return to the calculated value. When diagnosing network performance issues, consulting the specific implementation's behavior is important.

Summary: The Wisdom of Not Measuring

Karn's algorithm embodies a profound engineering principle: when you can't make a clean measurement, it's better to not measure than to measure incorrectly. Let's consolidate the key takeaways:

Key Takeaways

•Retransmission creates ambiguity — When a segment is retransmitted, the resulting ACK could acknowledge either the original or the retransmission, and there's no way to tell which.
•Both guess strategies fail — Using original send time causes RTO inflation; using retransmit time causes RTO collapse. Neither is acceptable.
•Rule 1: Don't use ambiguous samples — Protect the RTT estimator by refusing to update it with samples that might corrupt it.
•Rule 2: Apply exponential backoff — Since we can't measure, adapt to potentially worsening conditions by doubling RTO on each timeout.
•Both rules are necessary — Rule 1 alone can't adapt to changing conditions; Rule 2 alone still corrupts the estimator.
•TCP timestamps eliminate ambiguity — By echoing the sender's timestamp, the receiver enables precise matching of ACKs to specific transmissions.
•Backoff still applies with timestamps — Even with perfect RTT measurement, backoff provides conservative congestion response.

What's next:

We've now covered how to measure RTT (Page 1), how to compute smooth estimates and variance (Page 2 - Jacobson's algorithm), and which samples to use (this page - Karn's algorithm). The next page brings it all together: RTO Calculation—the complete formula for the Retransmission Timeout, including all the bounds, adjustments, and practical considerations specified in RFC 6298.

Page Complete

You now understand Karn's algorithm for handling retransmission ambiguity—the problem it solves, why both rules are necessary, how it interacts with Jacobson's algorithm, and how TCP timestamps provide a modern solution. Next, we'll see the complete RTO calculation as specified in RFC 6298.

Karn's Algorithm: Resolving Retransmission Ambiguity

The Measurement That Cannot Be Made

What You Will Learn

The Retransmission Ambiguity Problem in Depth

We introduced this problem briefly in Page 1, but let's examine it thoroughly. The problem occurs in any scenario where:

A segment is transmitted
The retransmission timeout expires (no ACK received in time)
The segment is retransmitted
An ACK arrives

The fundamental question: Which transmission does the ACK acknowledge?

Converting Mermaid diagram...

Two Possible Scenarios (Indistinguishable)

Scenario A: Original was delivered, ACK acknowledges original

Original segment arrived successfully at the receiver
Receiver sent ACK promptly, but ACK was delayed or lost
Sender retransmitted before ACK arrived
Original ACK (or a replacement ACK) arrives
True RTT: The longer measurement (T₃ - T₁)

Scenario B: Original was lost, ACK acknowledges retransmission

Original segment was lost in the network
Retransmission was received successfully
Receiver sent ACK for the retransmission
True RTT: The shorter measurement (T₃ - T₂)

Not Just Unknown—Actively Harmful

Why Guessing Goes Wrong: The Corruption Cascade

Let's trace through what happens when TCP implementations guess incorrectly.

Guessing Strategy 1: Use the Original Send Time (T₁)

This strategy timestamps RTT from the first transmission:

SampleRTT = T₃ - T₁

Impact of Using Original Send Time
Actual Scenario	Measured RTT	Actual RTT	Effect on Estimator
A (Original delivered)	T₃ - T₁	T₃ - T₁	Correct
B (Original lost)	T₃ - T₁ (long)	T₃ - T₂ (short)	Inflated: SRTT and RTTVAR increase

When Scenario B occurs but we use T₁:

Measured RTT = (T₂ - T₁) + actual_RTT = RTO + actual_RTT
This sample is at least one full RTO larger than the true RTT
SRTT increases, RTTVAR increases
RTO for subsequent segments becomes inflated
If this happens repeatedly, RTO can grow unboundedly

Guessing Strategy 2: Use the Retransmission Time (T₂)

This strategy timestamps RTT from the retransmission:

SampleRTT = T₃ - T₂

Impact of Using Retransmission Time
Actual Scenario	Measured RTT	Actual RTT	Effect on Estimator
A (Original delivered)	T₃ - T₂ (short)	T₃ - T₁ (long)	Deflated: SRTT and RTTVAR decrease
B (Original lost)	T₃ - T₂	T₃ - T₂	Correct

When Scenario A occurs but we use T₂:

Measured RTT = T₃ - T₂, but ACK was actually for the original
This sample is much smaller than the true RTT (it's only the time since retransmission)
SRTT decreases, and crucially, RTTVAR may decrease
RTO shrinks
Subsequent segments may timeout spuriously

Both Strategies Fail

Karn's Solution: Two Simple Rules

Rule 1: Don't Use Ambiguous Samples

When a segment is retransmitted, do not use the resulting ACK to update the RTT estimator (SRTT and RTTVAR).

Rule 2: Back Off and Hold

When a timeout occurs and triggers retransmission, back off the RTO (typically doubling it) and keep it at that backed-off value until a successful (non-retransmitted) transmission completes.

This rule is essential because Rule 1 alone would be dangerous. Consider:

Timeout occurs → retransmit → don't update estimator
What if the network has genuinely slowed down?
If we keep using the old RTO, we'll keep timing out
We need some mechanism to increase RTO even without new samples

The solution: exponential backoff. Each timeout doubles the RTO. This backed-off value is maintained until we get a clean sample from a non-retransmitted segment.

karns_algorithm.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class TCPConnection:
    def __init__(self):
        # RTT estimator state
        self.srtt = None       # Smoothed RTT
        self.rttvar = None     # RTT Variance
        self.rto = 1000        # Initial RTO: 1 second (ms)
        
        # Segment tracking
        self.pending_segments = {}  # seq_num -> (send_time, retransmitted_flag)
        self.num_retransmits = {}   # seq_num -> count
    
    def send_segment(self, segment):
        """Record send time and track if this is a retransmission."""
        seq_num = segment.sequence_number
        send_time = current_timestamp()
        
        if seq_num in self.pending_segments:
            # This is a retransmission
            old_send_time, _ = self.pending_segments[seq_num]
            self.pending_segments[seq_num] = (old_send_time, True)  # Mark as retransmitted
            self.num_retransmits[seq_num] = self.num_retransmits.get(seq_num, 0) + 1
        else:
            # First transmission
            self.pending_segments[seq_num] = (send_time, False)
        
        self.transmit(segment)
    
    def receive_ack(self, ack):
        """Process ACK and potentially update RTT estimate."""
        ack_num = ack.acknowledgment_number
        receive_time = current_timestamp()
        
        for seq_num in list(self.pending_segments.keys()):
            if seq_num < ack_num:
                send_time, was_retransmitted = self.pending_segments.pop(seq_num)
                
                # KARN'S RULE 1: Only update estimator if NOT retransmitted
                if not was_retransmitted:
                    sample_rtt = receive_time - send_time
                    self.update_rtt_estimator(sample_rtt)
                    
                    # KARN'S RULE 2 (reset): Successful non-retransmit
                    # means we can now trust our estimates again
                    # RTO is recalculated from the estimator, not held at backed-off value
                else:
                    # Retransmitted segment acknowledged
                    # DO NOT update SRTT/RTTVAR
                    # Keep the backed-off RTO
                    pass
                
                # Clean up retransmit count
                if seq_num in self.num_retransmits:
                    del self.num_retransmits[seq_num]
    
    def handle_timeout(self, seq_num):
        """Handle retransmission timeout."""
        # KARN'S RULE 2: Exponential backoff
        self.rto = min(self.rto * 2, MAX_RTO)  # Double RTO, up to maximum
        
        # Retransmit the segment (will be marked as retransmitted)
        segment = self.get_segment(seq_num)
        self.send_segment(segment)
    
    def update_rtt_estimator(self, sample_rtt):
        """Jacobson's algorithm for updating SRTT/RTTVAR."""
        if self.srtt is None:
            # First sample
            self.srtt = sample_rtt
            self.rttvar = sample_rtt / 2
        else:
            # Subsequent samples
            err = sample_rtt - self.srtt
            self.srtt = self.srtt + (err / 8)
            self.rttvar = self.rttvar + (abs(err) - self.rttvar) / 4
        
        # Recalculate RTO
        self.rto = self.srtt + 4 * self.rttvar
        self.rto = max(self.rto, MIN_RTO)  # Apply minimum bound

The Key Insight

Why Both Rules Are Necessary

It's tempting to simplify Karn's algorithm to just one of the rules. Let's see why both are essential:

Without Rule 1 (Only Backoff)

If we still update SRTT/RTTVAR using ambiguous samples but apply backoff:

Ambiguous samples still corrupt the estimator
SRTT could drift high or low depending on scenarios
Once we get a clean sample, RTO calculation uses corrupted SRTT/RTTVAR
The backoff is undone by a corrupted small RTO calculation

Result: The estimator corruption problem remains; backoff just delays its effects.

Without Rule 2 (Only "Don't Measure")

If we refuse ambiguous samples but don't back off:

Scenario: Network RTT Increases Permanently
Event	RTO	Actual RTT	Outcome
Initial state	200ms	100ms	Working fine
Network degrades	200ms	500ms	Timeout (expected)
Retransmit	200ms (unchanged)	500ms	Segment sent; don't measure
ACK arrives	200ms (no update)	500ms	Timeout again on next segment
Retransmit again	200ms (still)	500ms	Stuck in perpetual timeout loop

Result: The connection becomes stuck in a perpetual timeout-retransmit loop, never adapting to the new network conditions.

Rule 1 Alone

•Protects estimator integrity
•But cannot adapt to worsening conditions
•Leads to perpetual timeouts
•Connection effectively dead

Rule 2 Alone

•Allows adaptation via backoff
•But corrupted estimator remains
•Once clean sample arrives, RTO corrupted
•Potential for storms or stalls

Synergy of the Two Rules

Interaction with Jacobson's Algorithm

Karn's algorithm and Jacobson's algorithm are designed to work together as complementary pieces of the RTO calculation:

The Division of Labor

Algorithm	Responsibility
Jacobson's	How to update SRTT and RTTVAR given an RTT sample; how to calculate RTO from these estimates
Karn's	Which samples to feed into Jacobson's algorithm; how to adjust RTO when samples are unavailable

Neither algorithm is complete without the other:

Jacobson's algorithm needs samples, but doesn't specify how to handle retransmission ambiguity
Karn's algorithm specifies which samples to use, but doesn't say how to process them

The Complete Algorithm

Combining both algorithms gives us the complete TCP RTO calculation:

complete_rto_algorithm.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
"""
Complete TCP RTO Calculation: Jacobson + Karn
 
This implementation follows RFC 6298 which incorporates both algorithms.
"""
 
class TCPTimer:
    # Constants (from RFC 6298)
    ALPHA = 1/8       # SRTT smoothing factor
    BETA = 1/4        # RTTVAR smoothing factor  
    K = 4             # Variance multiplier
    MIN_RTO = 1000    # 1 second minimum (ms)
    MAX_RTO = 60000   # 60 second maximum (ms)
    
    def __init__(self):
        self.srtt = None
        self.rttvar = None
        self.rto = self.MIN_RTO  # Initial RTO
        self._rto_backed_off = False
    
    def process_ack(self, sample_rtt: float, was_retransmitted: bool):
        """
        Process an acknowledgment and potentially update RTO.
        
        Args:
            sample_rtt: Measured RTT for this sample (ms)
            was_retransmitted: True if the segment was retransmitted
        """
        # KARN'S RULE 1: Don't use retransmitted samples
        if was_retransmitted:
            # Do not update SRTT/RTTVAR
            # Keep current (possibly backed-off) RTO
            return
        
        # Unambiguous sample: apply Jacobson's algorithm
        if self.srtt is None:
            # First measurement (RFC 6298 Section 2.2)
            self.srtt = sample_rtt
            self.rttvar = sample_rtt / 2
        else:
            # Subsequent measurements (RFC 6298 Section 2.3)
            # Note: Use SRTT from before update for RTTVAR calculation
            err = sample_rtt - self.srtt
            
            # Update RTTVAR first (uses old SRTT)
            self.rttvar = (1 - self.BETA) * self.rttvar + self.BETA * abs(err)
            
            # Update SRTT
            self.srtt = (1 - self.ALPHA) * self.srtt + self.ALPHA * sample_rtt
        
        # Calculate new RTO
        self.rto = self.srtt + self.K * self.rttvar
        
        # Apply bounds
        self.rto = max(self.rto, self.MIN_RTO)
        self.rto = min(self.rto, self.MAX_RTO)
        
        # Clear backoff flag - we have a fresh measurement
        self._rto_backed_off = False
    
    def handle_timeout(self):
        """
        Handle a retransmission timeout.
        
        KARN'S RULE 2: Back off the RTO.
        """
        # Exponential backoff: double the RTO
        self.rto = min(self.rto * 2, self.MAX_RTO)
        self._rto_backed_off = True
        
        # The backed-off RTO is used until a successful
        # non-retransmitted segment is acknowledged
    
    def get_rto(self) -> float:
        """Get the current RTO value for segment transmission."""
        return self.rto
 
 
# Example: Simulating TCP behavior
timer = TCPTimer()
 
# First segment sent and acknowledged (RTT = 100ms)
timer.process_ack(sample_rtt=100, was_retransmitted=False)
print(f"After first sample: RTO = {timer.rto:.1f}ms")
# RTO = 100 + 4*50 = 300ms
 
# Second segment acknowledged (RTT = 120ms)
timer.process_ack(sample_rtt=120, was_retransmitted=False)
print(f"After second sample: RTO = {timer.rto:.1f}ms")
 
# Timeout occurs on third segment
timer.handle_timeout()
print(f"After timeout (backoff): RTO = {timer.rto:.1f}ms")
 
# Retransmission acknowledged - don't update estimator
timer.process_ack(sample_rtt=150, was_retransmitted=True)
print(f"After retransmit ACK (no update): RTO = {timer.rto:.1f}ms")
 
# Next segment succeeds without retransmission
timer.process_ack(sample_rtt=110, was_retransmitted=False)
print(f"After clean sample: RTO = {timer.rto:.1f}ms")

RFC 6298 Clarifications

TCP Timestamps: The Modern Solution

Karn's algorithm solves the ambiguity problem by refusing to measure ambiguous samples. But what if we could eliminate the ambiguity entirely? That's what TCP timestamps (RFC 7323) accomplish.

How TCP Timestamps Work

The TCP Timestamps option adds two 32-bit fields to the TCP header:

TSval (Timestamp Value): The sender's timestamp when the segment was sent
TSecr (Timestamp Echo Reply): The most recent TSval received from the other side

When a sender transmits a segment, it includes its current timestamp in TSval. When the receiver acknowledges, it echoes that timestamp back in TSecr.

Converting Mermaid diagram...

Eliminating Ambiguity

With timestamps, the sender knows exactly which transmission the ACK corresponds to:

Original sent at TSval=1000
Retransmission sent at TSval=1200
ACK arrives with TSecr=1200
Conclusion: The ACK is for the retransmission (TSval=1200 matches)
RTT sample: Current time minus 1200

Even though the segment was retransmitted, we can now safely use the RTT sample because we know which transmission it corresponds to.

Benefits of TCP Timestamps

Advantages of TCP Timestamps

•Eliminates retransmission ambiguity — Every ACK can be precisely matched to a specific transmission.
•More RTT samples — Without Karn's Rule 1 restriction, every ACK provides a valid RTT sample.
•Better estimator accuracy — More samples lead to more accurate SRTT and RTTVAR values.
•Protection Against Wrapped Sequence (PAWS) — Timestamps help detect ancient duplicate segments even after sequence number wrap.
•Essential for high-bandwidth networks — With window scaling, sequence numbers can wrap quickly; PAWS becomes essential.

Karn's Algorithm in the Timestamp Era

With timestamps, Karn's Rule 1 becomes less critical—we can safely use samples from retransmitted segments. However, Karn's Rule 2 (exponential backoff) remains important:

Even with timestamps, a timeout indicates something is wrong
Backoff still provides conservative adaptation to unknown conditions
The timeout might be due to congestion; aggressive retransmission would worsen it

Modern TCPs with timestamp support thus:

Use samples from retransmitted segments (if timestamped)
Still apply exponential backoff on timeout
Recalculate RTO normally once a clean, timestamped sample arrives

Timestamp Adoption

Practical Considerations and Edge Cases

Implementing Karn's algorithm correctly requires attention to several practical details:

Handling Multiple Retransmissions

If a segment is retransmitted multiple times before an ACK arrives:

T₁: Send segment (seq=1000)
T₂: Timeout → retransmit
T₃: Timeout → retransmit again
T₄: ACK arrives

Without timestamps, all three transmissions are ambiguous. The segment is marked as retransmitted, and no sample is taken.

With timestamps, the TSecr in the ACK indicates which transmission was received. We can calculate RTT relative to that specific transmission.

Exponential Backoff Limits

Backoff should be bounded to prevent RTO from growing unboundedly:

backoff_limits.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Typical backoff implementation with limits
 
MAX_RTO = 60000  # 60 seconds maximum
MAX_RETRIES = 15  # Maximum retransmission attempts
 
def handle_timeout(self, retransmit_count):
    """Apply exponential backoff with limits."""
    
    # Check if we've exceeded maximum retries
    if retransmit_count >= MAX_RETRIES:
        # Connection presumed dead, abort
        self.abort_connection()
        return
    
    # Apply exponential backoff
    # Note: each timeout doubles RTO from its CURRENT value
    # not from the original calculated value
    self.rto = min(self.rto * 2, MAX_RTO)
    
    # Some implementations also track backoff depth
    # for diagnostic purposes
    self.backoff_depth = retransmit_count

Race Conditions with Multiple Segments

Consider:

Segment A sent (seq=1000) at T₁
Segment B sent (seq=1500) at T₂
Segment A times out, retransmitted at T₃
ACK for both A and B arrives (ack=2000) at T₄

In this case:

Segment A is marked as retransmitted → no sample (per Karn's Rule 1)
Segment B was never retransmitted → safe to use RTT = T₄ - T₂

Modern implementations track per-segment retransmission status to maximize usable samples while still protecting the estimator.

Spurious Timeouts

Implementation Variation

Summary: The Wisdom of Not Measuring

Karn's algorithm embodies a profound engineering principle: when you can't make a clean measurement, it's better to not measure than to measure incorrectly. Let's consolidate the key takeaways:

Key Takeaways

•Retransmission creates ambiguity — When a segment is retransmitted, the resulting ACK could acknowledge either the original or the retransmission, and there's no way to tell which.
•Both guess strategies fail — Using original send time causes RTO inflation; using retransmit time causes RTO collapse. Neither is acceptable.
•Rule 1: Don't use ambiguous samples — Protect the RTT estimator by refusing to update it with samples that might corrupt it.
•Rule 2: Apply exponential backoff — Since we can't measure, adapt to potentially worsening conditions by doubling RTO on each timeout.
•Both rules are necessary — Rule 1 alone can't adapt to changing conditions; Rule 2 alone still corrupts the estimator.
•TCP timestamps eliminate ambiguity — By echoing the sender's timestamp, the receiver enables precise matching of ACKs to specific transmissions.
•Backoff still applies with timestamps — Even with perfect RTT measurement, backoff provides conservative congestion response.

What's next:

Page Complete