Flow Control - Learning Module

Loading content...

0/240

Zero Window

When the Receiver Says Stop

What happens when a receiver's buffer fills completely? The answer is both simple and profound: the receiver advertises a window of zero bytes, and the sender must halt all data transmission. This is the zero window condition—the ultimate expression of receiver-based flow control.

But stopping is easy; knowing when to resume is hard. If the receiver's buffer clears but the window update is lost, both endpoints could wait forever—the sender waiting for permission, the receiver waiting for data. TCP solves this potential deadlock through clever mechanisms, but understanding them requires diving deep into protocol behavior.

This page examines the zero window condition comprehensively: its causes, consequences, the persist timer mechanism, zero window probes, and the relationship to silly window syndrome.

What You Will Learn

By the end of this page, you will understand: (1) What causes a zero window condition, (2) Why zero window can lead to deadlock without countermeasures, (3) How the persist timer and zero window probes break deadlocks, (4) Clark's algorithm for receiver-side SWS avoidance, (5) The complete protocol flow during zero window scenarios, and (6) Performance implications and monitoring.

Understanding the Zero Window Condition

A zero window occurs when the receiver has no capacity to accept more data. This is a normal condition, not an error—it means flow control is working as designed.

When Zero Window Occurs

The receiver advertises rwnd = 0 when:

RcvBuffer = LastByteRcvd - LastByteRead

In words: the receive buffer is completely occupied by data awaiting application consumption. There is literally no memory available for additional bytes.

Common Causes

Slow application consumption: The receiving application can't keep up with network arrival rate
Application pause: Application stops reading (e.g., user pauses video)
Application blocking: Application blocks on I/O (e.g., database write)
Processing overhead: Application processing each unit takes significant time
Memory pressure: System has reduced the receive buffer due to resource constraints

Converting Mermaid diagram...

The Sender's Response

When the sender receives an ACK with window = 0:

Stop transmitting: No new data may be sent (the window is exhausted)
Start persist timer: Begin timing to probe the receiver periodically
Wait for window update: Hope for an unsolicited ACK with window > 0
Retain unsent data: Application data remains in send buffer

Important Distinction

Zero window is not connection termination or error. The connection remains fully established. The sender still:

Accepts ACKs for previously sent data
May receive data if the connection is full-duplex and the other direction has window
Maintains keepalives (if configured)
Retransmits unacknowledged data if needed

Zero Window ≠ Connection Problem

A zero window is a healthy flow control response—it prevents receiver buffer overflow. Administrators seeing zero windows should not panic; they should investigate why the receiving application isn't consuming data. The network and TCP stack are working correctly.

The Deadlock Problem

Zero window creates a potential deadlock—a situation where both endpoints wait forever, each expecting the other to act first. Understanding this problem is essential to understanding the solution.

The Scenario

Receiver's buffer is full; advertises rwnd = 0
Sender receives the zero window; stops transmitting
Application reads data; buffer space becomes available
Receiver sends window update (ACK with rwnd > 0)
Window update is lost
Sender: "Waiting for non-zero window..."
Receiver: "Waiting for data..."
Both wait forever—deadlock

Why It Happens

ACK packets (including window updates) are not retransmitted. TCP only retransmits data. If a pure ACK (no data) is lost, the information it carried is lost until something triggers a new ACK.

In the deadlock scenario:

The receiver has no data to send that would piggyback a window update
The sender has stopped sending, so no data arrives to trigger an ACK
Neither side has a reason to transmit—perfect, catastrophic silence

Converting Mermaid diagram...

Pure ACKs Are Not Retransmitted

This is the root cause of the potential deadlock. TCP retransmits data segments when unacknowledged, but pure ACKs (with no data payload) are never retransmitted. If the window update ACK is lost, the receiver has no automatic mechanism to resend it.

Without a Solution

Without countermeasures, connections could hang indefinitely in this deadlock state:

User sees application frozen
Neither side detects an error
Connection remains technically "established"
No timeout occurs (retransmission timers aren't running—there's nothing to retransmit)
Recovery requires application-level intervention or connection restart

This is clearly unacceptable. TCP needs a mechanism to break the deadlock.

The Persist Timer: Breaking the Deadlock

TCP's solution to the zero window deadlock is the persist timer—a mechanism that allows the sender to periodically "probe" the receiver to discover if the window has opened.

How the Persist Timer Works

Timer starts: When sender receives zero window, start persist timer
Timer expires: When timer fires, send a zero window probe (ZWP)
Probe triggers response: Receiver must ACK the probe with current window
Check response: If window > 0, resume sending; if still zero, restart timer
Repeat: Continue probing until window opens

Persist Timer Timing

The persist timer uses exponential backoff:

Initial timeout: 1-5 seconds (depends on RTO estimate)
Doubles on each retry: 1s → 2s → 4s → 8s → 16s → ...
Maximum: 60-120 seconds (implementation-dependent)
Never gives up: Unlike retransmission timer, persist timer runs forever

The persist timer never causes connection abort. Even if the receiver's buffer remains full for hours, the connection persists (hence the name).

Converting Mermaid diagram...

Persist Timer vs Retransmission Timer

These two timers are often confused. Here's the distinction:

Aspect	Retransmission Timer	Persist Timer
Purpose	Recover lost data	Discover window opening
Triggered by	Data sent, ACK pending	Zero window received
On expiration	Retransmit data	Send zero window probe
Gives up?	Yes (connection abort)	No (probes forever)
exponential backoff	Yes	Yes

The retransmission timer protects against data loss; the persist timer protects against window update loss.

persist_timer.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# TCP Persist Timer Implementation
 
class PersistTimer:
    """
    Models the TCP persist timer used to probe zero-window receivers.
    
    The persist timer ensures the sender can discover when a
    zero-window condition clears, preventing deadlock.
    """
    
    def __init__(self, initial_rto_ms: float = 1000):
        self.initial_rto = initial_rto_ms
        self.current_timeout = initial_rto_ms
        self.max_timeout = 60000  # 60 seconds maximum
        self.probe_count = 0
        self.is_running = False
    
    def start(self):
        """
        Start persist timer when zero window is received.
        
        Called when an ACK with window=0 arrives.
        """
        self.is_running = True
        self.current_timeout = self.initial_rto
        self.probe_count = 0
        print(f"Persist timer started: {self.current_timeout}ms")
    
    def on_expire(self) -> str:
        """
        Handle persist timer expiration.
        
        Returns: Action to take ('probe' or 'continue')
        """
        self.probe_count += 1
        print(f"Persist timer expired (probe #{self.probe_count})")
        
        # Exponential backoff for next timeout
        self.current_timeout = min(
            self.current_timeout * 2,
            self.max_timeout
        )
        
        return 'send_zero_window_probe'
    
    def on_window_update(self, new_window: int):
        """
        Handle receiving a window update.
        
        If window > 0, stop persist timer and resume sending.
        If window = 0, continue probing (timer already restarted).
        """
        if new_window > 0:
            self.is_running = False
            print(f"Window opened ({new_window} bytes), persist timer stopped")
            return 'resume_sending'
        else:
            # Still zero, keep probing
            print(f"Window still zero, will probe in {self.current_timeout}ms")
            return 'continue_probing'
    
    def stop(self):
        """Stop the persist timer (e.g., on connection close)."""
        self.is_running = False
        self.current_timeout = self.initial_rto
        self.probe_count = 0

Zero Window Probes (ZWP)

When the persist timer expires, the sender transmits a zero window probe—a special segment designed to elicit an ACK with the receiver's current window status.

ZWP Characteristics

A zero window probe is typically:

One byte of data: Contains a single byte from the send buffer
Sequence number: The next expected byte (one past the last ACK'd byte)
Designed to be ACKed: Forces receiver to respond

Alternatively, some implementations send:

Zero-length probe: A segment with no data (just headers)
The same byte repeatedly: Send the same one byte on each probe

Why One Byte?

Sending one byte has important properties:

Forces ACK: Any data segment requires acknowledgment
Tests window: If receiver can accept even one byte, window > 0
Resumes discussion: Gets the receiver talking again
Minimal waste: Only one byte uses buffer if receiver still has zero window

Receiver Response to ZWP

When the receiver receives a zero window probe:

If window is zero:

Cannot accept the data byte (no buffer space)
Sends ACK with current acknowledgment number (does NOT advance)
ACK contains window = 0
May (or may not) buffer the probe byte in some implementations

If window is non-zero:

Accepts the data byte (if one was sent)
Sends ACK with next expected sequence number
ACK contains current window value (> 0)
Sender can now resume transmission

Zero Window Probe Handling
Receiver State	Action	ACK Contents	Sender's Next Step
Window still zero	Cannot accept probe	Same ACK num, window=0	Restart persist, probe later
Window now > 0	Accept probe byte	ACK num+1, window=N	Resume sending
Window > 0, different data expected	Out of order	ACK current, window=N	Sender may need to retransmit

ZWP and Retransmission

Zero window probes are NOT retransmissions—they're new transmissions from the sender's perspective. If the probe is lost, the persist timer will fire again and send another probe. The probe itself is not retransmitted by the retransmission timer; the persist timer handles retrying.

zero_window_probe.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# Zero Window Probe Generation
 
class ZeroWindowProbeGenerator:
                            """
    Generates zero window probes when persist timer fires.
    """
    
    def __init__(self, send_buffer: bytes, snd_una: int, snd_nxt: int):
        self.send_buffer = send_buffer
        self.snd_una = snd_una  # Oldest unacked sequence
        self.snd_nxt = snd_nxt  # Next sequence to send
    
    def generate_probe(self) -> dict:
        """
        Generate a zero window probe segment.
        
        The probe contains one byte starting at SND.NXT.
        This byte will be acknowledged if the receiver can accept it.
        """
        # Get the next byte from send buffer
        buffer_offset = self.snd_nxt - self.snd_una
        
        if buffer_offset < len(self.send_buffer):
            probe_byte = self.send_buffer[buffer_offset:buffer_offset + 1]
        else:
            # No data to probe with (unusual but possible)
            probe_byte = b''
        
        return {
            'sequence_number': self.snd_nxt,
            'data': probe_byte,
            'data_length': len(probe_byte),
            'flags': {'ACK': True},  # Probe also ACKs any receiver data
            'is_zwp': True
        }
    
    def handle_probe_response(self, ack_num: int, window: int) -> dict:
        """
        Process the response to a zero window probe.
        """
        if window > 0:
            return {
                'action': 'resume_sending',
                'window_opened': True,
                'available_window': window,
                'new_ack': ack_num > self.snd_una
            }
        else:
            return {
                'action': 'continue_probing',
                'window_opened': False,
                'message': 'Receiver still has zero window'
            }

Silly Window Syndrome Avoidance

The zero window handling reveals a related problem: Silly Window Syndrome (SWS). Understanding SWS is essential because it affects when receivers advertise non-zero windows.

The Silly Window Problem

Imagine this scenario:

Receiver's buffer is full → advertises window = 0
Application reads 50 bytes → buffer has 50 bytes free
Receiver sends window update: window = 50
Sender sends 50 bytes (tiny segment!)
Buffer full again → window = 0
Application reads 30 bytes → window = 30
Send 30 bytes... repeat forever

The result: countless tiny segments, each with 20-40 bytes of header overhead for a few bytes of payload. Network efficiency collapses.

The Name

"Silly Window Syndrome" was coined because it seems silly to send 50-byte segments when the overhead approaches 50%. It's a pathological state where both endpoints are technically correct but efficiency is terrible.

Converting Mermaid diagram...

Clark's Algorithm (Receiver-Side SWS Avoidance)

David Clark proposed a simple rule for receivers:

"Do not advertise a small window. Wait until the window is at least:

One Maximum Segment Size (MSS), or
Half the receive buffer

(whichever is smaller) before advertising a non-zero window after advertising zero."

This forces the receiver to hold off sending window updates until a "meaningful" amount of space is available—preventing the cascade of tiny windows.

clarks_algorithm.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# Clark's Algorithm: Receiver-Side SWS Avoidance
 
class ClarksSWSAvoidance:
    """
    Implements Clark's algorithm for avoiding silly window syndrome.
    
    The receiver should not advertise small windows. Wait until
    sufficient space is available to make transmission worthwhile.
    """
    
    def __init__(self, rcv_buffer_size: int, mss: int = 1460):
        self.rcv_buffer_size = rcv_buffer_size
        self.mss = mss
        self.last_advertised_window = rcv_buffer_size
        self.was_zero_window = False
    
    def calculate_advertisable_window(self, actual_free_space: int) -> int:
        """
        Calculate the window to advertise, applying Clark's algorithm.
        
        If coming out of zero window, don't advertise until
        sufficient space is available.
        """
        # Threshold for advertising after zero window
        threshold = min(self.mss, self.rcv_buffer_size // 2)
        
        if self.was_zero_window:
            # We previously advertised zero
            if actual_free_space >= threshold:
                # Enough space now; advertise the full available space
                self.was_zero_window = False
                return actual_free_space
            else:
                # Not enough space yet; keep advertising zero
                # (This prevents SWS)
                return 0
        else:
            # Normal operation; advertise actual space
            # But still avoid going to tiny windows
            if actual_free_space < threshold:
                self.was_zero_window = True
                return 0
            return actual_free_space
    
    def on_window_advertised(self, window: int):
        """Track what was advertised."""
        self.last_advertised_window = window
        if window == 0:
            self.was_zero_window = True
 
 
# Example
sws = ClarksSWSAvoidance(rcv_buffer_size=65536, mss=1460)
 
# Buffer full, zero window
print(sws.calculate_advertisable_window(0))  # 0
 
# 50 bytes freed - NOT enough
print(sws.calculate_advertisable_window(50))  # Still 0 (Clark's algorithm)
 
# 2000 bytes freed - enough!
print(sws.calculate_advertisable_window(2000))  # 2000 (full available)

Sender-Side SWS Avoidance: Nagle's Algorithm

The sender also has SWS avoidance: Nagle's algorithm. It prevents sending small segments when there's outstanding unacknowledged data. Together, Clark's algorithm (receiver) and Nagle's algorithm (sender) prevent SWS from both ends.

Complete Zero Window Protocol Flow

Let's trace through a complete zero window scenario from start to resolution, integrating all the mechanisms we've discussed.

Converting Mermaid diagram...

Zero Window Flow Summary

•Normal transfer: Data flows, windows shrink as buffer fills
•Buffer fills: Last segment fills buffer; receiver ACKs with window=0
•Sender stops: Starts persist timer (initial timeout ~1-5 seconds)
•Persist timer fires: Sender transmits zero window probe (1 byte)
•Receiver responds: ACK with current window (still 0 if buffer still full)
•Sender backs off: Doubles persist timeout, waits again
•Application reads: Buffer space becomes available
•Clark's check: If space >= min(MSS, 50% buffer), advertise
•Window update: Receiver sends ACK with non-zero window
•Sender resumes: Cancels persist timer, transmits data

Parallel Window Updates

The window update (step 9) might arrive before the persist timer fires if the application reads quickly. The sender doesn't need to wait for its probe—any ACK with window > 0 cancels the persist timer and allows transmission.

Monitoring and Diagnostics

Zero window conditions are visible in TCP diagnostics and can indicate application or system problems.

What Zero Windows Indicate

Frequent or prolonged zero windows suggest:

Application bottleneck: Receiver application can't keep up
I/O blocking: Application blocked on disk or database
CPU starvation: Application not getting enough CPU time
Memory pressure: System reducing buffer sizes
Lock contention: Application threads blocked on locks

Diagnostic Tools

netstat / ss (Linux)

ss -ti state established | grep -E 'wscale|rcv_space'
# Shows window scaling and receive buffer usage

netstat -s | grep -i 'prune\|zero'
# Shows pruned packets (memory pressure) and zero window events

Wireshark

Filter: tcp.analysis.zero_window
# Shows segments with zero window advertisement

Filter: tcp.analysis.zero_window_probe
# Shows zero window probe segments

Zero Window Diagnostic Interpretation
Observation	Likely Cause	Investigation	Resolution
Occasional zero window	Normal burst handling	May be fine	Monitor; optimize if persistent
Frequent zero window	App can't keep up	Profile receiver app	Optimize app or add resources
Prolonged zero window	App blocked/hung	Check app state	Fix blocking issue
Zero window + high CPU	CPU bottleneck	Profile app	Optimize or scale
Zero window + low CPU	I/O or lock blocking	Check I/O and locks	Fix blocking
System-wide zero windows	OS memory pressure	Check tcp_mem thresholds	Increase memory or limits

monitor_zero_window.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/bin/bash
# Monitor zero window conditions on Linux
 
echo "=== TCP Memory Status ==="
cat /proc/net/sockstat | grep TCP
 
echo ""
echo "=== TCP Memory Thresholds ==="
cat /proc/sys/net/ipv4/tcp_mem
echo "(low / pressure / high in pages)"
 
echo ""
echo "=== Current TCP Stats ==="
netstat -s | grep -iE 'prune|collapse|zero|rcv|memory'
 
echo ""
echo "=== Connections with Small Receive Windows ==="
ss -ti state established | grep -E 'rcv_space:[0-9]{1,3}[^0-9]'
 
echo ""
echo "=== Active Zero Window Connections ==="
ss -ti state established | grep 'wnd:0'

False Positive: Paused Applications

Not all zero windows indicate problems. A video player paused by the user will cause zero window until play resumes. A batch job waiting for downstream confirmation is not broken. Distinguish between intentional pauses and pathological blocking.

Summary: Zero Window

We have examined the zero window condition—when the receiver cannot accept more data—and the mechanisms that handle it. Let us consolidate the key insights:

Key Takeaways

•Zero window is normal flow control — It occurs when receive buffer is full and indicates flow control is working
•Zero window can cause deadlock — If window update is lost, both endpoints wait forever
•Persist timer breaks deadlock — Sender probes receiver periodically to discover window opening
•Zero window probes elicit ACKs — Small segments force receiver to respond with current window
•Exponential backoff prevents overload — Persist timer intervals increase to avoid overwhelming a slow receiver
•Clark's algorithm prevents SWS — Receiver waits for sufficient space before advertising non-zero window
•Zero windows are diagnosable — Tools like ss, netstat, and Wireshark reveal zero window conditions

Module Complete

This page concludes Module 1: Flow Control. We have progressed from understanding flow control's purpose, through receiver-based control and window advertisement, to buffer management and the zero window edge case. You now possess a comprehensive understanding of how TCP prevents receiver buffer overflow through coordinated sender-receiver communication.

What's Next

The next module will explore TCP's sliding window mechanism in greater depth, examining how the window slides as data is transmitted and acknowledged, window scaling for high-bandwidth networks, and the effective window calculation that combines flow control with congestion control.

Module 1 Complete

Congratulations! You have mastered TCP flow control. You understand why it exists (preventing receiver overflow), how it works (receiver-based window advertisement), its physical basis (buffer management), and its critical edge case (zero window). This knowledge forms the foundation for understanding TCP's broader reliability and performance mechanisms.

Zero Window

When the Receiver Says Stop

This page examines the zero window condition comprehensively: its causes, consequences, the persist timer mechanism, zero window probes, and the relationship to silly window syndrome.

What You Will Learn

Understanding the Zero Window Condition

A zero window occurs when the receiver has no capacity to accept more data. This is a normal condition, not an error—it means flow control is working as designed.

When Zero Window Occurs

The receiver advertises rwnd = 0 when:

RcvBuffer = LastByteRcvd - LastByteRead

In words: the receive buffer is completely occupied by data awaiting application consumption. There is literally no memory available for additional bytes.

Common Causes

Slow application consumption: The receiving application can't keep up with network arrival rate
Application pause: Application stops reading (e.g., user pauses video)
Application blocking: Application blocks on I/O (e.g., database write)
Processing overhead: Application processing each unit takes significant time
Memory pressure: System has reduced the receive buffer due to resource constraints

Converting Mermaid diagram...

The Sender's Response

When the sender receives an ACK with window = 0:

Stop transmitting: No new data may be sent (the window is exhausted)
Start persist timer: Begin timing to probe the receiver periodically
Wait for window update: Hope for an unsolicited ACK with window > 0
Retain unsent data: Application data remains in send buffer

Important Distinction

Zero window is not connection termination or error. The connection remains fully established. The sender still:

Accepts ACKs for previously sent data
May receive data if the connection is full-duplex and the other direction has window
Maintains keepalives (if configured)
Retransmits unacknowledged data if needed

Zero Window ≠ Connection Problem

The Deadlock Problem

The Scenario

Receiver's buffer is full; advertises rwnd = 0
Sender receives the zero window; stops transmitting
Application reads data; buffer space becomes available
Receiver sends window update (ACK with rwnd > 0)
Window update is lost
Sender: "Waiting for non-zero window..."
Receiver: "Waiting for data..."
Both wait forever—deadlock

Why It Happens

ACK packets (including window updates) are not retransmitted. TCP only retransmits data. If a pure ACK (no data) is lost, the information it carried is lost until something triggers a new ACK.

In the deadlock scenario:

The receiver has no data to send that would piggyback a window update
The sender has stopped sending, so no data arrives to trigger an ACK
Neither side has a reason to transmit—perfect, catastrophic silence

Converting Mermaid diagram...

Pure ACKs Are Not Retransmitted

Without a Solution

Without countermeasures, connections could hang indefinitely in this deadlock state:

User sees application frozen
Neither side detects an error
Connection remains technically "established"
No timeout occurs (retransmission timers aren't running—there's nothing to retransmit)
Recovery requires application-level intervention or connection restart

This is clearly unacceptable. TCP needs a mechanism to break the deadlock.

The Persist Timer: Breaking the Deadlock

TCP's solution to the zero window deadlock is the persist timer—a mechanism that allows the sender to periodically "probe" the receiver to discover if the window has opened.

How the Persist Timer Works

Timer starts: When sender receives zero window, start persist timer
Timer expires: When timer fires, send a zero window probe (ZWP)
Probe triggers response: Receiver must ACK the probe with current window
Check response: If window > 0, resume sending; if still zero, restart timer
Repeat: Continue probing until window opens

Persist Timer Timing

The persist timer uses exponential backoff:

Initial timeout: 1-5 seconds (depends on RTO estimate)
Doubles on each retry: 1s → 2s → 4s → 8s → 16s → ...
Maximum: 60-120 seconds (implementation-dependent)
Never gives up: Unlike retransmission timer, persist timer runs forever

The persist timer never causes connection abort. Even if the receiver's buffer remains full for hours, the connection persists (hence the name).

Converting Mermaid diagram...

Persist Timer vs Retransmission Timer

These two timers are often confused. Here's the distinction:

Aspect	Retransmission Timer	Persist Timer
Purpose	Recover lost data	Discover window opening
Triggered by	Data sent, ACK pending	Zero window received
On expiration	Retransmit data	Send zero window probe
Gives up?	Yes (connection abort)	No (probes forever)
exponential backoff	Yes	Yes

The retransmission timer protects against data loss; the persist timer protects against window update loss.

persist_timer.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# TCP Persist Timer Implementation
 
class PersistTimer:
    """
    Models the TCP persist timer used to probe zero-window receivers.
    
    The persist timer ensures the sender can discover when a
    zero-window condition clears, preventing deadlock.
    """
    
    def __init__(self, initial_rto_ms: float = 1000):
        self.initial_rto = initial_rto_ms
        self.current_timeout = initial_rto_ms
        self.max_timeout = 60000  # 60 seconds maximum
        self.probe_count = 0
        self.is_running = False
    
    def start(self):
        """
        Start persist timer when zero window is received.
        
        Called when an ACK with window=0 arrives.
        """
        self.is_running = True
        self.current_timeout = self.initial_rto
        self.probe_count = 0
        print(f"Persist timer started: {self.current_timeout}ms")
    
    def on_expire(self) -> str:
        """
        Handle persist timer expiration.
        
        Returns: Action to take ('probe' or 'continue')
        """
        self.probe_count += 1
        print(f"Persist timer expired (probe #{self.probe_count})")
        
        # Exponential backoff for next timeout
        self.current_timeout = min(
            self.current_timeout * 2,
            self.max_timeout
        )
        
        return 'send_zero_window_probe'
    
    def on_window_update(self, new_window: int):
        """
        Handle receiving a window update.
        
        If window > 0, stop persist timer and resume sending.
        If window = 0, continue probing (timer already restarted).
        """
        if new_window > 0:
            self.is_running = False
            print(f"Window opened ({new_window} bytes), persist timer stopped")
            return 'resume_sending'
        else:
            # Still zero, keep probing
            print(f"Window still zero, will probe in {self.current_timeout}ms")
            return 'continue_probing'
    
    def stop(self):
        """Stop the persist timer (e.g., on connection close)."""
        self.is_running = False
        self.current_timeout = self.initial_rto
        self.probe_count = 0

Zero Window Probes (ZWP)

When the persist timer expires, the sender transmits a zero window probe—a special segment designed to elicit an ACK with the receiver's current window status.

ZWP Characteristics

A zero window probe is typically:

One byte of data: Contains a single byte from the send buffer
Sequence number: The next expected byte (one past the last ACK'd byte)
Designed to be ACKed: Forces receiver to respond

Alternatively, some implementations send:

Zero-length probe: A segment with no data (just headers)
The same byte repeatedly: Send the same one byte on each probe

Why One Byte?

Sending one byte has important properties:

Forces ACK: Any data segment requires acknowledgment
Tests window: If receiver can accept even one byte, window > 0
Resumes discussion: Gets the receiver talking again
Minimal waste: Only one byte uses buffer if receiver still has zero window

Receiver Response to ZWP

When the receiver receives a zero window probe:

If window is zero:

Cannot accept the data byte (no buffer space)
Sends ACK with current acknowledgment number (does NOT advance)
ACK contains window = 0
May (or may not) buffer the probe byte in some implementations

If window is non-zero:

Accepts the data byte (if one was sent)
Sends ACK with next expected sequence number
ACK contains current window value (> 0)
Sender can now resume transmission

Zero Window Probe Handling
Receiver State	Action	ACK Contents	Sender's Next Step
Window still zero	Cannot accept probe	Same ACK num, window=0	Restart persist, probe later
Window now > 0	Accept probe byte	ACK num+1, window=N	Resume sending
Window > 0, different data expected	Out of order	ACK current, window=N	Sender may need to retransmit

ZWP and Retransmission

zero_window_probe.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# Zero Window Probe Generation
 
class ZeroWindowProbeGenerator:
                            """
    Generates zero window probes when persist timer fires.
    """
    
    def __init__(self, send_buffer: bytes, snd_una: int, snd_nxt: int):
        self.send_buffer = send_buffer
        self.snd_una = snd_una  # Oldest unacked sequence
        self.snd_nxt = snd_nxt  # Next sequence to send
    
    def generate_probe(self) -> dict:
        """
        Generate a zero window probe segment.
        
        The probe contains one byte starting at SND.NXT.
        This byte will be acknowledged if the receiver can accept it.
        """
        # Get the next byte from send buffer
        buffer_offset = self.snd_nxt - self.snd_una
        
        if buffer_offset < len(self.send_buffer):
            probe_byte = self.send_buffer[buffer_offset:buffer_offset + 1]
        else:
            # No data to probe with (unusual but possible)
            probe_byte = b''
        
        return {
            'sequence_number': self.snd_nxt,
            'data': probe_byte,
            'data_length': len(probe_byte),
            'flags': {'ACK': True},  # Probe also ACKs any receiver data
            'is_zwp': True
        }
    
    def handle_probe_response(self, ack_num: int, window: int) -> dict:
        """
        Process the response to a zero window probe.
        """
        if window > 0:
            return {
                'action': 'resume_sending',
                'window_opened': True,
                'available_window': window,
                'new_ack': ack_num > self.snd_una
            }
        else:
            return {
                'action': 'continue_probing',
                'window_opened': False,
                'message': 'Receiver still has zero window'
            }

Silly Window Syndrome Avoidance

The zero window handling reveals a related problem: Silly Window Syndrome (SWS). Understanding SWS is essential because it affects when receivers advertise non-zero windows.

The Silly Window Problem

Imagine this scenario:

Receiver's buffer is full → advertises window = 0
Application reads 50 bytes → buffer has 50 bytes free
Receiver sends window update: window = 50
Sender sends 50 bytes (tiny segment!)
Buffer full again → window = 0
Application reads 30 bytes → window = 30
Send 30 bytes... repeat forever

The result: countless tiny segments, each with 20-40 bytes of header overhead for a few bytes of payload. Network efficiency collapses.

The Name

Converting Mermaid diagram...

Clark's Algorithm (Receiver-Side SWS Avoidance)

David Clark proposed a simple rule for receivers:

"Do not advertise a small window. Wait until the window is at least:

One Maximum Segment Size (MSS), or
Half the receive buffer

(whichever is smaller) before advertising a non-zero window after advertising zero."

This forces the receiver to hold off sending window updates until a "meaningful" amount of space is available—preventing the cascade of tiny windows.

clarks_algorithm.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# Clark's Algorithm: Receiver-Side SWS Avoidance
 
class ClarksSWSAvoidance:
    """
    Implements Clark's algorithm for avoiding silly window syndrome.
    
    The receiver should not advertise small windows. Wait until
    sufficient space is available to make transmission worthwhile.
    """
    
    def __init__(self, rcv_buffer_size: int, mss: int = 1460):
        self.rcv_buffer_size = rcv_buffer_size
        self.mss = mss
        self.last_advertised_window = rcv_buffer_size
        self.was_zero_window = False
    
    def calculate_advertisable_window(self, actual_free_space: int) -> int:
        """
        Calculate the window to advertise, applying Clark's algorithm.
        
        If coming out of zero window, don't advertise until
        sufficient space is available.
        """
        # Threshold for advertising after zero window
        threshold = min(self.mss, self.rcv_buffer_size // 2)
        
        if self.was_zero_window:
            # We previously advertised zero
            if actual_free_space >= threshold:
                # Enough space now; advertise the full available space
                self.was_zero_window = False
                return actual_free_space
            else:
                # Not enough space yet; keep advertising zero
                # (This prevents SWS)
                return 0
        else:
            # Normal operation; advertise actual space
            # But still avoid going to tiny windows
            if actual_free_space < threshold:
                self.was_zero_window = True
                return 0
            return actual_free_space
    
    def on_window_advertised(self, window: int):
        """Track what was advertised."""
        self.last_advertised_window = window
        if window == 0:
            self.was_zero_window = True
 
 
# Example
sws = ClarksSWSAvoidance(rcv_buffer_size=65536, mss=1460)
 
# Buffer full, zero window
print(sws.calculate_advertisable_window(0))  # 0
 
# 50 bytes freed - NOT enough
print(sws.calculate_advertisable_window(50))  # Still 0 (Clark's algorithm)
 
# 2000 bytes freed - enough!
print(sws.calculate_advertisable_window(2000))  # 2000 (full available)

Sender-Side SWS Avoidance: Nagle's Algorithm

Complete Zero Window Protocol Flow

Let's trace through a complete zero window scenario from start to resolution, integrating all the mechanisms we've discussed.

Converting Mermaid diagram...

Zero Window Flow Summary

•Normal transfer: Data flows, windows shrink as buffer fills
•Buffer fills: Last segment fills buffer; receiver ACKs with window=0
•Sender stops: Starts persist timer (initial timeout ~1-5 seconds)
•Persist timer fires: Sender transmits zero window probe (1 byte)
•Receiver responds: ACK with current window (still 0 if buffer still full)
•Sender backs off: Doubles persist timeout, waits again
•Application reads: Buffer space becomes available
•Clark's check: If space >= min(MSS, 50% buffer), advertise
•Window update: Receiver sends ACK with non-zero window
•Sender resumes: Cancels persist timer, transmits data

Parallel Window Updates

Monitoring and Diagnostics

Zero window conditions are visible in TCP diagnostics and can indicate application or system problems.

What Zero Windows Indicate

Frequent or prolonged zero windows suggest:

Application bottleneck: Receiver application can't keep up
I/O blocking: Application blocked on disk or database
CPU starvation: Application not getting enough CPU time
Memory pressure: System reducing buffer sizes
Lock contention: Application threads blocked on locks

Diagnostic Tools

netstat / ss (Linux)

ss -ti state established | grep -E 'wscale|rcv_space'
# Shows window scaling and receive buffer usage

netstat -s | grep -i 'prune\|zero'
# Shows pruned packets (memory pressure) and zero window events

Wireshark

Filter: tcp.analysis.zero_window
# Shows segments with zero window advertisement

Filter: tcp.analysis.zero_window_probe
# Shows zero window probe segments

Zero Window Diagnostic Interpretation
Observation	Likely Cause	Investigation	Resolution
Occasional zero window	Normal burst handling	May be fine	Monitor; optimize if persistent
Frequent zero window	App can't keep up	Profile receiver app	Optimize app or add resources
Prolonged zero window	App blocked/hung	Check app state	Fix blocking issue
Zero window + high CPU	CPU bottleneck	Profile app	Optimize or scale
Zero window + low CPU	I/O or lock blocking	Check I/O and locks	Fix blocking
System-wide zero windows	OS memory pressure	Check tcp_mem thresholds	Increase memory or limits

monitor_zero_window.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/bin/bash
# Monitor zero window conditions on Linux
 
echo "=== TCP Memory Status ==="
cat /proc/net/sockstat | grep TCP
 
echo ""
echo "=== TCP Memory Thresholds ==="
cat /proc/sys/net/ipv4/tcp_mem
echo "(low / pressure / high in pages)"
 
echo ""
echo "=== Current TCP Stats ==="
netstat -s | grep -iE 'prune|collapse|zero|rcv|memory'
 
echo ""
echo "=== Connections with Small Receive Windows ==="
ss -ti state established | grep -E 'rcv_space:[0-9]{1,3}[^0-9]'
 
echo ""
echo "=== Active Zero Window Connections ==="
ss -ti state established | grep 'wnd:0'

False Positive: Paused Applications

Summary: Zero Window

We have examined the zero window condition—when the receiver cannot accept more data—and the mechanisms that handle it. Let us consolidate the key insights:

Key Takeaways

•Zero window is normal flow control — It occurs when receive buffer is full and indicates flow control is working
•Zero window can cause deadlock — If window update is lost, both endpoints wait forever
•Persist timer breaks deadlock — Sender probes receiver periodically to discover window opening
•Zero window probes elicit ACKs — Small segments force receiver to respond with current window
•Exponential backoff prevents overload — Persist timer intervals increase to avoid overwhelming a slow receiver
•Clark's algorithm prevents SWS — Receiver waits for sufficient space before advertising non-zero window
•Zero windows are diagnosable — Tools like ss, netstat, and Wireshark reveal zero window conditions

Module Complete

What's Next

Module 1 Complete