Loading learning content...
From the sender's viewpoint, TCP communication presents a deceptively simple challenge: transmit a stream of bytes as quickly as possible while respecting two constraints—don't overflow the receiver's buffer, and don't overwhelm the network. The send window is the sender's tool for managing the first constraint.
The send window encapsulates everything the sender knows about what it's permitted to transmit. It tracks which bytes have been acknowledged, which are in flight, which are ready to send, and which must wait. It incorporates feedback from the receiver's window advertisements. And it interacts with congestion control to determine actual transmission behavior.
This page examines the send window mechanism in exhaustive detail. We'll explore the data structures that implement it, the algorithms that manage it, and the edge cases that challenge implementations.
By the end of this page, you will understand: how the sender maintains its window state variables, the precise conditions under which data can be transmitted, how acknowledgments advance the window, the interaction between send window and retransmission, and common implementation pitfalls that cause performance problems.
TCP, as specified in RFC 793 and subsequent RFCs, defines specific state variables that the sender maintains. Understanding these variables is essential for understanding send window behavior.
The Send Sequence Space:
RFC 793 defines the send sequence space with these critical variables:
| Variable | Full Name | Description |
|---|---|---|
SND.UNA | Send Unacknowledged | The oldest sequence number that has been sent but not yet acknowledged. This is the left edge of the send window. |
SND.NXT | Send Next | The next sequence number to use for new data. Increments as data is transmitted. |
SND.WND | Send Window | The current window size advertised by the receiver. Represents how many bytes beyond SND.UNA the sender may transmit. |
SND.UP | Send Urgent Pointer | Points to the sequence number of the last urgent data byte (used with URG flag). |
SND.WL1 | Segment Sequence for Window Update | The sequence number of the segment that last updated SND.WND. Used to filter stale updates. |
SND.WL2 | ACK Number for Window Update | The acknowledgment number of the segment that last updated SND.WND. |
ISS | Initial Send Sequence Number | The first sequence number used by this connection. Chosen randomly for security. |
Relationships and invariants:
These variables maintain specific relationships that implementations must preserve:
ISS ≤ SND.UNA ≤ SND.NXT ≤ SND.UNA + SND.WND
SND.UNA is always ≤ SND.NXT (you can't send data you haven't reached yet)SND.NXT is always ≤ SND.UNA + SND.WND (you can't send beyond the window)SND.UNA advances toward SND.NXTSND.NXT advances toward the window's right edgeDerived quantities:
Several useful quantities derive from these variables:
| Quantity | Calculation | Meaning |
|---|---|---|
| Bytes in flight | SND.NXT - SND.UNA | Data sent but not acknowledged; at risk during loss |
| Usable window | SND.UNA + SND.WND - SND.NXT | Bytes that can be sent immediately |
| Window right edge | SND.UNA + SND.WND | The highest sequence number permitted by receiver |
| Window fullness | (SND.NXT - SND.UNA) / SND.WND | Fraction of window currently in flight (0 to 1) |
When deciding whether to send data, the sender calculates the usable window. If usable window > 0, more data can be sent. If usable window = 0, the sender is "window-limited" and must wait for ACKs to advance SND.UNA.
The send window operates over the send buffer—a region of memory that holds data from the application until TCP has successfully delivered and acknowledged it. Understanding how the send buffer relates to the window is crucial.
Buffer organization:
The send buffer is logically organized into regions that correspond to the send sequence space:
Send Buffer Memory Layout: ┌─────────────────────────────────────────────────────────────────────────────┐│ SEND BUFFER │├──────────────────┬────────────────────────┬─────────────────────────────────┤│ [FREED SPACE] │ [RETAINED DATA] │ [AVAILABLE SPACE] ││ (acknowledged │ (sent or ready) │ (empty, accepting writes) ││ and released) │ │ │├──────────────────┼────────────────────────┼─────────────────────────────────┤│ Recycled for │ Cannot be freed │ Application can write() ││ new app writes │ until ACKed │ new data here │└──────────────────┴────────────────────────┴─────────────────────────────────┘ ^ ^ │ │ SND.UNA Buffer write position The "Retained Data" region subdivides further: ┌──────────────────────────┬────────────────────────┐│ SENT, UNACKNOWLEDGED │ NOT YET SENT ││ (bytes SND.UNA to │ (bytes SND.NXT to ││ SND.NXT - 1) │ write position - 1) │├──────────────────────────┼────────────────────────┤│ Must retain for │ Ready to transmit ││ retransmission │ when window allows │└──────────────────────────┴────────────────────────┘^ ^│ │SND.UNA SND.NXTBuffer size vs. window size:
The send buffer size and window size are related but distinct concepts:
Send buffer size: The total memory allocated for outbound data. Configured by the application or OS. Determines how much data the application can write before write() blocks.
Send window (SND.WND): The receiver's advertised capacity. Determines how much data TCP can have "in flight" at once.
The send buffer must be at least as large as the send window to allow the window to be fully utilized. If buffer_size < SND.WND, the sender may be artificially limited even when the receiver is willing to accept more.
Best practice:
Send Buffer Size ≥ Expected Maximum SND.WND + Headroom for application writes
Undersized send buffers are a common cause of poor performance. Even if the receiver advertises a large window, the sender can't fill it if the send buffer is too small. On high-BDP paths, both send and receive buffers should be sized to at least the bandwidth-delay product.
Buffer operations:
The send buffer supports three categories of operations:
Application writes: Data enters the buffer from the application. If the buffer is full, the write blocks (or returns an error for non-blocking sockets). This adds to the "not yet sent" region.
TCP transmission: Data in the "not yet sent" region moves to "sent, unacknowledged" when TCP transmits it. The SND.NXT pointer advances.
ACK processing: When acknowledgments arrive, data moves from "sent, unacknowledged" to freed space. The buffer space is recycled for new application writes.
This creates a continuous flow: application in → TCP out → ACK frees → more application in.
Given the send window state, when does TCP actually transmit data? The decision is more nuanced than simply "send if usable window > 0."
The basic transmission condition:
Can transmit if: SND.NXT < SND.UNA + SND.WND
AND data exists at SND.NXT in send buffer
AND (cwnd constraint also satisfied, but that's congestion control)
Segment sizing:
Once TCP decides to transmit, it must determine how many bytes to send. The segment size considers:
| Factor | Constraint | Typical Value |
|---|---|---|
| Maximum Segment Size (MSS) | Negotiated during handshake; based on MTU | 1460 bytes (Ethernet) |
| Usable window | Cannot exceed receiver's capacity | Variable (0 to SND.WND) |
| Congestion window | Cannot exceed inferred network capacity | Variable (slow start/AIMD) |
| Available data | Cannot send more than application provided | Variable |
| Nagle's algorithm | May delay small segments to coalesce | MSS threshold |
The actual segment size is:
Segment Size = min(MSS, Usable Window, cwnd - bytes_in_flight, Available Data)
Nagle's algorithm interaction:
Nagle's algorithm (RFC 896) adds a timing constraint to improve efficiency:
This prevents "silly window syndrome" from the sender side. We'll cover this in detail in a later module.
In practice, TCP calculates an effective window that is the minimum of the send window (receiver constraint) and congestion window (network constraint). We'll explore this in detail in a later page. For now, understand that the send window alone doesn't determine transmission—congestion control adds another layer of restriction.
Push (PSH) flag considerations:
When the application calls a "push" operation or TCP reaches the end of currently available data, it sets the PSH flag. This tells the receiver to deliver buffered data to the application immediately rather than waiting.
PSH doesn't affect send window mechanics directly, but it influences when the receiver processes and potentially ACKs data, which indirectly affects window dynamics.
When an ACK arrives, the sender must update its window state. This process is more subtle than it appears.
Validating the ACK:
First, TCP validates that the ACK is acceptable:
Valid ACK if: SND.UNA < ACK ≤ SND.NXT
Updating SND.UNA:
For a valid, non-duplicate ACK:
SND.UNA = ACK_Number
This advances the left edge of the window. All bytes from the old SND.UNA to the new SND.UNA - 1 are now considered acknowledged and can be freed from the send buffer.
Updating SND.WND:
The ACK segment also carries a window field. However, TCP must be careful about accepting window updates because segments can arrive out of order. RFC 793 specifies:
Accept new SND.WND if:
SEG.SEQ > SND.WL1 (segment has newer sequence number)
OR (SEG.SEQ == SND.WL1 AND SEG.ACK > SND.WL2) (same seq but newer ACK)
OR (SEG.SEQ == SND.WL1 AND SEG.ACK == SND.WL2 AND SEG.WND > SND.WND)
The SND.WL1 and SND.WL2 variables record the sequence and ACK numbers of the last segment that updated the window. This prevents an old, delayed segment from incorrectly shrinking the window.
Imagine the receiver advertises window=10000, then later advertises window=5000 as data is received. If the first segment with window=10000 is delayed in the network and arrives after the window=5000 segment, accepting it would incorrectly expand the window. The WL1/WL2 mechanism prevents this.
12345678910111213141516171819202122232425262728293031323334353637383940
procedure Process_ACK(segment): ack_num = segment.acknowledgment_number seq_num = segment.sequence_number win = segment.window // Step 1: Validate ACK if ack_num <= SND.UNA: // Duplicate ACK - may trigger fast retransmit handle_duplicate_ack(segment) return if ack_num > SND.NXT: // Invalid - ACKing data we never sent // This shouldn't happen; may indicate attack or bug drop_segment() return // Step 2: Advance SND.UNA bytes_acked = ack_num - SND.UNA SND.UNA = ack_num release_buffer_bytes(bytes_acked) // Step 3: Update retransmission timer if SND.UNA == SND.NXT: // All data acknowledged stop_retransmission_timer() else: restart_retransmission_timer() // Step 4: Update window (with staleness check) if seq_num > SND.WL1 or (seq_num == SND.WL1 and ack_num > SND.WL2) or (seq_num == SND.WL1 and ack_num == SND.WL2 and win > SND.WND): SND.WND = win SND.WL1 = seq_num SND.WL2 = ack_num // Step 5: Trigger new transmissions if window opened if usable_window() > 0 and data_pending(): schedule_transmission()Retransmission adds complexity to send window management. When TCP detects loss (via timeout or duplicate ACKs), it must retransmit data that's still in the "sent, unacknowledged" region.
Retransmission doesn't advance SND.NXT:
A critical point: retransmitting data does not advance SND.NXT. The retransmitted bytes occupy the range [SND.UNA, SND.NXT) that has already been sent. Retransmission is re-sending that same range, not sending new data.
Before sending: SND.UNA = 1000, SND.NXT = 5000 (4000 bytes in flight)
After loss detected: SND.UNA = 1000, SND.NXT = 5000 (unchanged)
After retransmit: SND.UNA = 1000, SND.NXT = 5000 (still unchanged)
Go-back-N vs. Selective Retransmission:
TCP implementations vary in how they handle retransmission:
SACK and the send window:
With SACK, the sender maintains a more sophisticated view of which bytes have been received:
SACK doesn't change the fundamental window mechanics but dramatically improves retransmission efficiency.
A subtle point: SACK information is advisory, not guaranteed. The receiver is permitted to "renege" (discard SACKed data if buffer pressure requires). Therefore, the sender must retain all data from SND.UNA forward, even if SACKed. Only cumulative ACKs truly free data.
A challenging situation arises when the receiver advertises a window of zero: SND.WND = 0. The sender cannot transmit any data. But how does the sender know when the receiver's buffer has freed up?
The zero window problem:
This is a classic lost-ACK deadlock scenario. TCP solves it with the persist timer and window probes.
The persist timer:
When the sender receives a zero window, it starts a persist timer. When this timer expires, the sender transmits a window probe—a tiny segment (often just 1 byte) designed to elicit an ACK with the current window value.
Window probe characteristics:
The probe solves the deadlock:
A persistent zero window (minutes to hours) often indicates an application problem: the receiving application isn't reading data. This can happen with hung processes, deadlocked threads, or applications that opened a connection but never process received data. Monitoring for extended zero-window conditions is important for diagnosing application issues.
Implementing the send window correctly requires attention to several subtle issues that can cause bugs or performance problems.
Sequence number wraparound:
TCP sequence numbers are 32-bit unsigned integers. On a fast connection, they wrap around (from 2³²-1 back to 0). All sequence number comparisons must use modular arithmetic:
// WRONG: Simple comparison fails at wraparound
if (seq1 < seq2) { ... }
// RIGHT: Signed comparison handles wraparound
if ((int32_t)(seq1 - seq2) < 0) { ... }
This applies to all window calculations: SND.UNA, SND.NXT, window boundaries, etc.
Buffer management efficiency:
The send buffer is a critical data structure. Efficient implementations:
Interaction with the network stack:
The send window is one component of a larger system. It must coordinate with:
When debugging throughput problems, packet captures are invaluable. Look at the Window field in received ACKs to see receiver-advertised window. Compare SND.NXT - SND.UNA (bytes in flight) to the advertised window. If in-flight bytes equal advertised window, you're window-limited—the bottleneck is receiver capacity, not network.
We've explored the send window in depth—the sender's mechanism for managing what data it can transmit. Let's consolidate the key concepts:
SND.UNA + SND.WND - SND.NXT—bytes that can be sent immediately.What's next:
We've examined the sender's perspective. Next, we'll explore the receive window—the receiver's side of the sliding window mechanism. We'll see how the receiver advertises its capacity, manages its buffer, and generates the window values that govern sender behavior.
You now understand the send window in depth—the state variables, buffer management, transmission decisions, ACK processing, and edge cases. This knowledge is essential for understanding TCP performance and debugging throughput issues. Next, we'll examine the complementary receive window.