Loading learning content...
While the sender pushes data into the network, the receiver faces a different challenge: accepting that data reliably, buffering it appropriately, and delivering it to the application in order—all while providing feedback that prevents the sender from overwhelming its capacity.
The receive window is the receiver's tool for this task. It represents the receiver's current capacity to accept data and is advertised to the sender in every ACK. When the application reads slowly, the window shrinks. When it catches up, the window expands. This dynamic feedback creates a self-regulating system that adapts to application behavior.
This page examines receive window mechanics in exhaustive detail: how the receiver manages its buffer, calculates its available capacity, advertises the window, and handles edge cases like out-of-order arrival and zero window conditions.
By the end of this page, you will understand: how the receiver maintains its buffer and window state, the calculation of the advertised window value, how out-of-order segments are handled, the generation of acknowledgments, and how receiver behavior affects sender throughput.
Just as the sender maintains state for transmission, the receiver maintains state for reception and window advertisement. RFC 793 specifies these variables:
The Receive Sequence Space:
| Variable | Full Name | Description |
|---|---|---|
RCV.NXT | Receive Next | The next expected sequence number. Bytes before this have been received and acknowledged. |
RCV.WND | Receive Window | The window size the receiver is willing to accept. Advertised to the sender. |
RCV.UP | Receive Urgent Pointer | Points to urgent data if URG flag is set. |
IRS | Initial Receive Sequence Number | The sequence number of the first byte received (from the sender's ISS). |
Understanding RCV.NXT:
RCV.NXT is the cornerstone of receive-side TCP. It represents:
The receive window boundaries:
Acceptable sequence numbers: [RCV.NXT, RCV.NXT + RCV.WND)
A critical point: RCV.NXT advances only when contiguous data is received. If bytes 0-999 are received, then bytes 2000-2999 arrive (out of order), RCV.NXT stays at 1000. Only when bytes 1000-1999 arrive (filling the gap) does RCV.NXT jump to 3000. This is why cumulative ACKs indicate the highest contiguously received byte.
Receive Buffer State: Byte positions: 0 1000 2000 3000 4000 5000 6000 7000 8000 |------|------|------|------|------|------|------|------| Scenario: Window = 5000, received bytes 0-999 and 2000-3999 (out of order) [==DELIVERED==][===GAP===][=OUT OF ORDER=][=======WINDOW=======][BEYOND] |<-- 0-999 -->|<1000-1999>|<- 2000-3999 ->|<-- 4000-7999 ----->| 8000+ ^ ^ ^ ^ │ │ │ │ RCV.NXT=1000 Hole Buffered RCV.NXT + RCV.WND ACK value = RCV.NXT = 1000 (indicating "give me byte 1000")Window advertised = RCV.WND (5000 in this example) When bytes 1000-1999 arrive:- Gap filled, RCV.NXT advances to 4000 (skipping buffered 2000-3999)- All 4000 bytes (0-3999) can be delivered to application- ACK value becomes 4000The receive buffer is the memory region that holds incoming data between network arrival and application consumption. Its management is central to receive window behavior.
Buffer organization:
The receive buffer is logically organized into regions:
Receive Buffer Memory Layout: ┌─────────────────────────────────────────────────────────────────────────────┐│ RECEIVE BUFFER │├──────────────────┬────────────────────────────────────────┬─────────────────┤│ [FREE SPACE] │ [PENDING DATA] │ [AVAILABLE] ││ (recycled from │ (received, not yet read by app) │ (window) ││ app reads) │ │ │├──────────────────┼────────────────────────────────────────┼─────────────────┤│ Can receive │ Holding data for │ Can accept ││ more data │ application │ new data ││ indirectly │ │ │└──────────────────┴────────────────────────────────────────┴─────────────────┘ The PENDING DATA region may contain:┌─────────────────────────────┬──────────┬──────────────────┐│ IN-ORDER (deliverable) │ GAP │ OUT-OF-ORDER ││ (bytes up to RCV.NXT-1) │ (hole) │ (future bytes) │├─────────────────────────────┼──────────┼──────────────────┤│ Ready for app read() │ Empty │ Waiting for ││ │ │ gap to fill │└─────────────────────────────┴──────────┴──────────────────┘Window calculation:
The receive window advertised to the sender is calculated based on available buffer space:
RCV.WND = Buffer_Size - (Bytes_Buffered_Not_Yet_Read_By_App)
More precisely:
RCV.WND = Buffer_Size - (Highest_Sequence_Received - Last_Sequence_Read_By_App)
This includes both in-order and out-of-order data. If the application reads slowly, buffered data accumulates, and RCV.WND shrinks. If the application reads quickly, buffer space frees, and RCV.WND grows.
A subtle but important point: out-of-order data occupies buffer space even though it can't be delivered yet. If a sender transmits bytes 0-999, 2000-2999, 4000-4999 (skipping 1000-1999 and 3000-3999), all 3000 received bytes consume buffer space, not just the 1000 in-order bytes. This reduces the window the receiver can advertise.
Buffer size configuration:
The receive buffer size is configured at the socket level, often with OS defaults that may need adjustment for high-performance scenarios:
| Platform | Configuration Method | Typical Default | Maximum |
|---|---|---|---|
| Linux | setsockopt(SO_RCVBUF) or sysctl | 87 KB - 6 MB (auto-tuned) | 16+ MB configurable |
| Windows | setsockopt(SO_RCVBUF) or registry | 64 KB - 1 MB | 16+ MB configurable |
| macOS | setsockopt(SO_RCVBUF) or sysctl | 128 KB - 4 MB | Limited by sysctl |
Auto-tuning:
Modern operating systems implement TCP receive buffer auto-tuning:
This balances performance with memory efficiency across diverse connections.
The Internet provides no ordering guarantees. Packets can take different paths, experience different delays, and arrive out of order. The receiver must handle this gracefully.
RFC 793's original guidance:
The original TCP specification suggested that receivers could discard out-of-order segments. The sender would eventually retransmit, and hopefully the retransmission would arrive in order. This is correct but inefficient.
Modern receiver behavior:
Modern TCP implementations buffer out-of-order segments:
Reassembly data structures:
Implementations typically use one of these data structures for out-of-order segment management:
| Data Structure | Insertion | Gap Query | Coalescing | Memory |
|---|---|---|---|---|
| Linked list of ranges | O(n) | O(n) | O(1) merge | Minimal per-range |
| Red-black tree | O(log n) | O(log n) | O(log n) merge | Nodes + data |
| Segment bitmap | O(1) | O(1) | O(byte range) | High for large windows |
The linked list approach is common for connections with infrequent reordering. Trees scale better for connections with persistent reordering.
123456789101112131415161718192021222324
Initial state: RCV.NXT = 1000, no buffered data Segment arrives: seq=2000, len=1000 - Out of order (expected 1000, got 2000) - Buffer: [(2000, 2999)] - RCV.NXT still 1000 Segment arrives: seq=4000, len=500 - Out of order - Buffer: [(2000, 2999), (4000, 4499)] - RCV.NXT still 1000 Segment arrives: seq=1000, len=1000 - In order! Fills first gap - Buffer: [(2000, 2999), (4000, 4499)] - Delivers bytes 1000-2999 to app (including buffered 2000-2999) - RCV.NXT advances to 3000 Segment arrives: seq=3000, len=1500 - In order (starts at RCV.NXT) - Overlaps and extends past buffered (4000, 4499) - Delivers bytes 3000-4499 to app - RCV.NXT advances to 4500 - Buffer: [] (empty)The Selective Acknowledgment (SACK) option allows the receiver to report which out-of-order blocks it has received. This helps the sender retransmit only the missing data. Each SACK block is a (start, end) sequence number range. The receiver can report up to 4 SACK blocks per ACK (limited by TCP options space).
The receiver communicates its capacity to the sender through the Window field in the TCP header. This 16-bit field (extended by window scaling) indicates how many bytes beyond the acknowledged byte the receiver can accept.
When to advertise:
The receiver includes a window value in every segment it sends, particularly:
Computing the advertised value:
Advertised_Window = min(RCV.WND, Maximum_Representable_Value)
With window scaling, Maximum_Representable_Value = 65535 × 2^scale_factor. Without scaling, it's just 65535.
Window shrinking considerations:
A subtle issue: should the receiver ever shrink the window's right edge? Consider:
This would move the window right edge from 15000 to 8000—a dangerous backward movement. The sender may have already transmitted bytes 8000-14999, which the receiver now says it can't accept.
RFC 793 advises against shrinking:
The standard recommends the receiver not shrink the right edge of the window. Instead:
However, implementations vary, and robust senders must handle window shrinkage gracefully.
If the receiver advertised every tiny increase in buffer space, the sender might send small segments, leading to poor efficiency. Clark's solution: only advertise a window increase when either (a) the window has grown by at least one MSS, or (b) half the buffer is free. This prevents advertising small windows that would trigger inefficient small segments.
The receiver generates ACKs to inform the sender of successful reception. ACK generation strategy significantly impacts performance.
What the ACK communicates:
Immediate vs. Delayed ACKs:
RFC 1122 allows ACKs to be delayed to improve efficiency:
RFC 1122 guidelines:
The duplicate ACK mechanism:
When an out-of-order segment arrives, the receiver:
When the receive buffer is completely full—typically because the application isn't reading data—the receiver advertises a zero window. This is a normal (if undesirable) condition that TCP handles gracefully.
How zero window arises:
Recovery from zero window:
The receiver's options for window reopening:
Window update segments:
A window update is an ACK segment with:
This segment serves only to inform the sender that the window has reopened.
Why window updates can fail:
Window updates are "pure ACKs" with no data. They are not retransmitted if lost:
This is why the sender's persist timer is essential—it recovers from lost window updates.
A zero window is not a connection failure—it's normal flow control. The connection remains open, and transmission will resume when the receiver advertises space. However, prolonged zero windows (minutes+) often indicate application problems: the application isn't reading data, perhaps due to a hung process or empty loop. Monitoring zero window duration is valuable for detecting application issues.
Receiver behavior during zero window:
The receiver should not simply ignore the sender; it must respond to probes to prevent permanent deadlock.
The ultimate purpose of the receive buffer is to deliver data to the application. This process affects window behavior and overall throughput.
Delivery semantics:
TCP delivers a stream of bytes to the application:
The read system call:
When the application calls read() (or recv()):
1234567891011121314151617181920
Scenario: Receive buffer = 10000 bytes, MSS = 1000 bytes Initial state: - Buffer: 8000 bytes used (data waiting for app) - RCV.WND = 2000 bytes (available space) - Sender is constrained by small window Application calls read(4000): - 4000 bytes moved from buffer to application - Buffer: 4000 bytes used - RCV.WND = 6000 bytes (more space!) Should receiver send window update? - Old window = 2000, New window = 6000 - Increase = 4000 bytes (4 MSS) - Clark's SWS solution: Update if >= 1 MSS or >= half buffer - 4000 >= 1000 (MSS), so yes, send update Receiver sends: ACK=X, Window=6000Sender receives update, can now send more dataPush (PSH) flag handling:
The PSH flag tells the receiver to deliver buffered data to the application promptly rather than waiting for more. When TCP receives a segment with PSH set:
Blocking vs. non-blocking read:
| Mode | Buffer Empty Behavior | Buffer Partially Full Behavior |
|---|---|---|
| Blocking | Block until data arrives | Return whatever is available |
| Non-blocking | Return immediately with error (EAGAIN) | Return whatever is available |
In both cases, TCP delivers whatever contiguous data is available—it never delivers out-of-order or waits for more data when some is ready.
Applications that read data slowly constrain TCP throughput. The window shrinks, the sender throttles, and throughput drops. For maximum performance: (1) read data promptly, (2) use large read buffers to reduce syscall overhead, and (3) consider using async I/O to overlap reading with processing.
We've explored the receive window in depth—the receiver's mechanism for managing incoming data and advertising capacity. Let's consolidate the key concepts:
What's next:
We've examined both send and receive windows. However, there's a critical limitation: the 16-bit window field in the TCP header limits windows to 65,535 bytes. For modern high-bandwidth, high-latency networks, this is woefully inadequate. Next, we'll explore Window Scaling—the TCP option that extends window sizes to match contemporary network demands.
You now understand the receive window mechanism: how the receiver manages its buffer, advertises capacity, handles out-of-order data, and delivers to applications. This knowledge is essential for understanding TCP performance characteristics and diagnosing throughput issues. Next, we'll explore window scaling for high-performance networks.