Tcp Overview - Learning Module

Loading content...

0/240

Ordered Delivery

Preserving Sequence in a Chaotic Network

Imagine reading a novel where chapters arrive out of order—chapter 3, then 1, then 5, then 2. You'd need to collect all chapters, sort them, then read. Now imagine if the chapter numbers were missing—chaos would ensue.

Network packets face precisely this challenge. Sent sequentially, they may travel different paths through the internet and arrive in any order. A file's bytes might arrive scrambled: the ending before the beginning, middle sections interleaved randomly.

TCP solves this problem completely. No matter how packets arrive at the receiver—out of order, with gaps, with duplicates—TCP reassembles them into the exact original sequence. The application sees a perfectly ordered stream, as if the data had traveled through a direct pipe.

In this page, we'll explore the mechanisms that make ordered delivery possible: how TCP uses sequence numbers for reassembly, how the receive buffer manages out-of-order data, and why this ordering guarantee matters for applications.

What You Will Learn

By the end of this page, you will understand how TCP maintains ordering through sequence-based reassembly, how receive buffers hold out-of-order segments, the head-of-line blocking problem this creates, and the implications of ordered delivery for protocol design. You'll see both the benefits and costs of TCP's strict ordering guarantee.

Why Order Matters

Before examining how TCP maintains order, let's understand why ordering is essential for many applications.

Applications that require ordering:

Application Ordering Requirements
Application Type	Why Order Matters	Disorder Impact
File Transfer	Bytes must be in original positions	Corrupted, unreadable files
Web Pages (HTTP)	HTML must be parsed sequentially	Garbled rendering, broken pages
Database Queries	Query results must match row order	Wrong data, logic errors
Terminal Sessions (SSH)	Commands must execute in sequence	Unpredictable behavior, security risks
Email (SMTP)	Message headers before body	Malformed messages, delivery failure
TLS/SSL	Handshake steps are sequence-dependent	Cryptographic failure, no connection

Contrast with order-tolerant applications:

Not all applications require strict ordering. Some can process data in any order:

DNS queries: Each query/response is independent
Game state updates: Latest state may matter more than historical sequence
Video streaming: Missing frames can be skipped; old frames are useless
IoT sensor data: Each reading is often independent

These applications can use UDP (which doesn't guarantee ordering) or application-layer mechanisms to handle reordering. But applications requiring strict ordering depend on TCP's guarantee.

The stream abstraction:

TCP presents data as a continuous, ordered stream—like reading from a file. The sender writes bytes 1, 2, 3, 4, 5... and the receiver reads them in exactly that order. This abstraction hides the underlying packet-based, potentially-disordered reality of the network.

Order vs. Timing

TCP guarantees order, not timing. Bytes will arrive in the correct sequence, but there's no guarantee about when. Network delays, congestion, and loss recovery all affect timing. Applications needing real-time delivery often prefer protocols that sacrifice ordering for timeliness.

How Packets Get Reordered

Understanding TCP's ordering mechanisms requires first understanding how packets end up out of order. Multiple factors contribute:

1. Multi-path routing:

The internet is a mesh of interconnected networks. Packets from A to B might travel:

Packet 1: A → Router X → Router Y → B (100ms path)
Packet 2: A → Router X → Router Z → B (50ms path)

Packet 2 arrives first, despite being sent second. This is particularly common when:

Load balancers distribute traffic across multiple paths
Routing changes occur mid-transmission
Traffic engineering shifts paths based on congestion

Converting Mermaid diagram...

2. Router queue variations:

Even on the same path, packets experience different queuing delays:

Packet 1: Arrives at router when queue is deep (100 packets ahead)
          → Waits 50ms for queue to drain
Packet 2: Arrives at router when queue is shallow (10 packets ahead)
          → Waits 5ms
Packet 3: Arrives at router when queue is empty
          → Immediate forwarding

Packet 3 might arrive before Packets 1 and 2, even though sent last.

3. Retransmission-caused reordering:

When a packet is lost and retransmitted, later packets may have already arrived:

Original: Packet 1 ━━━━ [LOST]
Original: Packet 2 ━━━━━━━→ Arrives at t=100ms
Original: Packet 3 ━━━━━━━━━→ Arrives at t=150ms
Retransmit: Packet 1 ━━━━━━━━━━━→ Arrives at t=300ms

The receiver gets packets in order 2, 3, 1—but needs to deliver 1, 2, 3.

4. Parallelism in network hardware:

Modern routers and NICs use parallelism. Multiple cores/threads process packets simultaneously, and slight timing variations can reorder packets even in a single device.

Reordering is Normal, Not Exceptional

Some reordering is expected in any network. Studies show that 0.1-3% of packets arrive out of order on typical internet paths. TCP is designed to handle this gracefully. The challenge is distinguishing reordering (benign) from loss (requires action). This is why fast retransmit waits for 3 duplicate ACKs rather than one.

Sequence-Based Reassembly

TCP's ordering mechanism is beautifully simple: sequence numbers define position, the receiver reassembles.

The rule:

Each byte has a unique sequence number defining its position in the stream. The receiver uses these to place data in the correct order, regardless of arrival order.

Sequence Number = Position in byte stream (starting from ISN)

If ISN = 1000:
  Byte 0 of data → Seq 1000
  Byte 1 of data → Seq 1001
  Byte 500 of data → Seq 1500
  ...

Reassembly algorithm:

reassembly_algorithm.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
class TCPReceiveBuffer:
    """
    TCP Receive Buffer with reassembly support.
    
    Maintains:
    - rcv_nxt: Next expected sequence number (in-order frontier)
    - in_order_buffer: Data ready for application (contiguous from ISN)
    - out_of_order_queue: Segments received ahead of gaps
    """
    
    def __init__(self, initial_seq):
        self.rcv_nxt = initial_seq      # Next expected byte
        self.in_order_buffer = bytearray()  # Ready for application
        self.out_of_order_queue = {}    # seq -> data
        self.window_size = 65535        # Receive window
    
    def receive_segment(self, seq, data):
        """
        Process incoming segment, maintaining order.
        Returns: (new_rcv_nxt, data_available_for_app)
        """
        seg_len = len(data)
        seg_end = seq + seg_len
        
        # === Case 1: In-order segment ===
        if seq == self.rcv_nxt:
            # Perfect! Append directly to in-order buffer
            self.in_order_buffer.extend(data)
            self.rcv_nxt = seg_end
            
            # Now check: can we deliver any buffered out-of-order segments?
            # They might now be contiguous with what we just received
            self._process_out_of_order_queue()
            
            return self.rcv_nxt, len(self.in_order_buffer) > 0
        
        # === Case 2: Duplicate/old segment ===
        if seg_end <= self.rcv_nxt:
            # Already received and delivered
            return self.rcv_nxt, False
        
        # === Case 3: Out-of-order segment (seq > rcv_nxt) ===
        if seq > self.rcv_nxt:
            # Buffer it for later
            # Handle overlaps with existing buffered segments
            self._store_out_of_order(seq, data)
            # Return current rcv_nxt (gap still exists)
            return self.rcv_nxt, False
        
        # === Case 4: Partial overlap (seq < rcv_nxt < seg_end) ===
        # Some data is duplicate, some is new
        new_start = self.rcv_nxt
        new_data = data[new_start - seq:]
        self.in_order_buffer.extend(new_data)
        self.rcv_nxt = seg_end
        self._process_out_of_order_queue()
        return self.rcv_nxt, True
    
    def _store_out_of_order(self, seq, data):
        """Store out-of-order segment, handling overlaps."""
        # Simple implementation: just store, handle overlaps on flush
        self.out_of_order_queue[seq] = data
    
    def _process_out_of_order_queue(self):
        """
        Check if any out-of-order segments are now contiguous 
        with the in-order frontier and can be delivered.
        """
        while True:
            # Look for segment starting at rcv_nxt
            if self.rcv_nxt in self.out_of_order_queue:
                data = self.out_of_order_queue.pop(self.rcv_nxt)
                self.in_order_buffer.extend(data)
                self.rcv_nxt += len(data)
            else:
                # No contiguous segment found
                break
    
    def read(self, max_bytes):
        """Application reads from buffer."""
        read_amount = min(max_bytes, len(self.in_order_buffer))
        data = bytes(self.in_order_buffer[:read_amount])
        self.in_order_buffer = self.in_order_buffer[read_amount:]
        return data
 
 
# === Demonstration ===
buf = TCPReceiveBuffer(initial_seq=1000)
 
# Segments arrive out of order
print("Segments arriving:")
segments = [
    (1500, b"world"),    # Out of order (gap 1000-1499)
    (2000, b"!----"),    # Out of order (bigger gap) 
    (1000, b"Hello"),    # In order! Fills gap
]
 
for seq, data in segments:
    rcv_nxt, has_data = buf.receive_segment(seq, data)
    print(f"  Seq={seq}: rcv_nxt now {rcv_nxt}, data ready: {has_data}")
 
# After last segment, everything is in order
print(f"
Buffer contains: {buf.read(100).decode()}")  # "Helloworld!----"

Efficient Reassembly

Real TCP implementations use sophisticated data structures (e.g., skip lists, red-black trees) for the out-of-order queue to handle overlapping segments efficiently and support fast insertion/lookup. The simple dictionary approach shown here works conceptually but doesn't handle overlaps optimally.

The Receive Buffer

TCP's receive buffer is central to ordered delivery. It serves two purposes:

Buffering for application: Hold in-order data until the application reads it
Buffering for reassembly: Hold out-of-order data until gaps are filled

Buffer anatomy:

receive_buffer.txt

Visualization

TCP Receive Buffer Layout
═══════════════════════════════════════════════════════════════════════
 
Total Buffer Size: 64KB (example)
 
    ┌──────────────────────────────────────────────────────────────────┐
    │                    RECEIVE BUFFER (64KB)                         │
    ├─────────────────┬────────────────┬─────────────────┬─────────────┤
    │    IN-ORDER     │      GAP       │  OUT-OF-ORDER   │    UNUSED   │
    │    (Delivered)  │    (Missing)   │   (Buffered)    │   (Window)  │
    └─────────────────┴────────────────┴─────────────────┴─────────────┘
         ↑                                                      ↑
    Application reads                                     New data limit
      from here                                          (advertised window)
 
Sequence number view (example with ISN=1000):
 
    1000        1500        2000        2500        3000        3500
    |=====×=====|===========|===========|~~~~~~~~~~~|===========|........|
    
    Legend:
    ═════  IN-ORDER: Received, waiting for application to read (bytes 1000-1999)
    ~~~~~  GAP: Not yet received (bytes 2000-2499)  
    ═════  OUT-OF-ORDER: Received, buffered until gap fills (bytes 2500-2999)
    .....  UNUSED: Available for new data (bytes 3000+)
 
    RCV.NXT = 2000 (next expected byte)
    Application has read up to byte 1500
    Available window = Buffer size - (RCV.NXT - last read) - out-of-order size

Buffer sizing considerations:

Factor	Impact on Buffer Size
Bandwidth-Delay Product	Larger buffer needed for high-bandwidth, high-latency paths
Application read speed	Slow reader → buffer fills faster
Expected reordering	More reordering → more out-of-order storage needed
Memory constraints	Embedded devices may have small buffers
Connection count	More connections → less memory per connection

The receive window (rwnd):

The receive buffer directly controls flow control. The receive window advertised to the sender equals:

rwnd = Buffer Size - Data Awaiting Application Read
     = Buffer Size - (Last Byte Received - Last Byte Read by App)

This includes both in-order and out-of-order buffered data. If the buffer fills (large gaps or slow application), rwnd drops to zero, stopping the sender.

Receiver Buffer Bloat

Too-large receive buffers can cause "buffer bloat": excessive buffering leads to high latency as data queues up. Applications that prefer low latency over throughput may benefit from smaller buffers. Modern systems try to auto-tune buffer sizes based on network conditions (TCP autotuning).

Handling Out-of-Order Segments

When segments arrive out of order, TCP must decide what to do. The receiver's behavior significantly affects performance and reliability.

Options for handling out-of-order segments:

Buffer and wait (standard): Keep the segment, send immediate ACK for the gap, deliver when gap fills
Discard: Throw away out-of-order data, forcing retransmission later (wasteful, obsolete approach)
Deliver out of order: Not an option for standard TCP (violates stream semantics)

Modern TCP behavior:

RFC 5681 recommends buffering out-of-order segments and sending immediate acknowledgments:

"Out-of-order data segments SHOULD be queued by the receiver... A TCP receiver SHOULD send an immediate duplicate ACK when an out-of-order segment arrives."

The duplicate ACK serves two purposes:

Informs sender that something is missing (triggers fast retransmit after 3 dup ACKs)
Re-confirms what has been received (in case earlier ACKs were lost)

Converting Mermaid diagram...

Overlap handling:

Segments may overlap—the sender might retransmit more data than was actually lost, or network issues might cause partial duplication. TCP must handle overlaps correctly:

Buffered: Bytes 1500-1999
Arriving: Bytes 1400-1700
         ^^^^
         1400-1499 = new data (extends backward)
              ^^^^
              1500-1700 = overlap (discard duplicate)

The receiver should accept the new bytes and ignore duplicates. Sequence numbers make this determination trivial—compare arriving sequence ranges against already-received ranges.

SACK Improves Efficiency

Selective Acknowledgment (SACK) dramatically improves efficiency for out-of-order delivery. Instead of just saying "I need byte 1000", SACK says "I need 1000, but I already have 1500-1999 and 2500-2999". The sender can retransmit only the truly missing data, not everything from the gap onward.

Head-of-Line Blocking

TCP's strict ordering guarantee has a significant cost: head-of-line (HOL) blocking. When one segment is missing, all subsequent data—even if already received—waits.

The problem:

Application wants to read:
  → TCP has bytes 1000-1499 ✓
  → TCP needs bytes 1500-1999 (LOST, being retransmitted)
  → TCP has bytes 2000-4999 ✓ (buffered, waiting)

Application receives: NOTHING (blocked waiting for 1500)

The application waits for the lost segment to be retransmitted and received. Meanwhile, 3000 bytes of perfectly good data sit in the buffer, unusable.

HOL blocking duration:

The block lasts until the missing segment is retransmitted and arrives:

Best case (fast retransmit): ~1.5-2 RTT (time to get 3 dup ACKs + retransmission time)
Worst case (timeout retransmit): ~1-2 RTO (often hundreds of milliseconds to seconds)

For latency-sensitive applications, this can be devastating.

Why HOL blocking is problematic for HTTP/2:

HTTP/2 multiplexes multiple request/response streams over a single TCP connection. If a packet is lost:

All streams are blocked, even those unaffected by the loss
Independent requests (different resources) block each other
One slow response blocks all other responses

This was a major motivation for QUIC/HTTP/3, which uses UDP and implements reliability per-stream rather than for the entire connection.

Converting Mermaid diagram...

When HOL Blocking is Acceptable

•File transfers (order is essential)
•Single-stream protocols (HTTP/1.1)
•Terminal/SSH sessions
•Database queries with ordered results
•Any application with sequential dependencies

When HOL Blocking Hurts

•HTTP/2 with many multiplexed streams
•Real-time gaming with independent updates
•Video conferencing with audio/video tracks
•Parallel independent API requests
•Any application with independent data channels

QUIC Solves HOL Blocking

QUIC (used by HTTP/3) addresses HOL blocking by implementing reliability per-stream rather than per-connection. If a packet for Stream 1 is lost, only Stream 1 blocks. Streams 2 and 3 continue delivering data to the application immediately. This is a fundamental architectural change that TCP cannot provide.

Implications for Application Design

TCP's ordered delivery guarantee affects how applications should be designed. Understanding these implications leads to better protocol choices.

1. Use TCP when order matters:

If your application requires sequential processing of data, TCP is the right choice. Don't try to implement ordering over UDP unless you have very specific requirements (like per-stream ordering).

2. Consider message framing carefully:

Since TCP is a byte stream (not message-oriented), applications must implement their own message boundaries. Ordering helps here—you know that if you read bytes, they're the next bytes in sequence.

message_framing.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# TCP provides ordered bytes - application provides message structure
 
# === Option 1: Length-Prefixed Messages ===
# Every message starts with its length
def send_message(sock, message: bytes):
    length = len(message)
    # Send 4-byte length, then message
    sock.sendall(length.to_bytes(4, 'big'))
    sock.sendall(message)
 
def recv_message(sock) -> bytes:
    # Read 4-byte length first (TCP guarantees order)
    length_bytes = recv_exactly(sock, 4)
    length = int.from_bytes(length_bytes, 'big')
    # Now read that many bytes (will arrive in order)
    return recv_exactly(sock, length)
 
# === Option 2: Delimiter-Based ===
# Messages end with special delimiter (e.g., newline)
def recv_line(sock) -> str:
    data = b''
    while True:
        byte = sock.recv(1)
        if byte == b'
':
            break
        data += byte
    return data.decode()
 
# === Option 3: Fixed-Size Messages ===
# Every message is exactly N bytes
def recv_fixed_message(sock, size=256) -> bytes:
    return recv_exactly(sock, size)
 
# TCP guarantees that:
# - Bytes arrive in order, so length prefix → data works correctly
# - Delimiters appear in correct sequence positions
# - Fixed-size reads don't mix messages
 
# The helper function that handles TCP's byte-stream nature:
def recv_exactly(sock, n: int) -> bytes:
    """Receive exactly n bytes (may require multiple recv calls)."""
    data = b''
    while len(data) < n:
        chunk = sock.recv(n - len(data))
        if not chunk:
            raise ConnectionError("Connection closed")
        data += chunk
    return data

3. Be aware of HOL blocking impact:

If your application multiplexes independent data over a single TCP connection, be aware that one lost segment blocks everything. Options:

Use multiple TCP connections for independent channels
Accept the latency impact for simplicity
Consider QUIC/HTTP/3 for HTTP-based applications

4. Buffer sizes affect ordering behavior:

Larger receive buffers can hold more out-of-order data, potentially improving throughput under light loss. But they also increase maximum latency during HOL blocking.

5. Application-level acknowledgment:

TCP's ordered delivery guarantees data reaches the remote TCP stack. If you need to know the remote application processed it, implement application-level acknowledgment:

Client: "Process transaction X"
Server: Receives (TCP ACKs)
Server: Processes transaction
Server: "Transaction X complete" (application ACK)
Client: Now knows transaction was processed

Without this, the client only knows data was delivered to the remote kernel, not that it was acted upon.

Knowing Your Requirements

Not every application needs TCP's ordering. Evaluate: Is out-of-order data acceptable? Is some data loss tolerable? Would stale data be harmful? If the answers suggest flexibility, UDP or QUIC might be better choices. TCP's ordering is a guarantee, but also a constraint.

Ordering and the Receive Window

The receive window has important interactions with ordering. Out-of-order data consumes buffer space even though it can't be delivered yet, affecting flow control.

How out-of-order data affects the window:

Buffer Size: 10,000 bytes
In-order data (awaiting app): 2,000 bytes
Out-of-order data (buffered): 3,000 bytes

Advertised Window = 10,000 - 2,000 - 3,000 = 5,000 bytes

The sender sees rwnd=5000, but 3000 bytes of that
reduction is due to buffered out-of-order data that
the sender didn't intend to "cost" window space.

This creates a feedback loop: more out-of-order data → smaller window → slower sending → longer gaps → more buffering needed.

SACK helps here too:

With SACK, the sender knows exactly which bytes are buffered out of order. It can retransmit precisely the missing bytes, avoid sending duplicates, and the receiver can more efficiently manage buffer space.

Window update after gap fill:

When a gap is filled:

Out-of-order buffered data becomes in-order
Data is made available to the application
As the application reads, buffer space frees
Receive window grows, allowing sender to resume

This is why a single lost segment can temporarily stall transmission, then throughput resumes quickly once the gap is filled—buffered data is delivered rapidly, the window expands, and sending continues at full speed.

Receiver-Side Optimization

Some receivers implement "receiver-side autotuning": dynamically adjusting buffer sizes based on network conditions. High-bandwidth delay networks benefit from larger buffers to hold more in-flight data; stable networks with little loss need less buffering for out-of-order segments.

Summary: Ordered Delivery

TCP's ordered delivery guarantee transforms the chaotic, unordered world of packet switching into a clean, sequential byte stream. Let's consolidate what we've learned:

Key Takeaways

•Ordering is essential for many applications — File transfers, web pages, database queries, and terminal sessions all require data in the correct sequence.
•Packet reordering is normal — Multi-path routing, varying queue depths, and retransmissions all cause packets to arrive out of order.
•Sequence numbers enable reassembly — Each byte's sequence number defines its position; the receiver uses this to reassemble regardless of arrival order.
•The receive buffer holds out-of-order data — Buffering allows TCP to accept data that arrives ahead of gaps, improving efficiency.
•Head-of-line blocking is the cost of ordering — One missing segment blocks all subsequent data, even if already received.
•Application design should consider ordering — Use TCP when order matters; be aware of HOL blocking for multiplexed protocols.

The Stream Abstraction

TCP presents applications with a simple, powerful abstraction: data written to one end appears in the same order at the other end. This stream model has enabled decades of internet applications without developers needing to worry about packet-level chaos.

What's next:

We've covered ordering in one direction. The next page explores TCP's full-duplex nature—how both directions operate independently and simultaneously, enabling efficient bidirectional communication with piggybacking and independent sequence spaces.

Ordered Delivery

Preserving Sequence in a Chaotic Network

What You Will Learn

Why Order Matters

Before examining how TCP maintains order, let's understand why ordering is essential for many applications.

Applications that require ordering:

Application Ordering Requirements
Application Type	Why Order Matters	Disorder Impact
File Transfer	Bytes must be in original positions	Corrupted, unreadable files
Web Pages (HTTP)	HTML must be parsed sequentially	Garbled rendering, broken pages
Database Queries	Query results must match row order	Wrong data, logic errors
Terminal Sessions (SSH)	Commands must execute in sequence	Unpredictable behavior, security risks
Email (SMTP)	Message headers before body	Malformed messages, delivery failure
TLS/SSL	Handshake steps are sequence-dependent	Cryptographic failure, no connection

Contrast with order-tolerant applications:

Not all applications require strict ordering. Some can process data in any order:

DNS queries: Each query/response is independent
Game state updates: Latest state may matter more than historical sequence
Video streaming: Missing frames can be skipped; old frames are useless
IoT sensor data: Each reading is often independent

These applications can use UDP (which doesn't guarantee ordering) or application-layer mechanisms to handle reordering. But applications requiring strict ordering depend on TCP's guarantee.

The stream abstraction:

Order vs. Timing

How Packets Get Reordered

Understanding TCP's ordering mechanisms requires first understanding how packets end up out of order. Multiple factors contribute:

1. Multi-path routing:

The internet is a mesh of interconnected networks. Packets from A to B might travel:

Packet 1: A → Router X → Router Y → B (100ms path)
Packet 2: A → Router X → Router Z → B (50ms path)

Packet 2 arrives first, despite being sent second. This is particularly common when:

Load balancers distribute traffic across multiple paths
Routing changes occur mid-transmission
Traffic engineering shifts paths based on congestion

Converting Mermaid diagram...

2. Router queue variations:

Even on the same path, packets experience different queuing delays:

Packet 1: Arrives at router when queue is deep (100 packets ahead)
          → Waits 50ms for queue to drain
Packet 2: Arrives at router when queue is shallow (10 packets ahead)
          → Waits 5ms
Packet 3: Arrives at router when queue is empty
          → Immediate forwarding

Packet 3 might arrive before Packets 1 and 2, even though sent last.

3. Retransmission-caused reordering:

When a packet is lost and retransmitted, later packets may have already arrived:

Original: Packet 1 ━━━━ [LOST]
Original: Packet 2 ━━━━━━━→ Arrives at t=100ms
Original: Packet 3 ━━━━━━━━━→ Arrives at t=150ms
Retransmit: Packet 1 ━━━━━━━━━━━→ Arrives at t=300ms

The receiver gets packets in order 2, 3, 1—but needs to deliver 1, 2, 3.

4. Parallelism in network hardware:

Modern routers and NICs use parallelism. Multiple cores/threads process packets simultaneously, and slight timing variations can reorder packets even in a single device.

Reordering is Normal, Not Exceptional

Sequence-Based Reassembly

TCP's ordering mechanism is beautifully simple: sequence numbers define position, the receiver reassembles.

The rule:

Each byte has a unique sequence number defining its position in the stream. The receiver uses these to place data in the correct order, regardless of arrival order.

Sequence Number = Position in byte stream (starting from ISN)

If ISN = 1000:
  Byte 0 of data → Seq 1000
  Byte 1 of data → Seq 1001
  Byte 500 of data → Seq 1500
  ...

Reassembly algorithm:

reassembly_algorithm.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
class TCPReceiveBuffer:
    """
    TCP Receive Buffer with reassembly support.
    
    Maintains:
    - rcv_nxt: Next expected sequence number (in-order frontier)
    - in_order_buffer: Data ready for application (contiguous from ISN)
    - out_of_order_queue: Segments received ahead of gaps
    """
    
    def __init__(self, initial_seq):
        self.rcv_nxt = initial_seq      # Next expected byte
        self.in_order_buffer = bytearray()  # Ready for application
        self.out_of_order_queue = {}    # seq -> data
        self.window_size = 65535        # Receive window
    
    def receive_segment(self, seq, data):
        """
        Process incoming segment, maintaining order.
        Returns: (new_rcv_nxt, data_available_for_app)
        """
        seg_len = len(data)
        seg_end = seq + seg_len
        
        # === Case 1: In-order segment ===
        if seq == self.rcv_nxt:
            # Perfect! Append directly to in-order buffer
            self.in_order_buffer.extend(data)
            self.rcv_nxt = seg_end
            
            # Now check: can we deliver any buffered out-of-order segments?
            # They might now be contiguous with what we just received
            self._process_out_of_order_queue()
            
            return self.rcv_nxt, len(self.in_order_buffer) > 0
        
        # === Case 2: Duplicate/old segment ===
        if seg_end <= self.rcv_nxt:
            # Already received and delivered
            return self.rcv_nxt, False
        
        # === Case 3: Out-of-order segment (seq > rcv_nxt) ===
        if seq > self.rcv_nxt:
            # Buffer it for later
            # Handle overlaps with existing buffered segments
            self._store_out_of_order(seq, data)
            # Return current rcv_nxt (gap still exists)
            return self.rcv_nxt, False
        
        # === Case 4: Partial overlap (seq < rcv_nxt < seg_end) ===
        # Some data is duplicate, some is new
        new_start = self.rcv_nxt
        new_data = data[new_start - seq:]
        self.in_order_buffer.extend(new_data)
        self.rcv_nxt = seg_end
        self._process_out_of_order_queue()
        return self.rcv_nxt, True
    
    def _store_out_of_order(self, seq, data):
        """Store out-of-order segment, handling overlaps."""
        # Simple implementation: just store, handle overlaps on flush
        self.out_of_order_queue[seq] = data
    
    def _process_out_of_order_queue(self):
        """
        Check if any out-of-order segments are now contiguous 
        with the in-order frontier and can be delivered.
        """
        while True:
            # Look for segment starting at rcv_nxt
            if self.rcv_nxt in self.out_of_order_queue:
                data = self.out_of_order_queue.pop(self.rcv_nxt)
                self.in_order_buffer.extend(data)
                self.rcv_nxt += len(data)
            else:
                # No contiguous segment found
                break
    
    def read(self, max_bytes):
        """Application reads from buffer."""
        read_amount = min(max_bytes, len(self.in_order_buffer))
        data = bytes(self.in_order_buffer[:read_amount])
        self.in_order_buffer = self.in_order_buffer[read_amount:]
        return data
 
 
# === Demonstration ===
buf = TCPReceiveBuffer(initial_seq=1000)
 
# Segments arrive out of order
print("Segments arriving:")
segments = [
    (1500, b"world"),    # Out of order (gap 1000-1499)
    (2000, b"!----"),    # Out of order (bigger gap) 
    (1000, b"Hello"),    # In order! Fills gap
]
 
for seq, data in segments:
    rcv_nxt, has_data = buf.receive_segment(seq, data)
    print(f"  Seq={seq}: rcv_nxt now {rcv_nxt}, data ready: {has_data}")
 
# After last segment, everything is in order
print(f"
Buffer contains: {buf.read(100).decode()}")  # "Helloworld!----"

Efficient Reassembly

The Receive Buffer

TCP's receive buffer is central to ordered delivery. It serves two purposes:

Buffering for application: Hold in-order data until the application reads it
Buffering for reassembly: Hold out-of-order data until gaps are filled

Buffer anatomy:

receive_buffer.txt

Visualization

TCP Receive Buffer Layout
═══════════════════════════════════════════════════════════════════════
 
Total Buffer Size: 64KB (example)
 
    ┌──────────────────────────────────────────────────────────────────┐
    │                    RECEIVE BUFFER (64KB)                         │
    ├─────────────────┬────────────────┬─────────────────┬─────────────┤
    │    IN-ORDER     │      GAP       │  OUT-OF-ORDER   │    UNUSED   │
    │    (Delivered)  │    (Missing)   │   (Buffered)    │   (Window)  │
    └─────────────────┴────────────────┴─────────────────┴─────────────┘
         ↑                                                      ↑
    Application reads                                     New data limit
      from here                                          (advertised window)
 
Sequence number view (example with ISN=1000):
 
    1000        1500        2000        2500        3000        3500
    |=====×=====|===========|===========|~~~~~~~~~~~|===========|........|
    
    Legend:
    ═════  IN-ORDER: Received, waiting for application to read (bytes 1000-1999)
    ~~~~~  GAP: Not yet received (bytes 2000-2499)  
    ═════  OUT-OF-ORDER: Received, buffered until gap fills (bytes 2500-2999)
    .....  UNUSED: Available for new data (bytes 3000+)
 
    RCV.NXT = 2000 (next expected byte)
    Application has read up to byte 1500
    Available window = Buffer size - (RCV.NXT - last read) - out-of-order size

Buffer sizing considerations:

Factor	Impact on Buffer Size
Bandwidth-Delay Product	Larger buffer needed for high-bandwidth, high-latency paths
Application read speed	Slow reader → buffer fills faster
Expected reordering	More reordering → more out-of-order storage needed
Memory constraints	Embedded devices may have small buffers
Connection count	More connections → less memory per connection

The receive window (rwnd):

The receive buffer directly controls flow control. The receive window advertised to the sender equals:

rwnd = Buffer Size - Data Awaiting Application Read
     = Buffer Size - (Last Byte Received - Last Byte Read by App)

This includes both in-order and out-of-order buffered data. If the buffer fills (large gaps or slow application), rwnd drops to zero, stopping the sender.

Receiver Buffer Bloat

Handling Out-of-Order Segments

When segments arrive out of order, TCP must decide what to do. The receiver's behavior significantly affects performance and reliability.

Options for handling out-of-order segments:

Buffer and wait (standard): Keep the segment, send immediate ACK for the gap, deliver when gap fills
Discard: Throw away out-of-order data, forcing retransmission later (wasteful, obsolete approach)
Deliver out of order: Not an option for standard TCP (violates stream semantics)

Modern TCP behavior:

RFC 5681 recommends buffering out-of-order segments and sending immediate acknowledgments:

"Out-of-order data segments SHOULD be queued by the receiver... A TCP receiver SHOULD send an immediate duplicate ACK when an out-of-order segment arrives."

The duplicate ACK serves two purposes:

Informs sender that something is missing (triggers fast retransmit after 3 dup ACKs)
Re-confirms what has been received (in case earlier ACKs were lost)

Converting Mermaid diagram...

Overlap handling:

Segments may overlap—the sender might retransmit more data than was actually lost, or network issues might cause partial duplication. TCP must handle overlaps correctly:

Buffered: Bytes 1500-1999
Arriving: Bytes 1400-1700
         ^^^^
         1400-1499 = new data (extends backward)
              ^^^^
              1500-1700 = overlap (discard duplicate)

The receiver should accept the new bytes and ignore duplicates. Sequence numbers make this determination trivial—compare arriving sequence ranges against already-received ranges.

SACK Improves Efficiency

Head-of-Line Blocking

TCP's strict ordering guarantee has a significant cost: head-of-line (HOL) blocking. When one segment is missing, all subsequent data—even if already received—waits.

The problem:

Application wants to read:
  → TCP has bytes 1000-1499 ✓
  → TCP needs bytes 1500-1999 (LOST, being retransmitted)
  → TCP has bytes 2000-4999 ✓ (buffered, waiting)

Application receives: NOTHING (blocked waiting for 1500)

The application waits for the lost segment to be retransmitted and received. Meanwhile, 3000 bytes of perfectly good data sit in the buffer, unusable.

HOL blocking duration:

The block lasts until the missing segment is retransmitted and arrives:

Best case (fast retransmit): ~1.5-2 RTT (time to get 3 dup ACKs + retransmission time)
Worst case (timeout retransmit): ~1-2 RTO (often hundreds of milliseconds to seconds)

For latency-sensitive applications, this can be devastating.

Why HOL blocking is problematic for HTTP/2:

HTTP/2 multiplexes multiple request/response streams over a single TCP connection. If a packet is lost:

All streams are blocked, even those unaffected by the loss
Independent requests (different resources) block each other
One slow response blocks all other responses

This was a major motivation for QUIC/HTTP/3, which uses UDP and implements reliability per-stream rather than for the entire connection.

Converting Mermaid diagram...

When HOL Blocking is Acceptable

•File transfers (order is essential)
•Single-stream protocols (HTTP/1.1)
•Terminal/SSH sessions
•Database queries with ordered results
•Any application with sequential dependencies

When HOL Blocking Hurts

•HTTP/2 with many multiplexed streams
•Real-time gaming with independent updates
•Video conferencing with audio/video tracks
•Parallel independent API requests
•Any application with independent data channels

QUIC Solves HOL Blocking

Implications for Application Design

TCP's ordered delivery guarantee affects how applications should be designed. Understanding these implications leads to better protocol choices.

1. Use TCP when order matters:

If your application requires sequential processing of data, TCP is the right choice. Don't try to implement ordering over UDP unless you have very specific requirements (like per-stream ordering).

2. Consider message framing carefully:

Since TCP is a byte stream (not message-oriented), applications must implement their own message boundaries. Ordering helps here—you know that if you read bytes, they're the next bytes in sequence.

message_framing.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# TCP provides ordered bytes - application provides message structure
 
# === Option 1: Length-Prefixed Messages ===
# Every message starts with its length
def send_message(sock, message: bytes):
    length = len(message)
    # Send 4-byte length, then message
    sock.sendall(length.to_bytes(4, 'big'))
    sock.sendall(message)
 
def recv_message(sock) -> bytes:
    # Read 4-byte length first (TCP guarantees order)
    length_bytes = recv_exactly(sock, 4)
    length = int.from_bytes(length_bytes, 'big')
    # Now read that many bytes (will arrive in order)
    return recv_exactly(sock, length)
 
# === Option 2: Delimiter-Based ===
# Messages end with special delimiter (e.g., newline)
def recv_line(sock) -> str:
    data = b''
    while True:
        byte = sock.recv(1)
        if byte == b'
':
            break
        data += byte
    return data.decode()
 
# === Option 3: Fixed-Size Messages ===
# Every message is exactly N bytes
def recv_fixed_message(sock, size=256) -> bytes:
    return recv_exactly(sock, size)
 
# TCP guarantees that:
# - Bytes arrive in order, so length prefix → data works correctly
# - Delimiters appear in correct sequence positions
# - Fixed-size reads don't mix messages
 
# The helper function that handles TCP's byte-stream nature:
def recv_exactly(sock, n: int) -> bytes:
    """Receive exactly n bytes (may require multiple recv calls)."""
    data = b''
    while len(data) < n:
        chunk = sock.recv(n - len(data))
        if not chunk:
            raise ConnectionError("Connection closed")
        data += chunk
    return data

3. Be aware of HOL blocking impact:

If your application multiplexes independent data over a single TCP connection, be aware that one lost segment blocks everything. Options:

Use multiple TCP connections for independent channels
Accept the latency impact for simplicity
Consider QUIC/HTTP/3 for HTTP-based applications

4. Buffer sizes affect ordering behavior:

Larger receive buffers can hold more out-of-order data, potentially improving throughput under light loss. But they also increase maximum latency during HOL blocking.

5. Application-level acknowledgment:

TCP's ordered delivery guarantees data reaches the remote TCP stack. If you need to know the remote application processed it, implement application-level acknowledgment:

Client: "Process transaction X"
Server: Receives (TCP ACKs)
Server: Processes transaction
Server: "Transaction X complete" (application ACK)
Client: Now knows transaction was processed

Without this, the client only knows data was delivered to the remote kernel, not that it was acted upon.

Knowing Your Requirements

Ordering and the Receive Window

The receive window has important interactions with ordering. Out-of-order data consumes buffer space even though it can't be delivered yet, affecting flow control.

How out-of-order data affects the window:

Buffer Size: 10,000 bytes
In-order data (awaiting app): 2,000 bytes
Out-of-order data (buffered): 3,000 bytes

Advertised Window = 10,000 - 2,000 - 3,000 = 5,000 bytes

The sender sees rwnd=5000, but 3000 bytes of that
reduction is due to buffered out-of-order data that
the sender didn't intend to "cost" window space.

This creates a feedback loop: more out-of-order data → smaller window → slower sending → longer gaps → more buffering needed.

SACK helps here too:

Window update after gap fill:

When a gap is filled:

Out-of-order buffered data becomes in-order
Data is made available to the application
As the application reads, buffer space frees
Receive window grows, allowing sender to resume

Receiver-Side Optimization

Summary: Ordered Delivery

TCP's ordered delivery guarantee transforms the chaotic, unordered world of packet switching into a clean, sequential byte stream. Let's consolidate what we've learned:

Key Takeaways

•Ordering is essential for many applications — File transfers, web pages, database queries, and terminal sessions all require data in the correct sequence.
•Packet reordering is normal — Multi-path routing, varying queue depths, and retransmissions all cause packets to arrive out of order.
•Sequence numbers enable reassembly — Each byte's sequence number defines its position; the receiver uses this to reassemble regardless of arrival order.
•The receive buffer holds out-of-order data — Buffering allows TCP to accept data that arrives ahead of gaps, improving efficiency.
•Head-of-line blocking is the cost of ordering — One missing segment blocks all subsequent data, even if already received.
•Application design should consider ordering — Use TCP when order matters; be aware of HOL blocking for multiplexed protocols.

The Stream Abstraction

What's next: