Computer NetworksTCP Sequence Numbers

TCP Sequence Numbers

LevelIntermediate

Duration60 mins

TopicTCP Sequence Numbers

2 / 5

Byte-Oriented Transmission

Numbering Every Single Byte

When you send an email, make an API call, or stream a video over TCP, you're not really sending "packets." From TCP's perspective, you're contributing to a continuous stream of bytes—and TCP assigns a sequence number to every single one of those bytes.

This design choice—numbering bytes rather than packets or messages—is one of TCP's most fundamental architectural decisions. It defines how TCP presents data to applications, how it handles varying network conditions, and why the same application data might be split across multiple segments or combined into one.

This page explores TCP's byte-oriented transmission in depth: why it exists, how it works, and what it means for reliable communication.

What You Will Learn

By the end of this page, you will understand: the byte-stream abstraction TCP provides to applications, why bytes are numbered instead of packets, how segmentation and reassembly work at the byte level, the relationship between application writes and TCP segments, and how byte-orientation enables TCP's flexibility in handling network dynamics.

The Byte-Stream Abstraction

TCP provides applications with a byte-stream abstraction—the appearance of a continuous, reliable, ordered stream of bytes flowing between endpoints. This abstraction hides all the complexity of the underlying network:

No visible packet boundaries: Applications write and read bytes; they don't see segments
No visible reordering: Bytes arrive in exactly the order they were sent
No visible losses: Lost data is retransmitted automatically
No visible duplicates: Each byte is delivered exactly once

Think of it like a pipe between two applications. One end puts bytes in; the other end receives the same bytes in the same order. The pipe handles all the messy details of the unreliable network in between.

Converting Mermaid diagram...

Key Insight: The sender writes "Hello World!" as one call, but TCP might split it across multiple segments. The receiver might receive the segments out of order. Yet the receiving application sees "Hello World!" exactly as sent—the byte-stream abstraction is preserved.

No Message Boundaries:

Unlike UDP, TCP does not preserve application message boundaries. If an application makes three send() calls:

send("Hello");
send(" ");
send("World");

The receiver might see this data in any of these forms:

One recv() returning "Hello W"
Another recv() returning "orld"

Or:

One recv() returning "Hello World"

TCP guarantees the bytes arrive in order, but not that message boundaries are preserved. Applications must implement their own message framing if needed.

Application-Level Framing Required

Since TCP doesn't preserve message boundaries, protocols built on TCP must define their own framing. HTTP uses Content-Length or chunked transfer encoding. Many protocols use length prefixes or delimiters. This is the application's responsibility—TCP only guarantees byte-order, not message structure.

Why Bytes, Not Packets?

Other protocols number packets or messages. Why does TCP number individual bytes? This design was intentional and provides crucial flexibility.

Consider the alternative: Packet-numbered protocol

If TCP numbered packets instead of bytes:

Packet #1 carries 1000 bytes
Packet #2 carries 1000 bytes
Packet #1 is lost

To acknowledge progress, the receiver could only say "I got packet #2" but couldn't indicate that 1000 bytes of data are ready. If the sender retransmits with a different segment size (maybe 500 bytes each due to path MTU changes), how would the receiver correlate the new packets with the old?

Byte numbering solves this elegantly:

Problems with Packet Numbering

•Segment size must be fixed or tracked separately
•Partial ACKs require complex tracking
•Path MTU changes complicate retransmission
•Receiver buffering tied to packet boundaries
•Nagle's algorithm harder to implement
•Selective ACK becomes complicated

Benefits of Byte Numbering

•ACK specifies exactly which byte next
•Segment size can vary dynamically
•Retransmissions can use any segmentation
•Partial data delivery is natural
•Buffering is position-based, not packet-based
•Works with any MTU, any network path

Concrete Example: Dynamic Segmentation

Consider a sender transmitting bytes 1000-3999 (3000 bytes). Initially, the path supports 1000-byte segments:

Segment 1: SEQ=1000, LEN=1000 (bytes 1000-1999) — Lost!
Segment 2: SEQ=2000, LEN=1000 (bytes 2000-2999) — Delivered
Segment 3: SEQ=3000, LEN=1000 (bytes 3000-3999) — Delivered

Path MTU changes, now only 500-byte segments work. The retransmission:

Segment 1a: SEQ=1000, LEN=500 (bytes 1000-1499) — Delivered
Segment 1b: SEQ=1500, LEN=500 (bytes 1500-1999) — Delivered

Because sequence numbers identify bytes, the receiver seamlessly accepts the retransmission even though segmentation differs. ACK=4000 confirms all 3000 bytes received.

Decoupling Data from Segmentation

Byte-oriented numbering decouples the data stream from its physical segmentation. Segments are merely containers for byte ranges. The same data can be carried in different-sized segments, retransmitted with different segmentation, and acknowledged by byte position. This flexibility is essential for TCP's adaptability.

Sequence Numbers and Byte Positions

Each TCP segment's sequence number identifies the first byte of data in that segment. The relationship between sequence numbers and byte positions is fundamental to understanding TCP's operation.

Formal Definition:

For a segment with:

SEQ = S (sequence number in TCP header)
LEN = L (number of data bytes)

The bytes in this segment occupy positions:

First byte: S
Second byte: S + 1
...
Last byte: S + L - 1
Next expected byte: S + L

byte_position_calculation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def calculate_byte_range(seq_number: int, data_length: int) -> tuple:
    """
    Calculate the byte range covered by a TCP segment.
    
    Args:
        seq_number: The sequence number from TCP header
        data_length: Number of data bytes in the segment
    
    Returns:
        Tuple of (first_byte, last_byte, next_expected)
    """
    first_byte = seq_number
    last_byte = seq_number + data_length - 1
    next_expected = seq_number + data_length
    
    return (first_byte, last_byte, next_expected)
 
# Example: Segment with SEQ=5000 carrying 1460 bytes
first, last, next_exp = calculate_byte_range(5000, 1460)
print(f"Bytes {first} to {last} in this segment")
print(f"Next segment should start at {next_exp}")
# Output:
# Bytes 5000 to 6459 in this segment
# Next segment should start at 6460
 
# Example: Multiple segments forming a stream
segments = [
    (1000, 500),   # SEQ=1000, 500 bytes
    (1500, 1000),  # SEQ=1500, 1000 bytes
    (2500, 750),   # SEQ=2500, 750 bytes
]
 
print("\nByte stream composition:")
for seq, length in segments:
    first, last, next_exp = calculate_byte_range(seq, length)
    print(f"  SEQ={seq}: bytes {first}-{last}")
    
# Output:
# Byte stream composition:
#   SEQ=1000: bytes 1000-1499
#   SEQ=1500: bytes 1500-2499
#   SEQ=2500: bytes 2500-3249

Segment Size Independence:

The same byte stream can be segmented in countless ways. All of the following represent the same 3000 bytes:

Segmentation A (3 segments of 1000 bytes each):

SEQ=1000, LEN=1000 (bytes 1000-1999)
SEQ=2000, LEN=1000 (bytes 2000-2999)
SEQ=3000, LEN=1000 (bytes 3000-3999)

Segmentation B (6 segments of 500 bytes each):

SEQ=1000, LEN=500 (bytes 1000-1499)
SEQ=1500, LEN=500 (bytes 1500-1999)
... and so on

Segmentation C (Mixed sizes):

SEQ=1000, LEN=1200 (bytes 1000-2199)
SEQ=2200, LEN=1800 (bytes 2200-3999)

All three segmentations describe the identical byte stream. The receiver treats them equivalently.

Maximum Segment Size (MSS)

While segments can be any size up to 65,535 bytes theoretically, practical constraints limit segment size. MSS (Maximum Segment Size) is negotiated during connection setup and is typically derived from the path MTU minus IP and TCP header sizes. Common values are 1460 bytes (for Ethernet) or 1360 bytes (accounting for additional headers in some networks).

Application Writes vs Segment Boundaries

A critical implication of byte-oriented transmission is that application write boundaries do not correspond to segment boundaries. TCP's segmentation is independent of how the application writes data.

The TCP Send Buffer:

When an application calls send() or write(), data enters TCP's send buffer—a contiguous byte queue. TCP extracts bytes from this buffer and packages them into segments based on:

Maximum Segment Size (MSS): Upper bound on segment payload
Congestion window (cwnd): Sender's view of network capacity
Receive window (rwnd): Receiver's advertised buffer space
Nagle's Algorithm: Delays small writes to coalesce data
Push conditions: Application-requested immediate transmission

write_segmentation_example.txt
Application makes these writes:
  write(100 bytes)  → Send buffer: [100 bytes]
  write(200 bytes)  → Send buffer: [300 bytes total]
  write(1500 bytes) → Send buffer: [1800 bytes total]
  
TCP with MSS=1000 might segment as:
  Segment 1: 1000 bytes (partial from writes 1, 2, and 3)
  Segment 2: 800 bytes (remainder of writes)
  
═════════════════════════════════════════════════════════════
Application Writes:   |  100  |    200    |        1500        |
                      └───────┴───────────┴───────────────────┘
                      
TCP Segmentation:     |         1000            |     800     |
                      └─────────────────────────┴─────────────┘
                           Segment 1               Segment 2
═════════════════════════════════════════════════════════════
 
Note: Application boundaries (|) don't align with TCP boundaries!

Coalescing and Splitting:

Coalescing: Multiple small writes might be combined into one segment
Splitting: One large write might be split across multiple segments

Both happen transparently. The byte-stream abstraction is maintained regardless of segmentation.

Coalescing Example

•Application: write(50), write(75), write(25)
•Send buffer: 150 bytes accumulated
•After Nagle delay or buffer full:
•TCP sends: 1 segment of 150 bytes
•Receiver sees: 150 contiguous bytes

Splitting Example

•Application: write(5000 bytes)
•MSS = 1460 bytes
•TCP sends: 4 segments
•Segment 1: 1460 bytes (SEQ=N)
•Segment 2: 1460 bytes (SEQ=N+1460)
•Segment 3: 1460 bytes (SEQ=N+2920)
•Segment 4: 620 bytes (SEQ=N+4380)

Don't Assume Message Alignment

A common mistake is assuming that if you send() a complete JSON object, it will arrive as one recv() call. It won't, necessarily. TCP may deliver partial objects or combined objects. Always design your application to handle arbitrary byte boundaries in received data.

Receiver Reassembly

The receiver uses sequence numbers to reassemble the byte stream correctly, regardless of segment arrival order or size. This reassembly process is central to TCP's reliability.

Reassembly Buffer:

The receiver maintains a reassembly buffer indexed by sequence number. Each arriving segment places its bytes at the correct positions:

Buffer position = Segment SEQ - Initial Receive Sequence (IRS)

This means:

Segments can arrive in any order and be placed correctly
Duplicate bytes are detected and discarded
Gaps are identified as missing data
Contiguous data from position 0 can be delivered to the application

reassembly_buffer.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
class TCPReassemblyBuffer:
    """
    Simplified TCP reassembly buffer demonstrating byte-oriented 
    receive processing.
    """
    def __init__(self, initial_receive_seq):
        self.irs = initial_receive_seq  # Initial Receive Sequence
        self.buffer = {}                 # SEQ -> bytes mapping
        self.rcv_nxt = initial_receive_seq  # Next expected SEQ
        
    def receive_segment(self, seq: int, data: bytes) -> bytes:
        """
        Process an incoming segment. Returns bytes deliverable 
        to the application (contiguous from rcv_nxt).
        """
        # Calculate buffer positions for this segment's bytes
        for i, byte in enumerate(data):
            byte_seq = seq + i
            
            # Only accept bytes within receive window
            if byte_seq < self.rcv_nxt:
                # Duplicate byte - already received
                continue
            
            # Store byte at its sequence number position
            if byte_seq not in self.buffer:
                self.buffer[byte_seq] = byte
        
        # Deliver contiguous bytes to application
        deliverable = bytearray()
        while self.rcv_nxt in self.buffer:
            deliverable.append(self.buffer.pop(self.rcv_nxt))
            self.rcv_nxt += 1
            
        return bytes(deliverable)
    
# Example: Out-of-order arrival
buffer = TCPReassemblyBuffer(initial_receive_seq=1000)
 
# Segments arrive out of order
result1 = buffer.receive_segment(1005, b"World")  # Future: buffered
print(f"After SEQ=1005: delivered {len(result1)} bytes")
# Output: After SEQ=1005: delivered 0 bytes
 
result2 = buffer.receive_segment(1000, b"Hello")  # Expected: fills gap!
print(f"After SEQ=1000: delivered {len(result2)} bytes")
print(f"Delivered: {result2.decode()}")
# Output: After SEQ=1000: delivered 10 bytes
# Output: Delivered: HelloWorld

Key Reassembly Properties:

Position Independence: Segments place bytes at positions determined by SEQ, not arrival order
Gap Tolerance: Missing bytes create gaps in the buffer; receipt of the missing segment fills the gap
Duplicate Handling: Bytes already in buffer or delivered are discarded
Contiguous Delivery: Only contiguous bytes from RCV.NXT are delivered to the application
Buffer Indexing: The buffer is logically indexed by sequence number, enabling O(1) placement

Out-of-Order Buffering Improves Performance

By buffering out-of-order segments, TCP avoids unnecessary retransmissions. If segments 1, 3, and 4 arrive but segment 2 is delayed, only segment 2 needs retransmission. Without buffering, segments 2, 3, and 4 would all need retransmitting. Modern TCP implementations always buffer out-of-order data.

Implications for Flow Control

Byte-oriented numbering profoundly impacts how TCP implements flow control. The receiver's window advertisement specifies exactly how many bytes can be accepted, not how many packets.

Receive Window (rwnd):

The receiver advertises its receive window in each ACK segment:

Window = RCV.BUFFER_SIZE - (RCV.NXT - bytes_delivered_to_app)

This directly tells the sender: "You may send bytes with sequence numbers from RCV.NXT up to RCV.NXT + rwnd - 1."

Byte-Oriented Flow Control
Concept	Byte-Oriented Behavior	Why It Matters
Window Size	Exact byte count	Receiver knows precisely how much buffer space to allocate
Window Update	Any byte count	Can open/close by exact bytes as app reads data
Zero Window	No bytes allowed	Sender stops until window opens (window probe continues)
Window Scaling	Multiplier for bytes	Allows larger windows (up to 1 GB) for high-BDP paths
Silly Window Prevention	Byte-level decisions	Clark's algorithm: don't advertise tiny windows

Precise Control:

Byte-oriented windows enable precise control:

Receiver with 8192 bytes free advertises exactly 8192
Application reads 4096 bytes → window opens to 12288
Sender fills 8192 bytes → receiver window drops accordingly

This fine-grained control prevents receiver buffer overflow while maximizing throughput.

flow_control_example.txt
Byte-Oriented Flow Control Example:
═══════════════════════════════════════════════════════════════════
 
Initial state:
  Receiver buffer: 64 KB
  RCV.NXT = 10000
  Advertised window: 65535 bytes
  
Sender may send: bytes 10000 through 75534 (65535 bytes)
 
─────────────────────────────────────────────────────────────────
Event: Sender transmits 20000 bytes (SEQ 10000-29999)
─────────────────────────────────────────────────────────────────
  Receiver state:
    RCV.NXT = 30000 (after receiving all 20000)
    Buffer used: 20000 bytes
    Advertised window: 65535 - 20000 = 45535 bytes
    
  Sender may now send: bytes 30000 through 75534 (45535 bytes)
 
─────────────────────────────────────────────────────────────────
Event: Application reads 10000 bytes from receive buffer
─────────────────────────────────────────────────────────────────
  Receiver state:
    RCV.NXT = 30000 (unchanged - no new data)
    Buffer used: 10000 bytes (app consumed 10000)
    Advertised window: 65535 - 10000 = 55535 bytes
    
  Sender may now send: bytes 30000 through 85534 (55535 bytes)
  
  ↑ Window "opened" by exactly the bytes the application read

Window Scaling for Modern Networks

The original 16-bit window field limits advertised windows to 64 KB. For high-bandwidth, high-latency paths (high bandwidth-delay product), this is insufficient. RFC 7323 defines window scaling—a multiplier negotiated during handshake that allows effective windows up to 1 GB. The scaling factor applies to the byte count, maintaining byte-oriented semantics.

Contrast with Message-Oriented Protocols

To fully appreciate TCP's byte-oriented design, let's contrast it with message-oriented protocols like UDP and SCTP.

UDP (User Datagram Protocol):

UDP is message-oriented. Each sendto() creates exactly one datagram; each recvfrom() returns exactly one datagram. Message boundaries are preserved. There are no sequence numbers—datagrams are independent.

Byte-Oriented vs Message-Oriented
Aspect	TCP (Byte-Oriented)	UDP (Message-Oriented)
Data unit	Continuous byte stream	Discrete datagrams
Boundaries	Not preserved	Preserved exactly
Ordering	Guaranteed in-order delivery	No ordering guarantee
Partial delivery	Possible (any byte count)	All-or-nothing per datagram
Sequence tracking	Per-byte sequence numbers	None (or application-layer)
Reassembly	Automatic by TCP	Application responsibility
Best for	Continuous data streams	Independent messages/requests

SCTP (Stream Control Transmission Protocol):

Interestingly, SCTP supports both orientations:

Stream-oriented mode: Similar to TCP's byte stream
Message-oriented mode: Preserves application message boundaries

SCTP uses Transmission Sequence Numbers (TSN) that number chunks (containing messages), not bytes. This provides message-boundary preservation with reliability guarantees.

When Each Model Fits:

Byte-oriented (TCP): File transfers, HTTP, database connections—where data is naturally a stream
Message-oriented (UDP/SCTP): DNS queries, VoIP packets, game state updates—where each message is independent

Choosing the Right Abstraction

If your application naturally works with independent messages and needs reliability, consider whether TCP's byte stream adds complexity (requiring framing) or if SCTP's message mode or a UDP-based reliable protocol (like QUIC) might be more natural. TCP's byte stream is powerful but not always the best fit.

Summary: The Power of Byte-Oriented Design

TCP's byte-oriented design is a fundamental architectural choice that shapes every aspect of the protocol. Let's consolidate what we've learned:

Key Takeaways

•Byte-stream abstraction — TCP presents a continuous stream of bytes to applications, hiding all network complexity
•Per-byte sequence numbers — Every byte has a unique sequence number, enabling precise tracking and acknowledgment
•No message boundaries — Applications must implement their own framing; TCP doesn't preserve write() boundaries
•Segmentation independence — The same byte stream can be segmented differently for initial transmission and retransmission
•Position-based reassembly — Receivers place bytes at positions determined by SEQ, not arrival order
•Precise flow control — Windows advertise exact byte counts, enabling fine-grained buffer management
•Flexibility and adaptability — Byte-orientation allows TCP to adapt to varying MTUs, network conditions, and application patterns

What's Next:

We've established that every byte has a sequence number—but where do those sequence numbers start? The next page explores the Initial Sequence Number (ISN)—how it's chosen, why randomization matters for security, and its role in TCP connection establishment.

Page Complete

You now understand TCP's byte-oriented nature: how every byte is numbered, how applications interact with the byte stream, how receivers reassemble data, and how this design enables TCP's flexibility. Next, we examine how the initial sequence number is selected and why it matters.

2 / 5

Loading learning content...

Computer NetworksTCP Sequence Numbers

TCP Sequence Numbers

LevelIntermediate

Duration60 mins

TopicTCP Sequence Numbers

2 / 5

Byte-Oriented Transmission

Numbering Every Single Byte

This page explores TCP's byte-oriented transmission in depth: why it exists, how it works, and what it means for reliable communication.

What You Will Learn

The Byte-Stream Abstraction

No visible packet boundaries: Applications write and read bytes; they don't see segments
No visible reordering: Bytes arrive in exactly the order they were sent
No visible losses: Lost data is retransmitted automatically
No visible duplicates: Each byte is delivered exactly once

Converting Mermaid diagram...

No Message Boundaries:

Unlike UDP, TCP does not preserve application message boundaries. If an application makes three send() calls:

send("Hello");
send(" ");
send("World");

The receiver might see this data in any of these forms:

One recv() returning "Hello W"
Another recv() returning "orld"

Or:

One recv() returning "Hello World"

TCP guarantees the bytes arrive in order, but not that message boundaries are preserved. Applications must implement their own message framing if needed.

Application-Level Framing Required

Why Bytes, Not Packets?

Other protocols number packets or messages. Why does TCP number individual bytes? This design was intentional and provides crucial flexibility.

Consider the alternative: Packet-numbered protocol

If TCP numbered packets instead of bytes:

Packet #1 carries 1000 bytes
Packet #2 carries 1000 bytes
Packet #1 is lost

Byte numbering solves this elegantly:

Problems with Packet Numbering

•Segment size must be fixed or tracked separately
•Partial ACKs require complex tracking
•Path MTU changes complicate retransmission
•Receiver buffering tied to packet boundaries
•Nagle's algorithm harder to implement
•Selective ACK becomes complicated

Benefits of Byte Numbering

•ACK specifies exactly which byte next
•Segment size can vary dynamically
•Retransmissions can use any segmentation
•Partial data delivery is natural
•Buffering is position-based, not packet-based
•Works with any MTU, any network path

Concrete Example: Dynamic Segmentation

Consider a sender transmitting bytes 1000-3999 (3000 bytes). Initially, the path supports 1000-byte segments:

Segment 1: SEQ=1000, LEN=1000 (bytes 1000-1999) — Lost!
Segment 2: SEQ=2000, LEN=1000 (bytes 2000-2999) — Delivered
Segment 3: SEQ=3000, LEN=1000 (bytes 3000-3999) — Delivered

Path MTU changes, now only 500-byte segments work. The retransmission:

Segment 1a: SEQ=1000, LEN=500 (bytes 1000-1499) — Delivered
Segment 1b: SEQ=1500, LEN=500 (bytes 1500-1999) — Delivered

Because sequence numbers identify bytes, the receiver seamlessly accepts the retransmission even though segmentation differs. ACK=4000 confirms all 3000 bytes received.

Decoupling Data from Segmentation

Sequence Numbers and Byte Positions

Each TCP segment's sequence number identifies the first byte of data in that segment. The relationship between sequence numbers and byte positions is fundamental to understanding TCP's operation.

Formal Definition:

For a segment with:

SEQ = S (sequence number in TCP header)
LEN = L (number of data bytes)

The bytes in this segment occupy positions:

First byte: S
Second byte: S + 1
...
Last byte: S + L - 1
Next expected byte: S + L

byte_position_calculation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def calculate_byte_range(seq_number: int, data_length: int) -> tuple:
    """
    Calculate the byte range covered by a TCP segment.
    
    Args:
        seq_number: The sequence number from TCP header
        data_length: Number of data bytes in the segment
    
    Returns:
        Tuple of (first_byte, last_byte, next_expected)
    """
    first_byte = seq_number
    last_byte = seq_number + data_length - 1
    next_expected = seq_number + data_length
    
    return (first_byte, last_byte, next_expected)
 
# Example: Segment with SEQ=5000 carrying 1460 bytes
first, last, next_exp = calculate_byte_range(5000, 1460)
print(f"Bytes {first} to {last} in this segment")
print(f"Next segment should start at {next_exp}")
# Output:
# Bytes 5000 to 6459 in this segment
# Next segment should start at 6460
 
# Example: Multiple segments forming a stream
segments = [
    (1000, 500),   # SEQ=1000, 500 bytes
    (1500, 1000),  # SEQ=1500, 1000 bytes
    (2500, 750),   # SEQ=2500, 750 bytes
]
 
print("\nByte stream composition:")
for seq, length in segments:
    first, last, next_exp = calculate_byte_range(seq, length)
    print(f"  SEQ={seq}: bytes {first}-{last}")
    
# Output:
# Byte stream composition:
#   SEQ=1000: bytes 1000-1499
#   SEQ=1500: bytes 1500-2499
#   SEQ=2500: bytes 2500-3249

Segment Size Independence:

The same byte stream can be segmented in countless ways. All of the following represent the same 3000 bytes:

Segmentation A (3 segments of 1000 bytes each):

SEQ=1000, LEN=1000 (bytes 1000-1999)
SEQ=2000, LEN=1000 (bytes 2000-2999)
SEQ=3000, LEN=1000 (bytes 3000-3999)

Segmentation B (6 segments of 500 bytes each):

SEQ=1000, LEN=500 (bytes 1000-1499)
SEQ=1500, LEN=500 (bytes 1500-1999)
... and so on

Segmentation C (Mixed sizes):

SEQ=1000, LEN=1200 (bytes 1000-2199)
SEQ=2200, LEN=1800 (bytes 2200-3999)

All three segmentations describe the identical byte stream. The receiver treats them equivalently.

Maximum Segment Size (MSS)

Application Writes vs Segment Boundaries

The TCP Send Buffer:

When an application calls send() or write(), data enters TCP's send buffer—a contiguous byte queue. TCP extracts bytes from this buffer and packages them into segments based on:

Maximum Segment Size (MSS): Upper bound on segment payload
Congestion window (cwnd): Sender's view of network capacity
Receive window (rwnd): Receiver's advertised buffer space
Nagle's Algorithm: Delays small writes to coalesce data
Push conditions: Application-requested immediate transmission

write_segmentation_example.txt
Application makes these writes:
  write(100 bytes)  → Send buffer: [100 bytes]
  write(200 bytes)  → Send buffer: [300 bytes total]
  write(1500 bytes) → Send buffer: [1800 bytes total]
  
TCP with MSS=1000 might segment as:
  Segment 1: 1000 bytes (partial from writes 1, 2, and 3)
  Segment 2: 800 bytes (remainder of writes)
  
═════════════════════════════════════════════════════════════
Application Writes:   |  100  |    200    |        1500        |
                      └───────┴───────────┴───────────────────┘
                      
TCP Segmentation:     |         1000            |     800     |
                      └─────────────────────────┴─────────────┘
                           Segment 1               Segment 2
═════════════════════════════════════════════════════════════
 
Note: Application boundaries (|) don't align with TCP boundaries!

Coalescing and Splitting:

Coalescing: Multiple small writes might be combined into one segment
Splitting: One large write might be split across multiple segments

Both happen transparently. The byte-stream abstraction is maintained regardless of segmentation.

Coalescing Example

•Application: write(50), write(75), write(25)
•Send buffer: 150 bytes accumulated
•After Nagle delay or buffer full:
•TCP sends: 1 segment of 150 bytes
•Receiver sees: 150 contiguous bytes

Splitting Example

•Application: write(5000 bytes)
•MSS = 1460 bytes
•TCP sends: 4 segments
•Segment 1: 1460 bytes (SEQ=N)
•Segment 2: 1460 bytes (SEQ=N+1460)
•Segment 3: 1460 bytes (SEQ=N+2920)
•Segment 4: 620 bytes (SEQ=N+4380)

Don't Assume Message Alignment

Receiver Reassembly

The receiver uses sequence numbers to reassemble the byte stream correctly, regardless of segment arrival order or size. This reassembly process is central to TCP's reliability.

Reassembly Buffer:

The receiver maintains a reassembly buffer indexed by sequence number. Each arriving segment places its bytes at the correct positions:

Buffer position = Segment SEQ - Initial Receive Sequence (IRS)

This means:

Segments can arrive in any order and be placed correctly
Duplicate bytes are detected and discarded
Gaps are identified as missing data
Contiguous data from position 0 can be delivered to the application

reassembly_buffer.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
class TCPReassemblyBuffer:
    """
    Simplified TCP reassembly buffer demonstrating byte-oriented 
    receive processing.
    """
    def __init__(self, initial_receive_seq):
        self.irs = initial_receive_seq  # Initial Receive Sequence
        self.buffer = {}                 # SEQ -> bytes mapping
        self.rcv_nxt = initial_receive_seq  # Next expected SEQ
        
    def receive_segment(self, seq: int, data: bytes) -> bytes:
        """
        Process an incoming segment. Returns bytes deliverable 
        to the application (contiguous from rcv_nxt).
        """
        # Calculate buffer positions for this segment's bytes
        for i, byte in enumerate(data):
            byte_seq = seq + i
            
            # Only accept bytes within receive window
            if byte_seq < self.rcv_nxt:
                # Duplicate byte - already received
                continue
            
            # Store byte at its sequence number position
            if byte_seq not in self.buffer:
                self.buffer[byte_seq] = byte
        
        # Deliver contiguous bytes to application
        deliverable = bytearray()
        while self.rcv_nxt in self.buffer:
            deliverable.append(self.buffer.pop(self.rcv_nxt))
            self.rcv_nxt += 1
            
        return bytes(deliverable)
    
# Example: Out-of-order arrival
buffer = TCPReassemblyBuffer(initial_receive_seq=1000)
 
# Segments arrive out of order
result1 = buffer.receive_segment(1005, b"World")  # Future: buffered
print(f"After SEQ=1005: delivered {len(result1)} bytes")
# Output: After SEQ=1005: delivered 0 bytes
 
result2 = buffer.receive_segment(1000, b"Hello")  # Expected: fills gap!
print(f"After SEQ=1000: delivered {len(result2)} bytes")
print(f"Delivered: {result2.decode()}")
# Output: After SEQ=1000: delivered 10 bytes
# Output: Delivered: HelloWorld

Key Reassembly Properties:

Position Independence: Segments place bytes at positions determined by SEQ, not arrival order
Gap Tolerance: Missing bytes create gaps in the buffer; receipt of the missing segment fills the gap
Duplicate Handling: Bytes already in buffer or delivered are discarded
Contiguous Delivery: Only contiguous bytes from RCV.NXT are delivered to the application
Buffer Indexing: The buffer is logically indexed by sequence number, enabling O(1) placement

Out-of-Order Buffering Improves Performance

Implications for Flow Control

Byte-oriented numbering profoundly impacts how TCP implements flow control. The receiver's window advertisement specifies exactly how many bytes can be accepted, not how many packets.

Receive Window (rwnd):

The receiver advertises its receive window in each ACK segment:

Window = RCV.BUFFER_SIZE - (RCV.NXT - bytes_delivered_to_app)

This directly tells the sender: "You may send bytes with sequence numbers from RCV.NXT up to RCV.NXT + rwnd - 1."

Byte-Oriented Flow Control
Concept	Byte-Oriented Behavior	Why It Matters
Window Size	Exact byte count	Receiver knows precisely how much buffer space to allocate
Window Update	Any byte count	Can open/close by exact bytes as app reads data
Zero Window	No bytes allowed	Sender stops until window opens (window probe continues)
Window Scaling	Multiplier for bytes	Allows larger windows (up to 1 GB) for high-BDP paths
Silly Window Prevention	Byte-level decisions	Clark's algorithm: don't advertise tiny windows

Precise Control:

Byte-oriented windows enable precise control:

Receiver with 8192 bytes free advertises exactly 8192
Application reads 4096 bytes → window opens to 12288
Sender fills 8192 bytes → receiver window drops accordingly

This fine-grained control prevents receiver buffer overflow while maximizing throughput.

flow_control_example.txt
Byte-Oriented Flow Control Example:
═══════════════════════════════════════════════════════════════════
 
Initial state:
  Receiver buffer: 64 KB
  RCV.NXT = 10000
  Advertised window: 65535 bytes
  
Sender may send: bytes 10000 through 75534 (65535 bytes)
 
─────────────────────────────────────────────────────────────────
Event: Sender transmits 20000 bytes (SEQ 10000-29999)
─────────────────────────────────────────────────────────────────
  Receiver state:
    RCV.NXT = 30000 (after receiving all 20000)
    Buffer used: 20000 bytes
    Advertised window: 65535 - 20000 = 45535 bytes
    
  Sender may now send: bytes 30000 through 75534 (45535 bytes)
 
─────────────────────────────────────────────────────────────────
Event: Application reads 10000 bytes from receive buffer
─────────────────────────────────────────────────────────────────
  Receiver state:
    RCV.NXT = 30000 (unchanged - no new data)
    Buffer used: 10000 bytes (app consumed 10000)
    Advertised window: 65535 - 10000 = 55535 bytes
    
  Sender may now send: bytes 30000 through 85534 (55535 bytes)
  
  ↑ Window "opened" by exactly the bytes the application read

Window Scaling for Modern Networks

Contrast with Message-Oriented Protocols

To fully appreciate TCP's byte-oriented design, let's contrast it with message-oriented protocols like UDP and SCTP.

UDP (User Datagram Protocol):

Byte-Oriented vs Message-Oriented
Aspect	TCP (Byte-Oriented)	UDP (Message-Oriented)
Data unit	Continuous byte stream	Discrete datagrams
Boundaries	Not preserved	Preserved exactly
Ordering	Guaranteed in-order delivery	No ordering guarantee
Partial delivery	Possible (any byte count)	All-or-nothing per datagram
Sequence tracking	Per-byte sequence numbers	None (or application-layer)
Reassembly	Automatic by TCP	Application responsibility
Best for	Continuous data streams	Independent messages/requests

SCTP (Stream Control Transmission Protocol):

Interestingly, SCTP supports both orientations:

Stream-oriented mode: Similar to TCP's byte stream
Message-oriented mode: Preserves application message boundaries

SCTP uses Transmission Sequence Numbers (TSN) that number chunks (containing messages), not bytes. This provides message-boundary preservation with reliability guarantees.

When Each Model Fits:

Byte-oriented (TCP): File transfers, HTTP, database connections—where data is naturally a stream
Message-oriented (UDP/SCTP): DNS queries, VoIP packets, game state updates—where each message is independent

Choosing the Right Abstraction

Summary: The Power of Byte-Oriented Design

TCP's byte-oriented design is a fundamental architectural choice that shapes every aspect of the protocol. Let's consolidate what we've learned:

Key Takeaways

•Byte-stream abstraction — TCP presents a continuous stream of bytes to applications, hiding all network complexity
•Per-byte sequence numbers — Every byte has a unique sequence number, enabling precise tracking and acknowledgment
•No message boundaries — Applications must implement their own framing; TCP doesn't preserve write() boundaries
•Segmentation independence — The same byte stream can be segmented differently for initial transmission and retransmission
•Position-based reassembly — Receivers place bytes at positions determined by SEQ, not arrival order
•Precise flow control — Windows advertise exact byte counts, enabling fine-grained buffer management
•Flexibility and adaptability — Byte-orientation allows TCP to adapt to varying MTUs, network conditions, and application patterns

What's Next:

Page Complete

2 / 5