Loading content...
Imagine sending a 10-volume encyclopedia through a postal system that might lose packages, deliver them out of order, or duplicate them. How would you ensure the recipient can reconstruct the complete, correctly ordered text? You'd need a system to identify each piece, detect gaps, and request missing portions.
TCP sequence numbers solve exactly this problem for digital data. They are the fundamental mechanism that transforms the unreliable, best-effort service of IP into the reliable, ordered byte stream that applications depend upon. Every byte transmitted over a TCP connection is assigned a sequence number, creating an unambiguous identity for each piece of data.
This page explores the conceptual foundations of TCP sequence numbers—why they exist, how they work, and why their design is critical to TCP's reliability guarantees.
By the end of this page, you will understand: why sequence numbers are necessary for reliable transport, how TCP uses them to identify every byte of data, the mathematical space in which sequence numbers operate, how receivers use sequence numbers to detect loss and reordering, and the fundamental role sequence numbers play in all of TCP's reliability mechanisms.
To understand why sequence numbers are essential, we must first understand what TCP is built upon. The Internet Protocol (IP) provides an unreliable, connectionless datagram service. This means:
TCP's job is to provide reliable, ordered byte stream delivery on top of this unreliable substrate. Sequence numbers are the foundation that makes this possible.
TCP presents a byte stream abstraction to applications—data flows in as a continuous stream of bytes, not as discrete packets. Sequence numbers identify positions within this byte stream, much like byte offsets in a file. This abstraction is preserved across packet boundaries, reordering, and retransmissions.
A TCP sequence number is a 32-bit unsigned integer that identifies the first byte of data in a segment. Let's examine the mechanics in detail.
Mathematical Space: With 32 bits, sequence numbers range from 0 to 4,294,967,295 (2³² - 1). This creates a circular sequence space—after reaching the maximum value, sequence numbers wrap around to 0. This wrapping behavior, known as sequence number wraparound, has important implications for TCP's operation.
| Property | Value | Significance |
|---|---|---|
| Bit Width | 32 bits | Defines the range of possible sequence numbers |
| Minimum Value | 0 | Lowest possible sequence number |
| Maximum Value | 4,294,967,295 | Highest value before wraparound |
| Total Space | 4 GB (2³² bytes) | Maximum data identifiable without repetition |
| Arithmetic | Modulo 2³² | All comparisons use modular arithmetic |
| Wraparound | Approximately every 4 GB | At 1 Gbps, occurs roughly every 34 seconds |
Segment-to-Sequence Mapping:
Each TCP segment's sequence number field contains the sequence number of the first byte of data in that segment. For a segment carrying N bytes of data:
Example: If a segment has sequence number 1000 and carries 500 bytes, the bytes are numbered 1000, 1001, 1002, ..., 1499. The next contiguous segment would start with sequence number 1500.
Segment 1: SEQ=1000, LEN=500 → Bytes 1000-1499Segment 2: SEQ=1500, LEN=800 → Bytes 1500-2299Segment 3: SEQ=2300, LEN=200 → Bytes 2300-2499 Data Stream Position Mapping:┌─────────────────────────────────────────────────────────┐│ Position: 0 1 2 ... 499 500 501 ... 1299 1300││ SEQ Number: 1000 1001 1002 ... 1499 1500 1501 ... 2299 2300││ Segment: ←── Segment 1 ──→ ←── Segment 2 ──→ ←Seg 3→│└─────────────────────────────────────────────────────────┘SYN and FIN flags each consume one sequence number, even though they carry no data. This is crucial: a SYN segment with SEQ=100 means the first data byte will have SEQ=101. Similarly, a FIN with SEQ=5000 occupies that sequence number, so any subsequent data would start at 5001. This design ensures these critical control events are reliably delivered and acknowledged.
Because sequence numbers wrap around, comparing them requires modular arithmetic. A naive comparison (is 4,294,967,290 < 10?) would give the wrong answer for wrapped sequences.
TCP defines sequence number S1 as "less than" S2 if:
(S1 - S2) is negative when interpreted as a signed 32-bit integer
This means sequence numbers within 2³¹ (about 2 billion) of each other can be correctly compared across wraparound boundaries.
123456789101112131415161718192021
/* TCP sequence number comparison functions */ /* Returns true if s1 < s2 in modular sequence space */static inline bool before(uint32_t s1, uint32_t s2) { return (int32_t)(s1 - s2) < 0;} /* Returns true if s1 > s2 in modular sequence space */static inline bool after(uint32_t s1, uint32_t s2) { return (int32_t)(s1 - s2) > 0;} /* Example: Is 4294967290 before 10 after wraparound? */uint32_t s1 = 4294967290; /* Near max value */uint32_t s2 = 10; /* After wraparound */ /* Calculate: (4294967290 - 10) = 4294967280 *//* As signed int32: 4294967280 = -16 (negative!) *//* Therefore: 4294967290 is BEFORE 10 ✓ */ bool result = before(s1, s2); /* Returns true */Visual Representation of Circular Sequence Space:
Imagine sequence numbers arranged in a circle, like numbers on a clock but with 2³² positions. Any two sequence numbers divide the circle into two arcs. We consider S1 "before" S2 if the shorter path from S1 to S2 goes clockwise (in the direction of increasing sequence numbers).
This comparison works correctly as long as the two sequence numbers are within 2³¹ of each other—about 2 billion sequence numbers apart. For typical network speeds and TCP timer settings, this constraint is easily satisfied.
TCP can only correctly compare sequence numbers that are within 2³¹ of each other. This is why high-bandwidth connections use TCP timestamps (RFC 7323) to extend sequence space protection, preventing old segments from being mistaken for new ones after wraparound.
Let's trace how sequence numbers enable reliable delivery through a concrete example. Consider a client sending 3000 bytes to a server, assuming an initial sequence number of 1000 and a maximum segment size of 1000 bytes.
Key Observations:
Each segment's position is unambiguous: SEQ=2000 always refers to the same byte position, regardless of when it arrives
Reordering is transparent: Even if segment 2 (SEQ=2000) arrives before segment 1 (SEQ=1000), the server can buffer segment 2 and deliver data in order
Gaps indicate loss: If SEQ=2000 arrives but SEQ=1000 never does, the server detects a gap and can report it via acknowledgments
Progress is measurable: The acknowledgment number tells the sender exactly how much data has been received contiguously
The sender and receiver maintain different views of sequence space. Understanding these perspectives is crucial for grasping TCP's reliability mechanisms.
Sender's View:
The sender tracks several key sequence numbers:
| Variable | Description | Updates When |
|---|---|---|
| SND.UNA | Oldest unacknowledged byte | ACK received from receiver |
| SND.NXT | Next byte to be sent | Data transmitted |
| SND.WND | Receiver's advertised window | Window update received |
| ISS | Initial Send Sequence number | Connection established |
The sender's data falls into four categories:
Sender's Sequence Space View:═══════════════════════════════════════════════════════════════════│ Acknowledged │ Sent/Unacked │ Allowed to Send │ Not Allowed ││ (discarded) │ (in flight) │ (can transmit) │ (wait) │═══════════════════════════════════════════════════════════════════ ↑ ↑ ↑ SND.UNA SND.NXT SND.UNA + SND.WND Example State: ISS = 1000 (Initial Sequence Number) SND.UNA = 1500 (last ACK received was for 1500) SND.NXT = 2800 (next byte to send is 2800) SND.WND = 4000 (receiver window is 4000 bytes) → Bytes 1000-1499: Acknowledged, freed from buffer → Bytes 1500-2799: In flight, awaiting acknowledgment → Bytes 2800-5499: Can be sent immediately → Bytes 5500+: Must wait for window to advanceReceiver's View:
The receiver tracks incoming sequence numbers to determine what has been received and what to expect next:
| Variable | Description | Updates When |
|---|---|---|
| RCV.NXT | Next expected sequence number | In-order data received |
| RCV.WND | Receive window size | Application reads data |
| IRS | Initial Receive Sequence number | Connection established |
The receiver categorizes incoming segments into:
Modern TCP implementations buffer out-of-order segments rather than discarding them. This optimization dramatically improves performance when packets are reordered in the network. When the missing segment arrives, all buffered segments can be delivered immediately. Without this buffering, every out-of-order segment would need retransmission.
Sequence numbers are the foundation upon which TCP builds its reliability guarantees. Let's examine how they enable each guarantee:
Ordered Delivery:
Applications receive data in the exact order it was sent. The receiver uses sequence numbers to maintain ordering:
Initial state: RCV.NXT = 1000 Arrival order: SEQ=2000 → SEQ=3000 → SEQ=1000 → SEQ=4000 Processing: SEQ=2000 arrives: Future data (gap at 1000). Buffer it. SEQ=3000 arrives: Future data (still gap at 1000). Buffer it. SEQ=1000 arrives: Expected! Deliver 1000-1999, then 2000-2999, then 3000-3999 SEQ=4000 arrives: Expected (RCV.NXT now 4000). Deliver immediately. Application sees: Bytes 1000, 1001, 1002, ... (perfect order)No Duplicates:
The same data is never delivered twice. When a segment arrives:
No Missing Data:
The acknowledgment mechanism ensures no data is lost. The receiver's ACK tells the sender how much contiguous data has been received. Any gaps indicate loss and trigger retransmission through timeout or fast retransmit.
Sequence numbers + Acknowledgments + Retransmission Timers = Reliable Delivery. The receiver uses sequence numbers to detect problems; acknowledgments communicate this detection back to the sender; timers ensure the sender eventually retransmits unacknowledged data. Together, they guarantee that every byte eventually arrives, in order, exactly once.
Understanding sequence numbers has practical implications for network engineers, developers, and security professionals:
Performance Analysis:
When diagnosing TCP performance issues:
Security Considerations:
Sequence numbers were originally predictable, leading to attacks:
Modern TCP implementations use randomized Initial Sequence Numbers (ISN) to mitigate these attacks. RFC 6528 specifies secure ISN generation based on connection identifiers and a secret key.
Never implement TCP with predictable sequence numbers. Attackers can exploit predictability to hijack connections or inject malicious data. Always use cryptographically strong randomization for ISN selection. We explore ISN generation in depth in a later section of this module.
We've explored the conceptual foundations of TCP sequence numbers—the mechanism that enables reliable, ordered delivery over an unreliable network. Let's consolidate the key insights:
What's Next:
Now that we understand the concept of sequence numbers, we'll examine TCP's byte-oriented nature in detail. Unlike protocols that number packets or messages, TCP numbers individual bytes within the data stream. This design decision has profound implications for how TCP handles varying segment sizes, partial transmissions, and the byte-stream abstraction presented to applications.
You now understand the fundamental concept of TCP sequence numbers and their critical role in reliable data delivery. Next, we examine why TCP numbers bytes rather than packets, and how this byte-oriented design shapes the protocol's behavior.