Protocols And Standards - Learning Module

Loading content...

0/228

Protocol Elements

The Building Blocks of All Protocols

Every network protocol—from the simplest acknowledgment scheme to the most sophisticated secure transport layer—is composed of a small set of fundamental building blocks. These elements recur across layers, technologies, and decades of protocol evolution.

Understanding these elements provides a powerful mental framework: when you encounter any new protocol, you can immediately recognize its components and understand how they work together. Instead of memorizing each protocol as an isolated entity, you see them as assemblies of familiar parts arranged in different configurations.

What You Will Learn

By the end of this page, you will master the core elements of protocols: addressing, framing, error detection and correction, sequencing, acknowledgments, flow control, congestion control, and multiplexing. These concepts apply universally—from Ethernet frames to TCP segments to HTTP/3 streams.

Addressing

Addressing is the fundamental mechanism for identifying the source and destination of communication. Without addresses, messages would have no way to reach their intended recipients in a network of connected devices.

The addressing problem:

In a network with N nodes, how does Node A deliver data specifically to Node C, not to B, D, or E? The answer is to assign each node a unique identifier—an address—and include the destination address in every message.

Types of addresses in networking:

Address Types Across Network Layers
Layer	Address Type	Format	Scope	Example
Physical/Data Link	MAC Address	48 bits (6 octets)	Local network segment	`00:1A:2B:3C:4D:5E`
Network	IP Address	32 bits (IPv4) or 128 bits (IPv6)	Global internet	`192.168.1.1` or `2001:db8::1`
Transport	Port Number	16 bits	Within a single host	`443` (HTTPS), `22` (SSH)
Application	Domain Name / URI	Variable string	Human-readable global	`www.example.com/path`

Address properties:

Uniqueness: Addresses must be unique within their scope. MAC addresses are globally unique (assigned by manufacturers). IP addresses must be unique on a network (assigned by administrators or DHCP).
Hierarchical vs. Flat: IP addresses are hierarchical (network portion + host portion), enabling efficient routing by aggregating routes. MAC addresses are flat—no inherent structure helps route them.
Static vs. Dynamic: Some addresses are permanently assigned (burned-in MAC addresses), while others are dynamically assigned (DHCP-assigned IP addresses).

Addressing modes:

Unicast: One source to one destination (most common)
Broadcast: One source to all destinations on a network (255.255.255.255 in IPv4)
Multicast: One source to a group of interested destinations (224.0.0.0/4 in IPv4)
Anycast: One source to the nearest destination from a group (same address on multiple hosts)

Address Resolution

Networks often need to translate between address types. ARP (Address Resolution Protocol) translates IP addresses to MAC addresses. DNS (Domain Name System) translates domain names to IP addresses. These translation protocols are themselves built from the same fundamental elements we're studying.

Framing

Framing is the process of dividing a continuous stream of bits into discrete frames or packets—self-contained units of data with clear boundaries.

The framing problem:

Physical transmission media transmit continuous streams of bits. How does a receiver know where one message ends and another begins? How does it distinguish data bits from control information?

Framing techniques:

Common Framing Methods

•Character/Byte Count: A length field at the start specifies how many bytes follow. Simple but vulnerable—if the count is corrupted, synchronization is lost until a timeout.
•Flag Bytes with Byte Stuffing: Special flag bytes mark frame boundaries. If the flag pattern appears in data, it's escaped ('stuffed') with an escape byte. PPP uses this with flag byte 0x7E.
•Flag Bits with Bit Stuffing: Special bit patterns mark boundaries. HDLC uses 01111110. To prevent this pattern in data, the sender inserts a 0 after five consecutive 1s; the receiver removes it.
•Physical Layer Violations: Use encoding violations as delimiters. Manchester encoding can signal boundaries using invalid transitions that can't occur in normal data.
•Fixed-Length Frames: Each frame has exactly N bytes. Simple but inflexible—wastes space for small messages, can't send larger messages.

Bit Stuffing Example

Diagram

Flag Pattern: 01111110 (marks start and end of frame)
 
Original data:     01111111 10100010
                      ↑
                      Five consecutive 1s - must stuff a 0
 
After bit stuffing: 011111011 10100010
                          ↑
                          Inserted 0 to break the pattern
 
Transmitted frame: [01111110] 011111011 10100010 [01111110]
                    ↑ flag                         ↑ flag
 
Receiver removes stuffed bits to recover original data.

Frame components:

A typical frame contains several parts:

Preamble/Start delimiter: Marks the beginning and allows clock synchronization
Header: Contains control information—addresses, length, sequence numbers, flags
Payload: The actual user data being transmitted
Trailer: Contains error-detection codes (e.g., CRC) and possibly end delimiter

Why framing matters:

Without proper framing, a receiver cannot interpret the bit stream. It wouldn't know where addresses are, where data starts, or where frames end. Framing is so fundamental that protocols at every layer implement some form of it—Ethernet frames, IP packets, TCP segments, HTTP/2 frames.

Frame Synchronization Loss

When framing synchronization is lost (due to noise, buffer overflow, or bugs), receivers may fail to detect frame boundaries correctly. This can cascade into interpreting headers as data or data as headers. Protocols must include mechanisms to re-synchronize after such failures—often by waiting for timeouts or distinctive patterns.

Error Detection

Physical transmission media are imperfect. Electrical interference, cosmic rays, signal degradation, and component failures can flip bits during transmission. Error detection mechanisms allow receivers to identify when corruption has occurred.

The error detection problem:

A sender transmits 1000 bits. The receiver receives 1000 bits. How does it know if any bits changed during transmission?

Common error detection techniques:

Error Detection Methods Comparison
Method	How It Works	Detection Capability	Typical Usage
Parity Bit	Add 1 bit so total 1s are even (even parity) or odd (odd parity)	Detects single-bit errors; misses even-count errors	Memory, serial communication
2D Parity	Parity bits for rows AND columns	Detects all 1, 2, 3-bit errors; can correct 1-bit	Memory systems
Checksum	Sum data words, append sum; receiver recalculates to verify	Detects many errors but can miss compensating errors	IP, TCP, UDP headers
CRC (Cyclic Redundancy Check)	Treat data as polynomial, divide by generator, append remainder	Detects all errors shorter than n+1 bits (n = CRC length)	Ethernet, WiFi, storage

CRC deep dive:

CRC is the gold standard for error detection in networking due to its excellent detection capabilities relative to overhead.

How CRC works:

Treat the data as a giant binary number (polynomial)
Divide by a fixed generator polynomial using XOR-based division
Append the remainder (the CRC value) to the data
Receiver divides received data (including CRC) by the same generator
If remainder is zero, no detected errors; if non-zero, errors occurred

CRC-32 detects:

All single-bit errors
All double-bit errors
All odd numbers of bit errors
All burst errors up to 32 bits
Most longer burst errors (probability of missing ≈ 2^-32)

crc_example.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Simplified CRC demonstration (conceptual)
# Real implementations use optimized lookup tables
 
def crc_remainder(data_bits, generator):
    """
    Compute CRC by polynomial division
    data_bits: binary string representing data
    generator: binary string representing generator polynomial
    """
    # Append zeros for division (length = generator - 1)
    data = data_bits + '0' * (len(generator) - 1)
    
    data = list(data)
    for i in range(len(data_bits)):
        if data[i] == '1':  # Only divide if MSB is 1
            for j in range(len(generator)):
                # XOR each bit
                data[i + j] = str(int(data[i + j]) ^ int(generator[j]))
    
    # Remainder is the last (len(generator) - 1) bits
    remainder = ''.join(data[-(len(generator) - 1):])
    return remainder
 
# Example: CRC-3 with generator 1011
data = "11010011101100"
generator = "1011"
crc = crc_remainder(data, generator)
print(f"Data: {data}")
print(f"CRC:  {crc}")
print(f"Transmitted: {data}{crc}")
 
# Verification: divide transmitted data by generator
# If no errors, remainder should be all zeros

Detection vs. Correction

Error detection tells you something went wrong but not what. Error correction (covered next) actually repairs the damage. Detection is simpler and cheaper—if errors are rare, it's more efficient to detect and retransmit than to add correction overhead to every message.

Error Correction

Error correction (or Forward Error Correction, FEC) enables receivers to not only detect errors but automatically repair them without retransmission. This is crucial when retransmission is impractical—satellite communications, real-time media streaming, or one-way broadcast.

The Hamming Distance Principle:

Error correction codes work by adding redundancy that increases the "distance" between valid codewords. If valid messages are far apart in Hamming distance, a few bit flips will land in the space between them, and the receiver can determine which valid codeword was intended.

Hamming distance = number of positions where two codewords differ

To detect d errors: need minimum Hamming distance of d + 1
To correct d errors: need minimum Hamming distance of 2d + 1

Block Codes

•Hamming Codes — Classic single-error-correcting, double-error-detecting (SECDED). Used in ECC memory.
•Reed-Solomon Codes — Correct burst errors by operating on multi-bit symbols. Used in CDs, DVDs, QR codes, deep space.
•BCH Codes — Generalization of Hamming for multiple bit error correction.
•LDPC (Low-Density Parity Check) — Near Shannon-limit performance. Used in WiFi 6, 5G, SSD storage.

Convolutional Codes

•Convolutional Codes — Encode continuous streams (not fixed blocks). Decoded with Viterbi algorithm.
•Turbo Codes — Two convolutional codes interleaved. Revolutionary performance, used in 3G/4G.
•Polar Codes — Provably achieve channel capacity. Adopted for 5G control channels.

Hamming (7,4) Code Example:

The Hamming (7,4) code encodes 4 data bits into 7 bits (adding 3 parity bits). It can correct any single-bit error.

Bit positions: 1, 2, 3, 4, 5, 6, 7

Positions 1, 2, 4 are parity bits (powers of 2)
Positions 3, 5, 6, 7 are data bits

Each parity bit covers specific positions:

P1 (position 1): covers positions with bit 0 set in binary (1, 3, 5, 7)
P2 (position 2): covers positions with bit 1 set in binary (2, 3, 6, 7)
P4 (position 4): covers positions with bit 2 set in binary (4, 5, 6, 7)

To find the error, the receiver computes syndrome bits—if a parity check fails, its position bit is 1. The syndrome value gives the position of the flipped bit!

When to Use FEC vs. Retransmission

Use FEC when: latency is critical (real-time voice/video), retransmission is impossible (broadcast), or round-trip time is very long (deep space). Use retransmission when: bandwidth is precious, errors are rare, and latency is acceptable. Many modern systems use hybrid approaches—FEC handles most errors, retransmission handles the rest.

Sequencing and Ordering

In many networks, messages can arrive out of order, be duplicated, or be lost entirely. Sequencing mechanisms assign identifiers to messages that allow receivers to detect and handle these conditions.

The sequencing problem:

A sender transmits packets A, B, C in order. Due to different paths through the network or retransmissions, they might arrive as B, A, C, A, or just A, C. How does the receiver:

Detect that the order is wrong?
Reorder them correctly?
Detect duplicates?
Detect missing packets?

Sequence numbers:

The solution is to assign each packet a sequence number—a monotonically increasing identifier. The receiver tracks the expected next sequence number and can:

Detect reordering: Receiving sequence 5 when expecting 4 means 4 is delayed or lost
Reorder: Buffer out-of-order packets and deliver in sequence order when gaps fill
Detect duplicates: Same sequence number received twice
Detect loss: After timeout, missing sequence numbers indicate lost packets

Sequence number space:

Sequence numbers have finite bits and eventually wrap around. A 32-bit sequence number wraps after ~4 billion packets. The sequence number space must be larger than the maximum number of packets in flight at once; otherwise, ambiguity occurs.

TCP sequence numbers:

TCP uses 32-bit sequence numbers that count bytes, not packets. If a connection sends 1 TB of data, sequence numbers can wrap. TCP uses timestamps (PAWS—Protection Against Wrapped Sequences) to disambiguate.

Converting Mermaid diagram...

Idempotency with Sequence Numbers

Sequence numbers also enable idempotent processing. If a receiver processes packet SEQ=5 and then receives SEQ=5 again (duplicate), it can safely discard it. This property is essential for building reliable systems on unreliable networks—operations can be safely retried without duplicate side effects.

Acknowledgments and Retransmission

Acknowledgment (ACK) is a message sent by a receiver to confirm that data has been received correctly. Combined with retransmission on timeout or negative acknowledgment (NAK), ACKs enable reliable delivery over unreliable channels.

The acknowledgment problem:

Sender transmits packet. Did it arrive? Without feedback, the sender doesn't know. Did the receiver process it? Did it get corrupted and dropped? The sender needs confirmation.

ACK strategies:

Acknowledgment Approaches

•Positive ACK: Receiver sends ACK when packet received correctly. Sender retransmits if no ACK before timeout.
•Negative ACK (NAK): Receiver sends NAK when it detects a missing or corrupted packet. Sender retransmits immediately. Often combined with positive ACKs.
•Cumulative ACK: ACK number indicates all bytes up to that point received. "ACK 1000" means bytes 0-999 received. Simple but can't acknowledge out-of-order data.
•Selective ACK (SACK): Receiver reports exactly which segments received, including non-contiguous blocks. Allows efficient retransmission of only lost packets.
•Delayed ACK: Receiver waits briefly before ACKing, hoping to piggyback ACK on a data packet. Reduces ACK-only packets but adds latency.

Timeout and retransmission:

How long should a sender wait for an ACK before concluding the packet was lost? This is the Retransmission Timeout (RTO) problem.

Too short: Retransmit unnecessarily, wasting bandwidth and potentially causing congestion
Too long: Waste time waiting when the packet was clearly lost

Adaptive RTO:

Modern protocols (like TCP) dynamically estimate RTO based on measured Round-Trip Time (RTT):

Measure RTT for each acknowledged packet
Maintain smoothed RTT estimate: SRTT = α × SRTT + (1-α) × measured_RTT
Maintain RTT variance estimate: RTTVAR = β × RTTVAR + (1-β) × |SRTT - measured_RTT|
Calculate RTO: RTO = SRTT + 4 × RTTVAR

This adapts to network conditions—faster retransmission on low-latency LANs, longer timeouts for transcontinental connections.

The ACK Lost Problem

ACKs can be lost too! If the sender transmits packet SEQ=1, receiver gets it and sends ACK=1, but the ACK is lost, the sender will timeout and retransmit SEQ=1. The receiver must detect the duplicate (using sequence numbers) and discard it while resending ACK=1. This is why sequence numbers and idempotency are essential—the system must handle any message being lost or duplicated.

Flow Control

Flow control ensures that a fast sender doesn't overwhelm a slow receiver. It's a point-to-point mechanism between directly communicating peers.

The flow control problem:

A high-performance server can send data at 10 Gbps. A mobile phone on a congested WiFi connection can only receive at 50 Mbps. If the server blasts data at full speed, the receiver's buffers overflow and data is lost.

Flow control mechanisms:

Stop-and-Wait

•Sender transmits one packet
•Waits for ACK before sending next
•Simple but extremely inefficient
•Channel utilization: RTT / (RTT + transmission_time)
•On high-latency links, most time spent waiting

Sliding Window

•Sender can have W packets in flight (without ACKs)
•W (window size) controls how fast sender transmits
•Receiver advertises how much buffer space it has
•Sender adjusts window to match receiver capacity
•High efficiency—keeps channel full

TCP's sliding window in depth:

TCP implements flow control through the receive window (rwnd):

Receiver maintains a buffer to hold incoming data before the application reads it
Each ACK includes the advertised window—how many bytes the receiver can accept
Sender limits data in flight to min(rwnd, congestion_window)
As application reads data, buffer space frees, receiver advertises larger window
If receiver buffer fills (application not reading), advertised window drops to 0—zero window

Zero window situation:

When the receiver advertises window size 0, the sender must stop. But how does the sender know when to resume? TCP uses window probes—the sender periodically sends tiny segments to elicit an ACK with the current window size. When the window opens, transmission resumes.

Flow Control vs. Congestion Control

Flow control protects the receiver. Congestion control protects the network. A receiver might have plenty of buffer space, but the network path might be congested. Modern protocols implement both independently—the sender transmits at the minimum of what the receiver can handle and what the network can carry.

Congestion Control

Congestion control prevents too much data from being injected into the network, which would cause router buffers to overflow and packets to be dropped. Unlike flow control (receiver protection), congestion control protects the network itself.

The tragedy of the commons:

If every sender transmits as fast as possible, network queues overflow, packets are dropped, senders retransmit, making congestion worse—congestion collapse. Early ARPANET suffered this in 1986, triggering Van Jacobson's landmark congestion control work for TCP.

Key concepts:

Congestion window (cwnd): Sender-side limit on data in flight, based on network conditions (not receiver capacity)
Effective window: min(rwnd, cwnd)—sender uses the smaller of receiver window and congestion window
Congestion signals: Packet loss, delay increase, or explicit signals (ECN) indicate congestion

TCP congestion control phases:

1. Slow Start:

Start cwnd = 1 MSS (Maximum Segment Size)
For each ACK, cwnd += 1 MSS (doubles each RTT)
Exponential growth until loss or threshold (ssthresh)

2. Congestion Avoidance:

After reaching ssthresh, grow linearly: cwnd += 1/cwnd per ACK
Roughly 1 MSS increase per RTT—AIMD (Additive Increase)
Continue until loss detected

3. Fast Retransmit & Fast Recovery:

3 duplicate ACKs indicate loss (not just timeout)
Retransmit lost segment immediately (don't wait for timeout)
ssthresh = cwnd/2, cwnd = ssthresh + 3 MSS
Multiplicative Decrease

4. Timeout:

If RTO expires, assume severe congestion
ssthresh = cwnd/2, cwnd = 1 MSS (back to slow start)

Converting Mermaid diagram...

Modern Congestion Control

Classic TCP congestion control (Reno, NewReno) uses loss as the primary congestion signal. Modern algorithms like BBR (Bottleneck Bandwidth and RTT) instead model the network path, attempting to fill buffers just enough to maintain high throughput without causing excessive delay. This is particularly important for today's bufferbloated networks.

Multiplexing and Demultiplexing

Multiplexing allows multiple communication streams to share a single lower-layer connection or channel. Demultiplexing is the reverse—delivering incoming data to the correct upper-layer recipient.

The multiplexing problem:

A single computer runs multiple applications—web browser, email client, SSH session, video stream. All share one network interface and one IP address. How does the OS know which incoming packet belongs to which application?

Multiplexing Mechanisms by Layer

•Transport Layer (Port Numbers): 16-bit port numbers multiplex multiple applications per IP address. Socket = (IP, port). TCP segment to port 443 goes to HTTPS server; port 22 goes to SSH.
•Network Layer (Protocol Field): IP header's protocol field (8 bits) demultiplexes to transport protocols. Protocol 6 = TCP, Protocol 17 = UDP, Protocol 1 = ICMP.
•Data Link Layer (EtherType): Ethernet frame's type field demultiplexes to network layer. 0x0800 = IPv4, 0x86DD = IPv6, 0x0806 = ARP.
•Physical Layer (Frequency/Time): Multiple channels share the same medium. WiFi uses channel frequencies; cellular uses time slots.

How TCP demultiplexes connections:

TCP identifies a connection by the 4-tuple: (source IP, source port, destination IP, destination port).

A web server listening on port 443 can handle thousands of simultaneous connections because each has a unique 4-tuple:

Source IP	Source Port	Dest IP	Dest Port
1.2.3.4	54321	10.0.0.1	443
1.2.3.4	54322	10.0.0.1	443
5.6.7.8	12345	10.0.0.1	443

Even if the source IP and destination are identical, different source ports create distinct connections. The kernel's socket table maps incoming segments to the correct application based on the complete 4-tuple.

Port Number Ranges

Port 0-1023: Well-known ports (require root/admin to bind). Port 1024-49151: Registered ports (officially assigned by IANA). Port 49152-65535: Dynamic/ephemeral ports (used for client-side source ports). When you connect to a web server, your browser gets a random high-numbered ephemeral port; the server always uses 80 or 443.

Summary: Protocol Elements

We've explored the fundamental building blocks from which all network protocols are constructed. These elements recur at every layer of the network stack, combined in different configurations to meet different requirements.

Key Takeaways

•Addressing identifies sources and destinations at each layer—MAC, IP, ports, URIs—enabling routing and delivery.
•Framing creates discrete units from continuous bit streams, establishing message boundaries.
•Error detection (parity, checksum, CRC) identifies corruption; error correction (Hamming, Reed-Solomon) repairs it.
•Sequencing with sequence numbers enables reordering, duplicate detection, and loss detection.
•Acknowledgments and retransmission provide reliable delivery over unreliable channels.
•Flow control (sliding window) protects receivers from being overwhelmed by fast senders.
•Congestion control (slow start, AIMD) protects the network from collapse under heavy load.
•Multiplexing allows multiple streams to share channels; demultiplexing delivers to correct recipients.

What's next:

Now that we understand the individual elements, we'll see how they're organized into layers in the protocol stack. The next page examines how protocols at different layers cooperate through well-defined interfaces, enabling the modular, extensible architecture that makes the Internet possible.

Page Complete

You now have a comprehensive understanding of protocol elements. When you encounter any new protocol—from Bluetooth to QUIC—you can immediately identify which elements it uses and how they work together. This analytical framework is essential for networking at any level.

Protocol Elements

The Building Blocks of All Protocols

What You Will Learn

Addressing

The addressing problem:

Types of addresses in networking:

Address Types Across Network Layers
Layer	Address Type	Format	Scope	Example
Physical/Data Link	MAC Address	48 bits (6 octets)	Local network segment	`00:1A:2B:3C:4D:5E`
Network	IP Address	32 bits (IPv4) or 128 bits (IPv6)	Global internet	`192.168.1.1` or `2001:db8::1`
Transport	Port Number	16 bits	Within a single host	`443` (HTTPS), `22` (SSH)
Application	Domain Name / URI	Variable string	Human-readable global	`www.example.com/path`

Address properties:

Uniqueness: Addresses must be unique within their scope. MAC addresses are globally unique (assigned by manufacturers). IP addresses must be unique on a network (assigned by administrators or DHCP).
Hierarchical vs. Flat: IP addresses are hierarchical (network portion + host portion), enabling efficient routing by aggregating routes. MAC addresses are flat—no inherent structure helps route them.
Static vs. Dynamic: Some addresses are permanently assigned (burned-in MAC addresses), while others are dynamically assigned (DHCP-assigned IP addresses).

Addressing modes:

Unicast: One source to one destination (most common)
Broadcast: One source to all destinations on a network (255.255.255.255 in IPv4)
Multicast: One source to a group of interested destinations (224.0.0.0/4 in IPv4)
Anycast: One source to the nearest destination from a group (same address on multiple hosts)

Address Resolution

Framing

Framing is the process of dividing a continuous stream of bits into discrete frames or packets—self-contained units of data with clear boundaries.

The framing problem:

Physical transmission media transmit continuous streams of bits. How does a receiver know where one message ends and another begins? How does it distinguish data bits from control information?

Framing techniques:

Common Framing Methods

•Character/Byte Count: A length field at the start specifies how many bytes follow. Simple but vulnerable—if the count is corrupted, synchronization is lost until a timeout.
•Flag Bytes with Byte Stuffing: Special flag bytes mark frame boundaries. If the flag pattern appears in data, it's escaped ('stuffed') with an escape byte. PPP uses this with flag byte 0x7E.
•Flag Bits with Bit Stuffing: Special bit patterns mark boundaries. HDLC uses 01111110. To prevent this pattern in data, the sender inserts a 0 after five consecutive 1s; the receiver removes it.
•Physical Layer Violations: Use encoding violations as delimiters. Manchester encoding can signal boundaries using invalid transitions that can't occur in normal data.
•Fixed-Length Frames: Each frame has exactly N bytes. Simple but inflexible—wastes space for small messages, can't send larger messages.

Bit Stuffing Example

Diagram

Flag Pattern: 01111110 (marks start and end of frame)
 
Original data:     01111111 10100010
                      ↑
                      Five consecutive 1s - must stuff a 0
 
After bit stuffing: 011111011 10100010
                          ↑
                          Inserted 0 to break the pattern
 
Transmitted frame: [01111110] 011111011 10100010 [01111110]
                    ↑ flag                         ↑ flag
 
Receiver removes stuffed bits to recover original data.

Frame components:

A typical frame contains several parts:

Preamble/Start delimiter: Marks the beginning and allows clock synchronization
Header: Contains control information—addresses, length, sequence numbers, flags
Payload: The actual user data being transmitted
Trailer: Contains error-detection codes (e.g., CRC) and possibly end delimiter

Why framing matters:

Frame Synchronization Loss

Error Detection

The error detection problem:

A sender transmits 1000 bits. The receiver receives 1000 bits. How does it know if any bits changed during transmission?

Common error detection techniques:

Error Detection Methods Comparison
Method	How It Works	Detection Capability	Typical Usage
Parity Bit	Add 1 bit so total 1s are even (even parity) or odd (odd parity)	Detects single-bit errors; misses even-count errors	Memory, serial communication
2D Parity	Parity bits for rows AND columns	Detects all 1, 2, 3-bit errors; can correct 1-bit	Memory systems
Checksum	Sum data words, append sum; receiver recalculates to verify	Detects many errors but can miss compensating errors	IP, TCP, UDP headers
CRC (Cyclic Redundancy Check)	Treat data as polynomial, divide by generator, append remainder	Detects all errors shorter than n+1 bits (n = CRC length)	Ethernet, WiFi, storage

CRC deep dive:

CRC is the gold standard for error detection in networking due to its excellent detection capabilities relative to overhead.

How CRC works:

Treat the data as a giant binary number (polynomial)
Divide by a fixed generator polynomial using XOR-based division
Append the remainder (the CRC value) to the data
Receiver divides received data (including CRC) by the same generator
If remainder is zero, no detected errors; if non-zero, errors occurred

CRC-32 detects:

All single-bit errors
All double-bit errors
All odd numbers of bit errors
All burst errors up to 32 bits
Most longer burst errors (probability of missing ≈ 2^-32)

crc_example.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Simplified CRC demonstration (conceptual)
# Real implementations use optimized lookup tables
 
def crc_remainder(data_bits, generator):
    """
    Compute CRC by polynomial division
    data_bits: binary string representing data
    generator: binary string representing generator polynomial
    """
    # Append zeros for division (length = generator - 1)
    data = data_bits + '0' * (len(generator) - 1)
    
    data = list(data)
    for i in range(len(data_bits)):
        if data[i] == '1':  # Only divide if MSB is 1
            for j in range(len(generator)):
                # XOR each bit
                data[i + j] = str(int(data[i + j]) ^ int(generator[j]))
    
    # Remainder is the last (len(generator) - 1) bits
    remainder = ''.join(data[-(len(generator) - 1):])
    return remainder
 
# Example: CRC-3 with generator 1011
data = "11010011101100"
generator = "1011"
crc = crc_remainder(data, generator)
print(f"Data: {data}")
print(f"CRC:  {crc}")
print(f"Transmitted: {data}{crc}")
 
# Verification: divide transmitted data by generator
# If no errors, remainder should be all zeros

Detection vs. Correction

Error Correction

The Hamming Distance Principle:

Hamming distance = number of positions where two codewords differ

To detect d errors: need minimum Hamming distance of d + 1
To correct d errors: need minimum Hamming distance of 2d + 1

Block Codes

•Hamming Codes — Classic single-error-correcting, double-error-detecting (SECDED). Used in ECC memory.
•Reed-Solomon Codes — Correct burst errors by operating on multi-bit symbols. Used in CDs, DVDs, QR codes, deep space.
•BCH Codes — Generalization of Hamming for multiple bit error correction.
•LDPC (Low-Density Parity Check) — Near Shannon-limit performance. Used in WiFi 6, 5G, SSD storage.

Convolutional Codes

•Convolutional Codes — Encode continuous streams (not fixed blocks). Decoded with Viterbi algorithm.
•Turbo Codes — Two convolutional codes interleaved. Revolutionary performance, used in 3G/4G.
•Polar Codes — Provably achieve channel capacity. Adopted for 5G control channels.

Hamming (7,4) Code Example:

The Hamming (7,4) code encodes 4 data bits into 7 bits (adding 3 parity bits). It can correct any single-bit error.

Bit positions: 1, 2, 3, 4, 5, 6, 7

Positions 1, 2, 4 are parity bits (powers of 2)
Positions 3, 5, 6, 7 are data bits

Each parity bit covers specific positions:

P1 (position 1): covers positions with bit 0 set in binary (1, 3, 5, 7)
P2 (position 2): covers positions with bit 1 set in binary (2, 3, 6, 7)
P4 (position 4): covers positions with bit 2 set in binary (4, 5, 6, 7)

To find the error, the receiver computes syndrome bits—if a parity check fails, its position bit is 1. The syndrome value gives the position of the flipped bit!

When to Use FEC vs. Retransmission

Sequencing and Ordering

The sequencing problem:

A sender transmits packets A, B, C in order. Due to different paths through the network or retransmissions, they might arrive as B, A, C, A, or just A, C. How does the receiver:

Detect that the order is wrong?
Reorder them correctly?
Detect duplicates?
Detect missing packets?

Sequence numbers:

The solution is to assign each packet a sequence number—a monotonically increasing identifier. The receiver tracks the expected next sequence number and can:

Detect reordering: Receiving sequence 5 when expecting 4 means 4 is delayed or lost
Reorder: Buffer out-of-order packets and deliver in sequence order when gaps fill
Detect duplicates: Same sequence number received twice
Detect loss: After timeout, missing sequence numbers indicate lost packets

Sequence number space:

TCP sequence numbers:

Converting Mermaid diagram...

Idempotency with Sequence Numbers

Acknowledgments and Retransmission

The acknowledgment problem:

Sender transmits packet. Did it arrive? Without feedback, the sender doesn't know. Did the receiver process it? Did it get corrupted and dropped? The sender needs confirmation.

ACK strategies:

Acknowledgment Approaches

•Positive ACK: Receiver sends ACK when packet received correctly. Sender retransmits if no ACK before timeout.
•Negative ACK (NAK): Receiver sends NAK when it detects a missing or corrupted packet. Sender retransmits immediately. Often combined with positive ACKs.
•Cumulative ACK: ACK number indicates all bytes up to that point received. "ACK 1000" means bytes 0-999 received. Simple but can't acknowledge out-of-order data.
•Selective ACK (SACK): Receiver reports exactly which segments received, including non-contiguous blocks. Allows efficient retransmission of only lost packets.
•Delayed ACK: Receiver waits briefly before ACKing, hoping to piggyback ACK on a data packet. Reduces ACK-only packets but adds latency.

Timeout and retransmission:

How long should a sender wait for an ACK before concluding the packet was lost? This is the Retransmission Timeout (RTO) problem.

Too short: Retransmit unnecessarily, wasting bandwidth and potentially causing congestion
Too long: Waste time waiting when the packet was clearly lost

Adaptive RTO:

Modern protocols (like TCP) dynamically estimate RTO based on measured Round-Trip Time (RTT):

Measure RTT for each acknowledged packet
Maintain smoothed RTT estimate: SRTT = α × SRTT + (1-α) × measured_RTT
Maintain RTT variance estimate: RTTVAR = β × RTTVAR + (1-β) × |SRTT - measured_RTT|
Calculate RTO: RTO = SRTT + 4 × RTTVAR

This adapts to network conditions—faster retransmission on low-latency LANs, longer timeouts for transcontinental connections.

The ACK Lost Problem

Flow Control

Flow control ensures that a fast sender doesn't overwhelm a slow receiver. It's a point-to-point mechanism between directly communicating peers.

The flow control problem:

Flow control mechanisms:

Stop-and-Wait

•Sender transmits one packet
•Waits for ACK before sending next
•Simple but extremely inefficient
•Channel utilization: RTT / (RTT + transmission_time)
•On high-latency links, most time spent waiting

Sliding Window

•Sender can have W packets in flight (without ACKs)
•W (window size) controls how fast sender transmits
•Receiver advertises how much buffer space it has
•Sender adjusts window to match receiver capacity
•High efficiency—keeps channel full

TCP's sliding window in depth:

TCP implements flow control through the receive window (rwnd):

Receiver maintains a buffer to hold incoming data before the application reads it
Each ACK includes the advertised window—how many bytes the receiver can accept
Sender limits data in flight to min(rwnd, congestion_window)
As application reads data, buffer space frees, receiver advertises larger window
If receiver buffer fills (application not reading), advertised window drops to 0—zero window

Zero window situation:

Flow Control vs. Congestion Control

Congestion Control

The tragedy of the commons:

Key concepts:

Congestion window (cwnd): Sender-side limit on data in flight, based on network conditions (not receiver capacity)
Effective window: min(rwnd, cwnd)—sender uses the smaller of receiver window and congestion window
Congestion signals: Packet loss, delay increase, or explicit signals (ECN) indicate congestion

TCP congestion control phases:

1. Slow Start:

Start cwnd = 1 MSS (Maximum Segment Size)
For each ACK, cwnd += 1 MSS (doubles each RTT)
Exponential growth until loss or threshold (ssthresh)

2. Congestion Avoidance:

After reaching ssthresh, grow linearly: cwnd += 1/cwnd per ACK
Roughly 1 MSS increase per RTT—AIMD (Additive Increase)
Continue until loss detected

3. Fast Retransmit & Fast Recovery:

3 duplicate ACKs indicate loss (not just timeout)
Retransmit lost segment immediately (don't wait for timeout)
ssthresh = cwnd/2, cwnd = ssthresh + 3 MSS
Multiplicative Decrease

4. Timeout:

If RTO expires, assume severe congestion
ssthresh = cwnd/2, cwnd = 1 MSS (back to slow start)

Converting Mermaid diagram...

Modern Congestion Control

Multiplexing and Demultiplexing

The multiplexing problem:

Multiplexing Mechanisms by Layer

•Transport Layer (Port Numbers): 16-bit port numbers multiplex multiple applications per IP address. Socket = (IP, port). TCP segment to port 443 goes to HTTPS server; port 22 goes to SSH.
•Network Layer (Protocol Field): IP header's protocol field (8 bits) demultiplexes to transport protocols. Protocol 6 = TCP, Protocol 17 = UDP, Protocol 1 = ICMP.
•Data Link Layer (EtherType): Ethernet frame's type field demultiplexes to network layer. 0x0800 = IPv4, 0x86DD = IPv6, 0x0806 = ARP.
•Physical Layer (Frequency/Time): Multiple channels share the same medium. WiFi uses channel frequencies; cellular uses time slots.

How TCP demultiplexes connections:

TCP identifies a connection by the 4-tuple: (source IP, source port, destination IP, destination port).

A web server listening on port 443 can handle thousands of simultaneous connections because each has a unique 4-tuple:

Source IP	Source Port	Dest IP	Dest Port
1.2.3.4	54321	10.0.0.1	443
1.2.3.4	54322	10.0.0.1	443
5.6.7.8	12345	10.0.0.1	443

Port Number Ranges

Summary: Protocol Elements

Key Takeaways

•Addressing identifies sources and destinations at each layer—MAC, IP, ports, URIs—enabling routing and delivery.
•Framing creates discrete units from continuous bit streams, establishing message boundaries.
•Error detection (parity, checksum, CRC) identifies corruption; error correction (Hamming, Reed-Solomon) repairs it.
•Sequencing with sequence numbers enables reordering, duplicate detection, and loss detection.
•Acknowledgments and retransmission provide reliable delivery over unreliable channels.
•Flow control (sliding window) protects receivers from being overwhelmed by fast senders.
•Congestion control (slow start, AIMD) protects the network from collapse under heavy load.
•Multiplexing allows multiple streams to share channels; demultiplexing delivers to correct recipients.

What's next:

Page Complete