Multilevel Coding - Learning Module

Loading content...

0/228

8B/10B Block Encoding

The Quest for Perfect Signal Balance

As networking speeds climbed from 100 Mbps to 1 Gbps and beyond, the limitations of 4B/5B encoding became increasingly problematic. While 4B/5B guaranteed sufficient transitions for clock recovery, it could not guarantee DC balance—and this became a critical constraint for the demanding links of Gigabit-class systems.

The Problem: AC-coupled interfaces (using capacitors or transformers to block DC) require signals with zero average voltage. Long runs of certain data patterns in 4B/5B could cause baseline wander—a slow drift in the receiver's reference voltage that would eventually cause bit errors. This limited both the distance and the data pattern tolerance of 4B/5B systems.

The Solution: In 1983, IBM engineers Al Widmer and Peter Franaszek invented 8B/10B encoding, a revolutionary scheme that guarantees DC balance through a technique called running disparity control. Each 8-bit byte is encoded as a 10-bit symbol, with two possible encodings for most bytes—one that adds positive disparity, one that adds negative. The encoder dynamically chooses between them to keep the cumulative disparity bounded.

8B/10B became the encoding of choice for an entire generation of high-speed interfaces: Gigabit Ethernet, Fibre Channel, USB 3.0, SATA, DisplayPort, HDMI, and countless others.

What You Will Master

By completing this page, you will understand the complete theory and implementation of 8B/10B encoding: the 3B/4B and 5B/6B sub-encoding structure, running disparity management, the complete K-code control symbol set, error detection through code space violations, and implementations in Gigabit Ethernet, Fibre Channel, and PCI Express. You'll gain the depth expected of a systems architect who must make informed physical layer decisions.

The Architecture of 8B/10B Encoding

8B/10B encoding maps each 8-bit byte to a 10-bit symbol. Unlike 4B/5B's simple lookup, 8B/10B uses a sophisticated two-stage encoding with disparity tracking to guarantee DC balance.

The Two-Stage Structure

8B/10B is not a single 8-to-10-bit mapping. Instead, the 8-bit input is split into two parts:

Input: HGFEDCBA (8 bits)
       └─┬─┘└┬┘
         │   └── Lower 5 bits (EDCBA) → 5B/6B encoder → abcdei (6 bits)
         └────── Upper 3 bits (HGF)   → 3B/4B encoder → fghj  (4 bits)

Output: abcdei fghj (10 bits, transmitted LSB first)

Why this split?

Reduced Lookup Table Size: A direct 8-to-10 mapping would require 256 entries × 10 bits × 2 disparities = 5120 bits of ROM. The split approach uses two smaller tables.
Disparity Management: The 5B/6B and 3B/4B sub-codes are designed so that their disparities can be independently tracked and compensated.
Historical Optimization: The original IBM implementation used discrete logic where smaller tables meant fewer gates.

Understanding Disparity

Disparity Definition:

Disparity = (number of 1s) - (number of 0s)

For a code word:

If ones > zeros: positive disparity
If ones < zeros: negative disparity
If ones = zeros: neutral (zero disparity)

Running Disparity (RD):

The encoder maintains a running disparity state: either RD- (negative) or RD+ (positive).

After transmitting a positive disparity code, RD becomes RD+
After transmitting a negative disparity code, RD becomes RD-
After transmitting a neutral code, RD is unchanged

The encoder always selects the code version that drives RD toward balance:

If RD-, select the positive disparity version
If RD+, select the negative disparity version

The Disparity Inverting Trick

Most 8B/10B codes come in complementary pairs: one with positive disparity, one with negative. These pairs are bit-wise complements of each other! If code X has disparity +2, then ~X (all bits inverted) has disparity -2. This elegant property makes the encoder trivial: keep one version in the table and XOR with a mask when the other is needed.

The 5B/6B Sub-Code

The lower 5 bits (EDCBA) encode to 6 bits (abcdei). The 5B/6B code must ensure:

No code has more than 4 consecutive identical bits
Most codes have disparity 0, ±2 (never ±4 or ±6)
Codes with nonzero disparity come in complementary pairs

Example 5B/6B Mappings:

D.x (Data)	EDCBA	abcdei (RD-)	abcdei (RD+)	Disparity
D.0	00000	100111	011000	±2
D.1	00001	011101	100010	±2
D.7	00111	111000	000111	0/0
D.11	01011	110100	001011	0/±2
D.31	11111	101011	010100	±2

The 3B/4B Sub-Code

The upper 3 bits (HGF) encode to 4 bits (fghj). The 3B/4B code follows similar principles:

D.y (Data)	HGF	fghj (RD-)	fghj (RD+)	Disparity
.0	000	1011	0100	±2
.1	001	1001	-	0
.2	010	0101	-	0
.3	011	1100	0011	±2
.7	111	1110	0001	±2

Combined Notation: D.x.y

Data codes are written as D.x.y where:

x = decimal value of EDCBA (0-31)
y = decimal value of HGF (0-7)

For example, ASCII 'A' (0x41 = 0100 0001 = HGFEDCBA = 010 00001):

x = 00001 = 1
y = 010 = 2
Notation: D.1.2
Encoding: abcdei from D.1 + fghj from .2

Complete D.x.y Encoding Process Example
Step	Operation	Bits	Notes
Input	ASCII 'A'	01000001	Hex 0x41
Split	HGF + EDCBA	010 + 00001	Upper 3, lower 5
Lookup x	D.1 at RD-	011101	5B/6B code
Lookup y	.2 at RD-	0101	3B/4B code
Combine	abcdei + fghj	0111010101	10-bit symbol
Disparity	Sum 1s and 0s	6-4 = +2	New RD = RD+

Running Disparity: The Heart of DC Balance

Running disparity (RD) is the mechanism that guarantees 8B/10B's DC balance. Understanding RD is essential to understanding why 8B/10B succeeds where 4B/5B falls short.

The Running Disparity Algorithm

Initial State: At system startup, RD is typically initialized to RD- (though the choice is arbitrary—the system will converge to correct operation within a few symbols).

Encoding Rule:

1. For the current byte, look up the 10-bit code
2. If the code has disparity 0, use it directly; RD unchanged
3. If the code has disparity ±2:
   - If RD-, use the code version with disparity +2
   - If RD+, use the code version with disparity -2
4. Update RD based on the disparity of the transmitted code:
   - Positive disparity: RD becomes RD+
   - Negative disparity: RD becomes RD-
   - Zero disparity: RD unchanged

Why RD Stays Bounded

Key Insight: Because every non-neutral code flips the RD state, and the encoder always selects the disparity opposite to current RD, the running disparity oscillates between RD- and RD+ but never accumulates indefinitely.

Bounds:

Maximum cumulative disparity: ±4 (temporary excursion within a code)
After each code: ±2 (steady-state bounds)
Long-term average: 0 (guaranteed DC balance)

8b10b_encoder.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
class Encoder8B10B:
    """
    8B/10B Encoder with Running Disparity Tracking
    
    Demonstrates the core 8B/10B algorithm with disparity management.
    This simplified implementation shows the encoding logic; production
    implementations use full lookup tables for all 256 data codes.
    """
    
    def __init__(self):
        self.rd_negative = True  # Running disparity: True = RD-, False = RD+
        
        # Simplified 5B/6B table (selected entries for demonstration)
        # Format: (RD- code, RD+ code, is_neutral)
        self.table_5b6b = {
            0b00000: (0b100111, 0b011000, False),  # D.0
            0b00001: (0b011101, 0b100010, False),  # D.1
            0b00010: (0b101101, 0b010010, False),  # D.2
            0b00111: (0b111000, 0b000111, True),   # D.7 (neutral)
            0b01011: (0b110100, 0b110100, True),   # D.11 (neutral)
            0b10111: (0b101110, 0b010001, False),  # D.23
            0b11011: (0b110110, 0b001001, False),  # D.27
            0b11100: (0b111001, 0b000110, False),  # D.28
            0b11111: (0b101011, 0b010100, False),  # D.31
        }
        
        # Simplified 3B/4B table
        self.table_3b4b = {
            0b000: (0b1011, 0b0100, False),  # .0
            0b001: (0b1001, 0b1001, True),   # .1 (neutral)
            0b010: (0b0101, 0b0101, True),   # .2 (neutral)
            0b011: (0b1100, 0b0011, False),  # .3
            0b100: (0b1101, 0b0010, False),  # .4
            0b101: (0b1010, 0b1010, True),   # .5 (neutral)
            0b110: (0b0110, 0b0110, True),   # .6 (neutral)
            0b111: (0b1110, 0b0001, False),  # .7 (special handling for some x values)
        }
    
    def encode_byte(self, byte: int) -> tuple[int, int]:
        """
        Encode an 8-bit byte to a 10-bit symbol.
        
        Returns:
            Tuple of (10-bit code, disparity change)
        """
        # Split into x (lower 5 bits) and y (upper 3 bits)
        x = byte & 0x1F        # EDCBA
        y = (byte >> 5) & 0x07  # HGF
        
        # Look up 5B/6B code for x
        if x not in self.table_5b6b:
            # For demo, return a placeholder
            abcdei = 0b111000  # D.7 as fallback
            x_neutral = True
        else:
            rd_neg_code, rd_pos_code, x_neutral = self.table_5b6b[x]
            abcdei = rd_neg_code if self.rd_negative else rd_pos_code
        
        # Calculate intermediate disparity after 6B code
        ones_6b = bin(abcdei).count('1')
        disp_6b = ones_6b - 6+ ones_6b  # ones - zeros = ones - (6 - ones) = 2*ones - 6
        disp_6b = 2 * ones_6b - 6
        
        # Update intermediate RD
        intermediate_rd_negative = self.rd_negative
        if not x_neutral:
            intermediate_rd_negative = disp_6b < 0
        
        # Look up 3B/4B code for y
        if y not in self.table_3b4b:
            fghj = 0b0101  # .2 as fallback
            y_neutral = True
        else:
            rd_neg_code, rd_pos_code, y_neutral = self.table_3b4b[y]
            fghj = rd_neg_code if intermediate_rd_negative else rd_pos_code
        
        # Combine into 10-bit code (abcdei in lower 6 bits, fghj in upper 4)
        code_10b = (fghj << 6) | abcdei
        
        # Calculate final disparity
        ones_4b = bin(fghj).count('1')
        disp_4b = 2 * ones_4b - 4
        total_disp = disp_6b + disp_4b
        
        # Update running disparity
        if total_disp > 0:
            self.rd_negative = False
        elif total_disp < 0:
            self.rd_negative = True
        # If zero, RD unchanged
        
        return code_10b, total_disp
    
    def encode_stream(self, data: bytes) -> list[tuple[int, int]]:
        """Encode a byte stream, returning (code, disparity) pairs."""
        results = []
        for byte in data:
            code, disp = self.encode_byte(byte)
            results.append((code, disp))
        return results
    
    def analyze_balance(self, codes: list[tuple[int, int]]) -> dict:
        """Analyze DC balance of encoded stream."""
        total_ones = 0
        total_bits = 0
        rd_trace = []
        
        self.rd_negative = True  # Reset for analysis
        for code, disp in codes:
            ones = bin(code).count('1')
            total_ones += ones
            total_bits += 10
            rd_trace.append('RD-' if self.rd_negative else 'RD+')
            
            # Update RD
            if disp > 0:
                self.rd_negative = False
            elif disp < 0:
                self.rd_negative = True
        
        return {
            "total_bits": total_bits,
            "total_ones": total_ones,
            "total_zeros": total_bits - total_ones,
            "dc_offset": (total_ones - (total_bits - total_ones)) / total_bits,
            "rd_trace": rd_trace[:10],  # First 10 RD states
            "is_dc_balanced": abs(total_ones - (total_bits - total_ones)) <= 4
        }
 
 
# Demonstration
if __name__ == "__main__":
    encoder = Encoder8B10B()
    
    # Encode sample data
    test_data = bytes([0x00, 0xFF, 0xA5, 0x5A])
    print("8B/10B Encoding Demonstration")
    print("=" * 60)
    
    results = encoder.encode_stream(test_data)
    for i, (byte, (code, disp)) in enumerate(zip(test_data, results)):
        print(f"Byte 0x{byte:02X}: Code {code:010b} (disp {disp:+d})")
    
    # Analyze larger stream
    encoder = Encoder8B10B()  # Reset
    large_data = bytes(range(256))  # All possible bytes
    results = encoder.encode_stream(large_data)
    analysis = encoder.analyze_balance(results)
    print(f"\nDC Balance Analysis (256 bytes):")
    for k, v in analysis.items():
        if k != "rd_trace":
            print(f"  {k}: {v}")

Disparity Error Detection

Code Space Violations:

Not all 10-bit patterns are valid 8B/10B codes. There are 2¹⁰ = 1024 possible patterns, but only:

256 data codes (D.x.y) × 2 disparities = 512 unique patterns (some overlap)
12 control codes (K.x.y) × 2 disparities = ~24 unique patterns

Actual unique valid patterns: ~512 (exact count depends on neutral codes)

Disparity Violations:

Even if a received pattern is a valid code, receiving it with the wrong disparity indicates an error:

Example:
- Expecting RD- context, receive code for RD+ context
- This is a "disparity error"
- Indicates bit errors corrupted a code into a different valid code

Error Detection Capability:

8B/10B can detect:

Any single-bit error (invalid code or disparity violation)
Most multi-bit errors (very few code-to-code transitions)
All run-length violations (codes exceeding 5 consecutive identical bits)

This provides strong error detection at the physical layer, reducing the burden on higher-layer CRC checks.

Undetectable Errors

While 8B/10B has excellent error detection, some multi-bit errors can transform one valid code into another without violating disparity—these are undetectable at the 8B/10B layer. This is why higher-layer CRCs remain necessary. The probability of undetectable errors is very low (~0.1% of multi-bit error patterns) but non-zero.

K-Codes: Special Control Symbols

8B/10B reserves 12 special codes called K-codes (K for "Kontrol" in German, honoring the IBM Böblingen lab that contributed to the design). K-codes are distinguished from data codes by patterns that never appear in the data code space.

The K-Code Space

K-codes are identified by specific x.y combinations that produce unique 10-bit patterns:

K-Code	Notation	10-bit (RD-)	10-bit (RD+)	Primary Use
K.28.0	K28.0	001111 0100	110000 1011	Fibre Channel
K.28.1	K28.1	001111 1001	110000 0110	Gigabit Ethernet Idle
K.28.2	K28.2	001111 0101	110000 1010	Alignment
K.28.3	K28.3	001111 0011	110000 1100	Alignment
K.28.4	K28.4	001111 0010	110000 1101	Alignment
K.28.5	K28.5	001111 1010	110000 0101	Comma Character
K.28.6	K28.6	001111 0110	110000 1001	Alignment
K.28.7	K28.7	001111 1000	110000 0111	Reserved
K.23.7	K23.7	111010 1000	000101 0111	End Bad Frame
K.27.7	K27.7	110110 1000	001001 0111	Start of Packet
K.29.7	K29.7	101110 1000	010001 0111	End of Packet
K.30.7	K30.7	011110 1000	100001 0111	Error Propagation

The Comma Character: K.28.5

The most important K-code is K.28.5, known as the "comma" character. Its pattern contains a unique bit sequence that cannot appear within any sequence of data codes:

K.28.5 (RD-): 0011111010  (contains "0011111")
K.28.5 (RD+): 1100000101  (contains "1100000")

The Comma Property: The 7-bit patterns 0011111 and 1100000 never appear across data code boundaries or within data codes. This makes K.28.5 unambiguously identifiable regardless of where in the bit stream the receiver starts looking.

Uses of Comma:

Word Alignment: Receiver searches for comma pattern to find 10-bit symbol boundaries
Lane Alignment: In multi-lane links, commas synchronize all lanes
Clock Recovery Training: Comma-rich idle patterns help PLL lock

Why K.28 Sub-codes?

The K.28 family is special because x=28 (binary 11100) produces the 5B/6B pattern 001111 (RD-) or 110000 (RD+)—the only 5B/6B code with four consecutive identical bits. Combined with various .y values, this creates the unique comma patterns. Other K-codes (K.23.7, K.27.7, K.29.7, K.30.7) use y=7 to create specific end-of-frame markers.

K-Code Usage in Gigabit Ethernet

1000BASE-X Ordered Sets:

Gigabit Ethernet over fiber (1000BASE-X) uses K-codes in ordered sets:

Ordered Set	K-Code Sequence	Purpose
/I1/ (Idle 1)	K.28.5 D.5.6	Idle between frames
/I2/ (Idle 2)	K.28.5 D.16.2	Alternate idle
/C1/ (Config 1)	K.28.5 D.21.5 D.x.y D.x.y	Auto-negotiation
/C2/ (Config 2)	K.28.5 D.2.2 D.x.y D.x.y	Auto-negotiation
/R/ (Carrier Extend)	K.23.7	Extends carrier for half-duplex
/S/ (Start of Packet)	K.27.7	Immediately precedes frame
/T/ (End of Packet)	K.29.7	Immediately follows FCS
/E/ (Error)	K.30.7	Propagates error indication

Frame Delimiting:

...I I I I /S/ [Preamble] [SFD] [Frame...] [FCS] /T/ R R I I I...
            ↑                                   ↑
        K.27.7                               K.29.7

Unlike 4B/5B's J-K delimiter, 8B/10B uses single K-code markers (/S/ and /T/), providing cleaner frame boundaries.

K-Code Validity Rules

K-codes follow the same disparity rules as data codes, but they can only appear in specific contexts:

K-codes cannot appear as user data (they're filtered at higher layers)
Receiving an unexpected K-code is a protocol error
K-codes maintain running disparity just like data codes
Consecutive K-codes must form valid ordered sets

K-Code Functions

•K.28.5 (Comma): Word alignment, lane synchronization, clock training
•K.27.7 (/S/): Start of frame marker in Gigabit Ethernet
•K.29.7 (/T/): End of frame marker, immediately follows FCS
•K.28.1: Idle pattern generation between frames
•K.30.7 (/E/): Error propagation across link segments
•K.23.7 (/R/): Carrier extension for half-duplex burst mode

8B/10B in Gigabit Ethernet and Fibre Channel

8B/10B encoding found its most prominent applications in Gigabit Ethernet and Fibre Channel—two technologies that defined high-speed networking in the late 1990s and continue operating in millions of installations today.

1000BASE-X Gigabit Ethernet

Standard Variants:

1000BASE-SX: Short-wavelength laser (850 nm), multimode fiber, up to 550m
1000BASE-LX: Long-wavelength laser (1310 nm), single/multimode fiber, up to 5km
1000BASE-CX: Twinax copper, 25m (data center short runs)

Signal Chain:

[GMII] → [PCS: 8B/10B] → [PMA: SerDes] → [PMD: Laser/LED] → Fiber
  ↓          ↓              ↓
1 Gbps    1.25 Gbaud     Optical
(8-bit)   (10-bit)       modulation

Key Parameters:

Parameter	Value	Derivation
Data Rate	1000 Mbps	User payload
Line Rate	1250 Mbaud	1000 × 10/8
Symbol Period	0.8 ns	1/1.25G
Bit Period	0.8 ns	Same (serial)
Minimum IPG	12 bytes = 96 bits	76.8 ns

Converting Mermaid diagram...

Fibre Channel

Fibre Channel is the dominant storage area network (SAN) interconnect, using 8B/10B for all speeds up to 8 Gbps.

Speed Evolution:

Generation	Data Rate	Line Rate	8B/10B
1GFC	1.0625 Gbps	1.3281 Gbaud	Yes
2GFC	2.125 Gbps	2.6563 Gbaud	Yes
4GFC	4.250 Gbps	5.3125 Gbaud	Yes
8GFC	8.500 Gbps	10.5188 Gbaud	Yes
16GFC	14.025 Gbps	14.025 Gbaud	No (64B/66B)

Note: Starting with 16GFC, Fibre Channel switched to 64B/66B encoding for improved efficiency (96.97% vs 80%).

Fibre Channel Frame Structure

[SOF] [Frame Header 24B] [Payload 0-2112B] [CRC 4B] [EOF]
  ↑                                                   ↑
K.28.5 + ordered set                          K-code ordered set

Ordered Sets:

SOFi (Start of Frame Initiate): K.28.5, D.21.5, D.22.2, D.22.2
SOFn (Start of Frame Normal): K.28.5, D.21.5, D.22.1, D.22.1
EOFt (End of Frame Terminate): K.28.5, D.21.4, D.21.6, D.21.6
EOFa (End of Frame Abort): K.28.5, D.21.4, D.21.4, D.21.4

Why 8B/10B Persists in Storage

Fibre Channel continued using 8B/10B through 8GFC (2006) while Ethernet had moved to 64B/66B at 10 Gbps. This wasn't technical inertia—FC's deterministic latency requirements and simpler ordered sets made 8B/10B's 20% overhead acceptable. Only at 16GFC did the efficiency penalty become compelling enough to switch.

Other 8B/10B Applications

USB 3.0 SuperSpeed:

5 Gbps data rate → 5 Gbaud line rate (8B/10B)
Uses scrambling in addition to 8B/10B
K-codes for LFPS (Low Frequency Periodic Signaling)
Note: USB 3.1 Gen 2 (10 Gbps) switched to 128B/132B

Serial ATA (SATA):

SATA I/II/III: 1.5/3/6 Gbps → 1.5/3/6 Gbaud
Uses 8B/10B with OOB (Out-of-Band) signaling
COMRESET, COMINIT use comma patterns outside normal data

DisplayPort:

1.62/2.7/5.4/8.1 Gbps per lane
8B/10B for DisplayPort 1.2 and earlier
128B/132B for DisplayPort 2.0+

PCI Express Gen 1/2:

Gen 1: 2.5 GT/s (2.0 Gbps effective)
Gen 2: 5 GT/s (4.0 Gbps effective)
Gen 3+ switched to 128B/130B for higher efficiency

8B/10B Technology Adoption
Technology	Line Rate	Data Rate	Status
Gigabit Ethernet	1.25 Gbaud	1 Gbps	Active, mature
Fibre Channel 1-8G	1.06-10.52 Gbaud	1-8.5 Gbps	Active, legacy
USB 3.0	5 Gbaud	4 Gbps	Active
SATA I/II/III	1.5-6 Gbaud	1.2-4.8 Gbps	Active
PCIe Gen 1/2	2.5-5 GT/s	2-4 Gbps	Legacy
DisplayPort 1.0-1.2	1.62-5.4 Gbaud	1.3-4.32 Gbps	Legacy

8B/10B: Comparison and Evolution to 64B/66B

8B/10B dominated high-speed encoding for nearly two decades, but its 20% overhead eventually became prohibitive at 10 Gbps and beyond. Understanding why—and what replaced it—completes our understanding of block coding evolution.

Comprehensive Encoding Comparison

Block Encoding Scheme Comparison
Property	4B/5B	8B/10B	64B/66B	128B/130B
Efficiency	80%	80%	96.97%	98.46%
DC Balance	Not guaranteed	Guaranteed	Scrambled	Scrambled
Run Length	≤3 zeros	≤5 same	Scrambled	Scrambled
Error Detect	Invalid codes	Disparity + code	Sync header	Sync header
Comma Align	J-K pattern	K.28.5	01/10 header	01/10 header
Complexity	Low	Medium	Higher	Higher
Max Speed*	~200 Mbps	~10 Gbps	100+ Gbps	100+ Gbps

*Maximum practical speed is limited by other factors; these are typical deployment ceilings.

Why 8B/10B Efficiency Matters at Speed

The 10 Gigabit Ethernet Problem:

At 10 Gbps data rate:

8B/10B would require: 10 × 10/8 = 12.5 Gbaud line rate
64B/66B requires: 10 × 66/64 = 10.3125 Gbaud line rate

Impact of extra bandwidth:

Higher symbol rates require faster SerDes (more expensive, more power)
More cable/fiber bandwidth consumed
Tighter timing margins
More crosstalk in copper

The 18% reduction (12.5 → 10.3 Gbaud) translates to meaningful cost savings at 10G and becomes even more significant at 40G, 100G, and beyond.

64B/66B: The Successor

How 64B/66B Works:

Block Structure: Every 64 data bits are prefixed with a 2-bit sync header
Sync Headers:
- 01 = Data block (all 8 bytes are data)
- 10 = Control block (mix of control and data)
Scrambling: The 64-bit payload is scrambled (polynomial x⁵⁸ + x³⁹ + 1) for DC balance and run-length
No Disparity Tracking: Scrambling provides statistical DC balance without deterministic disparity control

Advantages over 8B/10B:

97% efficiency vs 80%
Simpler encoder/decoder logic (scrambler vs disparity FSM)
Guaranteed sync header transitions (01 or 10 every 66 bits)

Disadvantages:

No deterministic DC balance (rare pathological patterns possible)
Error detection through sync only (weaker than 8B/10B disparity)
Block-level errors vs symbol-level errors

When to Use Which Encoding

8B/10B remains preferred when deterministic timing and strong error detection are paramount (FC legacy, embedded systems). 64B/66B and beyond are preferred when bandwidth efficiency matters more (10G+ Ethernet, high-density data center links). The choice is fundamentally an engineering trade-off, not a matter of one being universally "better."

8B/10B Advantages

•Guaranteed DC balance per code
•Strong error detection (disparity + code)
•Proven reliability over 20+ years
•Deterministic behavior (no scrambler)
•Symbol-level error localization
•Wide industry support and tools

8B/10B Limitations

•20% overhead limits efficiency
•Complex disparity state machine
•Large lookup tables (512+ entries)
•Not practical beyond ~10 Gbps
•Higher power at equivalent data rate
•Replaced by 64B/66B in modern standards

Summary: Mastering 8B/10B Encoding

8B/10B encoding represents the pinnacle of deterministic block coding—a scheme that provides guaranteed DC balance, strong error detection, and reliable clock recovery through elegant mathematical design. Its influence extends far beyond its direct applications, shaping how engineers think about physical layer encoding.

Let's consolidate the essential knowledge:

Core Concepts to Remember

•Two-Stage Structure: 8-bit input splits into 5B/6B (lower) and 3B/4B (upper) sub-encodings
•Running Disparity: Encoder tracks RD state and selects positive/negative disparity codes to maintain DC balance
•Bounded Disparity: Cumulative disparity never exceeds ±4, guaranteeing DC balance regardless of data pattern
•K-Codes: 12 special codes (K.28.5, K.27.7, K.29.7, etc.) provide word alignment, frame delimiting, and control functions
•Comma Character: K.28.5 contains unique 7-bit pattern for unambiguous symbol boundary detection
•80% Efficiency: 25% overhead, acceptable for speeds up to ~10 Gbps, superseded by 64B/66B at higher rates

8B/10B Quick Reference Card
Parameter	Value	Notes
Code Rate	8:10	8 data bits → 10 code bits
Efficiency	80%	Same as 4B/5B, better properties
Max Run Length	5 identical	Guaranteed by code design
Data Codes	256 (D.x.y)	x: 0-31, y: 0-7
Control Codes	12 (K.x.y)	K.28.0-7, K.23/27/29/30.7
Comma Pattern	K.28.5	0011111010 or 1100000101
DC Balance	Guaranteed	Running disparity bounded ±4
Primary Uses	GbE, FC 1-8G, USB 3.0	Pre-10G high-speed serial

Looking Ahead:

With 8B/10B mastered, you now understand the complete evolution of block coding from 4B/5B through the modern era. The final page in this module covers High-Speed Applications—synthesizing everything you've learned to understand how modern 100G, 400G, and terabit links combine advanced modulation (PAM-4, PAM-16), forward error correction, and high-efficiency encoding to push the boundaries of what's possible over copper and fiber.

Page Complete

You now possess a comprehensive understanding of 8B/10B encoding—from its mathematical foundations through its deployment in Gigabit Ethernet, Fibre Channel, and numerous other standards. This knowledge is essential for anyone working with high-speed serial interfaces or designing systems where physical layer characteristics matter.