Computer NetworksLine Coding Basics

Line Coding - Basics

LevelIntermediate

Duration75 mins

TopicLine Coding Basics

5 / 5

Coding Efficiency: Measuring Line Code Performance

The Ultimate Trade-off Question

Throughout this module, we've encountered a recurring theme: every line coding scheme involves trade-offs. NRZ offers excellent bandwidth efficiency but fails at synchronization. Manchester guarantees transitions but doubles bandwidth requirements. Block codes like 8B/10B provide both synchronization and DC balance but add overhead.

The natural question is: How do we quantify these trade-offs? How can we compare one line code against another in a rigorous, principled way? What metrics capture the essence of a good line code?

This page introduces the analytical framework for evaluating line code efficiency. You'll learn to calculate spectral efficiency, compare coding overhead, and understand the fundamental limits that govern what any line code can achieve. This knowledge transforms line code selection from guesswork into engineering discipline.

What You Will Learn

By the end of this page, you will master the key efficiency metrics for line codes: bit rate to baud rate ratio, bandwidth efficiency, overhead calculation, and figure of merit analysis. You'll be able to make quantitative comparisons between encoding schemes and select the optimal code for given requirements.

Fundamental Efficiency Concepts

Before diving into specific metrics, let's establish the fundamental concepts that underpin efficiency analysis.

Key Definitions:

Essential Terminology for Efficiency Analysis
Term	Definition	Unit	Example
Bit Rate (R_b)	Number of data bits transmitted per second	bps	1 Gbps
Baud Rate (R_s)	Number of signal changes (symbols) per second	Baud or symbols/s	500 MBaud
Symbol	One discrete signal element (can represent 1+ bits)	—	Voltage level, phase state
Signaling Rate	Same as baud rate; rate of symbol changes	Baud	—
Bandwidth (B)	Frequency range occupied by the signal	Hz	125 MHz
Overhead	Extra bits added by encoding (beyond data bits)	% or ratio	25% for 8B/10B

The Critical Distinction: Bits vs. Symbols:

Bit: A binary digit (0 or 1) representing one unit of information
Symbol: A single transmission element that represents one or more bits

In NRZ, 1 symbol = 1 bit (bit rate = baud rate). In multilevel signaling, 1 symbol can represent multiple bits. In block codes like 8B/10B, we send 10 symbols per 8 data bits.

The Efficiency Question:

How many data bits can we transmit per unit bandwidth? This captures the essence of spectral efficiency:

$$\eta = \frac{R_b}{B}\ \text{(bits/second per Hz)}$$

Higher η means more data through the same bandwidth—crucial when bandwidth is scarce or expensive.

Why Baud Rate Matters

Baud rate determines the signal's bandwidth requirements, while bit rate determines data throughput. A clever encoding scheme might achieve high bit rate at modest baud rate (through multilevel signaling), or it might sacrifice efficiency for other goals (synchronization, error detection). Understanding both is essential.

Bit Rate to Baud Rate Ratio

The bit rate to baud rate ratio (R_b/R_s) indicates how many data bits each symbol conveys—a fundamental measure of coding efficiency.

Formula:

$$\frac{R_b}{R_s} = \log_2(L)$$

Where L is the number of distinct signal levels (for multilevel signaling).

For binary signaling (L = 2): $$\frac{R_b}{R_s} = \log_2(2) = 1$$

For 4-level signaling (L = 4): $$\frac{R_b}{R_s} = \log_2(4) = 2\text{ bits per symbol}$$

But Wait—Encoding Overhead Matters:

The above formula assumes each symbol represents pure data. Block codes add overhead:

$$\frac{R_b}{R_s} = \frac{k}{n} \times \log_2(L)$$

Where k = data bits, n = total bits (including overhead), L = signal levels.

Bit-to-Baud Ratios for Common Line Codes
Line Code	Signal Levels	k/n Ratio	R_b/R_s	Notes
NRZ-L	2	1/1	1.0	No overhead, baseline efficiency
NRZ-I	2	1/1	1.0	Same efficiency as NRZ-L
Manchester	2	1/2	0.5	50% efficiency (guaranteed transitions)
AMI	3 (but binary info)	1/1	1.0	Bipolar; DC balance for free
4B/5B	2	4/5	0.8	20% overhead for run-length limiting
8B/10B	2	8/10	0.8	25% overhead (for DC balance + sync)
PAM-4	4	1/1	2.0	2 bits per symbol, used in 100GbE
64B/66B	2	64/66	0.97	3% overhead (modern high-speed)

Interpretation:

R_b/R_s > 1: Multilevel signaling—each symbol carries multiple bits
R_b/R_s = 1: Binary signaling with no overhead (NRZ ideal case)
R_b/R_s < 1: Coding overhead—more symbols than data bits

Example Calculations:

Manchester Encoding at 10 Mbps:

Data rate: 10 Mbps
R_b/R_s = 0.5
Baud rate: R_s = R_b / 0.5 = 20 MBaud
Requires twice the bandwidth of equivalent NRZ

8B/10B at 1 Gbps:

Data rate: 1 Gbps
R_b/R_s = 0.8
Baud rate: R_s = R_b / 0.8 = 1.25 GBaud
25% higher symbol rate than data rate

The Hidden Cost of Transitions

Manchester's R_b/R_s = 0.5 might seem like 50% efficiency, but it's actually paying for something valuable: guaranteed transitions for clock recovery. The 'lost' 50% is the price of synchronization. Whether this is acceptable depends on whether synchronization could be achieved more cheaply by other means.

Bandwidth Efficiency

Bandwidth efficiency (also called spectral efficiency) measures how effectively a line code uses the available frequency spectrum.

Definition:

$$\eta_B = \frac{R_b}{B}\text{ (bits/s/Hz)}$$

Where R_b is data rate and B is the bandwidth required.

Theoretical Limits:

The Nyquist theorem states that for a channel with bandwidth B, the maximum symbol rate is:

$$R_s^{max} = 2B\text{ symbols/second (for baseband)}$$

Combining with multilevel signaling:

$$\eta_B^{max} = 2\log_2(L)\text{ bits/s/Hz}$$

This is the Nyquist limit for ideal (noiseless) channels. Real channels with noise are constrained by the Shannon-Hartley theorem, but for line coding analysis, Nyquist provides the relevant bound.

Bandwidth Efficiency Comparison
Line Code	Bandwidth (approx.)	η_B (bits/s/Hz)	Nyquist Efficiency (%)
NRZ-L/I	R_b / 2	2.0	100% (theoretical ideal)
Manchester	R_b	1.0	50%
AMI	R_b / 2	2.0	100% (for line rate)
4B/5B + NRZ-I	~0.625 × R_b	1.6	80%
8B/10B	~0.625 × R_b	1.6	80%
MLT-3	R_b / 4	4.0	200% (3-level, clever transitions)
PAM-4	R_b / 4	4.0	200% (4-level)

bandwidth_efficiency.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
import math
 
class LineCodingScheme:
    """Model a line coding scheme for efficiency analysis."""
    
    def __init__(self, name: str, signal_levels: int, 
                 data_bits: int, code_bits: int,
                 bandwidth_factor: float):
        """
        Args:
            name: Name of the encoding scheme
            signal_levels: Number of distinct signal levels (L)
            data_bits: Data bits per block (k)
            code_bits: Total bits per block (n)
            bandwidth_factor: Bandwidth as fraction of symbol rate
        """
        self.name = name
        self.L = signal_levels
        self.k = data_bits
        self.n = code_bits
        self.bw_factor = bandwidth_factor
        
    @property
    def bits_per_symbol(self) -> float:
        """Bits per symbol (considering multilevel)."""
        return math.log2(self.L)
    
    @property
    def code_rate(self) -> float:
        """k/n ratio (data bits per code bits)."""
        return self.k / self.n
    
    @property
    def bit_to_baud(self) -> float:
        """Data bit rate to baud rate ratio."""
        return self.code_rate * self.bits_per_symbol
    
    def symbol_rate(self, data_rate: float) -> float:
        """Symbol rate required for given data rate."""
        return data_rate / self.bit_to_baud
    
    def bandwidth(self, data_rate: float) -> float:
        """Bandwidth required for given data rate."""
        return self.symbol_rate(data_rate) * self.bw_factor
    
    def spectral_efficiency(self, data_rate: float) -> float:
        """Spectral efficiency in bits/s/Hz."""
        return data_rate / self.bandwidth(data_rate)
    
    def overhead_percent(self) -> float:
        """Encoding overhead as percentage."""
        return (1 - self.code_rate) * 100
    
    def __str__(self):
        return (f"{self.name}: {self.L}-level, {self.k}:{self.n} rate, "
                f"η={self.spectral_efficiency(1e6):.2f} bits/s/Hz, "
                f"overhead={self.overhead_percent():.1f}%")
 
# Define common line coding schemes
schemes = [
    LineCodingScheme("NRZ-L", signal_levels=2, data_bits=1, code_bits=1, 
                     bandwidth_factor=0.5),
    LineCodingScheme("Manchester", signal_levels=2, data_bits=1, code_bits=2, 
                     bandwidth_factor=0.5),
    LineCodingScheme("4B/5B", signal_levels=2, data_bits=4, code_bits=5, 
                     bandwidth_factor=0.5),
    LineCodingScheme("8B/10B", signal_levels=2, data_bits=8, code_bits=10, 
                     bandwidth_factor=0.5),
    LineCodingScheme("64B/66B", signal_levels=2, data_bits=64, code_bits=66, 
                     bandwidth_factor=0.5),
    LineCodingScheme("PAM-4", signal_levels=4, data_bits=2, code_bits=2, 
                     bandwidth_factor=0.5),
]
 
print("Line Code Efficiency Analysis")
print("=" * 60)
print(f"Target Data Rate: 10 Gbps\n")
 
data_rate = 10e9  # 10 Gbps
 
print(f"{'Scheme':<12} {'Symbol Rate':<14} {'Bandwidth':<12} {'η (b/s/Hz)':<12}")
print("-" * 60)
 
for scheme in schemes:
    sr = scheme.symbol_rate(data_rate)
    bw = scheme.bandwidth(data_rate)
    eta = scheme.spectral_efficiency(data_rate)
    print(f"{scheme.name:<12} {sr/1e9:.2f} GBaud     {bw/1e9:.2f} GHz     {eta:.2f}")

Practical vs. Theoretical Bandwidth

Theoretical Nyquist bandwidth is the 'first null' of the signal spectrum. Practical systems need more bandwidth to preserve signal integrity. A common rule: practical bandwidth ≈ 0.5 to 1.0 × symbol rate for NRZ-like codes. The exact factor depends on filtering, pulse shaping, and tolerable inter-symbol interference.

Coding Overhead Analysis

Coding overhead quantifies the extra bandwidth consumed by encoding beyond the bare minimum required for data transmission.

Definition:

$$Overhead\ (%) = \left(1 - \frac{k}{n}\right) \times 100$$

Where k = data bits per block, n = total transmitted bits per block.

Alternatively expressed as efficiency:

$$Efficiency\ (%) = \frac{k}{n} \times 100$$

Overhead + Efficiency = 100%

Overhead Analysis of Common Codes
Code	Data Bits (k)	Code Bits (n)	Overhead (%)	What Overhead Buys
NRZ-L/I	1	1	0%	Nothing—no synchronization, no DC balance
Manchester	1	2	100%	Guaranteed mid-bit transition, DC balance
4B/5B	4	5	25%	Max 3 consecutive zeros, improved sync
8B/10B	8	10	25%	Bounded run length, DC balance, error detection (via violations)
64B/66B	64	66	3.125%	Sync header, low overhead (with scrambling)
128B/130B	128	130	1.56%	Ultra-low overhead, sync header

The Evolution of Overhead:

Notice the historical trend:

Manchester (1980s Ethernet)    → 100% overhead
8B/10B (1990s Gigabit Ethernet) → 25% overhead  
64B/66B (2000s 10GbE)          → 3.125% overhead
128B/130B (PCIe Gen3+)         → 1.56% overhead

As data rates increased, the overhead 'tax' became increasingly expensive. At 100 Gbps, 25% overhead from 8B/10B would cost 31.25 GHz of extra bandwidth. Engineers developed more efficient codes to reduce this burden.

The Scrambling Revolution:

64B/66B achieves low overhead by using scrambling instead of carefully-designed code words:

2-bit sync header (provides framing)
64 data bits are scrambled with LFSR
Scrambling provides statistical DC balance and transition density
No lookup table needed; computationally simple

Overhead is Investment, Not Waste

Frame overhead as 'buying features' rather than 'wasting bandwidth.' 8B/10B's 25% overhead purchases: (1) guaranteed run length ≤5, (2) guaranteed DC balance, (3) error detection via code violations, (4) special control characters. Whether this purchase is worthwhile depends on the application.

Figure of Merit Comparisons

No single metric captures all aspects of line code performance. Engineers use multiple figures of merit (FOM) to compare codes across different dimensions.

Common Figures of Merit:

Line Code Evaluation Metrics

•Spectral Efficiency (η_B): Data rate per unit bandwidth—measures spectrum usage efficiency.
•Normalized Transition Density: Average transitions per bit—indicates synchronization capability.
•Maximum Run Length: Longest sequence without transition—determines PLL requirements.
•DC Free / DC Balance: Zero or bounded DC content—enables transformer coupling, AC coupling.
•Error Detection Capability: Ability to detect transmission errors through code violations.
•Implementation Complexity: Hardware/software resources for encoding and decoding.
•Power Spectral Density Shape: Frequency distribution of signal energy—affects interference and filtering.

Multi-Dimensional Line Code Comparison
Code	η_B	Transitions	Max Run	DC Free	Error Det.	Complexity
NRZ-L	2.0	0-1.0	Unlimited	No	None	Trivial
NRZ-I	2.0	0-1.0	Unlimited	No	None	Simple
Manchester	1.0	1.0-2.0	1 bit	Yes	Via edge	Simple
Diff Manchester	1.0	1.0-2.0	1 bit	Yes	Via edge	Simple
AMI	2.0	~0.5	Unlimited	Long-term	Bipolar violation	Simple
B8ZS	2.0	~0.5	8 bits	Yes	Multiple	Moderate
HDB3	2.0	~0.5	4 bits	Yes	Multiple	Moderate
4B/5B	1.6	~0.8	3 bits	No (needs NRZ-I)	Code violation	Lookup table
8B/10B	1.6	0.3-0.8	5 bits	Yes	Code violation	Lookup table
64B/66B	1.94	~0.5	~66 bits	Stat.	CRC/frame	Scrambler + header

Weighted Scoring Approach:

When selecting a line code, assign weights based on application priorities:

FOM = w1×(Spectral Efficiency) + w2×(Sync Capability) + w3×(DC Balance) 
      - w4×(Complexity) - w5×(Overhead)

Where w1 + w2 + w3 + w4 + w5 = 1 (normalized weights)

Example: Gigabit Ethernet Link

Priorities:

Spectral efficiency: Important but not critical (w1 = 0.2)
Sync capability: Essential (w2 = 0.3)
DC balance: Essential (transformer coupling) (w3 = 0.3)
Complexity: Low concern at this scale (w4 = 0.1)
Overhead: Moderate concern (w5 = 0.1)

Result: 8B/10B scores well—excellent sync and DC balance, acceptable overhead, reasonable complexity.

No Universal Winner

Every line code wins for some applications and loses for others. NRZ-L is perfect for short PCB traces with separate clock. Manchester is ideal for 10 Mbps Ethernet. 8B/10B suits Gigabit links. 64B/66B dominates at 10Gb+. There's no 'best' code—only best-for-this-application.

Trade-off Analysis Framework

Understanding how line code properties trade against each other helps engineers make informed decisions. Let's formalize these trade-offs.

Converting Mermaid diagram...

Bandwidth Efficiency vs. Synchronization Trade-off:

Guaranteeing transitions requires 'spending' signal changes that could otherwise carry data.

The Spectrum:

NRZ (0 guaranteed transitions): Maximum bandwidth efficiency (η = 2.0)
8B/10B (1 per 5 bits guaranteed): Good balance (η = 1.6)
Manchester (1 per bit guaranteed): Synchronization optimized (η = 1.0)

Mathematical Relationship:

Let T = guaranteed transitions per bit. Then approximately:

$$\eta_B \approx \frac{2}{1 + T}$$

More transitions → lower efficiency. This is fundamental—you can't get both for free.

When to Favor Synchronization:

Long cable runs (clock must survive cable delay)
No external clock available
Burst/packetized communication (must lock quickly)

When to Favor Bandwidth:

Short, controlled interconnects
External clock distribution possible
Bandwidth is scarce/expensive

The Design Space

Line code selection is constrained optimization: maximize efficiency subject to minimum synchronization, maximum run length, DC balance requirement, and complexity budget. Different applications have different constraints, leading to different optimal codes.

Real-World Selection Cases

Let's apply the efficiency framework to real engineering decisions, examining how actual standards selected their line codes.

Line Code Selection in Major Standards
Standard	Line Code	Why Selected	Key Trade-offs Made
10BASE-T Ethernet	Manchester	Reliable sync over cat3 cable, transformer coupling	Accepted 50% efficiency for guaranteed transitions
100BASE-TX	4B/5B + MLT-3	Higher efficiency than Manchester, adequate sync	3-level signaling for reduced bandwidth
1000BASE-T	PAM-5 (4D)	4 pairs × 5 levels = massive parallelism	Extreme complexity for 1Gbps on cat5
10GBASE-R	64B/66B	Low overhead critical at 10Gbps	Scrambling instead of mapping for DC balance
USB 2.0 FS	NRZI + bit stuffing	Simple, adequate for 12 Mbps	Bit stuffing limits run length without lookup tables
USB 3.0	8B/10B	DC balance required for AC coupling	25% overhead acceptable; proven technology
PCIe Gen3	128B/130B	Ultra-low overhead at 8 GT/s	Scrambling for spectral properties, CRC for errors
SATA	8B/10B	Reliable operation over cables	Same as Fibre Channel origin

Case Study: Evolution of Ethernet Coding:

10BASE-T (1990): Manchester @ 10 Mbps
- Required: 20 MBaud → 10 MHz bandwidth
- 50% efficiency acceptable for 10 Mbps

100BASE-TX (1995): 4B/5B + MLT-3 @ 100 Mbps
- 125 MBaud → ~31.25 MHz bandwidth
- MLT-3 reduces bandwidth vs. binary
- Efficiency: ~80% (vs. 50% for Manchester)

1000BASE-X (1998): 8B/10B @ 1.25 Gbps
- Fiber: 1.25 GBaud → ~625 MHz bandwidth
- 8B/10B provides DC balance for fiber optics
- Efficiency: 80%

10GBASE-R (2002): 64B/66B @ 10.3125 Gbps
- ~5.15 GHz bandwidth requirements
- 25% overhead would add 2.5 Gbps
- 64B/66B saves ~2.2 Gbps

25/40/100GBASE-R: 64B/66B + RS-FEC
- Forward Error Correction added at physical layer
- Overhead: 3% encoding + ~7% FEC = ~10% total

Key Insight: As data rates increased 10,000× (10 Mbps → 100 Gbps), overhead tolerance dropped from 100% to <5%. The technology evolved to meet these constraints.

Legacy Constraints Matter

Many line code choices reflect historical constraints that no longer apply. 8B/10B was designed when lookup tables were expensive; today, scrambling is cheaper. But backwards compatibility often locks in older choices. SATA uses 8B/10B because it inherited Fibre Channel's physical layer—not because 8B/10B is optimal for disk drives.

Summary: Quantifying Line Code Quality

Coding efficiency provides the quantitative foundation for line code selection. With these metrics, you can make principled decisions rather than relying on intuition or tradition.

Key Takeaways

•Bit rate ≠ baud rate — Line codes may trade efficiency for features; baud rate determines bandwidth requirements.
•Spectral efficiency has limits — Nyquist bounds what's achievable; NRZ approaches the limit, others trade efficiency for other properties.
•Overhead buys features — Synchronization, DC balance, and error detection all cost something; evaluate what you get for the price.
•Multiple metrics needed — No single FOM captures everything; use spectral efficiency, transition density, run length, DC balance, and complexity together.
•Trade-offs are fundamental — Bandwidth, synchronization, and DC balance compete; every choice privileges some over others.
•Context determines optimum — Different applications have different constraints; there's no universally 'best' line code.

Module Completion:

This completes our deep dive into Line Coding - Basics. You now have a comprehensive understanding of:

NRZ-L and NRZ-I — The foundational line codes and their characteristics
Self-Synchronization — Why transitions matter for clock recovery
DC Component — How signal averages affect AC-coupled channels
Baseline Wandering — Dynamic manifestation of DC problems
Coding Efficiency — Metrics for quantitative code comparison

With this foundation, you're prepared to study more advanced line coding schemes—Manchester encoding, bipolar codes (AMI, B8ZS, HDB3), and modern block codes (8B/10B, 64B/66B)—understanding not just how they work, but why they were designed as they were.

Module Complete

Congratulations! You've mastered the fundamentals of line coding. You can now analyze any encoding scheme using the metrics and frameworks introduced here. The next modules will build on this foundation with Manchester encoding, bipolar schemes, and multilevel coding—each addressing the limitations of basic NRZ in different ways.

5 / 5

Loading learning content...

Computer NetworksLine Coding Basics

Line Coding - Basics

LevelIntermediate

Duration75 mins

TopicLine Coding Basics

5 / 5

Coding Efficiency: Measuring Line Code Performance

The Ultimate Trade-off Question

The natural question is: How do we quantify these trade-offs? How can we compare one line code against another in a rigorous, principled way? What metrics capture the essence of a good line code?

What You Will Learn

Fundamental Efficiency Concepts

Before diving into specific metrics, let's establish the fundamental concepts that underpin efficiency analysis.

Key Definitions:

Essential Terminology for Efficiency Analysis
Term	Definition	Unit	Example
Bit Rate (R_b)	Number of data bits transmitted per second	bps	1 Gbps
Baud Rate (R_s)	Number of signal changes (symbols) per second	Baud or symbols/s	500 MBaud
Symbol	One discrete signal element (can represent 1+ bits)	—	Voltage level, phase state
Signaling Rate	Same as baud rate; rate of symbol changes	Baud	—
Bandwidth (B)	Frequency range occupied by the signal	Hz	125 MHz
Overhead	Extra bits added by encoding (beyond data bits)	% or ratio	25% for 8B/10B

The Critical Distinction: Bits vs. Symbols:

Bit: A binary digit (0 or 1) representing one unit of information
Symbol: A single transmission element that represents one or more bits

In NRZ, 1 symbol = 1 bit (bit rate = baud rate). In multilevel signaling, 1 symbol can represent multiple bits. In block codes like 8B/10B, we send 10 symbols per 8 data bits.

The Efficiency Question:

How many data bits can we transmit per unit bandwidth? This captures the essence of spectral efficiency:

$$\eta = \frac{R_b}{B}\ \text{(bits/second per Hz)}$$

Higher η means more data through the same bandwidth—crucial when bandwidth is scarce or expensive.

Why Baud Rate Matters

Bit Rate to Baud Rate Ratio

The bit rate to baud rate ratio (R_b/R_s) indicates how many data bits each symbol conveys—a fundamental measure of coding efficiency.

Formula:

$$\frac{R_b}{R_s} = \log_2(L)$$

Where L is the number of distinct signal levels (for multilevel signaling).

For binary signaling (L = 2): $$\frac{R_b}{R_s} = \log_2(2) = 1$$

For 4-level signaling (L = 4): $$\frac{R_b}{R_s} = \log_2(4) = 2\text{ bits per symbol}$$

But Wait—Encoding Overhead Matters:

The above formula assumes each symbol represents pure data. Block codes add overhead:

$$\frac{R_b}{R_s} = \frac{k}{n} \times \log_2(L)$$

Where k = data bits, n = total bits (including overhead), L = signal levels.

Bit-to-Baud Ratios for Common Line Codes
Line Code	Signal Levels	k/n Ratio	R_b/R_s	Notes
NRZ-L	2	1/1	1.0	No overhead, baseline efficiency
NRZ-I	2	1/1	1.0	Same efficiency as NRZ-L
Manchester	2	1/2	0.5	50% efficiency (guaranteed transitions)
AMI	3 (but binary info)	1/1	1.0	Bipolar; DC balance for free
4B/5B	2	4/5	0.8	20% overhead for run-length limiting
8B/10B	2	8/10	0.8	25% overhead (for DC balance + sync)
PAM-4	4	1/1	2.0	2 bits per symbol, used in 100GbE
64B/66B	2	64/66	0.97	3% overhead (modern high-speed)

Interpretation:

R_b/R_s > 1: Multilevel signaling—each symbol carries multiple bits
R_b/R_s = 1: Binary signaling with no overhead (NRZ ideal case)
R_b/R_s < 1: Coding overhead—more symbols than data bits

Example Calculations:

Manchester Encoding at 10 Mbps:

Data rate: 10 Mbps
R_b/R_s = 0.5
Baud rate: R_s = R_b / 0.5 = 20 MBaud
Requires twice the bandwidth of equivalent NRZ

8B/10B at 1 Gbps:

Data rate: 1 Gbps
R_b/R_s = 0.8
Baud rate: R_s = R_b / 0.8 = 1.25 GBaud
25% higher symbol rate than data rate

The Hidden Cost of Transitions

Bandwidth Efficiency

Bandwidth efficiency (also called spectral efficiency) measures how effectively a line code uses the available frequency spectrum.

Definition:

$$\eta_B = \frac{R_b}{B}\text{ (bits/s/Hz)}$$

Where R_b is data rate and B is the bandwidth required.

Theoretical Limits:

The Nyquist theorem states that for a channel with bandwidth B, the maximum symbol rate is:

$$R_s^{max} = 2B\text{ symbols/second (for baseband)}$$

Combining with multilevel signaling:

$$\eta_B^{max} = 2\log_2(L)\text{ bits/s/Hz}$$

Bandwidth Efficiency Comparison
Line Code	Bandwidth (approx.)	η_B (bits/s/Hz)	Nyquist Efficiency (%)
NRZ-L/I	R_b / 2	2.0	100% (theoretical ideal)
Manchester	R_b	1.0	50%
AMI	R_b / 2	2.0	100% (for line rate)
4B/5B + NRZ-I	~0.625 × R_b	1.6	80%
8B/10B	~0.625 × R_b	1.6	80%
MLT-3	R_b / 4	4.0	200% (3-level, clever transitions)
PAM-4	R_b / 4	4.0	200% (4-level)

bandwidth_efficiency.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
import math
 
class LineCodingScheme:
    """Model a line coding scheme for efficiency analysis."""
    
    def __init__(self, name: str, signal_levels: int, 
                 data_bits: int, code_bits: int,
                 bandwidth_factor: float):
        """
        Args:
            name: Name of the encoding scheme
            signal_levels: Number of distinct signal levels (L)
            data_bits: Data bits per block (k)
            code_bits: Total bits per block (n)
            bandwidth_factor: Bandwidth as fraction of symbol rate
        """
        self.name = name
        self.L = signal_levels
        self.k = data_bits
        self.n = code_bits
        self.bw_factor = bandwidth_factor
        
    @property
    def bits_per_symbol(self) -> float:
        """Bits per symbol (considering multilevel)."""
        return math.log2(self.L)
    
    @property
    def code_rate(self) -> float:
        """k/n ratio (data bits per code bits)."""
        return self.k / self.n
    
    @property
    def bit_to_baud(self) -> float:
        """Data bit rate to baud rate ratio."""
        return self.code_rate * self.bits_per_symbol
    
    def symbol_rate(self, data_rate: float) -> float:
        """Symbol rate required for given data rate."""
        return data_rate / self.bit_to_baud
    
    def bandwidth(self, data_rate: float) -> float:
        """Bandwidth required for given data rate."""
        return self.symbol_rate(data_rate) * self.bw_factor
    
    def spectral_efficiency(self, data_rate: float) -> float:
        """Spectral efficiency in bits/s/Hz."""
        return data_rate / self.bandwidth(data_rate)
    
    def overhead_percent(self) -> float:
        """Encoding overhead as percentage."""
        return (1 - self.code_rate) * 100
    
    def __str__(self):
        return (f"{self.name}: {self.L}-level, {self.k}:{self.n} rate, "
                f"η={self.spectral_efficiency(1e6):.2f} bits/s/Hz, "
                f"overhead={self.overhead_percent():.1f}%")
 
# Define common line coding schemes
schemes = [
    LineCodingScheme("NRZ-L", signal_levels=2, data_bits=1, code_bits=1, 
                     bandwidth_factor=0.5),
    LineCodingScheme("Manchester", signal_levels=2, data_bits=1, code_bits=2, 
                     bandwidth_factor=0.5),
    LineCodingScheme("4B/5B", signal_levels=2, data_bits=4, code_bits=5, 
                     bandwidth_factor=0.5),
    LineCodingScheme("8B/10B", signal_levels=2, data_bits=8, code_bits=10, 
                     bandwidth_factor=0.5),
    LineCodingScheme("64B/66B", signal_levels=2, data_bits=64, code_bits=66, 
                     bandwidth_factor=0.5),
    LineCodingScheme("PAM-4", signal_levels=4, data_bits=2, code_bits=2, 
                     bandwidth_factor=0.5),
]
 
print("Line Code Efficiency Analysis")
print("=" * 60)
print(f"Target Data Rate: 10 Gbps\n")
 
data_rate = 10e9  # 10 Gbps
 
print(f"{'Scheme':<12} {'Symbol Rate':<14} {'Bandwidth':<12} {'η (b/s/Hz)':<12}")
print("-" * 60)
 
for scheme in schemes:
    sr = scheme.symbol_rate(data_rate)
    bw = scheme.bandwidth(data_rate)
    eta = scheme.spectral_efficiency(data_rate)
    print(f"{scheme.name:<12} {sr/1e9:.2f} GBaud     {bw/1e9:.2f} GHz     {eta:.2f}")

Practical vs. Theoretical Bandwidth

Coding Overhead Analysis

Coding overhead quantifies the extra bandwidth consumed by encoding beyond the bare minimum required for data transmission.

Definition:

$$Overhead\ (%) = \left(1 - \frac{k}{n}\right) \times 100$$

Where k = data bits per block, n = total transmitted bits per block.

Alternatively expressed as efficiency:

$$Efficiency\ (%) = \frac{k}{n} \times 100$$

Overhead + Efficiency = 100%

Overhead Analysis of Common Codes
Code	Data Bits (k)	Code Bits (n)	Overhead (%)	What Overhead Buys
NRZ-L/I	1	1	0%	Nothing—no synchronization, no DC balance
Manchester	1	2	100%	Guaranteed mid-bit transition, DC balance
4B/5B	4	5	25%	Max 3 consecutive zeros, improved sync
8B/10B	8	10	25%	Bounded run length, DC balance, error detection (via violations)
64B/66B	64	66	3.125%	Sync header, low overhead (with scrambling)
128B/130B	128	130	1.56%	Ultra-low overhead, sync header

The Evolution of Overhead:

Notice the historical trend:

Manchester (1980s Ethernet)    → 100% overhead
8B/10B (1990s Gigabit Ethernet) → 25% overhead  
64B/66B (2000s 10GbE)          → 3.125% overhead
128B/130B (PCIe Gen3+)         → 1.56% overhead

The Scrambling Revolution:

64B/66B achieves low overhead by using scrambling instead of carefully-designed code words:

2-bit sync header (provides framing)
64 data bits are scrambled with LFSR
Scrambling provides statistical DC balance and transition density
No lookup table needed; computationally simple

Overhead is Investment, Not Waste

Figure of Merit Comparisons

No single metric captures all aspects of line code performance. Engineers use multiple figures of merit (FOM) to compare codes across different dimensions.

Common Figures of Merit:

Line Code Evaluation Metrics

•Spectral Efficiency (η_B): Data rate per unit bandwidth—measures spectrum usage efficiency.
•Normalized Transition Density: Average transitions per bit—indicates synchronization capability.
•Maximum Run Length: Longest sequence without transition—determines PLL requirements.
•DC Free / DC Balance: Zero or bounded DC content—enables transformer coupling, AC coupling.
•Error Detection Capability: Ability to detect transmission errors through code violations.
•Implementation Complexity: Hardware/software resources for encoding and decoding.
•Power Spectral Density Shape: Frequency distribution of signal energy—affects interference and filtering.

Multi-Dimensional Line Code Comparison
Code	η_B	Transitions	Max Run	DC Free	Error Det.	Complexity
NRZ-L	2.0	0-1.0	Unlimited	No	None	Trivial
NRZ-I	2.0	0-1.0	Unlimited	No	None	Simple
Manchester	1.0	1.0-2.0	1 bit	Yes	Via edge	Simple
Diff Manchester	1.0	1.0-2.0	1 bit	Yes	Via edge	Simple
AMI	2.0	~0.5	Unlimited	Long-term	Bipolar violation	Simple
B8ZS	2.0	~0.5	8 bits	Yes	Multiple	Moderate
HDB3	2.0	~0.5	4 bits	Yes	Multiple	Moderate
4B/5B	1.6	~0.8	3 bits	No (needs NRZ-I)	Code violation	Lookup table
8B/10B	1.6	0.3-0.8	5 bits	Yes	Code violation	Lookup table
64B/66B	1.94	~0.5	~66 bits	Stat.	CRC/frame	Scrambler + header

Weighted Scoring Approach:

When selecting a line code, assign weights based on application priorities:

FOM = w1×(Spectral Efficiency) + w2×(Sync Capability) + w3×(DC Balance) 
      - w4×(Complexity) - w5×(Overhead)

Where w1 + w2 + w3 + w4 + w5 = 1 (normalized weights)

Example: Gigabit Ethernet Link

Priorities:

Spectral efficiency: Important but not critical (w1 = 0.2)
Sync capability: Essential (w2 = 0.3)
DC balance: Essential (transformer coupling) (w3 = 0.3)
Complexity: Low concern at this scale (w4 = 0.1)
Overhead: Moderate concern (w5 = 0.1)

Result: 8B/10B scores well—excellent sync and DC balance, acceptable overhead, reasonable complexity.

No Universal Winner

Trade-off Analysis Framework

Understanding how line code properties trade against each other helps engineers make informed decisions. Let's formalize these trade-offs.

Converting Mermaid diagram...

Bandwidth Efficiency vs. Synchronization Trade-off:

Guaranteeing transitions requires 'spending' signal changes that could otherwise carry data.

The Spectrum:

NRZ (0 guaranteed transitions): Maximum bandwidth efficiency (η = 2.0)
8B/10B (1 per 5 bits guaranteed): Good balance (η = 1.6)
Manchester (1 per bit guaranteed): Synchronization optimized (η = 1.0)

Mathematical Relationship:

Let T = guaranteed transitions per bit. Then approximately:

$$\eta_B \approx \frac{2}{1 + T}$$

More transitions → lower efficiency. This is fundamental—you can't get both for free.

When to Favor Synchronization:

Long cable runs (clock must survive cable delay)
No external clock available
Burst/packetized communication (must lock quickly)

When to Favor Bandwidth:

Short, controlled interconnects
External clock distribution possible
Bandwidth is scarce/expensive

The Design Space

Real-World Selection Cases

Let's apply the efficiency framework to real engineering decisions, examining how actual standards selected their line codes.

Line Code Selection in Major Standards
Standard	Line Code	Why Selected	Key Trade-offs Made
10BASE-T Ethernet	Manchester	Reliable sync over cat3 cable, transformer coupling	Accepted 50% efficiency for guaranteed transitions
100BASE-TX	4B/5B + MLT-3	Higher efficiency than Manchester, adequate sync	3-level signaling for reduced bandwidth
1000BASE-T	PAM-5 (4D)	4 pairs × 5 levels = massive parallelism	Extreme complexity for 1Gbps on cat5
10GBASE-R	64B/66B	Low overhead critical at 10Gbps	Scrambling instead of mapping for DC balance
USB 2.0 FS	NRZI + bit stuffing	Simple, adequate for 12 Mbps	Bit stuffing limits run length without lookup tables
USB 3.0	8B/10B	DC balance required for AC coupling	25% overhead acceptable; proven technology
PCIe Gen3	128B/130B	Ultra-low overhead at 8 GT/s	Scrambling for spectral properties, CRC for errors
SATA	8B/10B	Reliable operation over cables	Same as Fibre Channel origin

Case Study: Evolution of Ethernet Coding:

10BASE-T (1990): Manchester @ 10 Mbps
- Required: 20 MBaud → 10 MHz bandwidth
- 50% efficiency acceptable for 10 Mbps

100BASE-TX (1995): 4B/5B + MLT-3 @ 100 Mbps
- 125 MBaud → ~31.25 MHz bandwidth
- MLT-3 reduces bandwidth vs. binary
- Efficiency: ~80% (vs. 50% for Manchester)

1000BASE-X (1998): 8B/10B @ 1.25 Gbps
- Fiber: 1.25 GBaud → ~625 MHz bandwidth
- 8B/10B provides DC balance for fiber optics
- Efficiency: 80%

10GBASE-R (2002): 64B/66B @ 10.3125 Gbps
- ~5.15 GHz bandwidth requirements
- 25% overhead would add 2.5 Gbps
- 64B/66B saves ~2.2 Gbps

25/40/100GBASE-R: 64B/66B + RS-FEC
- Forward Error Correction added at physical layer
- Overhead: 3% encoding + ~7% FEC = ~10% total

Key Insight: As data rates increased 10,000× (10 Mbps → 100 Gbps), overhead tolerance dropped from 100% to <5%. The technology evolved to meet these constraints.

Legacy Constraints Matter

Summary: Quantifying Line Code Quality

Coding efficiency provides the quantitative foundation for line code selection. With these metrics, you can make principled decisions rather than relying on intuition or tradition.

Key Takeaways

•Bit rate ≠ baud rate — Line codes may trade efficiency for features; baud rate determines bandwidth requirements.
•Spectral efficiency has limits — Nyquist bounds what's achievable; NRZ approaches the limit, others trade efficiency for other properties.
•Overhead buys features — Synchronization, DC balance, and error detection all cost something; evaluate what you get for the price.
•Multiple metrics needed — No single FOM captures everything; use spectral efficiency, transition density, run length, DC balance, and complexity together.
•Trade-offs are fundamental — Bandwidth, synchronization, and DC balance compete; every choice privileges some over others.
•Context determines optimum — Different applications have different constraints; there's no universally 'best' line code.

Module Completion:

This completes our deep dive into Line Coding - Basics. You now have a comprehensive understanding of:

NRZ-L and NRZ-I — The foundational line codes and their characteristics
Self-Synchronization — Why transitions matter for clock recovery
DC Component — How signal averages affect AC-coupled channels
Baseline Wandering — Dynamic manifestation of DC problems
Coding Efficiency — Metrics for quantitative code comparison

Module Complete

5 / 5