Computer NetworksData Link Layer

Framing: Byte Stuffing

LevelIntermediate

Duration75 mins

TopicData Link Layer

3 / 5

Byte Stuffing: The Complete Algorithm

Unifying Flag Bytes and Escape Sequences

In the previous two pages, we explored flag bytes for frame delimitation and escape sequences for representing reserved bytes within payload data. Now we bring these concepts together into the complete byte stuffing algorithm—the unified mechanism that enables reliable, transparent byte-oriented framing.

Byte stuffing (also called character stuffing or octet stuffing) is the process by which a transmitter inserts additional bytes into frame data to avoid confusion with protocol control bytes, and the corresponding process by which a receiver removes these inserted bytes to recover the original data.

The term "stuffing" comes from the visual metaphor of inserting (stuffing) extra bytes into the data stream, similar to stuffing padding into a cushion. Just as the padding can be removed to reveal the original cushion shape, the stuffed bytes can be removed to reveal the original data.

What You Will Master

By the end of this page, you will understand the complete byte stuffing algorithm from end to end, be able to implement production-quality stuffing and unstuffing functions, analyze the efficiency characteristics and overhead of byte stuffing, and compare byte stuffing with alternative framing approaches.

The Complete Byte Stuffing Algorithm

The byte stuffing algorithm consists of two complementary procedures: stuffing at the transmitter and unstuffing at the receiver.

Transmitter Algorithm (Stuffing):

TRANSMIT_FRAME(data):
  1. Output FLAG byte (0x7E)
  2. For each byte B in data:
     a. If B is FLAG (0x7E):
        Output ESCAPE (0x7D), then output 0x5E
     b. Else if B is ESCAPE (0x7D):
        Output ESCAPE (0x7D), then output 0x5D
     c. Else if B is in ACCM escape set:
        Output ESCAPE (0x7D), then output B XOR 0x20
     d. Else:
        Output B unchanged
  3. Output FLAG byte (0x7E)

Receiver Algorithm (Unstuffing):

RECEIVE_FRAME():
  1. Wait for FLAG byte (0x7E)
  2. Initialize empty buffer
  3. Loop:
     a. Receive next byte B
     b. If B is FLAG (0x7E):
        Return buffer as complete frame
     c. Else if B is ESCAPE (0x7D):
        Receive next byte B'
        Append (B' XOR 0x20) to buffer
     d. Else:
        Append B to buffer

Algorithm Properties:

Correctness: The unstuffing procedure perfectly inverts the stuffing procedure. For any input data D, UNSTUFF(STUFF(D)) = D.
Completeness: The algorithm handles all possible byte values (0x00-0xFF) correctly, ensuring full data transparency.
Determinism: Both procedures are deterministic; given the same input, they always produce the same output.
Single-Pass: Both stuffing and unstuffing are single-pass algorithms with O(n) time complexity, where n is the input length.
Streaming: The algorithms can process data byte-by-byte, making them suitable for resource-constrained environments.

The Key Insight:

The elegance of byte stuffing lies in the invariant it maintains:

After stuffing, the only occurrence of the flag byte (0x7E) in the byte stream is at actual frame boundaries.

This invariant ensures the receiver can unambiguously identify frame boundaries by scanning for flag bytes, regardless of what data the frame contains.

The XOR Transformation

The XOR with 0x20 serves a dual purpose: it transforms the reserved byte into a different value (removing the conflict), and it is its own inverse (XOR 0x20 twice returns the original). This mathematical property makes the encoding/decoding symmetric and eliminates the need for lookup tables during decoding.

Step-by-Step Worked Example

Let's trace through the complete byte stuffing process with a detailed example. This example will solidify your understanding of both the stuffing and unstuffing algorithms.

Original Payload Data:

Payload: [0x41, 0x7E, 0x42, 0x7D, 0x43, 0x7E, 0x7D, 0x44]
         'A'   FLAG  'B'   ESC   'C'   FLAG  ESC   'D'

This 8-byte payload contains two flag bytes (0x7E) and two escape bytes (0x7D)—requiring stuffing.

Step 1: Transmitter Stuffing

Processing each byte:

Input Byte	Hex	Action	Output Bytes
'A'	0x41	Pass through	0x41
FLAG	0x7E	Escape	0x7D, 0x5E
'B'	0x42	Pass through	0x42
ESC	0x7D	Escape	0x7D, 0x5D
'C'	0x43	Pass through	0x43
FLAG	0x7E	Escape	0x7D, 0x5E
ESC	0x7D	Escape	0x7D, 0x5D
'D'	0x44	Pass through	0x44

Complete Transmitted Frame:

[0x7E] [0x41] [0x7D 0x5E] [0x42] [0x7D 0x5D] [0x43] [0x7D 0x5E] [0x7D 0x5D] [0x44] [0x7E]
  ↑      ↑       ↑          ↑       ↑          ↑       ↑          ↑          ↑      ↑
START   'A'   escaped     'B'   escaped      'C'   escaped    escaped     'D'    END
              FLAG              ESC                FLAG        ESC

Size Analysis:

Original payload: 8 bytes
Stuffed payload: 12 bytes (4 bytes of escape sequences)
Total frame: 14 bytes (with start/end flags)
Overhead: 50% increase due to stuffing, plus 2 flag bytes

Step 2: Receiver Unstuffing

The receiver processes the frame between flags:

Input stream: [0x41] [0x7D 0x5E] [0x42] [0x7D 0x5D] [0x43] [0x7D 0x5E] [0x7D 0x5D] [0x44]

Processing with state machine:

State	Input	Action	Output	New State
NORMAL	0x41	Regular byte	0x41	NORMAL
NORMAL	0x7D	Escape detected	—	ESCAPED
ESCAPED	0x5E	XOR 0x20 = 0x7E	0x7E	NORMAL
NORMAL	0x42	Regular byte	0x42	NORMAL
NORMAL	0x7D	Escape detected	—	ESCAPED
ESCAPED	0x5D	XOR 0x20 = 0x7D	0x7D	NORMAL
NORMAL	0x43	Regular byte	0x43	NORMAL
NORMAL	0x7D	Escape detected	—	ESCAPED
ESCAPED	0x5E	XOR 0x20 = 0x7E	0x7E	NORMAL
NORMAL	0x7D	Escape detected	—	ESCAPED
ESCAPED	0x5D	XOR 0x20 = 0x7D	0x7D	NORMAL
NORMAL	0x44	Regular byte	0x44	NORMAL

Recovered Payload:

[0x41, 0x7E, 0x42, 0x7D, 0x43, 0x7E, 0x7D, 0x44]

This exactly matches the original payload!

Perfect Reconstruction

The byte stuffing algorithm guarantees perfect reconstruction. No matter what byte sequence the original data contains—including patterns that look like escape sequences themselves—the unstuffing process will always recover the exact original data.

Production-Quality Implementation

Let's develop a complete, production-quality implementation of byte stuffing that combines frame construction, CRC calculation, and escape processing. This implementation reflects real-world requirements including buffer management, error handling, and efficiency optimization.

byte_stuffing.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
"""
Complete Byte Stuffing Implementation
 
This module provides production-quality byte stuffing and unstuffing
for PPP-style framing, including CRC calculation and full frame construction.
"""
 
from typing import Optional, Tuple, List
from dataclasses import dataclass
from enum import Enum, auto
import struct
 
# CRC-16-CCITT lookup table for efficient CRC calculation
CRC16_TABLE = []
for i in range(256):
    crc = 0
    for _ in range(8):
        if (i ^ crc) & 1:
            crc = (crc >> 1) ^ 0x8408  # Reversed polynomial
        else:
            crc >>= 1
        i >>= 1
    CRC16_TABLE.append(crc)
 
 
def crc16_ccitt(data: bytes, initial: int = 0xFFFF) -> int:
    """
    Calculate CRC-16-CCITT (used by PPP/HDLC).
    
    The CRC is calculated on the unstuffed data.
    """
    crc = initial
    for byte in data:
        crc = (crc >> 8) ^ CRC16_TABLE[(crc ^ byte) & 0xFF]
    return crc ^ 0xFFFF
 
 
class ByteStuffer:
    """
    Complete byte stuffing implementation for PPP-style framing.
    
    This class handles the complete transmit path:
    1. Constructs frame with header fields
    2. Calculates and appends FCS (CRC)
    3. Performs byte stuffing on the complete frame
    4. Adds surrounding flag bytes
    """
    
    FLAG = 0x7E
    ESCAPE = 0x7D
    XOR_MASK = 0x20
    
    # Default PPP header bytes (can be compressed)
    ADDRESS = 0xFF
    CONTROL = 0x03
    
    def __init__(self, accm: int = 0xFFFFFFFF):
        """
        Initialize with Async Control Character Map.
        
        Args:
            accm: 32-bit bitmap for control character escaping
        """
        self.accm = accm
        self._build_escape_table()
    
    def _build_escape_table(self):
        """Pre-build escape lookup table for O(1) decisions."""
        self._must_escape = [False] * 256
        self._must_escape[self.FLAG] = True
        self._must_escape[self.ESCAPE] = True
        
        for i in range(32):
            if self.accm & (1 << i):
                self._must_escape[i] = True
    
    def stuff_byte(self, byte: int, output: List[int]):
        """Stuff a single byte, appending to output list."""
        if self._must_escape[byte]:
            output.append(self.ESCAPE)
            output.append(byte ^ self.XOR_MASK)
        else:
            output.append(byte)
    
    def stuff_bytes(self, data: bytes) -> bytes:
        """Stuff a sequence of bytes, returning stuffed result."""
        output = []
        for byte in data:
            self.stuff_byte(byte, output)
        return bytes(output)
    
    def create_frame(self, protocol: int, payload: bytes) -> bytes:
        """
        Create a complete PPP frame with stuffing.
        
        Args:
            protocol: PPP protocol field (e.g., 0x0021 for IP)
            payload: Raw payload data
        
        Returns:
            Complete frame including flags, ready for transmission
        """
        # Build unstuffed frame content
        frame_content = bytes([
            self.ADDRESS,
            self.CONTROL,
            (protocol >> 8) & 0xFF,  # Protocol high byte
            protocol & 0xFF,          # Protocol low byte
        ]) + payload
        
        # Calculate FCS on unstuffed content
        fcs = crc16_ccitt(frame_content)
        frame_content += struct.pack('<H', fcs)  # Little-endian FCS
        
        # Stuff the frame content
        stuffed = self.stuff_bytes(frame_content)
        
        # Add flags
        return bytes([self.FLAG]) + stuffed + bytes([self.FLAG])
    
    def calculate_stuffed_length(self, data: bytes) -> int:
        """Calculate stuffed length without actually stuffing."""
        escape_count = sum(1 for b in data if self._must_escape[b])
        return len(data) + escape_count
 
 
class UnstuffState(Enum):
    HUNT = auto()      # Looking for opening flag
    NORMAL = auto()    # Receiving normal data
    ESCAPED = auto()   # Previous byte was escape
 
 
@dataclass
class UnstuffResult:
    """Result of frame unstuffing operation."""
    success: bool
    payload: Optional[bytes]
    protocol: Optional[int]
    error: Optional[str]
 
 
class ByteUnstuffer:
    """
    Complete byte unstuffing implementation for PPP-style framing.
    
    This class handles the complete receive path:
    1. Detects frame boundaries (flags)
    2. Removes escape sequences (unstuffing)
    3. Validates FCS (CRC)
    4. Extracts protocol and payload
    """
    
    FLAG = 0x7E
    ESCAPE = 0x7D
    XOR_MASK = 0x20
    MAX_FRAME_SIZE = 4096
    
    def __init__(self):
        self.state = UnstuffState.HUNT
        self.buffer: List[int] = []
        self.stats = {
            'frames_received': 0,
            'frames_valid': 0,
            'frames_crc_error': 0,
            'frames_too_short': 0,
            'frames_too_long': 0,
        }
    
    def reset(self):
        """Reset to initial state."""
        self.state = UnstuffState.HUNT
        self.buffer.clear()
    
    def receive_byte(self, byte: int) -> Optional[UnstuffResult]:
        """
        Process a single received byte.
        
        Returns UnstuffResult when a complete frame is received,
        None otherwise.
        """
        if self.state == UnstuffState.HUNT:
            return self._handle_hunt(byte)
        elif self.state == UnstuffState.NORMAL:
            return self._handle_normal(byte)
        else:  # ESCAPED
            return self._handle_escaped(byte)
    
    def _handle_hunt(self, byte: int) -> Optional[UnstuffResult]:
        """In HUNT state: waiting for opening flag."""
        if byte == self.FLAG:
            self.buffer.clear()
            self.state = UnstuffState.NORMAL
        return None
    
    def _handle_normal(self, byte: int) -> Optional[UnstuffResult]:
        """In NORMAL state: receiving and unstuffing data."""
        if byte == self.FLAG:
            # Frame complete (or empty/consecutive flags)
            if len(self.buffer) == 0:
                # Empty frame (consecutive flags) - stay in NORMAL
                return None
            
            result = self._process_complete_frame()
            self.buffer.clear()
            # Stay in NORMAL state (this flag starts next frame)
            return result
        
        elif byte == self.ESCAPE:
            self.state = UnstuffState.ESCAPED
            return None
        
        else:
            if len(self.buffer) < self.MAX_FRAME_SIZE:
                self.buffer.append(byte)
            else:
                # Buffer overflow
                self.stats['frames_too_long'] += 1
                self.state = UnstuffState.HUNT
            return None
    
    def _handle_escaped(self, byte: int) -> Optional[UnstuffResult]:
        """In ESCAPED state: unstuff the escaped byte."""
        unstuffed = byte ^ self.XOR_MASK
        
        if len(self.buffer) < self.MAX_FRAME_SIZE:
            self.buffer.append(unstuffed)
        
        self.state = UnstuffState.NORMAL
        return None
    
    def _process_complete_frame(self) -> UnstuffResult:
        """Process a complete unstuffed frame buffer."""
        self.stats['frames_received'] += 1
        frame_data = bytes(self.buffer)
        
        # Minimum frame: Address + Control + Protocol + FCS = 6 bytes
        if len(frame_data) < 6:
            self.stats['frames_too_short'] += 1
            return UnstuffResult(
                success=False,
                payload=None,
                protocol=None,
                error=f"Frame too short: {len(frame_data)} bytes"
            )
        
        # Verify FCS (CRC should be 0xF0B8 after including FCS bytes)
        calculated_fcs = crc16_ccitt(frame_data)
        
        # The "magic" remainder for good CRC is 0xF0B8
        if calculated_fcs != 0xF0B8:
            self.stats['frames_crc_error'] += 1
            return UnstuffResult(
                success=False,
                payload=None,
                protocol=None,
                error="CRC error"
            )
        
        # Extract fields (FCS already verified, strip it)
        address = frame_data[0]
        control = frame_data[1]
        protocol = (frame_data[2] << 8) | frame_data[3]
        payload = frame_data[4:-2]  # Exclude FCS
        
        self.stats['frames_valid'] += 1
        
        return UnstuffResult(
            success=True,
            payload=payload,
            protocol=protocol,
            error=None
        )
    
    def receive_stream(self, data: bytes) -> List[UnstuffResult]:
        """
        Process a stream of bytes, returning all complete frames.
        
        Useful for batch processing captured data.
        """
        results = []
        for byte in data:
            result = self.receive_byte(byte)
            if result is not None:
                results.append(result)
        return results
 
 
def demonstrate_byte_stuffing():
    """Comprehensive demonstration of byte stuffing."""
    
    print("=" * 60)
    print("COMPLETE BYTE STUFFING DEMONSTRATION")
    print("=" * 60)
    
    # Create stuffer and unstuffer
    stuffer = ByteStuffer(accm=0x00000000)  # Minimal ACCM
    unstuffer = ByteUnstuffer()
    
    # Test payload with embedded reserved bytes
    test_payload = bytes([
        0x45, 0x00, 0x00, 0x28,  # IP header start
        0x7E, 0x7D, 0x00, 0x00,  # Contains FLAG and ESCAPE!
        0x40, 0x01, 0x00, 0x00,  # More IP header
    ])
    
    print(f"
Original payload ({len(test_payload)} bytes):")
    print(f"  {test_payload.hex(' ')}")
    print(f"  Contains FLAG (7E) and ESCAPE (7D) bytes!")
    
    # Create stuffed frame
    protocol = 0x0021  # IP
    frame = stuffer.create_frame(protocol, test_payload)
    
    print(f"
Transmitted frame ({len(frame)} bytes):")
    print(f"  {frame.hex(' ')}")
    
    # Receive and unstuff
    print(f"
Receiving and unstuffing...")
    results = unstuffer.receive_stream(frame)
    
    for result in results:
        if result.success:
            print(f"  ✓ Frame received successfully")
            print(f"    Protocol: 0x{result.protocol:04X}")
            print(f"    Payload: {result.payload.hex(' ')}")
            print(f"    Match: {result.payload == test_payload}")
        else:
            print(f"  ✗ Frame error: {result.error}")
    
    # Statistics
    print(f"
Receiver Statistics:")
    for stat, value in unstuffer.stats.items():
        print(f"  {stat}: {value}")
    
    # Overhead analysis
    print(f"
{'='*60}")
    print("OVERHEAD ANALYSIS")
    print("=" * 60)
    
    # Random data (approximates real-world)
    import random
    random.seed(42)
    random_payload = bytes(random.randint(0, 255) for _ in range(1000))
    
    full_accm_stuffer = ByteStuffer(accm=0xFFFFFFFF)
    min_accm_stuffer = ByteStuffer(accm=0x00000000)
    
    frame_full = full_accm_stuffer.create_frame(0x0021, random_payload)
    frame_min = min_accm_stuffer.create_frame(0x0021, random_payload)
    
    print(f"
1000-byte random payload:")
    print(f"  Full ACCM frame size: {len(frame_full)} bytes")
    print(f"  Min ACCM frame size:  {len(frame_min)} bytes")
    print(f"  Overhead (Full): {(len(frame_full) - len(random_payload)) / len(random_payload) * 100:.1f}%")
    print(f"  Overhead (Min):  {(len(frame_min) - len(random_payload)) / len(random_payload) * 100:.1f}%")
 
 
if __name__ == "__main__":
    demonstrate_byte_stuffing()

Efficiency Analysis

A critical aspect of any framing mechanism is its efficiency—how much overhead does it add to the original data? Let's analyze byte stuffing mathematically and empirically.

Theoretical Analysis:

For a payload of N bytes, the stuffed length L depends on how many bytes require escaping:

L = N + E

where:
  N = original payload length
  E = number of bytes requiring escaping

Each escaped byte becomes two bytes (escape byte + XOR'd byte), adding exactly one byte of overhead per escaped byte.

Probability Model:

Assuming uniformly random data (each byte value equally likely):

Probability of FLAG (0x7E): P(FLAG) = 1/256 ≈ 0.39%
Probability of ESCAPE (0x7D): P(ESCAPE) = 1/256 ≈ 0.39%
With minimal ACCM: P(escape) = 2/256 ≈ 0.78%
With full ACCM: P(escape) = 34/256 ≈ 13.28%

Expected Overhead:

E[overhead] = N × P(escape)

Minimal ACCM: E[L] = N × (1 + 2/256) ≈ 1.008N
Full ACCM:    E[L] = N × (1 + 34/256) ≈ 1.133N

Byte Stuffing Overhead by Data Type and ACCM
Data Type	Min ACCM Overhead	Full ACCM Overhead	Notes
Random binary	~0.8%	~13%	Uniform byte distribution
ASCII text	~0%	~0%	No reserved bytes in printable ASCII
Compressed data	~0.8%	~13%	Near-random distribution
Worst case	100%	100%+	All bytes are 0x7E or 0x7D
IP headers	~0.5%	~5%	Structured data, some control bytes

Worst-Case Analysis:

The absolute worst case occurs when every byte in the payload requires escaping:

Payload: [0x7E, 0x7E, 0x7E, ...] (N bytes)
Stuffed: [0x7D, 0x5E, 0x7D, 0x5E, ...] (2N bytes)

This represents 100% overhead. However, this case is extremely rare in practice:

For random data, probability of all N bytes needing escape: (2/256)^N
For N=100: essentially zero

Best-Case Analysis:

The best case occurs when no bytes require escaping:

Payload: [0x41, 0x42, 0x43, ...] (N bytes, no 0x7E/0x7D)
Stuffed: [0x41, 0x42, 0x43, ...] (N bytes, unchanged)

This is common for ASCII text and carefully constructed data.

Comparison with Fixed Overhead:

Byte stuffing has variable overhead that depends on data content. This contrasts with fixed overhead approaches like character count framing, which always add the same number of header bytes regardless of content.

Approach	Fixed Overhead	Variable Overhead
Character count	2-4 bytes	None
Byte stuffing	2 bytes (flags)	0-100%
Bit stuffing	~1.8% average	~20% worst case

Practical Efficiency

In practice, most data yields byte stuffing overhead well under 5%. Compressed and encrypted data approach the random model (~0.8% with minimal ACCM). For long-running links, negotiating minimal ACCM significantly reduces overhead compared to the default full ACCM.

Byte Stuffing vs. Bit Stuffing

While this module focuses on byte stuffing, it's essential to understand its relationship to bit stuffing, the alternative approach used in bit-oriented protocols like HDLC.

Bit Stuffing Overview:

In bit stuffing (used by HDLC and its derivatives), the flag pattern is 01111110 (same as 0x7E, but processed at the bit level). The transparency rule is:

After any sequence of five consecutive 1-bits in the data, insert a 0-bit.

The receiver performs the reverse: after seeing five consecutive 1-bits followed by a 0, it removes that 0.

Example:

Original:  01111110 11111 01111110
           FLAG     data  FLAG
                          
Bit-stuffed data: 011111010 → 0111110 10
                         ↑ inserted bit

Byte Stuffing Characteristics

•Granularity: Byte-level operations
•Implementation: Simple software, standard UARTs
•Overhead: Variable (0-100%), ~0.8% average
•Expansion: Up to 2x worst case
•Use cases: Serial links, async communication
•Examples: PPP, SLIP, BISYNC

Bit Stuffing Characteristics

•Granularity: Bit-level operations
•Implementation: Requires hardware or bit manipulation
•Overhead: Lower average (~1.8%)
•Expansion: Up to 20% worst case
•Use cases: Synchronous links, high-speed
•Examples: HDLC, LAPB, Frame Relay

Mathematical Comparison:

For bit stuffing:

Worst case: Every 5 bits followed by a 0 → 6/5 = 1.2x = 20% overhead
Average (random data): ~1 stuffed bit per 32 data bits → ~1.8% overhead

For byte stuffing (minimal ACCM):

Worst case: Every byte escaped → 2x = 100% overhead
Average (random data): ~2/256 bytes escaped → ~0.8% overhead

Why Use Byte Stuffing?

Despite bit stuffing's lower average overhead, byte stuffing remains prevalent because:

Hardware Simplicity: Standard UART chips output bytes, not bits. Bit stuffing requires additional hardware or software bit manipulation.
Debugging: Byte-oriented frames are easier to inspect in hex dumps and protocol analyzers.
Asynchronous Compatibility: Byte stuffing works naturally with asynchronous serial communication.
Legacy Systems: Enormous installed base of byte-oriented equipment.

When to Use Each:

Byte stuffing: Serial lines, modems, tunneling protocols, software implementations
Bit stuffing: High-speed synchronous links, dedicated hardware, WAN protocols

PPP's Dual Nature

PPP is defined for both asynchronous links (using byte stuffing) and synchronous links (using HDLC-style bit stuffing). The framing method is determined by the underlying physical layer. This flexibility allows PPP to serve as a universal link-layer protocol.

Implementation Optimizations

Production implementations of byte stuffing must be highly optimized, especially for high-speed links or resource-constrained embedded systems. Let's explore key optimization strategies.

1. Lookup Table for Escape Decisions:

Rather than computing "should this byte be escaped?" with bit operations, pre-compute a 256-entry lookup table:

# Initialization (once)
escape_table = [False] * 256
escape_table[0x7E] = True
escape_table[0x7D] = True
for i in range(32):  # ACCM bits
    if accm & (1 << i):
        escape_table[i] = True

# Usage (per byte) - single array lookup
if escape_table[byte]:
    # Escape needed

This trades 256 bytes of memory for O(1) escape decisions.

2. SIMD Processing:

Modern CPUs can check multiple bytes simultaneously using SIMD instructions:

// Process 16 bytes at once with SSE
__m128i data = _mm_loadu_si128(input);
__m128i flag_match = _mm_cmpeq_epi8(data, _mm_set1_epi8(0x7E));
__m128i esc_match = _mm_cmpeq_epi8(data, _mm_set1_epi8(0x7D));
__m128i needs_escape = _mm_or_si128(flag_match, esc_match);
int mask = _mm_movemask_epi8(needs_escape);

if (mask == 0) {
    // Fast path: no escaping needed, copy all 16 bytes
}

3. In-Place Unstuffing:

Since unstuffing always shrinks (or maintains) data length, it can be done in-place:

def unstuff_in_place(buffer):
    read_idx = 0
    write_idx = 0
    
    while read_idx < len(buffer):
        if buffer[read_idx] == ESCAPE:
            read_idx += 1
            buffer[write_idx] = buffer[read_idx] ^ 0x20
        else:
            buffer[write_idx] = buffer[read_idx]
        read_idx += 1
        write_idx += 1
    
    return write_idx  # New length

4. Scatter-Gather I/O:

Instead of copying data to insert escape bytes, use scatter-gather to describe the output as segments:

# Instead of copying everything:
output = flag + stuffed_header + stuffed_payload + stuffed_fcs + flag

# Use scatter-gather:
iov = [
    (flag_buffer, 1),
    (stuffed_header, len_header),
    (stuffed_payload, len_payload),
    (stuffed_fcs, len_fcs),
    (flag_buffer, 1),
]
writev(socket, iov)  # Single system call

Optimization Impact on Performance
Optimization	Speedup Factor	Memory Cost	Complexity
Lookup table	2-5x	256 bytes	Low
SIMD (SSE/AVX)	5-10x	None	Medium
In-place unstuffing	1.5-2x	None (saves memory)	Low
Scatter-gather I/O	2-3x	None	Medium
Hardware offload	100x+	Hardware	High

5. Branch Prediction Hints:

Most bytes don't need escaping. Hint this to the compiler:

if (__builtin_expect(escape_table[byte], 0)) {
    // Unlikely: escape needed
    *output++ = ESCAPE;
    *output++ = byte ^ 0x20;
} else {
    // Likely: pass through
    *output++ = byte;
}

6. Pre-calculated Frame Length:

When buffer allocation needs to be exact, scan once for escape count:

def calculate_stuffed_length(data):
    return len(data) + sum(1 for b in data if escape_table[b])

# Allocate exact buffer size
output = bytearray(2 + calculate_stuffed_length(payload))

This avoids reallocation during stuffing at the cost of an extra scan.

Profile Before Optimizing

The optimal strategy depends on your specific context: data patterns, CPU architecture, memory constraints, and throughput requirements. Always profile with realistic data before investing in complex optimizations. The lookup table approach provides excellent cost-benefit for most applications.

Error Handling and Edge Cases

Robust byte stuffing implementations must handle numerous error conditions and edge cases. Let's examine the most important ones.

1. Incomplete Escape Sequence:

A frame ends with an escape byte (0x7D) followed immediately by the end flag:

[0x7E] ... [0x7D] [0x7E]  ← Where is the escaped byte?

This is a protocol error. The receiver should:

Discard the incomplete frame
Log the error for diagnostics
Continue processing (the closing flag may be the start of a valid frame)

2. Invalid Escaped Byte:

In strict PPP, only certain escape sequences are valid. What if we receive:

[0x7D] [0x20]  ← XOR 0x20 = 0x00 (valid)
[0x7D] [0x40]  ← XOR 0x20 = 0x60 (unexpected original)

The second case produces 0x60, which is not normally escaped. Options:

Accept it (lenient): Decode as 0x60
Reject it (strict): Discard frame as malformed

Most implementations are lenient for interoperability.

3. Maximum Frame Size Exceeded:

If no flag arrives within MAX_FRAME_SIZE bytes:

[0x7E] [data...data...data...]  ← Buffer full, still no flag!

The receiver must:

Discard accumulated data
Return to HUNT state
Wait for next flag to resynchronize

Edge Cases Checklist

•Empty frame: Two consecutive flags with nothing between → Ignore (inter-frame fill)
•Zero-length payload: Valid frame with only headers and FCS → Process normally
•Maximum-length payload: Frame at MTU limit → Verify buffer handles it
•All bytes escaped: Pathological worst-case → Verify 2x buffer capacity
•Garbage before first flag: Entered mid-stream → HUNT mode discards correctly
•Flag inside escape sequence: Cannot happen (flag is always escaped) → Invariant maintained
•Multiple consecutive escapes: Each is independent → Process sequentially

4. CRC Errors After Unstuffing:

The CRC is calculated on unstuffed data. If CRC fails:

Unstuffed frame: [addr] [ctrl] [proto] [payload] [FCS]
CRC check: FAIL

This could mean:

Bit errors during transmission
Stuffing/unstuffing bug
Missed synchronization (interpreting garbage as frame)

The receiver should discard the frame, log the error, and increment an error counter.

5. Abort Sequence:

Some protocols define an "abort sequence" to forcibly terminate a frame without completing it. In HDLC/PPP:

[0x7E] [partial frame...] [0x7E 0x7E 0x7E...] [0x7E] [new frame]

Multiple consecutive flags are treated as inter-frame fill, effectively aborting any partial frame.

Defensive Implementation:

class RobustUnstuffer:
    def __init__(self):
        self.error_counts = {
            'incomplete_escape': 0,
            'buffer_overflow': 0,
            'crc_error': 0,
            'frame_too_short': 0,
        }
    
    def log_error(self, error_type, details):
        self.error_counts[error_type] += 1
        if self.error_counts[error_type] <= 10:  # Limit log spam
            logger.warning(f"Frame error: {error_type} - {details}")

Never Trust Input

In any parsing code, especially at the data link layer where data comes directly from the physical medium, assume all input is potentially malformed. Validate lengths, check for impossible states, and handle errors gracefully. A buffer overflow in frame parsing can be a security vulnerability.

Summary: The Complete Byte Stuffing Algorithm

We have now fully explored byte stuffing as the complete algorithm for byte-oriented frame delimitation with data transparency. Let's consolidate our understanding:

Key Takeaways

•Byte stuffing combines flag bytes and escape sequences into a unified framing mechanism
•The stuffing algorithm replaces reserved bytes with escape sequences; unstuffing reverses this exactly
•The key invariant: After stuffing, only frame boundaries contain the flag byte (0x7E)
•Efficiency varies with data: 0-100% overhead, typically < 1% for random data with minimal ACCM
•Production implementations use lookup tables, SIMD, and other optimizations for high throughput
•Robust error handling is essential for operation over unreliable physical media

The Module So Far:

Flag Bytes (Page 1): Frame boundary markers
Escape Sequences (Page 2): Reserved byte representation
Byte Stuffing (This Page): Complete algorithm combining both

What's Next:

The next page explores transparency in greater theoretical depth—examining the formal properties that make byte stuffing work, proving correctness, and understanding the concept of transparent channels. Following that, we'll see byte stuffing in action in a complete PPP example, tying together all the concepts with a real-world protocol walkthrough.

Page Complete

You now understand the complete byte stuffing algorithm—from theoretical foundations through production implementation. You can implement stuffing and unstuffing, analyze efficiency, and handle error conditions. Next, we'll explore the theoretical underpinnings of transparency and then see everything in action with PPP.

3 / 5

Loading learning content...

Computer NetworksData Link Layer

Framing: Byte Stuffing

LevelIntermediate

Duration75 mins

TopicData Link Layer

3 / 5

Byte Stuffing: The Complete Algorithm

Unifying Flag Bytes and Escape Sequences

What You Will Master

The Complete Byte Stuffing Algorithm

The byte stuffing algorithm consists of two complementary procedures: stuffing at the transmitter and unstuffing at the receiver.

Transmitter Algorithm (Stuffing):

TRANSMIT_FRAME(data):
  1. Output FLAG byte (0x7E)
  2. For each byte B in data:
     a. If B is FLAG (0x7E):
        Output ESCAPE (0x7D), then output 0x5E
     b. Else if B is ESCAPE (0x7D):
        Output ESCAPE (0x7D), then output 0x5D
     c. Else if B is in ACCM escape set:
        Output ESCAPE (0x7D), then output B XOR 0x20
     d. Else:
        Output B unchanged
  3. Output FLAG byte (0x7E)

Receiver Algorithm (Unstuffing):

RECEIVE_FRAME():
  1. Wait for FLAG byte (0x7E)
  2. Initialize empty buffer
  3. Loop:
     a. Receive next byte B
     b. If B is FLAG (0x7E):
        Return buffer as complete frame
     c. Else if B is ESCAPE (0x7D):
        Receive next byte B'
        Append (B' XOR 0x20) to buffer
     d. Else:
        Append B to buffer

Algorithm Properties:

Correctness: The unstuffing procedure perfectly inverts the stuffing procedure. For any input data D, UNSTUFF(STUFF(D)) = D.
Completeness: The algorithm handles all possible byte values (0x00-0xFF) correctly, ensuring full data transparency.
Determinism: Both procedures are deterministic; given the same input, they always produce the same output.
Single-Pass: Both stuffing and unstuffing are single-pass algorithms with O(n) time complexity, where n is the input length.
Streaming: The algorithms can process data byte-by-byte, making them suitable for resource-constrained environments.

The Key Insight:

The elegance of byte stuffing lies in the invariant it maintains:

After stuffing, the only occurrence of the flag byte (0x7E) in the byte stream is at actual frame boundaries.

This invariant ensures the receiver can unambiguously identify frame boundaries by scanning for flag bytes, regardless of what data the frame contains.

The XOR Transformation

Step-by-Step Worked Example

Let's trace through the complete byte stuffing process with a detailed example. This example will solidify your understanding of both the stuffing and unstuffing algorithms.

Original Payload Data:

Payload: [0x41, 0x7E, 0x42, 0x7D, 0x43, 0x7E, 0x7D, 0x44]
         'A'   FLAG  'B'   ESC   'C'   FLAG  ESC   'D'

This 8-byte payload contains two flag bytes (0x7E) and two escape bytes (0x7D)—requiring stuffing.

Step 1: Transmitter Stuffing

Processing each byte:

Input Byte	Hex	Action	Output Bytes
'A'	0x41	Pass through	0x41
FLAG	0x7E	Escape	0x7D, 0x5E
'B'	0x42	Pass through	0x42
ESC	0x7D	Escape	0x7D, 0x5D
'C'	0x43	Pass through	0x43
FLAG	0x7E	Escape	0x7D, 0x5E
ESC	0x7D	Escape	0x7D, 0x5D
'D'	0x44	Pass through	0x44

Complete Transmitted Frame:

[0x7E] [0x41] [0x7D 0x5E] [0x42] [0x7D 0x5D] [0x43] [0x7D 0x5E] [0x7D 0x5D] [0x44] [0x7E]
  ↑      ↑       ↑          ↑       ↑          ↑       ↑          ↑          ↑      ↑
START   'A'   escaped     'B'   escaped      'C'   escaped    escaped     'D'    END
              FLAG              ESC                FLAG        ESC

Size Analysis:

Original payload: 8 bytes
Stuffed payload: 12 bytes (4 bytes of escape sequences)
Total frame: 14 bytes (with start/end flags)
Overhead: 50% increase due to stuffing, plus 2 flag bytes

Step 2: Receiver Unstuffing

The receiver processes the frame between flags:

Input stream: [0x41] [0x7D 0x5E] [0x42] [0x7D 0x5D] [0x43] [0x7D 0x5E] [0x7D 0x5D] [0x44]

Processing with state machine:

State	Input	Action	Output	New State
NORMAL	0x41	Regular byte	0x41	NORMAL
NORMAL	0x7D	Escape detected	—	ESCAPED
ESCAPED	0x5E	XOR 0x20 = 0x7E	0x7E	NORMAL
NORMAL	0x42	Regular byte	0x42	NORMAL
NORMAL	0x7D	Escape detected	—	ESCAPED
ESCAPED	0x5D	XOR 0x20 = 0x7D	0x7D	NORMAL
NORMAL	0x43	Regular byte	0x43	NORMAL
NORMAL	0x7D	Escape detected	—	ESCAPED
ESCAPED	0x5E	XOR 0x20 = 0x7E	0x7E	NORMAL
NORMAL	0x7D	Escape detected	—	ESCAPED
ESCAPED	0x5D	XOR 0x20 = 0x7D	0x7D	NORMAL
NORMAL	0x44	Regular byte	0x44	NORMAL

Recovered Payload:

[0x41, 0x7E, 0x42, 0x7D, 0x43, 0x7E, 0x7D, 0x44]

This exactly matches the original payload!

Perfect Reconstruction

Production-Quality Implementation

byte_stuffing.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
"""
Complete Byte Stuffing Implementation
 
This module provides production-quality byte stuffing and unstuffing
for PPP-style framing, including CRC calculation and full frame construction.
"""
 
from typing import Optional, Tuple, List
from dataclasses import dataclass
from enum import Enum, auto
import struct
 
# CRC-16-CCITT lookup table for efficient CRC calculation
CRC16_TABLE = []
for i in range(256):
    crc = 0
    for _ in range(8):
        if (i ^ crc) & 1:
            crc = (crc >> 1) ^ 0x8408  # Reversed polynomial
        else:
            crc >>= 1
        i >>= 1
    CRC16_TABLE.append(crc)
 
 
def crc16_ccitt(data: bytes, initial: int = 0xFFFF) -> int:
    """
    Calculate CRC-16-CCITT (used by PPP/HDLC).
    
    The CRC is calculated on the unstuffed data.
    """
    crc = initial
    for byte in data:
        crc = (crc >> 8) ^ CRC16_TABLE[(crc ^ byte) & 0xFF]
    return crc ^ 0xFFFF
 
 
class ByteStuffer:
    """
    Complete byte stuffing implementation for PPP-style framing.
    
    This class handles the complete transmit path:
    1. Constructs frame with header fields
    2. Calculates and appends FCS (CRC)
    3. Performs byte stuffing on the complete frame
    4. Adds surrounding flag bytes
    """
    
    FLAG = 0x7E
    ESCAPE = 0x7D
    XOR_MASK = 0x20
    
    # Default PPP header bytes (can be compressed)
    ADDRESS = 0xFF
    CONTROL = 0x03
    
    def __init__(self, accm: int = 0xFFFFFFFF):
        """
        Initialize with Async Control Character Map.
        
        Args:
            accm: 32-bit bitmap for control character escaping
        """
        self.accm = accm
        self._build_escape_table()
    
    def _build_escape_table(self):
        """Pre-build escape lookup table for O(1) decisions."""
        self._must_escape = [False] * 256
        self._must_escape[self.FLAG] = True
        self._must_escape[self.ESCAPE] = True
        
        for i in range(32):
            if self.accm & (1 << i):
                self._must_escape[i] = True
    
    def stuff_byte(self, byte: int, output: List[int]):
        """Stuff a single byte, appending to output list."""
        if self._must_escape[byte]:
            output.append(self.ESCAPE)
            output.append(byte ^ self.XOR_MASK)
        else:
            output.append(byte)
    
    def stuff_bytes(self, data: bytes) -> bytes:
        """Stuff a sequence of bytes, returning stuffed result."""
        output = []
        for byte in data:
            self.stuff_byte(byte, output)
        return bytes(output)
    
    def create_frame(self, protocol: int, payload: bytes) -> bytes:
        """
        Create a complete PPP frame with stuffing.
        
        Args:
            protocol: PPP protocol field (e.g., 0x0021 for IP)
            payload: Raw payload data
        
        Returns:
            Complete frame including flags, ready for transmission
        """
        # Build unstuffed frame content
        frame_content = bytes([
            self.ADDRESS,
            self.CONTROL,
            (protocol >> 8) & 0xFF,  # Protocol high byte
            protocol & 0xFF,          # Protocol low byte
        ]) + payload
        
        # Calculate FCS on unstuffed content
        fcs = crc16_ccitt(frame_content)
        frame_content += struct.pack('<H', fcs)  # Little-endian FCS
        
        # Stuff the frame content
        stuffed = self.stuff_bytes(frame_content)
        
        # Add flags
        return bytes([self.FLAG]) + stuffed + bytes([self.FLAG])
    
    def calculate_stuffed_length(self, data: bytes) -> int:
        """Calculate stuffed length without actually stuffing."""
        escape_count = sum(1 for b in data if self._must_escape[b])
        return len(data) + escape_count
 
 
class UnstuffState(Enum):
    HUNT = auto()      # Looking for opening flag
    NORMAL = auto()    # Receiving normal data
    ESCAPED = auto()   # Previous byte was escape
 
 
@dataclass
class UnstuffResult:
    """Result of frame unstuffing operation."""
    success: bool
    payload: Optional[bytes]
    protocol: Optional[int]
    error: Optional[str]
 
 
class ByteUnstuffer:
    """
    Complete byte unstuffing implementation for PPP-style framing.
    
    This class handles the complete receive path:
    1. Detects frame boundaries (flags)
    2. Removes escape sequences (unstuffing)
    3. Validates FCS (CRC)
    4. Extracts protocol and payload
    """
    
    FLAG = 0x7E
    ESCAPE = 0x7D
    XOR_MASK = 0x20
    MAX_FRAME_SIZE = 4096
    
    def __init__(self):
        self.state = UnstuffState.HUNT
        self.buffer: List[int] = []
        self.stats = {
            'frames_received': 0,
            'frames_valid': 0,
            'frames_crc_error': 0,
            'frames_too_short': 0,
            'frames_too_long': 0,
        }
    
    def reset(self):
        """Reset to initial state."""
        self.state = UnstuffState.HUNT
        self.buffer.clear()
    
    def receive_byte(self, byte: int) -> Optional[UnstuffResult]:
        """
        Process a single received byte.
        
        Returns UnstuffResult when a complete frame is received,
        None otherwise.
        """
        if self.state == UnstuffState.HUNT:
            return self._handle_hunt(byte)
        elif self.state == UnstuffState.NORMAL:
            return self._handle_normal(byte)
        else:  # ESCAPED
            return self._handle_escaped(byte)
    
    def _handle_hunt(self, byte: int) -> Optional[UnstuffResult]:
        """In HUNT state: waiting for opening flag."""
        if byte == self.FLAG:
            self.buffer.clear()
            self.state = UnstuffState.NORMAL
        return None
    
    def _handle_normal(self, byte: int) -> Optional[UnstuffResult]:
        """In NORMAL state: receiving and unstuffing data."""
        if byte == self.FLAG:
            # Frame complete (or empty/consecutive flags)
            if len(self.buffer) == 0:
                # Empty frame (consecutive flags) - stay in NORMAL
                return None
            
            result = self._process_complete_frame()
            self.buffer.clear()
            # Stay in NORMAL state (this flag starts next frame)
            return result
        
        elif byte == self.ESCAPE:
            self.state = UnstuffState.ESCAPED
            return None
        
        else:
            if len(self.buffer) < self.MAX_FRAME_SIZE:
                self.buffer.append(byte)
            else:
                # Buffer overflow
                self.stats['frames_too_long'] += 1
                self.state = UnstuffState.HUNT
            return None
    
    def _handle_escaped(self, byte: int) -> Optional[UnstuffResult]:
        """In ESCAPED state: unstuff the escaped byte."""
        unstuffed = byte ^ self.XOR_MASK
        
        if len(self.buffer) < self.MAX_FRAME_SIZE:
            self.buffer.append(unstuffed)
        
        self.state = UnstuffState.NORMAL
        return None
    
    def _process_complete_frame(self) -> UnstuffResult:
        """Process a complete unstuffed frame buffer."""
        self.stats['frames_received'] += 1
        frame_data = bytes(self.buffer)
        
        # Minimum frame: Address + Control + Protocol + FCS = 6 bytes
        if len(frame_data) < 6:
            self.stats['frames_too_short'] += 1
            return UnstuffResult(
                success=False,
                payload=None,
                protocol=None,
                error=f"Frame too short: {len(frame_data)} bytes"
            )
        
        # Verify FCS (CRC should be 0xF0B8 after including FCS bytes)
        calculated_fcs = crc16_ccitt(frame_data)
        
        # The "magic" remainder for good CRC is 0xF0B8
        if calculated_fcs != 0xF0B8:
            self.stats['frames_crc_error'] += 1
            return UnstuffResult(
                success=False,
                payload=None,
                protocol=None,
                error="CRC error"
            )
        
        # Extract fields (FCS already verified, strip it)
        address = frame_data[0]
        control = frame_data[1]
        protocol = (frame_data[2] << 8) | frame_data[3]
        payload = frame_data[4:-2]  # Exclude FCS
        
        self.stats['frames_valid'] += 1
        
        return UnstuffResult(
            success=True,
            payload=payload,
            protocol=protocol,
            error=None
        )
    
    def receive_stream(self, data: bytes) -> List[UnstuffResult]:
        """
        Process a stream of bytes, returning all complete frames.
        
        Useful for batch processing captured data.
        """
        results = []
        for byte in data:
            result = self.receive_byte(byte)
            if result is not None:
                results.append(result)
        return results
 
 
def demonstrate_byte_stuffing():
    """Comprehensive demonstration of byte stuffing."""
    
    print("=" * 60)
    print("COMPLETE BYTE STUFFING DEMONSTRATION")
    print("=" * 60)
    
    # Create stuffer and unstuffer
    stuffer = ByteStuffer(accm=0x00000000)  # Minimal ACCM
    unstuffer = ByteUnstuffer()
    
    # Test payload with embedded reserved bytes
    test_payload = bytes([
        0x45, 0x00, 0x00, 0x28,  # IP header start
        0x7E, 0x7D, 0x00, 0x00,  # Contains FLAG and ESCAPE!
        0x40, 0x01, 0x00, 0x00,  # More IP header
    ])
    
    print(f"
Original payload ({len(test_payload)} bytes):")
    print(f"  {test_payload.hex(' ')}")
    print(f"  Contains FLAG (7E) and ESCAPE (7D) bytes!")
    
    # Create stuffed frame
    protocol = 0x0021  # IP
    frame = stuffer.create_frame(protocol, test_payload)
    
    print(f"
Transmitted frame ({len(frame)} bytes):")
    print(f"  {frame.hex(' ')}")
    
    # Receive and unstuff
    print(f"
Receiving and unstuffing...")
    results = unstuffer.receive_stream(frame)
    
    for result in results:
        if result.success:
            print(f"  ✓ Frame received successfully")
            print(f"    Protocol: 0x{result.protocol:04X}")
            print(f"    Payload: {result.payload.hex(' ')}")
            print(f"    Match: {result.payload == test_payload}")
        else:
            print(f"  ✗ Frame error: {result.error}")
    
    # Statistics
    print(f"
Receiver Statistics:")
    for stat, value in unstuffer.stats.items():
        print(f"  {stat}: {value}")
    
    # Overhead analysis
    print(f"
{'='*60}")
    print("OVERHEAD ANALYSIS")
    print("=" * 60)
    
    # Random data (approximates real-world)
    import random
    random.seed(42)
    random_payload = bytes(random.randint(0, 255) for _ in range(1000))
    
    full_accm_stuffer = ByteStuffer(accm=0xFFFFFFFF)
    min_accm_stuffer = ByteStuffer(accm=0x00000000)
    
    frame_full = full_accm_stuffer.create_frame(0x0021, random_payload)
    frame_min = min_accm_stuffer.create_frame(0x0021, random_payload)
    
    print(f"
1000-byte random payload:")
    print(f"  Full ACCM frame size: {len(frame_full)} bytes")
    print(f"  Min ACCM frame size:  {len(frame_min)} bytes")
    print(f"  Overhead (Full): {(len(frame_full) - len(random_payload)) / len(random_payload) * 100:.1f}%")
    print(f"  Overhead (Min):  {(len(frame_min) - len(random_payload)) / len(random_payload) * 100:.1f}%")
 
 
if __name__ == "__main__":
    demonstrate_byte_stuffing()

Efficiency Analysis

A critical aspect of any framing mechanism is its efficiency—how much overhead does it add to the original data? Let's analyze byte stuffing mathematically and empirically.

Theoretical Analysis:

For a payload of N bytes, the stuffed length L depends on how many bytes require escaping:

L = N + E

where:
  N = original payload length
  E = number of bytes requiring escaping

Each escaped byte becomes two bytes (escape byte + XOR'd byte), adding exactly one byte of overhead per escaped byte.

Probability Model:

Assuming uniformly random data (each byte value equally likely):

Probability of FLAG (0x7E): P(FLAG) = 1/256 ≈ 0.39%
Probability of ESCAPE (0x7D): P(ESCAPE) = 1/256 ≈ 0.39%
With minimal ACCM: P(escape) = 2/256 ≈ 0.78%
With full ACCM: P(escape) = 34/256 ≈ 13.28%

Expected Overhead:

E[overhead] = N × P(escape)

Minimal ACCM: E[L] = N × (1 + 2/256) ≈ 1.008N
Full ACCM:    E[L] = N × (1 + 34/256) ≈ 1.133N

Byte Stuffing Overhead by Data Type and ACCM
Data Type	Min ACCM Overhead	Full ACCM Overhead	Notes
Random binary	~0.8%	~13%	Uniform byte distribution
ASCII text	~0%	~0%	No reserved bytes in printable ASCII
Compressed data	~0.8%	~13%	Near-random distribution
Worst case	100%	100%+	All bytes are 0x7E or 0x7D
IP headers	~0.5%	~5%	Structured data, some control bytes

Worst-Case Analysis:

The absolute worst case occurs when every byte in the payload requires escaping:

Payload: [0x7E, 0x7E, 0x7E, ...] (N bytes)
Stuffed: [0x7D, 0x5E, 0x7D, 0x5E, ...] (2N bytes)

This represents 100% overhead. However, this case is extremely rare in practice:

For random data, probability of all N bytes needing escape: (2/256)^N
For N=100: essentially zero

Best-Case Analysis:

The best case occurs when no bytes require escaping:

Payload: [0x41, 0x42, 0x43, ...] (N bytes, no 0x7E/0x7D)
Stuffed: [0x41, 0x42, 0x43, ...] (N bytes, unchanged)

This is common for ASCII text and carefully constructed data.

Comparison with Fixed Overhead:

Approach	Fixed Overhead	Variable Overhead
Character count	2-4 bytes	None
Byte stuffing	2 bytes (flags)	0-100%
Bit stuffing	~1.8% average	~20% worst case

Practical Efficiency

Byte Stuffing vs. Bit Stuffing

While this module focuses on byte stuffing, it's essential to understand its relationship to bit stuffing, the alternative approach used in bit-oriented protocols like HDLC.

Bit Stuffing Overview:

In bit stuffing (used by HDLC and its derivatives), the flag pattern is 01111110 (same as 0x7E, but processed at the bit level). The transparency rule is:

After any sequence of five consecutive 1-bits in the data, insert a 0-bit.

The receiver performs the reverse: after seeing five consecutive 1-bits followed by a 0, it removes that 0.

Example:

Original:  01111110 11111 01111110
           FLAG     data  FLAG
                          
Bit-stuffed data: 011111010 → 0111110 10
                         ↑ inserted bit

Byte Stuffing Characteristics

•Granularity: Byte-level operations
•Implementation: Simple software, standard UARTs
•Overhead: Variable (0-100%), ~0.8% average
•Expansion: Up to 2x worst case
•Use cases: Serial links, async communication
•Examples: PPP, SLIP, BISYNC

Bit Stuffing Characteristics

•Granularity: Bit-level operations
•Implementation: Requires hardware or bit manipulation
•Overhead: Lower average (~1.8%)
•Expansion: Up to 20% worst case
•Use cases: Synchronous links, high-speed
•Examples: HDLC, LAPB, Frame Relay

Mathematical Comparison:

For bit stuffing:

Worst case: Every 5 bits followed by a 0 → 6/5 = 1.2x = 20% overhead
Average (random data): ~1 stuffed bit per 32 data bits → ~1.8% overhead

For byte stuffing (minimal ACCM):

Worst case: Every byte escaped → 2x = 100% overhead
Average (random data): ~2/256 bytes escaped → ~0.8% overhead

Why Use Byte Stuffing?

Despite bit stuffing's lower average overhead, byte stuffing remains prevalent because:

Hardware Simplicity: Standard UART chips output bytes, not bits. Bit stuffing requires additional hardware or software bit manipulation.
Debugging: Byte-oriented frames are easier to inspect in hex dumps and protocol analyzers.
Asynchronous Compatibility: Byte stuffing works naturally with asynchronous serial communication.
Legacy Systems: Enormous installed base of byte-oriented equipment.

When to Use Each:

Byte stuffing: Serial lines, modems, tunneling protocols, software implementations
Bit stuffing: High-speed synchronous links, dedicated hardware, WAN protocols

PPP's Dual Nature

Implementation Optimizations

Production implementations of byte stuffing must be highly optimized, especially for high-speed links or resource-constrained embedded systems. Let's explore key optimization strategies.

1. Lookup Table for Escape Decisions:

Rather than computing "should this byte be escaped?" with bit operations, pre-compute a 256-entry lookup table:

# Initialization (once)
escape_table = [False] * 256
escape_table[0x7E] = True
escape_table[0x7D] = True
for i in range(32):  # ACCM bits
    if accm & (1 << i):
        escape_table[i] = True

# Usage (per byte) - single array lookup
if escape_table[byte]:
    # Escape needed

This trades 256 bytes of memory for O(1) escape decisions.

2. SIMD Processing:

Modern CPUs can check multiple bytes simultaneously using SIMD instructions:

// Process 16 bytes at once with SSE
__m128i data = _mm_loadu_si128(input);
__m128i flag_match = _mm_cmpeq_epi8(data, _mm_set1_epi8(0x7E));
__m128i esc_match = _mm_cmpeq_epi8(data, _mm_set1_epi8(0x7D));
__m128i needs_escape = _mm_or_si128(flag_match, esc_match);
int mask = _mm_movemask_epi8(needs_escape);

if (mask == 0) {
    // Fast path: no escaping needed, copy all 16 bytes
}

3. In-Place Unstuffing:

Since unstuffing always shrinks (or maintains) data length, it can be done in-place:

def unstuff_in_place(buffer):
    read_idx = 0
    write_idx = 0
    
    while read_idx < len(buffer):
        if buffer[read_idx] == ESCAPE:
            read_idx += 1
            buffer[write_idx] = buffer[read_idx] ^ 0x20
        else:
            buffer[write_idx] = buffer[read_idx]
        read_idx += 1
        write_idx += 1
    
    return write_idx  # New length

4. Scatter-Gather I/O:

Instead of copying data to insert escape bytes, use scatter-gather to describe the output as segments:

# Instead of copying everything:
output = flag + stuffed_header + stuffed_payload + stuffed_fcs + flag

# Use scatter-gather:
iov = [
    (flag_buffer, 1),
    (stuffed_header, len_header),
    (stuffed_payload, len_payload),
    (stuffed_fcs, len_fcs),
    (flag_buffer, 1),
]
writev(socket, iov)  # Single system call

Optimization Impact on Performance
Optimization	Speedup Factor	Memory Cost	Complexity
Lookup table	2-5x	256 bytes	Low
SIMD (SSE/AVX)	5-10x	None	Medium
In-place unstuffing	1.5-2x	None (saves memory)	Low
Scatter-gather I/O	2-3x	None	Medium
Hardware offload	100x+	Hardware	High

5. Branch Prediction Hints:

Most bytes don't need escaping. Hint this to the compiler:

if (__builtin_expect(escape_table[byte], 0)) {
    // Unlikely: escape needed
    *output++ = ESCAPE;
    *output++ = byte ^ 0x20;
} else {
    // Likely: pass through
    *output++ = byte;
}

6. Pre-calculated Frame Length:

When buffer allocation needs to be exact, scan once for escape count:

def calculate_stuffed_length(data):
    return len(data) + sum(1 for b in data if escape_table[b])

# Allocate exact buffer size
output = bytearray(2 + calculate_stuffed_length(payload))

This avoids reallocation during stuffing at the cost of an extra scan.

Profile Before Optimizing

Error Handling and Edge Cases

Robust byte stuffing implementations must handle numerous error conditions and edge cases. Let's examine the most important ones.

1. Incomplete Escape Sequence:

A frame ends with an escape byte (0x7D) followed immediately by the end flag:

[0x7E] ... [0x7D] [0x7E]  ← Where is the escaped byte?

This is a protocol error. The receiver should:

Discard the incomplete frame
Log the error for diagnostics
Continue processing (the closing flag may be the start of a valid frame)

2. Invalid Escaped Byte:

In strict PPP, only certain escape sequences are valid. What if we receive:

[0x7D] [0x20]  ← XOR 0x20 = 0x00 (valid)
[0x7D] [0x40]  ← XOR 0x20 = 0x60 (unexpected original)

The second case produces 0x60, which is not normally escaped. Options:

Accept it (lenient): Decode as 0x60
Reject it (strict): Discard frame as malformed

Most implementations are lenient for interoperability.

3. Maximum Frame Size Exceeded:

If no flag arrives within MAX_FRAME_SIZE bytes:

[0x7E] [data...data...data...]  ← Buffer full, still no flag!

The receiver must:

Discard accumulated data
Return to HUNT state
Wait for next flag to resynchronize

Edge Cases Checklist

•Empty frame: Two consecutive flags with nothing between → Ignore (inter-frame fill)
•Zero-length payload: Valid frame with only headers and FCS → Process normally
•Maximum-length payload: Frame at MTU limit → Verify buffer handles it
•All bytes escaped: Pathological worst-case → Verify 2x buffer capacity
•Garbage before first flag: Entered mid-stream → HUNT mode discards correctly
•Flag inside escape sequence: Cannot happen (flag is always escaped) → Invariant maintained
•Multiple consecutive escapes: Each is independent → Process sequentially

4. CRC Errors After Unstuffing:

The CRC is calculated on unstuffed data. If CRC fails:

Unstuffed frame: [addr] [ctrl] [proto] [payload] [FCS]
CRC check: FAIL

This could mean:

Bit errors during transmission
Stuffing/unstuffing bug
Missed synchronization (interpreting garbage as frame)

The receiver should discard the frame, log the error, and increment an error counter.

5. Abort Sequence:

Some protocols define an "abort sequence" to forcibly terminate a frame without completing it. In HDLC/PPP:

[0x7E] [partial frame...] [0x7E 0x7E 0x7E...] [0x7E] [new frame]

Multiple consecutive flags are treated as inter-frame fill, effectively aborting any partial frame.

Defensive Implementation:

class RobustUnstuffer:
    def __init__(self):
        self.error_counts = {
            'incomplete_escape': 0,
            'buffer_overflow': 0,
            'crc_error': 0,
            'frame_too_short': 0,
        }
    
    def log_error(self, error_type, details):
        self.error_counts[error_type] += 1
        if self.error_counts[error_type] <= 10:  # Limit log spam
            logger.warning(f"Frame error: {error_type} - {details}")

Never Trust Input

Summary: The Complete Byte Stuffing Algorithm

We have now fully explored byte stuffing as the complete algorithm for byte-oriented frame delimitation with data transparency. Let's consolidate our understanding:

Key Takeaways

•Byte stuffing combines flag bytes and escape sequences into a unified framing mechanism
•The stuffing algorithm replaces reserved bytes with escape sequences; unstuffing reverses this exactly
•The key invariant: After stuffing, only frame boundaries contain the flag byte (0x7E)
•Efficiency varies with data: 0-100% overhead, typically < 1% for random data with minimal ACCM
•Production implementations use lookup tables, SIMD, and other optimizations for high throughput
•Robust error handling is essential for operation over unreliable physical media

The Module So Far:

Flag Bytes (Page 1): Frame boundary markers
Escape Sequences (Page 2): Reserved byte representation
Byte Stuffing (This Page): Complete algorithm combining both

What's Next:

Page Complete

3 / 5