Loading learning content...
You've mastered polynomial division and understand why generator polynomials are carefully chosen. Now comes the practical application: how do we actually compute a CRC value?
The sender takes a raw message, applies the CRC algorithm, and produces a protected codeword—the original data plus the appended CRC bits. This codeword has a special property: it is exactly divisible by the generator polynomial. The receiver can then verify this divisibility to detect any transmission errors.
This page walks through the complete CRC calculation procedure with multiple worked examples, explaining both the conceptual framework and practical implementation details.
By the end of this page, you will understand the complete CRC calculation algorithm step by step, perform CRC calculation by hand using polynomial division, understand why we shift the message before division, recognize the mathematical identity that makes CRC verification work, and appreciate the elegance of hardware CRC implementation.
Let's establish the formal procedure for CRC calculation. Given:
The Goal:
Construct T(x) such that:
The Solution:
Append r bits (the CRC value) to M(x) to form T(x). These bits are chosen such that T(x) mod G(x) = 0.
The Algorithm:
Multiply M(x) by x^r — This is equivalent to appending r zero bits to the message, creating space for the CRC
Divide by G(x) — Compute: x^r · M(x) mod G(x) to get remainder R(x)
Form the codeword — T(x) = x^r · M(x) + R(x)
Mathematical Justification:
Why does this work? Let's verify:
$$T(x) = x^r \cdot M(x) + R(x)$$
Divide by G(x): $$\frac{T(x)}{G(x)} = \frac{x^r \cdot M(x) + R(x)}{G(x)}$$
We know that $x^r \cdot M(x) = Q(x) \cdot G(x) + R(x)$ (from the division in step 2).
Therefore: $$\frac{T(x)}{G(x)} = \frac{Q(x) \cdot G(x) + R(x) + R(x)}{G(x)} = Q(x) + \frac{2 \cdot R(x)}{G(x)}$$
But in GF(2), 2 = 0 (since 1 + 1 = 0), so: $$\frac{T(x)}{G(x)} = Q(x) + 0 = Q(x)$$
The remainder is zero! T(x) is exactly divisible by G(x). ✓
The magic happens because adding R(x) twice gives zero in GF(2). When we append R(x) to the shifted message, we're 'correcting' the remainder to zero. In binary: if the remainder of (message with r zeros) is R, then XORing R into those trailing r positions produces a value with remainder 0.
Let's walk through a complete CRC calculation with a concrete example.
Given:
Step 1: Append r Zero Bits
Shift M left by r = 3 positions:
$$x^3 \cdot M(x) = 110101 \text{ followed by } 000 = 110101000$$
This represents M(x) · x³, making room for the 3-bit CRC.
Step 2: Divide by G(x)
Perform polynomial division: 110101000 ÷ 1011
110010 ← Quotient (we don't need this)
───────
1011 │ 110101000
1011 ← XOR (leading 1)
────
01010 ← Result: 10101
1011 ← XOR (leading 1)
────
00111 ← Result: 1110
1011 ← XOR (leading 1)
────
01010 ← Result: 10100
1011 ← XOR (leading 1)
────
0010 ← Result: 0100
000 ← Don't XOR (leading 0)
───
100 ← Remainder = 100
Step 3: Form the Codeword
T(x) = 110101000 ⊕ 000000100 = 110101100
Or equivalently: Original message (110101) + CRC (100) = 110101100
The transmitted codeword is: 110101100
In practice, you can simply replace the trailing r zeros with the computed remainder R. Since the positions are the same and R is the XOR of zero with R, the result is just R placed in those positions: 110101|000 becomes 110101|100.
Verification:
Let's verify that 110101100 is exactly divisible by 1011:
1011 │ 110101100
1011
────
10101
1011
────
01111 → 11110
1011
────
01000 → 10000
1011
────
0110 → 1100
1011
────
0111 → 111
000
───
111
Do I have an error? Let me redo...
Actually, let me recalculate step 2 more carefully.
Step 2 (Recalculated Carefully):
Divide 110101000 by 1011:
Position: 876543210
Dividend: 110101000
Divisor: 1011
Step 1: Align at position 8
110101000
1011 (positions 8-5)
─────
0110|01000 → 11001000 (after removing leading 0)
Step 2: Align at position 7
11001000
1011 (positions 7-4)
─────
01111000 → 1111000
Step 3: Align at position 6
1111000
1011 (positions 6-3)
─────
0100000 → 100000
Step 4: Align at position 5
100000
1011 (positions 5-2)
─────
0011|00 → 1100
Step 5: Align at position 3
1100
1011 (positions 3-0)
─────
0111 → 111
Remainder = 111 (3 bits, as expected for degree-3 divisor)
Corrected: CRC = 111
Codeword T = 110101111
The most common error in manual CRC calculation is misalignment. Always align the divisor's leading 1 with the dividend's leading 1. After each XOR, remove leading zeros and continue. Track your position carefully!
Let's do a fresh, complete example with careful tracking.
Given:
Sender Side:
Step 1: Append r = 3 zeros to M:
M' = 1101 000 = 1101000 (7 bits)
Step 2: Divide M' by G:
1101000 (7 bits)
÷ 1011 (divisor)
───────────
Align divisor with leading bit of dividend:
1101000
1011 (XOR at positions 6-3)
────
0110000 → 110000
110000
1011 (XOR at positions 5-2)
────
011100 → 11100
11100
1011 (XOR at positions 4-1)
────
01010 → 1010
1010
1011 (XOR at positions 3-0)
────
0001 → 001
Remainder R = 001
Step 3: Form codeword T = M' + R:
T = 1101000 + 0000001 = 1101000 XOR 0000001 = 1101001
Equivalently: 1101 | 001
Transmitted Codeword: 1101001
Receiver Side Verification:
Receive T = 1101001. Divide by G = 1011:
1101001
1011 (XOR)
────
0110001 → 110001
110001
1011 (XOR)
────
011101 → 11101
11101
1011 (XOR)
────
01011 → 1011
1011
1011 (XOR)
────
0000
Remainder = 000 ✓
Zero remainder confirms no errors detected!
Sender: M=1101, G=1011 → append 000 → divide by G → R=001 → T=1101001. Receiver: T=1101001 ÷ G=1011 → R=000 → No error! This demonstrates the fundamental CRC principle: a correctly computed codeword is exactly divisible by the generator.
| Step | Operation | Input | Output |
|---|---|---|---|
| 1 | Append r zeros | M = 1101 | M' = 1101000 |
| 2 | Divide M' by G | 1101000 ÷ 1011 | R = 001 |
| 3 | XOR remainder | 1101000 ⊕ 001 | T = 1101001 |
| Verify | Divide T by G | 1101001 ÷ 1011 | R = 000 ✓ |
Let's work through a longer example using a degree-4 polynomial (similar in complexity to real CRCs, but manageable by hand).
Given:
Step 1: Append 4 zeros:
M' = 10110011 0000 = 101100110000 (12 bits)
Step 2: Divide M' by G:
101100110000
10011 (align at pos 11)
─────
001110110000 → 1110110000
1110110000
10011 (align at pos 9)
─────
0111010000 → 111010000
111010000
10011 (align at pos 8)
─────
011100000 → 11100000
11100000
10011 (align at pos 7)
─────
01111000 → 1111000
1111000
10011 (align at pos 6)
─────
0110100 → 110100
110100
10011 (align at pos 5)
─────
010010 → 10010
10010
10011 (align at pos 4)
─────
00001 → 0001
Remainder R = 0001 (4 bits)
Step 3: Form codeword:
T = 101100110000 + 0001 = 101100110001
Transmitted: 10110011|0001 (8-bit message + 4-bit CRC)
Verification:
101100110001
10011
─────
001110110001 → 1110110001
1110110001
10011
─────
0111010001 → 111010001
111010001
10011
─────
011100001 → 11100001
11100001
10011
─────
01111001 → 1111001
1111001
10011
─────
0110101 → 110101
110101
10011
─────
010001 → 10001
10001
10011
─────
00010 → 0010
0010 < 10011 (too small to divide)
Final Remainder = 0010 ≠ 0000
Hmm, let me recheck step 2...
Recalculation of Step 2:
Divisor: 10011 (5 bits)
Dividend: 101100110000 (12 bits)
Round 1: bits [11:7] = 10110, XOR with 10011
10110 ⊕ 10011 = 00101 → 01011 (bring next bit)
Result so far: 01011 → working with: 10110 (after shift)
Actually, let me use a cleaner method:
Bit-by-bit with register:
Initial: Process "101100110000"
I'll use the standard shift-and-XOR approach:
Register (4 bits): 0000
Poly (without MSB): 0011
Process bit sequence from left:
Bit 1: Input=1, Reg=0000, MSB=0, Shift→0001, No XOR (MSB was 0)
Bit 1: Input=0, Reg=0001, MSB=0, Shift→0010, No XOR
Bit 1: Input=1, Reg=0010, MSB=0, Shift→0101, No XOR
Bit 1: Input=1, Reg=0101, MSB=0, Shift→1011, No XOR
Bit 0: Input=0, Reg=1011, MSB=1, Shift→0110, XOR 0011→0101
Bit 0: Input=0, Reg=0101, MSB=0, Shift→1010, No XOR
Bit 1: Input=1, Reg=1010, MSB=1, Shift→0101, XOR 0011→0110
Bit 1: Input=1, Reg=0110, MSB=0, Shift→1101, No XOR
Bit 0: Input=0, Reg=1101, MSB=1, Shift→1010, XOR 0011→1001
Bit 0: Input=0, Reg=1001, MSB=1, Shift→0010, XOR 0011→0001
Bit 0: Input=0, Reg=0001, MSB=0, Shift→0010, No XOR
Bit 0: Input=0, Reg=0010, MSB=0, Shift→0100, No XOR
Final Register (CRC) = 0100
CRC = 0100
Codeword T = 10110011 0100 = 101100110100
The shift register method processes one bit at a time: (1) Shift register left, bringing in the next data bit. (2) If the bit shifted OUT was 1, XOR the register with the polynomial (minus its leading 1). (3) Repeat for all bits. This is exactly how hardware implements CRC!
CRC's elegance shines in hardware. A Linear Feedback Shift Register (LFSR) computes CRC one bit at a time using only:
The LFSR Structure:
For generator G(x) = x^r + g_{r-1}x^{r-1} + ... + g₁x + 1:
Feedback ←────────────────┐
│
┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │
───►│ XOR ├──►│ D ├──►│ D ├─ ... ─│ D ├────┘
└─────┘ └──┬──┘ └──┬──┘ └──┬──┘
│ │g₁ │g₂ │g_{r-1}
│ ┌───┴───┐ ┌───┴───┐ ┌───┴───┐
│ │ XOR │ │ XOR │ │ XOR │
│ └───────┘ └───────┘ └───────┘
└─────────────────────────────────────►
(only connects where gᵢ = 1)
The XOR gates are placed between flip-flops at positions where the polynomial has a coefficient of 1.
Example: G(x) = x³ + x + 1 (1011)
Polynomial terms: x³, x¹, x⁰ (constant)
XOR feedback at position 1 (for the x term):
┌──────────────────┐
│ Feedback │
▼ │
Data ──►┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐──►
In │ XOR ├─►│ D₀ ├─►│ XOR ├─►│ D₁ ├─►│ D₂ │─►Out
└─────┘ └─────┘ └──┬──┘ └─────┘ └─────┘
│ ▲
└─────────────────┘
(XOR for x term)
Operation:
LFSR CRC runs at one bit per clock cycle. For a 100 MHz clock, that's 100 Mbps. For higher data rates (10 Gbps+), parallel implementations process multiple bits per cycle using unrolled or table-based techniques. Hardware CRC in Ethernet NICs runs at full wire speed with negligible latency.
While hardware uses shift registers, software implementations optimize differently. The main techniques are:
1. Bit-by-Bit (Direct Implementation)
Process one bit at a time, mirroring the hardware approach:
uint32_t crc_bitwise(uint8_t *data, size_t len, uint32_t poly) {
uint32_t crc = 0xFFFFFFFF; // Common initialization
for (size_t i = 0; i < len; i++) {
crc ^= (uint32_t)data[i] << 24; // Align byte to MSB
for (int bit = 0; bit < 8; bit++) {
if (crc & 0x80000000) // Check MSB
crc = (crc << 1) ^ poly; // Shift and XOR
else
crc <<= 1; // Just shift
}
}
return crc ^ 0xFFFFFFFF; // Final XOR (common)
}
This is slow: 8 branches per byte, hard to pipeline.
2. Table-Driven (Byte-at-a-Time)
Precompute a 256-entry table where entry i gives the CRC contribution of byte value i:
uint32_t crc_table[256]; // Precomputed
void init_crc_table(uint32_t poly) {
for (int i = 0; i < 256; i++) {
uint32_t crc = i << 24;
for (int bit = 0; bit < 8; bit++) {
crc = (crc << 1) ^ ((crc & 0x80000000) ? poly : 0);
}
crc_table[i] = crc;
}
}
uint32_t crc_table_driven(uint8_t *data, size_t len) {
uint32_t crc = 0xFFFFFFFF;
for (size_t i = 0; i < len; i++) {
uint8_t index = (crc >> 24) ^ data[i];
crc = (crc << 8) ^ crc_table[index];
}
return crc ^ 0xFFFFFFFF;
}
This processes 8 bits per iteration with a single table lookup—8x faster than bit-by-bit.
3. Slicing-by-N (Multiple Tables)
Process N bytes at once using N tables. 'Slicing-by-8' processes 8 bytes per iteration:
4. Hardware Intrinsics (CRC32C)
Modern x86_64 CPUs have PCLMULQDQ (carry-less multiplication) and CRC32 instructions:
// Intel CRC32C intrinsic (for iSCSI/SCTP polynomial)
uint32_t crc = _mm_crc32_u64(crc, *((uint64_t*)data));
This computes 8 bytes of CRC in a single cycle—fastest possible.
| Method | Speed (cycles/byte) | Table Size | Notes |
|---|---|---|---|
| Bit-by-bit | ~50-100 | 0 | Simple, slow |
| Byte table | ~10-15 | 1 KB | Good balance |
| Slicing-by-4 | ~4-6 | 4 KB | Cache-efficient |
| Slicing-by-8 | ~2-3 | 8 KB | Fast on modern CPUs |
| Hardware CRC32C | ~0.3-1 | 0 | If available, best option |
For CRC-32C (Castagnoli), use hardware instructions if available—they're 10-100x faster. For CRC-32 (Ethernet), table-driven is typically fastest in software. For embedded systems with limited memory, bit-by-bit or small tables may be necessary.
Real CRC implementations include several variations from the 'pure' algorithm. Understanding these is essential for interoperability.
1. Initial Value (Seed)
Instead of starting with register = 0, many standards initialize to all 1s (0xFFFFFFFF for CRC-32).
Why? An all-zeros message with all-zeros CRC is technically valid. Initializing to 1s ensures that leading zeros in the message affect the CRC.
2. Final XOR (Complement)
After processing, XOR the result with a constant (often all 1s).
Why? Combined with 0xFFFFFFFF initialization, this ensures that appending zeros to a message doesn't leave the CRC unchanged. It also provides some defense against implementation bugs.
3. Bit and Byte Reflection
Some standards reflect (reverse) bits within each byte, or reflect the entire final CRC.
Why? Historical reasons—some early hardware processed LSB first (UART-style). 'Reflected' CRCs become standard for compatibility.
| Standard | Poly (Hex) | Init | RefIn | RefOut | XorOut |
|---|---|---|---|---|---|
| CRC-16-CCITT | 0x1021 | 0xFFFF | No | No | 0x0000 |
| CRC-16-IBM | 0x8005 | 0x0000 | Yes | Yes | 0x0000 |
| CRC-32 (Ethernet) | 0x04C11DB7 | 0xFFFFFFFF | Yes | Yes | 0xFFFFFFFF |
| CRC-32C | 0x1EDC6F41 | 0xFFFFFFFF | Yes | Yes | 0xFFFFFFFF |
RefIn (Input Reflection): Process each byte LSB-first instead of MSB-first.
RefOut (Output Reflection): Reverse all bits of the final CRC before output.
Example: For CRC-32 (Ethernet), you:
These details are critical for interoperability. Using the wrong reflection or init value produces a different (and incompatible) CRC.
CRC implementation bugs are notoriously subtle. The ASCII string 'check_value' (123456789) has standardized CRC values. CRC-32 (Ethernet) of '123456789' = 0xCBF43926. Always verify your implementation against published test vectors before deployment.
We've covered the complete CRC calculation procedure, from conceptual framework to practical implementation. Let's consolidate:
What's Next:
With CRC calculation mastered, we turn to the receiver's perspective: CRC verification. How does the receiver confirm that received data is error-free? As we'll see, there are two equivalent approaches—recomputing and comparing the CRC, or dividing the entire codeword and checking for zero remainder.
You can now compute CRC values by hand and understand how both hardware and software implementations work. The polynomial division algorithm, augmented with practical variations like initialization and reflection, powers error detection in every Ethernet frame, USB packet, and file archive you use. Next, we explore the verification side of the CRC handshake.