Hamming Distance - Learning Module

Loading content...

0/228

Hamming Distance: The Measure of Codeword Difference

The Foundation of Reliable Communication

In 1950, Richard Hamming was working at Bell Labs when he became frustrated with the unreliable machines of his era. When weekend operators encountered errors in his programs—errors his code had no way to recover from—the machines would simply abort his calculations. Out of this frustration came one of the most elegant concepts in information theory: a way to mathematically measure how "far apart" two pieces of data are.

This measure, now called Hamming distance, is deceptively simple: it counts the number of positions where two binary strings differ. Yet this simple counting unlocks a profound understanding of error detection and correction. Every error control code—from simple parity to sophisticated Reed-Solomon—derives its power from the Hamming distances between its codewords.

What You Will Learn

By the end of this page, you will understand the mathematical definition of Hamming distance, how to calculate it between any two binary strings, why it provides a unified framework for understanding all error control codes, and how the distances between codewords determine what errors can be detected and corrected.

What is Hamming Distance?

The Hamming distance between two binary strings of equal length is the number of positions at which the corresponding bits differ. Named after Richard Hamming, this metric provides a rigorous mathematical framework for understanding how transmission errors affect data.

Formal Definition:

Given two binary strings a and b of length n, the Hamming distance d(a, b) is:

$$d(a, b) = \sum_{i=1}^{n} |a_i \oplus b_i|$$

Where ⊕ denotes the XOR (exclusive OR) operation. In simpler terms: XOR the two strings bit-by-bit, then count the number of 1s in the result.

The XOR Trick

XOR produces a 1 only where the input bits differ. This means XOR'ing two strings highlights exactly the positions where they disagree. Counting the 1s in the XOR result gives the Hamming distance. This is the most efficient way to compute Hamming distance in practice.

Worked Example:

Let's calculate the Hamming distance between 10110101 and 10011001:

String a:    1 0 1 1 0 1 0 1
String b:    1 0 0 1 1 0 0 1
             ─────────────────
a XOR b:     0 0 1 0 1 1 0 0
                 ↑   ↑ ↑

Counting the 1s: positions 3, 5, and 6 differ.

d(10110101, 10011001) = 3

This means converting 10110101 into 10011001 requires exactly 3 bit flips. Equivalently stated: if transmission errors flip exactly these 3 bits, the first string becomes the second.

Hamming Distance Examples
String A	String B	XOR Result	Hamming Distance
0000	0000	0000	0 (identical)
0000	1111	1111	4 (opposite)
1010	1001	0011	2
11100	00111	11011	4
1101011	1001001	0100010	2

Mathematical Properties of Hamming Distance

Hamming distance isn't just a counting measure—it's a true metric in the mathematical sense. This means it satisfies specific properties that make it a reliable measure of "distance" in the space of binary strings. Understanding these properties reveals why Hamming distance works so well for error analysis.

Metric Properties of Hamming Distance

•Non-negativity: d(a, b) ≥ 0 for all strings a and b. Distance is never negative—you can't have fewer than zero differences.
•Identity of Indiscernibles: d(a, b) = 0 if and only if a = b. Zero distance means the strings are identical.
•Symmetry: d(a, b) = d(b, a). The distance from a to b equals the distance from b to a. Order doesn't matter.
•Triangle Inequality: d(a, c) ≤ d(a, b) + d(b, c). The direct distance between a and c is never greater than going through an intermediate point b.

Why the Triangle Inequality Matters:

The triangle inequality has profound implications for error correction. Imagine three codewords A, B, and C. If A and B are far apart (large Hamming distance), and B and C are far apart, then A and C must also be reasonably far apart. This "spreading" property is what allows code designers to ensure that all legal codewords maintain sufficient distance from each other.

Bounds on Hamming Distance:

For two binary strings of length n:

Minimum distance: 0 (when strings are identical)
Maximum distance: n (when strings are completely opposite, bit-by-bit complements)

This means Hamming distance falls within [0, n] for n-bit strings.

Hamming Space

The set of all n-bit binary strings, equipped with Hamming distance as the metric, forms a mathematical structure called Hamming space (or Hamming cube). In this space, each string is a point, and the distance between any two points is their Hamming distance. Error control coding is essentially the art of choosing codewords that are far apart in this space.

Visualizing Hamming Space

Understanding Hamming distance becomes much more intuitive when we visualize it geometrically. For small values of n, we can represent all possible n-bit strings as vertices of an n-dimensional hypercube, where edges connect strings that differ by exactly 1 bit.

The 3-bit Hamming Cube:

Consider all 3-bit strings: 000, 001, 010, 011, 100, 101, 110, 111. These form the vertices of a cube. Two vertices are connected by an edge if and only if their Hamming distance is 1.

Converting Mermaid diagram...

Interpreting the Cube:

Adjacent vertices (connected by edges) have Hamming distance 1
Diagonal vertices across a face have Hamming distance 2
Opposite corners of the cube have Hamming distance 3 (e.g., 000 and 111)

The shortest path between any two vertices (counting edges) equals their Hamming distance. To get from 000 to 111, you must flip 3 bits, so you must traverse at least 3 edges.

Error as Movement in Hamming Space:

When transmission errors occur, they move the received string from one point in Hamming space to another. A single-bit error moves exactly one edge. A 2-bit error moves two edges. This geometric perspective reveals the core insight of error control:

If valid codewords are sufficiently far apart in Hamming space, any small number of errors cannot turn one valid codeword into another—the error is detectable.

Spheres in Hamming Space

A "sphere" of radius r around a codeword in Hamming space contains all strings within Hamming distance ≤ r of that codeword. If these spheres don't overlap, any received string can be uniquely decoded to the nearest codeword—this is the basis of error correction.

Computing Hamming Distance Efficiently

In practice, Hamming distance computation needs to be fast. Network equipment, storage controllers, and CPUs implement these calculations in hardware. Understanding the underlying algorithms helps appreciate both the theory and its real-world implementation.

hamming_distance.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def hamming_distance(a: int, b: int) -> int:
    """
    Calculate Hamming distance between two integers.
    
    Uses XOR to find differing bits, then counts the 1s.
    This is the most efficient software approach.
    """
    xor_result = a ^ b  # Bits that differ are 1
    
    # Count the number of 1 bits (population count)
    distance = 0
    while xor_result:
        distance += xor_result & 1  # Check lowest bit
        xor_result >>= 1             # Shift right
    return distance
 
def hamming_distance_strings(s1: str, s2: str) -> int:
    """
    Calculate Hamming distance between two binary strings.
    
    Strings must be equal length.
    """
    if len(s1) != len(s2):
        raise ValueError("Strings must have equal length")
    
    return sum(c1 != c2 for c1, c2 in zip(s1, s2))
 
# Examples
print(hamming_distance(0b10110101, 0b10011001))  # Output: 3
print(hamming_distance_strings("10110101", "10011001"))  # Output: 3
 
# Using Python's built-in popcount (Python 3.10+)
def hamming_distance_modern(a: int, b: int) -> int:
    return (a ^ b).bit_count()

Hardware Acceleration:

Modern processors include dedicated instructions for population count (counting 1-bits):

Architecture	Instruction	Introduced
x86/x64	POPCNT	SSE4.2 (2008)
ARM	CNT (NEON), VCNT	ARMv8
RISC-V	cpop	Bitmanip extension

This means Hamming distance between two 64-bit values can be computed in 2-3 clock cycles on modern hardware: one XOR, one population count.

Hamming Weight

The Hamming weight (or population count) of a binary string is the number of 1s it contains. Hamming distance is the Hamming weight of the XOR of two strings. This relationship is why efficient popcount operations directly enable fast Hamming distance calculations.

Connecting Hamming Distance to Error Control

The power of Hamming distance lies in how it connects to error detection and correction. This connection is both beautiful and practical—it transforms the abstract question "Can we detect/correct errors?" into the concrete question "How far apart are our codewords?"

The Fundamental Insight:

Consider a transmission where the original codeword c is corrupted into received word r. The number of bit errors is exactly d(c, r)—the Hamming distance between what was sent and what was received.

Now, if we have a set of valid codewords {c₁, c₂, c₃, ...}, and we receive a word r, there are three possibilities:

Received Word Scenarios

•r is a valid codeword: Either no errors occurred, OR errors transformed one valid codeword into another valid codeword (undetectable error).
•r is not a valid codeword: Errors definitely occurred, and they are detectable.
•r is closer to one codeword than all others: The errors may be correctable by choosing the nearest codeword.

The Critical Question:

How many bit errors turn one valid codeword into another? The answer is the minimum Hamming distance between any pair of valid codewords in the code. This single number—denoted dmin—determines the entire error-handling capability of the code.

Example:

Suppose our code has only two codewords: 00000 and 11111. Their Hamming distance is 5.

Single-bit error to 00000 → 00001 (not a valid codeword, error detected)
Two-bit errors to 00000 → 00011 (not valid, detected)
Three-bit errors to 00000 → 00111 (not valid, detected)
Four-bit errors to 00000 → 01111 (not valid, detected)
Five-bit errors to 00000 → 11111 (valid! error undetected)

This simple example shows that a minimum distance of 5 allows detection of up to 4-bit errors.

The Undetectable Error Pattern

When errors transform one valid codeword into another valid codeword, detection is impossible—the receiver has no way to know anything went wrong. This is why maximum distance between codewords is crucial: it determines how many errors must occur before this undetectable transformation becomes possible.

Minimum Distance: The Key Metric

While individual Hamming distances matter, what ultimately determines a code's power is its minimum Hamming distance—the smallest distance between any pair of valid codewords.

Formal Definition:

For a code C containing codewords {c₁, c₂, ..., cₖ}, the minimum distance is:

$$d_{min}(C) = \min_{i \neq j} d(c_i, c_j)$$

This is the "weakest link" in the code—the pair of codewords most easily confused by errors.

Example Code with Codewords and Pairwise Distances
Codeword	000000	001110	010101	011011	100110	101001
000000	—	3	3	4	3	3
001110	3	—	4	3	4	4
010101	3	4	—	3	4	4
011011	4	3	3	—	3	3
100110	3	4	4	3	—	4
101001	3	4	4	3	4	—

In this example code, all pairwise distances are either 3 or 4. The minimum distance dmin = 3.

Why Minimum Distance Dominates:

Why focus on the minimum rather than average distance? Because error control must work for the worst case. If any two codewords are separated by only 2 bits, then a 2-bit error pattern can turn one into the other—making that error undetectable. It doesn't matter that other codeword pairs are far apart; the code fails at its weakest point.

This leads to the central principle of error control coding:

A code's reliability is determined by its minimum distance. All error detection and correction capabilities derive from this single number.

Notation Convention

A code is often characterized by the notation (n, M, dmin) or sometimes [n, k, dmin], where n is the codeword length, M (or 2^k) is the number of codewords, and dmin is the minimum distance. For example, a (7, 16, 3) code has 7-bit codewords, 16 valid codewords, and minimum distance 3.

Linear Codes: Simplifying Distance Analysis

Analyzing a code by computing distances between all codeword pairs is computationally expensive. For a code with M codewords, there are M(M-1)/2 pairs to check. However, a special class of codes called linear codes provides a powerful shortcut.

What is a Linear Code?

A binary linear code is a set of codewords that forms a vector space over GF(2) (the field with two elements: 0 and 1). In practical terms, this means:

The all-zeros codeword 000...0 is always a valid codeword
XOR of any two codewords produces another valid codeword

The Simplification:

For linear codes, there's a remarkable property:

$$d_{min} = \min_{c \neq 0} w(c)$$

Where w(c) is the Hamming weight (number of 1s) of codeword c.

This means we can find the minimum distance by simply finding the non-zero codeword with the smallest Hamming weight—no need to compute distances between all pairs!

Why This Works

Since 0...0 is a codeword, the distance from any codeword c to the all-zeros codeword equals the Hamming weight of c. Since XOR of any two codewords c₁ ⊕ c₂ is also a codeword, and d(c₁, c₂) = w(c₁ ⊕ c₂), the minimum distance between any two codewords equals the minimum weight of any non-zero codeword.

Example: A Simple Linear Code

Consider the code with codewords: {0000, 0110, 1001, 1111}

Let's verify it's linear:

0000 is present ✓
0110 ⊕ 1001 = 1111 (present) ✓
0110 ⊕ 1111 = 1001 (present) ✓
1001 ⊕ 1111 = 0110 (present) ✓

Hamming weights:

w(0000) = 0
w(0110) = 2
w(1001) = 2
w(1111) = 4

Minimum non-zero weight = 2, so dmin = 2.

This is much faster than computing all 6 pairwise distances!

Common Linear Codes

•Hamming codes: (7,4,3) and (15,11,3) codes with single-error correction
•Reed-Muller codes: Used in early space probes and modern communications
•BCH codes: Powerful multi-error correcting codes
•Reed-Solomon codes: Used in CDs, DVDs, QR codes, and deep-space communication
•LDPC codes: Low-Density Parity-Check codes used in 5G, WiFi, and SSD storage

Summary: The Language of Error Control

Hamming distance provides the mathematical vocabulary for discussing error control. Before Hamming's work, engineers knew empirically that adding redundancy helped catch errors—but they lacked a rigorous framework to analyze and optimize their codes. Hamming distance changed everything.

Key Takeaways

•Hamming distance counts bit differences: d(a,b) = number of positions where a and b differ. Computable via XOR and popcount.
•It's a true metric: Non-negative, symmetric, satisfies triangle inequality. Forms the basis of "Hamming space."
•Errors move codewords in Hamming space: A k-bit error moves the codeword k units away from its original position.
•Minimum distance determines code power: The minimum distance between any pair of codewords sets the error detection/correction limits.
•Linear codes simplify analysis: For linear codes, minimum distance equals the minimum weight of any non-zero codeword.
•Foundation for all error control: Every coding scheme—parity, checksum, CRC, Hamming codes, Reed-Solomon—derives from Hamming distance properties.

What's Next:

Now that we understand what Hamming distance is and why minimum distance matters, we'll explore exactly how minimum distance translates into error detection capability. The next page establishes the precise mathematical relationship: given a minimum distance dmin, how many errors can we reliably detect?

Page Complete

You now understand Hamming distance as the fundamental metric of error control coding. It's a simple counting measure—bits that differ—yet it provides the entire analytical framework for understanding what errors can be detected and corrected. Next, we'll see exactly how to use minimum distance to determine error detection power.

Hamming Distance: The Measure of Codeword Difference

The Foundation of Reliable Communication

What You Will Learn

What is Hamming Distance?

Formal Definition:

Given two binary strings a and b of length n, the Hamming distance d(a, b) is:

$$d(a, b) = \sum_{i=1}^{n} |a_i \oplus b_i|$$

Where ⊕ denotes the XOR (exclusive OR) operation. In simpler terms: XOR the two strings bit-by-bit, then count the number of 1s in the result.

The XOR Trick

Worked Example:

Let's calculate the Hamming distance between 10110101 and 10011001:

String a:    1 0 1 1 0 1 0 1
String b:    1 0 0 1 1 0 0 1
             ─────────────────
a XOR b:     0 0 1 0 1 1 0 0
                 ↑   ↑ ↑

Counting the 1s: positions 3, 5, and 6 differ.

d(10110101, 10011001) = 3

This means converting 10110101 into 10011001 requires exactly 3 bit flips. Equivalently stated: if transmission errors flip exactly these 3 bits, the first string becomes the second.

Hamming Distance Examples
String A	String B	XOR Result	Hamming Distance
0000	0000	0000	0 (identical)
0000	1111	1111	4 (opposite)
1010	1001	0011	2
11100	00111	11011	4
1101011	1001001	0100010	2

Mathematical Properties of Hamming Distance

Metric Properties of Hamming Distance

•Non-negativity: d(a, b) ≥ 0 for all strings a and b. Distance is never negative—you can't have fewer than zero differences.
•Identity of Indiscernibles: d(a, b) = 0 if and only if a = b. Zero distance means the strings are identical.
•Symmetry: d(a, b) = d(b, a). The distance from a to b equals the distance from b to a. Order doesn't matter.
•Triangle Inequality: d(a, c) ≤ d(a, b) + d(b, c). The direct distance between a and c is never greater than going through an intermediate point b.

Why the Triangle Inequality Matters:

Bounds on Hamming Distance:

For two binary strings of length n:

Minimum distance: 0 (when strings are identical)
Maximum distance: n (when strings are completely opposite, bit-by-bit complements)

This means Hamming distance falls within [0, n] for n-bit strings.

Hamming Space

Visualizing Hamming Space

The 3-bit Hamming Cube:

Consider all 3-bit strings: 000, 001, 010, 011, 100, 101, 110, 111. These form the vertices of a cube. Two vertices are connected by an edge if and only if their Hamming distance is 1.

Converting Mermaid diagram...

Interpreting the Cube:

Adjacent vertices (connected by edges) have Hamming distance 1
Diagonal vertices across a face have Hamming distance 2
Opposite corners of the cube have Hamming distance 3 (e.g., 000 and 111)

The shortest path between any two vertices (counting edges) equals their Hamming distance. To get from 000 to 111, you must flip 3 bits, so you must traverse at least 3 edges.

Error as Movement in Hamming Space:

If valid codewords are sufficiently far apart in Hamming space, any small number of errors cannot turn one valid codeword into another—the error is detectable.

Spheres in Hamming Space

Computing Hamming Distance Efficiently

hamming_distance.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def hamming_distance(a: int, b: int) -> int:
    """
    Calculate Hamming distance between two integers.
    
    Uses XOR to find differing bits, then counts the 1s.
    This is the most efficient software approach.
    """
    xor_result = a ^ b  # Bits that differ are 1
    
    # Count the number of 1 bits (population count)
    distance = 0
    while xor_result:
        distance += xor_result & 1  # Check lowest bit
        xor_result >>= 1             # Shift right
    return distance
 
def hamming_distance_strings(s1: str, s2: str) -> int:
    """
    Calculate Hamming distance between two binary strings.
    
    Strings must be equal length.
    """
    if len(s1) != len(s2):
        raise ValueError("Strings must have equal length")
    
    return sum(c1 != c2 for c1, c2 in zip(s1, s2))
 
# Examples
print(hamming_distance(0b10110101, 0b10011001))  # Output: 3
print(hamming_distance_strings("10110101", "10011001"))  # Output: 3
 
# Using Python's built-in popcount (Python 3.10+)
def hamming_distance_modern(a: int, b: int) -> int:
    return (a ^ b).bit_count()

Hardware Acceleration:

Modern processors include dedicated instructions for population count (counting 1-bits):

Architecture	Instruction	Introduced
x86/x64	POPCNT	SSE4.2 (2008)
ARM	CNT (NEON), VCNT	ARMv8
RISC-V	cpop	Bitmanip extension

This means Hamming distance between two 64-bit values can be computed in 2-3 clock cycles on modern hardware: one XOR, one population count.

Hamming Weight

Connecting Hamming Distance to Error Control

The Fundamental Insight:

Now, if we have a set of valid codewords {c₁, c₂, c₃, ...}, and we receive a word r, there are three possibilities:

Received Word Scenarios

•r is a valid codeword: Either no errors occurred, OR errors transformed one valid codeword into another valid codeword (undetectable error).
•r is not a valid codeword: Errors definitely occurred, and they are detectable.
•r is closer to one codeword than all others: The errors may be correctable by choosing the nearest codeword.

The Critical Question:

Example:

Suppose our code has only two codewords: 00000 and 11111. Their Hamming distance is 5.

Single-bit error to 00000 → 00001 (not a valid codeword, error detected)
Two-bit errors to 00000 → 00011 (not valid, detected)
Three-bit errors to 00000 → 00111 (not valid, detected)
Four-bit errors to 00000 → 01111 (not valid, detected)
Five-bit errors to 00000 → 11111 (valid! error undetected)

This simple example shows that a minimum distance of 5 allows detection of up to 4-bit errors.

The Undetectable Error Pattern

Minimum Distance: The Key Metric

While individual Hamming distances matter, what ultimately determines a code's power is its minimum Hamming distance—the smallest distance between any pair of valid codewords.

Formal Definition:

For a code C containing codewords {c₁, c₂, ..., cₖ}, the minimum distance is:

$$d_{min}(C) = \min_{i \neq j} d(c_i, c_j)$$

This is the "weakest link" in the code—the pair of codewords most easily confused by errors.

Example Code with Codewords and Pairwise Distances
Codeword	000000	001110	010101	011011	100110	101001
000000	—	3	3	4	3	3
001110	3	—	4	3	4	4
010101	3	4	—	3	4	4
011011	4	3	3	—	3	3
100110	3	4	4	3	—	4
101001	3	4	4	3	4	—

In this example code, all pairwise distances are either 3 or 4. The minimum distance dmin = 3.

Why Minimum Distance Dominates:

This leads to the central principle of error control coding:

A code's reliability is determined by its minimum distance. All error detection and correction capabilities derive from this single number.

Notation Convention

Linear Codes: Simplifying Distance Analysis

What is a Linear Code?

A binary linear code is a set of codewords that forms a vector space over GF(2) (the field with two elements: 0 and 1). In practical terms, this means:

The all-zeros codeword 000...0 is always a valid codeword
XOR of any two codewords produces another valid codeword

The Simplification:

For linear codes, there's a remarkable property:

$$d_{min} = \min_{c \neq 0} w(c)$$

Where w(c) is the Hamming weight (number of 1s) of codeword c.

This means we can find the minimum distance by simply finding the non-zero codeword with the smallest Hamming weight—no need to compute distances between all pairs!

Why This Works

Example: A Simple Linear Code

Consider the code with codewords: {0000, 0110, 1001, 1111}

Let's verify it's linear:

0000 is present ✓
0110 ⊕ 1001 = 1111 (present) ✓
0110 ⊕ 1111 = 1001 (present) ✓
1001 ⊕ 1111 = 0110 (present) ✓

Hamming weights:

w(0000) = 0
w(0110) = 2
w(1001) = 2
w(1111) = 4

Minimum non-zero weight = 2, so dmin = 2.

This is much faster than computing all 6 pairwise distances!

Common Linear Codes

•Hamming codes: (7,4,3) and (15,11,3) codes with single-error correction
•Reed-Muller codes: Used in early space probes and modern communications
•BCH codes: Powerful multi-error correcting codes
•Reed-Solomon codes: Used in CDs, DVDs, QR codes, and deep-space communication
•LDPC codes: Low-Density Parity-Check codes used in 5G, WiFi, and SSD storage

Summary: The Language of Error Control

Key Takeaways

•Hamming distance counts bit differences: d(a,b) = number of positions where a and b differ. Computable via XOR and popcount.
•It's a true metric: Non-negative, symmetric, satisfies triangle inequality. Forms the basis of "Hamming space."
•Errors move codewords in Hamming space: A k-bit error moves the codeword k units away from its original position.
•Minimum distance determines code power: The minimum distance between any pair of codewords sets the error detection/correction limits.
•Linear codes simplify analysis: For linear codes, minimum distance equals the minimum weight of any non-zero codeword.
•Foundation for all error control: Every coding scheme—parity, checksum, CRC, Hamming codes, Reed-Solomon—derives from Hamming distance properties.

What's Next:

Page Complete