Computer NetworksDiffie-Hellman

Diffie-Hellman Key Exchange

LevelAdvanced

Duration60 mins

TopicDiffie-Hellman

3 / 5

Mathematical Basis of Diffie-Hellman

The Elegant Mathematics Behind Secure Communication

Diffie-Hellman's elegance arises from deep mathematical structures that have been studied for centuries. The protocol isn't merely a clever trick—it's built on the rigorous foundations of abstract algebra, number theory, and computational complexity theory.

Understanding these foundations accomplishes two critical goals: it illuminates why the protocol works rather than just how, and it equips you to reason about security guarantees and potential weaknesses. Engineers who understand the mathematics can make informed decisions about parameter selection, evaluate new attacks, and understand why certain variations (like ECDH) offer efficiency improvements.

What You Will Learn

By the end of this page, you will understand group theory fundamentals as they apply to cryptography, the structure of the multiplicative group modulo a prime, what makes a good generator, the precise hardness assumptions underlying DH security, and how computational complexity informs our security estimates.

Group Theory Fundamentals

At the heart of Diffie-Hellman lies the mathematical structure called a group. While this might seem like abstract formalism, it's precisely this structure that enables secure key exchange.

Definition: A Group

A group (G, •) consists of:

A set G of elements
A binary operation • that combines any two elements

Satisfying four axioms:

The Four Group Axioms

•Closure: For all a, b in G, the result a • b is also in G. The operation never takes you outside the group.
•Associativity: For all a, b, c in G, we have (a • b) • c = a • (b • c). Parentheses don't matter for consecutive operations.
•Identity element: There exists an element e in G such that for all a in G, e • a = a • e = a. This element "does nothing."
•Inverse elements: For each a in G, there exists an element a⁻¹ in G such that a • a⁻¹ = a⁻¹ • a = e. Every element can be "undone."

Optional additional property:

Commutativity (Abelian groups): For all a, b in G, a • b = b • a. Order doesn't matter.

The groups used in Diffie-Hellman are abelian (commutative). This commutativity is exactly what makes the protocol work: (g^a)^b = (g^b)^a.

group_examples.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
"""
Examples of Groups Relevant to Cryptography
 
Understanding groups through concrete examples helps
build intuition for the abstract definitions.
"""
 
# EXAMPLE 1: The integers under addition (Z, +)
# ============================================
# Set: {..., -2, -1, 0, 1, 2, 3, ...}
# Operation: addition (+)
# Identity: 0 (since a + 0 = a)
# Inverse of a: -a (since a + (-a) = 0)
# This is an INFINITE group
 
# EXAMPLE 2: Integers modulo n under addition (Z_n, +)
# ===================================================
# Set: {0, 1, 2, ..., n-1}
# Operation: addition mod n
# Identity: 0
# Inverse of a: (n - a) mod n
 
def addition_group_mod_n(n: int):
    """Demonstrate Z_n under addition."""
    print(f"Group Z_{n} under addition:")
    print(f"  Elements: {{0, 1, ..., {n-1}}}")
    print(f"  Identity: 0")
    
    # Show some operations
    a, b = 3, 5
    result = (a + b) % n
    print(f"  Example: {a} + {b} = {result} (mod {n})")
    
    # Show inverses
    for x in range(n):
        inv = (n - x) % n
        check = (x + inv) % n
        print(f"  Inverse of {x}: {inv} (check: {x}+{inv}={check})")
 
# EXAMPLE 3: Multiplicative group modulo p (Z*_p, ×)
# THIS IS THE GROUP USED IN DIFFIE-HELLMAN!
# =================================================
# Set: {1, 2, 3, ..., p-1} (all integers from 1 to p-1)
# Operation: multiplication mod p
# Identity: 1 (since a × 1 = a)
# Closure: For prime p, if a,b ≠ 0 mod p, then ab ≠ 0 mod p
 
def multiplicative_group_mod_p(p: int):
    """
    Demonstrate Z*_p - the multiplicative group modulo prime p.
    
    This is the group Diffie-Hellman operates in!
    """
    print(f"
Multiplicative Group Z*_{p}:")
    print(f"  Elements: {{1, 2, ..., {p-1}}}")
    print(f"  Order (size): {p-1}")
    print(f"  Identity: 1")
    
    # Show multiplication table for small p
    if p <= 7:
        print(f"  Multiplication table:")
        for a in range(1, p):
            row = [str((a * b) % p) for b in range(1, p)]
            print(f"    {a}: {' '.join(row)}")
    
    # Compute inverses using Fermat's little theorem
    # a^(p-1) ≡ 1 mod p, so a^(p-2) ≡ a^(-1) mod p
    print(f"  Inverses (a × a⁻¹ ≡ 1 mod {p}):")
    for a in range(1, p):
        inv = pow(a, p-2, p)  # a^(p-2) mod p
        check = (a * inv) % p
        print(f"    {a}⁻¹ = {inv} (check: {a}×{inv} ≡ {check})")
 
# Run examples
addition_group_mod_n(6)
multiplicative_group_mod_p(7)
 
# KEY INSIGHT: Why we need p to be PRIME
# =====================================
# For composite n, Z*_n has fewer elements (those coprime to n)
# and the structure is more complex. Primes give clean groups.
 
print("
" + "="*50)
print("Why primes? Compare Z*_8 vs Z*_7:")
print("Z*_8 = {1,3,5,7} - elements coprime to 8, size = 4")
print("Z*_7 = {1,2,3,4,5,6} - all non-zero elements, size = 6")
print("Primes give maximal, well-structured groups.")

Why Groups Matter for DH

The group structure guarantees that all DH operations stay within a well-defined set, every public key has a unique shared secret for any private key, and the mathematical properties (like commutativity) that make the protocol work are guaranteed.

The Multiplicative Group Z*_p

Diffie-Hellman operates specifically in the multiplicative group of integers modulo a prime p, denoted Z*_p (also written as (ℤ/pℤ)*). Let's examine this structure carefully.

Definition:

$$\mathbb{Z}_p^* = {1, 2, 3, \ldots, p-1}$$

with the operation being multiplication modulo p.

Key properties:

Size (order): |Z*_p| = p - 1. This is critical for security analysis.
Closure: For any a, b ∈ Z_p, the product ab mod p is also in Z_p. This is guaranteed because p is prime: if p doesn't divide a or b, it can't divide ab.
Inverses exist: For every element a ∈ Z*_p, there exists a unique a⁻¹ such that a × a⁻¹ ≡ 1 (mod p). This follows from Bézout's identity and Fermat's little theorem.

modular_arithmetic.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
"""
Deep Dive into Modular Arithmetic for Cryptography
 
Understanding these operations is essential for implementing
and analyzing Diffie-Hellman correctly.
"""
 
def extended_gcd(a: int, b: int) -> tuple[int, int, int]:
    """
    Extended Euclidean Algorithm.
    
    Returns (gcd, x, y) such that:
        a*x + b*y = gcd(a, b)
    
    This is fundamental for computing modular inverses.
    """
    if a == 0:
        return b, 0, 1
    
    gcd, x1, y1 = extended_gcd(b % a, a)
    x = y1 - (b // a) * x1
    y = x1
    
    return gcd, x, y
 
 
def modular_inverse(a: int, p: int) -> int:
    """
    Compute the multiplicative inverse of a modulo p.
    
    Two methods:
    1. Extended GCD: a*x + p*y = 1, so a*x ≡ 1 (mod p)
    2. Fermat: a^(p-1) ≡ 1 (mod p), so a^(p-2) ≡ a⁻¹ (mod p)
    """
    # Method 1: Extended GCD
    gcd, x, _ = extended_gcd(a, p)
    if gcd != 1:
        raise ValueError(f"{a} has no inverse modulo {p}")
    return x % p
 
 
def fermats_little_theorem_demo(p: int):
    """
    Demonstrate Fermat's Little Theorem:
        a^(p-1) ≡ 1 (mod p) for all a in Z*_p
    
    This is the cornerstone of modular arithmetic in cryptography.
    """
    print(f"
Fermat's Little Theorem for p = {p}:")
    print(f"For all a ∈ {{1,2,...,{p-1}}}: a^{p-1} ≡ 1 (mod {p})")
    print()
    
    for a in range(1, p):
        result = pow(a, p-1, p)
        # Show the exponentiation steps for small values
        if p <= 11:
            intermediate = a
            steps = [str(a)]
            for i in range(2, p):
                intermediate = (intermediate * a) % p
                if i == p-1:
                    steps.append(f"... a^{i} = {intermediate}")
                elif i <= 4:
                    steps.append(f"a^{i}={intermediate}")
            
            print(f"  a={a}: {' → '.join(steps[:4])} → a^{p-1} = {result}")
        else:
            print(f"  a={a}: a^{p-1} mod {p} = {result}")
    
    print(f"
All results are 1 ✓")
 
 
def group_structure_analysis(p: int):
    """
    Analyze the structure of Z*_p.
    
    Key concepts:
    - Order of an element
    - Primitive roots (generators)
    - Subgroup structure
    """
    print(f"
{'='*50}")
    print(f"GROUP STRUCTURE OF Z*_{p}")
    print(f"{'='*50}")
    print(f"Group size: |Z*_{p}| = {p-1}")
    
    # Find the order of each element
    # Order of a = smallest k such that a^k ≡ 1 (mod p)
    print(f"
Orders of elements:")
    
    generators = []
    for a in range(1, p):
        power = 1
        order = 0
        for k in range(1, p):
            power = (power * a) % p
            if power == 1:
                order = k
                break
        
        is_generator = (order == p - 1)
        if is_generator:
            generators.append(a)
        
        label = "← GENERATOR" if is_generator else ""
        print(f"  ord({a}) = {order} {label}")
    
    print(f"
Primitive roots (generators): {generators}")
    print(f"Number of generators: {len(generators)}")
    print(f"Expected: φ({p-1}) = {euler_phi(p-1)}")
    
    return generators
 
 
def euler_phi(n: int) -> int:
    """Euler's totient function: count of integers ≤ n coprime to n."""
    result = 0
    for i in range(1, n + 1):
        if gcd(i, n) == 1:
            result += 1
    return result
 
 
def gcd(a: int, b: int) -> int:
    """Greatest common divisor via Euclidean algorithm."""
    while b:
        a, b = b, a % b
    return a
 
 
# Demonstrations
fermats_little_theorem_demo(7)
generators = group_structure_analysis(11)
 
print(f"
{'='*50}")
print("WHY GENERATORS MATTER FOR DIFFIE-HELLMAN")
print(f"{'='*50}")
print("A generator g produces ALL elements of the group:")
for gen in generators[:2]:
    print(f"
Generator g = {gen}:")
    elements = []
    power = 1
    for i in range(1, 11):
        power = (power * gen) % 11
        elements.append(power)
    print(f"  Powers: g^1, g^2, ..., g^10 = {elements}")
    print(f"  Produces all {len(set(elements))} elements ✓")

Fermat's Little Theorem: The Foundation

For any a in Z*_p: a^(p-1) ≡ 1 (mod p). This theorem is used constantly: to compute inverses (a⁻¹ = a^(p-2)), to find element orders, and in security proofs. It's why DH operations 'wrap around' predictably.

Primitive Roots and Generators

A primitive root (or generator) of Z*_p is an element g whose powers produce every element of the group. This concept is central to DH security.

Definition:

An element g ∈ Z*_p is a primitive root modulo p if:

$${g^1 \mod p, g^2 \mod p, \ldots, g^{p-1} \mod p} = {1, 2, 3, \ldots, p-1}$$

In group theory terms, g is a generator if its order equals the group order p-1.

Order of an element:

The order of a ∈ Z*_p, written ord(a), is the smallest positive integer k such that a^k ≡ 1 (mod p).

Orders of Elements in Z*_11
Element a	Sequence a, a², a³, ... mod 11	Order	Is Generator?
1	1	1	No
2	2, 4, 8, 5, 10, 9, 7, 3, 6, 1	10	✓ YES
3	3, 9, 5, 4, 1	5	No
4	4, 5, 9, 3, 1	5	No
5	5, 3, 4, 9, 1	5	No
6	6, 3, 7, 9, 10, 5, 8, 4, 2, 1	10	✓ YES
7	7, 5, 2, 3, 10, 4, 6, 9, 8, 1	10	✓ YES
8	8, 9, 6, 4, 10, 3, 2, 5, 7, 1	10	✓ YES
9	9, 4, 3, 5, 1	5	No
10	10, 1	2	No

Key observations:

Not every element is a generator. In Z*_11, only 4 out of 10 elements are generators.
Order divides group order. By Lagrange's theorem, ord(a) always divides |Z*_p| = p-1.
Number of generators. There are exactly φ(p-1) primitive roots modulo p, where φ is Euler's totient function.
Security implication: If the generator has small order, the discrete log problem becomes easier. We want generators of maximal order.

generator_selection.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
"""
Generator Selection and Verification
 
Choosing and verifying generators is critical for DH security.
Incorrect choice can catastrophically weaken the protocol.
"""
 
def is_generator(g: int, p: int) -> bool:
    """
    Check if g is a primitive root modulo p.
    
    Efficient test: g is a generator iff g^((p-1)/q) ≠ 1 mod p
    for all prime factors q of (p-1).
    
    This is much faster than computing all powers.
    """
    from sympy import primefactors
    
    order = p - 1
    factors = primefactors(order)
    
    for q in factors:
        # Check g^((p-1)/q) mod p
        exponent = order // q
        if pow(g, exponent, p) == 1:
            # g has order dividing (p-1)/q, so it's NOT a generator
            return False
    
    return True
 
 
def find_all_generators(p: int) -> list[int]:
    """Find all primitive roots modulo prime p."""
    return [g for g in range(2, p) if is_generator(g, p)]
 
 
def verify_safe_prime_generator(p: int, g: int) -> dict:
    """
    For safe primes p = 2q + 1, verify generator properties.
    
    Safe primes are special because (p-1) = 2q has only two 
    prime factors: 2 and q. This makes generator verification simple.
    """
    from sympy import isprime
    
    q = (p - 1) // 2
    
    results = {
        "p": p,
        "is_prime": isprime(p),
        "q = (p-1)/2": q,
        "q_is_prime": isprime(q),
        "is_safe_prime": isprime(p) and isprime(q),
        "g": g,
    }
    
    if results["is_safe_prime"]:
        # For safe primes, g is a generator iff:
        # g^2 ≠ 1 mod p (rules out g = ±1)
        # g^q ≠ 1 mod p (rules out quadratic residues)
        
        g_squared = pow(g, 2, p)
        g_to_q = pow(g, q, p)
        
        results["g^2 mod p"] = g_squared
        results["g^q mod p"] = g_to_q
        results["is_generator"] = (g_squared != 1) and (g_to_q != 1)
    
    return results
 
 
def subgroup_attack_demo():
    """
    Demonstrate why small-order subgroups are dangerous.
    
    If an attacker can force key exchange into a small subgroup,
    the discrete log becomes easy within that subgroup.
    """
    print("SMALL SUBGROUP ATTACK DEMONSTRATION")
    print("="*50)
    
    # Consider a prime p where p-1 has a small factor
    p = 97  # p - 1 = 96 = 2^5 * 3
    
    # Find an element of small order
    for g in range(2, p):
        order = 1
        val = g
        while val != 1:
            val = (val * g) % p
            order += 1
        
        if order == 3:  # Found small-order element
            print(f"Element {g} has order {order}")
            print(f"Subgroup: {{{g}^1={g}, {g}^2={pow(g,2,p)}, {g}^3={pow(g,3,p)}=1}}")
            print()
            print("If attacker forces Bob to use this subgroup,")
            print("the shared secret can only be one of 3 values!")
            print("Brute force becomes trivial.")
            break
    
    print()
    print("MITIGATION: Use safe primes!")
    print("For safe prime p = 2q+1, the only subgroups have")
    print("orders 1, 2, q, and 2q. No small subgroups exist.")
 
 
# Demo
print("Generators of Z*_23:")
gens = find_all_generators(23)
print(f"Generators: {gens}")
print(f"Count: {len(gens)}")
 
print("
Verifying g=5 for p=23 (safe prime, q=11):")
result = verify_safe_prime_generator(23, 5)
for k, v in result.items():
    print(f"  {k}: {v}")
 
print()
subgroup_attack_demo()

Safe Primes Prevent Subgroup Attacks

A safe prime p = 2q + 1 (where q is also prime) ensures Z*_p has no small subgroups. This is why safe primes or validated groups from standards are essential. Using arbitrary primes risks devastating subgroup attacks.

The Discrete Logarithm Problem (DLP)

The security of Diffie-Hellman depends on the Discrete Logarithm Problem (DLP) being computationally intractable. Let's define it precisely and examine why it's believed to be hard.

Formal Definition:

Given: A prime p, a generator g of Z_p, and an element h ∈ Z_p

Find: The unique integer x ∈ [0, p-2] such that g^x ≡ h (mod p)

We write x = log_g(h) and call x the discrete logarithm of h base g.

Why "discrete"?

Unlike continuous logarithms (where ln(ab) = ln(a) + ln(b) provides algebraic tools), the modular structure destroys these relationships. Values "wrap around," and there's no smooth function to invert.

discrete_log_algorithms.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
"""
Discrete Logarithm Algorithms: From Naive to Number Field Sieve
 
Understanding attack algorithms helps appreciate parameter choice.
"""
 
import math
from collections import defaultdict
 
def naive_discrete_log(h: int, g: int, p: int) -> int:
    """
    Naive algorithm: try each exponent sequentially.
    
    Time complexity: O(p)
    Space complexity: O(1)
    
    For 2048-bit p, this requires ~10^600 operations.
    Completely impractical.
    """
    power = 1
    for x in range(p):
        if power == h:
            return x
        power = (power * g) % p
    return -1
 
 
def baby_step_giant_step(h: int, g: int, p: int) -> int:
    """
    Shanks' Baby-Step Giant-Step Algorithm.
    
    Time complexity: O(√p)
    Space complexity: O(√p)
    
    Key insight: if x = im + j where m = ⌈√n⌉, then
    g^x = g^(im+j) implies g^j = h * (g^-m)^i
    
    "Baby steps": precompute g^j for j = 0..m-1
    "Giant steps": compute h * (g^-m)^i for i = 0..m-1
    Look for collision.
    """
    n = p - 1  # Group order
    m = int(math.ceil(math.sqrt(n)))
    
    # Baby steps: build lookup table g^j -> j
    baby_steps = {}
    power = 1
    for j in range(m):
        baby_steps[power] = j
        power = (power * g) % p
    
    # Compute g^(-m) mod p
    g_inv_m = pow(g, -m, p)  # Python 3.8+ supports negative exponents
    
    # Giant steps: look for collision
    gamma = h
    for i in range(m):
        if gamma in baby_steps:
            j = baby_steps[gamma]
            x = i * m + j
            # Verify
            if pow(g, x, p) == h:
                return x
        gamma = (gamma * g_inv_m) % p
    
    return -1
 
 
def pollard_rho(h: int, g: int, p: int) -> int:
    """
    Pollard's Rho Algorithm for Discrete Log.
    
    Time complexity: O(√p)
    Space complexity: O(1)
    
    Uses cycle detection (like for detecting linked list cycles).
    More memory-efficient than baby-step giant-step.
    """
    # This is a simplified conceptual version
    # Full implementation requires careful handling of random walk
    
    n = p - 1  # Group order
    
    def f(x, a, b):
        """Partition {1,...,p-1} into 3 sets; define iteration."""
        s = x % 3
        if s == 0:
            return (x * x) % p, (2 * a) % n, (2 * b) % n
        elif s == 1:
            return (x * g) % p, (a + 1) % n, b
        else:
            return (x * h) % p, a, (b + 1) % n
    
    # Start Floyd's cycle-detection
    x, a, b = 1, 0, 0
    X, A, B = x, a, b
    
    for _ in range(n):
        x, a, b = f(x, a, b)
        X, A, B = f(*f(X, A, B))  # X moves twice as fast
        
        if x == X:
            # Collision: g^a * h^b = g^A * h^B
            # So h^(b-B) = g^(A-a)
            # If b ≠ B: h = g^((A-a)/(b-B))
            r = (b - B) % n
            if r != 0:
                from math import gcd
                d = gcd(r, n)
                if d == 1:
                    result = ((A - a) * pow(r, -1, n)) % n
                    if pow(g, result, p) == h:
                        return result
            break
    
    return -1
 
 
def complexity_comparison():
    """
    Compare algorithm complexities for different prime sizes.
    """
    print("DISCRETE LOG ALGORITHM COMPLEXITIES")
    print("="*60)
    print()
    
    bit_sizes = [64, 128, 256, 512, 1024, 2048, 3072]
    
    print(f"{'Bits':>6} | {'Naive O(p)':>15} | {'Baby-Giant O(√p)':>18} | {'NFS (subexp)':>15}")
    print("-" * 60)
    
    for bits in bit_sizes:
        p = 2**bits
        naive = p
        baby_giant = int(math.sqrt(p))
        # Number Field Sieve: L_p[1/3, c] ≈ exp(c * (ln p)^(1/3) * (ln ln p)^(2/3))
        # Simplified approximation
        nfs = int(math.exp(1.9 * (bits * math.log(2))**(1/3) * (math.log(bits * math.log(2)))**(2/3)))
        
        naive_str = f"2^{bits}"
        baby_str = f"2^{bits//2}"
        nfs_str = f"~2^{int(math.log2(nfs))}" if nfs > 0 else "~2^?"
        
        print(f"{bits:>6} | {naive_str:>15} | {baby_str:>18} | {nfs_str:>15}")
    
    print()
    print("Notes:")
    print("- 2^80 operations ≈ 1 year on powerful hardware (security boundary)")
    print("- 2^112 operations ≈ practical limit for nation-states")
    print("- 2^128+ operations ≈ currently infeasible")
    print("- 2048-bit primes give ~112-bit security")
    print("- 3072-bit primes give ~128-bit security")
 
 
# Demo on small primes
print("Discrete log examples on small primes:
")
test_cases = [
    (11, 2, 7),  # Find x such that 2^x ≡ 7 (mod 11)
    (23, 5, 8),  # Find x such that 5^x ≡ 8 (mod 23)
]
 
for p, g, h in test_cases:
    x_naive = naive_discrete_log(h, g, p)
    x_baby = baby_step_giant_step(h, g, p)
    print(f"Solve {g}^x ≡ {h} (mod {p})")
    print(f"  Naive:      x = {x_naive}")
    print(f"  Baby-Giant: x = {x_baby}")
    print(f"  Verify: {g}^{x_naive} mod {p} = {pow(g, x_naive, p)}")
    print()
 
complexity_comparison()

Best Known Algorithms for Discrete Logarithm
Algorithm	Time Complexity	Space	Notes
Brute Force	O(p)	O(1)	Try each exponent; completely impractical
Baby-step Giant-step	O(√p)	O(√p)	Time-space tradeoff; still exponential
Pollard's Rho	O(√p)	O(1)	Random walk; more practical than BSGS
Pollard's Kangaroo	O(√p)	O(1)	When log is known to lie in interval
Index Calculus	Subexponential	Large	Uses factor base; specific to Z*_p
Number Field Sieve	L_p[1/3, 1.9]	Massive	Best for very large p; used in records

Subexponential but Not Polynomial

The Number Field Sieve runs in time L_p[1/3, c] = exp(c · (ln p)^(1/3) · (ln ln p)^(2/3)). This is subexponential but superpolynomial—still infeasible for 2048+ bit primes, but faster than √p. This is why EC-based DH (next module) is more efficient.

Computational Hardness Assumptions

Cryptographic security proofs reduce to hardness assumptions. For Diffie-Hellman, three related but distinct problems are relevant:

1. Discrete Logarithm Problem (DLP):

Given: (p, g, g^a)
Find: a
This is what we've discussed. If you can solve DLP, you break DH trivially.

2. Computational Diffie-Hellman (CDH) Problem:

Given: (p, g, g^a, g^b)
Find: g^(ab)
This is exactly what Eve needs to compute. CDH ≤ DLP (DLP solves CDH), but the reverse is not proven.

3. Decisional Diffie-Hellman (DDH) Problem:

Given: (p, g, g^a, g^b, h) where h is either g^(ab) or random
Decide: which case h represents
DDH ≤ CDH ≤ DLP. DDH is the strongest (easiest) assumption.

Converting Mermaid diagram...

Why these distinctions matter:

For basic key exchange: CDH hardness suffices. Eve needs to compute g^(ab).
For key derivation security: DDH hardness is needed. The shared secret should be indistinguishable from random.
Security proofs: Many protocols prove security under DDH, which assumes CDH (and thus implicitly DLP).

Are these actually hard?

No mathematical proof exists that DLP is hard. However:

Decades of cryptanalytic effort have produced only subexponential (not polynomial) algorithms
The best attacks require massive resources; DLP records involve years of computation
Quantum computers would change everything (Shor's algorithm solves DLP in polynomial time), but practical quantum computers of sufficient size don't exist yet

Discrete Log Computation Records
Year	Bits	Method	Time/Resources
2016	1024	NFS (special prime)	~2 months, many cores
2019	795	NFS (general)	~3100 core-years
2023	912	NFS	Massive compute cluster
—	2048+	—	Currently infeasible

Logjam Attack: Weak Groups Are Real Threats

The 2015 Logjam attack showed that 512-bit and 768-bit DH primes could be broken in practical time. Even 1024-bit groups are within reach of well-funded adversaries. The current recommendation is 2048-bit minimum, with 3072-bit for long-term security.

Why Modular Arithmetic Creates Hardness

It's reasonable to ask: why is modular exponentiation hard to invert when regular exponentiation is easy? Understanding this provides intuition for the security model.

Regular logarithm:

Given y = 10^x, finding x is trivial:

x = log₁₀(y)
Use a calculator, Taylor series, or binary search
Smooth, monotonic function with analytical tools

Discrete logarithm:

Given y = 10^x mod 101, finding x is hard:

The modular reduction destroys smoothness
The sequence 10¹, 10², 10³, ... mod 101 appears pseudorandom
No monotonicity, no calculus, no binary search

modular_chaos.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
"""
Visualizing Why Modular Arithmetic Creates Hardness
 
The 'chaos' introduced by modular reduction is why
discrete logarithms are hard.
"""
 
def visualize_modular_sequence(g: int, p: int):
    """
    Show how modular exponentiation scrambles values.
    """
    print(f"Powers of {g} modulo {p}:")
    print("=" * 50)
    
    powers = []
    val = 1
    for i in range(1, p):
        val = (val * g) % p
        powers.append(val)
        print(f"  {g}^{i:2} mod {p} = {val:3}", end="")
        
        # Show as "percentage" of range for visual
        bar_len = int(val * 40 / p)
        print(f"  |{'#' * bar_len}")
    
    print()
    print("Observations:")
    print(f"  - Values jump unpredictably between 1 and {p-1}")
    print(f"  - No monotonic pattern")
    print(f"  - Knowing g^x doesn't help find g^(x+1) without knowing x")
    print(f"  - The only structure: values eventually cycle with period {p-1}")
    
    return powers
 
 
def statistical_analysis(g: int, p: int):
    """
    Analyze statistical properties of the sequence.
    
    A good cryptographic group should have:
    - Uniform distribution
    - Long period
    - No discernible pattern
    """
    from statistics import mean, stdev
    
    powers = []
    val = 1
    for i in range(1, p):
        val = (val * g) % p
        powers.append(val)
    
    print(f"
Statistical Analysis (g={g}, p={p}):")
    print(f"  Period: {len(powers)}")
    print(f"  Mean: {mean(powers):.2f} (expected: {(p-1)/2:.2f})")
    print(f"  StdDev: {stdev(powers):.2f} (uniform would be ~{(p-1)/(2*1.73):.2f})")
    
    # Check if all values appear exactly once
    unique = set(powers)
    if len(unique) == p - 1:
        print(f"  Generator: YES (all {p-1} non-zero values appear)")
    else:
        print(f"  Generator: NO (only {len(unique)} unique values)")
 
 
def compare_continuous_discrete():
    """
    Compare continuous vs discrete logarithms.
    """
    import math
    
    print("
CONTINUOUS VS DISCRETE LOGARITHM")
    print("=" * 50)
    
    # Continuous: finding x such that 10^x = y
    y_continuous = 1000
    x_continuous = math.log10(y_continuous)
    print(f"
Continuous: 10^x = 1000")
    print(f"  Solution: x = log₁₀(1000) = {x_continuous}")
    print(f"  Method: Direct formula, O(1)")
    
    # Discrete: finding x such that 10^x ≡ y (mod 101)
    g, p, y_discrete = 10, 101, 78
    print(f"
Discrete: 10^x ≡ {y_discrete} (mod 101)")
    
    # Find by brute force
    val = 1
    for x in range(1, p):
        val = (val * g) % p
        if val == y_discrete:
            print(f"  Solution: x = {x}")
            break
    
    print(f"  Method: Must try values (at best O(√p))")
    print(f"  Why hard: modular reduction destroys log properties:")
    print(f"           log(ab mod p) ≠ log(a) + log(b)")
 
 
# Run demonstrations
visualize_modular_sequence(3, 17)
statistical_analysis(5, 23)
compare_continuous_discrete()

The Information-Theoretic View

Modular reduction is a many-to-one mapping. Many different products can reduce to the same value mod p. This ambiguity—this 'information loss'—is what makes inversion hard. The discrete log problem asks you to recover information that has been systematically destroyed.

Implications for Parameter Selection

The mathematics we've covered directly informs how DH parameters should be chosen in practice. Each mathematical concept translates to a specific security requirement.

Parameter Selection Guidelines

•Prime size → Security level: 2048 bits ≈ 112-bit security; 3072 bits ≈ 128-bit security. Match to your threat model.
•Safe prime → No small subgroups: Use p = 2q + 1 where q is prime. This prevents subgroup confinement attacks.
•Primitive root → Full group coverage: The generator g must have order p-1 (or q for safe primes). Small generators like 2, 3, 5 are fine if verified.
•Prime structure → Algorithm resistance: Avoid primes with special structure (e.g., p-1 with many small factors) that could be exploited by index calculus variants.
•Standardized groups → Vetted security: RFC 3526, RFC 7919 provide audited parameter sets. Prefer these over custom generation.

Recommended DH Parameter Sizes (NIST/IETF)
Security Level	Prime Size (bits)	Comparable Symmetric	Use Case
112-bit	2048	3DES, AES-128	General purpose, current standard
128-bit	3072	AES-128+	Medium-term security (10+ years)
192-bit	7680	AES-192	Long-term, high security
256-bit	15360	AES-256	Post-quantum consideration (pre-QC)

Modern Recommendation

For new deployments: use either RFC 7919 FFDHE groups (if you must use traditional DH) or switch entirely to ECDH with curves like X25519 or P-256. ECDH provides equivalent security with much smaller key sizes and faster operations.

Summary: Mathematical Foundations

We've built a rigorous mathematical foundation for understanding Diffie-Hellman security. Let's consolidate these concepts:

Key Mathematical Concepts

•Group theory provides the algebraic structure: closure, associativity, identity, inverses. DH operates in the abelian group Z*_p.
•Modular arithmetic in Z*_p has special properties: Fermat's theorem (a^(p-1) ≡ 1), existence of inverses, and well-defined structure.
•Primitive roots (generators) produce all group elements. Choosing good generators ensures full security.
•The discrete logarithm problem is the foundation of DH security. Best algorithms are subexponential, making large primes secure.
•Hardness assumptions (DLP, CDH, DDH) form a hierarchy of security guarantees used in cryptographic proofs.
•Parameter selection translates mathematical requirements into practical choices: prime size, safe primes, verified generators.

What's next:

With the mathematical foundation established, we'll explore a crucial security property that modern systems demand: forward secrecy. You'll learn why key reuse is dangerous, how ephemeral keys protect past sessions, and why protocols like TLS 1.3 mandate ephemeral Diffie-Hellman.

Page Complete

You now understand the algebraic structures underlying Diffie-Hellman: groups, modular arithmetic, generators, and the discrete logarithm problem. This foundation explains why the protocol is secure and how to choose parameters correctly. Next, we examine forward secrecy and its critical importance.

3 / 5

Loading learning content...

Computer NetworksDiffie-Hellman

Diffie-Hellman Key Exchange

LevelAdvanced

Duration60 mins

TopicDiffie-Hellman

3 / 5

Mathematical Basis of Diffie-Hellman

The Elegant Mathematics Behind Secure Communication

What You Will Learn

Group Theory Fundamentals

At the heart of Diffie-Hellman lies the mathematical structure called a group. While this might seem like abstract formalism, it's precisely this structure that enables secure key exchange.

Definition: A Group

A group (G, •) consists of:

A set G of elements
A binary operation • that combines any two elements

Satisfying four axioms:

The Four Group Axioms

•Closure: For all a, b in G, the result a • b is also in G. The operation never takes you outside the group.
•Associativity: For all a, b, c in G, we have (a • b) • c = a • (b • c). Parentheses don't matter for consecutive operations.
•Identity element: There exists an element e in G such that for all a in G, e • a = a • e = a. This element "does nothing."
•Inverse elements: For each a in G, there exists an element a⁻¹ in G such that a • a⁻¹ = a⁻¹ • a = e. Every element can be "undone."

Optional additional property:

Commutativity (Abelian groups): For all a, b in G, a • b = b • a. Order doesn't matter.

The groups used in Diffie-Hellman are abelian (commutative). This commutativity is exactly what makes the protocol work: (g^a)^b = (g^b)^a.

group_examples.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
"""
Examples of Groups Relevant to Cryptography
 
Understanding groups through concrete examples helps
build intuition for the abstract definitions.
"""
 
# EXAMPLE 1: The integers under addition (Z, +)
# ============================================
# Set: {..., -2, -1, 0, 1, 2, 3, ...}
# Operation: addition (+)
# Identity: 0 (since a + 0 = a)
# Inverse of a: -a (since a + (-a) = 0)
# This is an INFINITE group
 
# EXAMPLE 2: Integers modulo n under addition (Z_n, +)
# ===================================================
# Set: {0, 1, 2, ..., n-1}
# Operation: addition mod n
# Identity: 0
# Inverse of a: (n - a) mod n
 
def addition_group_mod_n(n: int):
    """Demonstrate Z_n under addition."""
    print(f"Group Z_{n} under addition:")
    print(f"  Elements: {{0, 1, ..., {n-1}}}")
    print(f"  Identity: 0")
    
    # Show some operations
    a, b = 3, 5
    result = (a + b) % n
    print(f"  Example: {a} + {b} = {result} (mod {n})")
    
    # Show inverses
    for x in range(n):
        inv = (n - x) % n
        check = (x + inv) % n
        print(f"  Inverse of {x}: {inv} (check: {x}+{inv}={check})")
 
# EXAMPLE 3: Multiplicative group modulo p (Z*_p, ×)
# THIS IS THE GROUP USED IN DIFFIE-HELLMAN!
# =================================================
# Set: {1, 2, 3, ..., p-1} (all integers from 1 to p-1)
# Operation: multiplication mod p
# Identity: 1 (since a × 1 = a)
# Closure: For prime p, if a,b ≠ 0 mod p, then ab ≠ 0 mod p
 
def multiplicative_group_mod_p(p: int):
    """
    Demonstrate Z*_p - the multiplicative group modulo prime p.
    
    This is the group Diffie-Hellman operates in!
    """
    print(f"
Multiplicative Group Z*_{p}:")
    print(f"  Elements: {{1, 2, ..., {p-1}}}")
    print(f"  Order (size): {p-1}")
    print(f"  Identity: 1")
    
    # Show multiplication table for small p
    if p <= 7:
        print(f"  Multiplication table:")
        for a in range(1, p):
            row = [str((a * b) % p) for b in range(1, p)]
            print(f"    {a}: {' '.join(row)}")
    
    # Compute inverses using Fermat's little theorem
    # a^(p-1) ≡ 1 mod p, so a^(p-2) ≡ a^(-1) mod p
    print(f"  Inverses (a × a⁻¹ ≡ 1 mod {p}):")
    for a in range(1, p):
        inv = pow(a, p-2, p)  # a^(p-2) mod p
        check = (a * inv) % p
        print(f"    {a}⁻¹ = {inv} (check: {a}×{inv} ≡ {check})")
 
# Run examples
addition_group_mod_n(6)
multiplicative_group_mod_p(7)
 
# KEY INSIGHT: Why we need p to be PRIME
# =====================================
# For composite n, Z*_n has fewer elements (those coprime to n)
# and the structure is more complex. Primes give clean groups.
 
print("
" + "="*50)
print("Why primes? Compare Z*_8 vs Z*_7:")
print("Z*_8 = {1,3,5,7} - elements coprime to 8, size = 4")
print("Z*_7 = {1,2,3,4,5,6} - all non-zero elements, size = 6")
print("Primes give maximal, well-structured groups.")

Why Groups Matter for DH

The Multiplicative Group Z*_p

Diffie-Hellman operates specifically in the multiplicative group of integers modulo a prime p, denoted Z*_p (also written as (ℤ/pℤ)*). Let's examine this structure carefully.

Definition:

$$\mathbb{Z}_p^* = {1, 2, 3, \ldots, p-1}$$

with the operation being multiplication modulo p.

Key properties:

Size (order): |Z*_p| = p - 1. This is critical for security analysis.
Closure: For any a, b ∈ Z_p, the product ab mod p is also in Z_p. This is guaranteed because p is prime: if p doesn't divide a or b, it can't divide ab.
Inverses exist: For every element a ∈ Z*_p, there exists a unique a⁻¹ such that a × a⁻¹ ≡ 1 (mod p). This follows from Bézout's identity and Fermat's little theorem.

modular_arithmetic.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
"""
Deep Dive into Modular Arithmetic for Cryptography
 
Understanding these operations is essential for implementing
and analyzing Diffie-Hellman correctly.
"""
 
def extended_gcd(a: int, b: int) -> tuple[int, int, int]:
    """
    Extended Euclidean Algorithm.
    
    Returns (gcd, x, y) such that:
        a*x + b*y = gcd(a, b)
    
    This is fundamental for computing modular inverses.
    """
    if a == 0:
        return b, 0, 1
    
    gcd, x1, y1 = extended_gcd(b % a, a)
    x = y1 - (b // a) * x1
    y = x1
    
    return gcd, x, y
 
 
def modular_inverse(a: int, p: int) -> int:
    """
    Compute the multiplicative inverse of a modulo p.
    
    Two methods:
    1. Extended GCD: a*x + p*y = 1, so a*x ≡ 1 (mod p)
    2. Fermat: a^(p-1) ≡ 1 (mod p), so a^(p-2) ≡ a⁻¹ (mod p)
    """
    # Method 1: Extended GCD
    gcd, x, _ = extended_gcd(a, p)
    if gcd != 1:
        raise ValueError(f"{a} has no inverse modulo {p}")
    return x % p
 
 
def fermats_little_theorem_demo(p: int):
    """
    Demonstrate Fermat's Little Theorem:
        a^(p-1) ≡ 1 (mod p) for all a in Z*_p
    
    This is the cornerstone of modular arithmetic in cryptography.
    """
    print(f"
Fermat's Little Theorem for p = {p}:")
    print(f"For all a ∈ {{1,2,...,{p-1}}}: a^{p-1} ≡ 1 (mod {p})")
    print()
    
    for a in range(1, p):
        result = pow(a, p-1, p)
        # Show the exponentiation steps for small values
        if p <= 11:
            intermediate = a
            steps = [str(a)]
            for i in range(2, p):
                intermediate = (intermediate * a) % p
                if i == p-1:
                    steps.append(f"... a^{i} = {intermediate}")
                elif i <= 4:
                    steps.append(f"a^{i}={intermediate}")
            
            print(f"  a={a}: {' → '.join(steps[:4])} → a^{p-1} = {result}")
        else:
            print(f"  a={a}: a^{p-1} mod {p} = {result}")
    
    print(f"
All results are 1 ✓")
 
 
def group_structure_analysis(p: int):
    """
    Analyze the structure of Z*_p.
    
    Key concepts:
    - Order of an element
    - Primitive roots (generators)
    - Subgroup structure
    """
    print(f"
{'='*50}")
    print(f"GROUP STRUCTURE OF Z*_{p}")
    print(f"{'='*50}")
    print(f"Group size: |Z*_{p}| = {p-1}")
    
    # Find the order of each element
    # Order of a = smallest k such that a^k ≡ 1 (mod p)
    print(f"
Orders of elements:")
    
    generators = []
    for a in range(1, p):
        power = 1
        order = 0
        for k in range(1, p):
            power = (power * a) % p
            if power == 1:
                order = k
                break
        
        is_generator = (order == p - 1)
        if is_generator:
            generators.append(a)
        
        label = "← GENERATOR" if is_generator else ""
        print(f"  ord({a}) = {order} {label}")
    
    print(f"
Primitive roots (generators): {generators}")
    print(f"Number of generators: {len(generators)}")
    print(f"Expected: φ({p-1}) = {euler_phi(p-1)}")
    
    return generators
 
 
def euler_phi(n: int) -> int:
    """Euler's totient function: count of integers ≤ n coprime to n."""
    result = 0
    for i in range(1, n + 1):
        if gcd(i, n) == 1:
            result += 1
    return result
 
 
def gcd(a: int, b: int) -> int:
    """Greatest common divisor via Euclidean algorithm."""
    while b:
        a, b = b, a % b
    return a
 
 
# Demonstrations
fermats_little_theorem_demo(7)
generators = group_structure_analysis(11)
 
print(f"
{'='*50}")
print("WHY GENERATORS MATTER FOR DIFFIE-HELLMAN")
print(f"{'='*50}")
print("A generator g produces ALL elements of the group:")
for gen in generators[:2]:
    print(f"
Generator g = {gen}:")
    elements = []
    power = 1
    for i in range(1, 11):
        power = (power * gen) % 11
        elements.append(power)
    print(f"  Powers: g^1, g^2, ..., g^10 = {elements}")
    print(f"  Produces all {len(set(elements))} elements ✓")

Fermat's Little Theorem: The Foundation

Primitive Roots and Generators

A primitive root (or generator) of Z*_p is an element g whose powers produce every element of the group. This concept is central to DH security.

Definition:

An element g ∈ Z*_p is a primitive root modulo p if:

$${g^1 \mod p, g^2 \mod p, \ldots, g^{p-1} \mod p} = {1, 2, 3, \ldots, p-1}$$

In group theory terms, g is a generator if its order equals the group order p-1.

Order of an element:

The order of a ∈ Z*_p, written ord(a), is the smallest positive integer k such that a^k ≡ 1 (mod p).

Orders of Elements in Z*_11
Element a	Sequence a, a², a³, ... mod 11	Order	Is Generator?
1	1	1	No
2	2, 4, 8, 5, 10, 9, 7, 3, 6, 1	10	✓ YES
3	3, 9, 5, 4, 1	5	No
4	4, 5, 9, 3, 1	5	No
5	5, 3, 4, 9, 1	5	No
6	6, 3, 7, 9, 10, 5, 8, 4, 2, 1	10	✓ YES
7	7, 5, 2, 3, 10, 4, 6, 9, 8, 1	10	✓ YES
8	8, 9, 6, 4, 10, 3, 2, 5, 7, 1	10	✓ YES
9	9, 4, 3, 5, 1	5	No
10	10, 1	2	No

Key observations:

Not every element is a generator. In Z*_11, only 4 out of 10 elements are generators.
Order divides group order. By Lagrange's theorem, ord(a) always divides |Z*_p| = p-1.
Number of generators. There are exactly φ(p-1) primitive roots modulo p, where φ is Euler's totient function.
Security implication: If the generator has small order, the discrete log problem becomes easier. We want generators of maximal order.

generator_selection.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
"""
Generator Selection and Verification
 
Choosing and verifying generators is critical for DH security.
Incorrect choice can catastrophically weaken the protocol.
"""
 
def is_generator(g: int, p: int) -> bool:
    """
    Check if g is a primitive root modulo p.
    
    Efficient test: g is a generator iff g^((p-1)/q) ≠ 1 mod p
    for all prime factors q of (p-1).
    
    This is much faster than computing all powers.
    """
    from sympy import primefactors
    
    order = p - 1
    factors = primefactors(order)
    
    for q in factors:
        # Check g^((p-1)/q) mod p
        exponent = order // q
        if pow(g, exponent, p) == 1:
            # g has order dividing (p-1)/q, so it's NOT a generator
            return False
    
    return True
 
 
def find_all_generators(p: int) -> list[int]:
    """Find all primitive roots modulo prime p."""
    return [g for g in range(2, p) if is_generator(g, p)]
 
 
def verify_safe_prime_generator(p: int, g: int) -> dict:
    """
    For safe primes p = 2q + 1, verify generator properties.
    
    Safe primes are special because (p-1) = 2q has only two 
    prime factors: 2 and q. This makes generator verification simple.
    """
    from sympy import isprime
    
    q = (p - 1) // 2
    
    results = {
        "p": p,
        "is_prime": isprime(p),
        "q = (p-1)/2": q,
        "q_is_prime": isprime(q),
        "is_safe_prime": isprime(p) and isprime(q),
        "g": g,
    }
    
    if results["is_safe_prime"]:
        # For safe primes, g is a generator iff:
        # g^2 ≠ 1 mod p (rules out g = ±1)
        # g^q ≠ 1 mod p (rules out quadratic residues)
        
        g_squared = pow(g, 2, p)
        g_to_q = pow(g, q, p)
        
        results["g^2 mod p"] = g_squared
        results["g^q mod p"] = g_to_q
        results["is_generator"] = (g_squared != 1) and (g_to_q != 1)
    
    return results
 
 
def subgroup_attack_demo():
    """
    Demonstrate why small-order subgroups are dangerous.
    
    If an attacker can force key exchange into a small subgroup,
    the discrete log becomes easy within that subgroup.
    """
    print("SMALL SUBGROUP ATTACK DEMONSTRATION")
    print("="*50)
    
    # Consider a prime p where p-1 has a small factor
    p = 97  # p - 1 = 96 = 2^5 * 3
    
    # Find an element of small order
    for g in range(2, p):
        order = 1
        val = g
        while val != 1:
            val = (val * g) % p
            order += 1
        
        if order == 3:  # Found small-order element
            print(f"Element {g} has order {order}")
            print(f"Subgroup: {{{g}^1={g}, {g}^2={pow(g,2,p)}, {g}^3={pow(g,3,p)}=1}}")
            print()
            print("If attacker forces Bob to use this subgroup,")
            print("the shared secret can only be one of 3 values!")
            print("Brute force becomes trivial.")
            break
    
    print()
    print("MITIGATION: Use safe primes!")
    print("For safe prime p = 2q+1, the only subgroups have")
    print("orders 1, 2, q, and 2q. No small subgroups exist.")
 
 
# Demo
print("Generators of Z*_23:")
gens = find_all_generators(23)
print(f"Generators: {gens}")
print(f"Count: {len(gens)}")
 
print("
Verifying g=5 for p=23 (safe prime, q=11):")
result = verify_safe_prime_generator(23, 5)
for k, v in result.items():
    print(f"  {k}: {v}")
 
print()
subgroup_attack_demo()

Safe Primes Prevent Subgroup Attacks

The Discrete Logarithm Problem (DLP)

The security of Diffie-Hellman depends on the Discrete Logarithm Problem (DLP) being computationally intractable. Let's define it precisely and examine why it's believed to be hard.

Formal Definition:

Given: A prime p, a generator g of Z_p, and an element h ∈ Z_p

Find: The unique integer x ∈ [0, p-2] such that g^x ≡ h (mod p)

We write x = log_g(h) and call x the discrete logarithm of h base g.

Why "discrete"?

discrete_log_algorithms.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
"""
Discrete Logarithm Algorithms: From Naive to Number Field Sieve
 
Understanding attack algorithms helps appreciate parameter choice.
"""
 
import math
from collections import defaultdict
 
def naive_discrete_log(h: int, g: int, p: int) -> int:
    """
    Naive algorithm: try each exponent sequentially.
    
    Time complexity: O(p)
    Space complexity: O(1)
    
    For 2048-bit p, this requires ~10^600 operations.
    Completely impractical.
    """
    power = 1
    for x in range(p):
        if power == h:
            return x
        power = (power * g) % p
    return -1
 
 
def baby_step_giant_step(h: int, g: int, p: int) -> int:
    """
    Shanks' Baby-Step Giant-Step Algorithm.
    
    Time complexity: O(√p)
    Space complexity: O(√p)
    
    Key insight: if x = im + j where m = ⌈√n⌉, then
    g^x = g^(im+j) implies g^j = h * (g^-m)^i
    
    "Baby steps": precompute g^j for j = 0..m-1
    "Giant steps": compute h * (g^-m)^i for i = 0..m-1
    Look for collision.
    """
    n = p - 1  # Group order
    m = int(math.ceil(math.sqrt(n)))
    
    # Baby steps: build lookup table g^j -> j
    baby_steps = {}
    power = 1
    for j in range(m):
        baby_steps[power] = j
        power = (power * g) % p
    
    # Compute g^(-m) mod p
    g_inv_m = pow(g, -m, p)  # Python 3.8+ supports negative exponents
    
    # Giant steps: look for collision
    gamma = h
    for i in range(m):
        if gamma in baby_steps:
            j = baby_steps[gamma]
            x = i * m + j
            # Verify
            if pow(g, x, p) == h:
                return x
        gamma = (gamma * g_inv_m) % p
    
    return -1
 
 
def pollard_rho(h: int, g: int, p: int) -> int:
    """
    Pollard's Rho Algorithm for Discrete Log.
    
    Time complexity: O(√p)
    Space complexity: O(1)
    
    Uses cycle detection (like for detecting linked list cycles).
    More memory-efficient than baby-step giant-step.
    """
    # This is a simplified conceptual version
    # Full implementation requires careful handling of random walk
    
    n = p - 1  # Group order
    
    def f(x, a, b):
        """Partition {1,...,p-1} into 3 sets; define iteration."""
        s = x % 3
        if s == 0:
            return (x * x) % p, (2 * a) % n, (2 * b) % n
        elif s == 1:
            return (x * g) % p, (a + 1) % n, b
        else:
            return (x * h) % p, a, (b + 1) % n
    
    # Start Floyd's cycle-detection
    x, a, b = 1, 0, 0
    X, A, B = x, a, b
    
    for _ in range(n):
        x, a, b = f(x, a, b)
        X, A, B = f(*f(X, A, B))  # X moves twice as fast
        
        if x == X:
            # Collision: g^a * h^b = g^A * h^B
            # So h^(b-B) = g^(A-a)
            # If b ≠ B: h = g^((A-a)/(b-B))
            r = (b - B) % n
            if r != 0:
                from math import gcd
                d = gcd(r, n)
                if d == 1:
                    result = ((A - a) * pow(r, -1, n)) % n
                    if pow(g, result, p) == h:
                        return result
            break
    
    return -1
 
 
def complexity_comparison():
    """
    Compare algorithm complexities for different prime sizes.
    """
    print("DISCRETE LOG ALGORITHM COMPLEXITIES")
    print("="*60)
    print()
    
    bit_sizes = [64, 128, 256, 512, 1024, 2048, 3072]
    
    print(f"{'Bits':>6} | {'Naive O(p)':>15} | {'Baby-Giant O(√p)':>18} | {'NFS (subexp)':>15}")
    print("-" * 60)
    
    for bits in bit_sizes:
        p = 2**bits
        naive = p
        baby_giant = int(math.sqrt(p))
        # Number Field Sieve: L_p[1/3, c] ≈ exp(c * (ln p)^(1/3) * (ln ln p)^(2/3))
        # Simplified approximation
        nfs = int(math.exp(1.9 * (bits * math.log(2))**(1/3) * (math.log(bits * math.log(2)))**(2/3)))
        
        naive_str = f"2^{bits}"
        baby_str = f"2^{bits//2}"
        nfs_str = f"~2^{int(math.log2(nfs))}" if nfs > 0 else "~2^?"
        
        print(f"{bits:>6} | {naive_str:>15} | {baby_str:>18} | {nfs_str:>15}")
    
    print()
    print("Notes:")
    print("- 2^80 operations ≈ 1 year on powerful hardware (security boundary)")
    print("- 2^112 operations ≈ practical limit for nation-states")
    print("- 2^128+ operations ≈ currently infeasible")
    print("- 2048-bit primes give ~112-bit security")
    print("- 3072-bit primes give ~128-bit security")
 
 
# Demo on small primes
print("Discrete log examples on small primes:
")
test_cases = [
    (11, 2, 7),  # Find x such that 2^x ≡ 7 (mod 11)
    (23, 5, 8),  # Find x such that 5^x ≡ 8 (mod 23)
]
 
for p, g, h in test_cases:
    x_naive = naive_discrete_log(h, g, p)
    x_baby = baby_step_giant_step(h, g, p)
    print(f"Solve {g}^x ≡ {h} (mod {p})")
    print(f"  Naive:      x = {x_naive}")
    print(f"  Baby-Giant: x = {x_baby}")
    print(f"  Verify: {g}^{x_naive} mod {p} = {pow(g, x_naive, p)}")
    print()
 
complexity_comparison()

Best Known Algorithms for Discrete Logarithm
Algorithm	Time Complexity	Space	Notes
Brute Force	O(p)	O(1)	Try each exponent; completely impractical
Baby-step Giant-step	O(√p)	O(√p)	Time-space tradeoff; still exponential
Pollard's Rho	O(√p)	O(1)	Random walk; more practical than BSGS
Pollard's Kangaroo	O(√p)	O(1)	When log is known to lie in interval
Index Calculus	Subexponential	Large	Uses factor base; specific to Z*_p
Number Field Sieve	L_p[1/3, 1.9]	Massive	Best for very large p; used in records

Subexponential but Not Polynomial

Computational Hardness Assumptions

Cryptographic security proofs reduce to hardness assumptions. For Diffie-Hellman, three related but distinct problems are relevant:

1. Discrete Logarithm Problem (DLP):

Given: (p, g, g^a)
Find: a
This is what we've discussed. If you can solve DLP, you break DH trivially.

2. Computational Diffie-Hellman (CDH) Problem:

Given: (p, g, g^a, g^b)
Find: g^(ab)
This is exactly what Eve needs to compute. CDH ≤ DLP (DLP solves CDH), but the reverse is not proven.

3. Decisional Diffie-Hellman (DDH) Problem:

Given: (p, g, g^a, g^b, h) where h is either g^(ab) or random
Decide: which case h represents
DDH ≤ CDH ≤ DLP. DDH is the strongest (easiest) assumption.

Converting Mermaid diagram...

Why these distinctions matter:

For basic key exchange: CDH hardness suffices. Eve needs to compute g^(ab).
For key derivation security: DDH hardness is needed. The shared secret should be indistinguishable from random.
Security proofs: Many protocols prove security under DDH, which assumes CDH (and thus implicitly DLP).

Are these actually hard?

No mathematical proof exists that DLP is hard. However:

Decades of cryptanalytic effort have produced only subexponential (not polynomial) algorithms
The best attacks require massive resources; DLP records involve years of computation
Quantum computers would change everything (Shor's algorithm solves DLP in polynomial time), but practical quantum computers of sufficient size don't exist yet

Discrete Log Computation Records
Year	Bits	Method	Time/Resources
2016	1024	NFS (special prime)	~2 months, many cores
2019	795	NFS (general)	~3100 core-years
2023	912	NFS	Massive compute cluster
—	2048+	—	Currently infeasible

Logjam Attack: Weak Groups Are Real Threats

Why Modular Arithmetic Creates Hardness

It's reasonable to ask: why is modular exponentiation hard to invert when regular exponentiation is easy? Understanding this provides intuition for the security model.

Regular logarithm:

Given y = 10^x, finding x is trivial:

x = log₁₀(y)
Use a calculator, Taylor series, or binary search
Smooth, monotonic function with analytical tools

Discrete logarithm:

Given y = 10^x mod 101, finding x is hard:

The modular reduction destroys smoothness
The sequence 10¹, 10², 10³, ... mod 101 appears pseudorandom
No monotonicity, no calculus, no binary search

modular_chaos.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
"""
Visualizing Why Modular Arithmetic Creates Hardness
 
The 'chaos' introduced by modular reduction is why
discrete logarithms are hard.
"""
 
def visualize_modular_sequence(g: int, p: int):
    """
    Show how modular exponentiation scrambles values.
    """
    print(f"Powers of {g} modulo {p}:")
    print("=" * 50)
    
    powers = []
    val = 1
    for i in range(1, p):
        val = (val * g) % p
        powers.append(val)
        print(f"  {g}^{i:2} mod {p} = {val:3}", end="")
        
        # Show as "percentage" of range for visual
        bar_len = int(val * 40 / p)
        print(f"  |{'#' * bar_len}")
    
    print()
    print("Observations:")
    print(f"  - Values jump unpredictably between 1 and {p-1}")
    print(f"  - No monotonic pattern")
    print(f"  - Knowing g^x doesn't help find g^(x+1) without knowing x")
    print(f"  - The only structure: values eventually cycle with period {p-1}")
    
    return powers
 
 
def statistical_analysis(g: int, p: int):
    """
    Analyze statistical properties of the sequence.
    
    A good cryptographic group should have:
    - Uniform distribution
    - Long period
    - No discernible pattern
    """
    from statistics import mean, stdev
    
    powers = []
    val = 1
    for i in range(1, p):
        val = (val * g) % p
        powers.append(val)
    
    print(f"
Statistical Analysis (g={g}, p={p}):")
    print(f"  Period: {len(powers)}")
    print(f"  Mean: {mean(powers):.2f} (expected: {(p-1)/2:.2f})")
    print(f"  StdDev: {stdev(powers):.2f} (uniform would be ~{(p-1)/(2*1.73):.2f})")
    
    # Check if all values appear exactly once
    unique = set(powers)
    if len(unique) == p - 1:
        print(f"  Generator: YES (all {p-1} non-zero values appear)")
    else:
        print(f"  Generator: NO (only {len(unique)} unique values)")
 
 
def compare_continuous_discrete():
    """
    Compare continuous vs discrete logarithms.
    """
    import math
    
    print("
CONTINUOUS VS DISCRETE LOGARITHM")
    print("=" * 50)
    
    # Continuous: finding x such that 10^x = y
    y_continuous = 1000
    x_continuous = math.log10(y_continuous)
    print(f"
Continuous: 10^x = 1000")
    print(f"  Solution: x = log₁₀(1000) = {x_continuous}")
    print(f"  Method: Direct formula, O(1)")
    
    # Discrete: finding x such that 10^x ≡ y (mod 101)
    g, p, y_discrete = 10, 101, 78
    print(f"
Discrete: 10^x ≡ {y_discrete} (mod 101)")
    
    # Find by brute force
    val = 1
    for x in range(1, p):
        val = (val * g) % p
        if val == y_discrete:
            print(f"  Solution: x = {x}")
            break
    
    print(f"  Method: Must try values (at best O(√p))")
    print(f"  Why hard: modular reduction destroys log properties:")
    print(f"           log(ab mod p) ≠ log(a) + log(b)")
 
 
# Run demonstrations
visualize_modular_sequence(3, 17)
statistical_analysis(5, 23)
compare_continuous_discrete()

The Information-Theoretic View

Implications for Parameter Selection

The mathematics we've covered directly informs how DH parameters should be chosen in practice. Each mathematical concept translates to a specific security requirement.

Parameter Selection Guidelines

•Prime size → Security level: 2048 bits ≈ 112-bit security; 3072 bits ≈ 128-bit security. Match to your threat model.
•Safe prime → No small subgroups: Use p = 2q + 1 where q is prime. This prevents subgroup confinement attacks.
•Primitive root → Full group coverage: The generator g must have order p-1 (or q for safe primes). Small generators like 2, 3, 5 are fine if verified.
•Prime structure → Algorithm resistance: Avoid primes with special structure (e.g., p-1 with many small factors) that could be exploited by index calculus variants.
•Standardized groups → Vetted security: RFC 3526, RFC 7919 provide audited parameter sets. Prefer these over custom generation.

Recommended DH Parameter Sizes (NIST/IETF)
Security Level	Prime Size (bits)	Comparable Symmetric	Use Case
112-bit	2048	3DES, AES-128	General purpose, current standard
128-bit	3072	AES-128+	Medium-term security (10+ years)
192-bit	7680	AES-192	Long-term, high security
256-bit	15360	AES-256	Post-quantum consideration (pre-QC)

Modern Recommendation

Summary: Mathematical Foundations

We've built a rigorous mathematical foundation for understanding Diffie-Hellman security. Let's consolidate these concepts:

Key Mathematical Concepts

•Group theory provides the algebraic structure: closure, associativity, identity, inverses. DH operates in the abelian group Z*_p.
•Modular arithmetic in Z*_p has special properties: Fermat's theorem (a^(p-1) ≡ 1), existence of inverses, and well-defined structure.
•Primitive roots (generators) produce all group elements. Choosing good generators ensures full security.
•The discrete logarithm problem is the foundation of DH security. Best algorithms are subexponential, making large primes secure.
•Hardness assumptions (DLP, CDH, DDH) form a hierarchy of security guarantees used in cryptographic proofs.
•Parameter selection translates mathematical requirements into practical choices: prime size, safe primes, verified generators.

What's next:

Page Complete

3 / 5