Loading content...
Having established why the key exchange problem is so challenging, we now turn to the elegant solution devised by Whitfield Diffie and Martin Hellman. The Diffie-Hellman key exchange protocol achieves what seemed impossible: it allows two parties who have never communicated before to establish a shared secret over a completely public channel.
What makes this protocol remarkable is its simplicity. The core algorithm involves just a handful of mathematical operations, yet its security rests on a problem that has resisted the efforts of mathematicians for decades. In this page, we'll walk through every step of the protocol, examine real numerical examples, and understand exactly how and why it works.
By the end of this page, you will be able to execute the Diffie-Hellman protocol by hand using small numbers, understand each participant's computations, trace the flow of public and private values, and explain why an eavesdropper who observes all public communications cannot compute the shared secret.
Before diving into the mathematics, let's understand the protocol's structure at a high level. The Diffie-Hellman key exchange involves five phases:
Only two things are secret in the entire protocol: Alice's private exponent 'a' and Bob's private exponent 'b'. Everything else—p, g, A = g^a mod p, B = g^b mod p—can be transmitted openly over insecure channels. The final shared secret K = g^(ab) mod p is never transmitted at all.
The security of Diffie-Hellman depends critically on the choice of parameters. Let's examine each carefully.
The Prime Modulus p:
The prime p defines the mathematical structure within which all operations occur. This prime must be:
p where (p-1)/2 is also prime. This ensures the group has no small subgroups that could be exploited.123456789101112131415161718192021222324252627282930313233
import secretsfrom sympy import isprime, randprime def generate_safe_prime(bits: int) -> int: """ Generate a safe prime p where (p-1)/2 is also prime. Safe primes prevent small subgroup attacks and ensure the generator has maximum order. """ while True: # Generate a prime q of (bits-1) length q = randprime(2**(bits-2), 2**(bits-1)) # Compute potential safe prime p = 2q + 1 p = 2 * q + 1 if isprime(p): print(f"Found safe prime: p = {p}") print(f"Sophie Germain prime: q = {q}") return p # For demonstration, we'll use a small safe prime# In production, use at least 2048 bits!demo_p = 23 # Small safe prime: (23-1)/2 = 11 is primeprint(f"Demo prime p = {demo_p}")print(f"Check: (p-1)/2 = {(demo_p-1)//2}, is prime: {isprime((demo_p-1)//2)}") # Standard groups from RFC 3526 (for real applications):# - Group 14: 2048-bit MODP Group# - Group 15: 3072-bit MODP Group # - Group 16: 4096-bit MODP Group# These are carefully vetted and widely usedThe Generator g:
The generator g is an element that, when repeatedly exponentiated, can produce any element of the multiplicative group modulo p. Specifically, g should be a primitive root modulo p, meaning the sequence:
g¹ mod p, g² mod p, g³ mod p, ..., g^(p-1) mod p
produces all integers from 1 to p-1 (in some order) before repeating.
Properties of a good generator:
| Exponent x | 5^x mod 23 | Value |
|---|---|---|
| 1 | 5¹ mod 23 | 5 |
| 2 | 5² mod 23 | 2 |
| 3 | 5³ mod 23 | 10 |
| 4 | 5⁴ mod 23 | 4 |
| 5 | 5⁵ mod 23 | 20 |
| 6 | 5⁶ mod 23 | 8 |
| 7 | 5⁷ mod 23 | 17 |
| 8 | 5⁸ mod 23 | 16 |
| 9 | 5⁹ mod 23 | 11 |
| 10 | 5¹⁰ mod 23 | 9 |
| 11 | 5¹¹ mod 23 | 22 |
| ... | ... | ... |
| 22 | 5²² mod 23 | 1 (cycle completes) |
In practice, most implementations use standardized groups from RFCs. RFC 3526 defines MODP groups (traditional DH), while RFC 7748 defines modern elliptic curve parameters. Using standard groups prevents subtle parameter weaknesses and ensures interoperability.
Let's execute the Diffie-Hellman protocol with concrete numbers. We'll use small values for clarity—in production, all numbers would be thousands of bits long.
Given parameters (public):
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148
"""Complete Diffie-Hellman Key Exchange Implementation This implementation demonstrates the protocol step-by-stepwith detailed explanations at each phase.""" import secrets class DiffieHellman: def __init__(self, p: int, g: int, name: str): """ Initialize a DH participant with public parameters. Args: p: The prime modulus (public) g: The generator (public) name: Participant name for logging """ self.p = p self.g = g self.name = name self.private_key = None # Will be generated self.public_key = None # Will be computed self.shared_secret = None # Will be derived def generate_private_key(self) -> int: """ STEP 1: Generate a random private exponent. The private key must be: - Randomly chosen from [2, p-2] - Kept absolutely secret - Never transmitted or logged In production, use cryptographically secure randomness. """ # For demonstration, use small range # Production: self.private_key = secrets.randbelow(self.p - 2) + 2 self.private_key = secrets.randbelow(self.p - 3) + 2 print(f"[{self.name}] Generated private key: {self.private_key} (SECRET!)") return self.private_key def compute_public_key(self) -> int: """ STEP 2: Compute public key from private key. Formula: public_key = g^private_key mod p This is a one-way operation: - Computing g^a mod p is efficient: O(log a) multiplications - Reversing (finding a from g^a mod p) is the DLP """ if self.private_key is None: raise ValueError("Must generate private key first") self.public_key = pow(self.g, self.private_key, self.p) print(f"[{self.name}] Computed public key: {self.g}^{self.private_key} mod {self.p} = {self.public_key}") return self.public_key def compute_shared_secret(self, other_public_key: int) -> int: """ STEP 3: Compute shared secret using other party's public key. Formula: shared_secret = other_public^my_private mod p Alice computes: B^a mod p = (g^b)^a mod p = g^(ab) mod p Bob computes: A^b mod p = (g^a)^b mod p = g^(ab) mod p Both arrive at the same value! """ if self.private_key is None: raise ValueError("Must generate private key first") self.shared_secret = pow(other_public_key, self.private_key, self.p) print(f"[{self.name}] Computed shared secret: {other_public_key}^{self.private_key} mod {self.p} = {self.shared_secret}") return self.shared_secret def run_key_exchange(): """ Execute a complete Diffie-Hellman key exchange between Alice and Bob. """ print("=" * 60) print("DIFFIE-HELLMAN KEY EXCHANGE DEMONSTRATION") print("=" * 60) # Public parameters (would be standardized in practice) p = 23 # Prime modulus g = 5 # Generator print(f"📋 PUBLIC PARAMETERS (visible to everyone, including Eve):") print(f" Prime p = {p}") print(f" Generator g = {g}") # Create participants alice = DiffieHellman(p, g, "Alice") bob = DiffieHellman(p, g, "Bob") # Step 1: Each party generates a private key print(f"🔐 STEP 1: PRIVATE KEY GENERATION") alice.generate_private_key() bob.generate_private_key() # Step 2: Each party computes their public key print(f"📤 STEP 2: PUBLIC KEY COMPUTATION") alice_public = alice.compute_public_key() bob_public = bob.compute_public_key() # These public keys can be transmitted openly print(f"📡 STEP 3: PUBLIC KEY EXCHANGE") print(f" Alice sends to Bob: A = {alice_public}") print(f" Bob sends to Alice: B = {bob_public}") print(f" (Eve sees both: A = {alice_public}, B = {bob_public})") # Step 4: Each party computes the shared secret print(f"🔑 STEP 4: SHARED SECRET COMPUTATION") alice_secret = alice.compute_shared_secret(bob_public) bob_secret = bob.compute_shared_secret(alice_public) # Verify keys match print(f"✅ VERIFICATION:") print(f" Alice's shared secret: {alice_secret}") print(f" Bob's shared secret: {bob_secret}") assert alice_secret == bob_secret, "CRITICAL: Keys don't match!" print(f" MATCH: Both parties have identical key = {alice_secret}") # Eve's perspective print(f"👀 EVE'S VIEW (what the eavesdropper sees):") print(f" Public parameters: p = {p}, g = {g}") print(f" Alice's public key: A = {alice_public}") print(f" Bob's public key: B = {bob_public}") print(f" Shared secret: ??? (must solve discrete log to find)") return alice_secret if __name__ == "__main__": shared_key = run_key_exchange() print(f"🎉 Key exchange complete! Shared secret = {shared_key}")Example execution with specific values:
| Step | Alice | Bob | What Eve Sees |
|---|---|---|---|
| Parameters | p=23, g=5 | p=23, g=5 | p=23, g=5 |
| Generate private | a=6 (secret!) | b=15 (secret!) | Nothing |
| Compute public | A = 5⁶ mod 23 = 8 | B = 5¹⁵ mod 23 = 19 | A=8, B=19 |
| Exchange | Receives B=19 | Receives A=8 | Sees both |
| Compute shared | K = 19⁶ mod 23 = 2 | K = 8¹⁵ mod 23 = 2 | ??? |
Verification that both get the same key:
Alice computes: B^a mod p = 19⁶ mod 23
Bob computes: A^b mod p = 8¹⁵ mod 23
Both arrive at shared secret K = 2.
The correctness of Diffie-Hellman rests on a fundamental property of exponentiation: the commutativity of repeated exponentiation.
The key identity:
$$(g^a)^b = g^{ab} = g^{ba} = (g^b)^a$$
This holds in any group, including the multiplicative group of integers modulo a prime.
Proof of protocol correctness:
Let's trace exactly what each party computes:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
"""Mathematical Proof of Diffie-Hellman Correctness We prove that Alice and Bob always compute the same shared secret.""" # Setup: Public parameters# - p: a prime# - g: a generator modulo p # Alice:# 1. Chooses random private key: a# 2. Computes public key: A = g^a mod p# 3. Sends A to Bob # Bob:# 1. Chooses random private key: b# 2. Computes public key: B = g^b mod p# 3. Sends B to Alice # Alice's shared secret computation:# K_alice = B^a mod p# = (g^b mod p)^a mod p [substitute B = g^b mod p]# = g^(ba) mod p [property of modular exponentiation]# = g^(ab) mod p [commutativity of multiplication: ba = ab] # Bob's shared secret computation:# K_bob = A^b mod p# = (g^a mod p)^b mod p [substitute A = g^a mod p]# = g^(ab) mod p [property of modular exponentiation] # CONCLUSION:# K_alice = g^(ab) mod p = K_bob## Both parties compute the identical value g^(ab) mod p# QED def verify_correctness(p: int, g: int, a: int, b: int): """ Verify that DH produces identical keys for any valid inputs. """ # Alice's computations A = pow(g, a, p) # Public key K_alice = pow(pow(g, b, p), a, p) # Using Bob's public key # Bob's computations B = pow(g, b, p) # Public key K_bob = pow(pow(g, a, p), b, p) # Using Alice's public key # Direct computation of g^(ab) mod p K_direct = pow(g, a * b, p) print(f"Alice computes: (g^b)^a = {K_alice}") print(f"Bob computes: (g^a)^b = {K_bob}") print(f"Direct: g^(ab) = {K_direct}") assert K_alice == K_bob == K_direct print("✓ All three methods yield the same value!") return K_alice # Test with various parametersverify_correctness(p=23, g=5, a=6, b=15)verify_correctness(p=101, g=3, a=17, b=42)verify_correctness(p=997, g=7, a=123, b=456)Notice how the protocol exploits mathematical symmetry. Alice and Bob perform the same operations, just with different inputs. Neither needs to know the other's private value, yet both arrive at the same shared secret. This symmetry is what makes the protocol so beautiful and resistant to analysis.
Understanding why the protocol is secure is as important as understanding why it's correct. Let's analyze exactly what an eavesdropper (Eve) can and cannot do.
What Eve observes:
What Eve wants to compute:
The Diffie-Hellman Problem (DHP):
Given p, g, g^a mod p, and g^b mod p, compute g^(ab) mod p.
This is the Computational Diffie-Hellman (CDH) problem. It is believed to be computationally hard—no efficient algorithm is known.
| Attack Strategy | What Eve Attempts | Why It Fails |
|---|---|---|
| Solve discrete log for a | From A = g^a mod p, find a | DLP is computationally intractable for large primes |
| Solve discrete log for b | From B = g^b mod p, find b | Same: DLP is infeasible |
| Compute g^(ab) directly | Combine A and B somehow | No known algebraic method exists |
| Brute force a or b | Try all possible values | 2^2048 possibilities: more than atoms in universe |
| Meet-in-the-middle | Split the search space | Still requires √p operations: ~2^1024 |
Relationship between CDH and DLP:
If Eve could solve the Discrete Logarithm Problem (DLP), she could trivially solve CDH:
However, nobody has proven that CDH is as hard as DLP. It's possible (though considered unlikely) that a way exists to solve CDH without solving DLP.
The Decisional Diffie-Hellman (DDH) assumption:
An even stronger assumption states that given (g, g^a, g^b, g^c), Eve cannot efficiently distinguish whether c = ab or c is random. This is crucial for semantic security.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
"""Analysis from Eve's Perspective: What the Eavesdropper Sees This demonstrates the computational barrier Eve faces.""" def eves_attack_naive(p: int, g: int, A: int, B: int) -> int: """ Eve's most naive attack: brute force discrete logarithm. Try every possible value of 'a' until we find A = g^a mod p. Then compute B^a mod p. Time complexity: O(p) - completely impractical for large p """ print(f"Eve attempts brute force on p = {p}") for a_guess in range(1, p): if pow(g, a_guess, p) == A: print(f" Found a = {a_guess} (required {a_guess} attempts)") secret = pow(B, a_guess, p) return secret return None def eves_attack_baby_giant(p: int, g: int, A: int, B: int) -> int: """ Baby-step Giant-step algorithm: Eve's smarter approach. Trade time for space: O(√p) time and O(√p) space. Still impractical for cryptographic-sized primes. """ import math n = p - 1 # Order of the group m = int(math.ceil(math.sqrt(n))) # Baby step: compute g^j mod p for j = 0, 1, ..., m-1 baby_steps = {} power = 1 for j in range(m): baby_steps[power] = j power = (power * g) % p # Giant step: compute A * g^(-im) mod p for i = 0, 1, ... # Looking for collision with baby steps factor = pow(g, -m, p) # g^(-m) mod p gamma = A for i in range(m): if gamma in baby_steps: a = i * m + baby_steps[gamma] print(f" Baby-giant found a = {a}") return pow(B, a, p) gamma = (gamma * factor) % p return None # Demonstrate on small parameters (where Eve can succeed)print("=== Small Parameter Example (Eve CAN break this) ===")p_small = 23g_small = 5A_small = 8 # = 5^6 mod 23B_small = 19 # = 5^15 mod 23 secret = eves_attack_naive(p_small, g_small, A_small, B_small)print(f"Eve recovered shared secret: {secret}") # Now imagine cryptographic-sized parametersprint("=== Why Real Parameters Are Safe ===")print("For a 2048-bit prime p:")print(f" - Brute force: ~2^2048 operations")print(f" - Baby-giant: ~2^1024 operations")print(f" - Neither is feasible with any technology")print(f" - 2^80 operations is roughly the security boundary")print(f" - 2^1024 is approximately 10^300 times larger")The security level of DH is approximately half the bit length of the prime p. A 2048-bit prime provides roughly 112 bits of security, while a 3072-bit prime provides about 128 bits. Always use standardized groups with vetted security levels.
Implementing Diffie-Hellman correctly requires attention to several subtle but critical details. Many real-world vulnerabilities have emerged from implementation mistakes rather than mathematical weaknesses.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121
"""Secure Diffie-Hellman Implementation with Best Practices This demonstrates production-quality implementation patterns.""" import secretsimport hashlibimport hmac class SecureDiffieHellman: """ Production-oriented DH implementation with security best practices. """ # RFC 3526 Group 14: 2048-bit MODP Group # This is a well-vetted, standardized parameter set RFC3526_P = int( "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1" "29024E088A67CC74020BBEA63B139B22514A08798E3404DD" "EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245" "E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED" "EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D" "C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F" "83655D23DCA3AD961C62F356208552BB9ED529077096966D" "670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B" "E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9" "DE2BCBF6955817183995497CEA956AE515D2261898FA0510" "15728E5A8AACAA68FFFFFFFFFFFFFFFF", 16 ) RFC3526_G = 2 def __init__(self): self.p = self.RFC3526_P self.g = self.RFC3526_G self.private_key = None self.public_key = None def generate_keypair(self) -> int: """ Generate a cryptographically secure key pair. CRITICAL: Use secrets module, not random! """ # Private key must be in range [2, p-2] # Using secrets ensures cryptographic randomness self.private_key = secrets.randbelow(self.p - 3) + 2 # Public key computation self.public_key = pow(self.g, self.private_key, self.p) return self.public_key def validate_public_key(self, other_public: int) -> bool: """ Validate that a received public key is legitimate. This prevents small subgroup attacks and invalid key attacks. """ # Check range: must be in [2, p-2] if not (2 <= other_public <= self.p - 2): return False # For safe primes, check that key^q = 1 mod p # where q = (p-1)/2 # This ensures the key has the correct order q = (self.p - 1) // 2 if pow(other_public, q, self.p) != 1: return False return True def compute_shared_secret(self, other_public: int) -> bytes: """ Compute shared secret with proper key derivation. NEVER use raw DH output directly as a key! """ if not self.validate_public_key(other_public): raise ValueError("Invalid public key received") if self.private_key is None: raise ValueError("Must generate keypair first") # Raw DH shared secret raw_secret = pow(other_public, self.private_key, self.p) # Convert to bytes secret_bytes = raw_secret.to_bytes( (raw_secret.bit_length() + 7) // 8, 'big' ) # CRITICAL: Apply key derivation function # This provides: # - Domain separation # - Uniform output distribution # - Protection against related-key attacks derived_key = self._kdf(secret_bytes, info=b"DH-shared-secret", length=32) return derived_key def _kdf(self, secret: bytes, info: bytes, length: int) -> bytes: """ HKDF-style key derivation (simplified). """ # In production, use actual HKDF from cryptography library return hashlib.pbkdf2_hmac('sha256', secret, info, 1, length) # Usage examplealice = SecureDiffieHellman()bob = SecureDiffieHellman() alice_public = alice.generate_keypair()bob_public = bob.generate_keypair() alice_key = alice.compute_shared_secret(bob_public)bob_key = bob.compute_shared_secret(alice_public) assert alice_key == bob_keyprint(f"Shared key (hex): {alice_key.hex()}")Raw DH output has biased bits and predictable structure. Using it directly as an encryption key is a serious vulnerability. Always process through a key derivation function (KDF) like HKDF to produce uniform, pseudorandom key material.
Basic Diffie-Hellman has a critical limitation: it provides no authentication. The protocol ensures that two parties establish a shared secret, but neither party can verify who they're communicating with. This vulnerability enables the Man-in-the-Middle (MitM) attack.
How the attack works:
Why basic DH is vulnerable:
The protocol has no way for Alice to verify that the "g^b" she receives actually came from Bob. Nothing in pure mathematics proves identity.
Solutions:
Bare DH is rarely used in practice. Real protocols (TLS, SSH, IPsec) combine DH with authentication mechanisms. Never deploy unauthenticated key exchange in production systems.
We've thoroughly examined the Diffie-Hellman key exchange protocol—from its high-level structure through implementation details and security considerations. Let's consolidate the key points:
What's next:
With the algorithm thoroughly understood, we'll next explore the mathematical foundations in greater depth. You'll learn about group theory, the structure of Z*_p, why certain primes are better than others, and the precise computational assumptions that underpin DH security. This deeper understanding will prepare you for modern elliptic curve variants.
You can now execute Diffie-Hellman by hand, explain each step's purpose, analyze why eavesdroppers cannot compute the shared secret, and identify the protocol's authentication limitation. The next page will provide the rigorous mathematical foundation that makes all of this work.