Loading learning content...
In 2012, LinkedIn suffered a breach that exposed 6.5 million password hashes. The hashes were unsalted SHA-1—trivially cracked by modern standards. Within hours, attackers had recovered millions of plaintext passwords, enabling credential stuffing attacks across the internet.
This breach, and countless others, underscore a critical truth: how you store passwords determines whether a breach is a minor incident or a catastrophe. Even when attackers breach your system and steal your entire database, properly stored passwords should remain computationally infeasible to recover.
This page explores the evolution from plaintext to modern password hashing, the mathematics behind secure storage, and the specific algorithms that represent current best practices.
By the end of this page, you will understand why plaintext and simple hashing fail, how salting defeats precomputation attacks, why key derivation functions are essential, and how to implement bcrypt, scrypt, and Argon2 correctly.
Password storage has evolved through several generations, each addressing vulnerabilities of the previous approach.
Generation 1: Plaintext Storage
The earliest systems stored passwords directly:
username:password
alice:secret123
bob:password456
Fatal flaw: Any database breach instantly exposes every password. Insider threats, SQL injection, backup exposure—all lead to complete compromise.
Generation 2: Simple Hashing
Cryptographic hash functions provide one-way transformation:
username:hash(password)
alice:5f4dcc3b5aa765d61d8327deb882cf99 (MD5 of 'password')
Improvement: Attacker sees hash, not password. Weakness: Identical passwords produce identical hashes. Rainbow tables precompute hash→password mappings for millions of common passwords.
| Generation | Approach | Breach Impact | Attack Resistance |
|---|---|---|---|
| Store password directly | Catastrophic | None |
| H(password) | Severe | Rainbow tables defeat |
| H(salt + password) | Moderate | Per-password attack required |
| KDF(password, salt, cost) | Minimal | High computational cost per attempt |
Generation 3: Salted Hashing
A unique random salt for each password:
username:salt:hash(salt + password)
alice:x7Km9:a3f2b8c1d4e5...
bob:Pq2Rv:9e8d7c6b5a4...
Improvement: Same password produces different hashes. Rainbow tables useless—would need separate table per salt. Weakness: Fast hash functions (MD5, SHA-1, SHA-256) enable billions of attempts per second on GPU.
Generation 4: Key Derivation Functions (KDFs)
Purposefully slow, memory-hard functions:
username:algorithm$cost$salt$hash
alice:argon2id$19$x7Km9$a3f2b8c1d4e5...
Key insight: Make each verification attempt computationally expensive. If one attempt takes 100ms instead of 1μs, brute force becomes 100,000× slower.
SHA-256, SHA-512, and MD5 are designed to be fast. A single GPU can compute billions of SHA-256 hashes per second. These are appropriate for data integrity, not password storage. Always use dedicated password hashing functions.
Salts are random values concatenated with passwords before hashing. Understanding salt requirements is essential for secure implementation.
Salt Requirements:
Why Salts Work:
Without salt, identical passwords produce identical hashes:
hash("password") = 5f4dcc3b5aa765d61d8327deb882cf99
Attacker precomputes hash for every common password once, then looks up any stolen hash.
With salt, each account needs separate attack:
hash("salt1" + "password") = a3f2b8c1...
hash("salt2" + "password") = 7e9d4f2a...
Precomputation becomes infeasible—would need rainbow table for each possible salt (2¹²⁸ possibilities with 128-bit salt).
123456789101112131415161718192021222324252627282930313233343536
"""Salt Generation and Usage Demonstration"""import osimport hashlibimport secrets def generate_salt(length: int = 16) -> bytes: """Generate cryptographically secure random salt.""" return secrets.token_bytes(length) def hash_password_with_salt(password: str, salt: bytes) -> bytes: """Hash password with salt (educational - use KDF in production).""" return hashlib.sha256(salt + password.encode()).digest() def demonstrate_salting(): """Show why salting prevents rainbow table attacks.""" password = "password123" # Without salt - same password, same hash print("Without salting:") for i in range(3): h = hashlib.sha256(password.encode()).hexdigest() print(f" User {i+1}: {h[:32]}...") print(" All identical! Rainbow table attack works.\n") # With salt - same password, different hashes print("With per-user salt:") for i in range(3): salt = generate_salt() h = hash_password_with_salt(password, salt) print(f" User {i+1}: salt={salt.hex()[:16]}... hash={h.hex()[:32]}...") print(" All different! Each requires separate attack.") if __name__ == "__main__": demonstrate_salting()Unlike the password, the salt is stored in plaintext alongside the hash. Its purpose is to ensure uniqueness, not secrecy. An attacker with database access sees the salt—but still must attack each password individually.
Key Derivation Functions (KDFs) are specifically designed for password hashing with adjustable computational cost.
Design Goals:
The Big Three:
bcrypt (1999)
scrypt (2009)
Argon2 (2015)
| Algorithm | Memory Hardness | GPU Resistance | Recommended For |
|---|---|---|---|
| bcrypt | Low (~4KB) | Moderate | Legacy compatibility |
| scrypt | High (configurable) | High | Cryptocurrency, storage |
| Argon2id | High (configurable) | Very High | New applications (preferred) |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
"""Modern Password Hashing with Argon2, bcrypt, and scrypt"""import argon2import bcryptimport hashlibimport osimport time # ============ Argon2 (Recommended) ============def hash_argon2(password: str) -> str: """ Hash password with Argon2id (recommended algorithm). Default parameters provide good security/performance balance. """ ph = argon2.PasswordHasher( time_cost=3, # Number of iterations memory_cost=65536, # 64 MB of memory parallelism=4, # 4 parallel threads hash_len=32, # 256-bit hash salt_len=16, # 128-bit salt type=argon2.Type.ID # Argon2id (hybrid) ) return ph.hash(password) def verify_argon2(password: str, hash_str: str) -> bool: """Verify password against Argon2 hash.""" ph = argon2.PasswordHasher() try: ph.verify(hash_str, password) return True except argon2.exceptions.VerifyMismatchError: return False # ============ bcrypt ============def hash_bcrypt(password: str, rounds: int = 12) -> str: """ Hash password with bcrypt. Cost factor 12 = 2^12 iterations = ~250ms on modern CPU. """ salt = bcrypt.gensalt(rounds=rounds) return bcrypt.hashpw(password.encode(), salt).decode() def verify_bcrypt(password: str, hash_str: str) -> bool: """Verify password against bcrypt hash.""" return bcrypt.checkpw(password.encode(), hash_str.encode()) # ============ scrypt ============def hash_scrypt(password: str) -> str: """Hash password with scrypt.""" salt = os.urandom(16) hash_bytes = hashlib.scrypt( password.encode(), salt=salt, n=2**14, # CPU/memory cost r=8, # Block size p=1, # Parallelization dklen=32 # Output length ) return f"{salt.hex()}${hash_bytes.hex()}" def benchmark_algorithms(): """Compare hashing times.""" password = "SecurePassword123!" for name, func in [ ("Argon2id", lambda: hash_argon2(password)), ("bcrypt (12)", lambda: hash_bcrypt(password, 12)), ("scrypt", lambda: hash_scrypt(password)), ]: start = time.time() result = func() elapsed = time.time() - start print(f"{name}: {elapsed*1000:.1f}ms") print(f" Hash: {result[:50]}...") if __name__ == "__main__": benchmark_algorithms()Set work factor so hashing takes 100-500ms on your production hardware. This is imperceptible to legitimate users but makes brute force attacks take years. Increase work factor as hardware improves—Argon2 stores parameters in the hash for automatic detection.
Modern password hashes encode algorithm, parameters, salt, and hash in a single string for portability and future-proofing.
Modular Crypt Format (MCF):
Used by Unix systems and many applications:
$<algorithm>$<parameters>$<salt>$<hash>
Examples:
# bcrypt
$2b$12$N9qo8u7LlDKjuFVXPBK5r.l5sT7W5kxA8MvQ1Z2nQ3eDwYgPM.Bu
│ │ └──────────────────── salt+hash ─────────────────────┘
│ └─ cost factor (2^12 iterations)
└─ algorithm identifier
# Argon2id
$argon2id$v=19$m=65536,t=3,p=4$c2FsdHNhbHRzYWx0$hash...
│ │ │ │ └─ salt (base64) └─ hash (base64)
│ │ │ └─ parallelism
│ │ └─ iterations (time cost)
│ └─ memory cost (KB)
└─ version
# SHA-512 crypt (Linux)
$6$rounds=5000$saltvalue$hashvalue...
│ │ │ └─ hash
│ │ └─ salt
│ └─ iteration count
└─ algorithm (6 = SHA-512)
Storage Considerations:
When upgrading from weaker algorithms, rehash passwords opportunistically: after successful login with old hash, compute new hash and update storage. Never store passwords to rehash later—verify-then-upgrade is the only safe pattern.
While salts provide per-password uniqueness, peppers add an additional secret not stored in the database.
Pepper Concept:
A pepper is a secret key applied during hashing:
hash = KDF(password + pepper, salt)
Unlike salt (stored with hash), pepper is:
Security Benefit:
If attacker steals only the database, they have hashes and salts but not the pepper. Without pepper, they cannot even verify guesses—adding an additional barrier.
Implementation Approaches:
1. HMAC Pepper (Recommended):
def hash_with_pepper(password: str, pepper: bytes) -> str:
# First: HMAC password with pepper
import hmac
peppered = hmac.new(pepper, password.encode(), 'sha256').digest()
# Then: Use KDF on peppered value
return argon2.hash(peppered.hex())
2. Encryption Layer: Encrypt the password hash with a key not in the database:
stored = encrypt(hash(password, salt), encryption_key)
| Layer | What It Protects Against | Where Stored |
|---|---|---|
| Hashing (KDF) | Plaintext recovery | N/A (irreversible) |
| Salt | Rainbow tables, identical hash detection | With hash in database |
| Pepper/HMAC | Database-only breach | Secrets manager, HSM, config |
| Encryption | Database-only breach | Key management system |
| HSM | Server memory attacks | Hardware module |
Hardware Security Modules (HSMs):
For highest security, password operations occur within tamper-resistant hardware:
Peppering Caveats:
Peppers add defense depth but are not a substitute for strong KDFs. Start with Argon2id; add pepper as an additional layer for high-value targets.
The strongest password storage combines: strong KDF (Argon2id) + unique salt + pepper/HMAC + hardware protection. Each layer protects against different breach scenarios, and an attacker must defeat all layers to recover passwords.
Understanding attacker capabilities helps calibrate password storage parameters.
Attacker Hardware Capabilities (2024):
| Hardware | SHA-256/sec | bcrypt (cost 12)/sec | Argon2id (64MB)/sec |
|---|---|---|---|
| Single CPU | ~10M | ~4 | ~3 |
| Single GPU | ~10B | ~20K | Limited by memory |
| GPU Cluster (100) | ~1T | ~2M | Memory-bound |
| ASIC Farm | ~100T | ~10M | Not cost-effective |
Time to Crack Analysis:
For a 40-bit entropy password (typical human-chosen):
| Storage Method | GPU Cluster Time |
|---|---|
| MD5/SHA-256 | < 1 second |
| bcrypt cost 10 | ~6 hours |
| bcrypt cost 12 | ~1 day |
| Argon2id (64MB, t=3) | ~1 week |
For 60-bit entropy password (strong passphrase):
| Storage Method | GPU Cluster Time |
|---|---|
| MD5/SHA-256 | ~1 day |
| bcrypt cost 12 | ~100,000 years |
| Argon2id (64MB, t=3) | ~10 million years |
Even the best storage cannot protect weak passwords. 'password123' will fall to targeted attack regardless of algorithm. Strong storage buys time and raises costs—it doesn't make weak passwords secure. Combine proper storage with password policies and breach detection.
Proper implementation requires attention to details that libraries handle correctly but custom code often gets wrong.
Recommended Library Choices:
| Language | Recommended Library |
|---|---|
| Python | argon2-cffi, bcrypt |
| JavaScript/Node | argon2, bcrypt |
| Java | Spring Security, jBCrypt |
| Go | golang.org/x/crypto/argon2, golang.org/x/crypto/bcrypt |
| PHP | password_hash() (built-in, uses bcrypt/argon2) |
| Ruby | bcrypt-ruby, argon2 |
| .NET | BCrypt.Net-Next, Konscious.Security.Cryptography |
Implementation Checklist:
Parameter Recommendations (2024):
Argon2id:
bcrypt:
scrypt:
bcrypt truncates input at 72 bytes. For long passwords/passphrases, first hash with SHA-256 then pass to bcrypt: bcrypt(sha256(password)). This preserves entropy while avoiding truncation.
Password storage is the last line of defense when everything else fails. Proper implementation transforms a database breach from catastrophic credential exposure to a computationally infeasible attack.
Module Complete:
This module has covered the complete authentication landscape: from password-based authentication through multi-factor mechanisms, biometric systems, authentication protocols, and secure credential storage. Together, these concepts form the foundation for building secure systems that properly verify user identity while protecting credentials against compromise.
You now understand the complete password storage landscape—from historical failures through modern best practices. Apply this knowledge to protect user credentials: use Argon2id with appropriate parameters, validate your implementation, and plan for algorithm evolution.