Hash Functions - Learning Module

Loading content...

0/276

Determinism — Same Input, Same Output

The Non-Negotiable Property of Hash Functions

Imagine you're building a massive online shopping platform. A customer creates an account with the password "SecurePass123". Your system hashes this password and stores the hash. Later, when the customer tries to log in, you hash their entered password and compare it to the stored hash.

Now imagine this scenario: the hash function produces a different output for the same password on the second attempt. The customer is locked out of their account forever. Not because they forgot their password. Not because of a security breach. But because the hash function was non-deterministic.

This seemingly simple requirement—same input must always produce same output—is the most fundamental property of any hash function. Without it, hash tables, password storage, data integrity verification, and countless other applications would be impossible.

What You Will Learn

By the end of this page, you will understand: (1) What determinism means in the context of hash functions, (2) Why determinism is an absolute requirement rather than a nice-to-have, (3) What can break determinism and how to avoid it, (4) How determinism enables hash table correctness, and (5) The mathematical formalization of deterministic hashing.

Defining Determinism Precisely

Before we can deeply understand determinism in hash functions, we need to define it with mathematical precision. In computer science, a function is deterministic if, given the same input, it always produces the same output—no matter when it's called, how many times it's called, or what state the system is in.

Formally, if we have a hash function h, then determinism is expressed as:

1
2
3
4
5
6
For all inputs x and y:
  If x = y, then h(x) = h(y)
 
Equivalently:
  h: Domain → Codomain is a well-defined mathematical function
  Each element in the domain maps to exactly one element in the codomain

This may seem trivially obvious—isn't that what a function is? In mathematics, yes. But in computing, many operations that look like functions can exhibit non-deterministic behavior due to:

Random number generators embedded in the computation
System state that influences the result (time, memory addresses, etc.)
Floating-point precision that varies across platforms
Uninitialized memory being read as part of the input
Race conditions in concurrent environments

A Critical Distinction

Don't confuse determinism with reversibility. A deterministic function always gives the same output for the same input, but this doesn't mean you can reverse the process to get the input from the output. Hash functions are deterministic but explicitly designed to be non-reversible (one-way). The determinism flows in one direction only.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# DETERMINISTIC: Same input → Same output, always
def deterministic_hash(key: str) -> int:
    """A simple deterministic hash function."""
    hash_value = 0
    for char in key:
        hash_value = (hash_value * 31 + ord(char)) & 0xFFFFFFFF
    return hash_value
 
# Testing determinism
print(deterministic_hash("hello"))  # Always: 99162322
print(deterministic_hash("hello"))  # Always: 99162322
print(deterministic_hash("hello"))  # Always: 99162322
 
# NON-DETERMINISTIC: Same input → Different output
import random
import time
 
def non_deterministic_hash(key: str) -> int:
    """A broken hash function that uses random values."""
    hash_value = 0
    for char in key:
        # BUG: Adding random value makes output unpredictable
        hash_value += ord(char) + random.randint(0, 100)
    return hash_value
 
# Testing shows non-determinism
print(non_deterministic_hash("hello"))  # 587
print(non_deterministic_hash("hello"))  # 734
print(non_deterministic_hash("hello"))  # 512
 
# Another example of accidental non-determinism
def time_influenced_hash(key: str) -> int:
    """A hash that accidentally depends on time."""
    # BUG: Using current time in hash computation
    base = int(time.time() * 1000) % 1000000
    hash_value = base
    for char in key:
        hash_value = (hash_value * 31 + ord(char)) & 0xFFFFFFFF
    return hash_value

Why Determinism Is Non-Negotiable

Determinism isn't just a "nice property" of hash functions—it's the foundational assumption upon which all hash-based data structures and algorithms are built. Without determinism, hash tables become useless, password verification breaks, file integrity checks fail, and distributed systems become inconsistent.

Let's examine each critical application that depends on deterministic hashing:

Hash Table Correctness

•Insertion: When you insert a key-value pair, the hash function computes an index: index = hash(key) % table_size. The key-value pair is stored at this index.
•Lookup: When you look up a key, the same hash function must compute the same index. If the hash changes, you'll look in the wrong location and never find your data.
•Deletion: To delete a key, you must find it first—which requires computing the same hash. A different hash means looking in the wrong place.
•The Contract: Hash tables promise O(1) average-time operations. This promise is only valid if the hash function reliably maps keys to their stored locations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
class HashTable:
    def __init__(self, size=16):
        self.size = size
        self.buckets = [[] for _ in range(size)]
        self.hash_function = self._deterministic_hash
    
    def _deterministic_hash(self, key):
        """Hash must be deterministic for table to work."""
        h = 0
        for char in str(key):
            h = (h * 31 + ord(char)) & 0xFFFFFFFF
        return h % self.size
    
    def insert(self, key, value):
        # Step 1: Compute index using hash(key)
        index = self.hash_function(key)
        
        # Step 2: Store at that index
        bucket = self.buckets[index]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                bucket[i] = (key, value)  # Update existing
                return
        bucket.append((key, value))  # Insert new
    
    def lookup(self, key):
        # CRITICAL: Must compute SAME index as insert()
        index = self.hash_function(key)  # Same hash function!
        
        # Look in that bucket
        bucket = self.buckets[index]
        for k, v in bucket:
            if k == key:
                return v
        return None  # Not found
    
    def delete(self, key):
        # CRITICAL: Must compute SAME index
        index = self.hash_function(key)  # Same hash function!
        
        bucket = self.buckets[index]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                del bucket[i]
                return True
        return False
 
# Demonstration of determinism enabling correctness
table = HashTable()
table.insert("user:1001", {"name": "Alice", "balance": 5000})
table.insert("user:1002", {"name": "Bob", "balance": 3000})
 
# This works ONLY because hash("user:1001") is the same as before
print(table.lookup("user:1001"))  # {"name": "Alice", "balance": 5000}
 
# If hash were non-deterministic, this would randomly fail
# and silently return None despite the data existing in the table

Password Storage & Verification

•Storage: When a user creates a password, you hash it and store the hash (never the password itself).
•Verification: When they log in, you hash the entered password and compare it to the stored hash.
•Requirement: The hash of the same password must be identical every time. If it differs, legitimate users cannot authenticate.
•Security Implication: Non-deterministic password hashing would either lock out users or require storing passwords in plaintext—a catastrophic security failure.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import hashlib
 
class SecureAuthSystem:
    def __init__(self):
        self.password_hashes = {}
    
    def register_user(self, username: str, password: str):
        """Store hash of password, not the password itself."""
        # Hash the password (with salt in production)
        password_hash = self._hash_password(password)
        self.password_hashes[username] = password_hash
        print(f"Registered {username} with hash: {password_hash[:20]}...")
    
    def authenticate(self, username: str, password: str) -> bool:
        """Verify password by comparing hashes."""
        if username not in self.password_hashes:
            return False
        
        stored_hash = self.password_hashes[username]
        
        # CRITICAL: This must produce the SAME hash as registration
        entered_hash = self._hash_password(password)
        
        # Determinism guarantee: same password → same hash
        return stored_hash == entered_hash
    
    def _hash_password(self, password: str) -> str:
        """Deterministic password hashing."""
        # SHA-256 is deterministic: same input always gives same output
        return hashlib.sha256(password.encode()).hexdigest()
 
# Demonstration
auth = SecureAuthSystem()
auth.register_user("alice", "MySecretPass123")
 
# This works because SHA-256 is deterministic
print(auth.authenticate("alice", "MySecretPass123"))  # True
print(auth.authenticate("alice", "WrongPassword"))    # False
 
# If hashing were non-deterministic:
# - Registration stores hash H1
# - Login computes hash H2 ≠ H1 (different due to non-determinism)
# - User is locked out despite correct password!

Data Integrity Verification

•File Downloads: When you download a file, the website often provides a checksum (hash). You compute the hash of your download and compare.
•Git Version Control: Git uses SHA-1 (now SHA-256) hashes to identify every commit, tree, and blob. The same content always has the same hash.
•Blockchain: Each block contains the hash of the previous block. Non-determinism would break the chain entirely.
•Database Replication: Distributed databases use hashes to detect data divergence between replicas.

The Catastrophe of Non-Determinism

If hash functions were non-deterministic: (1) Hash tables would randomly lose data, (2) Users would be locked out of their accounts, (3) File integrity checks would always fail, (4) Git repositories would become corrupted, (5) Blockchains would collapse. Determinism is not optional—it's the foundation of modern computing infrastructure.

What Can Break Determinism

Determinism can be accidentally broken in subtle ways. Understanding these failure modes helps you avoid them and debug issues when they arise. Here are the most common ways determinism fails:

Common Sources of Non-Determinism in Hash Functions
Source	What Happens	How to Avoid
Memory Address Hashing	Hash depends on where object is stored in memory; changes between runs	Hash the object's content, not its memory address
Random Seeding	Some languages add random seeds to prevent hash collision attacks	Use explicit seed or content-based hash for persistence
Floating-Point Precision	IEEE 754 floating-point operations may vary across platforms	Use integer arithmetic or fixed-point representations
Object Field Ordering	Unordered containers (dicts, sets) may iterate differently	Sort keys before hashing or use ordered containers
Thread Race Conditions	Multiple threads modifying shared state during hashing	Ensure thread safety; hash immutable data
Platform Differences	Different byte orders, word sizes, or encodings	Specify encoding explicitly; test cross-platform

Python's Hash Randomization: A Case Study

Since Python 3.3, the default hash function for strings includes a random seed that changes each time Python starts. This was introduced to prevent hash collision denial-of-service attacks where attackers craft inputs that all hash to the same bucket, degrading O(1) performance to O(n).

While this is a security feature, it breaks determinism across program executions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Python's built-in hash() is NOT deterministic across runs
# Run this script multiple times and observe different outputs:
 
print("Hash of 'hello':", hash("hello"))
 
# Output (Run 1): Hash of 'hello': 4891469850553217823
# Output (Run 2): Hash of 'hello': -8542176538266937945
# Output (Run 3): Hash of 'hello': 7267123495821043857
 
# This is intentional! It protects against hash collision attacks.
# But for persistent storage or cross-process communication, 
# you must use a deterministic alternative.
 
import hashlib
 
def consistent_hash(s: str) -> int:
    """A deterministic hash that's the same across all runs."""
    # SHA-256 is deterministic
    return int(hashlib.sha256(s.encode()).hexdigest(), 16)
 
# This is ALWAYS the same:
print("Deterministic hash:", consistent_hash("hello"))
# Output: 2cf24dba5fb0a30e26e83b2ac5b9e29e...
 
# You can disable Python's hash randomization with:
# PYTHONHASHSEED=0 python script.py
# But relying on this is fragile; use explicit hashing instead.

Memory Address Hashing: The Java Default Hash Problem

In Java, the default hashCode() method for objects without an override often returns a value derived from the object's memory address. This creates non-determinism because objects are allocated at different addresses each time the program runs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// BAD: Default hashCode() is often memory-address based
class User {
    String name;
    int age;
    
    User(String name, int age) {
        this.name = name;
        this.age = age;
    }
    
    // No hashCode() override = uses memory address
    // Two runs of the same program may give different hashes!
}
 
User u1 = new User("Alice", 30);
System.out.println(u1.hashCode());  // 366712642 (first run)
                                     // 1318246816 (second run)
 
// GOOD: Override hashCode() to be content-based
class UserFixed {
    String name;
    int age;
    
    UserFixed(String name, int age) {
        this.name = name;
        this.age = age;
    }
    
    @Override
    public int hashCode() {
        // Content-based: same fields → same hash
        int result = 17;
        result = 31 * result + (name != null ? name.hashCode() : 0);
        result = 31 * result + age;
        return result;
    }
    
    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj == null || getClass() != obj.getClass()) return false;
        UserFixed other = (UserFixed) obj;
        return age == other.age && 
               Objects.equals(name, other.name);
    }
}
 
// Now hash is deterministic based on content
UserFixed u2 = new UserFixed("Alice", 30);
System.out.println(u2.hashCode());  // Always: -1404139888

The Golden Rule of Deterministic Hashing

Hash the content, not the container. If you hash based on what an object contains (its field values), the hash is deterministic. If you hash based on where the object lives (its memory address) or when it was created (timestamp, random seed), the hash becomes non-deterministic.

Formal Properties and Invariants

To reason rigorously about hash functions, we need to formalize determinism and its relationship to other properties. This formal understanding helps when designing systems and debugging issues.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
DEFINITION (Hash Function Determinism)
=========================================
A hash function h: K → H is deterministic if and only if:
 
  ∀ k₁, k₂ ∈ K : k₁ = k₂ ⟹ h(k₁) = h(k₂)
 
Stated in words:
"For all keys k₁ and k₂ in the key domain K,
 if k₁ equals k₂, then h(k₁) equals h(k₂)."
 
CONTRAPOSITIVE (Useful for Debugging)
=====================================
  h(k₁) ≠ h(k₂) ⟹ k₁ ≠ k₂
 
"If the hashes are different, the keys must be different."
 
This contrapositive is useful because:
- If you're debugging and see different hashes for what you 
  think is the same key, the keys are NOT actually equal.
- Check for trailing whitespace, encoding issues, invisible 
  characters, or floating-point precision differences.
 
NON-DETERMINISM DETECTION
=========================
A hash function is NON-deterministic if:
 
  ∃ k ∈ K : h(k) computed at time t₁ ≠ h(k) computed at time t₂
 
"There exists some key whose hash varies over time."
 
Test for this by hashing the same value multiple times:
  h("test") at 10:00:00 → 42
  h("test") at 10:00:01 → 42
  h("test") at 10:00:02 → 42
  ✓ Consistent → Likely deterministic
 
  h("test") at 10:00:00 → 42
  h("test") at 10:00:01 → 87
  h("test") at 10:00:02 → 156
  ✗ Inconsistent → Non-deterministic!

Important Corollaries and Relationships:

Key Relationships in Hash Function Properties
Property	Formal Statement	Implication
Determinism	k₁ = k₂ ⟹ h(k₁) = h(k₂)	Same key always produces same hash
Function Property	Each k maps to exactly one h(k)	Hash function is a valid mathematical function
NOT Injectivity	h(k₁) = h(k₂) ⟹̸ k₁ = k₂	Different keys CAN have same hash (collisions allowed)
NOT Surjectivity	Not every value in H is necessarily produced	Some hash values may never occur
Consistency Across Runs	h(k) is same between program executions	Persistent storage and distribution work
Consistency Across Machines	h(k) is same on different hardware	Distributed systems can agree on hash values

Determinism ≠ Injectivity

A common misconception: determinism means 'no collisions.' This is false. Determinism only guarantees that the SAME input produces the SAME output. Two DIFFERENT inputs can still produce the same output (a collision). Collisions are inevitable when hashing an infinite key space to a finite hash space.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def simple_hash(key: str, table_size: int = 10) -> int:
    """A simple deterministic hash function."""
    h = 0
    for char in key:
        h = (h * 31 + ord(char))
    return h % table_size
 
# Collisions happen but don't violate determinism:
print(simple_hash("ab"))  # 5
print(simple_hash("bB"))  # 5  (collision!)
 
# These two different keys hash to the same value.
# This is NOT a bug—it's expected behavior.
 
# BUT: determinism is preserved:
for _ in range(1000):
    assert simple_hash("ab") == 5  # Always 5
    assert simple_hash("bB") == 5  # Always 5
 
print("Determinism verified: same inputs always produce same outputs")
 
# Key insight:
# - h("ab") = 5 (always)
# - h("bB") = 5 (always)  
# - h("ab") = h("bB") is a COLLISION
# - Both h("ab") and h("bB") are deterministic
# - Determinism and collision-freeness are independent properties

Testing for Determinism

Given the subtle ways determinism can break, it's critical to test hash functions before using them in production. Here are systematic approaches to verify determinism:

Determinism Testing Strategies

•Repeated Execution Test: Hash the same input many times in a single execution. All outputs must be identical.
•Cross-Execution Test: Run the program multiple times and compare hash outputs. They should match across runs.
•Cross-Platform Test: Run on different operating systems, architectures, and language versions. Outputs should match.
•Edge Case Test: Test empty strings, null values, very long inputs, unicode characters, and boundary values.
•Stress Test: Hash millions of inputs rapidly. No variance should appear under load.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import hashlib
import json
from typing import Callable, Any, List
 
def test_determinism_repeated(
    hash_fn: Callable[[Any], int], 
    inputs: List[Any], 
    iterations: int = 10000
) -> bool:
    """Test that hash function gives same output for repeated calls."""
    print(f"Testing repeated execution ({iterations} iterations per input)...")
    
    for inp in inputs:
        first_hash = hash_fn(inp)
        for i in range(iterations):
            current_hash = hash_fn(inp)
            if current_hash != first_hash:
                print(f"  ✗ FAILED: Input '{inp}' gave {first_hash} then {current_hash}")
                return False
        print(f"  ✓ Input '{inp}' → {first_hash} (consistent {iterations}x)")
    
    return True
 
def test_determinism_saved_values(
    hash_fn: Callable[[Any], int],
    saved_file: str = "hash_reference.json"
) -> bool:
    """Test against previously saved reference values."""
    test_inputs = [
        "",
        "hello",
        "Hello",  # Case sensitivity
        "hello world",
        "hello  world",  # Extra space
        "hello
world",  # Newline
        "🔥🚀🌍",  # Unicode
        "a" * 10000,  # Long string
    ]
    
    try:
        with open(saved_file, 'r') as f:
            saved_hashes = json.load(f)
        
        print(f"Comparing against saved hashes from {saved_file}...")
        for inp in test_inputs:
            current = hash_fn(inp)
            saved = saved_hashes.get(inp)
            if saved is None:
                print(f"  ⚠ Input '{inp[:20]}...' not in saved file")
            elif current != saved:
                print(f"  ✗ MISMATCH: '{inp[:20]}...' was {saved}, now {current}")
                return False
            else:
                print(f"  ✓ Input '{inp[:20]}...' matches saved value")
        return True
        
    except FileNotFoundError:
        print(f"Creating reference file {saved_file}...")
        hashes = {inp: hash_fn(inp) for inp in test_inputs}
        with open(saved_file, 'w') as f:
            json.dump(hashes, f, indent=2)
        print("  Reference file created. Run again to verify.")
        return True
 
# Example: Testing a good hash function
def deterministic_hash(s: str) -> int:
    """Known-good deterministic hash."""
    return int(hashlib.sha256(str(s).encode()).hexdigest()[:16], 16)
 
# Run tests
test_inputs = ["hello", "world", "", "test123"]
assert test_determinism_repeated(deterministic_hash, test_inputs)
print("\n✓ All determinism tests passed!")

Cross-Process Testing

For production systems, save hash outputs to a file and verify they match on subsequent runs. This catches Python's hash randomization, memory-address hashing, and other cross-execution non-determinism that single-execution tests miss.

Determinism Enables Hash Table Correctness

Now let's see how determinism is the foundation of hash table correctness. Every hash table operation depends on the ability to compute the same index for the same key, every time.

With Determinism

•insert('key') stores at index 5
•lookup('key') computes index 5
•Key is found at index 5 ✓
•O(1) lookup achieved
•System works correctly

Without Determinism

•insert('key') stores at index 5
•lookup('key') computes index 8
•Index 8 is empty—key 'not found'
•Data appears lost (but exists!)
•System silently fails

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import random
 
class BrokenHashTable:
    """A hash table with a non-deterministic hash function (DON'T DO THIS)."""
    
    def __init__(self, size=16):
        self.size = size
        self.buckets = [[] for _ in range(size)]
    
    def _broken_hash(self, key):
        """BUG: Uses random offset, making it non-deterministic."""
        base = sum(ord(c) for c in str(key))
        random_offset = random.randint(0, 10)  # ← THE BUG
        return (base + random_offset) % self.size
    
    def insert(self, key, value):
        index = self._broken_hash(key)
        for i, (k, v) in enumerate(self.buckets[index]):
            if k == key:
                self.buckets[index][i] = (key, value)
                return
        self.buckets[index].append((key, value))
        print(f"  Inserted '{key}' at index {index}")
    
    def lookup(self, key):
        index = self._broken_hash(key)
        print(f"  Looking for '{key}' at index {index}")
        for k, v in self.buckets[index]:
            if k == key:
                return v
        return None  # Not found!
 
# Demonstration of failure
print("=== Demonstrating Non-Deterministic Hash Table Failure ===")
print()
 
broken_table = BrokenHashTable()
 
# Insert a key
print("Step 1: Insert key 'user:1001'")
broken_table.insert("user:1001", {"name": "Alice"})
 
# Try to find it multiple times
print("\nStep 2: Try to lookup the same key multiple times")
for attempt in range(5):
    result = broken_table.lookup("user:1001")
    if result:
        print(f"  Attempt {attempt+1}: FOUND - {result}")
    else:
        print(f"  Attempt {attempt+1}: NOT FOUND (data lost!)")
 
# Sample output:
# Step 1: Insert key 'user:1001'
#   Inserted 'user:1001' at index 7
#
# Step 2: Try to lookup the same key multiple times
#   Looking for 'user:1001' at index 12
#   Attempt 1: NOT FOUND (data lost!)
#   Looking for 'user:1001' at index 3
#   Attempt 2: NOT FOUND (data lost!)
#   Looking for 'user:1001' at index 7
#   Attempt 3: FOUND - {'name': 'Alice'}  ← Randomly works!
#   Looking for 'user:1001' at index 15
#   Attempt 4: NOT FOUND (data lost!)

Silent Data Loss

Notice the insidious nature of this bug: the data isn't deleted—it's still in the table at index 7. But because the hash function gives different indices, lookups fail most of the time. The failure is silent, intermittent, and extremely hard to debug. This is why determinism is non-negotiable.

Summary: Determinism as Foundation

We've explored the foundational property of hash function determinism in depth. Let's consolidate the key insights:

Key Takeaways

•Determinism defined: Same input must always produce same output. This is the mathematical definition of a function applied to hash functions.
•Non-negotiable requirement: Hash tables, password storage, data integrity, and distributed systems all fundamentally require deterministic hashing.
•Common pitfalls: Memory addresses, random seeds, floating-point precision, unordered containers, and platform differences can break determinism.
•Python's hash(): Intentionally non-deterministic across runs for security. Use hashlib for persistent, cross-platform deterministic hashing.
•Testing is essential: Verify determinism with repeated execution, cross-execution, and cross-platform tests before production use.
•Hash tables depend on it: Insert and lookup compute the same index only if the hash is deterministic. Non-determinism causes silent data loss.

What's next:

Determinism ensures correctness, but a deterministic hash function can still be terrible if it produces the same output for many different inputs (poor distribution). The next page explores uniformity—the property that ensures hash values are spread evenly across the output space, minimizing collisions and maintaining O(1) performance.

Page Complete

You now understand determinism—the most fundamental property of hash functions. You can identify what breaks it, test for it, and understand why it's the foundation of all hash-based systems. Next, we'll explore uniformity: how to spread hash values evenly to minimize collisions.