Loading content...
All the algorithms we've examined—Round Robin, Weighted Round Robin, Least Connections—share a fundamental property: each request is routed independently of previous requests. The load balancer has no memory.
This works perfectly for stateless applications. But many applications have state:
If a user's first request creates session data on Server A, subsequent requests routed to Server B won't find that data. The session breaks.
IP Hash solves this by using the client's IP address to deterministically select a server. The same IP always routes to the same server (as long as the server pool is stable), providing session affinity without shared session storage.
By the end of this page, you will understand IP Hash mechanics and hash function selection, its strengths and critical limitations, implementation in production load balancers, and alternative approaches to session affinity.
IP Hash uses a deterministic hash function to map client IP addresses to backend servers.
The Basic Algorithm:
hash(client_ip)server_index = hash(client_ip) % number_of_serversservers[server_index]The Determinism Property:
Given the same inputs (IP address, server list), the same output (server selection) always results. This is what provides session affinity—a client with IP 192.168.1.50 will always reach the same server.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131
import hashlibimport socketimport structfrom typing import Optional, Listimport threading class IPHashLoadBalancer: """ IP Hash load balancer implementation. Routes requests based on client IP address, ensuring the same client always reaches the same server. """ def __init__(self, servers: List[str]): """ Initialize with list of backend server addresses. Args: servers: Ordered list of server addresses """ if not servers: raise ValueError("At least one server required") self._servers = list(servers) self._lock = threading.Lock() def _ip_to_int(self, ip: str) -> int: """ Convert IP address string to integer for hashing. Handles both IPv4 and IPv6 addresses. """ try: # Try IPv4 first packed = socket.inet_aton(ip) return struct.unpack("!I", packed)[0] except socket.error: try: # Try IPv6 packed = socket.inet_pton(socket.AF_INET6, ip) # Use first 64 bits for hashing return struct.unpack("!Q", packed[:8])[0] except socket.error: # Fallback: hash the string directly return hash(ip) def _compute_hash(self, ip: str) -> int: """ Compute hash of IP address. Uses MD5 for consistent, well-distributed hashing. (Not for cryptographic purposes - just distribution) """ ip_int = self._ip_to_int(ip) # Use MD5 for good distribution h = hashlib.md5(str(ip_int).encode()) return int(h.hexdigest(), 16) def get_server(self, client_ip: str) -> str: """ Select server for the given client IP. Deterministic: same IP always returns same server (as long as server list is unchanged). Time Complexity: O(1) """ with self._lock: if not self._servers: raise RuntimeError("No servers available") hash_value = self._compute_hash(client_ip) index = hash_value % len(self._servers) return self._servers[index] def add_server(self, server: str) -> None: """ Add a server to the pool. WARNING: This changes the hash distribution! Many clients will be remapped to different servers. """ with self._lock: if server not in self._servers: self._servers.append(server) def remove_server(self, server: str) -> None: """ Remove a server from the pool. WARNING: This changes the hash distribution! Clients of the removed server AND some other clients will be remapped. """ with self._lock: if server in self._servers: self._servers.remove(server) # Demonstrationif __name__ == "__main__": servers = ["10.0.1.1:8080", "10.0.1.2:8080", "10.0.1.3:8080"] lb = IPHashLoadBalancer(servers) # Simulate clients clients = [ "192.168.1.50", "192.168.1.51", "192.168.1.52", "10.0.0.1", "172.16.0.100", ] print("IP Hash Distribution:") print("-" * 50) for client_ip in clients: server = lb.get_server(client_ip) print(f"Client {client_ip:15} → {server}") print("\nSame client, multiple requests (should be same server):") for i in range(3): server = lb.get_server("192.168.1.50") print(f" Request {i+1}: 192.168.1.50 → {server}") print("\nAfter removing a server:") lb.remove_server("10.0.1.2:8080") for client_ip in clients: server = lb.get_server(client_ip) print(f"Client {client_ip:15} → {server}")Use a well-distributed hash function like MD5, SHA1, or xxHash. The built-in hash() function in many languages is not suitable—it may differ between runs or machines, breaking the determinism property. Cryptographic strength isn't needed; distribution quality is what matters.
IP Hash has a significant problem: server pool changes cause massive remapping.
Example: Adding a Server
Original pool: [A, B, C] (3 servers)
After adding server D: [A, B, C, D] (4 servers)
Result: 50% of clients get remapped to different servers—their sessions break!
The Math:
When changing from n to n+1 servers:
n / (n + 1) of all clientsThis gets worse as you scale. Adding one server to a 100-server pool remaps almost everyone!
Simple IP Hash is only suitable for stable server pools. If you're frequently adding/removing servers (autoscaling, deployments, failures), the session disruption is unacceptable. Use Consistent Hashing instead (covered in the next page).
Choosing the right IP address to hash is more nuanced than it appears.
| Source | Description | Considerations |
|---|---|---|
| Direct Connection IP | IP of the TCP connection | Only correct if no proxies/CDN between client and LB |
| X-Forwarded-For header | Client IP as reported by upstream proxies | Can be spoofed! Must trust upstream proxy |
| X-Real-IP header | Alternative client IP header | Same spoofing concerns as X-Forwarded-For |
| CF-Connecting-IP | Cloudflare's client IP header | Only valid behind Cloudflare |
| True-Client-IP | Akamai's client IP header | Only valid behind Akamai |
NAT and Shared IP Problems:
Many clients share IP addresses:
Impact:
Mitigations:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
import hashlibimport socketimport structfrom typing import List class SubnetHashLoadBalancer: """ IP Hash using subnet portion of address. Hashes the first 24 bits of IPv4 addresses (the /24 subnet), which can help with some NAT scenarios while maintaining geographic locality. """ def __init__(self, servers: List[str], subnet_bits: int = 24): """ Args: servers: Backend server list subnet_bits: Number of bits to use for hashing (1-32 for IPv4) """ self._servers = list(servers) self._subnet_bits = subnet_bits self._mask = (0xFFFFFFFF << (32 - subnet_bits)) & 0xFFFFFFFF def _get_subnet(self, ip: str) -> int: """Extract subnet portion of IP address.""" try: packed = socket.inet_aton(ip) ip_int = struct.unpack("!I", packed)[0] return ip_int & self._mask except socket.error: return hash(ip) def get_server(self, client_ip: str) -> str: """Select server based on subnet hash.""" subnet = self._get_subnet(client_ip) h = hashlib.md5(str(subnet).encode()) hash_value = int(h.hexdigest(), 16) index = hash_value % len(self._servers) return self._servers[index] # Demonstrationif __name__ == "__main__": servers = ["Server-A", "Server-B", "Server-C"] lb = SubnetHashLoadBalancer(servers, subnet_bits=24) # Clients in same /24 subnet clients = [ "192.168.1.1", "192.168.1.50", "192.168.1.200", "192.168.2.1", # Different subnet "192.168.2.50", ] print("Subnet-based IP Hash (/24):") print("-" * 50) for client_ip in clients: server = lb.get_server(client_ip) subnet = client_ip.rsplit('.', 1)[0] + ".0/24" print(f"Client {client_ip:15} (subnet {subnet}) → {server}")IPv6 addresses are 128 bits. The first 64 bits typically identify the network, while the last 64 identify the host. For IP hash, using the first 64 bits (or even 48) is usually appropriate. Full address hashing may be unnecessarily granular.
Let's examine IP Hash configuration in production load balancers.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
# NGINX IP Hash Configuration upstream backend_ip_hash { # Enable IP hash ip_hash; # Server list - order matters for consistency! # Servers should always be listed in the same order server 10.0.1.1:8080; server 10.0.1.2:8080; server 10.0.1.3:8080; # Graceful removal: mark as down, not removed # This minimizes remapping - only clients of this server affected # server 10.0.1.4:8080 down; # Backup server - receives traffic only if hashed server is down server 10.0.1.5:8080 backup;} upstream ip_hash_with_weights { ip_hash; # Weights work differently with ip_hash: # Higher weight = more hash slots, more clients routed here server 10.0.1.1:8080 weight=3; server 10.0.1.2:8080 weight=2; server 10.0.1.3:8080 weight=1; # Approximate distribution: 3:2:1} # For X-Forwarded-For handlingupstream ip_hash_xff { ip_hash; server 10.0.1.1:8080; server 10.0.1.2:8080; server 10.0.1.3:8080;} server { listen 80; # Trust proxy headers from internal sources only set_real_ip_from 10.0.0.0/8; set_real_ip_from 172.16.0.0/12; real_ip_header X-Forwarded-For; real_ip_recursive on; # Use leftmost untrusted IP location / { proxy_pass http://backend_ip_hash; # Forward client IP for logging/app use proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; }}12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
# HAProxy IP Hash (Source) Configuration global maxconn 50000 defaults mode http timeout connect 5s timeout client 30s timeout server 60s backend app_source_hash # Source IP hash balancing balance source # Hash options hash-type consistent # Use consistent hashing (recommended!) # hash-type map-based # Simple modulo (not recommended) server app1 10.0.1.1:8080 check server app2 10.0.1.2:8080 check server app3 10.0.1.3:8080 check backend app_source_with_xff balance source hash-type consistent # Use X-Forwarded-For for client IP # Only if you trust the upstream proxy! http-request set-header X-Client-IP %[req.hdr(X-Forwarded-For),word(1,',')] if { req.hdr(X-Forwarded-For) -m found } server app1 10.0.1.1:8080 check server app2 10.0.1.2:8080 check server app3 10.0.1.3:8080 check backend app_header_hash # Hash on arbitrary header (alternative to IP) balance hdr(X-User-ID) hash-type consistent server app1 10.0.1.1:8080 check server app2 10.0.1.2:8080 check backend app_url_hash # Hash on URL parameter balance url_param userid hash-type consistent server app1 10.0.1.1:8080 check server app2 10.0.1.2:8080 check frontend http_front bind *:80 # Source IP is determined before routing default_backend app_source_hash listen stats bind *:8404 stats enable stats uri /statsNote the hash-type consistent directive in HAProxy. This uses consistent hashing instead of simple modulo, dramatically reducing remapping when servers change. Always use consistent hashing for production IP hash deployments!
IP Hash isn't the only way to achieve session affinity. Each approach has different tradeoffs.
| Method | Mechanism | Pros | Cons |
|---|---|---|---|
| IP Hash | Hash client IP to select server | Simple, no client cooperation needed | NAT problems, remap on server changes |
| Cookie-based | Set cookie with server ID, route by cookie | Per-user granularity, survives IP changes | Requires cookie support, inspect HTTP |
| URL Parameter | Encode server ID in URL | Works without cookies | Ugly URLs, requires app cooperation |
| Header-based | Custom header identifies user/session | Flexible, app-controlled | Requires app to set header |
| Consistent Hash | Hash with virtual nodes for smooth remapping | Minimal disruption on changes | More complex to implement |
| Centralized Session Store | Store sessions in Redis/DB, any server works | True stateless backends | Additional infrastructure, latency |
Cookie-Based Affinity in Detail:
Cookie-based affinity is often superior to IP hash. Here's how it works:
123456789101112131415161718192021222324252627282930313233343536373839
# NGINX Cookie-Based Session Affinity # Method 1: Sticky cookie (NGINX Plus only)upstream backend_sticky { sticky cookie srv_id expires=1h domain=.example.com path=/; server 10.0.1.1:8080; server 10.0.1.2:8080; server 10.0.1.3:8080; # NGINX Plus sets a cookie (srv_id) on first request # Subsequent requests route based on that cookie} # Method 2: Route based on existing session cookieupstream backend_route { # Use hash of session ID cookie hash $cookie_JSESSIONID consistent; server 10.0.1.1:8080; server 10.0.1.2:8080; server 10.0.1.3:8080;} # Method 3: App sets server hint in cookiemap $cookie_server_hint $backend_server { default backend_round_robin; "app1" backend_app1; "app2" backend_app2; "app3" backend_app3;} server { listen 80; location / { proxy_pass http://$backend_server; }}The cleanest solution is often to externalize session state to Redis, Memcached, or a database. With shared session storage, any server can handle any request—session affinity becomes unnecessary. This enables true horizontal scaling without load balancer complexity.
For most modern web applications: 1) Externalize session state to Redis/database, eliminating the need for affinity, OR 2) Use cookie-based affinity which is more reliable than IP hash. Reserve IP hash for Layer 4 LB scenarios or non-HTTP protocols.
What's Next:
IP Hash's remap problem is severe. The next page explores Consistent Hashing—an elegant algorithm that minimizes remapping to only K/n clients (where K is total clients and n is servers) when the server pool changes. This makes hash-based load balancing practical for dynamic environments.
You now understand IP Hash load balancing—its mechanics, critical limitations, and alternatives. You can make informed decisions about when IP Hash is appropriate versus when cookie-based affinity or consistent hashing would serve better.