Loading learning content...
Each major cloud provider has developed sophisticated load balancing services, each with unique capabilities, pricing models, and integration patterns. For architects operating in multi-cloud environments or evaluating cloud migration, understanding these differences is essential for informed decision-making.
This page synthesizes our exploration of load balancing technologies by:
By the end of this page, you will possess a complete mental framework for selecting and implementing load balancing solutions across any scale and environment.
By completing this page, you will understand the load balancing offerings across major cloud providers, develop selection criteria for different workload types, synthesize knowledge from the entire module into actionable decision frameworks, and establish production best practices for load balancer implementation.
Google Cloud Load Balancing (GCLB) represents Google's approach to traffic distribution, leveraging the same infrastructure that powers Google's own services. A distinctive feature is the global load balancing architecture—a single anycast IP serves traffic worldwide, with routing handled by Google's global network.
GCP Load Balancer Types:
| Load Balancer | Layer | Scope | Key Use Case |
|---|---|---|---|
| External HTTP(S) LB | Layer 7 | Global | Web applications, APIs, CDN integration |
| External TCP/UDP LB | Layer 4 | Regional or Global | Non-HTTP protocols, TCP/UDP services |
| Internal HTTP(S) LB | Layer 7 | Regional | Internal microservices, private APIs |
| Internal TCP/UDP LB | Layer 4 | Regional | Internal databases, caches |
| External Network LB | Layer 4 | Regional | Raw TCP/UDP with source IP preservation |
Key GCP Differentiators:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122
# Global HTTP Load Balancer with Cloud CDNresource "google_compute_global_address" "default" { name = "global-lb-ip"} resource "google_compute_global_forwarding_rule" "default" { name = "global-http-rule" target = google_compute_target_https_proxy.default.id port_range = "443" ip_address = google_compute_global_address.default.address} resource "google_compute_target_https_proxy" "default" { name = "https-proxy" url_map = google_compute_url_map.default.id ssl_certificates = [google_compute_managed_ssl_certificate.default.id]} resource "google_compute_managed_ssl_certificate" "default" { name = "managed-cert" managed { domains = ["api.example.com"] }} # URL Map for path-based routingresource "google_compute_url_map" "default" { name = "url-map" default_service = google_compute_backend_service.default.id host_rule { hosts = ["api.example.com"] path_matcher = "api-paths" } path_matcher { name = "api-paths" default_service = google_compute_backend_service.api_v1.id path_rule { paths = ["/v2/*"] service = google_compute_backend_service.api_v2.id } path_rule { paths = ["/static/*"] service = google_compute_backend_service.static.id } }} # Backend Service with Cloud CDN and Cloud Armorresource "google_compute_backend_service" "default" { name = "backend-service" protocol = "HTTP" port_name = "http" timeout_sec = 30 enable_cdn = true # Cloud CDN enabled # Cloud Armor security policy security_policy = google_compute_security_policy.default.id backend { group = google_compute_instance_group_manager.default.instance_group balancing_mode = "UTILIZATION" max_utilization = 0.8 capacity_scaler = 1.0 } # Health check health_checks = [google_compute_health_check.default.id] # Connection draining connection_draining_timeout_sec = 30 # Cloud CDN configuration cdn_policy { cache_mode = "CACHE_ALL_STATIC" default_ttl = 3600 max_ttl = 86400 negative_caching = true }} # Cloud Armor security policyresource "google_compute_security_policy" "default" { name = "security-policy" # Default rule rule { action = "allow" priority = "2147483647" match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = ["*"] } } description = "Default allow rule" } # Rate limiting rule rule { action = "rate_based_ban" priority = "1000" match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = ["*"] } } rate_limit_options { conform_action = "allow" exceed_action = "deny(429)" enforce_on_key = "IP" rate_limit_threshold { count = 1000 interval_sec = 60 } } }}GCP's global load balancer uses anycast IP addressing. A single IP (like 35.190.27.1) is announced from all Google edge locations. Traffic naturally routes to the nearest edge via BGP, then traverses Google's private network to reach backends. This provides sub-50ms latency worldwide without geographic DNS configuration.
Microsoft Azure provides a comprehensive set of load balancing services integrated with its enterprise-focused cloud platform. Azure's offerings span from basic regional load balancing to sophisticated global traffic management.
Azure Load Balancing Types:
| Service | Layer | Scope | Key Use Case |
|---|---|---|---|
| Azure Load Balancer | Layer 4 | Regional | High-performance TCP/UDP, zone redundancy |
| Application Gateway | Layer 7 | Regional | Web apps, WAF, SSL termination |
| Azure Front Door | Layer 7 | Global | Global HTTP acceleration, caching, WAF |
| Traffic Manager | DNS-based | Global | Geographic routing, failover, DR |
| Azure Gateway Load Balancer | Layer 4 | Regional | Third-party NVA integration |
Key Azure Differentiators:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113
# Azure Application Gateway v2resource "azurerm_application_gateway" "main" { name = "app-gateway" resource_group_name = azurerm_resource_group.main.name location = azurerm_resource_group.main.location sku { name = "Standard_v2" tier = "Standard_v2" } # Auto-scaling autoscale_configuration { min_capacity = 2 max_capacity = 10 } # Gateway IP configuration gateway_ip_configuration { name = "gateway-ip-config" subnet_id = azurerm_subnet.gateway.id } # Frontend IP frontend_ip_configuration { name = "frontend-ip" public_ip_address_id = azurerm_public_ip.gateway.id } # Frontend port frontend_port { name = "https-port" port = 443 } # SSL certificate ssl_certificate { name = "gateway-cert" key_vault_secret_id = azurerm_key_vault_certificate.gateway.secret_id } # HTTP listener http_listener { name = "https-listener" frontend_ip_configuration_name = "frontend-ip" frontend_port_name = "https-port" protocol = "Https" ssl_certificate_name = "gateway-cert" host_name = "api.example.com" } # Backend pool backend_address_pool { name = "backend-pool" ip_addresses = var.backend_ips } # Backend HTTP settings backend_http_settings { name = "http-settings" cookie_based_affinity = "Disabled" port = 8080 protocol = "Http" request_timeout = 30 connection_draining { enabled = true drain_timeout_sec = 30 } } # Health probe probe { name = "health-probe" protocol = "Http" path = "/health" host = "api.internal" interval = 30 timeout = 30 unhealthy_threshold = 3 } # URL path map for routing url_path_map { name = "path-map" default_backend_address_pool_name = "backend-pool" default_backend_http_settings_name = "http-settings" path_rule { name = "api-v2" paths = ["/api/v2/*"] backend_address_pool_name = "api-v2-pool" backend_http_settings_name = "http-settings" } } # Request routing rule request_routing_rule { name = "routing-rule" rule_type = "PathBasedRouting" http_listener_name = "https-listener" url_path_map_name = "path-map" priority = 100 } # WAF configuration (if using WAF_v2 SKU) # waf_configuration { # enabled = true # firewall_mode = "Prevention" # rule_set_type = "OWASP" # rule_set_version = "3.2" # }}For global HTTP workloads, prefer Azure Front Door over Traffic Manager + Application Gateway combinations. Front Door provides integrated caching, WAF, and performance-based routing with anycast entry points. Traffic Manager is DNS-based and adds latency; Front Door is a true global Layer 7 service.
The following matrix provides a feature-by-feature comparison across AWS, GCP, and Azure for common load balancing requirements.
| Capability | AWS | GCP | Azure |
|---|---|---|---|
| Global HTTP LB | CloudFront + ALB | Global HTTP(S) LB ✓ | Front Door ✓ |
| Static anycast IP | Global Accelerator | Native ✓ | Front Door (limited) |
| Layer 4 Global | Global Accelerator + NLB | Global TCP/UDP LB ✓ | Traffic Manager (DNS) |
| Layer 7 Regional | ALB ✓ | Regional HTTP(S) LB ✓ | Application Gateway ✓ |
| Layer 4 Regional | NLB ✓ | Network LB ✓ | Azure Load Balancer ✓ |
| Native WAF | AWS WAF + ALB | Cloud Armor ✓ | WAF policies ✓ |
| Integrated CDN | CloudFront separate | Cloud CDN integrated | Front Door integrated |
| Serverless targets | Lambda via ALB | Cloud Run NEG | Functions via App GW |
| Service mesh control | App Mesh (Envoy) | Traffic Director (Envoy) | OSM (limited) |
| Multi-region failover | Route 53 health checks | Global LB auto-failover | Front Door priority groups |
| Private endpoints | PrivateLink via NLB | Private Service Connect | Private Link ✓ |
Key Observations:
AWS: Most comprehensive but fragmented. Often requires combining multiple services (ALB + CloudFront + WAF + Route 53) to achieve what GCP or Azure provide as integrated solutions. Deepest Kubernetes integration via ALB Ingress Controller.
GCP: Most elegant global architecture. Single anycast IP with automatic nearest-region routing is simpler than AWS's multi-service approach. Cloud Armor integration is seamless. Traffic Director provides best-in-class managed Envoy control plane.
Azure: Strong enterprise integration. Front Door combines global routing, CDN, and WAF. Application Gateway v2 bridges regional and enterprise requirements. Deepest integration with Active Directory and hybrid scenarios.
For true multi-cloud architectures, avoid deep integration with provider-specific load balancers. Consider using self-managed solutions (NGINX, HAProxy, Envoy) deployed identically across clouds, fronted by provider load balancers for ingress only. This maintains consistent behavior while leveraging cloud infrastructure.
The choice between self-managed load balancers (NGINX, HAProxy, Envoy) and cloud-managed services (ALB, GCP LB, Azure App Gateway) represents a fundamental architectural decision with implications for operations, cost, and capability.
| Dimension | Self-Managed | Cloud-Managed |
|---|---|---|
| Operational burden | High: upgrades, scaling, HA | Low: fully managed |
| Cost at low traffic | Higher (EC2 instances) | Lower (pay per request) |
| Cost at high traffic | Lower (amortized) | Higher (scales with load) |
| Customization | Unlimited | Limited to service features |
| Performance | Maximum control | Sufficient for most use cases |
| Multi-cloud portability | High (same config) | None (vendor-specific) |
| Configuration complexity | Higher | Lower (UI, IaC support) |
| Scaling speed | Depends on implementation | Automatic, may be slow for spikes |
| Debugging visibility | Full access | Limited to metrics/logs |
| Compliance | You manage everything | Shared responsibility model |
Detailed Cost Analysis:
Let's examine a concrete example: 10 million requests per month at 10KB average response size.
AWS ALB:
Self-Managed NGINX on EC2:
At 100M requests:
The crossover point where self-managed becomes cheaper varies, but typically occurs around 50-100M requests/month. However, the hidden cost is engineering time—if your team spends 4 hours/month on load balancer operations at $100/hour, that's $400/month in implicit cost.
Many organizations use a hybrid model: cloud load balancer (NLB/ALB) at the edge for DDoS protection and auto-scaling, with self-managed NGINX/Envoy internally for sophisticated routing. This captures the benefits of both: managed infrastructure at the edge, control on the interior.
The following matrix synthesizes the entire module into actionable guidance for technology selection based on specific requirements.
| Requirement | Recommended Solution | Alternative |
|---|---|---|
| Simple web app, low traffic | Cloud LB (ALB/GCP HTTP LB) | NGINX on single instance |
| High-traffic HTTP API | Cloud LB or NGINX cluster | HAProxy if latency-critical |
| TCP/database proxy | NLB, HAProxy | Cloud Network LB |
| Global user distribution | GCP Global LB / CloudFront+ALB | Azure Front Door |
| Microservices routing | ALB with path rules, NGINX | Envoy + xDS control plane |
| Service mesh sidecar | Envoy | NGINX (limited) |
| gRPC services | Envoy, ALB (limited) | NLB passthrough |
| WebSocket applications | ALB, NGINX | HAProxy |
| UDP gaming/VOIP | NLB, HAProxy | GCP UDP LB |
| Rate limiting as primary feature | HAProxy (stick tables) | NGINX (rate limiting module) |
| A/B testing, canary deploys | ALB weighted routing, Envoy | NGINX split_clients |
| Maximum performance, low latency | HAProxy | NLB |
| Maximum flexibility, extensibility | NGINX+Lua or Envoy+Wasm | HAProxy (less extensible) |
| Zero ops overhead | Cloud-managed LBs | Managed Kubernetes + Ingress |
| Multi-cloud portability | NGINX, HAProxy, Envoy | Avoid cloud-specific LBs |
Synthesis of Module Learnings:
| Technology | Sweet Spot | Avoid When |
|---|---|---|
| NGINX | Versatile HTTP workloads, existing expertise, edge caching | Pure TCP, maximum performance needed |
| HAProxy | High-performance proxying, TCP protocols, advanced rate limiting | Need static file serving, extensive scripting |
| Envoy | Cloud-native/Kubernetes, service mesh, dynamic configuration | Simple architectures, limited ops expertise |
| AWS ALB | AWS-native apps, Lambda integration, low-ops HTTP routing | Multi-cloud, extreme performance |
| AWS NLB | TCP/UDP, static IPs, PrivateLink | Content-based routing |
| GCP Global LB | Global user base, integrated CDN/WAF | Regional-only requirements |
| Azure App GW | Enterprise Azure environments, WAF | Global distribution (use Front Door) |
There is no single 'best' load balancer. The optimal choice depends on your specific requirements, team expertise, cloud strategy, and operational maturity. A startup building quickly might choose ALB; a high-frequency trading firm might choose HAProxy; a platform team might choose Envoy. All can be correct for their context.
Regardless of which load balancing technology you select, certain best practices apply universally for production deployments.
Health Check Design:
A well-designed health check endpoint should:
GET /health
Response: 200 OK
{
"status": "healthy",
"checks": {
"database": { "status": "up", "latency_ms": 5 },
"cache": { "status": "up", "latency_ms": 2 },
"dependencies": { "status": "up" }
},
"version": "1.2.3"
}
If your health check fails when any dependency is down, a single dependency failure can cascade and take down your entire service tier. Consider separating liveness (is the process running?) from readiness (can it handle requests?) checks. In Kubernetes, use separate endpoints for liveness and readiness probes.
Load balancers occupy a critical position in your security architecture—they're the first (or last) point of defense against external threats and must be configured with security as a primary concern.
1234567891011121314151617181920212223242526272829303132333435
# Strong TLS configurationssl_protocols TLSv1.2 TLSv1.3;ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';ssl_prefer_server_ciphers on; # Security headersadd_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;add_header X-Content-Type-Options "nosniff" always;add_header X-Frame-Options "DENY" always;add_header X-XSS-Protection "1; mode=block" always; # Hide server informationserver_tokens off;proxy_hide_header X-Powered-By; # Request limitsclient_max_body_size 10m;client_body_timeout 30s;client_header_timeout 30s; # Rate limitinglimit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;limit_conn_zone $binary_remote_addr zone=conn_limit:10m; server { location /api/ { limit_req zone=api_limit burst=200 nodelay; limit_conn conn_limit 100; # Validate content type if ($content_type !~ "application/json") { return 415; } }}For high-security environments, implement mutual TLS (mTLS) for all service-to-service communication. Service meshes (Istio, Linkerd) automate this, but you can also implement it manually with Envoy or NGINX. mTLS ensures that both client and server authenticate each other, preventing unauthorized service communication.
This module has provided a comprehensive exploration of load balancing technologies across the spectrum from open-source software to cloud-managed services. Let's consolidate the key takeaways:
The Selection Framework:
When selecting a load balancing solution, evaluate along these dimensions:
No single technology dominates all dimensions. The "best" choice is context-dependent—a function of your specific requirements, constraints, and organizational capabilities.
Moving Forward:
Load balancing is a foundational capability that enables virtually all other distributed systems patterns we'll explore: caching, database scaling, microservices architecture, and more. The principles learned here—health checking, session persistence, algorithm selection, failover handling—recur throughout system design.
As you design systems, remember: load balancing is not just about distributing traffic—it's about building resilient, observable, scalable infrastructure that serves your users reliably under any conditions.
Congratulations! You have completed the Load Balancer Comparison module. You now possess comprehensive knowledge spanning NGINX, HAProxy, Envoy, AWS ALB/NLB, and cross-cloud solutions. Apply this framework to evaluate load balancing options in your own system designs, and remember that the best choice always depends on context.