Load Balancer Comparison - Learning Module

Loading content...

0/273

AWS ALB/NLB — Managed AWS Options

The Case for Managed Load Balancing

Operating load balancers in production—whether NGINX, HAProxy, or Envoy—requires specialized expertise. Configuration management, capacity planning, health monitoring, TLS certificate rotation, and high availability all demand ongoing operational investment. For many organizations, this operational burden represents undifferentiated heavy lifting that diverts resources from core business objectives.

Amazon Web Services (AWS) Elastic Load Balancing (ELB) addresses this challenge by providing fully managed load balancing services that abstract away operational complexity. AWS operates the underlying infrastructure, handles scaling, ensures high availability across Availability Zones, and integrates seamlessly with the AWS ecosystem.

AWS offers three load balancer types:

Application Load Balancer (ALB) — Layer 7 (HTTP/HTTPS) load balancing
Network Load Balancer (NLB) — Layer 4 (TCP/UDP) load balancing
Gateway Load Balancer (GWLB) — For deploying third-party appliances

This page focuses on ALB and NLB, as they represent the primary choices for application workloads. Understanding when and how to use these managed services is essential for effective AWS architecture.

Learning Objectives

By completing this page, you will understand the architectural differences between ALB and NLB, master their configuration for common use cases, comprehend their scaling behavior and limitations, and develop decision frameworks for choosing between managed and self-managed load balancing.

Application Load Balancer (ALB): Layer 7 Intelligence

The Application Load Balancer (ALB) operates at Layer 7 (application layer), enabling sophisticated routing decisions based on HTTP/HTTPS request content. ALB terminates client connections, inspects request attributes, and routes to backend targets based on configurable rules.

Core Components:

ALB Architecture Components

•Listeners — Define the port and protocol (HTTP/HTTPS) for accepting connections. Multiple listeners can exist on a single ALB.
•Rules — Each listener has rules that match requests based on conditions (path, host, headers, etc.) and specify actions (forward, redirect, fixed response).
•Target Groups — Logical groupings of targets (EC2 instances, IP addresses, Lambda functions, or other ALBs) that receive traffic.
•Health Checks — ALB continuously monitors target health and only routes to healthy targets.

ALB Configuration via Terraform

terraform

# Application Load Balancer
resource "aws_lb" "api_alb" {
  name               = "api-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb_sg.id]
  subnets           = var.public_subnet_ids
  
  # Enable access logs
  access_logs {
    bucket  = aws_s3_bucket.alb_logs.bucket
    prefix  = "api-alb"
    enabled = true
  }
  
  # Enable deletion protection for production
  enable_deletion_protection = true
  
  # Enable HTTP/2
  enable_http2 = true
  
  # Idle timeout
  idle_timeout = 60
  
  tags = {
    Environment = "production"
    Service     = "api"
  }
}
 
# HTTPS Listener with default action
resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.api_alb.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = aws_acm_certificate.api.arn
  
  # Default action: forward to main target group
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api_main.arn
  }
}
 
# HTTP Listener with redirect to HTTPS
resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.api_alb.arn
  port              = 80
  protocol          = "HTTP"
  
  default_action {
    type = "redirect"
    redirect {
      port        = "443"
      protocol    = "HTTPS"
      status_code = "HTTP_301"
    }
  }
}
 
# Path-based routing rule
resource "aws_lb_listener_rule" "api_v2" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 100
  
  condition {
    path_pattern {
      values = ["/api/v2/*"]
    }
  }
  
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api_v2.arn
  }
}
 
# Host-based routing rule
resource "aws_lb_listener_rule" "admin" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 50
  
  condition {
    host_header {
      values = ["admin.example.com"]
    }
  }
  
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.admin.arn
  }
}
 
# Weighted routing for canary deployments
resource "aws_lb_listener_rule" "canary" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 200
  
  condition {
    path_pattern {
      values = ["/feature/*"]
    }
  }
  
  action {
    type = "forward"
    forward {
      target_group {
        arn    = aws_lb_target_group.stable.arn
        weight = 90
      }
      target_group {
        arn    = aws_lb_target_group.canary.arn  
        weight = 10
      }
      stickiness {
        enabled  = true
        duration = 3600
      }
    }
  }
}

ALB Routing Capabilities:

ALB supports sophisticated routing based on multiple request attributes:

Condition Type	Match Against	Example Use Case
Host header	Domain name	Multi-tenant routing
Path pattern	URL path	API versioning, service routing
HTTP header	Any header value	A/B testing via custom headers
HTTP method	GET, POST, etc.	Read/write separation
Query string	URL parameters	Feature flags
Source IP	Client CIDR	Internal vs external traffic

Rules are evaluated by priority (lowest number first). When a match is found, the associated action is executed.

Rule Limit Awareness

ALB supports up to 100 rules per listener. Complex microservices architectures may hit this limit. Strategies include using multiple ALBs, consolidating rules, or using path prefixes to reduce rule count. Monitor your rule usage as your architecture evolves.

Target Groups: Flexible Backend Configuration

Target Groups define the backend resources that receive traffic from the load balancer. ALB supports multiple target types, each suited to different deployment models.

ALB Target Types
Target Type	Description	Use Case
instance	EC2 instances by ID	Traditional EC2 deployments
ip	IP addresses	ECS tasks, containers, on-premises via Direct Connect
lambda	Lambda function	Serverless backends
alb	Another ALB	ALB chaining for migration/separation

Target Group Configuration

terraform

# Target Group for EC2 instances
resource "aws_lb_target_group" "api_main" {
  name        = "api-main-tg"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = var.vpc_id
  target_type = "instance"
  
  # Load balancing algorithm
  load_balancing_algorithm_type = "least_outstanding_requests"
  # Options: round_robin, least_outstanding_requests
  
  # Slow start for gradual traffic ramp-up
  slow_start = 60  # seconds
  
  # Deregistration delay
  deregistration_delay = 30  # seconds to drain before removing target
  
  # Stickiness configuration
  stickiness {
    type            = "lb_cookie"
    cookie_duration = 86400  # 1 day
    enabled         = true
  }
  
  # Health check configuration
  health_check {
    enabled             = true
    path                = "/health"
    port                = "traffic-port"
    protocol            = "HTTP"
    healthy_threshold   = 2
    unhealthy_threshold = 3
    timeout             = 5
    interval            = 30
    matcher             = "200-299"
  }
  
  tags = {
    Name = "api-main-tg"
  }
}
 
# Target Group for ECS Fargate (IP target type)
resource "aws_lb_target_group" "ecs_service" {
  name        = "ecs-service-tg"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = var.vpc_id
  target_type = "ip"  # Required for Fargate
  
  health_check {
    enabled             = true
    path                = "/health"
    protocol            = "HTTP"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 5
    interval            = 10  # Faster for containers
    matcher             = "200"
  }
}
 
# Target Group for Lambda
resource "aws_lb_target_group" "lambda" {
  name        = "lambda-tg"
  target_type = "lambda"
  
  # Lambda-specific: no VPC, port, or protocol
  health_check {
    enabled             = false  # Lambda has its own health management
  }
}
 
# Attach Lambda function to target group
resource "aws_lb_target_group_attachment" "lambda" {
  target_group_arn = aws_lb_target_group.lambda.arn
  target_id        = aws_lambda_function.api.arn
  depends_on       = [aws_lambda_permission.alb]
}
 
# Lambda permission for ALB invocation
resource "aws_lambda_permission" "alb" {
  statement_id  = "AllowExecutionFromALB"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.api.function_name
  principal     = "elasticloadbalancing.amazonaws.com"
  source_arn    = aws_lb_target_group.lambda.arn
}

Target Group Best Practices

•Use slow start — Ramp up traffic to new targets over 30-60 seconds to allow JIT compilation and cache warming
•Tune deregistration delay — Balance between draining time and deployment speed; 30 seconds works for most HTTP services
•Health check tuning — Faster intervals (10s) for containers, slower (30s) for stable VMs; match to your service's startup time
•Cross-zone load balancing — Enabled by default for ALB; ensures even distribution across AZs even with unequal target counts

Lambda Cold Starts

When using Lambda targets, be aware of cold start latency (100ms - 2s+ depending on runtime and package size). ALB health checks can trigger Lambda invocations, and the 30-second health check timeout may cause issues with slow cold starts. Consider provisioned concurrency for latency-sensitive endpoints.

Network Load Balancer (NLB): Layer 4 Performance

The Network Load Balancer (NLB) operates at Layer 4 (transport layer), routing connections based on IP protocol data without inspecting application-layer content. This enables NLB to achieve significantly higher throughput and lower latency than ALB, making it ideal for performance-critical workloads.

Key NLB Characteristics:

NLB Capabilities

•Extreme performance — Handles millions of requests per second with single-digit millisecond latencies
•Static IP addresses — Each NLB gets one static IP per AZ; can assign Elastic IPs for fixed whitelisting
•Connection preservation — Direct connection between client and target; preserves source IP
•TCP/UDP/TLS support — Native support for TCP, UDP, and TLS (terminated or passthrough)
•PrivateLink integration — Expose services via AWS PrivateLink endpoints

NLB Configuration

terraform

# Network Load Balancer
resource "aws_lb" "api_nlb" {
  name               = "api-nlb"
  internal           = false
  load_balancer_type = "network"
  subnets           = var.public_subnet_ids
  
  # Assign Elastic IPs for static addressing
  # (Alternative: let AWS assign IPs)
  # Note: requires subnet_mapping instead of subnets
  
  # Enable cross-zone load balancing
  enable_cross_zone_load_balancing = true
  
  # Deletion protection
  enable_deletion_protection = true
  
  tags = {
    Environment = "production"
  }
}
 
# With Elastic IP assignment
resource "aws_lb" "api_nlb_static" {
  name               = "api-nlb-static"
  internal           = false
  load_balancer_type = "network"
  
  subnet_mapping {
    subnet_id     = var.public_subnet_a
    allocation_id = aws_eip.nlb_a.id
  }
  
  subnet_mapping {
    subnet_id     = var.public_subnet_b
    allocation_id = aws_eip.nlb_b.id
  }
}
 
# TCP Listener
resource "aws_lb_listener" "tcp" {
  load_balancer_arn = aws_lb.api_nlb.arn
  port              = 443
  protocol          = "TCP"
  
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.tcp.arn
  }
}
 
# TLS Listener (NLB terminates TLS)
resource "aws_lb_listener" "tls" {
  load_balancer_arn = aws_lb.api_nlb.arn
  port              = 443
  protocol          = "TLS"
  certificate_arn   = aws_acm_certificate.api.arn
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api.arn
  }
}
 
# UDP Listener (for DNS, gaming, etc.)
resource "aws_lb_listener" "udp" {
  load_balancer_arn = aws_lb.api_nlb.arn
  port              = 53
  protocol          = "UDP"
  
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.dns.arn
  }
}
 
# TCP Target Group
resource "aws_lb_target_group" "tcp" {
  name        = "tcp-tg"
  port        = 443
  protocol    = "TCP"
  vpc_id      = var.vpc_id
  target_type = "instance"
  
  # Preserve client IP
  preserve_client_ip = true
  
  # Connection termination on deregistration
  connection_termination = true
  
  # Health check
  health_check {
    enabled             = true
    port                = "traffic-port"
    protocol            = "TCP"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    interval            = 10
  }
  
  # Stickiness (source IP based)
  stickiness {
    enabled = true
    type    = "source_ip"
  }
}

NLB vs ALB: Traffic Flow Comparison:

ALB (Layer 7):

Client → ALB (terminates TCP) → New TCP to Target

Client sees ALB IP as server
Target sees ALB IP as client
Full HTTP inspection

NLB (Layer 4):

Client → NLB (routes packets) → Target

Client sees NLB IP (or Elastic IP) as server
Target sees original client IP (with proxy protocol or preserve_client_ip)
No application-layer inspection

This architectural difference explains NLB's performance advantage: no connection termination, no HTTP parsing, just highly optimized packet routing.

When to Use NLB TLS

NLB can terminate TLS (protocol: TLS) or pass it through (protocol: TCP). Terminate TLS at NLB when you want simplified certificate management via ACM. Use TCP passthrough when targets need to see client certificates (mTLS) or when you need end-to-end encryption without NLB having access to plaintext.

ALB vs NLB: Decision Framework

Choosing between ALB and NLB requires understanding their fundamental differences and matching capabilities to requirements.

ALB vs NLB Feature Comparison
Capability	ALB	NLB
OSI Layer	Layer 7 (HTTP/HTTPS)	Layer 4 (TCP/UDP/TLS)
Performance	Good (100K+ req/s)	Extreme (millions req/s)
Latency	~2-5ms overhead	~<1ms overhead
Content-based routing	Yes (path, host, headers)	No
WebSocket support	Yes	Yes (TCP passthrough)
HTTP/2 support	Yes (grpc support limited)	Passthrough only
Static IP	No (use Global Accelerator)	Yes (EIP per AZ)
Source IP preservation	Via X-Forwarded-For header	Native (preserve_client_ip)
Lambda targets	Yes	No
UDP support	No	Yes
PrivateLink	Limited	Full support
Price model	Per LCU (capacity units)	Per NLCU + data processed

Choose ALB When

•You need content-based routing (path, host, headers)
•Running microservices that require path routing
•Using Lambda as backend targets
•Need built-in authentication (Cognito, OIDC)
•Require detailed HTTP-level metrics
•Want managed WebSocket handling
•Applications communicate via HTTP/HTTPS only

Choose NLB When

•Ultra-low latency is critical
•Need static IPs for whitelisting
•Handling non-HTTP protocols (gRPC, databases, gaming)
•Extreme throughput requirements
•Exposing services via PrivateLink
•Source IP preservation is required
•UDP traffic (DNS, VOIP, gaming)

Common Architectural Patterns:

Pattern 1: ALB + NLB Combination Place NLB in front of ALB when you need both static IPs AND content-based routing. NLB handles IP stickiness; ALB handles HTTP routing.

Pattern 2: NLB for Internal, ALB for External Use NLB for service-to-service communication (lower latency, simple routing) and ALB for public-facing APIs (rich routing, authentication).

Pattern 3: NLB with Global Accelerator Combine NLB with AWS Global Accelerator for static anycast IPs with geographic routing and DDoS protection.

Cost Optimization

ALB pricing is based on Load Balancer Capacity Units (LCU), which measure new connections, active connections, processed bytes, and rule evaluations. Complex rule sets increase costs. NLB pricing is based on NLCU plus data processed. For high-throughput, simple routing, NLB is often cheaper.

Health Checking and Failover Strategies

Both ALB and NLB implement robust health checking to ensure traffic only reaches healthy targets. Understanding the configuration options is essential for reliable operation.

Health Check Parameters
Parameter	ALB Default	NLB Default	Recommendation
Interval	30 seconds	30 seconds	10s for containers, 30s for VMs
Healthy threshold	5 checks	3 checks	2-3 for faster recovery
Unhealthy threshold	2 checks	3 checks	2-3 to avoid flapping
Timeout	5 seconds	10 seconds	Less than interval; adjust for cold starts
Protocol	HTTP	TCP	HTTP when possible for deeper checks
Path	/	N/A (TCP)	Dedicated /health endpoint

Health Check Best Practices

terraform

# Production-optimized health check for containerized service
resource "aws_lb_target_group" "production" {
  name        = "production-tg"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = var.vpc_id
  target_type = "ip"
  
  health_check {
    enabled             = true
    path                = "/health"
    port                = "traffic-port"
    protocol            = "HTTP"
    
    # Fast detection: 10s interval, 2 fails = 20s to detect
    interval            = 10
    healthy_threshold   = 2
    unhealthy_threshold = 2
    
    # Timeout must be less than interval
    timeout             = 5
    
    # Accept any 2xx status
    matcher             = "200-299"
  }
  
  # Slow start prevents thundering herd on new instances
  slow_start = 60
  
  # Quick deregistration for fast deployments
  deregistration_delay = 30
}
 
# NLB with HTTP health check (for TCP targets)
resource "aws_lb_target_group" "nlb_http_check" {
  name        = "nlb-http-check-tg"
  port        = 443
  protocol    = "TCP"
  vpc_id      = var.vpc_id
  target_type = "instance"
  
  # Use HTTP health check even for TCP target group
  health_check {
    enabled             = true
    port                = 8080        # Dedicated health check port
    protocol            = "HTTP"      # HTTP check for TCP target
    path                = "/health"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    interval            = 10
  }
}

Failover Behavior:

Unhealthy target detection: After unhealthy_threshold consecutive failed checks, target is marked unhealthy
Traffic drain: New requests stop routing to unhealthy target
Existing connections: ALB terminates; NLB depends on connection_termination setting
Recovery: After healthy_threshold successful checks, target receives traffic again
Slow start (if configured): Traffic ramps up gradually on recovered targets

Cross-Zone Load Balancing:

ALB: Always enabled, included in price
NLB: Disabled by default, can be enabled (incurs cross-AZ data transfer charges)

Enable cross-zone for balanced distribution when target counts vary between AZs.

Health Check Costs

Health checks consume target resources. With 100 targets and 10s intervals, AWS generates 600 health check requests per minute per target group. Design your health endpoints to be lightweight—avoid database queries or expensive computations in health checks.

Advanced ALB/NLB Features

AWS load balancers include sophisticated features that extend beyond basic load distribution.

Advanced Capabilities

•ALB Authentication — Integrate with Cognito or any OIDC provider for built-in authentication before requests reach targets
•ALB Fixed Response — Return custom responses (maintenance pages, health checks) without targets
•ALB Lambda Warming — Keep Lambda functions warm via provisioned concurrency integration
•NLB PrivateLink — Expose services to other VPCs or AWS accounts without VPC peering
•Connection Draining — Gracefully complete in-flight requests during target deregistration
•Access Logging — Detailed request logs to S3 for analysis and compliance

Advanced ALB Authentication

terraform

# ALB with Cognito authentication
resource "aws_lb_listener_rule" "protected_api" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 10
  
  action {
    type = "authenticate-cognito"
    authenticate_cognito {
      user_pool_arn       = aws_cognito_user_pool.api.arn
      user_pool_client_id = aws_cognito_user_pool_client.alb.id
      user_pool_domain    = aws_cognito_user_pool_domain.api.domain
      
      session_cookie_name = "AWSELBAuthSession"
      session_timeout     = 3600
      
      on_unauthenticated_request = "authenticate"
      # Options: deny, allow, authenticate
    }
    order = 1
  }
  
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.protected.arn
    order            = 2
  }
  
  condition {
    path_pattern {
      values = ["/admin/*", "/internal/*"]
    }
  }
}
 
# ALB with OIDC authentication (generic OpenID Connect)
resource "aws_lb_listener_rule" "oidc_protected" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 20
  
  action {
    type = "authenticate-oidc"
    authenticate_oidc {
      authorization_endpoint = "https://idp.example.com/authorize"
      client_id              = var.oidc_client_id
      client_secret          = var.oidc_client_secret
      issuer                 = "https://idp.example.com"
      token_endpoint         = "https://idp.example.com/token"
      user_info_endpoint     = "https://idp.example.com/userinfo"
      
      on_unauthenticated_request = "authenticate"
    }
    order = 1
  }
  
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.protected.arn
    order            = 2
  }
  
  condition {
    path_pattern {
      values = ["/dashboard/*"]
    }
  }
}
 
# ALB fixed response for health checks and maintenance
resource "aws_lb_listener_rule" "maintenance" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 5
  
  action {
    type = "fixed-response"
    fixed_response {
      content_type = "application/json"
      message_body = jsonencode({
        status  = "maintenance"
        message = "Service is under maintenance. Please try again later."
      })
      status_code = "503"
    }
  }
  
  condition {
    path_pattern {
      values = ["/api/*"]
    }
  }
  
  # Only active when maintenance variable is true
  count = var.maintenance_mode ? 1 : 0
}

WAF Integration

ALB integrates natively with AWS WAF (Web Application Firewall) for protection against common web exploits, SQL injection, XSS, and bot detection. Attach a WAF WebACL to your ALB for defense-in-depth security. WAF can also implement rate limiting at the edge.

Scaling Behavior and Limits

AWS load balancers scale automatically in response to traffic, but understanding their scaling behavior and limits is crucial for capacity planning.

AWS Load Balancer Limits
Resource	ALB Default	NLB Default	Notes
Load balancers per region	50	50	Soft limit, request increase
Target groups per region	3,000	3,000	Soft limit
Targets per target group	1,000	500 (instance), 500 (IP)	Hard limits vary by type
Listeners per load balancer	50	50	Soft limit
Rules per listener	100	N/A	Plan for microservices scale
Certificates per listener	25 (default) + SNI	25 (default) + SNI	SNI for multi-domain

Scaling Behavior:

ALB Scaling:

Scales based on traffic patterns (connections, requests, bandwidth)
May take 1-15 minutes to scale up for traffic spikes
Uses multiple nodes behind DNS for capacity
Pre-warming available via AWS Support for expected traffic spikes

NLB Scaling:

Designed for instant scaling to millions of requests
Uses flow-based hashing for distribution
Elastic IPs remain static even during scaling
Generally no pre-warming required

Pre-warming (ALB):

For expected traffic spikes (product launches, events), contact AWS Support to pre-warm your ALB. Provide:

Expected peak requests per second
Expected traffic duration
Average request/response size
Timeline for traffic increase

DNS Caching and Scaling

ALB scaling adds new nodes behind the DNS name. Clients caching DNS may not see new capacity immediately. Use short TTLs (60 seconds) in Route 53 when pointing to ALB, and ensure client applications honor DNS TTLs. For traffic-critical applications, consider Global Accelerator which provides static anycast IPs.

Managed vs Self-Managed: Decision Framework

The decision between AWS managed load balancers and self-managed solutions (NGINX, HAProxy, Envoy) involves trade-offs across operational burden, cost, features, and control.

Choose AWS Managed When

•Operational simplicity is a priority
•Team lacks dedicated infrastructure expertise
•Native AWS integration is valuable (IAM, CloudWatch, ACM)
•Compliance requires managed services (SOC2, HIPAA)
•Auto-scaling without capacity planning is desired
•Traffic patterns are unpredictable
•Starting a new project with limited ops resources

Choose Self-Managed When

•Maximum performance is critical (HAProxy)
•Complex routing logic beyond ALB rules (NGINX Lua, Envoy Wasm)
•Multi-cloud or hybrid deployments
•Cost optimization at very high scale
•Custom rate limiting algorithms (HAProxy stick tables)
•Extensive customization requirements
•Team has strong infrastructure expertise

Cost Comparison Considerations:

Low traffic: AWS managed is often cheaper (no EC2 costs for self-managed)
Medium traffic: Roughly equivalent, depends on request patterns
High traffic: Self-managed can be 50-70% cheaper at scale
Hidden costs of self-managed: Engineer time, on-call burden, upgrade complexity

Hybrid Approaches:

Many organizations use both:

AWS LB → Self-managed → Backends: ALB/NLB handles external traffic, NGINX/Envoy handles internal routing
NLB → NGINX: Combine NLB's static IPs and HA with NGINX's routing flexibility
Global Accelerator → NLB → Services: Global distribution with AWS-managed infrastructure

Page Complete

You now possess comprehensive knowledge of AWS ALB and NLB—from their architectural differences to configuration patterns, health checking, and the decision framework for choosing managed versus self-managed load balancing. In the final page, we'll synthesize everything into a comprehensive cloud load balancer comparison across major providers.

AWS ALB/NLB — Managed AWS Options

The Case for Managed Load Balancing

AWS offers three load balancer types:

Application Load Balancer (ALB) — Layer 7 (HTTP/HTTPS) load balancing
Network Load Balancer (NLB) — Layer 4 (TCP/UDP) load balancing
Gateway Load Balancer (GWLB) — For deploying third-party appliances

Learning Objectives

Application Load Balancer (ALB): Layer 7 Intelligence

Core Components:

ALB Architecture Components

•Listeners — Define the port and protocol (HTTP/HTTPS) for accepting connections. Multiple listeners can exist on a single ALB.
•Rules — Each listener has rules that match requests based on conditions (path, host, headers, etc.) and specify actions (forward, redirect, fixed response).
•Target Groups — Logical groupings of targets (EC2 instances, IP addresses, Lambda functions, or other ALBs) that receive traffic.
•Health Checks — ALB continuously monitors target health and only routes to healthy targets.

ALB Configuration via Terraform

terraform

# Application Load Balancer
resource "aws_lb" "api_alb" {
  name               = "api-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb_sg.id]
  subnets           = var.public_subnet_ids
  
  # Enable access logs
  access_logs {
    bucket  = aws_s3_bucket.alb_logs.bucket
    prefix  = "api-alb"
    enabled = true
  }
  
  # Enable deletion protection for production
  enable_deletion_protection = true
  
  # Enable HTTP/2
  enable_http2 = true
  
  # Idle timeout
  idle_timeout = 60
  
  tags = {
    Environment = "production"
    Service     = "api"
  }
}
 
# HTTPS Listener with default action
resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.api_alb.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = aws_acm_certificate.api.arn
  
  # Default action: forward to main target group
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api_main.arn
  }
}
 
# HTTP Listener with redirect to HTTPS
resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.api_alb.arn
  port              = 80
  protocol          = "HTTP"
  
  default_action {
    type = "redirect"
    redirect {
      port        = "443"
      protocol    = "HTTPS"
      status_code = "HTTP_301"
    }
  }
}
 
# Path-based routing rule
resource "aws_lb_listener_rule" "api_v2" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 100
  
  condition {
    path_pattern {
      values = ["/api/v2/*"]
    }
  }
  
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api_v2.arn
  }
}
 
# Host-based routing rule
resource "aws_lb_listener_rule" "admin" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 50
  
  condition {
    host_header {
      values = ["admin.example.com"]
    }
  }
  
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.admin.arn
  }
}
 
# Weighted routing for canary deployments
resource "aws_lb_listener_rule" "canary" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 200
  
  condition {
    path_pattern {
      values = ["/feature/*"]
    }
  }
  
  action {
    type = "forward"
    forward {
      target_group {
        arn    = aws_lb_target_group.stable.arn
        weight = 90
      }
      target_group {
        arn    = aws_lb_target_group.canary.arn  
        weight = 10
      }
      stickiness {
        enabled  = true
        duration = 3600
      }
    }
  }
}

ALB Routing Capabilities:

ALB supports sophisticated routing based on multiple request attributes:

Condition Type	Match Against	Example Use Case
Host header	Domain name	Multi-tenant routing
Path pattern	URL path	API versioning, service routing
HTTP header	Any header value	A/B testing via custom headers
HTTP method	GET, POST, etc.	Read/write separation
Query string	URL parameters	Feature flags
Source IP	Client CIDR	Internal vs external traffic

Rules are evaluated by priority (lowest number first). When a match is found, the associated action is executed.

Rule Limit Awareness

Target Groups: Flexible Backend Configuration

Target Groups define the backend resources that receive traffic from the load balancer. ALB supports multiple target types, each suited to different deployment models.

ALB Target Types
Target Type	Description	Use Case
instance	EC2 instances by ID	Traditional EC2 deployments
ip	IP addresses	ECS tasks, containers, on-premises via Direct Connect
lambda	Lambda function	Serverless backends
alb	Another ALB	ALB chaining for migration/separation

Target Group Configuration

terraform

# Target Group for EC2 instances
resource "aws_lb_target_group" "api_main" {
  name        = "api-main-tg"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = var.vpc_id
  target_type = "instance"
  
  # Load balancing algorithm
  load_balancing_algorithm_type = "least_outstanding_requests"
  # Options: round_robin, least_outstanding_requests
  
  # Slow start for gradual traffic ramp-up
  slow_start = 60  # seconds
  
  # Deregistration delay
  deregistration_delay = 30  # seconds to drain before removing target
  
  # Stickiness configuration
  stickiness {
    type            = "lb_cookie"
    cookie_duration = 86400  # 1 day
    enabled         = true
  }
  
  # Health check configuration
  health_check {
    enabled             = true
    path                = "/health"
    port                = "traffic-port"
    protocol            = "HTTP"
    healthy_threshold   = 2
    unhealthy_threshold = 3
    timeout             = 5
    interval            = 30
    matcher             = "200-299"
  }
  
  tags = {
    Name = "api-main-tg"
  }
}
 
# Target Group for ECS Fargate (IP target type)
resource "aws_lb_target_group" "ecs_service" {
  name        = "ecs-service-tg"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = var.vpc_id
  target_type = "ip"  # Required for Fargate
  
  health_check {
    enabled             = true
    path                = "/health"
    protocol            = "HTTP"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 5
    interval            = 10  # Faster for containers
    matcher             = "200"
  }
}
 
# Target Group for Lambda
resource "aws_lb_target_group" "lambda" {
  name        = "lambda-tg"
  target_type = "lambda"
  
  # Lambda-specific: no VPC, port, or protocol
  health_check {
    enabled             = false  # Lambda has its own health management
  }
}
 
# Attach Lambda function to target group
resource "aws_lb_target_group_attachment" "lambda" {
  target_group_arn = aws_lb_target_group.lambda.arn
  target_id        = aws_lambda_function.api.arn
  depends_on       = [aws_lambda_permission.alb]
}
 
# Lambda permission for ALB invocation
resource "aws_lambda_permission" "alb" {
  statement_id  = "AllowExecutionFromALB"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.api.function_name
  principal     = "elasticloadbalancing.amazonaws.com"
  source_arn    = aws_lb_target_group.lambda.arn
}

Target Group Best Practices

•Use slow start — Ramp up traffic to new targets over 30-60 seconds to allow JIT compilation and cache warming
•Tune deregistration delay — Balance between draining time and deployment speed; 30 seconds works for most HTTP services
•Health check tuning — Faster intervals (10s) for containers, slower (30s) for stable VMs; match to your service's startup time
•Cross-zone load balancing — Enabled by default for ALB; ensures even distribution across AZs even with unequal target counts

Lambda Cold Starts

Network Load Balancer (NLB): Layer 4 Performance

Key NLB Characteristics:

NLB Capabilities

•Extreme performance — Handles millions of requests per second with single-digit millisecond latencies
•Static IP addresses — Each NLB gets one static IP per AZ; can assign Elastic IPs for fixed whitelisting
•Connection preservation — Direct connection between client and target; preserves source IP
•TCP/UDP/TLS support — Native support for TCP, UDP, and TLS (terminated or passthrough)
•PrivateLink integration — Expose services via AWS PrivateLink endpoints

NLB Configuration

terraform

# Network Load Balancer
resource "aws_lb" "api_nlb" {
  name               = "api-nlb"
  internal           = false
  load_balancer_type = "network"
  subnets           = var.public_subnet_ids
  
  # Assign Elastic IPs for static addressing
  # (Alternative: let AWS assign IPs)
  # Note: requires subnet_mapping instead of subnets
  
  # Enable cross-zone load balancing
  enable_cross_zone_load_balancing = true
  
  # Deletion protection
  enable_deletion_protection = true
  
  tags = {
    Environment = "production"
  }
}
 
# With Elastic IP assignment
resource "aws_lb" "api_nlb_static" {
  name               = "api-nlb-static"
  internal           = false
  load_balancer_type = "network"
  
  subnet_mapping {
    subnet_id     = var.public_subnet_a
    allocation_id = aws_eip.nlb_a.id
  }
  
  subnet_mapping {
    subnet_id     = var.public_subnet_b
    allocation_id = aws_eip.nlb_b.id
  }
}
 
# TCP Listener
resource "aws_lb_listener" "tcp" {
  load_balancer_arn = aws_lb.api_nlb.arn
  port              = 443
  protocol          = "TCP"
  
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.tcp.arn
  }
}
 
# TLS Listener (NLB terminates TLS)
resource "aws_lb_listener" "tls" {
  load_balancer_arn = aws_lb.api_nlb.arn
  port              = 443
  protocol          = "TLS"
  certificate_arn   = aws_acm_certificate.api.arn
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api.arn
  }
}
 
# UDP Listener (for DNS, gaming, etc.)
resource "aws_lb_listener" "udp" {
  load_balancer_arn = aws_lb.api_nlb.arn
  port              = 53
  protocol          = "UDP"
  
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.dns.arn
  }
}
 
# TCP Target Group
resource "aws_lb_target_group" "tcp" {
  name        = "tcp-tg"
  port        = 443
  protocol    = "TCP"
  vpc_id      = var.vpc_id
  target_type = "instance"
  
  # Preserve client IP
  preserve_client_ip = true
  
  # Connection termination on deregistration
  connection_termination = true
  
  # Health check
  health_check {
    enabled             = true
    port                = "traffic-port"
    protocol            = "TCP"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    interval            = 10
  }
  
  # Stickiness (source IP based)
  stickiness {
    enabled = true
    type    = "source_ip"
  }
}

NLB vs ALB: Traffic Flow Comparison:

ALB (Layer 7):

Client → ALB (terminates TCP) → New TCP to Target

Client sees ALB IP as server
Target sees ALB IP as client
Full HTTP inspection

NLB (Layer 4):

Client → NLB (routes packets) → Target

Client sees NLB IP (or Elastic IP) as server
Target sees original client IP (with proxy protocol or preserve_client_ip)
No application-layer inspection

This architectural difference explains NLB's performance advantage: no connection termination, no HTTP parsing, just highly optimized packet routing.

When to Use NLB TLS

ALB vs NLB: Decision Framework

Choosing between ALB and NLB requires understanding their fundamental differences and matching capabilities to requirements.

ALB vs NLB Feature Comparison
Capability	ALB	NLB
OSI Layer	Layer 7 (HTTP/HTTPS)	Layer 4 (TCP/UDP/TLS)
Performance	Good (100K+ req/s)	Extreme (millions req/s)
Latency	~2-5ms overhead	~<1ms overhead
Content-based routing	Yes (path, host, headers)	No
WebSocket support	Yes	Yes (TCP passthrough)
HTTP/2 support	Yes (grpc support limited)	Passthrough only
Static IP	No (use Global Accelerator)	Yes (EIP per AZ)
Source IP preservation	Via X-Forwarded-For header	Native (preserve_client_ip)
Lambda targets	Yes	No
UDP support	No	Yes
PrivateLink	Limited	Full support
Price model	Per LCU (capacity units)	Per NLCU + data processed

Choose ALB When

•You need content-based routing (path, host, headers)
•Running microservices that require path routing
•Using Lambda as backend targets
•Need built-in authentication (Cognito, OIDC)
•Require detailed HTTP-level metrics
•Want managed WebSocket handling
•Applications communicate via HTTP/HTTPS only

Choose NLB When

•Ultra-low latency is critical
•Need static IPs for whitelisting
•Handling non-HTTP protocols (gRPC, databases, gaming)
•Extreme throughput requirements
•Exposing services via PrivateLink
•Source IP preservation is required
•UDP traffic (DNS, VOIP, gaming)

Common Architectural Patterns:

Pattern 1: ALB + NLB Combination Place NLB in front of ALB when you need both static IPs AND content-based routing. NLB handles IP stickiness; ALB handles HTTP routing.

Pattern 2: NLB for Internal, ALB for External Use NLB for service-to-service communication (lower latency, simple routing) and ALB for public-facing APIs (rich routing, authentication).

Pattern 3: NLB with Global Accelerator Combine NLB with AWS Global Accelerator for static anycast IPs with geographic routing and DDoS protection.

Cost Optimization

Health Checking and Failover Strategies

Both ALB and NLB implement robust health checking to ensure traffic only reaches healthy targets. Understanding the configuration options is essential for reliable operation.

Health Check Parameters
Parameter	ALB Default	NLB Default	Recommendation
Interval	30 seconds	30 seconds	10s for containers, 30s for VMs
Healthy threshold	5 checks	3 checks	2-3 for faster recovery
Unhealthy threshold	2 checks	3 checks	2-3 to avoid flapping
Timeout	5 seconds	10 seconds	Less than interval; adjust for cold starts
Protocol	HTTP	TCP	HTTP when possible for deeper checks
Path	/	N/A (TCP)	Dedicated /health endpoint

Health Check Best Practices

terraform

# Production-optimized health check for containerized service
resource "aws_lb_target_group" "production" {
  name        = "production-tg"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = var.vpc_id
  target_type = "ip"
  
  health_check {
    enabled             = true
    path                = "/health"
    port                = "traffic-port"
    protocol            = "HTTP"
    
    # Fast detection: 10s interval, 2 fails = 20s to detect
    interval            = 10
    healthy_threshold   = 2
    unhealthy_threshold = 2
    
    # Timeout must be less than interval
    timeout             = 5
    
    # Accept any 2xx status
    matcher             = "200-299"
  }
  
  # Slow start prevents thundering herd on new instances
  slow_start = 60
  
  # Quick deregistration for fast deployments
  deregistration_delay = 30
}
 
# NLB with HTTP health check (for TCP targets)
resource "aws_lb_target_group" "nlb_http_check" {
  name        = "nlb-http-check-tg"
  port        = 443
  protocol    = "TCP"
  vpc_id      = var.vpc_id
  target_type = "instance"
  
  # Use HTTP health check even for TCP target group
  health_check {
    enabled             = true
    port                = 8080        # Dedicated health check port
    protocol            = "HTTP"      # HTTP check for TCP target
    path                = "/health"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    interval            = 10
  }
}

Failover Behavior:

Unhealthy target detection: After unhealthy_threshold consecutive failed checks, target is marked unhealthy
Traffic drain: New requests stop routing to unhealthy target
Existing connections: ALB terminates; NLB depends on connection_termination setting
Recovery: After healthy_threshold successful checks, target receives traffic again
Slow start (if configured): Traffic ramps up gradually on recovered targets

Cross-Zone Load Balancing:

ALB: Always enabled, included in price
NLB: Disabled by default, can be enabled (incurs cross-AZ data transfer charges)

Enable cross-zone for balanced distribution when target counts vary between AZs.

Health Check Costs

Advanced ALB/NLB Features

AWS load balancers include sophisticated features that extend beyond basic load distribution.

Advanced Capabilities

•ALB Authentication — Integrate with Cognito or any OIDC provider for built-in authentication before requests reach targets
•ALB Fixed Response — Return custom responses (maintenance pages, health checks) without targets
•ALB Lambda Warming — Keep Lambda functions warm via provisioned concurrency integration
•NLB PrivateLink — Expose services to other VPCs or AWS accounts without VPC peering
•Connection Draining — Gracefully complete in-flight requests during target deregistration
•Access Logging — Detailed request logs to S3 for analysis and compliance

Advanced ALB Authentication

terraform

# ALB with Cognito authentication
resource "aws_lb_listener_rule" "protected_api" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 10
  
  action {
    type = "authenticate-cognito"
    authenticate_cognito {
      user_pool_arn       = aws_cognito_user_pool.api.arn
      user_pool_client_id = aws_cognito_user_pool_client.alb.id
      user_pool_domain    = aws_cognito_user_pool_domain.api.domain
      
      session_cookie_name = "AWSELBAuthSession"
      session_timeout     = 3600
      
      on_unauthenticated_request = "authenticate"
      # Options: deny, allow, authenticate
    }
    order = 1
  }
  
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.protected.arn
    order            = 2
  }
  
  condition {
    path_pattern {
      values = ["/admin/*", "/internal/*"]
    }
  }
}
 
# ALB with OIDC authentication (generic OpenID Connect)
resource "aws_lb_listener_rule" "oidc_protected" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 20
  
  action {
    type = "authenticate-oidc"
    authenticate_oidc {
      authorization_endpoint = "https://idp.example.com/authorize"
      client_id              = var.oidc_client_id
      client_secret          = var.oidc_client_secret
      issuer                 = "https://idp.example.com"
      token_endpoint         = "https://idp.example.com/token"
      user_info_endpoint     = "https://idp.example.com/userinfo"
      
      on_unauthenticated_request = "authenticate"
    }
    order = 1
  }
  
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.protected.arn
    order            = 2
  }
  
  condition {
    path_pattern {
      values = ["/dashboard/*"]
    }
  }
}
 
# ALB fixed response for health checks and maintenance
resource "aws_lb_listener_rule" "maintenance" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 5
  
  action {
    type = "fixed-response"
    fixed_response {
      content_type = "application/json"
      message_body = jsonencode({
        status  = "maintenance"
        message = "Service is under maintenance. Please try again later."
      })
      status_code = "503"
    }
  }
  
  condition {
    path_pattern {
      values = ["/api/*"]
    }
  }
  
  # Only active when maintenance variable is true
  count = var.maintenance_mode ? 1 : 0
}

WAF Integration

Scaling Behavior and Limits

AWS load balancers scale automatically in response to traffic, but understanding their scaling behavior and limits is crucial for capacity planning.

AWS Load Balancer Limits
Resource	ALB Default	NLB Default	Notes
Load balancers per region	50	50	Soft limit, request increase
Target groups per region	3,000	3,000	Soft limit
Targets per target group	1,000	500 (instance), 500 (IP)	Hard limits vary by type
Listeners per load balancer	50	50	Soft limit
Rules per listener	100	N/A	Plan for microservices scale
Certificates per listener	25 (default) + SNI	25 (default) + SNI	SNI for multi-domain

Scaling Behavior:

ALB Scaling:

Scales based on traffic patterns (connections, requests, bandwidth)
May take 1-15 minutes to scale up for traffic spikes
Uses multiple nodes behind DNS for capacity
Pre-warming available via AWS Support for expected traffic spikes

NLB Scaling:

Designed for instant scaling to millions of requests
Uses flow-based hashing for distribution
Elastic IPs remain static even during scaling
Generally no pre-warming required

Pre-warming (ALB):

For expected traffic spikes (product launches, events), contact AWS Support to pre-warm your ALB. Provide:

Expected peak requests per second
Expected traffic duration
Average request/response size
Timeline for traffic increase

DNS Caching and Scaling

Managed vs Self-Managed: Decision Framework

The decision between AWS managed load balancers and self-managed solutions (NGINX, HAProxy, Envoy) involves trade-offs across operational burden, cost, features, and control.

Choose AWS Managed When

•Operational simplicity is a priority
•Team lacks dedicated infrastructure expertise
•Native AWS integration is valuable (IAM, CloudWatch, ACM)
•Compliance requires managed services (SOC2, HIPAA)
•Auto-scaling without capacity planning is desired
•Traffic patterns are unpredictable
•Starting a new project with limited ops resources

Choose Self-Managed When

•Maximum performance is critical (HAProxy)
•Complex routing logic beyond ALB rules (NGINX Lua, Envoy Wasm)
•Multi-cloud or hybrid deployments
•Cost optimization at very high scale
•Custom rate limiting algorithms (HAProxy stick tables)
•Extensive customization requirements
•Team has strong infrastructure expertise

Cost Comparison Considerations:

Low traffic: AWS managed is often cheaper (no EC2 costs for self-managed)
Medium traffic: Roughly equivalent, depends on request patterns
High traffic: Self-managed can be 50-70% cheaper at scale
Hidden costs of self-managed: Engineer time, on-call burden, upgrade complexity

Hybrid Approaches:

Many organizations use both:

AWS LB → Self-managed → Backends: ALB/NLB handles external traffic, NGINX/Envoy handles internal routing
NLB → NGINX: Combine NLB's static IPs and HA with NGINX's routing flexibility
Global Accelerator → NLB → Services: Global distribution with AWS-managed infrastructure

Page Complete