Loading learning content...
One of the most common sources of confusion in infrastructure design is the relationship between API Gateways and Load Balancers. They both sit in front of your services. They both route traffic. They both can perform health checks. So what's the difference?
The confusion is understandable—many gateway products include load balancing features, and many load balancers have grown to include API-like capabilities. But they solve fundamentally different problems, operate at different layers, and have different core competencies.
This page will dissect the differences with precision, helping you understand when to use each, how they complement each other, and common architectural patterns that combine both effectively.
By the end of this page, you will understand the fundamental differences between API Gateways and Load Balancers, their respective strengths, how they work together in production architectures, and how to make informed decisions about when to use each.
At their core, API Gateways and Load Balancers solve different problems:
Load Balancer: How do I distribute traffic efficiently across server instances?
API Gateway: How do I manage and secure API traffic at the application layer?
This distinction manifests in their design, capabilities, and position in the stack.
| Dimension | Load Balancer | API Gateway |
|---|---|---|
| OSI Layer | Layer 4 (TCP/UDP) or Layer 7 (HTTP) | Layer 7 (HTTP/API) exclusively |
| Primary Purpose | Distribute load across instances | Manage API traffic and enforce policies |
| Traffic Understanding | Connections, packets, or basic HTTP | Deep API semantics: paths, methods, headers, bodies, auth |
| State Awareness | Typically stateless (connection level) | API-aware: sessions, tokens, rate limits, quotas |
| Authentication | None or basic (IP, certificate) | Full: JWT, OAuth2, API keys, mTLS |
| Rate Limiting | Basic (connections/sec) | Sophisticated (per-user, per-endpoint, token bucket) |
| Request Transformation | None or minimal | Full: header manipulation, body transformation, protocol translation |
| Response Handling | Pass-through | Transformation, caching, error normalization |
| Observability | Connection metrics, health checks | Full API metrics: latency by endpoint, error rates, business metrics |
| Typical Products | HAProxy, NGINX (LB mode), AWS ELB/ALB | Kong, AWS API Gateway, Apigee, NGINX (API mode) |
Understanding the OSI layer distinction is key:
Layer 4 (Transport Layer) Load Balancers:
Layer 7 (Application Layer) Load Balancers:
API Gateways:
Incoming Request:───────────────────────────────────────────────────────────────────────POST /api/v2/orders HTTP/1.1Host: api.example.comAuthorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...Content-Type: application/jsonX-Correlation-ID: abc-123 { "customerId": "cust_12345", "items": [ { "productId": "prod_001", "quantity": 2 } ], "paymentMethod": "card_ending_4242"}─────────────────────────────────────────────────────────────────────── Layer 4 Load Balancer sees:───────────────────────────────────────────────────────────────────────┌─────────────────────────────────────────────────────────────────────┐│ Source IP: 203.0.113.45:52841 ││ Dest IP: 10.0.1.50:443 ││ Protocol: TCP ││ Bytes: 847 ││ Connection: New │└─────────────────────────────────────────────────────────────────────┘Decision: Route to server 10.0.1.100:8080 (least connections)─────────────────────────────────────────────────────────────────────── Layer 7 Load Balancer sees:───────────────────────────────────────────────────────────────────────┌─────────────────────────────────────────────────────────────────────┐│ + Everything L4 sees, plus: ││ Method: POST ││ Path: /api/v2/orders ││ Host: api.example.com ││ Headers: Authorization, Content-Type, X-Correlation-ID ││ Content-Length: 147 │└─────────────────────────────────────────────────────────────────────┘Decision: Route to order-service backend pool (path matching /api/v2/orders*)─────────────────────────────────────────────────────────────────────── API Gateway sees:───────────────────────────────────────────────────────────────────────┌─────────────────────────────────────────────────────────────────────┐│ + Everything L7 sees, plus: ││ API Version: v2 ││ API Operation: CreateOrder ││ Auth Token: Valid JWT, expires in 45 minutes ││ User ID: user_789 (from token) ││ User Roles: [customer, premium-tier] ││ Tenant ID: tenant_abc ││ Rate Limit Status: 47/100 requests remaining this minute ││ Request Body Parsed: Valid JSON, matches CreateOrderRequest schema ││ Customer ID: cust_12345 ││ Product IDs: [prod_001] │└─────────────────────────────────────────────────────────────────────┘Decision: 1. ✓ Authentication valid 2. ✓ Authorization: premium-tier can access CreateOrder 3. ✓ Rate limit: under quota 4. ✓ Schema validation: request body valid 5. Route to order-service-v2 (internal endpoint) 6. Add headers: X-User-ID, X-Tenant-ID, X-Request-ID (for tracing) 7. Record metrics: CreateOrder request, latency, user tier───────────────────────────────────────────────────────────────────────The key insight: a load balancer sees traffic as connections or HTTP requests. An API Gateway sees traffic as API operations with full semantic understanding: who is calling, what are they authorized to do, is this request valid, and how does it fit into the broader API contract?
Load balancers are purpose-built for traffic distribution and excel in specific scenarios:
Layer 4 load balancers can handle millions of connections per second with minimal latency overhead. When raw throughput is the priority:
Performance Comparison (typical values):───────────────────────────────────────────────────────────────────────Component │ Throughput (RPS) │ Added Latency │ Connections/sec───────────────────────────────────────────────────────────────────────L4 Load Balancer │ 1,000,000+ │ < 50 μs │ 100,000+L7 Load Balancer │ 100,000-500,000 │ 100-500 μs │ 50,000+API Gateway │ 10,000-100,000 │ 1-10 ms │ 10,000+─────────────────────────────────────────────────────────────────────── Note: Values are illustrative. Actual performance depends heavily on hardware, configuration, and workload characteristics. The performance difference reflects the work each layer does:- L4: Minimal processing, just routes packets- L7: Parses HTTP, inspects headers- Gateway: Full request processing, authentication, rate limitingFor service-to-service communication within a trusted zone, full API gateway capabilities are often unnecessary. A simple L7 load balancer or service mesh provides:
Load balancers handle stateful connections effectively:
Global load balancers (AWS Global Accelerator, Cloudflare) route users to the nearest healthy endpoint:
| Scenario | Why Load Balancer Excels | Example |
|---|---|---|
| Database Proxying | L4 handling of MySQL/PostgreSQL protocol | PgBouncer, ProxySQL |
| Internal Microservices | Simple, fast routing without API overhead | Kubernetes Service load balancing |
| Non-HTTP Protocols | Protocol-agnostic at L4 | gRPC, MQTT, custom TCP |
| Extreme Scale | Millions of RPS with microsecond latency | CDN origin load balancing |
| Long-Lived Connections | Connection affinity and health checking | WebSocket game servers |
In production, load balancers often sit in front of API Gateways. The load balancer handles connection distribution across gateway instances; the gateway handles API logic. The load balancer is infrastructure for the gateway, not a replacement for it.
API Gateways shine when you need to manage APIs as first-class entities—not just route traffic, but govern it.
When exposing APIs to external consumers (mobile apps, third parties, partners), you need:
APIs have a lifecycle: they're designed, deployed, versioned, deprecated, and retired. Gateways support this lifecycle:
API Lifecycle
─────────────────────────────────────────────────────────────────
Design Deploy Operate Retire
│ │ │ │
│ ┌─────────────┴────────────────┴────────────────┴─────────┐
│ │ API GATEWAY │
│ │ │
▼ │ Version Multiple versions Deprecation │
Schema │ Registration simultaneously warnings │
Def. │ │
│ Documentation Traffic Usage │
│ Publishing splitting analytics │
│ │
│ Consumer Canary Sunset │
│ Onboarding deployments notifications │
│ │
└────────────────────────────────────────────────────────┘
Gateways bridge different protocols and data formats:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
// Gateway configuration: REST to gRPC translation interface ProtocolTranslationRoute { publicApi: { method: 'GET' | 'POST' | 'PUT' | 'DELETE'; path: string; requestSchema?: JSONSchema; }; backendGrpc: { service: string; method: string; requestMapping: FieldMapping[]; responseMapping: FieldMapping[]; };} const userServiceRoutes: ProtocolTranslationRoute[] = [ { // External REST API publicApi: { method: 'GET', path: '/api/v1/users/{userId}', }, // Internal gRPC service backendGrpc: { service: 'user-service.grpc.internal:50051', method: 'UserService.GetUser', requestMapping: [ { from: 'path.userId', to: 'request.user_id' }, ], responseMapping: [ { from: 'response.user.id', to: 'id' }, { from: 'response.user.email', to: 'email' }, { from: 'response.user.created_at', to: 'createdAt', transform: 'timestampToISO' }, // Exclude internal fields from external response // 'response.user.internal_notes' is not mapped ], }, }, { publicApi: { method: 'POST', path: '/api/v1/users', requestSchema: { type: 'object', required: ['email', 'name'], properties: { email: { type: 'string', format: 'email' }, name: { type: 'string', minLength: 1 }, }, }, }, backendGrpc: { service: 'user-service.grpc.internal:50051', method: 'UserService.CreateUser', requestMapping: [ { from: 'body.email', to: 'request.email' }, { from: 'body.name', to: 'request.name' }, { from: 'identity.tenantId', to: 'request.tenant_id' }, // Inject from auth ], responseMapping: [ { from: 'response.user', to: 'user' }, { from: 'response.created', to: 'created' }, ], }, },];For APIs consumed by developers (internal or external), the gateway enables:
When your API is a product—something external developers use—the gateway is your storefront. It handles the developer experience: signup, keys, documentation, usage limits, and support. No load balancer provides this.
In production architectures, load balancers and API gateways are complementary, not competing. Understanding their typical arrangements clarifies their relationship.
The most common production pattern:
┌─────────────────────────────────────────────────────────────────────────────────┐│ INTERNET ││ │ ││ ▼ │├─────────────────────────────────────────────────────────────────────────────────┤│ ┌─────────────────────────────────────────────────────────────────────────────┐││ │ CLOUD LOAD BALANCER (L4/L7) │││ │ (AWS ALB, GCP LB, Azure LB) │││ │ │││ │ Responsibilities: │││ │ ✓ TLS termination (SSL offloading) │││ │ ✓ Connection distribution across gateway instances │││ │ ✓ Health checks on gateway │││ │ ✓ Geographic routing (if global) │││ │ ✓ DDoS protection (L4 attacks) │││ │ │││ │ Does NOT do: │││ │ ✗ Authentication │││ │ ✗ Rate limiting per user │││ │ ✗ API versioning │││ │ ✗ Request transformation │││ └─────────────────────────────────────────────────────────────────────────────┘││ │ ││ ┌─────────────────────────┼─────────────────────────┐ ││ ▼ ▼ ▼ ││ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ ││ │ API Gateway │ │ API Gateway │ │ API Gateway │ ││ │ Instance 1 │ │ Instance 2 │ │ Instance 3 │ ││ │ (AZ-1a) │ │ (AZ-1b) │ │ (AZ-1c) │ ││ └──────────────────┘ └──────────────────┘ └──────────────────┘ ││ │ │ │ ││ ┌─────────┴─────────────────────────┴─────────────────────────┴─────────┐ ││ │ API GATEWAY CLUSTER │ ││ │ │ ││ │ Responsibilities: │ ││ │ ✓ Authentication (JWT, API keys) │ ││ │ ✓ Authorization │ ││ │ ✓ Rate limiting per user/key │ ││ │ ✓ Request routing to services │ ││ │ ✓ Request/response transformation │ ││ │ ✓ API versioning │ ││ │ ✓ Observability (metrics, traces) │ ││ └────────────────────────────────────────────────────────────────────────┘ ││ │ ││ ┌─────────────────────────┼─────────────────────────┐ ││ ▼ ▼ ▼ ││ ┌────────────┐ ┌────────────┐ ┌────────────┐ ││ │ Product │ │ User │ │ Order │ ││ │ Service │ │ Service │ │ Service │ ││ └────────────┘ └────────────┘ └────────────┘ │└─────────────────────────────────────────────────────────────────────────────────┘Why this pattern works:
Separation of Concerns: The load balancer handles connection-level distribution; the gateway handles API-level logic
Scalability: Gateway instances scale horizontally; the load balancer distributes evenly
High Availability: Load balancer health checks ensure traffic only goes to healthy gateways
SSL Offloading: TLS termination can happen at the load balancer, reducing gateway CPU load
Cloud Integration: Cloud load balancers integrate with auto-scaling, monitoring, and WAF services
Some gateways (Kong, Envoy) include built-in load balancing for backend services:
┌─────────────────────────────────────┐
│ API GATEWAY │
│ │
Request ───────────────►│ Auth ──► Rate Limit ──► Transform │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Built-in │ │
│ │Load Balancer│ │
│ └──────┬──────┘ │
└────────────────┼────────────────────┘
│
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
Instance 1 Instance 2 Instance 3
This simplifies deployments when:
For east-west (service-to-service) traffic, many organizations replace internal load balancers with a service mesh:
North-South Traffic
│
▼
┌───────────────────────┐
│ API Gateway │
│ (External Traffic) │
└───────────┬───────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Service A │◄─────►│ Service B │◄─────►│ Service C │
│ + Sidecar │ │ + Sidecar │ │ + Sidecar │
└───────────────┘ └───────────────┘ └───────────────┘
▲ ▲ ▲
│ │ │
└───────────────────────┴───────────────────────┘
East-West Traffic
(Managed by Service Mesh Sidecars)
In this pattern:
Pattern 1 (LB + Gateway) suits most organizations. Pattern 2 (gateway-integrated) works for simpler deployments. Pattern 3 (mesh) is best for large microservice deployments where service-to-service concerns (mTLS, observability) are complex.
Let's address frequent misconceptions about gateways and load balancers:
| Need | Use Load Balancer | Use API Gateway |
|---|---|---|
| Distribute TCP connections | ✓ | |
| Route internal microservice HTTP | ✓ (or mesh) | |
| SSL termination only | ✓ | |
| Validate API keys | ✓ | |
| Enforce per-user rate limits | ✓ | |
| Protocol translation (REST→gRPC) | ✓ | |
| API versioning | ✓ | |
| Developer portal | ✓ | |
| Non-HTTP protocols | ✓ | |
| External API security | ✓ | |
| High-throughput static content | ✓ (with CDN) | |
| GraphQL federation | ✓ |
Modern products blur the lines: NGINX Plus has API gateway features; Kong includes load balancing; AWS ALB has some API-like routing. Understand the core competencies rather than product branding. Choose based on what the product does well, not what it claims to be.
Use this framework to decide what you need for a given use case:
┌─────────────────────────┐ │ Is traffic external │ │ (clients, partners)? │ └───────────┬────────────┘ │ ┌─────────────────┴─────────────────┐ │ │ Yes No │ │ ▼ ▼ ┌──────────────────────┐ ┌───────────────────────┐ │ You need an │ │ Is it HTTP traffic? │ │ API GATEWAY │ └───────────┬───────────┘ │ │ │ │ + Authentication │ ┌─────────────┴─────────────┐ │ + Rate Limiting │ │ │ │ + API Management │ Yes No └──────────┬───────────┘ │ │ │ ▼ ▼ │ ┌──────────────────────┐ ┌──────────────────────┐ │ │ Use L7 LOAD BALANCER │ │ Use L4 LOAD BALANCER │ │ │ or SERVICE MESH │ │ │ │ │ │ │ TCP/UDP distribution │ │ │ HTTP routing, │ │ Connection balancing │ │ │ health checks │ │ Max throughput │ │ └──────────────────────┘ └──────────────────────┘ │ ▼ ┌──────────────────────┐ │ Do you have multiple │ │ gateway instances? │ └───────────┬──────────┘ │ ┌───────────┴───────────┐ │ │ Yes No │ │ ▼ │┌──────────────────────┐ ││ Put a LOAD BALANCER │ ││ IN FRONT of Gateway │ ││ │ ││ Distribute across │ ││ gateway instances │ │└──────────────────────┘ │ │ ▼ ┌──────────────────────┐ │ Gateway built-in │ │ load balancing might │ │ suffice │ └──────────────────────┘For Startups / Small Teams:
For Mid-Size Organizations:
For Large Enterprises:
If you're unsure, use both: a cloud load balancer (AWS ALB, GCP Load Balancer) in front of an API gateway cluster. This pattern covers nearly all requirements and scales with your growth. You can always simplify or sophisticate from this baseline.
Understanding the distinction between API Gateways and Load Balancers is essential for correct infrastructure design. Let's consolidate the key insights:
Module Complete:
You've now completed Module 1: What Is an API Gateway? You understand:
Next, we'll explore Routing and Traffic Management—the sophisticated techniques gateways use to direct traffic to the right backends with path-based routing, header-based decisions, traffic splitting, and request transformation.
Congratulations! You've completed the foundational module on API Gateways. You now understand what an API Gateway is, its responsibilities, where it fits in your architecture, and how it differs from load balancers. This knowledge forms the essential foundation for the advanced gateway topics covered in subsequent modules.