System Design (HLD)What Is an API Gateway?

What Is an API Gateway?

LevelIntermediate

Duration60 mins

TopicWhat Is an API Gateway?

4 / 4

Gateway vs Load Balancer

The Confusion Between Gateways and Load Balancers

One of the most common sources of confusion in infrastructure design is the relationship between API Gateways and Load Balancers. They both sit in front of your services. They both route traffic. They both can perform health checks. So what's the difference?

The confusion is understandable—many gateway products include load balancing features, and many load balancers have grown to include API-like capabilities. But they solve fundamentally different problems, operate at different layers, and have different core competencies.

This page will dissect the differences with precision, helping you understand when to use each, how they complement each other, and common architectural patterns that combine both effectively.

What You Will Learn

By the end of this page, you will understand the fundamental differences between API Gateways and Load Balancers, their respective strengths, how they work together in production architectures, and how to make informed decisions about when to use each.

Fundamental Differences

At their core, API Gateways and Load Balancers solve different problems:

Load Balancer: How do I distribute traffic efficiently across server instances?

API Gateway: How do I manage and secure API traffic at the application layer?

This distinction manifests in their design, capabilities, and position in the stack.

API Gateway vs. Load Balancer Comparison
Dimension	Load Balancer	API Gateway
OSI Layer	Layer 4 (TCP/UDP) or Layer 7 (HTTP)	Layer 7 (HTTP/API) exclusively
Primary Purpose	Distribute load across instances	Manage API traffic and enforce policies
Traffic Understanding	Connections, packets, or basic HTTP	Deep API semantics: paths, methods, headers, bodies, auth
State Awareness	Typically stateless (connection level)	API-aware: sessions, tokens, rate limits, quotas
Authentication	None or basic (IP, certificate)	Full: JWT, OAuth2, API keys, mTLS
Rate Limiting	Basic (connections/sec)	Sophisticated (per-user, per-endpoint, token bucket)
Request Transformation	None or minimal	Full: header manipulation, body transformation, protocol translation
Response Handling	Pass-through	Transformation, caching, error normalization
Observability	Connection metrics, health checks	Full API metrics: latency by endpoint, error rates, business metrics
Typical Products	HAProxy, NGINX (LB mode), AWS ELB/ALB	Kong, AWS API Gateway, Apigee, NGINX (API mode)

Layer 4 vs. Layer 7: The OSI Model Perspective

Understanding the OSI layer distinction is key:

Layer 4 (Transport Layer) Load Balancers:

Operate at TCP/UDP level
See only IPs, ports, and connection metadata
Cannot inspect HTTP content
Ultra-fast: no protocol parsing overhead
Use cases: Database load balancing, non-HTTP protocols, maximum performance

Layer 7 (Application Layer) Load Balancers:

Operate at HTTP level
Can inspect headers, paths, cookies
Perform basic content-aware routing
Slightly more latency than L4
Use cases: HTTP traffic distribution, SSL termination, sticky sessions

API Gateways:

Operate exclusively at Layer 7, with deep API semantics
Understand not just HTTP, but API patterns (REST, GraphQL, gRPC)
Parse and validate request bodies
Manage API lifecycle: versioning, deprecation, documentation
Use cases: API management, security, developer experience

Incoming Request:
───────────────────────────────────────────────────────────────────────
POST /api/v2/orders HTTP/1.1
Host: api.example.com
Authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...
Content-Type: application/json
X-Correlation-ID: abc-123
 
{
  "customerId": "cust_12345",
  "items": [
    { "productId": "prod_001", "quantity": 2 }
  ],
  "paymentMethod": "card_ending_4242"
}
───────────────────────────────────────────────────────────────────────
 
 
Layer 4 Load Balancer sees:
───────────────────────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────────────────────┐
│  Source IP: 203.0.113.45:52841                                      │
│  Dest IP: 10.0.1.50:443                                             │
│  Protocol: TCP                                                       │
│  Bytes: 847                                                          │
│  Connection: New                                                     │
└─────────────────────────────────────────────────────────────────────┘
Decision: Route to server 10.0.1.100:8080 (least connections)
───────────────────────────────────────────────────────────────────────
 
 
Layer 7 Load Balancer sees:
───────────────────────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────────────────────┐
│  + Everything L4 sees, plus:                                        │
│  Method: POST                                                        │
│  Path: /api/v2/orders                                                │
│  Host: api.example.com                                               │
│  Headers: Authorization, Content-Type, X-Correlation-ID              │
│  Content-Length: 147                                                 │
└─────────────────────────────────────────────────────────────────────┘
Decision: Route to order-service backend pool (path matching /api/v2/orders*)
───────────────────────────────────────────────────────────────────────
 
 
API Gateway sees:
───────────────────────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────────────────────┐
│  + Everything L7 sees, plus:                                        │
│  API Version: v2                                                     │
│  API Operation: CreateOrder                                          │
│  Auth Token: Valid JWT, expires in 45 minutes                        │
│  User ID: user_789 (from token)                                      │
│  User Roles: [customer, premium-tier]                                │
│  Tenant ID: tenant_abc                                               │
│  Rate Limit Status: 47/100 requests remaining this minute            │
│  Request Body Parsed: Valid JSON, matches CreateOrderRequest schema  │
│  Customer ID: cust_12345                                             │
│  Product IDs: [prod_001]                                             │
└─────────────────────────────────────────────────────────────────────┘
Decision: 
  1. ✓ Authentication valid
  2. ✓ Authorization: premium-tier can access CreateOrder
  3. ✓ Rate limit: under quota
  4. ✓ Schema validation: request body valid
  5. Route to order-service-v2 (internal endpoint)
  6. Add headers: X-User-ID, X-Tenant-ID, X-Request-ID (for tracing)
  7. Record metrics: CreateOrder request, latency, user tier
───────────────────────────────────────────────────────────────────────

Depth of Understanding

The key insight: a load balancer sees traffic as connections or HTTP requests. An API Gateway sees traffic as API operations with full semantic understanding: who is calling, what are they authorized to do, is this request valid, and how does it fit into the broader API contract?

When Load Balancers Excel

Load balancers are purpose-built for traffic distribution and excel in specific scenarios:

1. High-Throughput Traffic Distribution

Layer 4 load balancers can handle millions of connections per second with minimal latency overhead. When raw throughput is the priority:

Database connection pooling (MySQL, PostgreSQL proxies)
Gaming servers with thousands of simultaneous connections
Real-time streaming applications
Any non-HTTP protocol (gRPC without HTTP/2, custom TCP protocols)

Performance Comparison (typical values):
───────────────────────────────────────────────────────────────────────
Component          │ Throughput (RPS) │ Added Latency │ Connections/sec
───────────────────────────────────────────────────────────────────────
L4 Load Balancer   │ 1,000,000+       │ < 50 μs       │ 100,000+
L7 Load Balancer   │ 100,000-500,000  │ 100-500 μs    │ 50,000+
API Gateway        │ 10,000-100,000   │ 1-10 ms       │ 10,000+
───────────────────────────────────────────────────────────────────────
 
Note: Values are illustrative. Actual performance depends heavily on 
hardware, configuration, and workload characteristics.
 
The performance difference reflects the work each layer does:
- L4: Minimal processing, just routes packets
- L7: Parses HTTP, inspects headers
- Gateway: Full request processing, authentication, rate limiting

2. Internal Service Load Balancing

For service-to-service communication within a trusted zone, full API gateway capabilities are often unnecessary. A simple L7 load balancer or service mesh provides:

Health checking and automatic failover
Round-robin or least-connections distribution
Circuit breaking at the connection level
mTLS termination

3. Stateful Protocol Handling

Load balancers handle stateful connections effectively:

WebSocket connection distribution
Long-lived HTTP/2 connections (multiplexed streams)
TCP keep-alive connections to databases
MQTT for IoT devices

4. Geographic Distribution

Global load balancers (AWS Global Accelerator, Cloudflare) route users to the nearest healthy endpoint:

Anycast networking for automatic geo-routing
Health checks across regions
Failover between regions on outage

Load Balancer Sweet Spots
Scenario	Why Load Balancer Excels	Example
Database Proxying	L4 handling of MySQL/PostgreSQL protocol	PgBouncer, ProxySQL
Internal Microservices	Simple, fast routing without API overhead	Kubernetes Service load balancing
Non-HTTP Protocols	Protocol-agnostic at L4	gRPC, MQTT, custom TCP
Extreme Scale	Millions of RPS with microsecond latency	CDN origin load balancing
Long-Lived Connections	Connection affinity and health checking	WebSocket game servers

Load Balancers as Gateway Infrastructure

In production, load balancers often sit in front of API Gateways. The load balancer handles connection distribution across gateway instances; the gateway handles API logic. The load balancer is infrastructure for the gateway, not a replacement for it.

When API Gateways Excel

API Gateways shine when you need to manage APIs as first-class entities—not just route traffic, but govern it.

1. External API Exposure

When exposing APIs to external consumers (mobile apps, third parties, partners), you need:

Authentication: Validate JWTs, API keys, OAuth tokens
Authorization: Enforce who can access what
Rate Limiting: Protect from abuse, enforce quotas
Versioning: Manage multiple API versions simultaneously
Documentation: Serve OpenAPI specs, developer portals
Analytics: Track usage by consumer, endpoint, error rates

Gateway Security Capabilities

•JWT Validation — Cryptographically verify tokens, check expiration, extract claims
•API Key Management — Issue, rotate, revoke keys; track usage per key
•OAuth 2.0 Flows — Authorization code, client credentials, token exchange
•Request Validation — Schema validation, parameter sanitization, injection prevention
•IP Allowlisting/Blocklisting — Restrict access by source IP range
•Mutual TLS (mTLS) — Certificate-based client authentication
•Fine-Grained Permissions — Scope-based access to specific operations

2. API Lifecycle Management

APIs have a lifecycle: they're designed, deployed, versioned, deprecated, and retired. Gateways support this lifecycle:

API Lifecycle
─────────────────────────────────────────────────────────────────

  Design          Deploy          Operate          Retire
    │                │                │                │
    │  ┌─────────────┴────────────────┴────────────────┴─────────┐
    │  │                    API GATEWAY                          │
    │  │                                                         │
    ▼  │  Version        Multiple versions    Deprecation       │
 Schema │  Registration    simultaneously        warnings        │
 Def.   │                                                        │
        │  Documentation   Traffic           Usage             │
        │  Publishing      splitting         analytics          │
        │                                                        │
        │  Consumer        Canary            Sunset             │
        │  Onboarding      deployments       notifications      │
        │                                                        │
        └────────────────────────────────────────────────────────┘

3. Protocol Translation

Gateways bridge different protocols and data formats:

REST ↔ gRPC: External REST clients, internal gRPC services
JSON ↔ XML: Modern clients, legacy backends
GraphQL Federation: Compose GraphQL from multiple sources
SOAP to REST: Modernize legacy integrations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
// Gateway configuration: REST to gRPC translation
 
interface ProtocolTranslationRoute {
  publicApi: {
    method: 'GET' | 'POST' | 'PUT' | 'DELETE';
    path: string;
    requestSchema?: JSONSchema;
  };
  backendGrpc: {
    service: string;
    method: string;
    requestMapping: FieldMapping[];
    responseMapping: FieldMapping[];
  };
}
 
const userServiceRoutes: ProtocolTranslationRoute[] = [
  {
    // External REST API
    publicApi: {
      method: 'GET',
      path: '/api/v1/users/{userId}',
    },
    // Internal gRPC service
    backendGrpc: {
      service: 'user-service.grpc.internal:50051',
      method: 'UserService.GetUser',
      requestMapping: [
        { from: 'path.userId', to: 'request.user_id' },
      ],
      responseMapping: [
        { from: 'response.user.id', to: 'id' },
        { from: 'response.user.email', to: 'email' },
        { from: 'response.user.created_at', to: 'createdAt', transform: 'timestampToISO' },
        // Exclude internal fields from external response
        // 'response.user.internal_notes' is not mapped
      ],
    },
  },
  {
    publicApi: {
      method: 'POST',
      path: '/api/v1/users',
      requestSchema: {
        type: 'object',
        required: ['email', 'name'],
        properties: {
          email: { type: 'string', format: 'email' },
          name: { type: 'string', minLength: 1 },
        },
      },
    },
    backendGrpc: {
      service: 'user-service.grpc.internal:50051',
      method: 'UserService.CreateUser',
      requestMapping: [
        { from: 'body.email', to: 'request.email' },
        { from: 'body.name', to: 'request.name' },
        { from: 'identity.tenantId', to: 'request.tenant_id' }, // Inject from auth
      ],
      responseMapping: [
        { from: 'response.user', to: 'user' },
        { from: 'response.created', to: 'created' },
      ],
    },
  },
];

4. Developer Experience

For APIs consumed by developers (internal or external), the gateway enables:

Developer Portal: Self-service key generation, documentation
Sandbox Environments: Safe testing without production access
Usage Dashboards: Visibility into API consumption
Quota Management: Request quota allocation and tracking
Changelog Communication: Notify consumers of API changes

API as Product

When your API is a product—something external developers use—the gateway is your storefront. It handles the developer experience: signup, keys, documentation, usage limits, and support. No load balancer provides this.

How They Work Together

In production architectures, load balancers and API gateways are complementary, not competing. Understanding their typical arrangements clarifies their relationship.

Pattern 1: Load Balancer in Front of Gateway

The most common production pattern:

┌─────────────────────────────────────────────────────────────────────────────────┐
│                                   INTERNET                                       │
│                                      │                                           │
│                                      ▼                                           │
├─────────────────────────────────────────────────────────────────────────────────┤
│  ┌─────────────────────────────────────────────────────────────────────────────┐│
│  │                       CLOUD LOAD BALANCER (L4/L7)                           ││
│  │                         (AWS ALB, GCP LB, Azure LB)                         ││
│  │                                                                             ││
│  │  Responsibilities:                                                          ││
│  │  ✓ TLS termination (SSL offloading)                                         ││
│  │  ✓ Connection distribution across gateway instances                         ││
│  │  ✓ Health checks on gateway                                                 ││
│  │  ✓ Geographic routing (if global)                                           ││
│  │  ✓ DDoS protection (L4 attacks)                                             ││
│  │                                                                             ││
│  │  Does NOT do:                                                               ││
│  │  ✗ Authentication                                                           ││
│  │  ✗ Rate limiting per user                                                   ││
│  │  ✗ API versioning                                                           ││
│  │  ✗ Request transformation                                                   ││
│  └─────────────────────────────────────────────────────────────────────────────┘│
│                                      │                                           │
│            ┌─────────────────────────┼─────────────────────────┐                 │
│            ▼                         ▼                         ▼                 │
│  ┌──────────────────┐     ┌──────────────────┐     ┌──────────────────┐          │
│  │  API Gateway     │     │  API Gateway     │     │  API Gateway     │          │
│  │  Instance 1      │     │  Instance 2      │     │  Instance 3      │          │
│  │  (AZ-1a)         │     │  (AZ-1b)         │     │  (AZ-1c)         │          │
│  └──────────────────┘     └──────────────────┘     └──────────────────┘          │
│            │                         │                         │                 │
│  ┌─────────┴─────────────────────────┴─────────────────────────┴─────────┐       │
│  │                         API GATEWAY CLUSTER                            │       │
│  │                                                                        │       │
│  │  Responsibilities:                                                     │       │
│  │  ✓ Authentication (JWT, API keys)                                      │       │
│  │  ✓ Authorization                                                       │       │
│  │  ✓ Rate limiting per user/key                                          │       │
│  │  ✓ Request routing to services                                         │       │
│  │  ✓ Request/response transformation                                     │       │
│  │  ✓ API versioning                                                      │       │
│  │  ✓ Observability (metrics, traces)                                     │       │
│  └────────────────────────────────────────────────────────────────────────┘       │
│                                      │                                           │
│            ┌─────────────────────────┼─────────────────────────┐                 │
│            ▼                         ▼                         ▼                 │
│     ┌────────────┐           ┌────────────┐           ┌────────────┐             │
│     │  Product   │           │   User     │           │   Order    │             │
│     │  Service   │           │  Service   │           │  Service   │             │
│     └────────────┘           └────────────┘           └────────────┘             │
└─────────────────────────────────────────────────────────────────────────────────┘

Why this pattern works:

Separation of Concerns: The load balancer handles connection-level distribution; the gateway handles API-level logic
Scalability: Gateway instances scale horizontally; the load balancer distributes evenly
High Availability: Load balancer health checks ensure traffic only goes to healthy gateways
SSL Offloading: TLS termination can happen at the load balancer, reducing gateway CPU load
Cloud Integration: Cloud load balancers integrate with auto-scaling, monitoring, and WAF services

Pattern 2: Gateway with Internal Load Balancing

Some gateways (Kong, Envoy) include built-in load balancing for backend services:

                        ┌─────────────────────────────────────┐
                        │            API GATEWAY              │
                        │                                     │
Request ───────────────►│  Auth ──► Rate Limit ──► Transform  │
                        │                │                    │
                        │                ▼                    │
                        │         ┌─────────────┐             │
                        │         │   Built-in  │             │
                        │         │Load Balancer│             │
                        │         └──────┬──────┘             │
                        └────────────────┼────────────────────┘
                                         │
                   ┌─────────────────────┼─────────────────────┐
                   │                     │                     │
                   ▼                     ▼                     ▼
              Instance 1            Instance 2            Instance 3

This simplifies deployments when:

Backend services have known, relatively static endpoints
You want fewer infrastructure components
Your gateway product has robust load balancing features

Pattern 3: Service Mesh for Internal Traffic

For east-west (service-to-service) traffic, many organizations replace internal load balancers with a service mesh:

                             North-South Traffic
                                    │
                                    ▼
                        ┌───────────────────────┐
                        │      API Gateway      │
                        │   (External Traffic)  │
                        └───────────┬───────────┘
                                    │
            ┌───────────────────────┼───────────────────────┐
            │                       │                       │
            ▼                       ▼                       ▼
    ┌───────────────┐       ┌───────────────┐       ┌───────────────┐
    │   Service A   │◄─────►│   Service B   │◄─────►│   Service C   │
    │   + Sidecar   │       │   + Sidecar   │       │   + Sidecar   │
    └───────────────┘       └───────────────┘       └───────────────┘
            ▲                       ▲                       ▲
            │                       │                       │
            └───────────────────────┴───────────────────────┘
                              East-West Traffic
                       (Managed by Service Mesh Sidecars)

In this pattern:

API Gateway: Handles north-south (external to internal) traffic
Service Mesh: Handles east-west (internal) traffic with sidecars
No Separate Internal Load Balancers: Mesh provides load balancing, mTLS, retries

Choosing the Right Pattern

Pattern 1 (LB + Gateway) suits most organizations. Pattern 2 (gateway-integrated) works for simpler deployments. Pattern 3 (mesh) is best for large microservice deployments where service-to-service concerns (mTLS, observability) are complex.

Common Misconceptions

Let's address frequent misconceptions about gateways and load balancers:

Misconceptions to Avoid

•"They're interchangeable" — They are not. A load balancer cannot validate JWTs, enforce per-user rate limits, or manage API versions. A gateway is overkill for simple TCP load balancing.
•"I only need one or the other" — Most production systems need both: load balancers for connection distribution, gateways for API management. They're layers, not alternatives.
•"NGINX is a load balancer" — NGINX can be configured as an L7 load balancer OR as an API gateway. The product is the same; the configuration determines the role.
•"AWS ALB replaces API Gateway" — ALB handles path-based routing and TLS termination. AWS API Gateway adds authentication, rate limiting, API management, and developer portal features. Different tools.
•"Gateway is slower, avoid if possible" — Gateways add latency, but they prevent security breaches, outages from abuse, and enable API evolution. The latency is a worthwhile trade-off for API traffic.
•"API Gateway duplicates service mesh functionality" — Meshes excel at east-west traffic (service-to-service). Gateways excel at north-south traffic (external APIs). They're complementary.

Use the Right Tool for the Job
Need	Use Load Balancer	Use API Gateway
Distribute TCP connections	✓
Route internal microservice HTTP	✓ (or mesh)
SSL termination only	✓
Validate API keys		✓
Enforce per-user rate limits		✓
Protocol translation (REST→gRPC)		✓
API versioning		✓
Developer portal		✓
Non-HTTP protocols	✓
External API security		✓
High-throughput static content	✓ (with CDN)
GraphQL federation		✓

Product Overlap

Modern products blur the lines: NGINX Plus has API gateway features; Kong includes load balancing; AWS ALB has some API-like routing. Understand the core competencies rather than product branding. Choose based on what the product does well, not what it claims to be.

Decision Framework: Gateway, Load Balancer, or Both?

Use this framework to decide what you need for a given use case:

                          ┌─────────────────────────┐
                          │  Is traffic external   │
                          │  (clients, partners)?  │
                          └───────────┬────────────┘
                                      │
                    ┌─────────────────┴─────────────────┐
                    │                                   │
                   Yes                                 No
                    │                                   │
                    ▼                                   ▼
        ┌──────────────────────┐          ┌───────────────────────┐
        │ You need an          │          │ Is it HTTP traffic?   │
        │ API GATEWAY          │          └───────────┬───────────┘
        │                      │                      │
        │ + Authentication     │        ┌─────────────┴─────────────┐
        │ + Rate Limiting      │        │                           │
        │ + API Management     │       Yes                         No
        └──────────┬───────────┘        │                           │
                   │                    ▼                           ▼
                   │        ┌──────────────────────┐   ┌──────────────────────┐
                   │        │ Use L7 LOAD BALANCER │   │ Use L4 LOAD BALANCER │
                   │        │ or SERVICE MESH      │   │                      │
                   │        │                      │   │ TCP/UDP distribution │
                   │        │ HTTP routing,        │   │ Connection balancing │
                   │        │ health checks        │   │ Max throughput       │
                   │        └──────────────────────┘   └──────────────────────┘
                   │
                   ▼
        ┌──────────────────────┐
        │ Do you have multiple │
        │ gateway instances?   │
        └───────────┬──────────┘
                    │
        ┌───────────┴───────────┐
        │                       │
       Yes                     No
        │                       │
        ▼                       │
┌──────────────────────┐        │
│ Put a LOAD BALANCER  │        │
│ IN FRONT of Gateway  │        │
│                      │        │
│ Distribute across    │        │
│ gateway instances    │        │
└──────────────────────┘        │
                                │
                                ▼
                     ┌──────────────────────┐
                     │ Gateway built-in     │
                     │ load balancing might │
                     │ suffice              │
                     └──────────────────────┘

Practical Recommendations

For Startups / Small Teams:

Start with a cloud-managed gateway (AWS API Gateway, Google Cloud Endpoints)
Cloud provider's load balancer handles gateway HA automatically
Keep it simple; add complexity when you have concrete problems

For Mid-Size Organizations:

L7 load balancer (ALB, Cloud Load Balancer) in front of gateway cluster
API Gateway (Kong, NGINX API Gateway) for API management
Consider service mesh for internal traffic if microservices are complex

For Large Enterprises:

Global load balancing (anycast, multi-region)
CDN + Edge + API Gateway tiers
Service mesh for internal traffic
Potentially domain-specific internal gateways
Full observability stack integrated across all layers

When in Doubt

If you're unsure, use both: a cloud load balancer (AWS ALB, GCP Load Balancer) in front of an API gateway cluster. This pattern covers nearly all requirements and scales with your growth. You can always simplify or sophisticate from this baseline.

Summary: Gateway vs. Load Balancer

Understanding the distinction between API Gateways and Load Balancers is essential for correct infrastructure design. Let's consolidate the key insights:

Key Takeaways

•Different Problems, Different Tools — Load balancers distribute connections; gateways manage APIs. They solve different problems at different layers.
•Layer 4 vs. Layer 7 vs. API-Aware — Load balancers operate at connection or HTTP level; gateways understand full API semantics (auth, rate limits, versioning).
•Load Balancers Excel at Distribution — High throughput, low latency, non-HTTP protocols, and connection-level concerns are load balancer territory.
•Gateways Excel at API Management — Authentication, authorization, rate limiting, versioning, protocol translation, and developer experience require a gateway.
•They're Complementary, Not Competing — Production architectures typically use both: load balancers distribute traffic across gateway instances.
•Choose Based on Requirements — External API traffic needs a gateway. Internal service traffic often needs just load balancing. Many systems need both.
•Avoid Misconceptions — Products overlap in features, but their core competencies differ. Don't confuse an L7 load balancer with path routing for a full API gateway.

Module Complete:

You've now completed Module 1: What Is an API Gateway? You understand:

The gateway as a single entry point and its role in distributed systems
The comprehensive responsibilities of an API Gateway
How to position gateways in your architecture
The critical distinction between gateways and load balancers

Next, we'll explore Routing and Traffic Management—the sophisticated techniques gateways use to direct traffic to the right backends with path-based routing, header-based decisions, traffic splitting, and request transformation.

Module Complete

Congratulations! You've completed the foundational module on API Gateways. You now understand what an API Gateway is, its responsibilities, where it fits in your architecture, and how it differs from load balancers. This knowledge forms the essential foundation for the advanced gateway topics covered in subsequent modules.

4 / 4

Loading learning content...

System Design (HLD)What Is an API Gateway?

What Is an API Gateway?

LevelIntermediate

Duration60 mins

TopicWhat Is an API Gateway?

4 / 4

Gateway vs Load Balancer

The Confusion Between Gateways and Load Balancers

This page will dissect the differences with precision, helping you understand when to use each, how they complement each other, and common architectural patterns that combine both effectively.

What You Will Learn

Fundamental Differences

At their core, API Gateways and Load Balancers solve different problems:

Load Balancer: How do I distribute traffic efficiently across server instances?

API Gateway: How do I manage and secure API traffic at the application layer?

This distinction manifests in their design, capabilities, and position in the stack.

API Gateway vs. Load Balancer Comparison
Dimension	Load Balancer	API Gateway
OSI Layer	Layer 4 (TCP/UDP) or Layer 7 (HTTP)	Layer 7 (HTTP/API) exclusively
Primary Purpose	Distribute load across instances	Manage API traffic and enforce policies
Traffic Understanding	Connections, packets, or basic HTTP	Deep API semantics: paths, methods, headers, bodies, auth
State Awareness	Typically stateless (connection level)	API-aware: sessions, tokens, rate limits, quotas
Authentication	None or basic (IP, certificate)	Full: JWT, OAuth2, API keys, mTLS
Rate Limiting	Basic (connections/sec)	Sophisticated (per-user, per-endpoint, token bucket)
Request Transformation	None or minimal	Full: header manipulation, body transformation, protocol translation
Response Handling	Pass-through	Transformation, caching, error normalization
Observability	Connection metrics, health checks	Full API metrics: latency by endpoint, error rates, business metrics
Typical Products	HAProxy, NGINX (LB mode), AWS ELB/ALB	Kong, AWS API Gateway, Apigee, NGINX (API mode)

Layer 4 vs. Layer 7: The OSI Model Perspective

Understanding the OSI layer distinction is key:

Layer 4 (Transport Layer) Load Balancers:

Operate at TCP/UDP level
See only IPs, ports, and connection metadata
Cannot inspect HTTP content
Ultra-fast: no protocol parsing overhead
Use cases: Database load balancing, non-HTTP protocols, maximum performance

Layer 7 (Application Layer) Load Balancers:

Operate at HTTP level
Can inspect headers, paths, cookies
Perform basic content-aware routing
Slightly more latency than L4
Use cases: HTTP traffic distribution, SSL termination, sticky sessions

API Gateways:

Operate exclusively at Layer 7, with deep API semantics
Understand not just HTTP, but API patterns (REST, GraphQL, gRPC)
Parse and validate request bodies
Manage API lifecycle: versioning, deprecation, documentation
Use cases: API management, security, developer experience

Incoming Request:
───────────────────────────────────────────────────────────────────────
POST /api/v2/orders HTTP/1.1
Host: api.example.com
Authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...
Content-Type: application/json
X-Correlation-ID: abc-123
 
{
  "customerId": "cust_12345",
  "items": [
    { "productId": "prod_001", "quantity": 2 }
  ],
  "paymentMethod": "card_ending_4242"
}
───────────────────────────────────────────────────────────────────────
 
 
Layer 4 Load Balancer sees:
───────────────────────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────────────────────┐
│  Source IP: 203.0.113.45:52841                                      │
│  Dest IP: 10.0.1.50:443                                             │
│  Protocol: TCP                                                       │
│  Bytes: 847                                                          │
│  Connection: New                                                     │
└─────────────────────────────────────────────────────────────────────┘
Decision: Route to server 10.0.1.100:8080 (least connections)
───────────────────────────────────────────────────────────────────────
 
 
Layer 7 Load Balancer sees:
───────────────────────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────────────────────┐
│  + Everything L4 sees, plus:                                        │
│  Method: POST                                                        │
│  Path: /api/v2/orders                                                │
│  Host: api.example.com                                               │
│  Headers: Authorization, Content-Type, X-Correlation-ID              │
│  Content-Length: 147                                                 │
└─────────────────────────────────────────────────────────────────────┘
Decision: Route to order-service backend pool (path matching /api/v2/orders*)
───────────────────────────────────────────────────────────────────────
 
 
API Gateway sees:
───────────────────────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────────────────────┐
│  + Everything L7 sees, plus:                                        │
│  API Version: v2                                                     │
│  API Operation: CreateOrder                                          │
│  Auth Token: Valid JWT, expires in 45 minutes                        │
│  User ID: user_789 (from token)                                      │
│  User Roles: [customer, premium-tier]                                │
│  Tenant ID: tenant_abc                                               │
│  Rate Limit Status: 47/100 requests remaining this minute            │
│  Request Body Parsed: Valid JSON, matches CreateOrderRequest schema  │
│  Customer ID: cust_12345                                             │
│  Product IDs: [prod_001]                                             │
└─────────────────────────────────────────────────────────────────────┘
Decision: 
  1. ✓ Authentication valid
  2. ✓ Authorization: premium-tier can access CreateOrder
  3. ✓ Rate limit: under quota
  4. ✓ Schema validation: request body valid
  5. Route to order-service-v2 (internal endpoint)
  6. Add headers: X-User-ID, X-Tenant-ID, X-Request-ID (for tracing)
  7. Record metrics: CreateOrder request, latency, user tier
───────────────────────────────────────────────────────────────────────

Depth of Understanding

When Load Balancers Excel

Load balancers are purpose-built for traffic distribution and excel in specific scenarios:

1. High-Throughput Traffic Distribution

Layer 4 load balancers can handle millions of connections per second with minimal latency overhead. When raw throughput is the priority:

Database connection pooling (MySQL, PostgreSQL proxies)
Gaming servers with thousands of simultaneous connections
Real-time streaming applications
Any non-HTTP protocol (gRPC without HTTP/2, custom TCP protocols)

Performance Comparison (typical values):
───────────────────────────────────────────────────────────────────────
Component          │ Throughput (RPS) │ Added Latency │ Connections/sec
───────────────────────────────────────────────────────────────────────
L4 Load Balancer   │ 1,000,000+       │ < 50 μs       │ 100,000+
L7 Load Balancer   │ 100,000-500,000  │ 100-500 μs    │ 50,000+
API Gateway        │ 10,000-100,000   │ 1-10 ms       │ 10,000+
───────────────────────────────────────────────────────────────────────
 
Note: Values are illustrative. Actual performance depends heavily on 
hardware, configuration, and workload characteristics.
 
The performance difference reflects the work each layer does:
- L4: Minimal processing, just routes packets
- L7: Parses HTTP, inspects headers
- Gateway: Full request processing, authentication, rate limiting

2. Internal Service Load Balancing

For service-to-service communication within a trusted zone, full API gateway capabilities are often unnecessary. A simple L7 load balancer or service mesh provides:

Health checking and automatic failover
Round-robin or least-connections distribution
Circuit breaking at the connection level
mTLS termination

3. Stateful Protocol Handling

Load balancers handle stateful connections effectively:

WebSocket connection distribution
Long-lived HTTP/2 connections (multiplexed streams)
TCP keep-alive connections to databases
MQTT for IoT devices

4. Geographic Distribution

Global load balancers (AWS Global Accelerator, Cloudflare) route users to the nearest healthy endpoint:

Anycast networking for automatic geo-routing
Health checks across regions
Failover between regions on outage

Load Balancer Sweet Spots
Scenario	Why Load Balancer Excels	Example
Database Proxying	L4 handling of MySQL/PostgreSQL protocol	PgBouncer, ProxySQL
Internal Microservices	Simple, fast routing without API overhead	Kubernetes Service load balancing
Non-HTTP Protocols	Protocol-agnostic at L4	gRPC, MQTT, custom TCP
Extreme Scale	Millions of RPS with microsecond latency	CDN origin load balancing
Long-Lived Connections	Connection affinity and health checking	WebSocket game servers

Load Balancers as Gateway Infrastructure

When API Gateways Excel

API Gateways shine when you need to manage APIs as first-class entities—not just route traffic, but govern it.

1. External API Exposure

When exposing APIs to external consumers (mobile apps, third parties, partners), you need:

Authentication: Validate JWTs, API keys, OAuth tokens
Authorization: Enforce who can access what
Rate Limiting: Protect from abuse, enforce quotas
Versioning: Manage multiple API versions simultaneously
Documentation: Serve OpenAPI specs, developer portals
Analytics: Track usage by consumer, endpoint, error rates

Gateway Security Capabilities

•JWT Validation — Cryptographically verify tokens, check expiration, extract claims
•API Key Management — Issue, rotate, revoke keys; track usage per key
•OAuth 2.0 Flows — Authorization code, client credentials, token exchange
•Request Validation — Schema validation, parameter sanitization, injection prevention
•IP Allowlisting/Blocklisting — Restrict access by source IP range
•Mutual TLS (mTLS) — Certificate-based client authentication
•Fine-Grained Permissions — Scope-based access to specific operations

2. API Lifecycle Management

APIs have a lifecycle: they're designed, deployed, versioned, deprecated, and retired. Gateways support this lifecycle:

API Lifecycle
─────────────────────────────────────────────────────────────────

  Design          Deploy          Operate          Retire
    │                │                │                │
    │  ┌─────────────┴────────────────┴────────────────┴─────────┐
    │  │                    API GATEWAY                          │
    │  │                                                         │
    ▼  │  Version        Multiple versions    Deprecation       │
 Schema │  Registration    simultaneously        warnings        │
 Def.   │                                                        │
        │  Documentation   Traffic           Usage             │
        │  Publishing      splitting         analytics          │
        │                                                        │
        │  Consumer        Canary            Sunset             │
        │  Onboarding      deployments       notifications      │
        │                                                        │
        └────────────────────────────────────────────────────────┘

3. Protocol Translation

Gateways bridge different protocols and data formats:

REST ↔ gRPC: External REST clients, internal gRPC services
JSON ↔ XML: Modern clients, legacy backends
GraphQL Federation: Compose GraphQL from multiple sources
SOAP to REST: Modernize legacy integrations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
// Gateway configuration: REST to gRPC translation
 
interface ProtocolTranslationRoute {
  publicApi: {
    method: 'GET' | 'POST' | 'PUT' | 'DELETE';
    path: string;
    requestSchema?: JSONSchema;
  };
  backendGrpc: {
    service: string;
    method: string;
    requestMapping: FieldMapping[];
    responseMapping: FieldMapping[];
  };
}
 
const userServiceRoutes: ProtocolTranslationRoute[] = [
  {
    // External REST API
    publicApi: {
      method: 'GET',
      path: '/api/v1/users/{userId}',
    },
    // Internal gRPC service
    backendGrpc: {
      service: 'user-service.grpc.internal:50051',
      method: 'UserService.GetUser',
      requestMapping: [
        { from: 'path.userId', to: 'request.user_id' },
      ],
      responseMapping: [
        { from: 'response.user.id', to: 'id' },
        { from: 'response.user.email', to: 'email' },
        { from: 'response.user.created_at', to: 'createdAt', transform: 'timestampToISO' },
        // Exclude internal fields from external response
        // 'response.user.internal_notes' is not mapped
      ],
    },
  },
  {
    publicApi: {
      method: 'POST',
      path: '/api/v1/users',
      requestSchema: {
        type: 'object',
        required: ['email', 'name'],
        properties: {
          email: { type: 'string', format: 'email' },
          name: { type: 'string', minLength: 1 },
        },
      },
    },
    backendGrpc: {
      service: 'user-service.grpc.internal:50051',
      method: 'UserService.CreateUser',
      requestMapping: [
        { from: 'body.email', to: 'request.email' },
        { from: 'body.name', to: 'request.name' },
        { from: 'identity.tenantId', to: 'request.tenant_id' }, // Inject from auth
      ],
      responseMapping: [
        { from: 'response.user', to: 'user' },
        { from: 'response.created', to: 'created' },
      ],
    },
  },
];

4. Developer Experience

For APIs consumed by developers (internal or external), the gateway enables:

Developer Portal: Self-service key generation, documentation
Sandbox Environments: Safe testing without production access
Usage Dashboards: Visibility into API consumption
Quota Management: Request quota allocation and tracking
Changelog Communication: Notify consumers of API changes

API as Product

How They Work Together

In production architectures, load balancers and API gateways are complementary, not competing. Understanding their typical arrangements clarifies their relationship.

Pattern 1: Load Balancer in Front of Gateway

The most common production pattern:

┌─────────────────────────────────────────────────────────────────────────────────┐
│                                   INTERNET                                       │
│                                      │                                           │
│                                      ▼                                           │
├─────────────────────────────────────────────────────────────────────────────────┤
│  ┌─────────────────────────────────────────────────────────────────────────────┐│
│  │                       CLOUD LOAD BALANCER (L4/L7)                           ││
│  │                         (AWS ALB, GCP LB, Azure LB)                         ││
│  │                                                                             ││
│  │  Responsibilities:                                                          ││
│  │  ✓ TLS termination (SSL offloading)                                         ││
│  │  ✓ Connection distribution across gateway instances                         ││
│  │  ✓ Health checks on gateway                                                 ││
│  │  ✓ Geographic routing (if global)                                           ││
│  │  ✓ DDoS protection (L4 attacks)                                             ││
│  │                                                                             ││
│  │  Does NOT do:                                                               ││
│  │  ✗ Authentication                                                           ││
│  │  ✗ Rate limiting per user                                                   ││
│  │  ✗ API versioning                                                           ││
│  │  ✗ Request transformation                                                   ││
│  └─────────────────────────────────────────────────────────────────────────────┘│
│                                      │                                           │
│            ┌─────────────────────────┼─────────────────────────┐                 │
│            ▼                         ▼                         ▼                 │
│  ┌──────────────────┐     ┌──────────────────┐     ┌──────────────────┐          │
│  │  API Gateway     │     │  API Gateway     │     │  API Gateway     │          │
│  │  Instance 1      │     │  Instance 2      │     │  Instance 3      │          │
│  │  (AZ-1a)         │     │  (AZ-1b)         │     │  (AZ-1c)         │          │
│  └──────────────────┘     └──────────────────┘     └──────────────────┘          │
│            │                         │                         │                 │
│  ┌─────────┴─────────────────────────┴─────────────────────────┴─────────┐       │
│  │                         API GATEWAY CLUSTER                            │       │
│  │                                                                        │       │
│  │  Responsibilities:                                                     │       │
│  │  ✓ Authentication (JWT, API keys)                                      │       │
│  │  ✓ Authorization                                                       │       │
│  │  ✓ Rate limiting per user/key                                          │       │
│  │  ✓ Request routing to services                                         │       │
│  │  ✓ Request/response transformation                                     │       │
│  │  ✓ API versioning                                                      │       │
│  │  ✓ Observability (metrics, traces)                                     │       │
│  └────────────────────────────────────────────────────────────────────────┘       │
│                                      │                                           │
│            ┌─────────────────────────┼─────────────────────────┐                 │
│            ▼                         ▼                         ▼                 │
│     ┌────────────┐           ┌────────────┐           ┌────────────┐             │
│     │  Product   │           │   User     │           │   Order    │             │
│     │  Service   │           │  Service   │           │  Service   │             │
│     └────────────┘           └────────────┘           └────────────┘             │
└─────────────────────────────────────────────────────────────────────────────────┘

Why this pattern works:

Separation of Concerns: The load balancer handles connection-level distribution; the gateway handles API-level logic
Scalability: Gateway instances scale horizontally; the load balancer distributes evenly
High Availability: Load balancer health checks ensure traffic only goes to healthy gateways
SSL Offloading: TLS termination can happen at the load balancer, reducing gateway CPU load
Cloud Integration: Cloud load balancers integrate with auto-scaling, monitoring, and WAF services

Pattern 2: Gateway with Internal Load Balancing

Some gateways (Kong, Envoy) include built-in load balancing for backend services:

                        ┌─────────────────────────────────────┐
                        │            API GATEWAY              │
                        │                                     │
Request ───────────────►│  Auth ──► Rate Limit ──► Transform  │
                        │                │                    │
                        │                ▼                    │
                        │         ┌─────────────┐             │
                        │         │   Built-in  │             │
                        │         │Load Balancer│             │
                        │         └──────┬──────┘             │
                        └────────────────┼────────────────────┘
                                         │
                   ┌─────────────────────┼─────────────────────┐
                   │                     │                     │
                   ▼                     ▼                     ▼
              Instance 1            Instance 2            Instance 3

This simplifies deployments when:

Backend services have known, relatively static endpoints
You want fewer infrastructure components
Your gateway product has robust load balancing features

Pattern 3: Service Mesh for Internal Traffic

For east-west (service-to-service) traffic, many organizations replace internal load balancers with a service mesh:

                             North-South Traffic
                                    │
                                    ▼
                        ┌───────────────────────┐
                        │      API Gateway      │
                        │   (External Traffic)  │
                        └───────────┬───────────┘
                                    │
            ┌───────────────────────┼───────────────────────┐
            │                       │                       │
            ▼                       ▼                       ▼
    ┌───────────────┐       ┌───────────────┐       ┌───────────────┐
    │   Service A   │◄─────►│   Service B   │◄─────►│   Service C   │
    │   + Sidecar   │       │   + Sidecar   │       │   + Sidecar   │
    └───────────────┘       └───────────────┘       └───────────────┘
            ▲                       ▲                       ▲
            │                       │                       │
            └───────────────────────┴───────────────────────┘
                              East-West Traffic
                       (Managed by Service Mesh Sidecars)

In this pattern:

API Gateway: Handles north-south (external to internal) traffic
Service Mesh: Handles east-west (internal) traffic with sidecars
No Separate Internal Load Balancers: Mesh provides load balancing, mTLS, retries

Choosing the Right Pattern

Common Misconceptions

Let's address frequent misconceptions about gateways and load balancers:

Misconceptions to Avoid

•"They're interchangeable" — They are not. A load balancer cannot validate JWTs, enforce per-user rate limits, or manage API versions. A gateway is overkill for simple TCP load balancing.
•"I only need one or the other" — Most production systems need both: load balancers for connection distribution, gateways for API management. They're layers, not alternatives.
•"NGINX is a load balancer" — NGINX can be configured as an L7 load balancer OR as an API gateway. The product is the same; the configuration determines the role.
•"AWS ALB replaces API Gateway" — ALB handles path-based routing and TLS termination. AWS API Gateway adds authentication, rate limiting, API management, and developer portal features. Different tools.
•"Gateway is slower, avoid if possible" — Gateways add latency, but they prevent security breaches, outages from abuse, and enable API evolution. The latency is a worthwhile trade-off for API traffic.
•"API Gateway duplicates service mesh functionality" — Meshes excel at east-west traffic (service-to-service). Gateways excel at north-south traffic (external APIs). They're complementary.

Use the Right Tool for the Job
Need	Use Load Balancer	Use API Gateway
Distribute TCP connections	✓
Route internal microservice HTTP	✓ (or mesh)
SSL termination only	✓
Validate API keys		✓
Enforce per-user rate limits		✓
Protocol translation (REST→gRPC)		✓
API versioning		✓
Developer portal		✓
Non-HTTP protocols	✓
External API security		✓
High-throughput static content	✓ (with CDN)
GraphQL federation		✓

Product Overlap

Decision Framework: Gateway, Load Balancer, or Both?

Use this framework to decide what you need for a given use case:

                          ┌─────────────────────────┐
                          │  Is traffic external   │
                          │  (clients, partners)?  │
                          └───────────┬────────────┘
                                      │
                    ┌─────────────────┴─────────────────┐
                    │                                   │
                   Yes                                 No
                    │                                   │
                    ▼                                   ▼
        ┌──────────────────────┐          ┌───────────────────────┐
        │ You need an          │          │ Is it HTTP traffic?   │
        │ API GATEWAY          │          └───────────┬───────────┘
        │                      │                      │
        │ + Authentication     │        ┌─────────────┴─────────────┐
        │ + Rate Limiting      │        │                           │
        │ + API Management     │       Yes                         No
        └──────────┬───────────┘        │                           │
                   │                    ▼                           ▼
                   │        ┌──────────────────────┐   ┌──────────────────────┐
                   │        │ Use L7 LOAD BALANCER │   │ Use L4 LOAD BALANCER │
                   │        │ or SERVICE MESH      │   │                      │
                   │        │                      │   │ TCP/UDP distribution │
                   │        │ HTTP routing,        │   │ Connection balancing │
                   │        │ health checks        │   │ Max throughput       │
                   │        └──────────────────────┘   └──────────────────────┘
                   │
                   ▼
        ┌──────────────────────┐
        │ Do you have multiple │
        │ gateway instances?   │
        └───────────┬──────────┘
                    │
        ┌───────────┴───────────┐
        │                       │
       Yes                     No
        │                       │
        ▼                       │
┌──────────────────────┐        │
│ Put a LOAD BALANCER  │        │
│ IN FRONT of Gateway  │        │
│                      │        │
│ Distribute across    │        │
│ gateway instances    │        │
└──────────────────────┘        │
                                │
                                ▼
                     ┌──────────────────────┐
                     │ Gateway built-in     │
                     │ load balancing might │
                     │ suffice              │
                     └──────────────────────┘

Practical Recommendations

For Startups / Small Teams:

Start with a cloud-managed gateway (AWS API Gateway, Google Cloud Endpoints)
Cloud provider's load balancer handles gateway HA automatically
Keep it simple; add complexity when you have concrete problems

For Mid-Size Organizations:

L7 load balancer (ALB, Cloud Load Balancer) in front of gateway cluster
API Gateway (Kong, NGINX API Gateway) for API management
Consider service mesh for internal traffic if microservices are complex

For Large Enterprises:

Global load balancing (anycast, multi-region)
CDN + Edge + API Gateway tiers
Service mesh for internal traffic
Potentially domain-specific internal gateways
Full observability stack integrated across all layers

When in Doubt

Summary: Gateway vs. Load Balancer

Understanding the distinction between API Gateways and Load Balancers is essential for correct infrastructure design. Let's consolidate the key insights:

Key Takeaways

•Different Problems, Different Tools — Load balancers distribute connections; gateways manage APIs. They solve different problems at different layers.
•Layer 4 vs. Layer 7 vs. API-Aware — Load balancers operate at connection or HTTP level; gateways understand full API semantics (auth, rate limits, versioning).
•Load Balancers Excel at Distribution — High throughput, low latency, non-HTTP protocols, and connection-level concerns are load balancer territory.
•Gateways Excel at API Management — Authentication, authorization, rate limiting, versioning, protocol translation, and developer experience require a gateway.
•They're Complementary, Not Competing — Production architectures typically use both: load balancers distribute traffic across gateway instances.
•Choose Based on Requirements — External API traffic needs a gateway. Internal service traffic often needs just load balancing. Many systems need both.
•Avoid Misconceptions — Products overlap in features, but their core competencies differ. Don't confuse an L7 load balancer with path routing for a full API gateway.

Module Complete:

You've now completed Module 1: What Is an API Gateway? You understand:

The gateway as a single entry point and its role in distributed systems
The comprehensive responsibilities of an API Gateway
How to position gateways in your architecture
The critical distinction between gateways and load balancers

Module Complete

4 / 4