System Design (HLD)API Gateway Solutions

API Gateway Solutions

LevelAdvanced

Duration90 mins

TopicAPI Gateway Solutions

1 / 5

Kong Gateway: Plugin-Based, Open Source

The Rise of the Open-Source API Gateway

In 2015, when Mashape (now Kong Inc.) open-sourced Kong, they fundamentally changed the API Gateway landscape. Until then, organizations faced a stark choice: build custom gateway infrastructure from scratch, or purchase expensive proprietary solutions from vendors like Apigee, MuleSoft, or IBM. Kong introduced a third option—a production-grade, open-source API Gateway built on battle-tested foundations, with an architecture designed for extensibility.

Today, Kong is one of the most widely deployed API Gateways in the world, powering API infrastructure at organizations ranging from startups to Fortune 500 enterprises. Its plugin-based architecture has become an industry reference point, influencing how we think about gateway extensibility. Whether you're evaluating Kong for your organization or simply seeking to understand gateway design patterns, a deep understanding of Kong's architecture provides invaluable insights.

This page provides an exhaustive exploration of Kong—its architectural foundations, plugin ecosystem, deployment models, operational characteristics, and the considerations that determine when Kong is the right choice for your infrastructure.

What You Will Learn

By the end of this page, you will understand Kong's architectural philosophy and core components, how its plugin system enables extensibility without code forks, the differences between Kong OSS, Kong Enterprise, and Kong Konnect, how to reason about Kong's performance characteristics, and when Kong is—and isn't—the optimal choice for your gateway needs.

Architectural Foundations

Kong's architecture is built on a foundation of proven, high-performance components. Understanding this foundation is essential for reasoning about Kong's behavior, performance characteristics, and operational requirements.

NGINX: The HTTP Engine

At its core, Kong is built atop NGINX, the world's most widely deployed web server and reverse proxy. NGINX handles all low-level HTTP processing: connection management, TLS termination, HTTP parsing, and proxying. This choice wasn't accidental—NGINX's event-driven, non-blocking architecture can handle tens of thousands of concurrent connections with minimal memory footprint.

NGINX provides Kong with:

Battle-tested stability — NGINX powers approximately 34% of the web, including high-traffic sites like Netflix, Cloudflare, and Dropbox
Exceptional performance — Event-driven architecture with minimal context switching
Mature TLS implementation — OpenSSL integration with hardware acceleration support
Efficient resource utilization — Predictable memory usage regardless of connection count

OpenResty: The Lua Runtime

OpenResty extends NGINX with LuaJIT, a Just-In-Time compiler for the Lua programming language. This combination allows Kong to execute Lua code at various points in the request/response lifecycle—without the per-request overhead of spawning external processes or the complexity of writing NGINX modules in C.

OpenResty's architecture is critical to understanding Kong's plugin model:

┌─────────────────────────────────────────────────────────────────────────────────┐
│                        NGINX/OpenResty Request Lifecycle                        │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   CLIENT REQUEST                                                                │
│         │                                                                       │
│         ▼                                                                       │
│   ┌─────────────────┐                                                          │
│   │   ssl_phase     │ ← TLS handshake, certificate validation                  │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │  rewrite_phase  │ ← URL rewriting, request transformation        [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │  access_phase   │ ← Authentication, authorization, rate limiting [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │  content_phase  │ ← Upstream selection, load balancing           [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│         UPSTREAM                                                                 │
│            │                                                                    │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │ header_filter   │ ← Response header modification                 [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │  body_filter    │ ← Response body modification                   [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │    log_phase    │ ← Request/response logging, metrics            [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   CLIENT RESPONSE                                                               │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

LuaJIT: Performance Without Compromise

LuaJIT is arguably the fastest dynamic language implementation available. Its trace-based JIT compiler generates highly optimized machine code for hot paths, achieving performance within an order of magnitude of hand-written C for many workloads.

This matters because Kong plugins execute on every request. A poorly performing scripting language would add unacceptable latency. LuaJIT's characteristics:

Trace compilation — Identifies frequently executed code paths (traces) and compiles them to native machine code
Minimal GC pause — Incremental garbage collection prevents stop-the-world pauses
FFI (Foreign Function Interface) — Allows calling C functions directly, enabling high-performance integrations
Low memory overhead — Lua tables are memory-efficient, critical for per-connection state

Kong's Layered Architecture

Kong adds its own layers atop NGINX/OpenResty:

Kong's Architectural Layers
Layer	Component	Responsibility
L1: HTTP Engine	NGINX	Connection handling, TLS, HTTP parsing, proxying
L2: Lua Runtime	OpenResty/LuaJIT	Execute Lua at request phases, provide NGINX APIs
L3: Kong Core	Kong PDK	Request/response abstraction, plugin lifecycle, routing
L4: Data Plane	Kong Workers	Stateless request processing, plugin execution
L5: Control Plane	Kong Admin API	Configuration management, clustering, health checks
L6: Persistence	PostgreSQL/Cassandra	Configuration storage, cluster coordination

The Power of Standing on Giants

Kong's choice to build on NGINX and OpenResty rather than implementing HTTP handling from scratch is a masterclass in software architecture. Rather than reinventing the wheel, Kong inherits decades of battle-testing, security patches, and performance optimizations. This pattern—building domain-specific logic atop proven infrastructure—is a hallmark of successful open-source projects.

The Plugin Architecture: Kong's Defining Feature

Kong's plugin architecture is its defining characteristic—the feature that distinguishes it from other gateways and enables its remarkable flexibility. Understanding this architecture is essential for anyone deploying, extending, or evaluating Kong.

The Plugin Philosophy

Kong follows a philosophy of minimal core, maximal plugins. The Kong core provides:

Request/response routing
Service and route configuration
Plugin lifecycle management
Data plane/control plane separation
Clustering and configuration sync

Nearby everything else—authentication, rate limiting, logging, request transformation, caching—is implemented as plugins. This design has profound implications:

Composability — Mix and match plugins for your exact requirements
Performance — Disable plugins you don't need; no overhead for unused features
Extensibility — Write custom plugins without forking Kong
Maintainability — Plugin updates are independent of core updates

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
-- Example: Custom Rate Limiting Plugin
-- File: kong/plugins/custom-rate-limit/handler.lua
 
local BasePlugin = require "kong.plugins.base_plugin"
local CustomRateLimitHandler = BasePlugin:extend()
 
-- Plugin priority determines execution order (higher = earlier)
-- Authentication plugins: 1000-1999
-- Security plugins: 2000-2999
-- Traffic control: 3000-3999
CustomRateLimitHandler.PRIORITY = 901  -- After auth, before logging
CustomRateLimitHandler.VERSION = "1.0.0"
 
function CustomRateLimitHandler:new()
    CustomRateLimitHandler.super.new(self, "custom-rate-limit")
end
 
-- ACCESS phase: runs after authentication, before upstream proxying
function CustomRateLimitHandler:access(conf)
    CustomRateLimitHandler.super.access(self)
    
    -- Access the Kong PDK (Plugin Development Kit)
    local kong = kong
    
    -- Get consumer identity (set by auth plugins)
    local consumer = kong.client.get_consumer()
    local identifier = consumer and consumer.id or kong.client.get_forwarded_ip()
    
    -- Check rate limit using shared dict (in-memory) or Redis
    local cache_key = "ratelimit:" .. identifier .. ":" .. conf.window
    local current = kong.cache:get(cache_key) or 0
    
    if current >= conf.limit then
        -- Rate limit exceeded
        kong.response.set_header("X-RateLimit-Limit", conf.limit)
        kong.response.set_header("X-RateLimit-Remaining", 0)
        kong.response.set_header("Retry-After", conf.window)
        
        return kong.response.exit(429, {
            message = "Rate limit exceeded",
            retry_after = conf.window
        })
    end
    
    -- Increment counter
    kong.cache:set(cache_key, current + 1, conf.window)
    
    -- Set response headers
    kong.response.set_header("X-RateLimit-Limit", conf.limit)
    kong.response.set_header("X-RateLimit-Remaining", conf.limit - current - 1)
end
 
-- LOG phase: runs after response is sent to client
function CustomRateLimitHandler:log(conf)
    CustomRateLimitHandler.super.log(self)
    
    -- Log rate limit metrics for observability
    local latency = kong.response.get_latency()
    local status = kong.response.get_status()
    
    kong.log.info("Request completed",
        " status=", status,
        " latency=", latency, "ms",
        " consumer=", kong.client.get_consumer() and kong.client.get_consumer().username or "anonymous"
    )
end
 
return CustomRateLimitHandler

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
-- File: kong/plugins/custom-rate-limit/schema.lua
-- Defines plugin configuration schema with validation
 
local typedefs = require "kong.db.schema.typedefs"
 
return {
    name = "custom-rate-limit",
    fields = {
        { consumer = typedefs.no_consumer },  -- Plugin applies globally or to routes/services
        { protocols = typedefs.protocols_http },
        { config = {
            type = "record",
            fields = {
                -- Rate limit: max requests per window
                { limit = {
                    type = "integer",
                    required = true,
                    gt = 0,  -- Greater than 0
                    default = 100,
                }},
                -- Time window in seconds
                { window = {
                    type = "integer",
                    required = true,
                    gt = 0,
                    default = 60,
                }},
                -- Storage backend for counters
                { storage = {
                    type = "string",
                    one_of = { "local", "redis", "cluster" },
                    default = "local",
                }},
                -- Redis configuration (if storage = "redis")
                { redis_host = {
                    type = "string",
                    default = "127.0.0.1",
                }},
                { redis_port = {
                    type = "integer",
                    default = 6379,
                }},
                { redis_password = {
                    type = "string",
                    encrypted = true,  -- Stored encrypted in DB
                }},
                -- Response customization
                { error_message = {
                    type = "string",
                    default = "API rate limit exceeded",
                }},
                { hide_client_headers = {
                    type = "boolean",
                    default = false,
                }},
            },
        }},
    },
}

Plugin Execution Order and Priorities

Kong plugins execute in a deterministic order controlled by their priority values. This ordering is critical for correct behavior:

Authentication plugins (1000-1999) — Execute first to establish identity
Security plugins (2000-2999) — IP restrictions, bot detection
Traffic control (700-999) — Rate limiting, request size limiting
Transformations (800-899) — Request/response modification
Logging (0-199) — Execute last to capture complete request context

Within each phase, plugins with higher priority execute first. This matters because:

Rate limiting needs consumer identity (set by auth plugins)
Logging needs final response status (after all transformations)
Caching needs to see the original request (before transformations)

Kong Built-in Plugin Priorities
Plugin	Priority	Category	Purpose
correlation-id	100001	Observability	Add request correlation ID
jwt	1005	Authentication	Validate JWT tokens
key-auth	1003	Authentication	API key authentication
oauth2	1004	Authentication	OAuth 2.0 flows
acl	950	Authorization	Access control lists
rate-limiting	901	Traffic Control	Request rate limiting
request-transformer	801	Transformation	Modify request headers/body
response-transformer	800	Transformation	Modify response headers/body
proxy-cache	100	Performance	Cache upstream responses
http-log	12	Logging	HTTP request logging
file-log	9	Logging	Log to file system

Plugin Priority Conflicts

When writing custom plugins, carefully consider priority values. A custom auth plugin with priority lower than rate-limiting will fail—rate limiting runs before identity is established. Always document your plugin's priority assumptions and test with realistic plugin combinations.

The Kong PDK (Plugin Development Kit)

Kong provides a comprehensive Plugin Development Kit (PDK) that abstracts away OpenResty/NGINX internals. The PDK offers:

kong.request — Access request headers, body, query params
kong.response — Set response headers, status, body
kong.service — Configure upstream target dynamically
kong.client — Get client IP, consumer, credentials
kong.cache — Per-worker or shared caching
kong.db — Access Kong's configuration database
kong.log — Structured logging with levels
kong.ctx — Request-scoped shared data

The PDK is the only supported interface for plugin development. Direct OpenResty API usage, while possible, is discouraged and may break across Kong versions.

Core Plugin Categories and Capabilities

Kong ships with a rich ecosystem of built-in plugins covering the essential gateway functionality. Understanding these categories helps you design your plugin configuration strategy.

Authentication Plugins

Kong supports multiple authentication mechanisms, often used in combination:

Kong Authentication Plugins
Plugin	Use Case	Key Features
key-auth	Simple API keys	Header/query param API keys, consumer mapping
jwt	Stateless tokens	RS256/HS256 validation, claims extraction, consumer mapping
oauth2	Delegated authorization	Full OAuth2 flows, authorization code, client credentials
basic-auth	Username/password	Base64 credentials, consumer mapping
hmac-auth	Request signing	AWS Signature v4 compatible, replay protection
ldap-auth	Enterprise directories	LDAP/AD integration, group mapping
openid-connect	Modern identity (Enterprise)	OIDC flows, IdP integration, JWT validation
mtls-auth	Certificate-based	Client certificate validation, consumer mapping

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# Kong declarative configuration (kong.yaml)
# Demonstrates layered authentication strategy
 
_format_version: "2.1"
 
services:
  - name: payment-api
    url: http://payment-service:8080
    routes:
      - name: payment-route
        paths:
          - /api/v1/payments
 
plugins:
  # Primary auth: JWT validation
  - name: jwt
    service: payment-api
    config:
      key_claim_name: iss
      claims_to_verify:
        - exp
        - nbf
      run_on_preflight: false
      
  # Secondary auth: mTLS for service-to-service
  - name: mtls-auth
    service: payment-api
    config:
      revocation_check_mode: SKIP  # Or: IGNORE_CA_ERROR
      authenticated_group_by: CN
    # Only applies when client presents certificate
    
  # Fallback: API key for legacy clients
  - name: key-auth
    service: payment-api
    config:
      key_names:
        - X-API-Key
        - apikey
      hide_credentials: true
      anonymous: anonymous-consumer  # Allow anonymous with limits
      
consumers:
  - username: anonymous-consumer
    custom_id: anonymous
 
  - username: premium-client
    custom_id: client-001
    # Associated credentials would be created separately

Traffic Control Plugins

Traffic control plugins protect your infrastructure from overload and abuse:

Kong Traffic Control Plugins
Plugin	Mechanism	Granularity	Storage
rate-limiting	Token bucket / Sliding window	Consumer, IP, Service, Route, Header	Local, Cluster, Redis
rate-limiting-advanced (EE)	Multiple limits per config	Same + Custom identifiers	Redis with sync
request-size-limiting	Body size enforcement	Per-route configurable	N/A (stateless)
response-ratelimiting	Based on response headers	X-RateLimit header from upstream	Local, Cluster, Redis
request-termination	Static response	Block routes entirely	N/A

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Different rate limits for different consumer tiers
plugins:
  # Default rate limit for all consumers
  - name: rate-limiting
    service: api-service
    config:
      minute: 100
      hour: 1000
      policy: redis
      redis_host: redis.internal
      redis_port: 6379
      redis_database: 0
      fault_tolerant: true  # Allow requests if Redis is down
      hide_client_headers: false
 
  # Override for premium consumers
  - name: rate-limiting
    service: api-service
    consumer: premium-tier
    config:
      minute: 1000
      hour: 50000
      policy: redis
      redis_host: redis.internal
      
  # Strict limits for trial/anonymous
  - name: rate-limiting
    service: api-service
    consumer: trial-tier
    config:
      minute: 10
      hour: 100
      policy: local  # Local storage is fine for low limits

Transformation Plugins

Transformation plugins modify requests and responses without changing backend services:

Kong Transformation Plugins
Plugin	Direction	Capabilities
request-transformer	Request	Add/remove/rename headers, query params, body fields
response-transformer	Response	Add/remove/rename headers, body transformation
correlation-id	Request	Generate unique request ID, propagate through system
request-validator	Request	JSON Schema validation, OpenAPI spec enforcement
grpc-web	Both	Translate gRPC-Web protocol to/from gRPC
grpc-gateway	Request	RESTful JSON to gRPC transcoding

Observability Plugins

Kong's observability plugins integrate with standard monitoring stacks:

Kong Observability Plugins
Plugin	Output	Data	Use Case
http-log	HTTP endpoint	JSON request/response	Central logging, SIEM
tcp-log	TCP socket	JSON request/response	Log aggregators
udp-log	UDP socket	JSON request/response	Low-latency logging
syslog	Syslog daemon	Structured logs	Traditional Unix logging
file-log	Local file	JSON lines	Local debugging, rotation
prometheus	HTTP /metrics	Prometheus format	Metrics collection, Grafana
datadog	Datadog agent	Metrics + traces	Datadog APM integration
zipkin	Zipkin collector	Distributed traces	OpenZipkin ecosystem
opentelemetry (EE)	OTLP endpoint	Traces, metrics, logs	Modern observability

Plugin Overhead Considerations

Each enabled plugin adds latency. In performance-critical scenarios, measure the impact of your plugin stack. Typically: auth plugins add 1-5ms (depending on cryptographic operations), rate limiting with Redis adds 1-2ms network RTT, and logging plugins add negligible latency (async by default). The total plugin overhead should be evaluated against your latency budget.

Deployment Models: From Development to Enterprise

Kong supports multiple deployment models, each with distinct characteristics for different organizational needs and scale requirements.

DB-less Mode (Declarative Configuration)

In DB-less mode, Kong operates without a database. Configuration is provided via a declarative YAML file loaded at startup:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# kong.yaml - Complete gateway configuration
_format_version: "2.1"
_transform: true
 
services:
  - name: user-service
    url: http://user-service.internal:8080
    connect_timeout: 5000
    read_timeout: 60000
    write_timeout: 60000
    retries: 3
    routes:
      - name: user-routes
        paths:
          - /api/v1/users
        methods:
          - GET
          - POST
          - PUT
          - DELETE
        strip_path: false
        preserve_host: true
 
  - name: order-service
    url: http://order-service.internal:8080
    routes:
      - name: order-routes
        paths:
          - /api/v1/orders
        
plugins:
  # Global plugins (apply to all routes)
  - name: prometheus
    config:
      per_consumer: true
      status_code_metrics: true
      latency_metrics: true
      
  - name: correlation-id
    config:
      header_name: X-Request-ID
      generator: uuid
      echo_downstream: true
      
  # Service-specific plugins
  - name: jwt
    service: user-service
    config:
      key_claim_name: iss
      
  - name: rate-limiting
    service: user-service
    config:
      minute: 100
      policy: local
 
upstreams:
  - name: user-service-upstream
    algorithm: round-robin
    healthchecks:
      active:
        healthy:
          interval: 5
          successes: 2
        unhealthy:
          interval: 5
          tcp_failures: 2
          http_failures: 2
    targets:
      - target: user-pod-1:8080
        weight: 100
      - target: user-pod-2:8080
        weight: 100

DB-less Advantages

•Simpler operations — No database to manage, backup, or scale
•GitOps compatible — Configuration in version control
•Faster startup — No database connection required
•Immutable deploys — Configuration changes = redeploy
•Lower resource usage — No database overhead

DB-less Limitations

•No dynamic updates — Configuration changes require reload
•No Admin API writes — Read-only Admin API
•Limited plugins — Some plugins require database (OAuth2)
•No consumer secrets — Credentials must be in config file
•Cluster sync manual — Must deploy to all nodes

Traditional Mode (with Database)

Traditional mode uses PostgreSQL or Cassandra for configuration storage:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Kong Traditional Mode                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────┐                   │
│   │   Kong #1   │     │   Kong #2   │     │   Kong #3   │                   │
│   │  (Worker)   │     │  (Worker)   │     │  (Worker)   │                   │
│   └──────┬──────┘     └──────┬──────┘     └──────┬──────┘                   │
│          │                   │                   │                          │
│          └───────────────────┼───────────────────┘                          │
│                              │                                              │
│                              ▼                                              │
│                    ┌─────────────────────┐                                  │
│                    │     PostgreSQL      │                                  │
│                    │  (Primary + Replica)│                                  │
│                    └─────────────────────┘                                  │
│                                                                             │
│   Admin API → Any Kong node → Database → Propagates to all nodes           │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Characteristics:

Dynamic configuration via Admin API
Automatic cluster synchronization (via db polling or events)
Full plugin compatibility
Operational database dependency
Suitable for: Development, smaller deployments, when dynamic config is essential

Hybrid Mode (Data Plane / Control Plane Separation)

Kong's Hybrid Mode is the recommended production architecture for larger deployments. It separates the cluster into:

Control Plane (CP) — Manages configuration, connects to database, exposes Admin API
Data Plane (DP) — Processes traffic, receives configuration from CP, no database connection

┌─────────────────────────────────────────────────────────────────────────────────┐
│                           Kong Hybrid Mode                                      │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   CONTROL PLANE CLUSTER (Protected Network)                                    │
│   ┌──────────────────────────────────────────────────────────────────────┐      │
│   │  ┌─────────────┐     ┌─────────────┐                                 │      │
│   │  │    CP #1    │     │    CP #2    │  ← Admin API requests           │      │
│   │  │  (Primary)  │     │ (Secondary) │                                 │      │
│   │  └──────┬──────┘     └──────┬──────┘                                 │      │
│   │         │                   │                                        │      │
│   │         └─────────┬─────────┘                                        │      │
│   │                   │                                                  │      │
│   │                   ▼                                                  │      │
│   │         ┌─────────────────────┐                                      │      │
│   │         │     PostgreSQL      │                                      │      │
│   │         └─────────────────────┘                                      │      │
│   └──────────────────────────────────────────────────────────────────────┘      │
│                     │                                                           │
│                     │ mTLS WebSocket (Config Push)                              │
│                     │                                                           │
│   DATA PLANE CLUSTER (DMZ / Edge)                                               │
│   ┌──────────────────────────────────────────────────────────────────────┐      │
│   │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │      │
│   │  │    DP #1    │  │    DP #2    │  │    DP #3    │  │    DP #N    │  │      │
│   │  │  (Stateless)│  │  (Stateless)│  │  (Stateless)│  │  (Stateless)│  │      │
│   │  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  │      │
│   │         │                │                │                │         │      │
│   │         └────────────────┴────────────────┴────────────────┘         │      │
│   └──────────────────────────────────────────────────────────────────────┘      │
│                     ▲                                                           │
│                     │                                                           │
│              Internet Traffic                                                   │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

Hybrid Mode Benefits

•Security isolation — Database and Admin API never exposed to internet; only stateless DPs in DMZ
•Independent scaling — Scale DPs massively without database load increase
•Resilience — DPs continue serving traffic even if CP is temporarily unavailable (cached config)
•Network efficiency — DPs only need outbound connection to CP, no database connectivity
•Multi-region — CPs can be centralized; DPs deployed in each region close to traffic
•Compliance — Sensitive configuration stays in secure network zone

Hybrid Mode in Practice

In hybrid mode, configuration changes propagate from CP to DPs within seconds via persistent WebSocket connections secured with mTLS. DPs cache the full configuration locally, so they survive CP failures gracefully. This architecture is used by most large-scale Kong production deployments.

Kong OSS vs. Kong Enterprise vs. Kong Konnect

Kong is available in multiple editions with different capabilities and licensing. Understanding these options is essential for planning and budgeting.

Kong OSS (Open Source)

The core open-source Kong, licensed under Apache 2.0:

Kong OSS Included

•Core gateway functionality — Routing, load balancing, health checks
•Essential plugins — Rate limiting, key-auth, JWT, basic-auth, ACL, CORS, request/response transformers
•Observability plugins — Prometheus, file-log, http-log, tcp-log, syslog, Zipkin
•Deployment modes — DB-less, Traditional, Hybrid
•Admin API — Full configuration API
•Plugin development — Full PDK access, custom plugin support
•Community support — GitHub issues, Kong Nation forum

Kong Enterprise (Kong Gateway)

Commercial offering with additional enterprise features:

Kong Enterprise Additional Features
Category	Feature	Description
Authentication	OpenID Connect	Full OIDC flows, IdP integration
Authentication	SAML	Enterprise SSO integration
Authorization	OPA (Open Policy Agent)	Fine-grained policy-as-code authorization
Security	Secrets Management	HashiCorp Vault, AWS Secrets Manager integration
Security	GraphQL Security	Query depth limiting, cost analysis
Traffic Control	Rate Limiting Advanced	Multiple rate limits, custom windows, sliding window
Traffic Control	Canary Release	Gradual traffic shifting with automatic rollback
Developer Portal	Dev Portal	Self-service API documentation and key provisioning
Observability	OpenTelemetry	OTLP traces, metrics, logs
Observability	Vitals	Built-in analytics dashboard
Operations	RBAC	Role-based access control for Admin API
Operations	Workspaces	Multi-tenant configuration isolation
Support	24/7 Support	Dedicated enterprise support SLA

Kong Konnect (SaaS)

Kong's fully managed SaaS offering:

Control Plane as a Service — Kong manages the CP infrastructure
Global deployment — Multi-region CP with automatic failover
ServiceHub — Service catalog and API governance
Analytics — Built-in request analytics and reporting
Runtime Manager — Deploy and monitor DPs from cloud console
Dev Portal (hosted) — Managed developer portal

Pricing Model: Typically based on API requests/month or connected services.

Edition Comparison Summary
Capability	OSS	Enterprise	Konnect
Core Gateway	✅	✅	✅
Essential Plugins (~40)	✅	✅	✅
Enterprise Plugins (~30+)	❌	✅	✅
Developer Portal	❌	✅	✅ (Managed)
RBAC / Workspaces	❌	✅	✅
OIDC / SAML	❌	✅	✅
Managed Control Plane	❌	❌	✅
Enterprise Support	❌	✅	✅
Self-hosted	✅	✅	DP only
License Cost	Free	$$$	$$$ (SaaS)

Choosing an Edition

Start with Kong OSS for proof-of-concept and development. Evaluate Enterprise if you need OIDC integration, advanced rate limiting, developer portal, or enterprise support. Consider Konnect if operational overhead of running CPs is a concern and you prefer consumption-based pricing.

Performance Characteristics and Tuning

Kong's performance varies significantly based on configuration, plugin selection, and tuning. Understanding these factors helps you plan capacity and meet latency requirements.

Baseline Performance

Kong's raw proxying performance (minimal plugins) is impressive due to NGINX's efficiency:

Kong Baseline Performance (Reference Numbers)
Metric	No Plugins	Auth + Rate Limit	Full Stack (8 plugins)
Requests/second (per core)	~30,000	~15,000	~8,000
P50 Latency (added)	< 1ms	1-2ms	3-5ms
P99 Latency (added)	< 2ms	3-5ms	8-15ms
Memory per 10K RPS	~100MB	~150MB	~250MB
CPU utilization @ 10K RPS	~30%	~50%	~80%

Benchmark Caveats

These are approximate figures from various benchmarks. Your actual performance depends on: hardware (CPU, memory, network), request/response sizes, plugin configuration complexity, upstream latency, TLS termination overhead, and workload characteristics. Always benchmark with your specific configuration and traffic patterns.

Key Performance Tuning Parameters

Critical configuration options for production deployments:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# Worker processes - typically set to number of CPU cores
# Auto-detection: nginx_worker_processes = auto
nginx_worker_processes = 4
 
# Worker connections - max concurrent connections per worker
# Default: 16384, increase for high-concurrency workloads
nginx_worker_connections = 65535
 
# Upstream keepalive - reuse connections to upstreams
# Critical for reducing upstream connection overhead
nginx_http_upstream_keepalive = 1000
nginx_http_upstream_keepalive_requests = 10000
nginx_http_upstream_keepalive_timeout = 60s
 
# DNS resolver - cache DNS lookups for upstreams
dns_stale_ttl = 60
dns_not_found_ttl = 10
dns_order = LAST,SRV,A,CNAME
 
# Database connection pooling (Traditional mode)
pg_max_concurrent_queries = 0  # Unlimited
pg_semaphore_timeout = 60000   # 60 seconds
 
# Cache tuning - for plugin data caching
mem_cache_size = 128m
db_cache_ttl = 0  # Cache indefinitely (Hybrid mode)
 
# Logging - reduce I/O in high-throughput scenarios
proxy_access_log = off  # Use logging plugins instead
proxy_error_log = /dev/stderr notice
 
# Timeouts
nginx_proxy_connect_timeout = 10s
nginx_proxy_read_timeout = 60s
nginx_proxy_send_timeout = 60s
 
# Buffer sizes for large headers/bodies
nginx_http_client_max_body_size = 10m
nginx_http_client_body_buffer_size = 10m
nginx_http_proxy_buffer_size = 160k
nginx_http_proxy_buffers = 64 160k

Plugin Performance Impact

Different plugin categories have different performance characteristics:

Plugin Category Performance Impact
Category	Typical Latency	Bottleneck	Optimization Strategy
Correlation ID	< 0.1ms	UUID generation	No optimization needed
Key Auth (local)	0.2-0.5ms	Cache lookup	Ensure cache is sized appropriately
JWT Validation	0.5-2ms	Crypto (RS256 > HS256)	Use HS256 if secret key is feasible
Rate Limiting (local)	0.2-0.5ms	Shared dict access	Use local policy when possible
Rate Limiting (Redis)	1-3ms	Network RTT to Redis	Co-locate Redis, use pipelining
Request Transformer	0.1-0.5ms	Regex operations	Use simple transforms, avoid regex
Response Transformer	0.5-2ms	Body parsing/modification	Avoid body transforms if possible
Logging (async)	< 0.1ms	Minimal (async)	Ensure buffer sizes are adequate
Logging (sync HTTP)	5-50ms	External HTTP call	Avoid sync logging; use async

Performance Best Practices

Minimize plugins on hot paths—only enable what's necessary. 2) Use local rate limiting before Redis for simple cases. 3) Avoid response body transformations unless essential. 4) Configure upstream keepalives to reduce connection overhead. 5) Size the cache appropriately for your consumer/credential count. 6) Monitor P99 latency, not just average—tail latencies reveal plugin bottlenecks.

When to Choose Kong

Kong is an excellent choice for many scenarios, but it's not the right tool for every situation. Understanding Kong's strengths and limitations helps you make informed decisions.

Kong Excels When

•You need rich extensibility — Custom plugins are a core requirement
•Multi-protocol environment — HTTP, gRPC, WebSocket, GraphQL
•Hybrid deployment — Mix of on-prem and cloud
•Open-source foundation — Avoid vendor lock-in
•Kubernetes OR VMs — Flexible deployment targets
•Complex auth requirements — Multiple auth mechanisms per route
•Developer portal needed — Enterprise Edition provides this
•Strong community — Large community, abundant resources

Consider Alternatives When

•Pure Kubernetes — Ingress controllers (Ambassador, Contour) may integrate more naturally
•Serverless/AWS-only — AWS API Gateway has tighter Lambda integration
•Service mesh focus — Envoy/Istio may be more appropriate
•Minimal requirements — Simpler solutions exist for basic proxying
•Extreme latency sensitivity — Envoy may offer lower overhead
•Zero budget — Some Enterprise features require license
•Team familiarity — If team knows Envoy/Nginx, consider those

Decision Framework

Use this framework when evaluating Kong against alternatives:

1. Extensibility Requirements

Do you need custom plugins? Kong's Lua PDK is powerful but requires Lua expertise.
Would pre-built plugins suffice? Kong has 100+ plugins.

2. Deployment Environment

Kubernetes-native? Consider Kong Ingress Controller or Ambassador.
VMs/bare metal? Kong's traditional deployment works well.
Serverless? AWS API Gateway may integrate more smoothly.

3. Protocol Requirements

HTTP only? Most gateways work.
gRPC, WebSocket, GraphQL? Kong handles these well.
TCP/UDP? Kong supports TCP, but Envoy is stronger here.

4. Operational Model

GitOps/declarative? Kong DB-less mode is excellent.
Dynamic configuration? Traditional or Hybrid mode.
Managed service? Consider Kong Konnect or cloud-provider gateways.

5. Budget Constraints

OSS sufficient? Evaluate required plugins.
Enterprise features needed? Factor license costs.
Operational costs? Consider team expertise and learning curve.

The Plugin Ecosystem Advantage

Kong's greatest strength is its mature plugin ecosystem. Before building custom solutions, search the Kong Hub for existing plugins. Community plugins cover domains from GraphQL rate limiting to Kafka logging to HashiCorp Vault integration. The quality varies, but many are production-ready.

Summary: Kong Gateway

We've explored Kong's architecture, plugin system, deployment models, and decision criteria in depth. Let's consolidate the essential knowledge:

Key Takeaways

•Kong is Built on Giants — NGINX (HTTP engine) + OpenResty (Lua runtime) + LuaJIT (performance) provide a battle-tested, high-performance foundation.
•Plugins Define Everything — Kong's minimal core + maximal plugins philosophy enables composition, performance optimization, and extensibility without forking.
•Plugin Order Matters — Priority values control execution sequence; auth before rate-limiting before logging is the standard pattern.
•Multiple Deployment Models — DB-less (declarative), Traditional (database), and Hybrid (CP/DP separation) serve different operational needs.
•Hybrid Mode for Production — CP/DP separation provides security, scalability, and resilience for serious deployments.
•OSS vs. Enterprise is a Feature Question — Evaluate which plugins you actually need; many deployments succeed on OSS alone.
•Performance Depends on Plugin Stack — Baseline throughput is high; each plugin adds latency. Measure your specific configuration.
•Kong's Strength is Extensibility — If you foresee custom requirements, Kong's plugin architecture is a strong investment.

What's Next:

Having explored Kong as the archetypal open-source, plugin-based gateway, we'll next examine AWS API Gateway—a fundamentally different approach. Where Kong emphasizes self-hosted flexibility, AWS API Gateway offers a fully managed, serverless-integrated alternative with distinct trade-offs.

Page Complete

You now have a comprehensive understanding of Kong Gateway—its NGINX/OpenResty foundations, plugin architecture, deployment models, and decision criteria. You can evaluate Kong for your use cases, understand its performance characteristics, and reason about its trade-offs compared to alternatives. Next, we'll explore AWS API Gateway's serverless-native approach.

1 / 5

Loading learning content...

System Design (HLD)API Gateway Solutions

API Gateway Solutions

LevelAdvanced

Duration90 mins

TopicAPI Gateway Solutions

1 / 5

Kong Gateway: Plugin-Based, Open Source

The Rise of the Open-Source API Gateway

What You Will Learn

Architectural Foundations

NGINX: The HTTP Engine

NGINX provides Kong with:

Battle-tested stability — NGINX powers approximately 34% of the web, including high-traffic sites like Netflix, Cloudflare, and Dropbox
Exceptional performance — Event-driven architecture with minimal context switching
Mature TLS implementation — OpenSSL integration with hardware acceleration support
Efficient resource utilization — Predictable memory usage regardless of connection count

OpenResty: The Lua Runtime

OpenResty's architecture is critical to understanding Kong's plugin model:

┌─────────────────────────────────────────────────────────────────────────────────┐
│                        NGINX/OpenResty Request Lifecycle                        │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   CLIENT REQUEST                                                                │
│         │                                                                       │
│         ▼                                                                       │
│   ┌─────────────────┐                                                          │
│   │   ssl_phase     │ ← TLS handshake, certificate validation                  │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │  rewrite_phase  │ ← URL rewriting, request transformation        [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │  access_phase   │ ← Authentication, authorization, rate limiting [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │  content_phase  │ ← Upstream selection, load balancing           [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│         UPSTREAM                                                                 │
│            │                                                                    │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │ header_filter   │ ← Response header modification                 [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │  body_filter    │ ← Response body modification                   [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   ┌─────────────────┐                                                          │
│   │    log_phase    │ ← Request/response logging, metrics            [LUA]    │
│   └────────┬────────┘                                                          │
│            ▼                                                                    │
│   CLIENT RESPONSE                                                               │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

LuaJIT: Performance Without Compromise

This matters because Kong plugins execute on every request. A poorly performing scripting language would add unacceptable latency. LuaJIT's characteristics:

Trace compilation — Identifies frequently executed code paths (traces) and compiles them to native machine code
Minimal GC pause — Incremental garbage collection prevents stop-the-world pauses
FFI (Foreign Function Interface) — Allows calling C functions directly, enabling high-performance integrations
Low memory overhead — Lua tables are memory-efficient, critical for per-connection state

Kong's Layered Architecture

Kong adds its own layers atop NGINX/OpenResty:

Kong's Architectural Layers
Layer	Component	Responsibility
L1: HTTP Engine	NGINX	Connection handling, TLS, HTTP parsing, proxying
L2: Lua Runtime	OpenResty/LuaJIT	Execute Lua at request phases, provide NGINX APIs
L3: Kong Core	Kong PDK	Request/response abstraction, plugin lifecycle, routing
L4: Data Plane	Kong Workers	Stateless request processing, plugin execution
L5: Control Plane	Kong Admin API	Configuration management, clustering, health checks
L6: Persistence	PostgreSQL/Cassandra	Configuration storage, cluster coordination

The Power of Standing on Giants

The Plugin Architecture: Kong's Defining Feature

The Plugin Philosophy

Kong follows a philosophy of minimal core, maximal plugins. The Kong core provides:

Request/response routing
Service and route configuration
Plugin lifecycle management
Data plane/control plane separation
Clustering and configuration sync

Nearby everything else—authentication, rate limiting, logging, request transformation, caching—is implemented as plugins. This design has profound implications:

Composability — Mix and match plugins for your exact requirements
Performance — Disable plugins you don't need; no overhead for unused features
Extensibility — Write custom plugins without forking Kong
Maintainability — Plugin updates are independent of core updates

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
-- Example: Custom Rate Limiting Plugin
-- File: kong/plugins/custom-rate-limit/handler.lua
 
local BasePlugin = require "kong.plugins.base_plugin"
local CustomRateLimitHandler = BasePlugin:extend()
 
-- Plugin priority determines execution order (higher = earlier)
-- Authentication plugins: 1000-1999
-- Security plugins: 2000-2999
-- Traffic control: 3000-3999
CustomRateLimitHandler.PRIORITY = 901  -- After auth, before logging
CustomRateLimitHandler.VERSION = "1.0.0"
 
function CustomRateLimitHandler:new()
    CustomRateLimitHandler.super.new(self, "custom-rate-limit")
end
 
-- ACCESS phase: runs after authentication, before upstream proxying
function CustomRateLimitHandler:access(conf)
    CustomRateLimitHandler.super.access(self)
    
    -- Access the Kong PDK (Plugin Development Kit)
    local kong = kong
    
    -- Get consumer identity (set by auth plugins)
    local consumer = kong.client.get_consumer()
    local identifier = consumer and consumer.id or kong.client.get_forwarded_ip()
    
    -- Check rate limit using shared dict (in-memory) or Redis
    local cache_key = "ratelimit:" .. identifier .. ":" .. conf.window
    local current = kong.cache:get(cache_key) or 0
    
    if current >= conf.limit then
        -- Rate limit exceeded
        kong.response.set_header("X-RateLimit-Limit", conf.limit)
        kong.response.set_header("X-RateLimit-Remaining", 0)
        kong.response.set_header("Retry-After", conf.window)
        
        return kong.response.exit(429, {
            message = "Rate limit exceeded",
            retry_after = conf.window
        })
    end
    
    -- Increment counter
    kong.cache:set(cache_key, current + 1, conf.window)
    
    -- Set response headers
    kong.response.set_header("X-RateLimit-Limit", conf.limit)
    kong.response.set_header("X-RateLimit-Remaining", conf.limit - current - 1)
end
 
-- LOG phase: runs after response is sent to client
function CustomRateLimitHandler:log(conf)
    CustomRateLimitHandler.super.log(self)
    
    -- Log rate limit metrics for observability
    local latency = kong.response.get_latency()
    local status = kong.response.get_status()
    
    kong.log.info("Request completed",
        " status=", status,
        " latency=", latency, "ms",
        " consumer=", kong.client.get_consumer() and kong.client.get_consumer().username or "anonymous"
    )
end
 
return CustomRateLimitHandler

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
-- File: kong/plugins/custom-rate-limit/schema.lua
-- Defines plugin configuration schema with validation
 
local typedefs = require "kong.db.schema.typedefs"
 
return {
    name = "custom-rate-limit",
    fields = {
        { consumer = typedefs.no_consumer },  -- Plugin applies globally or to routes/services
        { protocols = typedefs.protocols_http },
        { config = {
            type = "record",
            fields = {
                -- Rate limit: max requests per window
                { limit = {
                    type = "integer",
                    required = true,
                    gt = 0,  -- Greater than 0
                    default = 100,
                }},
                -- Time window in seconds
                { window = {
                    type = "integer",
                    required = true,
                    gt = 0,
                    default = 60,
                }},
                -- Storage backend for counters
                { storage = {
                    type = "string",
                    one_of = { "local", "redis", "cluster" },
                    default = "local",
                }},
                -- Redis configuration (if storage = "redis")
                { redis_host = {
                    type = "string",
                    default = "127.0.0.1",
                }},
                { redis_port = {
                    type = "integer",
                    default = 6379,
                }},
                { redis_password = {
                    type = "string",
                    encrypted = true,  -- Stored encrypted in DB
                }},
                -- Response customization
                { error_message = {
                    type = "string",
                    default = "API rate limit exceeded",
                }},
                { hide_client_headers = {
                    type = "boolean",
                    default = false,
                }},
            },
        }},
    },
}

Plugin Execution Order and Priorities

Kong plugins execute in a deterministic order controlled by their priority values. This ordering is critical for correct behavior:

Authentication plugins (1000-1999) — Execute first to establish identity
Security plugins (2000-2999) — IP restrictions, bot detection
Traffic control (700-999) — Rate limiting, request size limiting
Transformations (800-899) — Request/response modification
Logging (0-199) — Execute last to capture complete request context

Within each phase, plugins with higher priority execute first. This matters because:

Rate limiting needs consumer identity (set by auth plugins)
Logging needs final response status (after all transformations)
Caching needs to see the original request (before transformations)

Kong Built-in Plugin Priorities
Plugin	Priority	Category	Purpose
correlation-id	100001	Observability	Add request correlation ID
jwt	1005	Authentication	Validate JWT tokens
key-auth	1003	Authentication	API key authentication
oauth2	1004	Authentication	OAuth 2.0 flows
acl	950	Authorization	Access control lists
rate-limiting	901	Traffic Control	Request rate limiting
request-transformer	801	Transformation	Modify request headers/body
response-transformer	800	Transformation	Modify response headers/body
proxy-cache	100	Performance	Cache upstream responses
http-log	12	Logging	HTTP request logging
file-log	9	Logging	Log to file system

Plugin Priority Conflicts

The Kong PDK (Plugin Development Kit)

Kong provides a comprehensive Plugin Development Kit (PDK) that abstracts away OpenResty/NGINX internals. The PDK offers:

kong.request — Access request headers, body, query params
kong.response — Set response headers, status, body
kong.service — Configure upstream target dynamically
kong.client — Get client IP, consumer, credentials
kong.cache — Per-worker or shared caching
kong.db — Access Kong's configuration database
kong.log — Structured logging with levels
kong.ctx — Request-scoped shared data

The PDK is the only supported interface for plugin development. Direct OpenResty API usage, while possible, is discouraged and may break across Kong versions.

Core Plugin Categories and Capabilities

Kong ships with a rich ecosystem of built-in plugins covering the essential gateway functionality. Understanding these categories helps you design your plugin configuration strategy.

Authentication Plugins

Kong supports multiple authentication mechanisms, often used in combination:

Kong Authentication Plugins
Plugin	Use Case	Key Features
key-auth	Simple API keys	Header/query param API keys, consumer mapping
jwt	Stateless tokens	RS256/HS256 validation, claims extraction, consumer mapping
oauth2	Delegated authorization	Full OAuth2 flows, authorization code, client credentials
basic-auth	Username/password	Base64 credentials, consumer mapping
hmac-auth	Request signing	AWS Signature v4 compatible, replay protection
ldap-auth	Enterprise directories	LDAP/AD integration, group mapping
openid-connect	Modern identity (Enterprise)	OIDC flows, IdP integration, JWT validation
mtls-auth	Certificate-based	Client certificate validation, consumer mapping

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# Kong declarative configuration (kong.yaml)
# Demonstrates layered authentication strategy
 
_format_version: "2.1"
 
services:
  - name: payment-api
    url: http://payment-service:8080
    routes:
      - name: payment-route
        paths:
          - /api/v1/payments
 
plugins:
  # Primary auth: JWT validation
  - name: jwt
    service: payment-api
    config:
      key_claim_name: iss
      claims_to_verify:
        - exp
        - nbf
      run_on_preflight: false
      
  # Secondary auth: mTLS for service-to-service
  - name: mtls-auth
    service: payment-api
    config:
      revocation_check_mode: SKIP  # Or: IGNORE_CA_ERROR
      authenticated_group_by: CN
    # Only applies when client presents certificate
    
  # Fallback: API key for legacy clients
  - name: key-auth
    service: payment-api
    config:
      key_names:
        - X-API-Key
        - apikey
      hide_credentials: true
      anonymous: anonymous-consumer  # Allow anonymous with limits
      
consumers:
  - username: anonymous-consumer
    custom_id: anonymous
 
  - username: premium-client
    custom_id: client-001
    # Associated credentials would be created separately

Traffic Control Plugins

Traffic control plugins protect your infrastructure from overload and abuse:

Kong Traffic Control Plugins
Plugin	Mechanism	Granularity	Storage
rate-limiting	Token bucket / Sliding window	Consumer, IP, Service, Route, Header	Local, Cluster, Redis
rate-limiting-advanced (EE)	Multiple limits per config	Same + Custom identifiers	Redis with sync
request-size-limiting	Body size enforcement	Per-route configurable	N/A (stateless)
response-ratelimiting	Based on response headers	X-RateLimit header from upstream	Local, Cluster, Redis
request-termination	Static response	Block routes entirely	N/A

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Different rate limits for different consumer tiers
plugins:
  # Default rate limit for all consumers
  - name: rate-limiting
    service: api-service
    config:
      minute: 100
      hour: 1000
      policy: redis
      redis_host: redis.internal
      redis_port: 6379
      redis_database: 0
      fault_tolerant: true  # Allow requests if Redis is down
      hide_client_headers: false
 
  # Override for premium consumers
  - name: rate-limiting
    service: api-service
    consumer: premium-tier
    config:
      minute: 1000
      hour: 50000
      policy: redis
      redis_host: redis.internal
      
  # Strict limits for trial/anonymous
  - name: rate-limiting
    service: api-service
    consumer: trial-tier
    config:
      minute: 10
      hour: 100
      policy: local  # Local storage is fine for low limits

Transformation Plugins

Transformation plugins modify requests and responses without changing backend services:

Kong Transformation Plugins
Plugin	Direction	Capabilities
request-transformer	Request	Add/remove/rename headers, query params, body fields
response-transformer	Response	Add/remove/rename headers, body transformation
correlation-id	Request	Generate unique request ID, propagate through system
request-validator	Request	JSON Schema validation, OpenAPI spec enforcement
grpc-web	Both	Translate gRPC-Web protocol to/from gRPC
grpc-gateway	Request	RESTful JSON to gRPC transcoding

Observability Plugins

Kong's observability plugins integrate with standard monitoring stacks:

Kong Observability Plugins
Plugin	Output	Data	Use Case
http-log	HTTP endpoint	JSON request/response	Central logging, SIEM
tcp-log	TCP socket	JSON request/response	Log aggregators
udp-log	UDP socket	JSON request/response	Low-latency logging
syslog	Syslog daemon	Structured logs	Traditional Unix logging
file-log	Local file	JSON lines	Local debugging, rotation
prometheus	HTTP /metrics	Prometheus format	Metrics collection, Grafana
datadog	Datadog agent	Metrics + traces	Datadog APM integration
zipkin	Zipkin collector	Distributed traces	OpenZipkin ecosystem
opentelemetry (EE)	OTLP endpoint	Traces, metrics, logs	Modern observability

Plugin Overhead Considerations

Deployment Models: From Development to Enterprise

Kong supports multiple deployment models, each with distinct characteristics for different organizational needs and scale requirements.

DB-less Mode (Declarative Configuration)

In DB-less mode, Kong operates without a database. Configuration is provided via a declarative YAML file loaded at startup:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# kong.yaml - Complete gateway configuration
_format_version: "2.1"
_transform: true
 
services:
  - name: user-service
    url: http://user-service.internal:8080
    connect_timeout: 5000
    read_timeout: 60000
    write_timeout: 60000
    retries: 3
    routes:
      - name: user-routes
        paths:
          - /api/v1/users
        methods:
          - GET
          - POST
          - PUT
          - DELETE
        strip_path: false
        preserve_host: true
 
  - name: order-service
    url: http://order-service.internal:8080
    routes:
      - name: order-routes
        paths:
          - /api/v1/orders
        
plugins:
  # Global plugins (apply to all routes)
  - name: prometheus
    config:
      per_consumer: true
      status_code_metrics: true
      latency_metrics: true
      
  - name: correlation-id
    config:
      header_name: X-Request-ID
      generator: uuid
      echo_downstream: true
      
  # Service-specific plugins
  - name: jwt
    service: user-service
    config:
      key_claim_name: iss
      
  - name: rate-limiting
    service: user-service
    config:
      minute: 100
      policy: local
 
upstreams:
  - name: user-service-upstream
    algorithm: round-robin
    healthchecks:
      active:
        healthy:
          interval: 5
          successes: 2
        unhealthy:
          interval: 5
          tcp_failures: 2
          http_failures: 2
    targets:
      - target: user-pod-1:8080
        weight: 100
      - target: user-pod-2:8080
        weight: 100

DB-less Advantages

•Simpler operations — No database to manage, backup, or scale
•GitOps compatible — Configuration in version control
•Faster startup — No database connection required
•Immutable deploys — Configuration changes = redeploy
•Lower resource usage — No database overhead

DB-less Limitations

•No dynamic updates — Configuration changes require reload
•No Admin API writes — Read-only Admin API
•Limited plugins — Some plugins require database (OAuth2)
•No consumer secrets — Credentials must be in config file
•Cluster sync manual — Must deploy to all nodes

Traditional Mode (with Database)

Traditional mode uses PostgreSQL or Cassandra for configuration storage:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Kong Traditional Mode                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────┐                   │
│   │   Kong #1   │     │   Kong #2   │     │   Kong #3   │                   │
│   │  (Worker)   │     │  (Worker)   │     │  (Worker)   │                   │
│   └──────┬──────┘     └──────┬──────┘     └──────┬──────┘                   │
│          │                   │                   │                          │
│          └───────────────────┼───────────────────┘                          │
│                              │                                              │
│                              ▼                                              │
│                    ┌─────────────────────┐                                  │
│                    │     PostgreSQL      │                                  │
│                    │  (Primary + Replica)│                                  │
│                    └─────────────────────┘                                  │
│                                                                             │
│   Admin API → Any Kong node → Database → Propagates to all nodes           │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Characteristics:

Dynamic configuration via Admin API
Automatic cluster synchronization (via db polling or events)
Full plugin compatibility
Operational database dependency
Suitable for: Development, smaller deployments, when dynamic config is essential

Hybrid Mode (Data Plane / Control Plane Separation)

Kong's Hybrid Mode is the recommended production architecture for larger deployments. It separates the cluster into:

Control Plane (CP) — Manages configuration, connects to database, exposes Admin API
Data Plane (DP) — Processes traffic, receives configuration from CP, no database connection

┌─────────────────────────────────────────────────────────────────────────────────┐
│                           Kong Hybrid Mode                                      │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   CONTROL PLANE CLUSTER (Protected Network)                                    │
│   ┌──────────────────────────────────────────────────────────────────────┐      │
│   │  ┌─────────────┐     ┌─────────────┐                                 │      │
│   │  │    CP #1    │     │    CP #2    │  ← Admin API requests           │      │
│   │  │  (Primary)  │     │ (Secondary) │                                 │      │
│   │  └──────┬──────┘     └──────┬──────┘                                 │      │
│   │         │                   │                                        │      │
│   │         └─────────┬─────────┘                                        │      │
│   │                   │                                                  │      │
│   │                   ▼                                                  │      │
│   │         ┌─────────────────────┐                                      │      │
│   │         │     PostgreSQL      │                                      │      │
│   │         └─────────────────────┘                                      │      │
│   └──────────────────────────────────────────────────────────────────────┘      │
│                     │                                                           │
│                     │ mTLS WebSocket (Config Push)                              │
│                     │                                                           │
│   DATA PLANE CLUSTER (DMZ / Edge)                                               │
│   ┌──────────────────────────────────────────────────────────────────────┐      │
│   │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │      │
│   │  │    DP #1    │  │    DP #2    │  │    DP #3    │  │    DP #N    │  │      │
│   │  │  (Stateless)│  │  (Stateless)│  │  (Stateless)│  │  (Stateless)│  │      │
│   │  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  │      │
│   │         │                │                │                │         │      │
│   │         └────────────────┴────────────────┴────────────────┘         │      │
│   └──────────────────────────────────────────────────────────────────────┘      │
│                     ▲                                                           │
│                     │                                                           │
│              Internet Traffic                                                   │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

Hybrid Mode Benefits

•Security isolation — Database and Admin API never exposed to internet; only stateless DPs in DMZ
•Independent scaling — Scale DPs massively without database load increase
•Resilience — DPs continue serving traffic even if CP is temporarily unavailable (cached config)
•Network efficiency — DPs only need outbound connection to CP, no database connectivity
•Multi-region — CPs can be centralized; DPs deployed in each region close to traffic
•Compliance — Sensitive configuration stays in secure network zone

Hybrid Mode in Practice

Kong OSS vs. Kong Enterprise vs. Kong Konnect

Kong is available in multiple editions with different capabilities and licensing. Understanding these options is essential for planning and budgeting.

Kong OSS (Open Source)

The core open-source Kong, licensed under Apache 2.0:

Kong OSS Included

•Core gateway functionality — Routing, load balancing, health checks
•Essential plugins — Rate limiting, key-auth, JWT, basic-auth, ACL, CORS, request/response transformers
•Observability plugins — Prometheus, file-log, http-log, tcp-log, syslog, Zipkin
•Deployment modes — DB-less, Traditional, Hybrid
•Admin API — Full configuration API
•Plugin development — Full PDK access, custom plugin support
•Community support — GitHub issues, Kong Nation forum

Kong Enterprise (Kong Gateway)

Commercial offering with additional enterprise features:

Kong Enterprise Additional Features
Category	Feature	Description
Authentication	OpenID Connect	Full OIDC flows, IdP integration
Authentication	SAML	Enterprise SSO integration
Authorization	OPA (Open Policy Agent)	Fine-grained policy-as-code authorization
Security	Secrets Management	HashiCorp Vault, AWS Secrets Manager integration
Security	GraphQL Security	Query depth limiting, cost analysis
Traffic Control	Rate Limiting Advanced	Multiple rate limits, custom windows, sliding window
Traffic Control	Canary Release	Gradual traffic shifting with automatic rollback
Developer Portal	Dev Portal	Self-service API documentation and key provisioning
Observability	OpenTelemetry	OTLP traces, metrics, logs
Observability	Vitals	Built-in analytics dashboard
Operations	RBAC	Role-based access control for Admin API
Operations	Workspaces	Multi-tenant configuration isolation
Support	24/7 Support	Dedicated enterprise support SLA

Kong Konnect (SaaS)

Kong's fully managed SaaS offering:

Control Plane as a Service — Kong manages the CP infrastructure
Global deployment — Multi-region CP with automatic failover
ServiceHub — Service catalog and API governance
Analytics — Built-in request analytics and reporting
Runtime Manager — Deploy and monitor DPs from cloud console
Dev Portal (hosted) — Managed developer portal

Pricing Model: Typically based on API requests/month or connected services.

Edition Comparison Summary
Capability	OSS	Enterprise	Konnect
Core Gateway	✅	✅	✅
Essential Plugins (~40)	✅	✅	✅
Enterprise Plugins (~30+)	❌	✅	✅
Developer Portal	❌	✅	✅ (Managed)
RBAC / Workspaces	❌	✅	✅
OIDC / SAML	❌	✅	✅
Managed Control Plane	❌	❌	✅
Enterprise Support	❌	✅	✅
Self-hosted	✅	✅	DP only
License Cost	Free	$$$	$$$ (SaaS)

Choosing an Edition

Performance Characteristics and Tuning

Kong's performance varies significantly based on configuration, plugin selection, and tuning. Understanding these factors helps you plan capacity and meet latency requirements.

Baseline Performance

Kong's raw proxying performance (minimal plugins) is impressive due to NGINX's efficiency:

Kong Baseline Performance (Reference Numbers)
Metric	No Plugins	Auth + Rate Limit	Full Stack (8 plugins)
Requests/second (per core)	~30,000	~15,000	~8,000
P50 Latency (added)	< 1ms	1-2ms	3-5ms
P99 Latency (added)	< 2ms	3-5ms	8-15ms
Memory per 10K RPS	~100MB	~150MB	~250MB
CPU utilization @ 10K RPS	~30%	~50%	~80%

Benchmark Caveats

Key Performance Tuning Parameters

Critical configuration options for production deployments:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# Worker processes - typically set to number of CPU cores
# Auto-detection: nginx_worker_processes = auto
nginx_worker_processes = 4
 
# Worker connections - max concurrent connections per worker
# Default: 16384, increase for high-concurrency workloads
nginx_worker_connections = 65535
 
# Upstream keepalive - reuse connections to upstreams
# Critical for reducing upstream connection overhead
nginx_http_upstream_keepalive = 1000
nginx_http_upstream_keepalive_requests = 10000
nginx_http_upstream_keepalive_timeout = 60s
 
# DNS resolver - cache DNS lookups for upstreams
dns_stale_ttl = 60
dns_not_found_ttl = 10
dns_order = LAST,SRV,A,CNAME
 
# Database connection pooling (Traditional mode)
pg_max_concurrent_queries = 0  # Unlimited
pg_semaphore_timeout = 60000   # 60 seconds
 
# Cache tuning - for plugin data caching
mem_cache_size = 128m
db_cache_ttl = 0  # Cache indefinitely (Hybrid mode)
 
# Logging - reduce I/O in high-throughput scenarios
proxy_access_log = off  # Use logging plugins instead
proxy_error_log = /dev/stderr notice
 
# Timeouts
nginx_proxy_connect_timeout = 10s
nginx_proxy_read_timeout = 60s
nginx_proxy_send_timeout = 60s
 
# Buffer sizes for large headers/bodies
nginx_http_client_max_body_size = 10m
nginx_http_client_body_buffer_size = 10m
nginx_http_proxy_buffer_size = 160k
nginx_http_proxy_buffers = 64 160k

Plugin Performance Impact

Different plugin categories have different performance characteristics:

Plugin Category Performance Impact
Category	Typical Latency	Bottleneck	Optimization Strategy
Correlation ID	< 0.1ms	UUID generation	No optimization needed
Key Auth (local)	0.2-0.5ms	Cache lookup	Ensure cache is sized appropriately
JWT Validation	0.5-2ms	Crypto (RS256 > HS256)	Use HS256 if secret key is feasible
Rate Limiting (local)	0.2-0.5ms	Shared dict access	Use local policy when possible
Rate Limiting (Redis)	1-3ms	Network RTT to Redis	Co-locate Redis, use pipelining
Request Transformer	0.1-0.5ms	Regex operations	Use simple transforms, avoid regex
Response Transformer	0.5-2ms	Body parsing/modification	Avoid body transforms if possible
Logging (async)	< 0.1ms	Minimal (async)	Ensure buffer sizes are adequate
Logging (sync HTTP)	5-50ms	External HTTP call	Avoid sync logging; use async

Performance Best Practices

Minimize plugins on hot paths—only enable what's necessary. 2) Use local rate limiting before Redis for simple cases. 3) Avoid response body transformations unless essential. 4) Configure upstream keepalives to reduce connection overhead. 5) Size the cache appropriately for your consumer/credential count. 6) Monitor P99 latency, not just average—tail latencies reveal plugin bottlenecks.

When to Choose Kong

Kong is an excellent choice for many scenarios, but it's not the right tool for every situation. Understanding Kong's strengths and limitations helps you make informed decisions.

Kong Excels When

•You need rich extensibility — Custom plugins are a core requirement
•Multi-protocol environment — HTTP, gRPC, WebSocket, GraphQL
•Hybrid deployment — Mix of on-prem and cloud
•Open-source foundation — Avoid vendor lock-in
•Kubernetes OR VMs — Flexible deployment targets
•Complex auth requirements — Multiple auth mechanisms per route
•Developer portal needed — Enterprise Edition provides this
•Strong community — Large community, abundant resources

Consider Alternatives When

•Pure Kubernetes — Ingress controllers (Ambassador, Contour) may integrate more naturally
•Serverless/AWS-only — AWS API Gateway has tighter Lambda integration
•Service mesh focus — Envoy/Istio may be more appropriate
•Minimal requirements — Simpler solutions exist for basic proxying
•Extreme latency sensitivity — Envoy may offer lower overhead
•Zero budget — Some Enterprise features require license
•Team familiarity — If team knows Envoy/Nginx, consider those

Decision Framework

Use this framework when evaluating Kong against alternatives:

1. Extensibility Requirements

Do you need custom plugins? Kong's Lua PDK is powerful but requires Lua expertise.
Would pre-built plugins suffice? Kong has 100+ plugins.

2. Deployment Environment

Kubernetes-native? Consider Kong Ingress Controller or Ambassador.
VMs/bare metal? Kong's traditional deployment works well.
Serverless? AWS API Gateway may integrate more smoothly.

3. Protocol Requirements

HTTP only? Most gateways work.
gRPC, WebSocket, GraphQL? Kong handles these well.
TCP/UDP? Kong supports TCP, but Envoy is stronger here.

4. Operational Model

GitOps/declarative? Kong DB-less mode is excellent.
Dynamic configuration? Traditional or Hybrid mode.
Managed service? Consider Kong Konnect or cloud-provider gateways.

5. Budget Constraints

OSS sufficient? Evaluate required plugins.
Enterprise features needed? Factor license costs.
Operational costs? Consider team expertise and learning curve.

The Plugin Ecosystem Advantage

Summary: Kong Gateway

We've explored Kong's architecture, plugin system, deployment models, and decision criteria in depth. Let's consolidate the essential knowledge:

Key Takeaways

•Kong is Built on Giants — NGINX (HTTP engine) + OpenResty (Lua runtime) + LuaJIT (performance) provide a battle-tested, high-performance foundation.
•Plugins Define Everything — Kong's minimal core + maximal plugins philosophy enables composition, performance optimization, and extensibility without forking.
•Plugin Order Matters — Priority values control execution sequence; auth before rate-limiting before logging is the standard pattern.
•Multiple Deployment Models — DB-less (declarative), Traditional (database), and Hybrid (CP/DP separation) serve different operational needs.
•Hybrid Mode for Production — CP/DP separation provides security, scalability, and resilience for serious deployments.
•OSS vs. Enterprise is a Feature Question — Evaluate which plugins you actually need; many deployments succeed on OSS alone.
•Performance Depends on Plugin Stack — Baseline throughput is high; each plugin adds latency. Measure your specific configuration.
•Kong's Strength is Extensibility — If you foresee custom requirements, Kong's plugin architecture is a strong investment.

What's Next:

Page Complete

1 / 5