System DesignAPI Gateway Patterns

Authentication at Gateway

LevelIntermediate

Duration90 mins

TopicAPI Gateway Patterns

1 / 4

Centralized Authentication

The Case for Centralized Authentication

In a microservices architecture, every service potentially exposes endpoints that require authentication. Without a centralized approach, each service must independently implement authentication logic—validating tokens, checking credentials, managing sessions, and handling edge cases like token expiration and revocation.

This distributed authentication model creates several fundamental problems:

Code Duplication: Every service contains nearly identical authentication code, multiplied across potentially hundreds of services.

Inconsistent Security: Different teams implement authentication differently, leading to security gaps. One service might properly validate JWT signatures while another skips expiration checks.

Operational Overhead: Rotating secrets, updating authentication libraries, or patching vulnerabilities requires coordinated changes across all services.

Latency Multiplication: Each service independently calling identity providers creates redundant network requests and increased latency.

Centralized authentication at the API Gateway solves these problems by consolidating authentication logic at a single, well-audited entry point. The gateway authenticates all incoming requests before they reach backend services, allowing services to trust that authenticated context has already been established.

The Security Perimeter Principle

Centralized authentication establishes a clear security perimeter. Every request entering your system passes through the gateway, which acts as a checkpoint. Backend services operate behind this perimeter and can focus on authorization (what can authenticated users do?) rather than authentication (who is making this request?).

Architectural Foundations

Centralized authentication at the gateway layer requires careful architectural consideration. The gateway becomes a critical component in the request path—every authenticated request depends on it. This section examines the foundational architecture.

The Gateway as Authentication Proxy

The API Gateway sits at the edge of your system, receiving all inbound requests before they reach internal services. In its authentication role, the gateway performs several key functions:

1. Credential Extraction The gateway extracts authentication credentials from incoming requests. These credentials might be:

Bearer tokens in the Authorization header
Session cookies
API keys in headers or query parameters
Client certificates (for mTLS)

2. Token Validation For token-based authentication (the dominant model in modern systems), the gateway validates:

Signature verification: The token was issued by a trusted authority
Expiration checks: The token hasn't exceeded its lifetime
Issuer validation: The token comes from an expected identity provider
Audience validation: The token was intended for this service/API

3. Identity Resolution Once validated, the gateway resolves the token to an identity—typically a user ID or service account. This identity becomes the security context for the request.

4. Context Propagation The gateway propagates the authenticated identity to downstream services, typically via:

Custom headers (e.g., X-User-ID, X-Tenant-ID)
A new, internal token scoped to the request
Request metadata in the service mesh

Converting Mermaid diagram...

Trust Boundaries

The diagram illustrates a critical concept: trust boundaries. The external zone (clients) is untrusted—all requests must be authenticated. The internal zone (services behind the gateway) is trusted—services can rely on the gateway having authenticated requests. This trust model simplifies backend services but requires absolute confidence in the gateway's authentication logic.

Authentication Flow Patterns

Centralized authentication can be implemented through several distinct patterns, each with trade-offs in latency, security, and operational complexity. Understanding these patterns is essential for making appropriate architectural decisions.

Pattern 1: Inline Token Validation

In inline validation, the gateway validates tokens synchronously as part of request processing. This is the most common pattern for JWT-based authentication.

Inline JWT Validation Flow

Gateway Logic (Pseudocode)

// Inline Token Validation - Gateway Middleware
 
function authenticateRequest(request, response, next) {
    // Step 1: Extract token from Authorization header
    const authHeader = request.headers["authorization"];
    if (!authHeader || !authHeader.startsWith("Bearer ")) {
        return response.status(401).json({
            error: "missing_token",
            message: "Authorization header with Bearer token required"
        });
    }
    
    const token = authHeader.substring(7);
    
    // Step 2: Decode token header (without verifying signature yet)
    const header = decodeJwtHeader(token);
    
    // Step 3: Retrieve signing key based on 'kid' (key ID) claim
    // JWKS is cached locally with periodic refresh (e.g., every 5 minutes)
    const signingKey = jwksCache.getKey(header.kid);
    if (!signingKey) {
        // Trigger async JWKS refresh and retry once
        await jwksCache.refresh();
        signingKey = jwksCache.getKey(header.kid);
        if (!signingKey) {
            return response.status(401).json({
                error: "unknown_signing_key",
                message: "Token signed by unknown key"
            });
        }
    }
    
    // Step 4: Verify signature and decode claims
    let claims;
    try {
        claims = verifyAndDecode(token, signingKey, {
            algorithms: ["RS256"],           // Only allow expected algorithms
            issuer: config.expectedIssuer,    // Verify 'iss' claim
            audience: config.expectedAudience // Verify 'aud' claim
        });
    } catch (err) {
        if (err instanceof TokenExpiredError) {
            return response.status(401).json({
                error: "token_expired",
                message: "Access token has expired"
            });
        }
        return response.status(401).json({
            error: "invalid_token",
            message: "Token validation failed"
        });
    }
    
    // Step 5: Additional claim validation
    if (!claims.sub) {
        return response.status(401).json({
            error: "invalid_claims",
            message: "Token missing subject claim"
        });
    }
    
    // Step 6: Propagate identity to downstream services
    request.headers["X-Authenticated-User-Id"] = claims.sub;
    request.headers["X-Authenticated-User-Email"] = claims.email || "";
    request.headers["X-Token-Scopes"] = (claims.scope || "").join(",");
    request.headers["X-Request-Id"] = generateRequestId();
    
    // Continue to routing
    next();
}

Characteristics of Inline Validation:

Latency: Minimal added latency (typically <1ms) since JWT validation is a local cryptographic operation
Dependencies: Requires JWKS caching; occasional dependency on identity provider for key refresh
Failure Mode: Gateway can continue operating with cached keys even if identity provider is temporarily unavailable

Pattern 2: Token Introspection

For opaque tokens (non-JWT access tokens), the gateway must call the identity provider's introspection endpoint to validate tokens. This pattern is common with OAuth2 servers that issue reference tokens.

Token Introspection Flow

Gateway Logic

// Token Introspection - Gateway Middleware
 
function authenticateRequest(request, response, next) {
    const token = extractBearerToken(request);
    if (!token) {
        return response.status(401).json({ error: "missing_token" });
    }
    
    // Check token cache first (avoid introspection call if recently validated)
    const cachedResult = tokenCache.get(hashToken(token));
    if (cachedResult) {
        if (cachedResult.active) {
            propagateIdentity(request, cachedResult.claims);
            return next();
        } else {
            return response.status(401).json({ error: "inactive_token" });
        }
    }
    
    // Call introspection endpoint
    const introspectionResult = await httpClient.post(
        config.introspectionEndpoint,
        {
            token: token,
            token_type_hint: "access_token"
        },
        {
            auth: {
                username: config.clientId,
                password: config.clientSecret
            },
            timeout: 500, // Aggressive timeout
            retry: { attempts: 2, delay: 50 }
        }
    );
    
    // Cache the result (with short TTL to handle revocation)
    tokenCache.set(hashToken(token), {
        active: introspectionResult.active,
        claims: {
            sub: introspectionResult.sub,
            scope: introspectionResult.scope,
            exp: introspectionResult.exp
        }
    }, { ttl: 30 }); // 30 second cache
    
    if (!introspectionResult.active) {
        return response.status(401).json({ error: "inactive_token" });
    }
    
    propagateIdentity(request, introspectionResult);
    next();
}

Characteristics of Token Introspection:

Latency: Higher latency (10-100ms) due to external call; mitigated by caching
Dependencies: Strong dependency on identity provider availability
Revocation: Immediate token revocation when cache TTL is low
Trade-off: Real-time token status vs. latency and availability

Pattern 3: Session-Based Authentication

For web applications with traditional session-based authentication, the gateway validates session cookies against a shared session store.

Session Validation Flow

Gateway Logic

// Session-Based Authentication - Gateway Middleware
 
function authenticateRequest(request, response, next) {
    // Extract session ID from secure, httpOnly cookie
    const sessionId = request.cookies[config.sessionCookieName];
    if (!sessionId) {
        return response.status(401).json({ error: "no_session" });
    }
    
    // Validate session ID format (prevent injection attacks)
    if (!isValidSessionIdFormat(sessionId)) {
        return response.status(401).json({ error: "invalid_session_format" });
    }
    
    // Lookup session in distributed store (Redis, Memcached, etc.)
    const session = await sessionStore.get(sessionId);
    
    if (!session) {
        // Clear the invalid cookie
        response.clearCookie(config.sessionCookieName);
        return response.status(401).json({ error: "session_not_found" });
    }
    
    // Check session expiration
    if (Date.now() > session.expiresAt) {
        await sessionStore.delete(sessionId);
        response.clearCookie(config.sessionCookieName);
        return response.status(401).json({ error: "session_expired" });
    }
    
    // Sliding window expiration: extend session on activity
    const newExpiry = Date.now() + config.sessionTtlMs;
    await sessionStore.extend(sessionId, newExpiry);
    
    // Propagate identity
    request.headers["X-Authenticated-User-Id"] = session.userId;
    request.headers["X-Session-Id"] = sessionId;
    
    next();
}

Authentication Pattern Comparison
Pattern	Latency	Revocation Speed	IDP Dependency	Best For
Inline JWT Validation	~1ms	Token lifetime (minutes-hours)	Low (JWKS cache)	Microservices, APIs
Token Introspection	10-100ms	Cache TTL (seconds)	High (every validation)	Opaque tokens, strict revocation
Session-Based	1-5ms	Immediate (session deletion)	Medium (session store)	Web applications

Gateway Architecture for High Availability

When authentication is centralized at the gateway, the gateway becomes a single point of failure. A gateway outage means no requests can be authenticated, effectively taking down the entire system. Designing for high availability is non-negotiable.

Redundancy Patterns

Multi-Instance Deployment: Deploy multiple gateway instances behind a load balancer. Each instance must be capable of independently validating tokens without coordination with other instances.

High Availability Gateway Deployment
Kubernetes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# api-gateway-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-gateway
  namespace: edge
spec:
  replicas: 5  # Multiple replicas for redundancy
  selector:
    matchLabels:
      app: api-gateway
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1      # Maintain N-1 instances during updates
      maxSurge: 2
  template:
    metadata:
      labels:
        app: api-gateway
    spec:
      # Anti-affinity: spread across failure domains
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: api-gateway
                topologyKey: topology.kubernetes.io/zone
      containers:
        - name: gateway
          image: api-gateway:v2.3.1
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "500m"
              memory: "256Mi"
            limits:
              cpu: "2000m"
              memory: "1Gi"
          # Health checks
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 10
          env:
            - name: JWKS_CACHE_TTL
              value: "300"  # 5 minute JWKS cache
            - name: JWKS_REFRESH_INTERVAL
              value: "60"   # Proactive refresh every minute
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-gateway-pdb
  namespace: edge
spec:
  minAvailable: 3  # Always maintain at least 3 healthy pods
  selector:
    matchLabels:
      app: api-gateway

Stateless Design

Each gateway instance must be completely stateless regarding authentication. This means:

1. No Local Session State: Sessions, if used, must be stored in a distributed store accessible to all instances.

2. Cached JWKS Must Be Eventually Consistent: Each instance maintains its own JWKS cache, refreshing independently. This means a key rotation might briefly cause some instances to reject tokens while others accept them—typically acceptable given the short refresh interval.

3. No Instance Affinity: Requests from the same client can be routed to any gateway instance. The gateway must not assume it will see previous requests from the same client.

Graceful Degradation

When dependencies (identity provider, session store) become unavailable, the gateway should degrade gracefully rather than failing completely.

Graceful Degradation Logic

Gateway Configuration

// Graceful Degradation Configuration
 
const authConfig = {
    // JWT Validation Degradation
    jwt: {
        // If JWKS refresh fails, continue using cached keys
        useStaleCacheOnRefreshFailure: true,
        staleCacheMaxAge: 3600,  // Accept stale cache for up to 1 hour
        
        // Log warning when operating with stale keys
        alertOnStaleCache: true,
    },
    
    // Token Introspection Degradation
    introspection: {
        // If introspection endpoint is unavailable
        fallbackBehavior: "USE_CACHED_OR_REJECT",  // Options: USE_CACHED_OR_REJECT, REJECT_ALL, PASS_THROUGH
        
        // Maximum age of cached introspection result to accept during outage
        maxCacheAgeOnFailure: 300,  // 5 minutes
        
        // Circuit breaker for introspection endpoint
        circuitBreaker: {
            failureThreshold: 5,
            recoveryTimeout: 30,
            halfOpenRequests: 2,
        },
    },
    
    // Session Store Degradation
    session: {
        // If session store is unavailable
        fallbackBehavior: "REJECT_ALL",  // Sessions require store access
        
        // Connection pool settings for resilience
        pool: {
            minConnections: 5,
            maxConnections: 50,
            connectionTimeout: 500,
            idleTimeout: 30000,
        },
        
        // Read from replicas, write to primary
        readFromReplicas: true,
    },
};

Degradation Trade-offs

Every degradation strategy involves a security vs. availability trade-off. Using stale JWKS means a key compromise takes longer to propagate. Using cached introspection results means revoked tokens remain valid briefly. Document these trade-offs explicitly and ensure they align with your organization's risk tolerance.

Identity Propagation Strategies

Once the gateway authenticates a request, it must communicate the authenticated identity to downstream services. This is called identity propagation. Several strategies exist, each with distinct security and operational characteristics.

Strategy 1: Header-Based Propagation

The gateway adds custom headers containing identity information. Services read these headers and trust them implicitly.

Header-Based Identity Propagation

// Gateway: Inject identity headers after authentication
 
function propagateIdentity(request, claims) {
    // Set standard identity headers
    request.headers["X-User-Id"] = claims.sub;
    request.headers["X-User-Email"] = claims.email || "";
    request.headers["X-User-Roles"] = (claims.roles || []).join(",");
    
    // Set organization/tenant context (for multi-tenant systems)
    request.headers["X-Tenant-Id"] = claims.tenant_id || "";
    request.headers["X-Org-Id"] = claims.org_id || "";
    
    // Set token metadata
    request.headers["X-Token-Issuer"] = claims.iss;
    request.headers["X-Token-Scopes"] = (claims.scope || []).join(",");
    request.headers["X-Token-Expires"] = claims.exp.toString();
    
    // Request correlation
    request.headers["X-Request-Id"] = request.headers["X-Request-Id"] || generateUUID();
    request.headers["X-Correlation-Id"] = request.headers["X-Correlation-Id"] || generateUUID();
    
    // IMPORTANT: Remove the original Authorization header or replace it
    // to prevent downstream services from trusting the original token
    delete request.headers["Authorization"];
}

Critical Security Requirement

Header-based propagation is only secure if no external traffic can reach services directly. If an attacker can bypass the gateway and send requests directly to services, they can forge identity headers. Use network policies, service mesh, or firewall rules to ensure services only receive traffic from the gateway.

Strategy 2: Internal Token Minting

Instead of trusting headers, the gateway mints a new, cryptographically signed internal token for each request. Services validate this token, providing defense-in-depth.

Internal Token Minting

// Gateway: Mint internal JWT for downstream propagation
 
function mintInternalToken(claims, originalTokenExpiry) {
    // Internal token has short lifetime (5 minutes max)
    // or remaining lifetime of original token, whichever is shorter
    const expiresIn = Math.min(
        300,  // 5 minutes
        originalTokenExpiry - Math.floor(Date.now() / 1000)
    );
    
    const internalClaims = {
        // Standard claims
        iss: "gateway.internal",
        sub: claims.sub,
        aud: "internal-services",
        iat: Math.floor(Date.now() / 1000),
        exp: Math.floor(Date.now() / 1000) + expiresIn,
        
        // Custom claims (propagated from original token)
        email: claims.email,
        roles: claims.roles,
        tenant_id: claims.tenant_id,
        scopes: claims.scope,
        
        // Request context
        request_id: generateUUID(),
        original_issuer: claims.iss,
    };
    
    // Sign with gateway's private key (rotated regularly)
    return jwt.sign(internalClaims, gatewayPrivateKey, {
        algorithm: "ES256",  // ECDSA for fast signing
        keyid: currentKeyId,
    });
}
 
function propagateIdentity(request, claims) {
    const internalToken = mintInternalToken(claims, claims.exp);
    request.headers["Authorization"] = `Bearer ${internalToken}`;
}

Identity Propagation Comparison
Strategy	Security	Performance	Complexity	Use Case
Headers Only	Requires network isolation	Fastest (no crypto)	Low	Trusted internal network
Internal Token	Cryptographic verification	~1ms signing overhead	Medium	Zero-trust internal
Pass-Through	Services validate original	Varies	Low	Services need full claims

Multi-Tenant Authentication

Multi-tenant systems serve multiple customers (tenants) from a single deployment. Centralized gateway authentication must securely isolate tenants while efficiently routing requests to tenant-specific contexts.

Tenant Identification

The gateway must determine which tenant a request belongs to. Common approaches:

1. Subdomain-Based Tenancy

tenant-a.api.example.com/users
tenant-b.api.example.com/users

2. Path-Based Tenancy

api.example.com/tenants/tenant-a/users
api.example.com/tenants/tenant-b/users

3. Header-Based Tenancy

GET /users
X-Tenant-ID: tenant-a

4. Token-Embedded Tenancy The tenant ID is embedded in the access token as a claim.

Multi-Tenant Authentication Logic

Gateway Middleware

// Multi-Tenant Gateway Authentication
 
function authenticateMultiTenant(request, response, next) {
    // Step 1: Identify tenant from request
    const tenantId = resolveTenantId(request);
    if (!tenantId) {
        return response.status(400).json({
            error: "tenant_not_identified",
            message: "Unable to determine tenant context"
        });
    }
    
    // Step 2: Validate tenant exists and is active
    const tenant = await tenantRegistry.get(tenantId);
    if (!tenant) {
        return response.status(404).json({ error: "tenant_not_found" });
    }
    if (tenant.status !== "active") {
        return response.status(403).json({
            error: "tenant_suspended",
            message: "Tenant account is suspended"
        });
    }
    
    // Step 3: Extract and validate token
    const token = extractBearerToken(request);
    if (!token) {
        return response.status(401).json({ error: "missing_token" });
    }
    
    // Step 4: Get tenant-specific JWKS configuration
    // (tenants may have different identity providers)
    const idpConfig = tenant.identityProvider || defaultIdpConfig;
    const jwks = await getJwksForTenant(tenantId, idpConfig);
    
    // Step 5: Validate token
    let claims;
    try {
        claims = await validateToken(token, jwks, idpConfig);
    } catch (err) {
        return response.status(401).json({ error: "invalid_token" });
    }
    
    // Step 6: Verify token belongs to this tenant
    // Critical: Prevent cross-tenant authentication
    if (claims.tenant_id && claims.tenant_id !== tenantId) {
        // Token was issued for a different tenant!
        logSecurityAlert("cross_tenant_attempt", {
            tokenTenant: claims.tenant_id,
            requestTenant: tenantId,
            userId: claims.sub,
        });
        return response.status(403).json({
            error: "tenant_mismatch",
            message: "Token not valid for this tenant"
        });
    }
    
    // Step 7: Propagate identity with tenant context
    request.headers["X-Tenant-Id"] = tenantId;
    request.headers["X-User-Id"] = claims.sub;
    request.headers["X-Tenant-Plan"] = tenant.plan;  // For rate limiting, features
    
    next();
}
 
function resolveTenantId(request) {
    // Priority 1: Subdomain
    const host = request.headers["host"];
    const subdomainMatch = host.match(/^([a-z0-9-]+)\.api\.example\.com$/);
    if (subdomainMatch) {
        return subdomainMatch[1];
    }
    
    // Priority 2: Path prefix
    const pathMatch = request.path.match(/^\/tenants\/([a-z0-9-]+)\//);
    if (pathMatch) {
        return pathMatch[1];
    }
    
    // Priority 3: Header
    if (request.headers["x-tenant-id"]) {
        return request.headers["x-tenant-id"];
    }
    
    return null;
}

Tenant Isolation is Critical

Cross-tenant data access is one of the most severe security vulnerabilities in multi-tenant systems. Always verify that the authenticated user's tenant claim matches the request's target tenant. Log and alert on any mismatches—they may indicate an attack attempt.

Performance Optimization

Authentication adds latency to every request. At high traffic volumes, even small inefficiencies compound into significant performance impacts. This section covers optimization techniques for high-throughput gateway authentication.

JWKS Caching and Preloading

JWKS (JSON Web Key Set) contains the public keys needed to verify JWT signatures. Fetching JWKS on every request would be catastrophically slow.

Optimized JWKS Cache
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
// High-Performance JWKS Cache with Proactive Refresh
 
class JwksCacheManager {
    private cache: Map<string, JwksCacheEntry> = new Map();
    private refreshLock: Map<string, Promise<void>> = new Map();
    
    constructor(private readonly config: JwksCacheConfig) {}
    
    async getSigningKey(issuer: string, kid: string): Promise<crypto.KeyObject> {
        let entry = this.cache.get(issuer);
        
        // Cache miss or expired: fetch JWKS
        if (!entry || this.isExpired(entry)) {
            await this.refreshJwks(issuer);
            entry = this.cache.get(issuer);
        }
        // Near expiry: trigger background refresh (non-blocking)
        else if (this.isNearExpiry(entry)) {
            this.backgroundRefresh(issuer);
        }
        
        const key = entry?.keys.get(kid);
        if (!key) {
            throw new Error(`Unknown key ID: ${kid}`);
        }
        
        return key;
    }
    
    private async refreshJwks(issuer: string): Promise<void> {
        // Coalesce concurrent refresh requests
        let existingRefresh = this.refreshLock.get(issuer);
        if (existingRefresh) {
            return existingRefresh;
        }
        
        const refreshPromise = this.doRefresh(issuer);
        this.refreshLock.set(issuer, refreshPromise);
        
        try {
            await refreshPromise;
        } finally {
            this.refreshLock.delete(issuer);
        }
    }
    
    private async doRefresh(issuer: string): Promise<void> {
        const jwksUri = await this.discoverJwksUri(issuer);
        
        const response = await fetch(jwksUri, {
            headers: { "Accept": "application/json" },
            signal: AbortSignal.timeout(this.config.fetchTimeout),
        });
        
        if (!response.ok) {
            // Keep stale cache if refresh fails
            const existing = this.cache.get(issuer);
            if (existing) {
                existing.stale = true;
                this.emitAlert("jwks_refresh_failed", { issuer });
                return;
            }
            throw new Error(`JWKS fetch failed: ${response.status}`);
        }
        
        const jwks = await response.json();
        const keys = new Map<string, crypto.KeyObject>();
        
        for (const jwk of jwks.keys) {
            const keyObject = crypto.createPublicKey({ key: jwk, format: "jwk" });
            keys.set(jwk.kid, keyObject);
        }
        
        this.cache.set(issuer, {
            keys,
            fetchedAt: Date.now(),
            expiresAt: Date.now() + this.config.cacheTtl,
            stale: false,
        });
    }
    
    private backgroundRefresh(issuer: string): void {
        // Non-blocking refresh in background
        setImmediate(() => {
            this.refreshJwks(issuer).catch(err => {
                console.warn(`Background JWKS refresh failed: ${err.message}`);
            });
        });
    }
    
    private isExpired(entry: JwksCacheEntry): boolean {
        return Date.now() > entry.expiresAt && !entry.stale;
    }
    
    private isNearExpiry(entry: JwksCacheEntry): boolean {
        // Refresh when 75% through TTL
        const threshold = entry.fetchedAt + (this.config.cacheTtl * 0.75);
        return Date.now() > threshold;
    }
}

Algorithm Selection for Signature Verification

The choice of JWT signing algorithm significantly impacts verification performance:

JWT Algorithm Performance Comparison
Algorithm	Type	Verify Time (~)	Security	Recommendation
HS256	Symmetric (HMAC)	~2μs	Shared secret risk	Internal only, never for external tokens
RS256	Asymmetric (RSA)	~50μs	Strong (2048+ bit)	Standard choice, widely supported
RS384/RS512	Asymmetric (RSA)	~60-80μs	Stronger	When extra security needed
ES256	Asymmetric (ECDSA)	~30μs	Strong (P-256)	Faster than RSA, smaller keys
ES384/ES512	Asymmetric (ECDSA)	~40-60μs	Stronger	When extra security needed
EdDSA (Ed25519)	Asymmetric	~20μs	Modern, strong	Best performance, newer support

Performance Tip

At 10,000 requests/second, RS256 verification adds ~500ms of cumulative CPU time per second. Switching to ES256 cuts this nearly in half. For extreme throughput, EdDSA offers the best performance—but ensure your JWT library and identity provider support it.

Connection Pooling and Keep-Alive

For authentication patterns requiring external calls (introspection, session store), connection management is critical.

Connection Pool Configuration
Node.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// HTTP Agent for Identity Provider Connections
const idpAgent = new https.Agent({
    keepAlive: true,
    keepAliveMsecs: 60000,
    maxSockets: 100,           // Max connections per host
    maxFreeSockets: 20,        // Keep idle connections
    timeout: 5000,
    scheduling: 'fifo',        // First-in-first-out for consistent latency
});
 
// Redis Client for Session Store
const sessionRedis = new Redis.Cluster([
    { host: 'redis-node-1', port: 6379 },
    { host: 'redis-node-2', port: 6379 },
    { host: 'redis-node-3', port: 6379 },
], {
    redisOptions: {
        connectTimeout: 500,
        commandTimeout: 100,     // Fast fail for auth path
        enableReadyCheck: true,
        lazyConnect: false,
    },
    scaleReads: 'slave',         // Read from replicas
    slotsRefreshTimeout: 2000,
    clusterRetryStrategy: (times) => Math.min(100 * times, 1000),
});

Security Hardening

Centralized authentication concentrates security responsibility—and risk—at the gateway. Rigorous security hardening is essential.

Algorithm Enforcement

Never allow the JWT to dictate which algorithm to use—this enables the "Algorithm Confusion" attack where an attacker modifies the token header to use a weaker algorithm.

Secure JWT Validation

Secure Configuration

// SECURE: Explicitly specify allowed algorithms
const decoded = jwt.verify(token, publicKey, {
    algorithms: ["RS256", "ES256"],  // Whitelist only expected algorithms
    issuer: "https://auth.example.com",
    audience: "api.example.com",
    complete: true,  // Get header too for logging
});
 
// INSECURE: Never do this!
// const decoded = jwt.verify(token, secretOrPublicKey);
// Without algorithm restriction, attacker can use HS256 with public key as secret
 
// Also reject the "none" algorithm explicitly
const header = decodeJwtHeader(token);
if (header.alg === "none" || header.alg.toLowerCase() === "none") {
    throw new SecurityError("Algorithm 'none' is forbidden");
}

Rate Limiting Authentication Endpoints

Authentication endpoints are prime targets for brute-force and credential stuffing attacks. Apply aggressive rate limiting.

Authentication Rate Limiting
Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Rate limiting for authentication flows
rate_limits:
  # Token validation (normal API traffic)
  token_validation:
    requests_per_second: 10000  # High limit, mostly legitimate
    burst: 500
    
  # Login/token issuance (target for attacks)
  login:
    # Per IP limits
    per_ip:
      requests_per_minute: 10
      burst: 5
    # Per username limits (prevent account lockout attacks)
    per_username:
      requests_per_minute: 5
      burst: 3
    # Global limits
    global:
      requests_per_second: 1000
      
  # Failed authentication (more aggressive)  
  failed_auth:
    per_ip:
      # After 5 failures, impose 5-minute delay
      threshold: 5
      lockout_seconds: 300
    per_username:
      threshold: 10
      lockout_seconds: 900  # 15 minutes after 10 failures
      
  # Password reset (frequently abused)
  password_reset:
    per_ip:
      requests_per_hour: 5
    per_email:
      requests_per_hour: 3

Additional Security Measures

•Clock Skew Tolerance: Allow small clock differences (e.g., 30 seconds) when validating exp/nbf claims, but not too large
•Token Binding: Where supported, bind tokens to client certificates or DPoP proofs to prevent token theft
•Constant-Time Comparison: Use constant-time string comparison for secrets and tokens to prevent timing attacks
•Secure Headers: Set security headers (HSTS, X-Content-Type-Options, etc.) on all responses
•Audit Logging: Log every authentication decision (success and failure) with request context for forensic analysis
•Secret Rotation: Implement seamless rotation of signing keys, client secrets, and API keys without downtime

Summary: Centralizing Authentication

Centralized authentication at the API Gateway is a foundational pattern for microservices security. Let's consolidate the key principles:

Key Takeaways

•Single Point of Authentication: The gateway validates all external requests, allowing internal services to focus on authorization rather than authentication.
•Multiple Validation Patterns: Inline JWT validation (fast, independent), token introspection (real-time status), and session-based authentication (stateful web apps) each serve different use cases.
•High Availability is Non-Negotiable: Deploy multiple stateless gateway instances with graceful degradation when dependencies fail.
•Identity Propagation Choices: Header-based propagation (simple, requires network isolation) vs. internal token minting (cryptographic verification, defense-in-depth).
•Multi-Tenant Security: Always verify tenant context matches token claims to prevent cross-tenant access.
•Performance Optimization: JWKS caching, algorithm selection, and connection pooling are essential for high-throughput gateways.
•Security Hardening: Enforce algorithm restrictions, rate-limit authentication endpoints, and maintain comprehensive audit logs.

Next Up

With centralized authentication established, the next page explores OAuth2 and JWT Validation in depth—understanding token formats, validation flows, and the specific security considerations for these dominant authentication standards.

1 / 4

Loading learning content...

System DesignAPI Gateway Patterns

Authentication at Gateway

LevelIntermediate

Duration90 mins

TopicAPI Gateway Patterns

1 / 4

Centralized Authentication

The Case for Centralized Authentication

This distributed authentication model creates several fundamental problems:

Code Duplication: Every service contains nearly identical authentication code, multiplied across potentially hundreds of services.

Inconsistent Security: Different teams implement authentication differently, leading to security gaps. One service might properly validate JWT signatures while another skips expiration checks.

Operational Overhead: Rotating secrets, updating authentication libraries, or patching vulnerabilities requires coordinated changes across all services.

Latency Multiplication: Each service independently calling identity providers creates redundant network requests and increased latency.

The Security Perimeter Principle

Architectural Foundations

The Gateway as Authentication Proxy

The API Gateway sits at the edge of your system, receiving all inbound requests before they reach internal services. In its authentication role, the gateway performs several key functions:

1. Credential Extraction The gateway extracts authentication credentials from incoming requests. These credentials might be:

Bearer tokens in the Authorization header
Session cookies
API keys in headers or query parameters
Client certificates (for mTLS)

2. Token Validation For token-based authentication (the dominant model in modern systems), the gateway validates:

Signature verification: The token was issued by a trusted authority
Expiration checks: The token hasn't exceeded its lifetime
Issuer validation: The token comes from an expected identity provider
Audience validation: The token was intended for this service/API

3. Identity Resolution Once validated, the gateway resolves the token to an identity—typically a user ID or service account. This identity becomes the security context for the request.

4. Context Propagation The gateway propagates the authenticated identity to downstream services, typically via:

Custom headers (e.g., X-User-ID, X-Tenant-ID)
A new, internal token scoped to the request
Request metadata in the service mesh

Converting Mermaid diagram...

Trust Boundaries

Authentication Flow Patterns

Pattern 1: Inline Token Validation

In inline validation, the gateway validates tokens synchronously as part of request processing. This is the most common pattern for JWT-based authentication.

Inline JWT Validation Flow

Gateway Logic (Pseudocode)

// Inline Token Validation - Gateway Middleware
 
function authenticateRequest(request, response, next) {
    // Step 1: Extract token from Authorization header
    const authHeader = request.headers["authorization"];
    if (!authHeader || !authHeader.startsWith("Bearer ")) {
        return response.status(401).json({
            error: "missing_token",
            message: "Authorization header with Bearer token required"
        });
    }
    
    const token = authHeader.substring(7);
    
    // Step 2: Decode token header (without verifying signature yet)
    const header = decodeJwtHeader(token);
    
    // Step 3: Retrieve signing key based on 'kid' (key ID) claim
    // JWKS is cached locally with periodic refresh (e.g., every 5 minutes)
    const signingKey = jwksCache.getKey(header.kid);
    if (!signingKey) {
        // Trigger async JWKS refresh and retry once
        await jwksCache.refresh();
        signingKey = jwksCache.getKey(header.kid);
        if (!signingKey) {
            return response.status(401).json({
                error: "unknown_signing_key",
                message: "Token signed by unknown key"
            });
        }
    }
    
    // Step 4: Verify signature and decode claims
    let claims;
    try {
        claims = verifyAndDecode(token, signingKey, {
            algorithms: ["RS256"],           // Only allow expected algorithms
            issuer: config.expectedIssuer,    // Verify 'iss' claim
            audience: config.expectedAudience // Verify 'aud' claim
        });
    } catch (err) {
        if (err instanceof TokenExpiredError) {
            return response.status(401).json({
                error: "token_expired",
                message: "Access token has expired"
            });
        }
        return response.status(401).json({
            error: "invalid_token",
            message: "Token validation failed"
        });
    }
    
    // Step 5: Additional claim validation
    if (!claims.sub) {
        return response.status(401).json({
            error: "invalid_claims",
            message: "Token missing subject claim"
        });
    }
    
    // Step 6: Propagate identity to downstream services
    request.headers["X-Authenticated-User-Id"] = claims.sub;
    request.headers["X-Authenticated-User-Email"] = claims.email || "";
    request.headers["X-Token-Scopes"] = (claims.scope || "").join(",");
    request.headers["X-Request-Id"] = generateRequestId();
    
    // Continue to routing
    next();
}

Characteristics of Inline Validation:

Latency: Minimal added latency (typically <1ms) since JWT validation is a local cryptographic operation
Dependencies: Requires JWKS caching; occasional dependency on identity provider for key refresh
Failure Mode: Gateway can continue operating with cached keys even if identity provider is temporarily unavailable

Pattern 2: Token Introspection

Token Introspection Flow

Gateway Logic

// Token Introspection - Gateway Middleware
 
function authenticateRequest(request, response, next) {
    const token = extractBearerToken(request);
    if (!token) {
        return response.status(401).json({ error: "missing_token" });
    }
    
    // Check token cache first (avoid introspection call if recently validated)
    const cachedResult = tokenCache.get(hashToken(token));
    if (cachedResult) {
        if (cachedResult.active) {
            propagateIdentity(request, cachedResult.claims);
            return next();
        } else {
            return response.status(401).json({ error: "inactive_token" });
        }
    }
    
    // Call introspection endpoint
    const introspectionResult = await httpClient.post(
        config.introspectionEndpoint,
        {
            token: token,
            token_type_hint: "access_token"
        },
        {
            auth: {
                username: config.clientId,
                password: config.clientSecret
            },
            timeout: 500, // Aggressive timeout
            retry: { attempts: 2, delay: 50 }
        }
    );
    
    // Cache the result (with short TTL to handle revocation)
    tokenCache.set(hashToken(token), {
        active: introspectionResult.active,
        claims: {
            sub: introspectionResult.sub,
            scope: introspectionResult.scope,
            exp: introspectionResult.exp
        }
    }, { ttl: 30 }); // 30 second cache
    
    if (!introspectionResult.active) {
        return response.status(401).json({ error: "inactive_token" });
    }
    
    propagateIdentity(request, introspectionResult);
    next();
}

Characteristics of Token Introspection:

Latency: Higher latency (10-100ms) due to external call; mitigated by caching
Dependencies: Strong dependency on identity provider availability
Revocation: Immediate token revocation when cache TTL is low
Trade-off: Real-time token status vs. latency and availability

Pattern 3: Session-Based Authentication

For web applications with traditional session-based authentication, the gateway validates session cookies against a shared session store.

Session Validation Flow

Gateway Logic

// Session-Based Authentication - Gateway Middleware
 
function authenticateRequest(request, response, next) {
    // Extract session ID from secure, httpOnly cookie
    const sessionId = request.cookies[config.sessionCookieName];
    if (!sessionId) {
        return response.status(401).json({ error: "no_session" });
    }
    
    // Validate session ID format (prevent injection attacks)
    if (!isValidSessionIdFormat(sessionId)) {
        return response.status(401).json({ error: "invalid_session_format" });
    }
    
    // Lookup session in distributed store (Redis, Memcached, etc.)
    const session = await sessionStore.get(sessionId);
    
    if (!session) {
        // Clear the invalid cookie
        response.clearCookie(config.sessionCookieName);
        return response.status(401).json({ error: "session_not_found" });
    }
    
    // Check session expiration
    if (Date.now() > session.expiresAt) {
        await sessionStore.delete(sessionId);
        response.clearCookie(config.sessionCookieName);
        return response.status(401).json({ error: "session_expired" });
    }
    
    // Sliding window expiration: extend session on activity
    const newExpiry = Date.now() + config.sessionTtlMs;
    await sessionStore.extend(sessionId, newExpiry);
    
    // Propagate identity
    request.headers["X-Authenticated-User-Id"] = session.userId;
    request.headers["X-Session-Id"] = sessionId;
    
    next();
}

Authentication Pattern Comparison
Pattern	Latency	Revocation Speed	IDP Dependency	Best For
Inline JWT Validation	~1ms	Token lifetime (minutes-hours)	Low (JWKS cache)	Microservices, APIs
Token Introspection	10-100ms	Cache TTL (seconds)	High (every validation)	Opaque tokens, strict revocation
Session-Based	1-5ms	Immediate (session deletion)	Medium (session store)	Web applications

Gateway Architecture for High Availability

Redundancy Patterns

Multi-Instance Deployment: Deploy multiple gateway instances behind a load balancer. Each instance must be capable of independently validating tokens without coordination with other instances.

High Availability Gateway Deployment
Kubernetes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# api-gateway-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-gateway
  namespace: edge
spec:
  replicas: 5  # Multiple replicas for redundancy
  selector:
    matchLabels:
      app: api-gateway
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1      # Maintain N-1 instances during updates
      maxSurge: 2
  template:
    metadata:
      labels:
        app: api-gateway
    spec:
      # Anti-affinity: spread across failure domains
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: api-gateway
                topologyKey: topology.kubernetes.io/zone
      containers:
        - name: gateway
          image: api-gateway:v2.3.1
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "500m"
              memory: "256Mi"
            limits:
              cpu: "2000m"
              memory: "1Gi"
          # Health checks
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 10
          env:
            - name: JWKS_CACHE_TTL
              value: "300"  # 5 minute JWKS cache
            - name: JWKS_REFRESH_INTERVAL
              value: "60"   # Proactive refresh every minute
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-gateway-pdb
  namespace: edge
spec:
  minAvailable: 3  # Always maintain at least 3 healthy pods
  selector:
    matchLabels:
      app: api-gateway

Stateless Design

Each gateway instance must be completely stateless regarding authentication. This means:

1. No Local Session State: Sessions, if used, must be stored in a distributed store accessible to all instances.

3. No Instance Affinity: Requests from the same client can be routed to any gateway instance. The gateway must not assume it will see previous requests from the same client.

Graceful Degradation

When dependencies (identity provider, session store) become unavailable, the gateway should degrade gracefully rather than failing completely.

Graceful Degradation Logic

Gateway Configuration

// Graceful Degradation Configuration
 
const authConfig = {
    // JWT Validation Degradation
    jwt: {
        // If JWKS refresh fails, continue using cached keys
        useStaleCacheOnRefreshFailure: true,
        staleCacheMaxAge: 3600,  // Accept stale cache for up to 1 hour
        
        // Log warning when operating with stale keys
        alertOnStaleCache: true,
    },
    
    // Token Introspection Degradation
    introspection: {
        // If introspection endpoint is unavailable
        fallbackBehavior: "USE_CACHED_OR_REJECT",  // Options: USE_CACHED_OR_REJECT, REJECT_ALL, PASS_THROUGH
        
        // Maximum age of cached introspection result to accept during outage
        maxCacheAgeOnFailure: 300,  // 5 minutes
        
        // Circuit breaker for introspection endpoint
        circuitBreaker: {
            failureThreshold: 5,
            recoveryTimeout: 30,
            halfOpenRequests: 2,
        },
    },
    
    // Session Store Degradation
    session: {
        // If session store is unavailable
        fallbackBehavior: "REJECT_ALL",  // Sessions require store access
        
        // Connection pool settings for resilience
        pool: {
            minConnections: 5,
            maxConnections: 50,
            connectionTimeout: 500,
            idleTimeout: 30000,
        },
        
        // Read from replicas, write to primary
        readFromReplicas: true,
    },
};

Degradation Trade-offs

Identity Propagation Strategies

Strategy 1: Header-Based Propagation

The gateway adds custom headers containing identity information. Services read these headers and trust them implicitly.

Header-Based Identity Propagation

// Gateway: Inject identity headers after authentication
 
function propagateIdentity(request, claims) {
    // Set standard identity headers
    request.headers["X-User-Id"] = claims.sub;
    request.headers["X-User-Email"] = claims.email || "";
    request.headers["X-User-Roles"] = (claims.roles || []).join(",");
    
    // Set organization/tenant context (for multi-tenant systems)
    request.headers["X-Tenant-Id"] = claims.tenant_id || "";
    request.headers["X-Org-Id"] = claims.org_id || "";
    
    // Set token metadata
    request.headers["X-Token-Issuer"] = claims.iss;
    request.headers["X-Token-Scopes"] = (claims.scope || []).join(",");
    request.headers["X-Token-Expires"] = claims.exp.toString();
    
    // Request correlation
    request.headers["X-Request-Id"] = request.headers["X-Request-Id"] || generateUUID();
    request.headers["X-Correlation-Id"] = request.headers["X-Correlation-Id"] || generateUUID();
    
    // IMPORTANT: Remove the original Authorization header or replace it
    // to prevent downstream services from trusting the original token
    delete request.headers["Authorization"];
}

Critical Security Requirement

Strategy 2: Internal Token Minting

Instead of trusting headers, the gateway mints a new, cryptographically signed internal token for each request. Services validate this token, providing defense-in-depth.

Internal Token Minting

// Gateway: Mint internal JWT for downstream propagation
 
function mintInternalToken(claims, originalTokenExpiry) {
    // Internal token has short lifetime (5 minutes max)
    // or remaining lifetime of original token, whichever is shorter
    const expiresIn = Math.min(
        300,  // 5 minutes
        originalTokenExpiry - Math.floor(Date.now() / 1000)
    );
    
    const internalClaims = {
        // Standard claims
        iss: "gateway.internal",
        sub: claims.sub,
        aud: "internal-services",
        iat: Math.floor(Date.now() / 1000),
        exp: Math.floor(Date.now() / 1000) + expiresIn,
        
        // Custom claims (propagated from original token)
        email: claims.email,
        roles: claims.roles,
        tenant_id: claims.tenant_id,
        scopes: claims.scope,
        
        // Request context
        request_id: generateUUID(),
        original_issuer: claims.iss,
    };
    
    // Sign with gateway's private key (rotated regularly)
    return jwt.sign(internalClaims, gatewayPrivateKey, {
        algorithm: "ES256",  // ECDSA for fast signing
        keyid: currentKeyId,
    });
}
 
function propagateIdentity(request, claims) {
    const internalToken = mintInternalToken(claims, claims.exp);
    request.headers["Authorization"] = `Bearer ${internalToken}`;
}

Identity Propagation Comparison
Strategy	Security	Performance	Complexity	Use Case
Headers Only	Requires network isolation	Fastest (no crypto)	Low	Trusted internal network
Internal Token	Cryptographic verification	~1ms signing overhead	Medium	Zero-trust internal
Pass-Through	Services validate original	Varies	Low	Services need full claims

Multi-Tenant Authentication

Tenant Identification

The gateway must determine which tenant a request belongs to. Common approaches:

1. Subdomain-Based Tenancy

tenant-a.api.example.com/users
tenant-b.api.example.com/users

2. Path-Based Tenancy

api.example.com/tenants/tenant-a/users
api.example.com/tenants/tenant-b/users

3. Header-Based Tenancy

GET /users
X-Tenant-ID: tenant-a

4. Token-Embedded Tenancy The tenant ID is embedded in the access token as a claim.

Multi-Tenant Authentication Logic

Gateway Middleware

// Multi-Tenant Gateway Authentication
 
function authenticateMultiTenant(request, response, next) {
    // Step 1: Identify tenant from request
    const tenantId = resolveTenantId(request);
    if (!tenantId) {
        return response.status(400).json({
            error: "tenant_not_identified",
            message: "Unable to determine tenant context"
        });
    }
    
    // Step 2: Validate tenant exists and is active
    const tenant = await tenantRegistry.get(tenantId);
    if (!tenant) {
        return response.status(404).json({ error: "tenant_not_found" });
    }
    if (tenant.status !== "active") {
        return response.status(403).json({
            error: "tenant_suspended",
            message: "Tenant account is suspended"
        });
    }
    
    // Step 3: Extract and validate token
    const token = extractBearerToken(request);
    if (!token) {
        return response.status(401).json({ error: "missing_token" });
    }
    
    // Step 4: Get tenant-specific JWKS configuration
    // (tenants may have different identity providers)
    const idpConfig = tenant.identityProvider || defaultIdpConfig;
    const jwks = await getJwksForTenant(tenantId, idpConfig);
    
    // Step 5: Validate token
    let claims;
    try {
        claims = await validateToken(token, jwks, idpConfig);
    } catch (err) {
        return response.status(401).json({ error: "invalid_token" });
    }
    
    // Step 6: Verify token belongs to this tenant
    // Critical: Prevent cross-tenant authentication
    if (claims.tenant_id && claims.tenant_id !== tenantId) {
        // Token was issued for a different tenant!
        logSecurityAlert("cross_tenant_attempt", {
            tokenTenant: claims.tenant_id,
            requestTenant: tenantId,
            userId: claims.sub,
        });
        return response.status(403).json({
            error: "tenant_mismatch",
            message: "Token not valid for this tenant"
        });
    }
    
    // Step 7: Propagate identity with tenant context
    request.headers["X-Tenant-Id"] = tenantId;
    request.headers["X-User-Id"] = claims.sub;
    request.headers["X-Tenant-Plan"] = tenant.plan;  // For rate limiting, features
    
    next();
}
 
function resolveTenantId(request) {
    // Priority 1: Subdomain
    const host = request.headers["host"];
    const subdomainMatch = host.match(/^([a-z0-9-]+)\.api\.example\.com$/);
    if (subdomainMatch) {
        return subdomainMatch[1];
    }
    
    // Priority 2: Path prefix
    const pathMatch = request.path.match(/^\/tenants\/([a-z0-9-]+)\//);
    if (pathMatch) {
        return pathMatch[1];
    }
    
    // Priority 3: Header
    if (request.headers["x-tenant-id"]) {
        return request.headers["x-tenant-id"];
    }
    
    return null;
}

Tenant Isolation is Critical

Performance Optimization

JWKS Caching and Preloading

JWKS (JSON Web Key Set) contains the public keys needed to verify JWT signatures. Fetching JWKS on every request would be catastrophically slow.

Optimized JWKS Cache
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
// High-Performance JWKS Cache with Proactive Refresh
 
class JwksCacheManager {
    private cache: Map<string, JwksCacheEntry> = new Map();
    private refreshLock: Map<string, Promise<void>> = new Map();
    
    constructor(private readonly config: JwksCacheConfig) {}
    
    async getSigningKey(issuer: string, kid: string): Promise<crypto.KeyObject> {
        let entry = this.cache.get(issuer);
        
        // Cache miss or expired: fetch JWKS
        if (!entry || this.isExpired(entry)) {
            await this.refreshJwks(issuer);
            entry = this.cache.get(issuer);
        }
        // Near expiry: trigger background refresh (non-blocking)
        else if (this.isNearExpiry(entry)) {
            this.backgroundRefresh(issuer);
        }
        
        const key = entry?.keys.get(kid);
        if (!key) {
            throw new Error(`Unknown key ID: ${kid}`);
        }
        
        return key;
    }
    
    private async refreshJwks(issuer: string): Promise<void> {
        // Coalesce concurrent refresh requests
        let existingRefresh = this.refreshLock.get(issuer);
        if (existingRefresh) {
            return existingRefresh;
        }
        
        const refreshPromise = this.doRefresh(issuer);
        this.refreshLock.set(issuer, refreshPromise);
        
        try {
            await refreshPromise;
        } finally {
            this.refreshLock.delete(issuer);
        }
    }
    
    private async doRefresh(issuer: string): Promise<void> {
        const jwksUri = await this.discoverJwksUri(issuer);
        
        const response = await fetch(jwksUri, {
            headers: { "Accept": "application/json" },
            signal: AbortSignal.timeout(this.config.fetchTimeout),
        });
        
        if (!response.ok) {
            // Keep stale cache if refresh fails
            const existing = this.cache.get(issuer);
            if (existing) {
                existing.stale = true;
                this.emitAlert("jwks_refresh_failed", { issuer });
                return;
            }
            throw new Error(`JWKS fetch failed: ${response.status}`);
        }
        
        const jwks = await response.json();
        const keys = new Map<string, crypto.KeyObject>();
        
        for (const jwk of jwks.keys) {
            const keyObject = crypto.createPublicKey({ key: jwk, format: "jwk" });
            keys.set(jwk.kid, keyObject);
        }
        
        this.cache.set(issuer, {
            keys,
            fetchedAt: Date.now(),
            expiresAt: Date.now() + this.config.cacheTtl,
            stale: false,
        });
    }
    
    private backgroundRefresh(issuer: string): void {
        // Non-blocking refresh in background
        setImmediate(() => {
            this.refreshJwks(issuer).catch(err => {
                console.warn(`Background JWKS refresh failed: ${err.message}`);
            });
        });
    }
    
    private isExpired(entry: JwksCacheEntry): boolean {
        return Date.now() > entry.expiresAt && !entry.stale;
    }
    
    private isNearExpiry(entry: JwksCacheEntry): boolean {
        // Refresh when 75% through TTL
        const threshold = entry.fetchedAt + (this.config.cacheTtl * 0.75);
        return Date.now() > threshold;
    }
}

Algorithm Selection for Signature Verification

The choice of JWT signing algorithm significantly impacts verification performance:

JWT Algorithm Performance Comparison
Algorithm	Type	Verify Time (~)	Security	Recommendation
HS256	Symmetric (HMAC)	~2μs	Shared secret risk	Internal only, never for external tokens
RS256	Asymmetric (RSA)	~50μs	Strong (2048+ bit)	Standard choice, widely supported
RS384/RS512	Asymmetric (RSA)	~60-80μs	Stronger	When extra security needed
ES256	Asymmetric (ECDSA)	~30μs	Strong (P-256)	Faster than RSA, smaller keys
ES384/ES512	Asymmetric (ECDSA)	~40-60μs	Stronger	When extra security needed
EdDSA (Ed25519)	Asymmetric	~20μs	Modern, strong	Best performance, newer support

Performance Tip

Connection Pooling and Keep-Alive

For authentication patterns requiring external calls (introspection, session store), connection management is critical.

Connection Pool Configuration
Node.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// HTTP Agent for Identity Provider Connections
const idpAgent = new https.Agent({
    keepAlive: true,
    keepAliveMsecs: 60000,
    maxSockets: 100,           // Max connections per host
    maxFreeSockets: 20,        // Keep idle connections
    timeout: 5000,
    scheduling: 'fifo',        // First-in-first-out for consistent latency
});
 
// Redis Client for Session Store
const sessionRedis = new Redis.Cluster([
    { host: 'redis-node-1', port: 6379 },
    { host: 'redis-node-2', port: 6379 },
    { host: 'redis-node-3', port: 6379 },
], {
    redisOptions: {
        connectTimeout: 500,
        commandTimeout: 100,     // Fast fail for auth path
        enableReadyCheck: true,
        lazyConnect: false,
    },
    scaleReads: 'slave',         // Read from replicas
    slotsRefreshTimeout: 2000,
    clusterRetryStrategy: (times) => Math.min(100 * times, 1000),
});

Security Hardening

Centralized authentication concentrates security responsibility—and risk—at the gateway. Rigorous security hardening is essential.

Algorithm Enforcement

Never allow the JWT to dictate which algorithm to use—this enables the "Algorithm Confusion" attack where an attacker modifies the token header to use a weaker algorithm.

Secure JWT Validation

Secure Configuration

// SECURE: Explicitly specify allowed algorithms
const decoded = jwt.verify(token, publicKey, {
    algorithms: ["RS256", "ES256"],  // Whitelist only expected algorithms
    issuer: "https://auth.example.com",
    audience: "api.example.com",
    complete: true,  // Get header too for logging
});
 
// INSECURE: Never do this!
// const decoded = jwt.verify(token, secretOrPublicKey);
// Without algorithm restriction, attacker can use HS256 with public key as secret
 
// Also reject the "none" algorithm explicitly
const header = decodeJwtHeader(token);
if (header.alg === "none" || header.alg.toLowerCase() === "none") {
    throw new SecurityError("Algorithm 'none' is forbidden");
}

Rate Limiting Authentication Endpoints

Authentication endpoints are prime targets for brute-force and credential stuffing attacks. Apply aggressive rate limiting.

Authentication Rate Limiting
Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Rate limiting for authentication flows
rate_limits:
  # Token validation (normal API traffic)
  token_validation:
    requests_per_second: 10000  # High limit, mostly legitimate
    burst: 500
    
  # Login/token issuance (target for attacks)
  login:
    # Per IP limits
    per_ip:
      requests_per_minute: 10
      burst: 5
    # Per username limits (prevent account lockout attacks)
    per_username:
      requests_per_minute: 5
      burst: 3
    # Global limits
    global:
      requests_per_second: 1000
      
  # Failed authentication (more aggressive)  
  failed_auth:
    per_ip:
      # After 5 failures, impose 5-minute delay
      threshold: 5
      lockout_seconds: 300
    per_username:
      threshold: 10
      lockout_seconds: 900  # 15 minutes after 10 failures
      
  # Password reset (frequently abused)
  password_reset:
    per_ip:
      requests_per_hour: 5
    per_email:
      requests_per_hour: 3

Additional Security Measures

•Clock Skew Tolerance: Allow small clock differences (e.g., 30 seconds) when validating exp/nbf claims, but not too large
•Token Binding: Where supported, bind tokens to client certificates or DPoP proofs to prevent token theft
•Constant-Time Comparison: Use constant-time string comparison for secrets and tokens to prevent timing attacks
•Secure Headers: Set security headers (HSTS, X-Content-Type-Options, etc.) on all responses
•Audit Logging: Log every authentication decision (success and failure) with request context for forensic analysis
•Secret Rotation: Implement seamless rotation of signing keys, client secrets, and API keys without downtime

Summary: Centralizing Authentication

Centralized authentication at the API Gateway is a foundational pattern for microservices security. Let's consolidate the key principles:

Key Takeaways

•Single Point of Authentication: The gateway validates all external requests, allowing internal services to focus on authorization rather than authentication.
•Multiple Validation Patterns: Inline JWT validation (fast, independent), token introspection (real-time status), and session-based authentication (stateful web apps) each serve different use cases.
•High Availability is Non-Negotiable: Deploy multiple stateless gateway instances with graceful degradation when dependencies fail.
•Identity Propagation Choices: Header-based propagation (simple, requires network isolation) vs. internal token minting (cryptographic verification, defense-in-depth).
•Multi-Tenant Security: Always verify tenant context matches token claims to prevent cross-tenant access.
•Performance Optimization: JWKS caching, algorithm selection, and connection pooling are essential for high-throughput gateways.
•Security Hardening: Enforce algorithm restrictions, rate-limit authentication endpoints, and maintain comprehensive audit logs.

Next Up

1 / 4