Loading learning content...
In a microservices architecture, every service potentially exposes endpoints that require authentication. Without a centralized approach, each service must independently implement authentication logic—validating tokens, checking credentials, managing sessions, and handling edge cases like token expiration and revocation.
This distributed authentication model creates several fundamental problems:
Code Duplication: Every service contains nearly identical authentication code, multiplied across potentially hundreds of services.
Inconsistent Security: Different teams implement authentication differently, leading to security gaps. One service might properly validate JWT signatures while another skips expiration checks.
Operational Overhead: Rotating secrets, updating authentication libraries, or patching vulnerabilities requires coordinated changes across all services.
Latency Multiplication: Each service independently calling identity providers creates redundant network requests and increased latency.
Centralized authentication at the API Gateway solves these problems by consolidating authentication logic at a single, well-audited entry point. The gateway authenticates all incoming requests before they reach backend services, allowing services to trust that authenticated context has already been established.
Centralized authentication establishes a clear security perimeter. Every request entering your system passes through the gateway, which acts as a checkpoint. Backend services operate behind this perimeter and can focus on authorization (what can authenticated users do?) rather than authentication (who is making this request?).
Centralized authentication at the gateway layer requires careful architectural consideration. The gateway becomes a critical component in the request path—every authenticated request depends on it. This section examines the foundational architecture.
The API Gateway sits at the edge of your system, receiving all inbound requests before they reach internal services. In its authentication role, the gateway performs several key functions:
1. Credential Extraction The gateway extracts authentication credentials from incoming requests. These credentials might be:
Authorization header2. Token Validation For token-based authentication (the dominant model in modern systems), the gateway validates:
3. Identity Resolution Once validated, the gateway resolves the token to an identity—typically a user ID or service account. This identity becomes the security context for the request.
4. Context Propagation The gateway propagates the authenticated identity to downstream services, typically via:
X-User-ID, X-Tenant-ID)The diagram illustrates a critical concept: trust boundaries. The external zone (clients) is untrusted—all requests must be authenticated. The internal zone (services behind the gateway) is trusted—services can rely on the gateway having authenticated requests. This trust model simplifies backend services but requires absolute confidence in the gateway's authentication logic.
Centralized authentication can be implemented through several distinct patterns, each with trade-offs in latency, security, and operational complexity. Understanding these patterns is essential for making appropriate architectural decisions.
In inline validation, the gateway validates tokens synchronously as part of request processing. This is the most common pattern for JWT-based authentication.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
// Inline Token Validation - Gateway Middleware function authenticateRequest(request, response, next) { // Step 1: Extract token from Authorization header const authHeader = request.headers["authorization"]; if (!authHeader || !authHeader.startsWith("Bearer ")) { return response.status(401).json({ error: "missing_token", message: "Authorization header with Bearer token required" }); } const token = authHeader.substring(7); // Step 2: Decode token header (without verifying signature yet) const header = decodeJwtHeader(token); // Step 3: Retrieve signing key based on 'kid' (key ID) claim // JWKS is cached locally with periodic refresh (e.g., every 5 minutes) const signingKey = jwksCache.getKey(header.kid); if (!signingKey) { // Trigger async JWKS refresh and retry once await jwksCache.refresh(); signingKey = jwksCache.getKey(header.kid); if (!signingKey) { return response.status(401).json({ error: "unknown_signing_key", message: "Token signed by unknown key" }); } } // Step 4: Verify signature and decode claims let claims; try { claims = verifyAndDecode(token, signingKey, { algorithms: ["RS256"], // Only allow expected algorithms issuer: config.expectedIssuer, // Verify 'iss' claim audience: config.expectedAudience // Verify 'aud' claim }); } catch (err) { if (err instanceof TokenExpiredError) { return response.status(401).json({ error: "token_expired", message: "Access token has expired" }); } return response.status(401).json({ error: "invalid_token", message: "Token validation failed" }); } // Step 5: Additional claim validation if (!claims.sub) { return response.status(401).json({ error: "invalid_claims", message: "Token missing subject claim" }); } // Step 6: Propagate identity to downstream services request.headers["X-Authenticated-User-Id"] = claims.sub; request.headers["X-Authenticated-User-Email"] = claims.email || ""; request.headers["X-Token-Scopes"] = (claims.scope || "").join(","); request.headers["X-Request-Id"] = generateRequestId(); // Continue to routing next();}Characteristics of Inline Validation:
For opaque tokens (non-JWT access tokens), the gateway must call the identity provider's introspection endpoint to validate tokens. This pattern is common with OAuth2 servers that issue reference tokens.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
// Token Introspection - Gateway Middleware function authenticateRequest(request, response, next) { const token = extractBearerToken(request); if (!token) { return response.status(401).json({ error: "missing_token" }); } // Check token cache first (avoid introspection call if recently validated) const cachedResult = tokenCache.get(hashToken(token)); if (cachedResult) { if (cachedResult.active) { propagateIdentity(request, cachedResult.claims); return next(); } else { return response.status(401).json({ error: "inactive_token" }); } } // Call introspection endpoint const introspectionResult = await httpClient.post( config.introspectionEndpoint, { token: token, token_type_hint: "access_token" }, { auth: { username: config.clientId, password: config.clientSecret }, timeout: 500, // Aggressive timeout retry: { attempts: 2, delay: 50 } } ); // Cache the result (with short TTL to handle revocation) tokenCache.set(hashToken(token), { active: introspectionResult.active, claims: { sub: introspectionResult.sub, scope: introspectionResult.scope, exp: introspectionResult.exp } }, { ttl: 30 }); // 30 second cache if (!introspectionResult.active) { return response.status(401).json({ error: "inactive_token" }); } propagateIdentity(request, introspectionResult); next();}Characteristics of Token Introspection:
For web applications with traditional session-based authentication, the gateway validates session cookies against a shared session store.
12345678910111213141516171819202122232425262728293031323334353637383940
// Session-Based Authentication - Gateway Middleware function authenticateRequest(request, response, next) { // Extract session ID from secure, httpOnly cookie const sessionId = request.cookies[config.sessionCookieName]; if (!sessionId) { return response.status(401).json({ error: "no_session" }); } // Validate session ID format (prevent injection attacks) if (!isValidSessionIdFormat(sessionId)) { return response.status(401).json({ error: "invalid_session_format" }); } // Lookup session in distributed store (Redis, Memcached, etc.) const session = await sessionStore.get(sessionId); if (!session) { // Clear the invalid cookie response.clearCookie(config.sessionCookieName); return response.status(401).json({ error: "session_not_found" }); } // Check session expiration if (Date.now() > session.expiresAt) { await sessionStore.delete(sessionId); response.clearCookie(config.sessionCookieName); return response.status(401).json({ error: "session_expired" }); } // Sliding window expiration: extend session on activity const newExpiry = Date.now() + config.sessionTtlMs; await sessionStore.extend(sessionId, newExpiry); // Propagate identity request.headers["X-Authenticated-User-Id"] = session.userId; request.headers["X-Session-Id"] = sessionId; next();}| Pattern | Latency | Revocation Speed | IDP Dependency | Best For |
|---|---|---|---|---|
| Inline JWT Validation | ~1ms | Token lifetime (minutes-hours) | Low (JWKS cache) | Microservices, APIs |
| Token Introspection | 10-100ms | Cache TTL (seconds) | High (every validation) | Opaque tokens, strict revocation |
| Session-Based | 1-5ms | Immediate (session deletion) | Medium (session store) | Web applications |
When authentication is centralized at the gateway, the gateway becomes a single point of failure. A gateway outage means no requests can be authenticated, effectively taking down the entire system. Designing for high availability is non-negotiable.
Multi-Instance Deployment: Deploy multiple gateway instances behind a load balancer. Each instance must be capable of independently validating tokens without coordination with other instances.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
# api-gateway-deployment.yamlapiVersion: apps/v1kind: Deploymentmetadata: name: api-gateway namespace: edgespec: replicas: 5 # Multiple replicas for redundancy selector: matchLabels: app: api-gateway strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 # Maintain N-1 instances during updates maxSurge: 2 template: metadata: labels: app: api-gateway spec: # Anti-affinity: spread across failure domains affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchLabels: app: api-gateway topologyKey: topology.kubernetes.io/zone containers: - name: gateway image: api-gateway:v2.3.1 ports: - containerPort: 8080 resources: requests: cpu: "500m" memory: "256Mi" limits: cpu: "2000m" memory: "1Gi" # Health checks readinessProbe: httpGet: path: /health/ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 livenessProbe: httpGet: path: /health/live port: 8080 initialDelaySeconds: 10 periodSeconds: 10 env: - name: JWKS_CACHE_TTL value: "300" # 5 minute JWKS cache - name: JWKS_REFRESH_INTERVAL value: "60" # Proactive refresh every minute---apiVersion: policy/v1kind: PodDisruptionBudgetmetadata: name: api-gateway-pdb namespace: edgespec: minAvailable: 3 # Always maintain at least 3 healthy pods selector: matchLabels: app: api-gatewayEach gateway instance must be completely stateless regarding authentication. This means:
1. No Local Session State: Sessions, if used, must be stored in a distributed store accessible to all instances.
2. Cached JWKS Must Be Eventually Consistent: Each instance maintains its own JWKS cache, refreshing independently. This means a key rotation might briefly cause some instances to reject tokens while others accept them—typically acceptable given the short refresh interval.
3. No Instance Affinity: Requests from the same client can be routed to any gateway instance. The gateway must not assume it will see previous requests from the same client.
When dependencies (identity provider, session store) become unavailable, the gateway should degrade gracefully rather than failing completely.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
// Graceful Degradation Configuration const authConfig = { // JWT Validation Degradation jwt: { // If JWKS refresh fails, continue using cached keys useStaleCacheOnRefreshFailure: true, staleCacheMaxAge: 3600, // Accept stale cache for up to 1 hour // Log warning when operating with stale keys alertOnStaleCache: true, }, // Token Introspection Degradation introspection: { // If introspection endpoint is unavailable fallbackBehavior: "USE_CACHED_OR_REJECT", // Options: USE_CACHED_OR_REJECT, REJECT_ALL, PASS_THROUGH // Maximum age of cached introspection result to accept during outage maxCacheAgeOnFailure: 300, // 5 minutes // Circuit breaker for introspection endpoint circuitBreaker: { failureThreshold: 5, recoveryTimeout: 30, halfOpenRequests: 2, }, }, // Session Store Degradation session: { // If session store is unavailable fallbackBehavior: "REJECT_ALL", // Sessions require store access // Connection pool settings for resilience pool: { minConnections: 5, maxConnections: 50, connectionTimeout: 500, idleTimeout: 30000, }, // Read from replicas, write to primary readFromReplicas: true, },};Every degradation strategy involves a security vs. availability trade-off. Using stale JWKS means a key compromise takes longer to propagate. Using cached introspection results means revoked tokens remain valid briefly. Document these trade-offs explicitly and ensure they align with your organization's risk tolerance.
Once the gateway authenticates a request, it must communicate the authenticated identity to downstream services. This is called identity propagation. Several strategies exist, each with distinct security and operational characteristics.
The gateway adds custom headers containing identity information. Services read these headers and trust them implicitly.
12345678910111213141516171819202122232425
// Gateway: Inject identity headers after authentication function propagateIdentity(request, claims) { // Set standard identity headers request.headers["X-User-Id"] = claims.sub; request.headers["X-User-Email"] = claims.email || ""; request.headers["X-User-Roles"] = (claims.roles || []).join(","); // Set organization/tenant context (for multi-tenant systems) request.headers["X-Tenant-Id"] = claims.tenant_id || ""; request.headers["X-Org-Id"] = claims.org_id || ""; // Set token metadata request.headers["X-Token-Issuer"] = claims.iss; request.headers["X-Token-Scopes"] = (claims.scope || []).join(","); request.headers["X-Token-Expires"] = claims.exp.toString(); // Request correlation request.headers["X-Request-Id"] = request.headers["X-Request-Id"] || generateUUID(); request.headers["X-Correlation-Id"] = request.headers["X-Correlation-Id"] || generateUUID(); // IMPORTANT: Remove the original Authorization header or replace it // to prevent downstream services from trusting the original token delete request.headers["Authorization"];}Header-based propagation is only secure if no external traffic can reach services directly. If an attacker can bypass the gateway and send requests directly to services, they can forge identity headers. Use network policies, service mesh, or firewall rules to ensure services only receive traffic from the gateway.
Instead of trusting headers, the gateway mints a new, cryptographically signed internal token for each request. Services validate this token, providing defense-in-depth.
12345678910111213141516171819202122232425262728293031323334353637383940
// Gateway: Mint internal JWT for downstream propagation function mintInternalToken(claims, originalTokenExpiry) { // Internal token has short lifetime (5 minutes max) // or remaining lifetime of original token, whichever is shorter const expiresIn = Math.min( 300, // 5 minutes originalTokenExpiry - Math.floor(Date.now() / 1000) ); const internalClaims = { // Standard claims iss: "gateway.internal", sub: claims.sub, aud: "internal-services", iat: Math.floor(Date.now() / 1000), exp: Math.floor(Date.now() / 1000) + expiresIn, // Custom claims (propagated from original token) email: claims.email, roles: claims.roles, tenant_id: claims.tenant_id, scopes: claims.scope, // Request context request_id: generateUUID(), original_issuer: claims.iss, }; // Sign with gateway's private key (rotated regularly) return jwt.sign(internalClaims, gatewayPrivateKey, { algorithm: "ES256", // ECDSA for fast signing keyid: currentKeyId, });} function propagateIdentity(request, claims) { const internalToken = mintInternalToken(claims, claims.exp); request.headers["Authorization"] = `Bearer ${internalToken}`;}| Strategy | Security | Performance | Complexity | Use Case |
|---|---|---|---|---|
| Headers Only | Requires network isolation | Fastest (no crypto) | Low | Trusted internal network |
| Internal Token | Cryptographic verification | ~1ms signing overhead | Medium | Zero-trust internal |
| Pass-Through | Services validate original | Varies | Low | Services need full claims |
Multi-tenant systems serve multiple customers (tenants) from a single deployment. Centralized gateway authentication must securely isolate tenants while efficiently routing requests to tenant-specific contexts.
The gateway must determine which tenant a request belongs to. Common approaches:
1. Subdomain-Based Tenancy
tenant-a.api.example.com/users
tenant-b.api.example.com/users
2. Path-Based Tenancy
api.example.com/tenants/tenant-a/users
api.example.com/tenants/tenant-b/users
3. Header-Based Tenancy
GET /users
X-Tenant-ID: tenant-a
4. Token-Embedded Tenancy The tenant ID is embedded in the access token as a claim.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687
// Multi-Tenant Gateway Authentication function authenticateMultiTenant(request, response, next) { // Step 1: Identify tenant from request const tenantId = resolveTenantId(request); if (!tenantId) { return response.status(400).json({ error: "tenant_not_identified", message: "Unable to determine tenant context" }); } // Step 2: Validate tenant exists and is active const tenant = await tenantRegistry.get(tenantId); if (!tenant) { return response.status(404).json({ error: "tenant_not_found" }); } if (tenant.status !== "active") { return response.status(403).json({ error: "tenant_suspended", message: "Tenant account is suspended" }); } // Step 3: Extract and validate token const token = extractBearerToken(request); if (!token) { return response.status(401).json({ error: "missing_token" }); } // Step 4: Get tenant-specific JWKS configuration // (tenants may have different identity providers) const idpConfig = tenant.identityProvider || defaultIdpConfig; const jwks = await getJwksForTenant(tenantId, idpConfig); // Step 5: Validate token let claims; try { claims = await validateToken(token, jwks, idpConfig); } catch (err) { return response.status(401).json({ error: "invalid_token" }); } // Step 6: Verify token belongs to this tenant // Critical: Prevent cross-tenant authentication if (claims.tenant_id && claims.tenant_id !== tenantId) { // Token was issued for a different tenant! logSecurityAlert("cross_tenant_attempt", { tokenTenant: claims.tenant_id, requestTenant: tenantId, userId: claims.sub, }); return response.status(403).json({ error: "tenant_mismatch", message: "Token not valid for this tenant" }); } // Step 7: Propagate identity with tenant context request.headers["X-Tenant-Id"] = tenantId; request.headers["X-User-Id"] = claims.sub; request.headers["X-Tenant-Plan"] = tenant.plan; // For rate limiting, features next();} function resolveTenantId(request) { // Priority 1: Subdomain const host = request.headers["host"]; const subdomainMatch = host.match(/^([a-z0-9-]+)\.api\.example\.com$/); if (subdomainMatch) { return subdomainMatch[1]; } // Priority 2: Path prefix const pathMatch = request.path.match(/^\/tenants\/([a-z0-9-]+)\//); if (pathMatch) { return pathMatch[1]; } // Priority 3: Header if (request.headers["x-tenant-id"]) { return request.headers["x-tenant-id"]; } return null;}Cross-tenant data access is one of the most severe security vulnerabilities in multi-tenant systems. Always verify that the authenticated user's tenant claim matches the request's target tenant. Log and alert on any mismatches—they may indicate an attack attempt.
Authentication adds latency to every request. At high traffic volumes, even small inefficiencies compound into significant performance impacts. This section covers optimization techniques for high-throughput gateway authentication.
JWKS (JSON Web Key Set) contains the public keys needed to verify JWT signatures. Fetching JWKS on every request would be catastrophically slow.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
// High-Performance JWKS Cache with Proactive Refresh class JwksCacheManager { private cache: Map<string, JwksCacheEntry> = new Map(); private refreshLock: Map<string, Promise<void>> = new Map(); constructor(private readonly config: JwksCacheConfig) {} async getSigningKey(issuer: string, kid: string): Promise<crypto.KeyObject> { let entry = this.cache.get(issuer); // Cache miss or expired: fetch JWKS if (!entry || this.isExpired(entry)) { await this.refreshJwks(issuer); entry = this.cache.get(issuer); } // Near expiry: trigger background refresh (non-blocking) else if (this.isNearExpiry(entry)) { this.backgroundRefresh(issuer); } const key = entry?.keys.get(kid); if (!key) { throw new Error(`Unknown key ID: ${kid}`); } return key; } private async refreshJwks(issuer: string): Promise<void> { // Coalesce concurrent refresh requests let existingRefresh = this.refreshLock.get(issuer); if (existingRefresh) { return existingRefresh; } const refreshPromise = this.doRefresh(issuer); this.refreshLock.set(issuer, refreshPromise); try { await refreshPromise; } finally { this.refreshLock.delete(issuer); } } private async doRefresh(issuer: string): Promise<void> { const jwksUri = await this.discoverJwksUri(issuer); const response = await fetch(jwksUri, { headers: { "Accept": "application/json" }, signal: AbortSignal.timeout(this.config.fetchTimeout), }); if (!response.ok) { // Keep stale cache if refresh fails const existing = this.cache.get(issuer); if (existing) { existing.stale = true; this.emitAlert("jwks_refresh_failed", { issuer }); return; } throw new Error(`JWKS fetch failed: ${response.status}`); } const jwks = await response.json(); const keys = new Map<string, crypto.KeyObject>(); for (const jwk of jwks.keys) { const keyObject = crypto.createPublicKey({ key: jwk, format: "jwk" }); keys.set(jwk.kid, keyObject); } this.cache.set(issuer, { keys, fetchedAt: Date.now(), expiresAt: Date.now() + this.config.cacheTtl, stale: false, }); } private backgroundRefresh(issuer: string): void { // Non-blocking refresh in background setImmediate(() => { this.refreshJwks(issuer).catch(err => { console.warn(`Background JWKS refresh failed: ${err.message}`); }); }); } private isExpired(entry: JwksCacheEntry): boolean { return Date.now() > entry.expiresAt && !entry.stale; } private isNearExpiry(entry: JwksCacheEntry): boolean { // Refresh when 75% through TTL const threshold = entry.fetchedAt + (this.config.cacheTtl * 0.75); return Date.now() > threshold; }}The choice of JWT signing algorithm significantly impacts verification performance:
| Algorithm | Type | Verify Time (~) | Security | Recommendation |
|---|---|---|---|---|
| HS256 | Symmetric (HMAC) | ~2μs | Shared secret risk | Internal only, never for external tokens |
| RS256 | Asymmetric (RSA) | ~50μs | Strong (2048+ bit) | Standard choice, widely supported |
| RS384/RS512 | Asymmetric (RSA) | ~60-80μs | Stronger | When extra security needed |
| ES256 | Asymmetric (ECDSA) | ~30μs | Strong (P-256) | Faster than RSA, smaller keys |
| ES384/ES512 | Asymmetric (ECDSA) | ~40-60μs | Stronger | When extra security needed |
| EdDSA (Ed25519) | Asymmetric | ~20μs | Modern, strong | Best performance, newer support |
At 10,000 requests/second, RS256 verification adds ~500ms of cumulative CPU time per second. Switching to ES256 cuts this nearly in half. For extreme throughput, EdDSA offers the best performance—but ensure your JWT library and identity provider support it.
For authentication patterns requiring external calls (introspection, session store), connection management is critical.
1234567891011121314151617181920212223242526
// HTTP Agent for Identity Provider Connectionsconst idpAgent = new https.Agent({ keepAlive: true, keepAliveMsecs: 60000, maxSockets: 100, // Max connections per host maxFreeSockets: 20, // Keep idle connections timeout: 5000, scheduling: 'fifo', // First-in-first-out for consistent latency}); // Redis Client for Session Storeconst sessionRedis = new Redis.Cluster([ { host: 'redis-node-1', port: 6379 }, { host: 'redis-node-2', port: 6379 }, { host: 'redis-node-3', port: 6379 },], { redisOptions: { connectTimeout: 500, commandTimeout: 100, // Fast fail for auth path enableReadyCheck: true, lazyConnect: false, }, scaleReads: 'slave', // Read from replicas slotsRefreshTimeout: 2000, clusterRetryStrategy: (times) => Math.min(100 * times, 1000),});Centralized authentication concentrates security responsibility—and risk—at the gateway. Rigorous security hardening is essential.
Never allow the JWT to dictate which algorithm to use—this enables the "Algorithm Confusion" attack where an attacker modifies the token header to use a weaker algorithm.
1234567891011121314151617
// SECURE: Explicitly specify allowed algorithmsconst decoded = jwt.verify(token, publicKey, { algorithms: ["RS256", "ES256"], // Whitelist only expected algorithms issuer: "https://auth.example.com", audience: "api.example.com", complete: true, // Get header too for logging}); // INSECURE: Never do this!// const decoded = jwt.verify(token, secretOrPublicKey);// Without algorithm restriction, attacker can use HS256 with public key as secret // Also reject the "none" algorithm explicitlyconst header = decodeJwtHeader(token);if (header.alg === "none" || header.alg.toLowerCase() === "none") { throw new SecurityError("Algorithm 'none' is forbidden");}Authentication endpoints are prime targets for brute-force and credential stuffing attacks. Apply aggressive rate limiting.
12345678910111213141516171819202122232425262728293031323334353637
# Rate limiting for authentication flowsrate_limits: # Token validation (normal API traffic) token_validation: requests_per_second: 10000 # High limit, mostly legitimate burst: 500 # Login/token issuance (target for attacks) login: # Per IP limits per_ip: requests_per_minute: 10 burst: 5 # Per username limits (prevent account lockout attacks) per_username: requests_per_minute: 5 burst: 3 # Global limits global: requests_per_second: 1000 # Failed authentication (more aggressive) failed_auth: per_ip: # After 5 failures, impose 5-minute delay threshold: 5 lockout_seconds: 300 per_username: threshold: 10 lockout_seconds: 900 # 15 minutes after 10 failures # Password reset (frequently abused) password_reset: per_ip: requests_per_hour: 5 per_email: requests_per_hour: 3Centralized authentication at the API Gateway is a foundational pattern for microservices security. Let's consolidate the key principles:
With centralized authentication established, the next page explores OAuth2 and JWT Validation in depth—understanding token formats, validation flows, and the specific security considerations for these dominant authentication standards.