Loading content...
Knowing what an API Gateway does is only half the picture. Equally critical is understanding where it sits in your architecture. Gateway placement affects security posture, latency budgets, operational complexity, and cost. A gateway at the edge serves different purposes than a gateway between internal services, and many organizations deploy multiple gateways at different layers.
This page explores gateway placement strategies in depth, from simple single-gateway architectures to sophisticated multi-tier deployments used by the world's largest organizations.
By the end of this page, you will understand edge gateway placement, internal/mesh gateway patterns, multi-tier gateway architectures, cloud vs. self-hosted placement decisions, and the trade-offs that inform where to position your gateway infrastructure.
The most common gateway placement is at the edge—the boundary between the public internet and your internal infrastructure. This is the gateway that terminates TLS from external clients, validates authentication, and routes requests to internal services.
┌────────────────────────────────────────────────────────────────────────────────┐│ INTERNET ││ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ││ │ Mobile │ │ Web │ │ IoT │ │Partner │ │Attacker│ ││ │ App │ │Browser │ │Devices │ │ APIs │ │ Bots │ ││ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ ││ │ │ │ │ │ ││ └─────────────┴─────────────┴─────────────┴─────────────┘ ││ │ ││ HTTPS/TLS ││ │ ││ ▼ │├────────────────────────────────────────────────────────────────────────────────┤│ EDGE LAYER ││ ┌─────────────────────────────────────────────────────────────────────────┐ ││ │ CDN / WAF / DDoS Protection │ ││ │ (Cloudflare, AWS Shield, Akamai, etc.) │ ││ └───────────────────────────────────┬─────────────────────────────────────┘ ││ │ ││ ┌───────────────────────────────────┴─────────────────────────────────────┐ ││ │ EDGE API GATEWAY │ ││ │ ┌────────────────────────────────────────────────────────────────────┐ │ ││ │ │ • TLS Termination • Authentication • Rate Limiting │ │ ││ │ │ • Request Validation • Routing • Observability │ │ ││ │ │ • IP Allowlisting • Request Transform • Response Caching │ │ ││ │ └────────────────────────────────────────────────────────────────────┘ │ ││ └───────────────────────────────────┬─────────────────────────────────────┘ ││ │ │├───────────────────────────────────────┼──────────────────────────────────────────┤│ PRIVATE NETWORK (VPC) ││ │ ││ ┌──────────────────────────┼──────────────────────────┐ ││ │ │ │ ││ ▼ ▼ ▼ ││ ┌────────────┐ ┌────────────┐ ┌────────────┐ ││ │ Product │ │ User │ │ Order │ ││ │ Service │ │ Service │ │ Service │ ││ └────────────┘ └────────────┘ └────────────┘ ││ │└────────────────────────────────────────────────────────────────────────────────┘The edge gateway is the most security-critical component. Consider:
Network Segmentation: The gateway sits in a DMZ or dedicated security zone, separate from backend services. Even if compromised, attackers cannot directly access internal networks.
Minimal Attack Surface: The gateway should run minimal software—just routing, security validation, and proxying. No business logic, no databases, no state.
Hardened Infrastructure: Operating system hardening, restricted ports, centralized logging of all access, and regular security patching.
Defense in Depth: The edge gateway works in concert with CDN-level protection (Cloudflare, AWS Shield), WAF rules (OWASP Top 10 protection), and network-level security groups.
Zero Trust Architecture: Even with a secure edge, internal services should not blindly trust traffic. The gateway enriches requests with verified identity, but services may perform additional authorization.
While logically a single entry point, the edge gateway must be physically highly available. Deploy across multiple availability zones with load balancing. Ensure shared state (rate limit counters, etc.) uses distributed stores. Monitor health proactively and have runbooks for gateway failures.
Beyond edge placement, many organizations deploy internal gateways to manage service-to-service communication within their infrastructure. This creates a multi-tier gateway architecture where different gateways serve different purposes.
| Use Case | Description | Example |
|---|---|---|
| Domain Boundaries | Separate gateway per business domain | E-commerce: Order Gateway, Inventory Gateway, Payment Gateway |
| Team Autonomy | Each team manages their domain's gateway | Payments team controls payment API versioning and routing |
| Legacy Integration | Gateway mediating between modern and legacy systems | REST gateway translating to SOAP backend |
| Cross-Platform Communication | Gateway bridging different technology stacks | Node.js services calling Java services through gateway |
| Security Zones | Gateway between trust zones within the network | Gateway between production and data analytics zone |
| Multi-Cloud/Hybrid | Gateway connecting cloud and on-premises | AWS services communicating with on-prem mainframe |
┌─────────────────────────────────────────────────────────────────────────────────┐│ INTERNET ││ │ ││ ▼ │├─────────────────────────────────────────────────────────────────────────────────┤│ EDGE GATEWAY TIER ││ ┌───────────────────────────────────────────────────────────────────────────┐ ││ │ EDGE API GATEWAY │ ││ │ (Public-facing: Auth, Rate Limiting, TLS Termination) │ ││ └─────────────────────────────────┬─────────────────────────────────────────┘ ││ │ │├────────────────────────────────────┼────────────────────────────────────────────┤│ INTERNAL GATEWAY TIER ││ │ ││ ┌─────────────────────────┼─────────────────────────┐ ││ │ │ │ ││ ▼ ▼ ▼ ││ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ││ │ COMMERCE │ │ PAYMENTS │ │ USER │ ││ │ GATEWAY │ │ GATEWAY │ │ GATEWAY │ ││ │ │ │ │ │ │ ││ │ • Product API │ │ • Payments API│ │ • Auth API │ ││ │ • Cart API │ │ • Invoicing │ │ • Profile API │ ││ │ • Search API │ │ • Refunds │ │ • Prefs API │ ││ └───────┬───────┘ └───────┬───────┘ └───────┬───────┘ ││ │ │ │ │├──────────┼─────────────────────────┼─────────────────────────┼───────────────────┤│ │ │ │ ││ ┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐ ││ │ Commerce │ │ Payments │ │ User │ ││ │ Services │ │ Services │ │ Services │ ││ │ │ │ │ │ │ ││ │ • Products │ │ • Payment Svc │ │ • Auth Svc │ ││ │ • Inventory │ │ • Invoice Svc │ │ • Profile Svc │ ││ │ • Cart │ │ • Refund Svc │ │ • Notif Svc │ ││ │ • Search │ │ • Ledger │ │ │ ││ └───────────────┘ └───────────────┘ └───────────────┘ ││ ││ SERVICE TIER │└─────────────────────────────────────────────────────────────────────────────────┘Internal gateways differ from edge gateways in their responsibilities:
| Responsibility | Edge Gateway | Internal Gateway |
|---|---|---|
| TLS Termination | Always (public traffic) | Optional (mTLS between zones) |
| Authentication | Full validation (JWT, API keys) | Identity propagation (trust upstream) |
| Rate Limiting | Per-client protection | Per-service protection (backpressure) |
| Observability | External request metrics | Internal call tracing |
| Protocol Translation | HTTP ↔ gRPC | Service-specific protocols |
| Routing | Route to domains | Route within domain |
| Versioning | Public API versioning | Internal API contracts |
Internal gateways typically operate in a higher-trust environment, assuming traffic has already passed through external security checks.
For service-to-service communication, a service mesh (Istio, Linkerd) often replaces internal gateways. The mesh deploys sidecars with each service, handling mTLS, retries, and observability without a centralized gateway. Meshes excel at east-west traffic; gateways excel at north-south traffic.
As organizations scale, they often evolve from a single gateway to multi-tier architectures. Understanding common patterns helps you design for your organization's needs.
This pattern separates external traffic management from internal domain routing:
For global applications with aggressive latency targets:
┌─────────────────────────────────────────────────────────────────────────────────┐│ CLIENT REQUEST ││ │ ││ (DNS Query) ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────────────────┐││ │ TIER 1: CDN EDGE LAYER │││ │ │││ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │││ │ │ Edge PoP │ │ Edge PoP │ │ Edge PoP │ │ Edge PoP │ │││ │ │ New York │ │ London │ │ Tokyo │ │ Sydney │ │││ │ │ │ │ │ │ │ │ │ │││ │ │ • DDoS Mitigation │ │││ │ │ • Static Asset Caching │ │││ │ │ • TLS Termination │ │││ │ │ • Geographic Routing │ │││ │ │ • Edge Compute (Cloudflare Workers / Lambda@Edge) │ │││ │ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ │││ └──────────────────────────────────┬──────────────────────────────────────────┘││ │ ││ (Origin Request) ││ │ ││ ┌──────────────────────────────────┼──────────────────────────────────────────┐││ │ TIER 2: EDGE API GATEWAY │││ │ │││ │ ┌─────────────────────────────────────────────────────────────────────────┐│││ │ │ REGIONAL GATEWAY CLUSTER ││││ │ │ ││││ │ │ • Authentication / Authorization (JWT validation) ││││ │ │ • Rate Limiting (per-user, per-tenant) ││││ │ │ • Request Validation (schema validation, size limits) ││││ │ │ • API Versioning (route to correct service version) ││││ │ │ • Observability (metrics, tracing, logging) ││││ │ │ • Request Transformation (header enrichment) ││││ │ └─────────────────────────────────────────────────────────────────────────┘│││ └──────────────────────────────────┬──────────────────────────────────────────┘││ │ ││ ┌──────────────────────────────────┼──────────────────────────────────────────┐││ │ TIER 3: APPLICATION GATEWAYS / BFFs │││ │ │││ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │││ │ │ Mobile BFF │ │ Web BFF │ │ GraphQL Gateway│ │││ │ │ │ │ │ │ │ │││ │ │ • Response │ │ • SSR Support │ │ • Schema │ │││ │ │ Compression │ │ Aggregation │ │ Federation │ │││ │ │ • Payload │ │ • Session │ │ • Query │ │││ │ │ Optimization │ │ Management │ │ Optimization │ │││ │ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │││ │ │ │ │ │││ └────────────┼─────────────────────┼─────────────────────┼─────────────────────┘││ │ │ │ ││ ▼ ▼ ▼ ││ ┌─────────────────────────────────────────────────────────────────────────────┐││ │ BACKEND SERVICES │││ └─────────────────────────────────────────────────────────────────────────────┘│└─────────────────────────────────────────────────────────────────────────────────┘Global organizations with multiple regions and data residency requirements:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
interface RegionalGatewayConfig { region: string; primaryServices: ServiceEndpoint[]; crossRegionFallback?: RegionalFallback; dataResidency: DataResidencyConfig;} interface DataResidencyConfig { enforced: boolean; // If true, data never leaves region allowedRegions: string[]; // Regions data can flow to piiFields: string[]; // Fields that require residency enforcement} // Example: EU Region Gateway Configurationconst euGatewayConfig: RegionalGatewayConfig = { region: 'eu-west-1', primaryServices: [ { name: 'user-service', endpoint: 'http://user-service.eu-west-1.internal' }, { name: 'order-service', endpoint: 'http://order-service.eu-west-1.internal' }, { name: 'product-service', endpoint: 'http://product-service.eu-west-1.internal' }, ], crossRegionFallback: { enabled: true, regions: ['eu-central-1', 'us-east-1'], // Failover order // Only fallback for non-user-data requests excludePatterns: ['/api/users/**', '/api/orders/**'], }, dataResidency: { enforced: true, // GDPR compliance allowedRegions: ['eu-west-1', 'eu-central-1'], // EU only piiFields: ['email', 'name', 'address', 'phone', 'ip_address'], },}; // Routing decision with data residency awarenessasync function routeRequest( request: Request, identity: Identity, config: RegionalGatewayConfig,): Promise<Response> { const userRegion = identity.metadata?.region as string; // Enforce data residency for user data if (config.dataResidency.enforced) { const requestPath = new URL(request.url).pathname; const isUserDataRequest = requestPath.includes('/users') || requestPath.includes('/orders') || requestPath.includes('/profile'); if (isUserDataRequest && !config.dataResidency.allowedRegions.includes(userRegion)) { // Must redirect to user's home region const homeGateway = getRegionalGateway(userRegion); return redirectToRegion(request, homeGateway); } } // Normal routing within region const service = findService(request.url, config.primaryServices); try { return await proxyRequest(request, service); } catch (error) { // Try cross-region fallback if configured if (config.crossRegionFallback?.enabled) { return await fallbackToOtherRegion(request, config.crossRegionFallback); } throw error; }}More tiers add latency and complexity. Start with the simplest architecture that meets your needs: single edge gateway for most startups, two-tier (edge + domain) for team autonomy at scale, three-tier (CDN + edge + BFFs) for global applications with diverse client types. Add tiers when you have concrete problems to solve, not preemptively.
A critical placement decision is whether to use cloud-managed gateway services (AWS API Gateway, Azure API Management, Google Cloud Endpoints) or self-hosted solutions (Kong, NGINX, Envoy, Traefik). This decision affects cost, control, and operational complexity.
| Dimension | Cloud-Managed | Self-Hosted |
|---|---|---|
| Operational Burden | Low — Provider manages scaling, HA, patches | High — You manage infrastructure, updates, monitoring |
| Customization | Limited to provider's features | Full control — custom plugins, logic, protocols |
| Cost Structure | Per-request pricing (can be expensive at scale) | Infrastructure cost (predictable at scale) |
| Latency | Additional hop to provider's infrastructure | Can colocate with services for minimal latency |
| Lock-in | Provider-specific features create lock-in | Portable across clouds and on-premises |
| Compliance | Provider certifications (SOC2, HIPAA, etc.) | You manage compliance; full audit control |
| Feature Velocity | New features when provider releases them | Immediate access to open-source innovations |
| Debugging | Limited visibility into provider internals | Full access to logs, configurations, code |
Choose cloud-managed gateways when:
Choose self-hosted gateways when:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394
// Cost comparison: AWS API Gateway vs. Self-hosted Kong on EKS interface CostCalculation { requestsPerMonth: number; avgPayloadKB: number; requiredInstances: number; calculateMonthlyCost(): CostBreakdown;} // AWS API Gateway (HTTP API) pricingclass AWSApiGatewayCost implements CostCalculation { constructor( public requestsPerMonth: number, public avgPayloadKB: number, public requiredInstances: number = 0, // Managed, no instances ) {} calculateMonthlyCost(): CostBreakdown { // $1.00 per million requests for first 300M // $0.90 per million for next 700M // Data transfer adds ~$0.09/GB const first300M = Math.min(this.requestsPerMonth, 300_000_000); const next700M = Math.min(Math.max(0, this.requestsPerMonth - 300_000_000), 700_000_000); const beyond1B = Math.max(0, this.requestsPerMonth - 1_000_000_000); const requestCost = (first300M / 1_000_000) * 1.00 + (next700M / 1_000_000) * 0.90 + (beyond1B / 1_000_000) * 0.80; const dataTransferGB = (this.requestsPerMonth * this.avgPayloadKB) / 1_000_000; const dataTransferCost = dataTransferGB * 0.09; return { compute: 0, requests: requestCost, dataTransfer: dataTransferCost, total: requestCost + dataTransferCost, }; }} // Self-hosted Kong on EKSclass SelfHostedKongCost implements CostCalculation { constructor( public requestsPerMonth: number, public avgPayloadKB: number, // Assume 10K RPS per m5.xlarge instance public requiredInstances: number = Math.ceil( (requestsPerMonth / (30 * 24 * 60 * 60)) / 10000 ), ) {} calculateMonthlyCost(): CostBreakdown { // m5.xlarge: ~$139/month (on-demand), ~$90/month (reserved 1yr) const instanceCost = this.requiredInstances * 90 * 2; // 2x for HA // Redis for rate limiting: r6g.large ~$120/month const redisCost = 120; // EKS cluster: $72/month const eksCost = 72; // Data transfer: ~$0.01/GB internal const dataTransferGB = (this.requestsPerMonth * this.avgPayloadKB) / 1_000_000; const dataTransferCost = dataTransferGB * 0.01; // Operations overhead: ~0.5 FTE @ $150K/yr = $6,250/month const operationsCost = 6250; return { compute: instanceCost + redisCost + eksCost, requests: 0, // No per-request cost dataTransfer: dataTransferCost, operations: operationsCost, total: instanceCost + redisCost + eksCost + dataTransferCost + operationsCost, }; }} // Example: 500 million requests/month, 5KB average payloadconst managed = new AWSApiGatewayCost(500_000_000, 5, 0);const selfHosted = new SelfHostedKongCost(500_000_000, 5); console.log('AWS API Gateway:', managed.calculateMonthlyCost());// { compute: 0, requests: 480, dataTransfer: 225, total: $705 } console.log('Self-Hosted Kong:', selfHosted.calculateMonthlyCost());// { compute: 552, requests: 0, dataTransfer: 25, operations: 6250, total: $6,827 } // At 500M requests/month, AWS is cheaper unless you already have the ops capacity!// At 5B requests/month, self-hosted becomes significantly cheaper.The infrastructure cost comparison often misses the biggest cost: operations. On-call rotations, incident response, upgrades, security patches, and capacity planning require skilled engineers. Only include self-hosted in your calculation if you have (or can justify building) this operational capacity.
Based on patterns observed across successful organizations, here are best practices for gateway placement:
1. The Monolithic Gateway
Placing all routing logic, transformations, and policies in a single gateway creates a bottleneck. As the configuration grows, changes become risky, deployments slow, and single team ownership becomes impossible.
2. Gateway Per Service
Going to the opposite extreme—a dedicated gateway for each microservice—adds unnecessary latency, operational complexity, and confusion. Services can communicate directly within a trust zone.
3. Business Logic in Gateway
When gateways start containing business rules (if premium user, calculate discount...), you've created a distributed monolith. Keep gateways focused on infrastructure concerns.
4. Inconsistent Gateway Technologies
Using Kong for one domain, NGINX for another, and AWS API Gateway for a third creates operational fragmentation. Standardize on one or two gateway technologies to build operational expertise.
A good heuristic: one edge gateway for external traffic, plus one internal gateway per major domain boundary where teams need independent deployment and versioning control. For most organizations, this means 1-5 gateways total, not dozens.
Gateway placement is a strategic architectural decision that affects security, performance, cost, and operational complexity. Let's consolidate the key insights:
What's Next:
With a thorough understanding of gateway placement, we'll now explore the critical distinction between API Gateways and Load Balancers—two often-confused components that serve different purposes in your architecture. Understanding their differences is essential for correct infrastructure design.
You now understand where to place API Gateways in your architecture. From edge placement for external traffic to internal gateways for domain boundaries, from simple single-tier to sophisticated multi-tier architectures, you can make informed decisions about gateway placement for your organization's needs.