Loading content...
A mobile app needs to display a user's dashboard: their profile, recent orders, loyalty points, and personalized recommendations. In a microservices architecture, this data lives in four different services. Should the mobile app make four separate API calls, managing failures, latency, and aggregation on the client? Or should a single API call return everything the screen needs?
Service Composition is the pattern where the API Gateway orchestrates calls to multiple backend services, aggregates their responses, and returns a unified response to the client. This moves complexity from clients to the infrastructure, enabling simpler client code and more efficient network usage.
But composition is a double-edged sword. Done well, it dramatically improves client experience. Done poorly, it turns the gateway into a tangled mess of business logic. This page provides a comprehensive exploration of service composition patterns, implementation strategies, and the critical decision of when to compose at the gateway versus elsewhere.
By the end of this page, you will understand service composition patterns (aggregation, orchestration, choreography), implementation strategies for parallel and sequential calls, error handling in composed requests, performance optimization techniques, and criteria for deciding when gateway composition is appropriate.
In a microservices architecture, a single client request often requires data from multiple services. Without composition, clients face significant challenges:
The Client-Side Aggregation Problem:
The Composition Solution:
Service composition moves aggregation to the server side. The client makes a single request; the gateway calls multiple services in parallel (or sequence), combines responses, and returns a unified result.
| Aspect | Client-Side | Gateway Composition |
|---|---|---|
| Network Calls | Multiple (one per service) | Single (to gateway) |
| Latency | Sum of sequential calls | Max of parallel calls |
| Bandwidth | Multiple request/response cycles | Single optimized payload |
| Error Handling | Client must aggregate errors | Gateway provides unified error |
| Service Coupling | Client knows all services | Client knows one endpoint |
| Caching | Per-service caching only | Gateway can cache composed response |
Service composition at the gateway is closely related to the Backend-for-Frontend (BFF) pattern. A BFF is a specialized API layer tailored to a specific client's needs (mobile, web, IoT). Gateway composition can implement BFF patterns without separate services.
Different composition patterns address different needs. Understanding these patterns helps you choose the right approach for each use case.
Pattern Overview:
12345678910111213141516171819202122232425262728293031323334353637383940414243
# Kong configuration for parallel aggregationroutes: - name: user-dashboard paths: ["/api/dashboard"] plugins: - name: request-termination config: # This route doesn't call a single backend # Instead, composition plugin handles it - name: composition config: # All calls execute in parallel parallel: true timeout: 3000 # Overall timeout # Backend calls calls: - name: profile url: "http://user-service/users/{userId}" timeout: 1000 - name: orders url: "http://order-service/users/{userId}/recent?limit=5" timeout: 1500 - name: loyalty url: "http://loyalty-service/users/{userId}/points" timeout: 1000 - name: notifications url: "http://notification-service/users/{userId}/unread?limit=10" timeout: 1000 # Response template response: template: | { "profile": ${profile.body}, "recentOrders": ${orders.body.items}, "loyaltyPoints": ${loyalty.body.balance}, "notifications": ${notifications.body.items} }Use simple aggregation when calls are independent. Use sequential orchestration when there are true data dependencies. Avoid conditional composition in the gateway when conditions reflect business logic—that belongs in a service.
Implementing composition requires careful attention to concurrency, timeout management, and response handling. Here's how production systems approach these challenges.
Parallel Execution:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107
// Parallel composition with Promise.allSettledinterface ServiceCall { name: string; url: string; timeout: number; optional?: boolean;} interface ComposedResponse { [key: string]: unknown; _meta: { latency: number; services: { [name: string]: { status: 'success' | 'failed' | 'timeout'; latency: number; }; }; };} async function composeServices( calls: ServiceCall[], context: RequestContext): Promise<ComposedResponse> { const startTime = Date.now(); // Execute all calls in parallel const promises = calls.map(call => executeWithTimeout(call, context) ); // Wait for all, don't fail fast const results = await Promise.allSettled(promises); // Build composed response const response: ComposedResponse = { _meta: { latency: Date.now() - startTime, services: {}, }, }; for (let i = 0; i < calls.length; i++) { const call = calls[i]; const result = results[i]; if (result.status === 'fulfilled') { response[call.name] = result.value.body; response._meta.services[call.name] = { status: 'success', latency: result.value.latency, }; } else { // Handle failure if (!call.optional) { response._meta.services[call.name] = { status: 'failed', latency: 0, }; // For required services, include error response[call.name] = null; } } } return response;} async function executeWithTimeout( call: ServiceCall, context: RequestContext): Promise<{ body: unknown; latency: number }> { const controller = new AbortController(); const timeout = setTimeout( () => controller.abort(), call.timeout ); const startTime = Date.now(); try { const response = await fetch( call.url.replace('{userId}', context.userId), { headers: { 'X-Request-ID': context.requestId, 'X-Forwarded-For': context.clientIp, }, signal: controller.signal, } ); clearTimeout(timeout); if (!response.ok) { throw new Error(`Service returned ${response.status}`); } return { body: await response.json(), latency: Date.now() - startTime, }; } finally { clearTimeout(timeout); }}Timeout Management:
Composed requests have an overall timeout, but individual service timeouts matter too. Key considerations:
| Timeout Level | Purpose | Typical Value |
|---|---|---|
| Overall composition | Maximum time for entire operation | 3-5 seconds |
| Individual service | Prevent one slow service from blocking all | 1-2 seconds |
| Connection timeout | Fail fast on unreachable services | 100-500ms |
| Read timeout | Handle slow response streaming | 2-3 seconds |
For sequential composition, timeout budget is distributed across steps. If you have 3 sequential calls and a 3-second overall timeout, each call has roughly 1 second. For parallel calls, overall timeout can equal the longest individual timeout. Plan accordingly.
When composing multiple services, partial failures are common. How you handle these failures dramatically affects user experience.
Error Handling Strategies:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
# Gateway composition with partial response handlingroutes: - name: dashboard paths: ["/api/dashboard"] plugins: - name: composition config: # Don't fail entire request on partial failure fail_on_error: false calls: # Critical - dashboard fails without this - name: profile url: "http://user-service/users/{userId}" required: true # Optional - dashboard can render without - name: orders url: "http://order-service/users/{userId}/recent" required: false fallback: value: { items: [], message: "Orders temporarily unavailable" } - name: loyalty url: "http://loyalty-service/users/{userId}/points" required: false fallback: cached: true # Use cached value if available ttl: 300 - name: recommendations url: "http://recommendation-service/users/{userId}" required: false fallback: url: "http://content-service/trending" # Call different service response: # Include status for each service include_errors: true template: | { "data": { "profile": ${profile.body}, "orders": ${orders.body ?? orders.fallback}, "loyalty": ${loyalty.body ?? loyalty.fallback}, "recommendations": ${recommendations.body ?? recommendations.fallback} }, "status": { "orders": ${orders.success ? "live" : "stale"}, "loyalty": ${loyalty.success ? "live" : "cached"}, "recommendations": ${recommendations.success ? "personalized" : "trending"} } }When returning cached or fallback data, tell the client. Response headers like X-Data-Source: cached or response fields like "loyalty.source": "cached" help clients render appropriate UI (e.g., showing 'Loyalty points as of 5 min ago').
Composition adds latency inherently—the gateway must call services, wait for responses, and aggregate. Optimizing this path is critical for user experience.
Optimization Techniques:
| Technique | Description | Impact |
|---|---|---|
| Parallel Execution | Call independent services simultaneously | Latency = max(calls) instead of sum |
| Connection Pooling | Reuse HTTP connections to backends | Eliminate connection overhead (50-100ms) |
| Response Caching | Cache composed responses or individual service responses | Eliminate backend calls entirely |
| Request Deduplication | Batch identical concurrent requests | Reduce backend load |
| Field Selection | Only request needed fields from backends | Reduce payload size and serialization |
| Streaming Aggregation | Stream partial responses as they arrive | Faster time-to-first-byte |
12345678910111213141516171819202122232425262728293031323334353637383940
// Request deduplication / batching using DataLoader patternimport DataLoader from 'dataloader'; // Create loaders for each service (per-request scope)function createLoaders() { return { users: new DataLoader<string, User>(async (userIds) => { // Batch all user requests into single call const users = await userService.getBatch(userIds); // Return in same order as requested return userIds.map(id => users.find(u => u.id === id) ?? null ); }, { maxBatchSize: 100, batchScheduleFn: (callback) => setTimeout(callback, 10), }), orders: new DataLoader<string, Order[]>(async (userIds) => { // Batch order requests const ordersByUser = await orderService.getRecentForUsers(userIds); return userIds.map(id => ordersByUser[id] ?? []); }), };} // Usage in compositionasync function composeDashboard(userId: string, loaders: Loaders) { // These may be called multiple times but only execute once per unique userId const [user, orders, recommendations] = await Promise.all([ loaders.users.load(userId), loaders.orders.load(userId), // Recommendations might need user data first loaders.users.load(userId).then(u => recommendationService.getForPreferences(u.preferences) ), ]); return { user, orders, recommendations };}Caching composed responses is powerful but requires careful invalidation. If any underlying service data changes, the composed cache must be invalidated. Event-driven invalidation or short TTLs with stale-while-revalidate provide good tradeoffs.
Gateway composition is powerful, but it's not always the right choice. Overusing composition turns the gateway into a monolith-in-disguise—a 'god gateway' that knows too much about business logic.
Signs Gateway Composition is Wrong:
Alternatives to Gateway Composition:
| Alternative | When to Use |
|---|---|
| Backend-for-Frontend (BFF) | Create dedicated services for each client type (mobile BFF, web BFF) |
| GraphQL | Let clients specify exactly what they need; federation composes automatically |
| Aggregator Service | Dedicated microservice for complex composition and business logic |
| Client-Side Composition | When clients need fine-grained control or offline capability |
If you're writing 'if' statements or loops in your gateway composition, you've gone too far. Gateways should be configured, not programmed. When composition becomes complex, extract it into a dedicated service that can be properly tested, deployed, and maintained.
GraphQL offers an alternative approach to service composition that's worth understanding. Instead of the gateway defining composed endpoints, clients specify exactly what they need, and the GraphQL layer assembles data from multiple services.
GraphQL Federation:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
# Client query - automatically composed from multiple servicesquery DashboardQuery($userId: ID!) { user(id: $userId) { # From User Service id name email # From Order Service (User type extended) orders(first: 5) { edges { node { id total status # From Product Service (Order type extended) items { product { name imageUrl } quantity } } } } # From Loyalty Service (User type extended) loyaltyAccount { points tier expiringPoints(within: "30d") } # From Recommendation Service recommendations(limit: 4) { id reason product { name price imageUrl } } }} # Gateway automatically:# 1. Parses query# 2. Creates query plan# 3. Fetches from User Service# 4. In parallel: Order, Loyalty, Recommendation Services# 5. Resolves Product references# 6. Assembles and returns responseREST vs. GraphQL Composition:
| Aspect | REST Gateway Composition | GraphQL Federation |
|---|---|---|
| Query Flexibility | Fixed endpoints | Client-specified queries |
| Caching | HTTP caching (simple) | Needs special handling |
| Error Model | HTTP status codes | Partial data + errors |
| Learning Curve | Lower | Higher |
| Tooling | Standard HTTP tools | GraphQL-specific tools |
| Best For | Simple aggregation | Complex, evolving data needs |
GraphQL excels when clients have diverse data needs and when field-level granularity matters. For simple aggregation of a few well-defined endpoints, REST composition is simpler. Choose based on your actual use case, not technology hype.
Service composition is a powerful capability that, used judiciously, dramatically improves client experience in microservices architectures. Let's consolidate the key insights:
Module Complete:
This concludes our deep dive into API Gateway in Microservices. Across five pages, we've explored gateway responsibilities, request routing, authentication aggregation, rate limiting, and service composition. You now have a comprehensive understanding of how API Gateways serve as the critical entry point for distributed systems, handling everything from security to traffic management to developer experience.
You've mastered API Gateway concepts in microservices architectures. From routing and security to rate limiting and composition, you understand how gateways enable scalable, secure, and developer-friendly APIs. Apply these patterns to design gateway strategies that enhance your distributed systems without creating new bottlenecks.