Backend For Frontend - Learning Module

Loading content...

0/273

API Aggregation

The Composition Challenge

Modern microservices architectures decompose functionality across many services, each responsible for a single domain. While this decomposition offers tremendous benefits for team autonomy and system scalability, it creates a new problem: a single user operation often requires data from multiple services.

Consider a simple e-commerce product page. Displaying it requires data from: the Product Catalog Service (name, description, images), the Inventory Service (stock levels), the Pricing Service (current price, discounts), the Review Service (ratings, comments), the Recommendation Service (related products), and potentially the User Service (personalization data).

Without a BFF, clients must either make six separate API calls (slow, complex, battery-draining) or backend services must be coupled together to produce composite responses (defeating the purpose of decomposition). API aggregation in the BFF layer solves this problem elegantly.

What You Will Learn

By the end of this page, you will master API aggregation patterns including parallel fetching, sequential dependencies, partial failure handling, data joining strategies, aggregation composition patterns, caching strategies, and performance optimization techniques that enable BFFs to compose data from dozens of services into cohesive client responses.

Aggregation Fundamentals

API aggregation is the process of collecting data from multiple sources and presenting it as a unified response. This seemingly simple concept involves sophisticated engineering to execute well.

The Aggregation Process

Every aggregation operation follows a conceptual pipeline:

The Aggregation Pipeline

•Decomposition — Analyze the client's request to determine which downstream services need to be called and what data each provides.
•Planning — Determine execution order: which calls can be parallel? Which have dependencies? What are the timeout constraints?
•Execution — Make the downstream calls according to the plan, handling concurrency, retries, and timeouts.
•Normalization — Transform each service's response into a common internal format for composition.
•Composition — Combine the normalized data into the client-expected response structure.
•Enrichment — Add computed fields, apply business rules, resolve references between datasets.
•Serialization — Produce the final response in client-optimal format.

Converting Mermaid diagram...

Key Principles

Minimize total latency — The aggregation layer should add as little latency as possible above the slowest downstream call.
Maximize parallelism — Independent calls should execute concurrently; serialization only where dependencies require it.
Fail gracefully — Partial failures should produce partial responses, not complete failures.
Maintain data integrity — Aggregated data must be consistent and coherent even when sources have different update frequencies.
Hide service complexity — Clients should be unaware of how many services were called or how data was assembled.

Parallel vs Sequential Fetching

The choice between parallel and sequential fetching is the most impactful decision in aggregation design. Getting this wrong can make responses 5-10x slower than necessary.

Identifying Dependencies

The fundamental question: does call B require data from call A's response?

Independent calls — Can execute in parallel. Example: getting user preferences doesn't depend on getting product details.
Dependent calls — Must execute sequentially. Example: getting order details requires the order ID from the order list.
Mixed dependencies — Some calls depend on others; optimize by maximizing parallelism at each stage.

sequential-antipattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// ❌ ANTIPATTERN: Sequential fetching
// when parallel is possible
 
async function getProductPage(productId: string) {
  // Each awaits before next starts
  // Total time: sum of all calls
  
  const product = await productService.get(productId);
  // 100ms
  
  const inventory = await inventoryService.get(productId);
  // + 80ms
  
  const pricing = await pricingService.get(productId);
  // + 60ms
  
  const reviews = await reviewService.get(productId);
  // + 120ms
  
  const related = await recommendationService.get(productId);
  // + 150ms
  
  // Total: ~510ms 😱
  return { product, inventory, pricing, reviews, related };
}

parallel-pattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// ✅ CORRECT: Parallel fetching
// for independent calls
 
async function getProductPage(productId: string) {
  // All calls start simultaneously
  // Total time: slowest call only
  
  const [product, inventory, pricing, reviews, related] = 
    await Promise.all([
      productService.get(productId),
      inventoryService.get(productId),
      pricingService.get(productId),
      reviewService.get(productId),
      recommendationService.get(productId),
    ]);
  
  // Total: ~150ms (slowest call) ✨
  // 70% latency reduction!
  
  return { product, inventory, pricing, reviews, related };
}

Handling Mixed Dependencies

Real-world aggregations often have complex dependency graphs. The key is to execute the maximum possible parallelism at each stage:

mixed-dependencies.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// Complex dependency graph with optimized execution
 
interface DependencyGraph {
  // User profile: no dependencies
  // Product catalog: no dependencies
  // Pricing: depends on product (needs product tier)
  // Recommendations: depends on user (needs preferences)
  // Dynamic bundle: depends on product AND pricing
  // Personalized price: depends on user AND pricing
}
 
async function getComplexProductPage(userId: string, productId: string) {
  // Stage 1: Independent calls in parallel
  const [user, product] = await Promise.all([
    userService.getProfile(userId),
    productService.get(productId),
  ]);
  // Time: max(userService, productService) ≈ 100ms
 
  // Stage 2: Calls that depend on Stage 1, parallel within stage
  const [pricing, recommendations] = await Promise.all([
    pricingService.get(productId, { tier: product.pricingTier }),
    recommendationService.get(userId, { preferences: user.preferences }),
  ]);
  // Time: max(pricingService, recommendationService) ≈ 80ms
 
  // Stage 3: Calls that depend on Stage 2
  const [dynamicBundle, personalizedPrice] = await Promise.all([
    bundleService.calculate(product, pricing),
    pricingService.personalize(pricing, user.membershipLevel),
  ]);
  // Time: max(bundleService, personalize) ≈ 50ms
 
  // Total: 100 + 80 + 50 = 230ms
  // Sequential would be: 100 + 100 + 80 + 80 + 50 + 50 = 460ms
  // 50% improvement through dependency analysis
 
  return { user, product, pricing, recommendations, dynamicBundle, personalizedPrice };
}
 
// Visualizing the execution:
//
// Time →
// [0ms]                    [100ms]              [180ms]        [230ms]
//   │                         │                    │              │
//   ├─ userService ──────────►│                    │              │
//   ├─ productService ───────►│                    │              │
//   │                         ├─ pricingService ──►│              │
//   │                         ├─ recommendations ─►│              │
//   │                         │                    ├─ bundle ────►│
//   │                         │                    ├─ personalize►│

Dependency Analysis Tooling

For complex BFFs, consider building a dependency graph analyzer that validates your aggregation code matches the actual dependency requirements. This prevents accidental sequential execution when parallel is possible—a common source of latency regression.

Partial Failure Handling

In a microservices environment, partial failures are not exceptional—they're routine. Any downstream service may be slow, return errors, or be completely unavailable. The BFF aggregation layer must handle these failures gracefully.

Failure Classification

Different failures require different handling strategies:

Failure Types and Handling Strategies
Failure Type	Example	Handling Strategy
Critical Data Missing	Product service returns 404	Fail the entire request; partial response would be meaningless
Non-Critical Data Missing	Reviews service timeout	Continue with empty/default reviews section
Degraded Data Available	Recommendation service returns cached data	Use degraded data; indicate staleness to client
Transient Error	Service returns 503	Retry with backoff; fallback if retries exhausted
Semantic Error	Invalid product ID format	Fail fast; no retry will help

Promise.allSettled: The Foundation

JavaScript's Promise.allSettled is the foundation of partial failure handling. Unlike Promise.all which fails fast on any rejection, allSettled completes all promises and reports individual outcomes:

partial-failure-handling.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
// Comprehensive partial failure handling
 
type FetchResult<T> = 
  | { status: 'success'; data: T; latencyMs: number }
  | { status: 'failed'; error: Error; fallback: T }
  | { status: 'degraded'; data: T; reason: string };
 
interface AggregatedProductPage {
  product: ProductDetails;           // Critical
  inventory: InventoryStatus;        // Critical
  pricing: PricingInfo;              // Critical
  reviews: ReviewSummary;            // Non-critical
  recommendations: Product[];        // Non-critical
  _meta: {
    degradedSections: string[];
    fetchTimes: Record<string, number>;
  };
}
 
class ResilientAggregator {
  async getProductPage(productId: string): Promise<AggregatedProductPage> {
    const startTime = Date.now();
    
    // Classify services by criticality
    const criticalCalls = [
      this.fetchWithMeta('product', () => this.productService.get(productId)),
      this.fetchWithMeta('inventory', () => this.inventoryService.get(productId)),
      this.fetchWithMeta('pricing', () => this.pricingService.get(productId)),
    ];
 
    const nonCriticalCalls = [
      this.fetchWithMeta('reviews', () => this.reviewService.getSummary(productId), {
        fallback: { rating: 0, count: 0, reviews: [] },
        timeout: 500, // Aggressive timeout for non-critical
      }),
      this.fetchWithMeta('recommendations', () => this.recService.get(productId), {
        fallback: [],
        timeout: 500,
      }),
    ];
 
    // Execute all calls
    const [criticalResults, nonCriticalResults] = await Promise.all([
      Promise.all(criticalCalls),
      Promise.allSettled(nonCriticalCalls),
    ]);
 
    // Check critical failures
    const criticalFailure = criticalResults.find(r => r.status === 'failed');
    if (criticalFailure) {
      throw new AggregationError(
        `Critical service failure: ${criticalFailure.error.message}`,
        { failedService: criticalFailure.name }
      );
    }
 
    // Process non-critical results
    const degradedSections: string[] = [];
    const processedNonCritical = nonCriticalResults.map((result, i) => {
      const callInfo = ['reviews', 'recommendations'][i];
      
      if (result.status === 'fulfilled') {
        if (result.value.status === 'degraded') {
          degradedSections.push(callInfo);
        }
        return result.value.data;
      } else {
        degradedSections.push(callInfo);
        return nonCriticalCalls[i].fallback;
      }
    });
 
    return {
      product: criticalResults[0].data,
      inventory: criticalResults[1].data,
      pricing: criticalResults[2].data,
      reviews: processedNonCritical[0],
      recommendations: processedNonCritical[1],
      _meta: {
        degradedSections,
        fetchTimes: this.collectFetchTimes([...criticalResults, ...processedNonCritical]),
      },
    };
  }
 
  private async fetchWithMeta<T>(
    name: string,
    fetcher: () => Promise<T>,
    options: { fallback?: T; timeout?: number } = {}
  ): Promise<FetchResult<T> & { name: string }> {
    const start = Date.now();
    
    try {
      const timeoutPromise = options.timeout 
        ? this.timeoutAfter<T>(options.timeout)
        : null;
      
      const data = timeoutPromise
        ? await Promise.race([fetcher(), timeoutPromise])
        : await fetcher();
      
      return {
        name,
        status: 'success',
        data,
        latencyMs: Date.now() - start,
      };
    } catch (error) {
      if (options.fallback !== undefined) {
        return {
          name,
          status: 'failed',
          error: error as Error,
          fallback: options.fallback,
        };
      }
      throw error;
    }
  }
}

The Empty vs Absent Distinction

There's a semantic difference between 'reviews are unavailable' (service down) and 'no reviews exist' (service returned empty array). Clients may want to display these differently—a spinner for retrying vs. 'Be the first to review'. Your aggregation layer should preserve this distinction.

Data Joining Patterns

Aggregated data often needs to be joined—connecting user IDs to user names, product IDs to product details, etc. This is analogous to database JOINs but executed across service boundaries.

The N+1 Problem in Aggregation

A common anti-pattern is fetching a list and then fetching related data for each item individually:

n-plus-1-antipattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// ❌ N+1 ANTIPATTERN
 
async function getOrderHistory(userId: string) {
  // 1 call for orders
  const orders = await orderService.list(userId);
  
  // N calls for products - BAD!
  const enrichedOrders = await Promise.all(
    orders.map(async (order) => {
      const products = await Promise.all(
        order.productIds.map(id => 
          productService.get(id) // 💀 Called for EACH product
        )
      );
      return { ...order, products };
    })
  );
  
  // If user has 20 orders with 5 products each:
  // 1 + (20 * 5) = 101 API calls! 😱
  
  return enrichedOrders;
}

batch-fetch-pattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// ✅ BATCH FETCH PATTERN
 
async function getOrderHistory(userId: string) {
  // 1 call for orders
  const orders = await orderService.list(userId);
  
  // Collect all unique product IDs
  const productIds = [...new Set(
    orders.flatMap(o => o.productIds)
  )];
  
  // 1 batch call for all products
  const products = await productService.batchGet(productIds);
  
  // Build lookup map
  const productMap = new Map(
    products.map(p => [p.id, p])
  );
  
  // Enrich orders in memory
  const enrichedOrders = orders.map(order => ({
    ...order,
    products: order.productIds.map(id => productMap.get(id)),
  }));
  
  // Total: 2 API calls regardless of data size ✨
  
  return enrichedOrders;
}

DataLoader Pattern

The DataLoader pattern (popularized by Facebook/GraphQL) automates batch fetching with request deduplication:

dataloader-pattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// DataLoader: Automatic batching and caching
 
import DataLoader from 'dataloader';
 
class AggregationContext {
  // Create loaders per request to ensure request-scoped caching
  productLoader = new DataLoader<string, Product>(async (ids) => {
    // This function receives ALL IDs requested during the tick
    console.log(`Batching ${ids.length} product fetches`);
    
    const products = await this.productService.batchGet([...ids]);
    
    // Must return results in same order as input IDs
    const productMap = new Map(products.map(p => [p.id, p]));
    return ids.map(id => productMap.get(id) ?? new Error(`Product ${id} not found`));
  }, {
    maxBatchSize: 100,  // Limit batch size
    cache: true,        // Enable request-scoped caching
  });
 
  userLoader = new DataLoader<string, User>(async (ids) => {
    const users = await this.userService.batchGet([...ids]);
    const userMap = new Map(users.map(u => [u.id, u]));
    return ids.map(id => userMap.get(id) ?? new Error(`User ${id} not found`));
  });
}
 
// Usage becomes simple - DataLoader handles batching transparently
async function getCommentsWithAuthors(postId: string, ctx: AggregationContext) {
  const comments = await commentService.getByPost(postId);
  
  // Each .load() call is automatically batched
  const enrichedComments = await Promise.all(
    comments.map(async (comment) => ({
      ...comment,
      author: await ctx.userLoader.load(comment.authorId),
      // Even if same user commented twice, only 1 fetch occurs (caching)
    }))
  );
  
  // If 50 comments by 30 unique users:
  // Without DataLoader: 50 API calls
  // With DataLoader: 1 batched API call for 30 users
  
  return enrichedComments;
}

DataLoader Request Scoping

DataLoader instances should be created per-request, not globally. Global DataLoaders cause cross-request cache pollution and memory leaks. Create fresh loaders for each incoming request to the BFF.

Composition Patterns

Once data is fetched and joined, it must be composed into the final response structure. Several patterns exist for organizing this composition logic.

The Builder Pattern

For complex responses with many conditional sections, the Builder pattern provides clarity:

response-builder.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
// Builder pattern for complex response composition
 
class HomeScreenResponseBuilder {
  private response: Partial<HomeScreenResponse> = {};
  private degradedSections: string[] = [];
 
  withUser(user: User | null): this {
    if (user) {
      this.response.user = {
        id: user.id,
        name: user.displayName,
        avatar: user.avatarUrl,
        membershipTier: user.subscription?.tier ?? 'free',
      };
    } else {
      this.response.user = null;
      this.degradedSections.push('user');
    }
    return this;
  }
 
  withContinueWatching(items: WatchHistoryItem[], user: User | null): this {
    this.response.continueWatching = items
      .filter(item => !item.completed)
      .slice(0, 10)
      .map(item => ({
        id: item.contentId,
        title: item.title,
        thumbnail: item.thumbnailUrl,
        progress: Math.round((item.position / item.duration) * 100),
        resumePosition: item.position,
      }));
    return this;
  }
 
  withRecommendations(
    recommendations: Recommendation[], 
    fallback: boolean = false
  ): this {
    this.response.recommendations = recommendations.slice(0, 20).map(rec => ({
      id: rec.contentId,
      title: rec.title,
      thumbnail: rec.thumbnailUrl,
      reason: rec.recommendationReason,
      score: rec.confidenceScore,
    }));
    
    if (fallback) {
      this.degradedSections.push('recommendations');
      this.response.recommendations.forEach(r => r.reason = 'Popular content');
    }
    return this;
  }
 
  withFeaturedContent(featured: FeaturedContent | null): this {
    if (featured) {
      this.response.featured = {
        type: featured.contentType,
        id: featured.contentId,
        title: featured.title,
        hero: featured.heroImageUrl,
        cta: featured.callToAction,
      };
    }
    return this;
  }
 
  build(): HomeScreenResponse {
    return {
      ...this.response as HomeScreenResponse,
      _meta: {
        timestamp: Date.now(),
        degradedSections: this.degradedSections,
        version: '2.0',
      },
    };
  }
}
 
// Usage
async function getHomeScreen(userId: string): Promise<HomeScreenResponse> {
  const [user, history, recs, featured] = await Promise.allSettled([
    userService.get(userId),
    historyService.getRecent(userId),
    recommendationService.getForUser(userId),
    contentService.getFeatured(),
  ]);
 
  return new HomeScreenResponseBuilder()
    .withUser(extract(user))
    .withContinueWatching(extract(history) ?? [], extract(user))
    .withRecommendations(
      extract(recs) ?? await getFallbackRecommendations(),
      recs.status === 'rejected'
    )
    .withFeaturedContent(extract(featured))
    .build();
}

The Transformer Pipeline Pattern

For responses that undergo multiple transformation stages, a pipeline pattern ensures clean separation of concerns:

transformer-pipeline.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
// Transformer pipeline for multi-stage processing
 
type Transformer<T> = (data: T) => T | Promise<T>;
 
class TransformationPipeline<T> {
  private transformers: Transformer<T>[] = [];
 
  add(transformer: Transformer<T>): this {
    this.transformers.push(transformer);
    return this;
  }
 
  async execute(initial: T): Promise<T> {
    let result = initial;
    for (const transformer of this.transformers) {
      result = await transformer(result);
    }
    return result;
  }
}
 
// Define individual transformers
const addLocalizedStrings: Transformer<ProductResponse> = (data) => ({
  ...data,
  title: localize(data.titleKey),
  description: localize(data.descriptionKey),
});
 
const addPricingDisplay: Transformer<ProductResponse> = (data) => ({
  ...data,
  displayPrice: formatCurrency(data.price, data.currency),
  originalPrice: data.discount 
    ? formatCurrency(data.originalPrice, data.currency) 
    : null,
});
 
const addAvailabilityBadge: Transformer<ProductResponse> = (data) => ({
  ...data,
  badge: data.inventory < 5 
    ? { type: 'warning', text: 'Low stock' }
    : data.inventory === 0
    ? { type: 'error', text: 'Out of stock' }
    : null,
});
 
const filterSensitiveFields: Transformer<ProductResponse> = (data) => {
  const { internalCost, supplierMargin, ...safe } = data;
  return safe;
};
 
// Compose pipeline
const productPipeline = new TransformationPipeline<ProductResponse>()
  .add(addLocalizedStrings)
  .add(addPricingDisplay)
  .add(addAvailabilityBadge)
  .add(filterSensitiveFields);
 
// Execute
const finalResponse = await productPipeline.execute(rawProductData);

Caching Strategies for Aggregation

Caching is essential for aggregation performance, but aggregated responses create unique caching challenges. You must decide where to cache, what to cache, and how to handle cache invalidation across multiple data sources.

Caching Layers

Aggregation Caching Layers
Layer	What's Cached	TTL Strategy	Invalidation
Response Cache	Complete aggregated responses	Short (1-5 min)	Time-based or user-action triggered
Service Result Cache	Individual service responses	Varies by service (1 min - 1 hour)	Service-specific webhooks or polling
Computed Value Cache	Expensive transformations	Until source data changes	Dependency tracking
Reference Data Cache	Categories, locales, configs	Long (hours - days)	Deployment or admin action

The Freshness Problem

When aggregating cached data from multiple sources, each source may have different freshness. The aggregated response is only as fresh as its stalest component:

Response freshness = min(freshness of each cached component)

This creates a tradeoff: aggressive caching improves performance but risks showing stale data. Conservative caching ensures freshness but increases latency and downstream load.

freshness-aware-caching.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
// Freshness-aware caching strategy
 
interface CachedData<T> {
  data: T;
  cachedAt: number;
  ttlMs: number;
  source: 'cache' | 'fresh';
}
 
interface FreshnessConfig {
  user: { ttl: 60_000; critical: true };      // 1 min, required fresh
  product: { ttl: 300_000; critical: true };  // 5 min, required fresh
  reviews: { ttl: 3600_000; critical: false }; // 1 hour, stale OK
  recommendations: { ttl: 1800_000; critical: false }; // 30 min, stale OK
}
 
class FreshnessAwareAggregator {
  async aggregate(productId: string, userId: string) {
    // Fetch all with freshness metadata
    const results = await Promise.all([
      this.getWithFreshness('user', () => this.userService.get(userId)),
      this.getWithFreshness('product', () => this.productService.get(productId)),
      this.getWithFreshness('reviews', () => this.reviewService.get(productId)),
      this.getWithFreshness('recommendations', () => this.recService.get(productId)),
    ]);
 
    // Calculate aggregate freshness
    const oldestCacheTime = Math.min(...results.map(r => r.cachedAt));
    const ageMs = Date.now() - oldestCacheTime;
    
    // Determine overall cache-control
    const maxAgeSeconds = Math.floor(
      Math.min(...results.map(r => (r.ttlMs - (Date.now() - r.cachedAt)) / 1000))
    );
 
    return {
      user: results[0].data,
      product: results[1].data,
      reviews: results[2].data,
      recommendations: results[3].data,
      _cache: {
        age: Math.floor(ageMs / 1000),
        maxAge: Math.max(0, maxAgeSeconds),
        stale: results.filter(r => r.source === 'cache').map((_, i) => 
          ['user', 'product', 'reviews', 'recommendations'][i]
        ),
      },
    };
  }
 
  private async getWithFreshness<T>(
    key: string, 
    fetcher: () => Promise<T>
  ): Promise<CachedData<T>> {
    const config = FRESHNESS_CONFIG[key];
    const cached = await this.cache.get<CachedData<T>>(key);
 
    // Check if cache is fresh enough
    if (cached && (Date.now() - cached.cachedAt) < config.ttl) {
      return { ...cached, source: 'cache' };
    }
 
    // If critical data, must fetch fresh
    if (config.critical || !cached) {
      const fresh = await fetcher();
      const entry: CachedData<T> = {
        data: fresh,
        cachedAt: Date.now(),
        ttlMs: config.ttl,
        source: 'fresh',
      };
      await this.cache.set(key, entry, config.ttl);
      return entry;
    }
 
    // Non-critical: return stale, refresh in background
    this.refreshInBackground(key, fetcher);
    return { ...cached, source: 'cache' };
  }
}

Stale-While-Revalidate

The stale-while-revalidate pattern serves cached data immediately while refreshing in the background. This gives users instant responses with eventual freshness. Combine with the Cache-Control header for CDN/browser support.

Performance Optimization Techniques

Beyond parallelization and caching, several advanced techniques can dramatically improve aggregation performance.

Timeout Strategies

Different components have different acceptable latencies. A sophisticated timeout strategy treats each service appropriately:

adaptive-timeouts.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
// Adaptive timeout strategy
 
interface TimeoutConfig {
  initial: number;      // Starting timeout
  max: number;          // Maximum timeout
  percentile: number;   // Target percentile (e.g., p99)
}
 
class AdaptiveTimeoutManager {
  private latencyHistories = new Map<string, number[]>();
  
  getTimeout(serviceName: string): number {
    const history = this.latencyHistories.get(serviceName);
    if (!history || history.length < 100) {
      return DEFAULT_TIMEOUTS[serviceName]?.initial ?? 1000;
    }
 
    // Calculate timeout as p99 latency + buffer
    const sorted = [...history].sort((a, b) => a - b);
    const p99Index = Math.floor(sorted.length * 0.99);
    const p99Latency = sorted[p99Index];
    
    const config = DEFAULT_TIMEOUTS[serviceName];
    return Math.min(
      Math.max(p99Latency * 1.2, config.initial), // 20% buffer
      config.max
    );
  }
 
  recordLatency(serviceName: string, latencyMs: number): void {
    if (!this.latencyHistories.has(serviceName)) {
      this.latencyHistories.set(serviceName, []);
    }
    const history = this.latencyHistories.get(serviceName)!;
    history.push(latencyMs);
    
    // Keep only recent samples
    if (history.length > 1000) {
      history.shift();
    }
  }
}
 
// Usage with racing
async function fetchWithAdaptiveTimeout<T>(
  serviceName: string,
  fetcher: () => Promise<T>,
  fallback?: T
): Promise<T> {
  const timeout = timeoutManager.getTimeout(serviceName);
  const start = Date.now();
  
  try {
    const result = await Promise.race([
      fetcher(),
      new Promise<never>((_, reject) => 
        setTimeout(() => reject(new TimeoutError(`${serviceName} timed out after ${timeout}ms`)), timeout)
      ),
    ]);
    
    timeoutManager.recordLatency(serviceName, Date.now() - start);
    return result;
  } catch (error) {
    if (error instanceof TimeoutError && fallback !== undefined) {
      metrics.increment('aggregation.timeout', { service: serviceName });
      return fallback;
    }
    throw error;
  }
}

Connection Pooling and Keep-Alive

HTTP connection establishment is expensive. BFFs should maintain connection pools to downstream services:

Keep-Alive connections reduce latency by reusing established connections
Connection pools limit resource usage while enabling concurrent requests
Warm-up strategies pre-establish connections before traffic arrives

Advanced Optimization Techniques

•Speculative Execution — Start potentially-needed calls before confirming they're required. Cancel if not needed.
•Request Hedging — For critical calls, send duplicate requests to multiple replicas; use first response.
•Compression — Enable gRPC/HTTP2 compression for inter-service communication.
•Protocol Optimization — Use gRPC instead of REST for internal service calls; binary protocols are faster.
•Locality-Aware Routing — Route to same-zone service instances to minimize network latency.

Summary: API Aggregation Mastery

API aggregation is the heart of BFF functionality—the core capability that justifies the pattern's existence. Mastering aggregation transforms BFFs from simple proxies into powerful composition layer.

Key Takeaways

•Maximize parallelism — Analyze dependencies to execute the maximum possible concurrent calls at each stage.
•Handle partial failures gracefully — Distinguish critical from non-critical data; serve degraded responses rather than failing entirely.
•Batch related fetches — Use DataLoader or manual batching to avoid N+1 query patterns.
•Structure composition clearly — Builder or pipeline patterns make complex response construction maintainable.
•Cache at appropriate layers — Cache individual service results, computed values, and reference data with freshness tracking.
•Optimize timeouts adaptively — Use service-specific, latency-informed timeouts rather than static values.
•Communicate degradation — Include metadata indicating which sections are stale or missing so clients can display appropriately.

What's Next:

With aggregation patterns mastered, the next page explores Request Coalescing—techniques for combining multiple client requests into efficient batched backend calls, further reducing downstream service load and improving overall system efficiency.

Page Complete

You now have deep expertise in API aggregation patterns. You can design aggregation logic that maximizes performance, handles failures gracefully, and produces clean, client-optimal responses from complex microservices ecosystems.