Loading learning content...
In a monolithic database, answering complex queries is straightforward. Need to find all orders from customers in California with products in a specific category? A single SQL JOIN across three tables returns the answer in milliseconds.
In microservices, this becomes fundamentally harder. The Customer Service owns customer data (including location). The Order Service owns orders. The Catalog Service owns product categories. There is no shared database to JOIN across.
This is the distributed join problem: how do you answer queries that require data from multiple autonomous services, each with its own database, without reintroducing the tight coupling that microservices aim to eliminate?
The answer involves trade-offs between latency, consistency, complexity, and coupling. There's no free lunch—but there are proven patterns that work at scale.
By the end of this page, you will understand the fundamental patterns for cross-service queries: API composition, materialized views, CQRS, and data meshes. You'll know when to apply each pattern, their trade-offs, and how companies like Netflix and Amazon solve this at planetary scale.
Before exploring solutions, let's deeply understand why cross-service queries are hard. The challenge isn't just technical—it's a fundamental tension between service autonomy and query flexibility.
The query taxonomy:
Not all cross-service queries are equal. Understanding the query type guides the solution:
| Query Type | Example | Challenge Level |
|---|---|---|
| Point lookup | Get order with customer name | Low — single ID lookup |
| Filtered search | Find orders with product category | Medium — filter on foreign data |
| Aggregation | Count orders by customer region | High — aggregate across domains |
| Ad-hoc reporting | Complex business intelligence | Very High — arbitrary combinations |
Simple point lookups (enriching an order with customer name) can use straightforward API calls. Complex analytical queries (revenue by customer segment by region by product line) require dedicated infrastructure. Don't over-engineer for simple queries or under-engineer for complex ones.
API Composition (also called Aggregator Pattern) is the most straightforward approach: a client or gateway calls multiple services and combines their responses.
How it works:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107
// ===================================================// API COMPOSITION PATTERN// ===================================================// An API Gateway or BFF (Backend for Frontend) // composes data from multiple services.// =================================================== interface EnrichedOrder { orderId: string; orderDate: Date; status: string; customer: { id: string; name: string; email: string; }; items: Array<{ productId: string; productName: string; category: string; quantity: number; unitPrice: number; }>; totalAmount: number;} class OrderCompositionService { private orderClient: OrderServiceClient; private customerClient: CustomerServiceClient; private catalogClient: CatalogServiceClient; async getEnrichedOrder(orderId: string): Promise<EnrichedOrder> { // Step 1: Get the core order data const order = await this.orderClient.getOrder(orderId); // Step 2: Fetch related data in parallel // These calls are independent and can execute concurrently const [customer, products] = await Promise.all([ this.customerClient.getCustomer(order.customerId), this.catalogClient.getProducts( order.lineItems.map(item => item.productId) ), ]); // Step 3: Compose the enriched response const productMap = new Map(products.map(p => [p.id, p])); return { orderId: order.id, orderDate: order.createdAt, status: order.status, customer: { id: customer.id, name: customer.name, email: customer.email, }, items: order.lineItems.map(item => { const product = productMap.get(item.productId)!; return { productId: item.productId, productName: product.name, category: product.category, quantity: item.quantity, unitPrice: item.unitPrice, }; }), totalAmount: order.totalAmount, }; } async getOrdersForCustomer(customerId: string): Promise<EnrichedOrder[]> { // Get all orders for a customer const orders = await this.orderClient.getOrdersByCustomer(customerId); // Get customer info once const customer = await this.customerClient.getCustomer(customerId); // Get all unique products across all orders const allProductIds = new Set<string>(); orders.forEach(order => { order.lineItems.forEach(item => allProductIds.add(item.productId)); }); const products = await this.catalogClient.getProducts([...allProductIds]); const productMap = new Map(products.map(p => [p.id, p])); // Compose all orders return orders.map(order => ({ orderId: order.id, orderDate: order.createdAt, status: order.status, customer: { id: customer.id, name: customer.name, email: customer.email, }, items: order.lineItems.map(item => ({ productId: item.productId, productName: productMap.get(item.productId)?.name ?? 'Unknown', category: productMap.get(item.productId)?.category ?? 'Unknown', quantity: item.quantity, unitPrice: item.unitPrice, })), totalAmount: order.totalAmount, })); }}When to use API Composition:
When to avoid:
Materialized Views pre-compute query results by combining data from multiple services into a queryable store. Instead of composing data at query time, you build and maintain a denormalized view optimized for specific queries.
How it works:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181
// ===================================================// MATERIALIZED VIEW PATTERN// ===================================================// Build a pre-computed view that supports specific// cross-service query patterns efficiently.// =================================================== // The materialized view model - optimized for common queriesinterface OrderSearchView { // Order fields orderId: string; orderDate: Date; orderStatus: string; totalAmount: number; // Customer fields (denormalized from Customer Service) customerId: string; customerName: string; customerEmail: string; customerRegion: string; // For regional queries // Product fields (denormalized from Catalog Service) productIds: string[]; productNames: string[]; categories: string[]; // For category filtering // Derived fields for search searchText: string; // Combined searchable text lastUpdated: Date;} class OrderSearchViewBuilder { private viewStore: OrderSearchViewRepository; // e.g., Elasticsearch private orderClient: OrderServiceClient; private customerClient: CustomerServiceClient; private catalogClient: CatalogServiceClient; // ========================================== // EVENT HANDLERS - Build view incrementally // ========================================== async handleOrderCreated(event: OrderCreatedEvent): Promise<void> { // Fetch related data to build the view const [customer, products] = await Promise.all([ this.customerClient.getCustomer(event.data.customerId), this.catalogClient.getProducts( event.data.lineItems.map(i => i.productId) ), ]); const view: OrderSearchView = { orderId: event.aggregateId, orderDate: event.data.orderDate, orderStatus: event.data.status, totalAmount: event.data.totalAmount, customerId: customer.id, customerName: customer.name, customerEmail: customer.email, customerRegion: customer.region, productIds: products.map(p => p.id), productNames: products.map(p => p.name), categories: [...new Set(products.map(p => p.category))], searchText: this.buildSearchText(event.data, customer, products), lastUpdated: new Date(), }; await this.viewStore.index(view); } async handleOrderStatusUpdated(event: OrderStatusUpdatedEvent): Promise<void> { // Partial update - only change the status field await this.viewStore.updatePartial(event.aggregateId, { orderStatus: event.data.newStatus, lastUpdated: new Date(), }); } async handleCustomerUpdated(event: CustomerUpdatedEvent): Promise<void> { // Update all orders for this customer // This is the trade-off: fan-out updates when source data changes const ordersForCustomer = await this.viewStore.findByCustomerId( event.aggregateId ); for (const order of ordersForCustomer) { await this.viewStore.updatePartial(order.orderId, { customerName: event.data.name, customerEmail: event.data.email, customerRegion: event.data.region ?? order.customerRegion, lastUpdated: new Date(), }); } } async handleProductUpdated(event: ProductUpdatedEvent): Promise<void> { // Update all orders containing this product const ordersWithProduct = await this.viewStore.findByProductId( event.aggregateId ); for (const order of ordersWithProduct) { // Refetch all products for this order to rebuild product fields const products = await this.catalogClient.getProducts(order.productIds); await this.viewStore.updatePartial(order.orderId, { productNames: products.map(p => p.name), categories: [...new Set(products.map(p => p.category))], lastUpdated: new Date(), }); } } private buildSearchText(order: any, customer: any, products: any[]): string { return [ order.id, customer.name, customer.email, ...products.map(p => p.name), ...products.map(p => p.category), ].join(' ').toLowerCase(); }} // ==========================================// QUERY SERVICE - Uses the materialized view// ========================================== class OrderSearchService { private viewStore: OrderSearchViewRepository; async searchOrders(query: OrderSearchQuery): Promise<OrderSearchResult> { // All filtering happens on the materialized view // No cross-service calls at query time! const filters: any = {}; if (query.customerRegion) { filters.customerRegion = query.customerRegion; } if (query.categories?.length) { filters.categories = { $in: query.categories }; } if (query.dateRange) { filters.orderDate = { $gte: query.dateRange.start, $lte: query.dateRange.end, }; } if (query.searchText) { filters.searchText = { $contains: query.searchText.toLowerCase() }; } const results = await this.viewStore.search({ filters, sort: query.sortBy ?? 'orderDate', sortOrder: query.sortOrder ?? 'desc', limit: query.limit ?? 50, offset: query.offset ?? 0, }); return { orders: results.hits, total: results.totalCount, hasMore: results.totalCount > (query.offset ?? 0) + results.hits.length, }; } // Complex queries are now fast! async getRevenueByRegion(): Promise<Map<string, number>> { return await this.viewStore.aggregate({ groupBy: 'customerRegion', sum: 'totalAmount', }); }}Trade-offs of Materialized Views:
| Aspect | Benefit | Cost |
|---|---|---|
| Query latency | Fast—single datastore | N/A |
| Query flexibility | Optimized for defined patterns | Can't support arbitrary queries |
| Data freshness | Eventual consistency | Staleness during sync |
| Write fan-out | N/A | One source change updates many views |
| Storage | N/A | Stores data redundantly |
| Operational | N/A | Additional system to maintain |
Elasticsearch (or OpenSearch, Solr) is commonly used for materialized views because it supports full-text search, filtering, sorting, and aggregations—exactly what cross-service queries need. Consider it for any significant materialized view use case.
CQRS takes materialized views to the architectural level. Instead of a single model for reads and writes, you maintain separate read and write models, each optimized for its purpose.
The core insight: Commands (writes) have different requirements than Queries (reads). Commands need strong consistency, validation, and business rules. Queries need speed, flexibility, and often span domains. Separating them allows optimization of each.
CQRS Architecture:
┌─────────────────────────┐
│ API Gateway │
└────────────┬─────────────┘
│
┌─────────────────────┴─────────────────────┐
│ │
┌───────▼───────┐ ┌───────▼───────┐
│ Commands │ │ Queries │
└───────┬────────┘ └───────┬───────┘
│ │
┌───────▼────────┐ ┌───────▼───────┐
│ Write Model │──── Events ────────────▶│ Read Model │
│ (Domain Services)│ │ (Query Services)│
│ │ │ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ Customer DB │ │ │ │ Order Search│ │
│ │ Order DB │ │ │ │ (Elastic) │ │
│ │ Catalog DB │ │ │ │ │ │
│ └─────────────┘ │ │ │ Analytics │ │
│ │ │ │ (ClickHouse)│ │
└──────────────────┘ │ └─────────────┘ │
└─────────────────┘
How CQRS solves cross-service queries:
| Read Model | Technology | Use Case | Data Sources |
|---|---|---|---|
| Order Search | Elasticsearch | Full-text search, filtering | Orders, Customers, Catalog |
| Customer 360 | PostgreSQL | Customer profile with history | Customers, Orders, Support |
| Analytics Cube | ClickHouse | Business intelligence | All domain events |
| Real-time Dashboard | Redis | Live metrics | Aggregated events |
| Recommendation | Neo4j | Graph-based recommendations | Purchases, Products, Preferences |
CQRS is powerful but complex. You're maintaining multiple models, handling eventual consistency, and building event-synchronization infrastructure. Only adopt CQRS if your query requirements genuinely can't be met with simpler patterns.
When to use CQRS:
When to avoid:
Data Mesh is an architectural paradigm that treats data as a product, owned by domain teams. It's particularly relevant for analytical and reporting queries that span the entire organization.
The Data Mesh principle: Instead of a central data team extracting data from all services into a monolithic warehouse, each domain team publishes curated data products that other teams can consume.
Key concepts:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
// ===================================================// DATA MESH: Domain Data Products// ===================================================// Each domain publishes curated, documented data products// for consumption by other teams and analytics systems.// =================================================== // Orders Domain publishes this data productinterface OrdersDataProduct { // Metadata (discoverable in data catalog) productName: 'orders-v2'; owner: 'order-team'; description: 'Complete order data for analytics and reporting'; sla: { freshness: '< 5 minutes'; availability: '99.9%'; }; schema: typeof OrderDataSchema; // Available interfaces interfaces: { // Streaming for real-time consumers stream: { type: 'kafka'; topic: 'orders.events.v2'; format: 'avro'; }; // Batch for analytics warehouses batch: { type: 's3'; path: 's3://data-lake/orders/daily/'; format: 'parquet'; partitionBy: ['date', 'region']; }; // Query for ad-hoc analysis query: { type: 'presto'; catalog: 'orders'; table: 'orders_v2'; }; };} const OrderDataSchema = { orderId: 'string', orderDate: 'timestamp', status: 'string', customerId: 'string', totalAmount: 'decimal(10,2)', currency: 'string', itemCount: 'int', region: 'string', channel: 'string', // web, mobile, api // Note: No PII like customer name/email // Consumers JOIN with Customers data product if needed}; // ==========================================// CONSUMING DATA PRODUCTS// ========================================== // Analytics team builds cross-domain reportsclass RevenueReportBuilder { private ordersProduct: DataProductClient<OrdersDataProduct>; private customersProduct: DataProductClient<CustomersDataProduct>; private productsProduct: DataProductClient<ProductsDataProduct>; async buildRevenueBySegmentReport(): Promise<Report> { // Query each data product // The data mesh platform handles the distributed query const query = ` SELECT c.segment, c.region, p.category, SUM(o.totalAmount) as revenue, COUNT(DISTINCT o.orderId) as orderCount, COUNT(DISTINCT o.customerId) as customerCount FROM orders.orders_v2 o JOIN customers.customers_v2 c ON o.customerId = c.customerId JOIN products.products_v2 p ON o.primaryProductId = p.productId WHERE o.orderDate >= DATE_SUB(CURRENT_DATE, 30) AND o.status = 'completed' GROUP BY c.segment, c.region, p.category ORDER BY revenue DESC `; // The query engine (e.g., Presto/Trino) federates across products const results = await this.queryEngine.execute(query); return this.formatReport(results); }}When Data Mesh applies:
Trade-offs:
Data Mesh is primarily about analytical data—reports, BI, ML training. For operational queries (user-facing features), the other patterns (API composition, materialized views, CQRS) are usually more appropriate.
With multiple patterns available, how do you choose? The decision depends on query characteristics, consistency requirements, and system constraints.
| Factor | API Composition | Materialized View | CQRS | Data Mesh |
|---|---|---|---|---|
| Query complexity | Low | Medium-High | High | Very High |
| Latency requirement | Tolerant (100ms+) | Strict (<50ms) | Strict | Tolerant |
| Consistency | Strong-ish | Eventual | Eventual | Eventual |
| Query volume | Low-Medium | High | Very High | Varies |
| Implementation cost | Low | Medium | High | Very High |
| Operational cost | Low | Medium | High | High |
| Use case | Enrichment | Search/Filter | Complex apps | Analytics |
Decision flowchart:
Is this an analytical/BI query?
├── Yes → Consider Data Mesh or dedicated data warehouse
└── No → Continue...
│
▼
Can you tolerate eventual consistency?
├── No → Must use API Composition (accept latency)
└── Yes → Continue...
│
▼
Need to filter/sort by data from multiple services?
├── No → API Composition is sufficient
└── Yes → Continue...
│
▼
How many different query patterns?
├── Few (1-3) → Materialized Views
└── Many → Consider CQRS
Combining patterns:
Real systems often use multiple patterns:
Each pattern addresses different needs. The goal is matching the right pattern to each use case, not picking one pattern for everything.
Begin with API Composition—it's the simplest. Add materialized views when specific queries become bottlenecks. Adopt CQRS only when you have clear need for complex read models. This incremental approach avoids premature optimization.
Regardless of which pattern you choose, several implementation concerns apply across the board.
API Composition can easily become N+1 if you're not careful. Fetching order list, then customer per order, then products per line item... adds up fast. Always batch requests: get all customer IDs, fetch all customers in one call.
Cross-service queries are an inevitable challenge in microservices. There's no perfect solution—only trade-offs between latency, consistency, complexity, and coupling. Understanding the patterns and when to apply each is essential for designing queryable distributed systems.
What's next:
With query patterns established, the final piece is ensuring data consistency even when it's distributed. The next page explores data consistency patterns—how to maintain correctness across services when traditional ACID transactions don't apply.
You now understand the major patterns for querying across service boundaries: API Composition, Materialized Views, CQRS, and Data Mesh. Each addresses different trade-offs between simplicity, latency, consistency, and complexity. Start simple with API Composition and evolve to more sophisticated patterns as needs demand. Next, we'll tackle data consistency patterns for distributed systems.