Loading content...
Once you've separated commands from queries, the read side becomes a canvas for performance engineering. Unlike the write side—constrained by transactional consistency, normalization rules, and business validation—the read side exists purely to serve data as fast as possible.
This freedom is liberating but also demanding. A poorly designed read model is just a denormalized mess. A well-designed read model anticipates query patterns, optimizes for actual access paths, and delivers sub-millisecond response times at any scale.
The mantra of read model design: Shape data for how it will be consumed, not how it was produced.
This page covers read model design principles, denormalization strategies, materialized view patterns, polyglot storage selection, and real-world optimization techniques. You'll learn how to build read models that serve millions of queries per second with consistent low latency.
Effective read models are purpose-built for queries, not reverse-engineered from write models. This fundamental shift in thinking drives every design decision.
Principle 1: One Read Model Per View
The most powerful insight in read model design is that you don't need a single universal read model. You can have many, each optimized for specific query patterns. A product search page has different needs than a product detail page, which differs from an admin dashboard.
Principle 2: Pre-compute Everything
Every computation that can be done during write-time should be done then, not at read-time. If you always display 'average rating,' compute and store it when reviews are added—not when the page loads.
Principle 3: Embrace Redundancy
Normalization avoids redundancy; read models embrace it. Storing customer name inside every order record is wasteful from a storage perspective but eliminates a join at query time. Storage is cheap; latency is expensive.
Principle 4: Optimize for Access Patterns
Design your read model's structure around how data will be accessed. If users always fetch orders with their items, store them together. If you need to sort by date, make date the primary sort key.
| Design Aspect | Write Model | Read Model |
|---|---|---|
| Optimization Goal | Consistency, correctness | Query speed, scalability |
| Normalization | Highly normalized (3NF+) | Heavily denormalized |
| Data Redundancy | Minimized | Embraced strategically |
| Computation | Deferred to read time | Pre-computed at write time |
| Schema Changes | Carefully migrated | Rebuilt from events |
| Number of Models | Single source of truth | Multiple per query pattern |
| Indexes | Few (write overhead) | Many (read optimization) |
Modern storage economics favor read optimization. A 10TB SSD costs a few hundred dollars. A 100ms latency increase costs users—and potentially customers. Don't hesitate to store the same data multiple ways if it serves different query patterns better.
Denormalization is the art of trading storage and write complexity for read performance. There are several key strategies, each suited to different scenarios.
Strategy 1: Embedding Related Data
Instead of foreign keys that require joins, embed the needed data directly into the parent document. When displaying an order, embed customer name, shipping address, and item details rather than referencing separate tables.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
// Normalized write model (relational)// orders → customers (FK), order_items → products (FK) // Denormalized read model (embedded){ "orderId": "ord_12345", "orderDate": "2024-01-15T10:30:00Z", "status": "shipped", // Customer data embedded (no join needed) "customer": { "id": "cust_789", "name": "Alice Johnson", "email": "alice@example.com" }, // Shipping address embedded "shippingAddress": { "line1": "123 Main St", "city": "Seattle", "state": "WA", "zip": "98101", "country": "US" }, // Order items with product data embedded "items": [ { "productId": "prod_abc", "productName": "Wireless Headphones", "productImage": "https://cdn.example.com/headphones.jpg", "quantity": 1, "unitPrice": 149.99 }, { "productId": "prod_def", "productName": "USB-C Cable Pack", "productImage": "https://cdn.example.com/cables.jpg", "quantity": 2, "unitPrice": 19.99 } ], // Pre-computed totals "subtotal": 189.97, "tax": 18.52, "shipping": 0, "total": 208.49, "itemCount": 3}Strategy 2: Pre-aggregated Summaries
Instead of computing aggregates at query time (COUNT, AVG, SUM), maintain running totals that update when underlying data changes.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
// Product read model with pre-aggregated statsinterface ProductReadModel { productId: string; name: string; description: string; price: number; // Pre-computed aggregates (updated by event handlers) reviewStats: { averageRating: number; // Maintained incrementally totalReviews: number; // Counter, not computed ratingDistribution: { // Histogram for UI 1: number; 2: number; 3: number; 4: number; 5: number; }; }; inventoryStats: { totalStock: number; // Sum across warehouses warehouseCount: number; isAvailable: boolean; // Computed flag lowStockAlert: boolean; // Business rule flag }; salesStats: { totalSold: number; // Running counter lastWeekSold: number; // Rolling window trendingScore: number; // Computed ranking signal };} // Event handler maintains aggregates incrementallyclass ProductProjection { async handleReviewAdded(event: ReviewAddedEvent): Promise<void> { const product = await this.readStore.getProduct(event.productId); // Incremental average calculation const newTotal = product.reviewStats.totalReviews + 1; const newAverage = ( (product.reviewStats.averageRating * product.reviewStats.totalReviews) + event.rating ) / newTotal; // Update atomically await this.readStore.updateProduct(event.productId, { 'reviewStats.totalReviews': newTotal, 'reviewStats.averageRating': newAverage, [`reviewStats.ratingDistribution.${event.rating}`]: product.reviewStats.ratingDistribution[event.rating] + 1 }); }}Strategy 3: Materialized Joins
For queries that always need data from multiple domains, pre-join the data into a dedicated read table.
12345678910111213141516171819202122232425262728293031323334353637
-- Admin dashboard needs: order + customer + shipping status-- Instead of joining at query time, maintain a materialized view CREATE TABLE order_dashboard_view ( order_id TEXT PRIMARY KEY, order_date TIMESTAMPTZ, order_status TEXT, -- Embedded customer data customer_id TEXT, customer_name TEXT, customer_tier TEXT, -- "gold", "silver", "standard" customer_lifetime_value DECIMAL, -- Embedded shipping data shipping_carrier TEXT, tracking_number TEXT, estimated_delivery DATE, last_tracking_update TIMESTAMPTZ, -- Pre-computed flags for filtering is_delayed BOOLEAN, requires_attention BOOLEAN, -- Aggregated order data item_count INTEGER, total_amount DECIMAL, -- Indexes for common query patterns INDEX idx_status (order_status), INDEX idx_date (order_date DESC), INDEX idx_customer (customer_id), INDEX idx_attention (requires_attention) WHERE requires_attention = true); -- This single table serves the entire dashboard without joins-- Event handlers keep it synchronized with source dataOne of CQRS's most powerful capabilities is using different storage technologies for different query patterns. Rather than forcing all queries through a single database type, you can choose the optimal store for each use case.
The key insight: Different query patterns have fundamentally different performance characteristics. A full-text search engine excels at fuzzy matching but struggles with transactional consistency. A graph database navigates relationships efficiently but isn't designed for aggregations. Choose the right tool for each job.
| Query Pattern | Optimal Store Type | Example Technologies | Trade-offs |
|---|---|---|---|
| Full-text search, fuzzy matching | Search Engine | Elasticsearch, Algolia, Meilisearch | Eventually consistent, not ACID |
| Key-value lookups, caching | Key-Value Store | Redis, Memcached, DynamoDB | Limited query flexibility |
| Document retrieval, flexible schema | Document Database | MongoDB, Couchbase, Firestore | Less efficient joins |
| Graph traversal, relationships | Graph Database | Neo4j, Amazon Neptune, ArangoDB | Not optimized for aggregations |
| Time-series, metrics, logs | Time-Series Database | InfluxDB, TimescaleDB, Prometheus | Optimized for append, not update |
| Complex queries, reporting | Relational (OLAP) | PostgreSQL, Snowflake, BigQuery | Write overhead, scaling limits |
| Real-time analytics | Columnar Store | ClickHouse, Apache Druid, Redshift | Not for transactional workloads |
Real-World Example: E-commerce Product Service
Consider an e-commerce platform's product service. Different features require different query capabilities:
┌──────────────────────────────────────────────────────────────────────────┐│ WRITE SIDE (PostgreSQL) ││ Source of Truth for Products ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ products, categories, inventory, pricing, reviews (normalized) │ ││ └────────────────────────────────┬────────────────────────────────────┘ │└───────────────────────────────────┼───────────────────────────────────────┘ │ Events ▼ ┌─────────────────────────────────┐ │ EVENT PROJECTORS │ │ (Consume events, update │ │ multiple read stores) │ └─────────────────────────────────┘ │ ┌─────────────────────────┼─────────────────────────┐ │ │ │ ▼ ▼ ▼┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐│ ELASTICSEARCH │ │ REDIS │ │ MONGODB ││ │ │ │ │ ││ • Product search │ │ • Product cache │ │ • Product pages ││ • Autocomplete │ │ • Session data │ │ • Review docs ││ • Faceted browse │ │ • Cart state │ │ • User profiles ││ • Fuzzy matching │ │ • View counts │ │ • Order history │└────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘ │ │ │ ▼ ▼ ▼┌──────────────────────────────────────────────────────────────────────────┐│ QUERY ROUTER ││ /search?q=... → Elasticsearch ││ /products/:id → Redis (cache) → MongoDB (fallback) ││ /users/:id/orders → MongoDB ││ /autocomplete → Elasticsearch │└──────────────────────────────────────────────────────────────────────────┘Each additional store adds operational burden: monitoring, backup procedures, failover handling, and team expertise requirements. Ensure the performance benefits justify the operational costs. Start with one or two specialized stores before expanding.
With CQRS, caching becomes significantly more tractable. Because you know exactly when data changes (through events), you can implement sophisticated caching strategies with precise invalidation.
Write-Through Caching:
When projecting events to read models, simultaneously update the cache. The read path always hits the cache first. Events trigger both database and cache updates atomically.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
class ProductProjection { constructor( private readStore: ProductReadStore, private cache: RedisClient ) {} async handleProductUpdated(event: ProductUpdatedEvent): Promise<void> { // Build the read model const productView = this.buildProductView(event); // Update both database and cache in transaction const pipeline = this.cache.pipeline(); // 1. Update main product cache pipeline.setex( `product:${event.productId}`, 3600, // 1 hour TTL as safety net JSON.stringify(productView) ); // 2. Invalidate related caches pipeline.del(`category:${event.categoryId}:products`); pipeline.del(`search:results:*`); // Invalidate search cache // 3. Update listing caches pipeline.zadd( `category:${event.categoryId}:by_price`, productView.price, event.productId ); // Execute atomically await Promise.all([ this.readStore.upsertProduct(productView), pipeline.exec() ]); } async getProduct(productId: string): Promise<ProductView> { // Try cache first const cached = await this.cache.get(`product:${productId}`); if (cached) { return JSON.parse(cached); } // Cache miss - fetch from read store and populate cache const product = await this.readStore.getProduct(productId); await this.cache.setex( `product:${productId}`, 3600, JSON.stringify(product) ); return product; }}Multi-Layer Caching:
For maximum performance, implement multiple cache layers with increasing latency but also increasing capacity:
Event-Driven Invalidation:
The key advantage of CQRS for caching: you know exactly when to invalidate. Instead of TTL-based expiration or manual purging, events trigger precise invalidation.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
class CacheInvalidationHandler { private invalidationMap: Map<string, (event: DomainEvent) => string[]> = new Map([ ['ProductUpdated', (e) => [ `product:${e.productId}`, `category:${e.categoryId}:products`, `homepage:featured` ]], ['ReviewAdded', (e) => [ `product:${e.productId}`, `product:${e.productId}:reviews`, ]], ['InventoryChanged', (e) => [ `product:${e.productId}`, `product:${e.productId}:availability`, ...e.affectedCategories.map(c => `category:${c}:in_stock`) ]], ['PriceChanged', (e) => [ `product:${e.productId}`, `category:${e.categoryId}:by_price`, `deals:*` // Invalidate all deals caches ]] ]); async handleEvent(event: DomainEvent): Promise<void> { const invalidator = this.invalidationMap.get(event.type); if (!invalidator) return; const keysToInvalidate = invalidator(event); // Handle pattern-based invalidation for (const pattern of keysToInvalidate) { if (pattern.includes('*')) { // Scan and delete matching keys const keys = await this.cache.scan(pattern); if (keys.length > 0) { await this.cache.del(...keys); } } else { await this.cache.del(pattern); } } // Log for observability console.log(`Invalidated ${keysToInvalidate.length} cache keys for ${event.type}`); }}Beyond architectural patterns, specific optimization techniques ensure read models perform at scale.
Technique 1: Covering Indexes
Design indexes that contain all columns needed for a query. The database serves the query entirely from the index without accessing the main table (index-only scans).
1234567891011121314
-- Query: Get product names and prices for a categorySELECT product_id, name, price, average_ratingFROM products_read_modelWHERE category_id = $1ORDER BY average_rating DESCLIMIT 20; -- Covering index: includes ALL selected columnsCREATE INDEX idx_category_products_coveringON products_read_model (category_id, average_rating DESC)INCLUDE (product_id, name, price); -- Now the query is served entirely from the index-- No table access required = much fasterTechnique 2: Pagination Optimization
Traditional OFFSET-based pagination degrades as offset increases. Use keyset (cursor) pagination for consistent performance.
-- Page 1000: database must skip 19,980 rowsSELECT * FROM productsORDER BY created_at DESCLIMIT 20 OFFSET 19980; -- Performance degrades with offset-- Offset 0: ~1ms-- Offset 10000: ~50ms-- Offset 100000: ~500ms+-- Use last seen value as cursorSELECT * FROM productsWHERE created_at < $last_seen_date AND id != $last_seen_idORDER BY created_at DESCLIMIT 20; -- Performance is constant-- Any page: ~1-2ms-- Uses index efficientlyTechnique 3: Query-Specific Projections
Don't serve one giant read model for all queries. Create specialized projections for distinct query patterns.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
// DON'T: One massive product model for everythinginterface ProductReadModel { // 50+ fields for all possible use cases // Wastes bandwidth, memory, parsing time} // DO: Specialized projections per use case // For search results listinginterface ProductSearchResult { productId: string; name: string; thumbnailUrl: string; price: number; averageRating: number; inStock: boolean;} // For product detail pageinterface ProductDetailView { productId: string; name: string; description: string; images: string[]; price: number; specifications: Record<string, string>; reviews: ReviewSummary; relatedProducts: ProductSearchResult[];} // For admin dashboardinterface ProductAdminView { productId: string; name: string; sku: string; inventoryLevel: number; lastUpdated: Date; salesStats: SalesStats; performanceMetrics: PerformanceMetrics;} // Query handlers return the appropriate projectionclass ProductQueryService { async searchProducts(query: SearchQuery): Promise<ProductSearchResult[]> { // Query lightweight search index } async getProductDetails(productId: string): Promise<ProductDetailView> { // Query full product read model } async getProductAdmin(productId: string): Promise<ProductAdminView> { // Query admin-specific read model }}Always measure actual query performance before optimizing. Use query explain plans, track p95/p99 latencies, and identify which queries are actually bottlenecks. Don't optimize queries that aren't performance-critical.
One of CQRS's most powerful capabilities is the ability to rebuild read models from scratch. When combined with event sourcing, you can replay the entire event history to regenerate optimized read models—enabling schema evolution, bug fixes, and new query capabilities without data migration.
When to Rebuild:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
class ReadModelRebuilder { constructor( private eventStore: EventStore, private projections: Map<string, Projection>, private readStores: Map<string, ReadStore> ) {} async rebuildReadModel( projectionName: string, options: RebuildOptions = {} ): Promise<RebuildResult> { const projection = this.projections.get(projectionName); const readStore = this.readStores.get(projectionName); // 1. Create new version of read store (blue-green) const newStore = await readStore.createNewVersion(); // 2. Configure for bulk loading (disable indexes, constraints) await newStore.prepareForBulkLoad(); // 3. Stream events and apply projection let processedCount = 0; const startTime = Date.now(); const eventStream = this.eventStore.streamAllEvents({ fromPosition: options.fromPosition || 0, batchSize: options.batchSize || 1000 }); for await (const batch of eventStream) { // Apply projection to each event const updates = batch.map(event => projection.apply(event)); // Bulk insert to new store await newStore.bulkUpsert(updates); processedCount += batch.length; // Progress reporting if (processedCount % 10000 === 0) { console.log(`Rebuilt ${processedCount} events...`); } } // 4. Rebuild indexes after bulk load await newStore.finalizeAfterBulkLoad(); // 5. Atomic swap to new version await readStore.swapToNewVersion(newStore); // 6. Clean up old version await readStore.removeOldVersion(); return { projectionName, eventsProcessed: processedCount, durationMs: Date.now() - startTime }; } // Parallel rebuilding for multiple read models async rebuildAll(projectionNames: string[]): Promise<RebuildResult[]> { return Promise.all( projectionNames.map(name => this.rebuildReadModel(name)) ); }}For large event stores (millions of events), rebuilds can take hours. Plan for this: schedule during low-traffic periods, ensure adequate resources, and have rollback procedures ready. Consider incremental rebuilding strategies for very large stores.
Read models require careful monitoring to ensure they're serving accurate data with acceptable latency. Unlike the write side where failures are immediately visible, read model issues can be subtle—stale data, slow queries, or projection lag.
Key Metrics to Monitor:
| Metric Category | Key Metrics | Alert Threshold Examples |
|---|---|---|
| Query Performance | p50/p95/p99 latency, throughput | p99 > 100ms, throughput drop > 20% |
| Projection Lag | Time since last event processed | Lag > 30 seconds |
| Cache Health | Hit rate, eviction rate, memory usage | Hit rate < 80%, memory > 90% |
| Data Freshness | Age of oldest unprocessed event | Age > 1 minute |
| Error Rates | Failed queries, projection errors | Error rate > 0.1% |
| Resource Utilization | CPU, memory, connections, disk I/O | CPU > 80%, connections > 90% of max |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
class ReadModelHealthCheck { async checkHealth(): Promise<HealthStatus> { const checks: HealthCheck[] = []; // 1. Check projection lag const lagCheck = await this.checkProjectionLag(); checks.push(lagCheck); // 2. Check database connectivity const dbCheck = await this.checkDatabaseConnectivity(); checks.push(dbCheck); // 3. Check cache health const cacheCheck = await this.checkCacheHealth(); checks.push(cacheCheck); // 4. Check query performance (sample query) const queryCheck = await this.checkQueryPerformance(); checks.push(queryCheck); // Aggregate status const overallStatus = checks.every(c => c.status === 'healthy') ? 'healthy' : checks.some(c => c.status === 'unhealthy') ? 'unhealthy' : 'degraded'; return { status: overallStatus, checks }; } private async checkProjectionLag(): Promise<HealthCheck> { const lastProcessedPosition = await this.projector.getLastPosition(); const currentEventPosition = await this.eventStore.getCurrentPosition(); const lag = currentEventPosition - lastProcessedPosition; const lagSeconds = await this.estimateLagInSeconds(lag); return { name: 'projection_lag', status: lagSeconds < 30 ? 'healthy' : lagSeconds < 120 ? 'degraded' : 'unhealthy', metrics: { eventsBehing: lag, estimatedLagSeconds: lagSeconds } }; } private async checkQueryPerformance(): Promise<HealthCheck> { const start = performance.now(); // Execute a representative sample query await this.readStore.executeHealthCheckQuery(); const durationMs = performance.now() - start; return { name: 'query_performance', status: durationMs < 50 ? 'healthy' : durationMs < 200 ? 'degraded' : 'unhealthy', metrics: { sampleQueryDurationMs: durationMs } }; }}We've explored the art and science of building read models that serve queries at massive scale. Let's consolidate the essential insights:
What's Next:
With optimized read models in place, we must address the elephant in the room: eventual consistency. When writes and reads are separated, data synchronization isn't instantaneous. The next page explores strategies for handling the gap between when data is written and when it becomes visible to readers.
You now understand how to design and optimize read models for high-performance query serving. From denormalization strategies to polyglot storage selection to cache management, you have the tools to build read paths that scale to millions of queries per second.