System Design HLDAmazon E-Commerce

Designing Amazon: E-Commerce at Massive Scale

LevelAdvanced

Duration120 mins

TopicAmazon E-Commerce

2 / 6

Product Catalog at Scale

The Heart of E-Commerce

The product catalog is the foundation upon which all e-commerce functionality is built. Every search query, every product page view, every recommendation, every price display—all depend on a catalog system that can serve accurate, up-to-date product information at massive scale.

Amazon's catalog contains over 350 million unique products from 2+ million active sellers. Each product has dozens of attributes, multiple variants, high-resolution images, customer reviews, pricing rules, and availability information that varies by region and fulfillment center. The catalog must support:

70,000+ page views per second during normal operation
Full-text search with faceted filtering across all products in under 300ms
Real-time updates as sellers modify listings, prices change, and inventory fluctuates
Consistency between search results and product pages (no broken links or stale data)

This page will take you through the complete architecture of a catalog system designed for this scale.

Learning Objectives

By the end of this page, you will understand how to design a product catalog architecture that separates concerns between primary storage, search, and caching; how to model complex product data with variants and attributes; and how to maintain consistency across a distributed catalog system.

Product Data Model Design

Before diving into architecture, we must understand what we're storing. Product data is surprisingly complex—far more than a simple table of items with names and prices.

The Core Entities:

A product in an e-commerce catalog is actually a hierarchical structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
// Core Product Entity - The "Parent" product
interface Product {
  id: string;                          // Globally unique identifier (ASIN equivalent)
  sellerId: string;                    // Merchant/brand who owns the listing
  
  // Basic Information
  title: string;                       // "Sony WH-1000XM5 Wireless Noise Canceling Headphones"
  brand: string;                       // "Sony"
  manufacturer: string;                // May differ from brand
  description: string;                 // HTML-formatted long description
  bulletPoints: string[];              // Key feature highlights
  
  // Categorization
  categoryPath: string[];              // ["Electronics", "Audio", "Headphones", "Over-Ear"]
  categoryIds: string[];               // Internal category identifiers
  
  // Attributes (category-specific)
  attributes: Record<string, Attribute>;
  
  // Search optimization
  keywords: string[];                  // Seller-provided search terms
  
  // State management
  status: 'draft' | 'pending_review' | 'active' | 'suppressed' | 'archived';
  createdAt: Timestamp;
  updatedAt: Timestamp;
  
  // Relationships
  variants: ProductVariant[];          // Color/size variations
  
  // Aggregate data (computed)
  averageRating: number;               // 4.7
  reviewCount: number;                 // 12,847
  priceRange: PriceRange;              // Min/max across variants
}
 
// Product Variants - Each purchasable item
interface ProductVariant {
  id: string;                          // SKU-level identifier
  productId: string;                   // Parent product reference
  
  // Variant-defining attributes
  variantAttributes: {
    color?: string;                    // "Black", "Silver", "Midnight Blue"
    size?: string;                     // "Small", "Medium", "Large"
    configuration?: string;            // "256GB", "512GB"
    style?: string;                    // "Standard", "Premium Edition"
  };
  
  // Pricing (can vary by variant)
  listPrice: Money;                    // MSRP
  currentPrice: Money;                 // Active selling price
  dealPrice?: Money;                   // Special promotion price
  dealEndTime?: Timestamp;             // When deal expires
  
  // Availability
  inStock: boolean;                    // Aggregated stock status
  stockLevel: StockLevel;              // 'in_stock' | 'low_stock' | 'out_of_stock'
  
  // Media
  images: ProductImage[];              // Variant-specific images
  
  // Shipping
  dimensions: Dimensions;              // For shipping calculation
  weight: Weight;
  fulfillmentOptions: FulfillmentOption[];
}
 
// Complex attribute structure supporting rich product data
interface Attribute {
  name: string;                        // "Noise Cancellation Type"
  value: string | string[] | number;   // "Active" or ["Active", "Passive"]
  unit?: string;                       // "hours", "inches", "GB"
  filterable: boolean;                 // Can be used in search facets
  displayOrder: number;                // UI rendering order
  normalizedValue?: string;            // Standardized for comparison
}

Why Parent-Variant Separation?

This parent-variant model is crucial for UX and data management. When a customer searches for 'Sony headphones', they want to see one result per product—not separate listings for every color. But when they add to cart, they're buying a specific variant. The model must support both views efficiently.

Category-Specific Attributes:

One of the most challenging aspects of catalog design is that different product categories have vastly different attributes. Electronics have specifications (battery life, connectivity); clothing has sizes; furniture has dimensions; food has nutritional information.

This requires a flexible schema approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// Category defines which attributes are relevant and how they're validated
interface CategorySchema {
  categoryId: string;
  name: string;
  parentId?: string;                    // For hierarchy
  
  // Required attributes for this category
  requiredAttributes: AttributeDefinition[];
  
  // Optional but recommended attributes
  optionalAttributes: AttributeDefinition[];
  
  // Validation rules
  validationRules: ValidationRule[];
}
 
interface AttributeDefinition {
  name: string;
  displayName: string;                  // User-facing name
  type: 'string' | 'number' | 'boolean' | 'enum' | 'multi-enum' | 'range';
  enumValues?: string[];                // For enum types
  unit?: string;
  filterable: boolean;                  // Show in faceted search
  comparable: boolean;                  // Show in product comparison
  searchable: boolean;                  // Include in full-text search
}
 
// Example: Electronics > Audio > Headphones category schema
const headphonesCategorySchema: CategorySchema = {
  categoryId: "electronics_audio_headphones",
  name: "Headphones",
  parentId: "electronics_audio",
  requiredAttributes: [
    { name: "headphone_type", displayName: "Type", type: "enum", 
      enumValues: ["Over-Ear", "On-Ear", "In-Ear", "Earbuds"], filterable: true },
    { name: "connectivity", displayName: "Connectivity", type: "multi-enum",
      enumValues: ["Wireless", "Bluetooth", "3.5mm Jack", "USB-C"], filterable: true },
    { name: "noise_cancellation", displayName: "Noise Cancellation", type: "enum",
      enumValues: ["Active", "Passive", "None"], filterable: true },
  ],
  optionalAttributes: [
    { name: "battery_life", displayName: "Battery Life", type: "number", 
      unit: "hours", filterable: true, searchable: false },
    { name: "driver_size", displayName: "Driver Size", type: "number",
      unit: "mm", filterable: true },
    { name: "frequency_response", displayName: "Frequency Response", type: "range",
      unit: "Hz" },
  ],
  validationRules: [
    { rule: "battery_life_required_for_wireless",
      condition: "connectivity includes 'Wireless'",
      requirement: "battery_life is required" }
  ]
};

Storage Architecture

A production catalog system uses multiple storage layers optimized for different access patterns. No single database can efficiently handle all catalog operations.

The Three-Layer Storage Pattern:

Storage Layer Architecture

•Primary Store (Source of Truth) — Relational or document database holding the authoritative product data. Used for writes, complex queries by sellers, and as the source for sync to other stores.
•Search Store — Elasticsearch or similar inverted-index database optimized for full-text search and faceted navigation. Powers customer search and browse experiences.
•Read Cache — Redis or Memcached holding hot product data for direct lookups. Powers product detail pages and API responses.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
┌─────────────────────────────────────────────────────────────────────────┐
│                        CATALOG STORAGE ARCHITECTURE                      │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌──────────────────┐                                                   │
│  │   WRITE PATH     │                                                   │
│  │                  │                                                   │
│  │  Seller Portal   │──────┐                                            │
│  │  Admin Tools     │      │                                            │
│  │  Bulk Import     │      │                                            │
│  └──────────────────┘      │                                            │
│                            ▼                                            │
│  ┌───────────────────────────────────────────┐                         │
│  │         PRIMARY STORE (Source of Truth)    │                         │
│  │    ┌────────────────────────────────────┐ │                         │
│  │    │       PostgreSQL / DynamoDB         │ │                         │
│  │    │  • All product data with history    │ │                         │
│  │    │  • Strong consistency on writes     │ │                         │
│  │    │  • Complex seller queries           │ │                         │
│  │    │  • ACID transactions for updates    │ │                         │
│  │    └────────────────────────────────────┘ │                         │
│  └─────────────────────┬─────────────────────┘                         │
│                        │                                                │
│                        │ Change Data Capture (CDC)                      │
│                        │                                                │
│         ┌──────────────┴──────────────┐                                │
│         │                             │                                │
│         ▼                             ▼                                │
│  ┌─────────────────────┐    ┌─────────────────────┐                   │
│  │    SEARCH STORE     │    │     READ CACHE      │                   │
│  │  ┌───────────────┐  │    │  ┌───────────────┐  │                   │
│  │  │ Elasticsearch │  │    │  │     Redis     │  │                   │
│  │  │               │  │    │  │               │  │                   │
│  │  │• Full-text    │  │    │  │• Hot products │  │                   │
│  │  │• Faceted nav  │  │    │  │• Sub-ms reads │  │                   │
│  │  │• Aggregations │  │    │  │• TTL-based    │  │                   │
│  │  │• Analytics    │  │    │  │• Cache-aside  │  │                   │
│  │  └───────────────┘  │    │  └───────────────┘  │                   │
│  └──────────┬──────────┘    └─────────┬───────────┘                   │
│             │                         │                                │
│             └────────────┬────────────┘                                │
│                          │                                             │
│                          ▼                                             │
│           ┌────────────────────────────┐                              │
│           │      READ PATH (APIs)       │                              │
│           │                             │                              │
│           │  • Product pages → Cache    │                              │
│           │  • Search → Elasticsearch   │                              │
│           │  • Browse → Elasticsearch   │                              │
│           └────────────────────────────┘                              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Primary Store Choice: PostgreSQL vs DynamoDB

The choice of primary store depends on your specific requirements:

PostgreSQL with JSON columns works well when:

You need complex relational queries (seller dashboards, analytics)
Your team has strong SQL expertise
You can accept the operational overhead of managing PostgreSQL at scale
You need ACID transactions across product updates

DynamoDB works well when:

You have well-defined access patterns (primarily key-based lookups)
You need predictable performance at any scale
You want managed infrastructure with minimal operations
You can model data to fit DynamoDB's single-table design

For Amazon-scale catalogs, a hybrid approach is common: DynamoDB for the high-throughput primary product data with PostgreSQL for seller analytics and complex reporting.

Storage Layer Characteristics
Characteristic	Primary Store	Search Store	Cache Layer
Technology	PostgreSQL or DynamoDB	Elasticsearch	Redis Cluster
Data Volume	350M products × 50KB = 17.5TB	350M docs × 5KB = 1.75TB (indexed)	Hot 10M × 5KB = 50GB
Write Pattern	Thousands of updates/second	Async sync from CDC	On-demand populate
Read Latency	5-50ms	20-100ms	<5ms
Consistency	Strong (source of truth)	Eventual (seconds lag)	Eventual (TTL-based)
Primary Use	Authoritative data	Search & browse	Product page speed

Search Architecture Deep Dive

Search is the primary way customers interact with the catalog. A well-designed search system must handle:

Full-text queries with typo tolerance and synonym matching
Faceted filtering (brand, price range, ratings, category-specific attributes)
Relevance ranking that considers query match, popularity, and personalization
Real-time updates as products change
100,000+ queries per second during peak traffic

Elasticsearch Architecture:

For a catalog of 350M+ products, a single Elasticsearch cluster isn't sufficient. We need a distributed architecture:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
{
  "settings": {
    "number_of_shards": 48,
    "number_of_replicas": 2,
    "refresh_interval": "30s",
    "analysis": {
      "analyzer": {
        "product_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "snowball", "synonym_filter"]
        },
        "autocomplete_analyzer": {
          "type": "custom",
          "tokenizer": "edge_ngram_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "edge_ngram_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 15,
          "token_chars": ["letter", "digit"]
        }
      },
      "filter": {
        "synonym_filter": {
          "type": "synonym",
          "synonyms_path": "synonyms.txt"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "product_id": { "type": "keyword" },
      "title": { 
        "type": "text",
        "analyzer": "product_analyzer",
        "fields": {
          "autocomplete": { "type": "text", "analyzer": "autocomplete_analyzer" },
          "exact": { "type": "keyword" }
        }
      },
      "description": { "type": "text", "analyzer": "product_analyzer" },
      "brand": { 
        "type": "text",
        "fields": { "keyword": { "type": "keyword" } }
      },
      "category_path": { "type": "keyword" },
      "price": { "type": "scaled_float", "scaling_factor": 100 },
      "rating": { "type": "float" },
      "review_count": { "type": "integer" },
      "in_stock": { "type": "boolean" },
      "attributes": { "type": "flattened" },
      "popularity_score": { "type": "float" },
      "search_keywords": { "type": "text", "analyzer": "product_analyzer" },
      "suggestion": {
        "type": "completion",
        "analyzer": "product_analyzer"
      }
    }
  }
}

Search Index Design Principles

•Shard Count — 48 shards for 350M products allows ~7M docs per shard, enabling parallel search across nodes
•Replica Count — 2 replicas provide read scalability and fault tolerance; each query can hit any replica
•Refresh Interval — 30s refresh reduces indexing overhead; real-time isn't needed for search (new products can wait)
•Multi-field Mapping — Title indexed multiple ways: full-text for search, edge-ngrams for autocomplete, keyword for exact match
•Flattened Attributes — Category-specific attributes stored as flattened type for flexible faceting without schema changes
•Completion Suggester — Built-in autocomplete using specialized data structure for prefix-based suggestions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
{
  "query": {
    "function_score": {
      "query": {
        "bool": {
          "must": [
            {
              "multi_match": {
                "query": "wireless headphones",
                "fields": ["title^3", "brand^2", "description", "search_keywords"],
                "type": "best_fields",
                "fuzziness": "AUTO"
              }
            }
          ],
          "filter": [
            { "term": { "in_stock": true } },
            { "range": { "price": { "lte": 200 } } },
            { "terms": { "attributes.noise_cancellation": ["Active"] } }
          ]
        }
      },
      "functions": [
        {
          "field_value_factor": {
            "field": "popularity_score",
            "factor": 1.2,
            "modifier": "log1p"
          }
        },
        {
          "field_value_factor": {
            "field": "rating",
            "factor": 1.5,
            "modifier": "sqrt"
          }
        },
        {
          "filter": { "range": { "review_count": { "gte": 100 } } },
          "weight": 1.3
        }
      ],
      "score_mode": "multiply",
      "boost_mode": "multiply"
    }
  },
  "aggs": {
    "brand_facet": {
      "terms": { "field": "brand.keyword", "size": 20 }
    },
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 50 },
          { "from": 50, "to": 100 },
          { "from": 100, "to": 200 },
          { "from": 200 }
        ]
      }
    },
    "rating_histogram": {
      "histogram": { "field": "rating", "interval": 1 }
    }
  },
  "highlight": {
    "fields": {
      "title": {},
      "description": { "number_of_fragments": 2 }
    }
  },
  "size": 24,
  "from": 0
}

Function Scoring for Relevance

The function_score query is the secret to good search relevance. It combines text relevance with business signals (popularity, ratings, review count) to rank results. Tuning these weights is an ongoing process based on A/B testing click-through rates and conversion.

Multi-Layer Caching Strategy

Caching is critical for catalog performance. With 70,000 QPS hitting product pages, even a small cache improvement has massive impact. We employ multiple caching layers:

Caching Layers in Product Catalog
Layer	Technology	TTL	Hit Rate Target	Purpose
CDN Edge	CloudFront/Fastly	5 min	40%	Static assets, popular product pages
Application Cache	Redis Cluster	15 min	85%	Product data, computed fields
Local Cache	In-process LRU	1 min	60%	Extremely hot keys, reduce Redis hops
Search Cache	Elasticsearch cache	30 sec	70%	Common search queries, facet aggregations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// Cache key design for product catalog
 
interface CacheKeyStrategy {
  // Product data - versioned to enable instant invalidation
  productData: (productId: string, version: number) => string;
  
  // Price data - separate cache with shorter TTL (prices change frequently)
  productPrice: (productId: string, variantId: string, userId?: string) => string;
  
  // Search results - include normalized query and filters
  searchResults: (queryHash: string, page: number) => string;
  
  // Category data - hierarchical with parent invalidation
  categoryProducts: (categoryId: string, filters: string, page: number) => string;
}
 
const cacheKeys: CacheKeyStrategy = {
  productData: (productId, version) => 
    `product:data:${productId}:v${version}`,
  
  productPrice: (productId, variantId, userId) =>
    userId 
      ? `product:price:${productId}:${variantId}:user:${userId}`  // Personalized
      : `product:price:${productId}:${variantId}:default`,         // Standard
  
  searchResults: (queryHash, page) =>
    `search:results:${queryHash}:page:${page}`,
  
  categoryProducts: (categoryId, filters, page) =>
    `category:${categoryId}:filters:${filters}:page:${page}`
};
 
// Cache-aside pattern with version checking
async function getProduct(productId: string): Promise<Product> {
  // Get current version from lightweight metadata
  const currentVersion = await redis.get(`product:version:${productId}`);
  
  // Try cache with version
  const cacheKey = cacheKeys.productData(productId, parseInt(currentVersion || '0'));
  const cached = await redis.get(cacheKey);
  
  if (cached) {
    return JSON.parse(cached);
  }
  
  // Cache miss - fetch from primary store
  const product = await primaryStore.getProduct(productId);
  
  // Store in cache with TTL
  await redis.setex(cacheKey, 900, JSON.stringify(product));  // 15 min TTL
  
  return product;
}
 
// On product update, just bump the version
async function invalidateProduct(productId: string): Promise<void> {
  await redis.incr(`product:version:${productId}`);
  // Old cached entries will naturally expire
  // New requests get new version number, triggering cache miss
}

Advanced Caching Patterns

•Cache Stampede Prevention — Use probabilistic early expiration or locking to prevent thundering herd on popular products
•Write-Through for Critical Data — Prices and availability bypass cache-aside to ensure immediate consistency
•Negative Caching — Cache '404' results for deleted products to prevent repeated DB lookups
•Tiered TTLs — Static product info (description) has long TTL; dynamic data (price, stock) has short TTL
•Cache Warming — Pre-populate cache for predicted hot products before major sale events

The Cache Consistency Challenge

Cache invalidation is one of the hardest problems in distributed systems. With products in CDN, Redis, Elasticsearch, and local caches simultaneously, ensuring consistency is complex. The version-based approach above provides eventual consistency—a product update might take 15 minutes to propagate to all layers, but this is acceptable for catalog data.

Cross-Store Data Synchronization

With data distributed across primary store, search index, and cache, keeping them synchronized is a critical architectural concern. We use Change Data Capture (CDC) as the foundation for reliable sync.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
┌─────────────────────────────────────────────────────────────────────────────┐
│                     CATALOG SYNCHRONIZATION PIPELINE                         │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌──────────────┐                                                          │
│   │  PostgreSQL  │                                                          │
│   │  (Primary)   │                                                          │
│   └──────┬───────┘                                                          │
│          │ WAL (Write-Ahead Log)                                            │
│          │                                                                   │
│          ▼                                                                   │
│   ┌──────────────┐                                                          │
│   │   Debezium   │  Captures all changes as events                          │
│   │ (CDC Engine) │                                                          │
│   └──────┬───────┘                                                          │
│          │                                                                   │
│          ▼                                                                   │
│   ┌──────────────────────────────────────────────────────┐                  │
│   │                    Apache Kafka                       │                  │
│   │   ┌─────────────────────────────────────────────┐    │                  │
│   │   │  catalog.products.changes (partitioned by   │    │                  │
│   │   │  productId for ordering guarantees)         │    │                  │
│   │   └─────────────────────────────────────────────┘    │                  │
│   └─────────────────────────┬────────────────────────────┘                  │
│                             │                                                │
│        ┌────────────────────┼────────────────────┐                          │
│        │                    │                    │                          │
│        ▼                    ▼                    ▼                          │
│  ┌───────────┐       ┌───────────┐       ┌───────────┐                     │
│  │   ES      │       │   Redis   │       │   CDN     │                     │
│  │ Indexer   │       │ Updater   │       │ Purger    │                     │
│  │           │       │           │       │           │                     │
│  │• Batch    │       │• Increment│       │• API call │                     │
│  │  indexing │       │  version  │       │  to purge │                     │
│  │• Bulk API │       │• Update   │       │• Selective│                     │
│  │• Retry on │       │  price if │       │  invalidate│                    │
│  │  failure  │       │  changed  │       │           │                     │
│  └───────────┘       └───────────┘       └───────────┘                     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
// Kafka consumer processing catalog change events
 
interface ProductChangeEvent {
  eventId: string;
  timestamp: number;
  operation: 'INSERT' | 'UPDATE' | 'DELETE';
  productId: string;
  before?: Partial<Product>;   // Previous state (for updates)
  after?: Product;             // New state (for inserts/updates)
  changedFields?: string[];    // Which fields changed
}
 
class CatalogSyncConsumer {
  async processEvent(event: ProductChangeEvent): Promise<void> {
    const { productId, operation, after, changedFields } = event;
    
    try {
      // Always sync to Elasticsearch (full document)
      if (operation === 'DELETE') {
        await this.elasticsearch.delete('products', productId);
      } else {
        await this.elasticsearch.index('products', productId, 
          this.transformForSearch(after));
      }
      
      // Increment cache version (lazy invalidation)
      await this.redis.incr(`product:version:${productId}`);
      
      // If price changed, also update price cache immediately
      if (changedFields?.includes('currentPrice')) {
        await this.updatePriceCache(productId, after);
      }
      
      // If it's a popular product, purge CDN cache
      if (await this.isHighTrafficProduct(productId)) {
        await this.cdn.purge(`/products/${productId}*`);
      }
      
      // Record sync completion for monitoring
      await this.metrics.recordSync(productId, event.timestamp);
      
    } catch (error) {
      // Send to dead-letter queue for retry
      await this.deadLetterQueue.send(event, error);
      throw error;
    }
  }
  
  // Transform product for search-optimized format
  private transformForSearch(product: Product): SearchDocument {
    return {
      product_id: product.id,
      title: product.title,
      description: product.description,
      brand: product.brand,
      category_path: product.categoryPath,
      price: this.getLowestVariantPrice(product),
      rating: product.averageRating,
      review_count: product.reviewCount,
      in_stock: this.hasAnyInStockVariant(product),
      attributes: this.flattenAttributes(product.attributes),
      popularity_score: this.calculatePopularity(product),
      search_keywords: product.keywords.join(' '),
      updated_at: new Date().toISOString()
    };
  }
}

Why Kafka for CDC?

Kafka provides exactly-once semantics, ordered message delivery (within partitions), and persistent storage of events. If the Elasticsearch indexer goes down, it can resume from its last committed offset without losing events. This durability is essential for keeping stores in sync.

High Availability Architecture

A catalog outage means customers can't browse, search, or view products—effectively shutting down the entire e-commerce operation. The catalog service requires 99.99% availability, allowing only ~52 minutes of downtime per year.

Multi-Region Deployment Strategy:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
                                    ┌───────────────┐
                                    │    Global     │
                                    │ Load Balancer │
                                    │  (Route 53)   │
                                    └───────┬───────┘
                                            │
           ┌────────────────────────────────┼────────────────────────────────┐
           │                                │                                │
           ▼                                ▼                                ▼
   ┌───────────────┐               ┌───────────────┐               ┌───────────────┐
   │   US-EAST     │               │   EU-WEST     │               │  AP-SOUTHEAST │
   │ (Primary for  │               │ (Primary for  │               │ (Primary for  │
   │  Americas)    │               │  Europe)      │               │  Asia)        │
   └───────┬───────┘               └───────┬───────┘               └───────┬───────┘
           │                                │                                │
   ┌───────┴───────┐               ┌───────┴───────┐               ┌───────┴───────┐
   │               │               │               │               │               │
   ▼               ▼               ▼               ▼               ▼               ▼
┌──────┐       ┌──────┐       ┌──────┐       ┌──────┐       ┌──────┐       ┌──────┐
│Redis │       │ ES   │       │Redis │       │ ES   │       │Redis │       │ ES   │
│Cluster│      │Cluster│      │Cluster│      │Cluster│      │Cluster│      │Cluster│
└──────┘       └──────┘       └──────┘       └──────┘       └──────┘       └──────┘
           │                                │                                │
           └────────────────────────────────┼────────────────────────────────┘
                                            │
                           ┌────────────────┴────────────────┐
                           │                                  │
                           ▼                                  ▼
                   ┌───────────────┐                 ┌───────────────┐
                   │ Primary DB    │ ───────────────→│ Read Replicas │
                   │ (US-EAST)     │  Cross-region   │ (EU, APAC)    │
                   │               │   replication   │               │
                   └───────────────┘                 └───────────────┘

Failure Handling Strategies

•Regional Failover — If US-EAST fails, Route 53 health checks trigger automatic failover to EU-WEST within 60 seconds
•Read Replica Promotion — DR replicas can be promoted to primary in disaster recovery scenarios
•Cache Fallback — If Elasticsearch is unavailable, degrade to Redis-based filtered listings (reduced functionality)
•Circuit Breakers — Prevent cascade failures when downstream services are struggling
•Graceful Degradation — Show cached product data even if real-time inventory check fails

Health Monitoring

•Synthetic monitoring with end-to-end transaction tests
•p50, p95, p99 latency tracking per endpoint
•Cache hit rate monitoring with alerts
•Search relevance quality metrics
•Sync lag measurement across stores

Alerting Thresholds

•Page load p99 > 500ms → Page oncall
•Search p99 > 1s → Escalate immediately
•Error rate > 0.1% → Automatic rollback
•Cache hit rate < 70% → Investigate
•Sync lag > 5 min → High priority alert

Catalog Architecture Summary

We've covered the complete architecture of a product catalog system designed for Amazon-scale operations. Let's consolidate the key architectural decisions:

Key Architectural Principles

•Polyglot Persistence — Use the right store for each access pattern: relational for complex queries, Elasticsearch for search, Redis for speed
•CDC for Consistency — Change Data Capture enables reliable, ordered synchronization across all stores without tight coupling
•Multi-Layer Caching — CDN, Redis, and local caches each serve different purposes with tiered TTLs
•Version-Based Invalidation — Lazy invalidation via version numbers provides efficient cache management at scale
•Active-Active Multi-Region — Each region has full read capability with global write coordination
•Graceful Degradation — System continues functioning with reduced features when components fail

What's Next:

With the catalog architecture established, we'll dive into the Shopping Cart Service—one of the most stateful components in e-commerce. You'll learn how to design a cart that persists across sessions and devices, handles race conditions when inventory changes, and seamlessly merges guest and authenticated user carts.

Page Complete

You now understand how to architect a product catalog system that can serve 350M+ products to millions of concurrent users with sub-200ms latency. The key insight is that 'the catalog' is actually multiple specialized systems working together, each optimized for its specific purpose.

2 / 6

Loading learning content...

System Design HLDAmazon E-Commerce

Designing Amazon: E-Commerce at Massive Scale

LevelAdvanced

Duration120 mins

TopicAmazon E-Commerce

2 / 6

Product Catalog at Scale

The Heart of E-Commerce

70,000+ page views per second during normal operation
Full-text search with faceted filtering across all products in under 300ms
Real-time updates as sellers modify listings, prices change, and inventory fluctuates
Consistency between search results and product pages (no broken links or stale data)

This page will take you through the complete architecture of a catalog system designed for this scale.

Learning Objectives

Product Data Model Design

Before diving into architecture, we must understand what we're storing. Product data is surprisingly complex—far more than a simple table of items with names and prices.

The Core Entities:

A product in an e-commerce catalog is actually a hierarchical structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
// Core Product Entity - The "Parent" product
interface Product {
  id: string;                          // Globally unique identifier (ASIN equivalent)
  sellerId: string;                    // Merchant/brand who owns the listing
  
  // Basic Information
  title: string;                       // "Sony WH-1000XM5 Wireless Noise Canceling Headphones"
  brand: string;                       // "Sony"
  manufacturer: string;                // May differ from brand
  description: string;                 // HTML-formatted long description
  bulletPoints: string[];              // Key feature highlights
  
  // Categorization
  categoryPath: string[];              // ["Electronics", "Audio", "Headphones", "Over-Ear"]
  categoryIds: string[];               // Internal category identifiers
  
  // Attributes (category-specific)
  attributes: Record<string, Attribute>;
  
  // Search optimization
  keywords: string[];                  // Seller-provided search terms
  
  // State management
  status: 'draft' | 'pending_review' | 'active' | 'suppressed' | 'archived';
  createdAt: Timestamp;
  updatedAt: Timestamp;
  
  // Relationships
  variants: ProductVariant[];          // Color/size variations
  
  // Aggregate data (computed)
  averageRating: number;               // 4.7
  reviewCount: number;                 // 12,847
  priceRange: PriceRange;              // Min/max across variants
}
 
// Product Variants - Each purchasable item
interface ProductVariant {
  id: string;                          // SKU-level identifier
  productId: string;                   // Parent product reference
  
  // Variant-defining attributes
  variantAttributes: {
    color?: string;                    // "Black", "Silver", "Midnight Blue"
    size?: string;                     // "Small", "Medium", "Large"
    configuration?: string;            // "256GB", "512GB"
    style?: string;                    // "Standard", "Premium Edition"
  };
  
  // Pricing (can vary by variant)
  listPrice: Money;                    // MSRP
  currentPrice: Money;                 // Active selling price
  dealPrice?: Money;                   // Special promotion price
  dealEndTime?: Timestamp;             // When deal expires
  
  // Availability
  inStock: boolean;                    // Aggregated stock status
  stockLevel: StockLevel;              // 'in_stock' | 'low_stock' | 'out_of_stock'
  
  // Media
  images: ProductImage[];              // Variant-specific images
  
  // Shipping
  dimensions: Dimensions;              // For shipping calculation
  weight: Weight;
  fulfillmentOptions: FulfillmentOption[];
}
 
// Complex attribute structure supporting rich product data
interface Attribute {
  name: string;                        // "Noise Cancellation Type"
  value: string | string[] | number;   // "Active" or ["Active", "Passive"]
  unit?: string;                       // "hours", "inches", "GB"
  filterable: boolean;                 // Can be used in search facets
  displayOrder: number;                // UI rendering order
  normalizedValue?: string;            // Standardized for comparison
}

Why Parent-Variant Separation?

Category-Specific Attributes:

This requires a flexible schema approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// Category defines which attributes are relevant and how they're validated
interface CategorySchema {
  categoryId: string;
  name: string;
  parentId?: string;                    // For hierarchy
  
  // Required attributes for this category
  requiredAttributes: AttributeDefinition[];
  
  // Optional but recommended attributes
  optionalAttributes: AttributeDefinition[];
  
  // Validation rules
  validationRules: ValidationRule[];
}
 
interface AttributeDefinition {
  name: string;
  displayName: string;                  // User-facing name
  type: 'string' | 'number' | 'boolean' | 'enum' | 'multi-enum' | 'range';
  enumValues?: string[];                // For enum types
  unit?: string;
  filterable: boolean;                  // Show in faceted search
  comparable: boolean;                  // Show in product comparison
  searchable: boolean;                  // Include in full-text search
}
 
// Example: Electronics > Audio > Headphones category schema
const headphonesCategorySchema: CategorySchema = {
  categoryId: "electronics_audio_headphones",
  name: "Headphones",
  parentId: "electronics_audio",
  requiredAttributes: [
    { name: "headphone_type", displayName: "Type", type: "enum", 
      enumValues: ["Over-Ear", "On-Ear", "In-Ear", "Earbuds"], filterable: true },
    { name: "connectivity", displayName: "Connectivity", type: "multi-enum",
      enumValues: ["Wireless", "Bluetooth", "3.5mm Jack", "USB-C"], filterable: true },
    { name: "noise_cancellation", displayName: "Noise Cancellation", type: "enum",
      enumValues: ["Active", "Passive", "None"], filterable: true },
  ],
  optionalAttributes: [
    { name: "battery_life", displayName: "Battery Life", type: "number", 
      unit: "hours", filterable: true, searchable: false },
    { name: "driver_size", displayName: "Driver Size", type: "number",
      unit: "mm", filterable: true },
    { name: "frequency_response", displayName: "Frequency Response", type: "range",
      unit: "Hz" },
  ],
  validationRules: [
    { rule: "battery_life_required_for_wireless",
      condition: "connectivity includes 'Wireless'",
      requirement: "battery_life is required" }
  ]
};

Storage Architecture

A production catalog system uses multiple storage layers optimized for different access patterns. No single database can efficiently handle all catalog operations.

The Three-Layer Storage Pattern:

Storage Layer Architecture

•Primary Store (Source of Truth) — Relational or document database holding the authoritative product data. Used for writes, complex queries by sellers, and as the source for sync to other stores.
•Search Store — Elasticsearch or similar inverted-index database optimized for full-text search and faceted navigation. Powers customer search and browse experiences.
•Read Cache — Redis or Memcached holding hot product data for direct lookups. Powers product detail pages and API responses.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
┌─────────────────────────────────────────────────────────────────────────┐
│                        CATALOG STORAGE ARCHITECTURE                      │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌──────────────────┐                                                   │
│  │   WRITE PATH     │                                                   │
│  │                  │                                                   │
│  │  Seller Portal   │──────┐                                            │
│  │  Admin Tools     │      │                                            │
│  │  Bulk Import     │      │                                            │
│  └──────────────────┘      │                                            │
│                            ▼                                            │
│  ┌───────────────────────────────────────────┐                         │
│  │         PRIMARY STORE (Source of Truth)    │                         │
│  │    ┌────────────────────────────────────┐ │                         │
│  │    │       PostgreSQL / DynamoDB         │ │                         │
│  │    │  • All product data with history    │ │                         │
│  │    │  • Strong consistency on writes     │ │                         │
│  │    │  • Complex seller queries           │ │                         │
│  │    │  • ACID transactions for updates    │ │                         │
│  │    └────────────────────────────────────┘ │                         │
│  └─────────────────────┬─────────────────────┘                         │
│                        │                                                │
│                        │ Change Data Capture (CDC)                      │
│                        │                                                │
│         ┌──────────────┴──────────────┐                                │
│         │                             │                                │
│         ▼                             ▼                                │
│  ┌─────────────────────┐    ┌─────────────────────┐                   │
│  │    SEARCH STORE     │    │     READ CACHE      │                   │
│  │  ┌───────────────┐  │    │  ┌───────────────┐  │                   │
│  │  │ Elasticsearch │  │    │  │     Redis     │  │                   │
│  │  │               │  │    │  │               │  │                   │
│  │  │• Full-text    │  │    │  │• Hot products │  │                   │
│  │  │• Faceted nav  │  │    │  │• Sub-ms reads │  │                   │
│  │  │• Aggregations │  │    │  │• TTL-based    │  │                   │
│  │  │• Analytics    │  │    │  │• Cache-aside  │  │                   │
│  │  └───────────────┘  │    │  └───────────────┘  │                   │
│  └──────────┬──────────┘    └─────────┬───────────┘                   │
│             │                         │                                │
│             └────────────┬────────────┘                                │
│                          │                                             │
│                          ▼                                             │
│           ┌────────────────────────────┐                              │
│           │      READ PATH (APIs)       │                              │
│           │                             │                              │
│           │  • Product pages → Cache    │                              │
│           │  • Search → Elasticsearch   │                              │
│           │  • Browse → Elasticsearch   │                              │
│           └────────────────────────────┘                              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Primary Store Choice: PostgreSQL vs DynamoDB

The choice of primary store depends on your specific requirements:

PostgreSQL with JSON columns works well when:

You need complex relational queries (seller dashboards, analytics)
Your team has strong SQL expertise
You can accept the operational overhead of managing PostgreSQL at scale
You need ACID transactions across product updates

DynamoDB works well when:

You have well-defined access patterns (primarily key-based lookups)
You need predictable performance at any scale
You want managed infrastructure with minimal operations
You can model data to fit DynamoDB's single-table design

For Amazon-scale catalogs, a hybrid approach is common: DynamoDB for the high-throughput primary product data with PostgreSQL for seller analytics and complex reporting.

Storage Layer Characteristics
Characteristic	Primary Store	Search Store	Cache Layer
Technology	PostgreSQL or DynamoDB	Elasticsearch	Redis Cluster
Data Volume	350M products × 50KB = 17.5TB	350M docs × 5KB = 1.75TB (indexed)	Hot 10M × 5KB = 50GB
Write Pattern	Thousands of updates/second	Async sync from CDC	On-demand populate
Read Latency	5-50ms	20-100ms	<5ms
Consistency	Strong (source of truth)	Eventual (seconds lag)	Eventual (TTL-based)
Primary Use	Authoritative data	Search & browse	Product page speed

Search Architecture Deep Dive

Search is the primary way customers interact with the catalog. A well-designed search system must handle:

Full-text queries with typo tolerance and synonym matching
Faceted filtering (brand, price range, ratings, category-specific attributes)
Relevance ranking that considers query match, popularity, and personalization
Real-time updates as products change
100,000+ queries per second during peak traffic

Elasticsearch Architecture:

For a catalog of 350M+ products, a single Elasticsearch cluster isn't sufficient. We need a distributed architecture:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
{
  "settings": {
    "number_of_shards": 48,
    "number_of_replicas": 2,
    "refresh_interval": "30s",
    "analysis": {
      "analyzer": {
        "product_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "snowball", "synonym_filter"]
        },
        "autocomplete_analyzer": {
          "type": "custom",
          "tokenizer": "edge_ngram_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "edge_ngram_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 15,
          "token_chars": ["letter", "digit"]
        }
      },
      "filter": {
        "synonym_filter": {
          "type": "synonym",
          "synonyms_path": "synonyms.txt"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "product_id": { "type": "keyword" },
      "title": { 
        "type": "text",
        "analyzer": "product_analyzer",
        "fields": {
          "autocomplete": { "type": "text", "analyzer": "autocomplete_analyzer" },
          "exact": { "type": "keyword" }
        }
      },
      "description": { "type": "text", "analyzer": "product_analyzer" },
      "brand": { 
        "type": "text",
        "fields": { "keyword": { "type": "keyword" } }
      },
      "category_path": { "type": "keyword" },
      "price": { "type": "scaled_float", "scaling_factor": 100 },
      "rating": { "type": "float" },
      "review_count": { "type": "integer" },
      "in_stock": { "type": "boolean" },
      "attributes": { "type": "flattened" },
      "popularity_score": { "type": "float" },
      "search_keywords": { "type": "text", "analyzer": "product_analyzer" },
      "suggestion": {
        "type": "completion",
        "analyzer": "product_analyzer"
      }
    }
  }
}

Search Index Design Principles

•Shard Count — 48 shards for 350M products allows ~7M docs per shard, enabling parallel search across nodes
•Replica Count — 2 replicas provide read scalability and fault tolerance; each query can hit any replica
•Refresh Interval — 30s refresh reduces indexing overhead; real-time isn't needed for search (new products can wait)
•Multi-field Mapping — Title indexed multiple ways: full-text for search, edge-ngrams for autocomplete, keyword for exact match
•Flattened Attributes — Category-specific attributes stored as flattened type for flexible faceting without schema changes
•Completion Suggester — Built-in autocomplete using specialized data structure for prefix-based suggestions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
{
  "query": {
    "function_score": {
      "query": {
        "bool": {
          "must": [
            {
              "multi_match": {
                "query": "wireless headphones",
                "fields": ["title^3", "brand^2", "description", "search_keywords"],
                "type": "best_fields",
                "fuzziness": "AUTO"
              }
            }
          ],
          "filter": [
            { "term": { "in_stock": true } },
            { "range": { "price": { "lte": 200 } } },
            { "terms": { "attributes.noise_cancellation": ["Active"] } }
          ]
        }
      },
      "functions": [
        {
          "field_value_factor": {
            "field": "popularity_score",
            "factor": 1.2,
            "modifier": "log1p"
          }
        },
        {
          "field_value_factor": {
            "field": "rating",
            "factor": 1.5,
            "modifier": "sqrt"
          }
        },
        {
          "filter": { "range": { "review_count": { "gte": 100 } } },
          "weight": 1.3
        }
      ],
      "score_mode": "multiply",
      "boost_mode": "multiply"
    }
  },
  "aggs": {
    "brand_facet": {
      "terms": { "field": "brand.keyword", "size": 20 }
    },
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 50 },
          { "from": 50, "to": 100 },
          { "from": 100, "to": 200 },
          { "from": 200 }
        ]
      }
    },
    "rating_histogram": {
      "histogram": { "field": "rating", "interval": 1 }
    }
  },
  "highlight": {
    "fields": {
      "title": {},
      "description": { "number_of_fragments": 2 }
    }
  },
  "size": 24,
  "from": 0
}

Function Scoring for Relevance

Multi-Layer Caching Strategy

Caching is critical for catalog performance. With 70,000 QPS hitting product pages, even a small cache improvement has massive impact. We employ multiple caching layers:

Caching Layers in Product Catalog
Layer	Technology	TTL	Hit Rate Target	Purpose
CDN Edge	CloudFront/Fastly	5 min	40%	Static assets, popular product pages
Application Cache	Redis Cluster	15 min	85%	Product data, computed fields
Local Cache	In-process LRU	1 min	60%	Extremely hot keys, reduce Redis hops
Search Cache	Elasticsearch cache	30 sec	70%	Common search queries, facet aggregations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// Cache key design for product catalog
 
interface CacheKeyStrategy {
  // Product data - versioned to enable instant invalidation
  productData: (productId: string, version: number) => string;
  
  // Price data - separate cache with shorter TTL (prices change frequently)
  productPrice: (productId: string, variantId: string, userId?: string) => string;
  
  // Search results - include normalized query and filters
  searchResults: (queryHash: string, page: number) => string;
  
  // Category data - hierarchical with parent invalidation
  categoryProducts: (categoryId: string, filters: string, page: number) => string;
}
 
const cacheKeys: CacheKeyStrategy = {
  productData: (productId, version) => 
    `product:data:${productId}:v${version}`,
  
  productPrice: (productId, variantId, userId) =>
    userId 
      ? `product:price:${productId}:${variantId}:user:${userId}`  // Personalized
      : `product:price:${productId}:${variantId}:default`,         // Standard
  
  searchResults: (queryHash, page) =>
    `search:results:${queryHash}:page:${page}`,
  
  categoryProducts: (categoryId, filters, page) =>
    `category:${categoryId}:filters:${filters}:page:${page}`
};
 
// Cache-aside pattern with version checking
async function getProduct(productId: string): Promise<Product> {
  // Get current version from lightweight metadata
  const currentVersion = await redis.get(`product:version:${productId}`);
  
  // Try cache with version
  const cacheKey = cacheKeys.productData(productId, parseInt(currentVersion || '0'));
  const cached = await redis.get(cacheKey);
  
  if (cached) {
    return JSON.parse(cached);
  }
  
  // Cache miss - fetch from primary store
  const product = await primaryStore.getProduct(productId);
  
  // Store in cache with TTL
  await redis.setex(cacheKey, 900, JSON.stringify(product));  // 15 min TTL
  
  return product;
}
 
// On product update, just bump the version
async function invalidateProduct(productId: string): Promise<void> {
  await redis.incr(`product:version:${productId}`);
  // Old cached entries will naturally expire
  // New requests get new version number, triggering cache miss
}

Advanced Caching Patterns

•Cache Stampede Prevention — Use probabilistic early expiration or locking to prevent thundering herd on popular products
•Write-Through for Critical Data — Prices and availability bypass cache-aside to ensure immediate consistency
•Negative Caching — Cache '404' results for deleted products to prevent repeated DB lookups
•Tiered TTLs — Static product info (description) has long TTL; dynamic data (price, stock) has short TTL
•Cache Warming — Pre-populate cache for predicted hot products before major sale events

The Cache Consistency Challenge

Cross-Store Data Synchronization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
┌─────────────────────────────────────────────────────────────────────────────┐
│                     CATALOG SYNCHRONIZATION PIPELINE                         │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌──────────────┐                                                          │
│   │  PostgreSQL  │                                                          │
│   │  (Primary)   │                                                          │
│   └──────┬───────┘                                                          │
│          │ WAL (Write-Ahead Log)                                            │
│          │                                                                   │
│          ▼                                                                   │
│   ┌──────────────┐                                                          │
│   │   Debezium   │  Captures all changes as events                          │
│   │ (CDC Engine) │                                                          │
│   └──────┬───────┘                                                          │
│          │                                                                   │
│          ▼                                                                   │
│   ┌──────────────────────────────────────────────────────┐                  │
│   │                    Apache Kafka                       │                  │
│   │   ┌─────────────────────────────────────────────┐    │                  │
│   │   │  catalog.products.changes (partitioned by   │    │                  │
│   │   │  productId for ordering guarantees)         │    │                  │
│   │   └─────────────────────────────────────────────┘    │                  │
│   └─────────────────────────┬────────────────────────────┘                  │
│                             │                                                │
│        ┌────────────────────┼────────────────────┐                          │
│        │                    │                    │                          │
│        ▼                    ▼                    ▼                          │
│  ┌───────────┐       ┌───────────┐       ┌───────────┐                     │
│  │   ES      │       │   Redis   │       │   CDN     │                     │
│  │ Indexer   │       │ Updater   │       │ Purger    │                     │
│  │           │       │           │       │           │                     │
│  │• Batch    │       │• Increment│       │• API call │                     │
│  │  indexing │       │  version  │       │  to purge │                     │
│  │• Bulk API │       │• Update   │       │• Selective│                     │
│  │• Retry on │       │  price if │       │  invalidate│                    │
│  │  failure  │       │  changed  │       │           │                     │
│  └───────────┘       └───────────┘       └───────────┘                     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
// Kafka consumer processing catalog change events
 
interface ProductChangeEvent {
  eventId: string;
  timestamp: number;
  operation: 'INSERT' | 'UPDATE' | 'DELETE';
  productId: string;
  before?: Partial<Product>;   // Previous state (for updates)
  after?: Product;             // New state (for inserts/updates)
  changedFields?: string[];    // Which fields changed
}
 
class CatalogSyncConsumer {
  async processEvent(event: ProductChangeEvent): Promise<void> {
    const { productId, operation, after, changedFields } = event;
    
    try {
      // Always sync to Elasticsearch (full document)
      if (operation === 'DELETE') {
        await this.elasticsearch.delete('products', productId);
      } else {
        await this.elasticsearch.index('products', productId, 
          this.transformForSearch(after));
      }
      
      // Increment cache version (lazy invalidation)
      await this.redis.incr(`product:version:${productId}`);
      
      // If price changed, also update price cache immediately
      if (changedFields?.includes('currentPrice')) {
        await this.updatePriceCache(productId, after);
      }
      
      // If it's a popular product, purge CDN cache
      if (await this.isHighTrafficProduct(productId)) {
        await this.cdn.purge(`/products/${productId}*`);
      }
      
      // Record sync completion for monitoring
      await this.metrics.recordSync(productId, event.timestamp);
      
    } catch (error) {
      // Send to dead-letter queue for retry
      await this.deadLetterQueue.send(event, error);
      throw error;
    }
  }
  
  // Transform product for search-optimized format
  private transformForSearch(product: Product): SearchDocument {
    return {
      product_id: product.id,
      title: product.title,
      description: product.description,
      brand: product.brand,
      category_path: product.categoryPath,
      price: this.getLowestVariantPrice(product),
      rating: product.averageRating,
      review_count: product.reviewCount,
      in_stock: this.hasAnyInStockVariant(product),
      attributes: this.flattenAttributes(product.attributes),
      popularity_score: this.calculatePopularity(product),
      search_keywords: product.keywords.join(' '),
      updated_at: new Date().toISOString()
    };
  }
}

Why Kafka for CDC?

High Availability Architecture

Multi-Region Deployment Strategy:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
                                    ┌───────────────┐
                                    │    Global     │
                                    │ Load Balancer │
                                    │  (Route 53)   │
                                    └───────┬───────┘
                                            │
           ┌────────────────────────────────┼────────────────────────────────┐
           │                                │                                │
           ▼                                ▼                                ▼
   ┌───────────────┐               ┌───────────────┐               ┌───────────────┐
   │   US-EAST     │               │   EU-WEST     │               │  AP-SOUTHEAST │
   │ (Primary for  │               │ (Primary for  │               │ (Primary for  │
   │  Americas)    │               │  Europe)      │               │  Asia)        │
   └───────┬───────┘               └───────┬───────┘               └───────┬───────┘
           │                                │                                │
   ┌───────┴───────┐               ┌───────┴───────┐               ┌───────┴───────┐
   │               │               │               │               │               │
   ▼               ▼               ▼               ▼               ▼               ▼
┌──────┐       ┌──────┐       ┌──────┐       ┌──────┐       ┌──────┐       ┌──────┐
│Redis │       │ ES   │       │Redis │       │ ES   │       │Redis │       │ ES   │
│Cluster│      │Cluster│      │Cluster│      │Cluster│      │Cluster│      │Cluster│
└──────┘       └──────┘       └──────┘       └──────┘       └──────┘       └──────┘
           │                                │                                │
           └────────────────────────────────┼────────────────────────────────┘
                                            │
                           ┌────────────────┴────────────────┐
                           │                                  │
                           ▼                                  ▼
                   ┌───────────────┐                 ┌───────────────┐
                   │ Primary DB    │ ───────────────→│ Read Replicas │
                   │ (US-EAST)     │  Cross-region   │ (EU, APAC)    │
                   │               │   replication   │               │
                   └───────────────┘                 └───────────────┘

Failure Handling Strategies

•Regional Failover — If US-EAST fails, Route 53 health checks trigger automatic failover to EU-WEST within 60 seconds
•Read Replica Promotion — DR replicas can be promoted to primary in disaster recovery scenarios
•Cache Fallback — If Elasticsearch is unavailable, degrade to Redis-based filtered listings (reduced functionality)
•Circuit Breakers — Prevent cascade failures when downstream services are struggling
•Graceful Degradation — Show cached product data even if real-time inventory check fails

Health Monitoring

•Synthetic monitoring with end-to-end transaction tests
•p50, p95, p99 latency tracking per endpoint
•Cache hit rate monitoring with alerts
•Search relevance quality metrics
•Sync lag measurement across stores

Alerting Thresholds

•Page load p99 > 500ms → Page oncall
•Search p99 > 1s → Escalate immediately
•Error rate > 0.1% → Automatic rollback
•Cache hit rate < 70% → Investigate
•Sync lag > 5 min → High priority alert

Catalog Architecture Summary

We've covered the complete architecture of a product catalog system designed for Amazon-scale operations. Let's consolidate the key architectural decisions:

Key Architectural Principles

•Polyglot Persistence — Use the right store for each access pattern: relational for complex queries, Elasticsearch for search, Redis for speed
•CDC for Consistency — Change Data Capture enables reliable, ordered synchronization across all stores without tight coupling
•Multi-Layer Caching — CDN, Redis, and local caches each serve different purposes with tiered TTLs
•Version-Based Invalidation — Lazy invalidation via version numbers provides efficient cache management at scale
•Active-Active Multi-Region — Each region has full read capability with global write coordination
•Graceful Degradation — System continues functioning with reduced features when components fail

What's Next:

Page Complete

2 / 6