System Design HLDAmazon E-Commerce

Designing Amazon: E-Commerce at Massive Scale

LevelAdvanced

Duration120 mins

TopicAmazon E-Commerce

6 / 6

Recommendation Engine

The Revenue Multiplier

Amazon's recommendation engine is responsible for an estimated 35% of all purchases on the platform. When you see 'Customers who bought this item also bought...', 'Frequently bought together', or the personalized homepage, you're witnessing one of the most sophisticated recommendation systems ever built.

At Amazon's scale, the recommendation system must:

Generate personalized recommendations for 500+ million users
Cover a catalog of 350+ million products
Serve recommendations in under 100ms per request
Process billions of user interactions daily to update models
Handle the cold start problem for new users and products

This isn't just about showing related products—it's about understanding user intent, predicting future needs, and presenting the right product at the right moment. The difference between good and great recommendations translates directly into billions of dollars in revenue.

This page explores the architecture that makes this possible.

Learning Objectives

By the end of this page, you will understand the key recommendation algorithms (collaborative filtering, content-based, hybrid); how to architect a real-time recommendation serving system; strategies for handling cold start and data sparsity; and the infrastructure needed to train and deploy models at scale.

Types of E-Commerce Recommendations

Different recommendation surfaces serve different purposes. Each requires a tailored approach optimized for its specific context and user intent.

Recommendation Types and Their Characteristics
Type	Location	Primary Goal	Algorithm Style
Personalized Homepage	Homepage	Re-engagement, discovery	User-based CF + trending
Similar Items	Product page	Comparison shopping	Item-based CF + content
Frequently Bought Together	Product page	Bundle completion	Association rules
Customers Also Viewed	Product page	Alternative exploration	Session-based patterns
Recently Viewed	Homepage/Cart	Continuation	User history retrieval
Search Recommendations	Search results	Query refinement	Semantic similarity
Cart Recommendations	Shopping cart	Cross-sell, upsell	Complementary items
Email Recommendations	Marketing email	Re-engagement	User preference + recency

Key Recommendation Metrics

•Click-Through Rate (CTR) — What percentage of shown recommendations are clicked?
•Conversion Rate — What percentage of recommendation clicks result in purchase?
•Revenue per Recommendation — Average revenue driven by recommendation unit
•Coverage — What percentage of catalog appears in recommendations?
•Diversity — How varied are recommendations? (avoid echo chambers)
•Serendipity — Do recommendations surface unexpected but valuable items?

Core Recommendation Algorithms

Production recommendation systems combine multiple approaches. Each algorithm has strengths and weaknesses that complement each other.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
// User-Based Collaborative Filtering
// "Users similar to you also liked..."
 
class UserBasedCF {
  // Find similar users based on their interaction history
  async findSimilarUsers(userId: string, k: number = 50): Promise<SimilarUser[]> {
    const userVector = await this.getUserInteractionVector(userId);
    
    // Compute cosine similarity with other users
    // In practice, use approximate nearest neighbors (ANN) for scale
    const similarities = await this.annIndex.query(userVector, k);
    
    return similarities.map(s => ({
      userId: s.id,
      similarity: s.score,
    }));
  }
  
  async recommendItems(userId: string, count: number = 20): Promise<Recommendation[]> {
    const similarUsers = await this.findSimilarUsers(userId, 100);
    const userPurchases = new Set(await this.getUserPurchases(userId));
    
    // Aggregate items from similar users, weighted by similarity
    const itemScores = new Map<string, number>();
    
    for (const { userId: similarUserId, similarity } of similarUsers) {
      const theirPurchases = await this.getUserPurchases(similarUserId);
      
      for (const itemId of theirPurchases) {
        if (!userPurchases.has(itemId)) {  // Don't recommend already purchased
          const currentScore = itemScores.get(itemId) || 0;
          itemScores.set(itemId, currentScore + similarity);
        }
      }
    }
    
    // Sort by score and return top N
    return Array.from(itemScores.entries())
      .sort((a, b) => b[1] - a[1])
      .slice(0, count)
      .map(([itemId, score]) => ({ itemId, score, source: 'user_cf' }));
  }
}
 
// Item-Based Collaborative Filtering
// "Customers who bought this also bought..."
 
class ItemBasedCF {
  // Precompute item-item similarities during batch processing
  async computeItemSimilarities(): Promise<void> {
    // Build co-occurrence matrix from purchase history
    // For each pair of items, count how many users bought both
    // Normalize to get similarity scores
    
    const items = await this.getAllItems();
    
    for (const item of items) {
      const buyersOfItem = await this.getBuyers(item.id);
      const relatedItems = new Map<string, number>();
      
      for (const buyerId of buyersOfItem) {
        const theirPurchases = await this.getUserPurchases(buyerId);
        for (const otherItem of theirPurchases) {
          if (otherItem !== item.id) {
            const count = relatedItems.get(otherItem) || 0;
            relatedItems.set(otherItem, count + 1);
          }
        }
      }
      
      // Normalize by geometric mean of item popularities
      const itemPopularity = buyersOfItem.length;
      const similarities: ItemSimilarity[] = [];
      
      for (const [otherItemId, coCount] of relatedItems) {
        const otherPopularity = await this.getItemPopularity(otherItemId);
        const similarity = coCount / Math.sqrt(itemPopularity * otherPopularity);
        similarities.push({ itemId: otherItemId, similarity });
      }
      
      // Store top-K similar items per item
      const topK = similarities
        .sort((a, b) => b.similarity - a.similarity)
        .slice(0, 100);
      
      await this.storeItemSimilarities(item.id, topK);
    }
  }
  
  // Real-time: Get recommendations for a specific product
  async getSimilarItems(itemId: string): Promise<Recommendation[]> {
    // Direct lookup of precomputed similarities
    const similarities = await this.cache.get(`item_sim:${itemId}`);
    
    if (similarities) {
      return similarities.map(s => ({ 
        itemId: s.itemId, 
        score: s.similarity,
        source: 'item_cf'
      }));
    }
    
    // Fall back to content-based if no collaborative data
    return this.contentBasedFallback(itemId);
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
// Content-Based Filtering
// "Based on the attributes of items you've viewed..."
 
class ContentBasedRecommender {
  // Build item feature vectors from product attributes
  buildItemVector(item: Product): number[] {
    const features: number[] = [];
    
    // Category embedding (learned or one-hot)
    features.push(...this.categoryEmbedding.get(item.categoryId));
    
    // Brand embedding
    features.push(...this.brandEmbedding.get(item.brand));
    
    // Price bucket (normalized)
    features.push(this.normalizePriceBucket(item.price));
    
    // Attribute embeddings (varies by category)
    for (const attr of item.attributes) {
      if (this.attributeEmbeddings.has(attr.name)) {
        features.push(...this.attributeEmbeddings.get(attr.name).get(attr.value));
      }
    }
    
    // Text embeddings from title/description (using pre-trained model)
    const textEmbedding = await this.textEncoder.encode(
      `${item.title} ${item.description}`
    );
    features.push(...textEmbedding);
    
    return features;
  }
  
  // Build user preference vector from their interaction history
  async buildUserPreferenceVector(userId: string): Promise<number[]> {
    const interactions = await this.getUserInteractions(userId);
    
    // Weighted average of item vectors based on interaction type and recency
    let combinedVector = new Array(this.vectorDimension).fill(0);
    let totalWeight = 0;
    
    for (const interaction of interactions) {
      const itemVector = await this.getItemVector(interaction.itemId);
      
      // Weight by interaction type
      const typeWeight = this.interactionWeights[interaction.type];
      // Decay by recency (exponential decay)
      const recencyWeight = Math.exp(
        -(Date.now() - interaction.timestamp) / (7 * 24 * 60 * 60 * 1000)
      );
      
      const weight = typeWeight * recencyWeight;
      
      for (let i = 0; i < itemVector.length; i++) {
        combinedVector[i] += itemVector[i] * weight;
      }
      totalWeight += weight;
    }
    
    // Normalize
    return combinedVector.map(v => v / totalWeight);
  }
  
  async recommend(userId: string, count: number): Promise<Recommendation[]> {
    const userVector = await this.buildUserPreferenceVector(userId);
    
    // Find items most similar to user preference vector
    // Using approximate nearest neighbors for scale
    const candidates = await this.annIndex.query(userVector, count * 3);
    
    // Filter out already purchased
    const purchased = new Set(await this.getUserPurchases(userId));
    
    return candidates
      .filter(c => !purchased.has(c.itemId))
      .slice(0, count)
      .map(c => ({ itemId: c.itemId, score: c.similarity, source: 'content' }));
  }
  
  private interactionWeights = {
    'purchase': 5.0,
    'add_to_cart': 3.0,
    'product_view': 1.0,
    'search_click': 0.5,
    'impression': 0.1,
  };
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
// Association Rule Mining
// "Frequently bought together: X + Y + Z"
 
class AssociationRuleMiner {
  // Apriori algorithm for finding frequent itemsets
  async findFrequentItemsets(
    minSupport: number,
    maxSize: number = 3
  ): Promise<FrequentItemset[]> {
    const transactions = await this.getRecentTransactions(90);  // Last 90 days
    const totalTransactions = transactions.length;
    
    // Find frequent single items
    const itemCounts = new Map<string, number>();
    for (const tx of transactions) {
      for (const item of tx.items) {
        itemCounts.set(item.sku, (itemCounts.get(item.sku) || 0) + 1);
      }
    }
    
    let frequentItemsets: FrequentItemset[] = [];
    let currentLevel = Array.from(itemCounts.entries())
      .filter(([_, count]) => count / totalTransactions >= minSupport)
      .map(([sku, count]) => ({
        items: [sku],
        support: count / totalTransactions,
      }));
    
    frequentItemsets.push(...currentLevel);
    
    // Generate larger itemsets
    for (let size = 2; size <= maxSize; size++) {
      const candidates = this.generateCandidates(currentLevel, size);
      const nextLevel: FrequentItemset[] = [];
      
      for (const candidate of candidates) {
        const count = this.countSupport(candidate, transactions);
        const support = count / totalTransactions;
        
        if (support >= minSupport) {
          nextLevel.push({ items: candidate, support });
        }
      }
      
      if (nextLevel.length === 0) break;
      
      frequentItemsets.push(...nextLevel);
      currentLevel = nextLevel;
    }
    
    return frequentItemsets;
  }
  
  // Generate association rules from frequent itemsets
  async generateRules(minConfidence: number): Promise<AssociationRule[]> {
    const itemsets = await this.findFrequentItemsets(0.001, 3);
    const rules: AssociationRule[] = [];
    
    for (const itemset of itemsets) {
      if (itemset.items.length < 2) continue;
      
      // Generate all possible A -> B rules from itemset
      for (let i = 1; i < Math.pow(2, itemset.items.length) - 1; i++) {
        const antecedent: string[] = [];
        const consequent: string[] = [];
        
        for (let j = 0; j < itemset.items.length; j++) {
          if (i & (1 << j)) {
            antecedent.push(itemset.items[j]);
          } else {
            consequent.push(itemset.items[j]);
          }
        }
        
        // Calculate confidence: P(consequent | antecedent)
        const antecedentSupport = await this.getItemsetSupport(antecedent);
        const confidence = itemset.support / antecedentSupport;
        
        if (confidence >= minConfidence) {
          // Calculate lift: confidence / P(consequent)
          const consequentSupport = await this.getItemsetSupport(consequent);
          const lift = confidence / consequentSupport;
          
          rules.push({
            antecedent,
            consequent,
            support: itemset.support,
            confidence,
            lift,
          });
        }
      }
    }
    
    return rules.sort((a, b) => b.lift - a.lift);
  }
  
  // Real-time lookup
  async getFrequentlyBoughtTogether(itemId: string): Promise<Recommendation[]> {
    // Precomputed and cached
    const rules = await this.cache.get(`fbt:${itemId}`);
    
    if (rules) {
      return rules.map(r => ({
        itemId: r.consequent[0],
        score: r.lift * r.confidence,
        source: 'fbt',
        bundleDiscount: r.bundleDiscount,
      }));
    }
    
    return [];
  }
}

Deep Learning for Recommendations

Modern recommendation systems increasingly use deep learning to capture complex patterns in user behavior. Neural networks can learn rich representations that go beyond simple similarity measures.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
// Two-Tower Architecture for Candidate Generation
// Separate encoder networks for users and items
 
interface TwoTowerModel {
  userTower: NeuralNetwork;   // Encodes user to embedding
  itemTower: NeuralNetwork;   // Encodes item to embedding
  embeddingDim: number;       // Shared embedding dimension
}
 
class NeuralRetrievalModel {
  constructor(private model: TwoTowerModel) {}
  
  // User tower takes user features and history
  async encodeUser(userId: string): Promise<Float32Array> {
    const userFeatures = await this.getUserFeatures(userId);
    
    // Features include:
    // - User demographics (age bucket, location, etc.)
    // - Aggregated interaction history
    // - Recent click sequence (processed by RNN/Transformer)
    // - Time features (day of week, time of day)
    
    return this.model.userTower.forward(userFeatures);
  }
  
  // Item tower takes item features
  async encodeItem(itemId: string): Promise<Float32Array> {
    const itemFeatures = await this.getItemFeatures(itemId);
    
    // Features include:
    // - Category embeddings
    // - Text embeddings (title, description)
    // - Image embeddings (from CNN)
    // - Price, rating, popularity signals
    
    return this.model.itemTower.forward(itemFeatures);
  }
  
  // Training: contrastive learning with in-batch negatives
  async train(batch: TrainingBatch): Promise<void> {
    const userEmbeddings = await Promise.all(
      batch.users.map(u => this.encodeUser(u.id))
    );
    
    const itemEmbeddings = await Promise.all(
      batch.positiveItems.map(i => this.encodeItem(i.id))
    );
    
    // Compute similarity matrix (all user-item pairs in batch)
    const similarities = this.computeSimilarityMatrix(
      userEmbeddings, 
      itemEmbeddings
    );
    
    // Loss: maximize similarity for positive pairs
    // minimize similarity for in-batch negatives
    const loss = this.inBatchSoftmaxLoss(similarities, batch.labels);
    
    await this.optimizer.step(loss);
  }
  
  // Serving: encode user once, retrieve from item index
  async recommend(userId: string, count: number): Promise<Recommendation[]> {
    const userEmbedding = await this.encodeUser(userId);
    
    // Query approximate nearest neighbor index of item embeddings
    const candidates = await this.itemAnnIndex.query(userEmbedding, count * 10);
    
    // Re-rank with full model if needed
    return this.rerank(userId, candidates, count);
  }
}
 
// Sequential Recommendation with Transformers
class SequentialRecommender {
  // Process user's click/purchase sequence
  async recommendFromSequence(
    sequence: UserEvent[],
    count: number
  ): Promise<Recommendation[]> {
    // Embed each item in sequence
    const itemEmbeddings = await Promise.all(
      sequence.map(e => this.itemEncoder.encode(e.itemId))
    );
    
    // Add positional encoding
    const positionedEmbeddings = itemEmbeddings.map((emb, i) => 
      this.addPositionalEncoding(emb, i)
    );
    
    // Self-attention over sequence
    const contextualizedEmbeddings = await this.transformer.forward(
      positionedEmbeddings
    );
    
    // Use last position as sequence representation
    const sequenceEmbedding = contextualizedEmbeddings[contextualizedEmbeddings.length - 1];
    
    // Predict next items
    return this.predictNextItems(sequenceEmbedding, count);
  }
}

Why Two Towers?

Separating user and item encoders allows precomputing item embeddings offline. At serving time, we only need to compute the user embedding once, then perform a fast approximate nearest neighbor search over millions of pre-indexed items. This enables sub-100ms recommendations at scale.

Recommendation Serving Architecture

Serving recommendations at scale requires a specialized architecture that can handle billions of requests while maintaining low latency.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
┌──────────────────────────────────────────────────────────────────────────────────┐
│                    RECOMMENDATION SERVING PIPELINE                                │
├──────────────────────────────────────────────────────────────────────────────────┤
│                                                                                   │
│                     ┌──────────────┐                                              │
│                     │   Request    │                                              │
│                     │  (userId,    │                                              │
│                     │   context)   │                                              │
│                     └──────┬───────┘                                              │
│                            │                                                      │
│                            ▼                                                      │
│  ┌─────────────────────────────────────────────────────────────────────────────┐ │
│  │                    STAGE 1: CANDIDATE GENERATION                            │ │
│  │               (Retrieve 1000s of candidates, <50ms)                          │ │
│  │  ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐    │ │
│  │  │  User-based   │ │  Item-based   │ │   Content     │ │   Trending    │    │ │
│  │  │     CF        │ │     CF        │ │    Based      │ │   Popular     │    │ │
│  │  │              │ │              │ │              │ │              │    │ │
│  │  │  Top 500     │ │  Top 500     │ │  Top 300     │ │  Top 200     │    │ │
│  │  └──────┬────────┘ └──────┬────────┘ └──────┬────────┘ └──────┬────────┘    │ │
│  │         │                 │                 │                 │             │ │
│  │         └─────────────────┴─────────────────┴─────────────────┘             │ │
│  │                                     │                                        │ │
│  │                                     ▼                                        │ │
│  │                         ┌────────────────────┐                               │ │
│  │                         │  Deduplicate &     │                               │ │
│  │                         │  Merge (~1500)     │                               │ │
│  │                         └────────────────────┘                               │ │
│  └─────────────────────────────────────┬───────────────────────────────────────┘ │
│                                        │                                          │
│                                        ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────────────┐ │
│  │                    STAGE 2: SCORING/RANKING                                  │ │
│  │               (Score all candidates, <30ms)                                   │ │
│  │                                                                              │ │
│  │     ┌───────────────────────────────────────────────────────────────────┐   │ │
│  │     │              Deep Ranking Model (Neural Network)                   │   │ │
│  │     │                                                                    │   │ │
│  │     │  Inputs:                                                           │   │ │
│  │     │  - User features (demographics, history, context)                 │   │ │
│  │     │  - Item features (embeddings, attributes)                         │   │ │
│  │     │  - Cross features (user-item interaction signals)                 │   │ │
│  │     │                                                                    │   │ │
│  │     │  Outputs:                                                          │   │ │
│  │     │  - P(click), P(add_to_cart), P(purchase), E(revenue)             │   │ │
│  │     └───────────────────────────────────────────────────────────────────┘   │ │
│  │                                                                              │ │
│  │                                  │                                           │ │
│  │                                  ▼                                           │ │
│  │                         ┌────────────────┐                                   │ │
│  │                         │ Score & Sort   │                                   │ │
│  │                         │ (~100 items)   │                                   │ │
│  │                         └────────────────┘                                   │ │
│  └─────────────────────────────────┬───────────────────────────────────────────┘ │
│                                    │                                              │
│                                    ▼                                              │
│  ┌─────────────────────────────────────────────────────────────────────────────┐ │
│  │                    STAGE 3: POST-PROCESSING                                  │ │
│  │               (Filter, reorder, diversify, <20ms)                            │ │
│  │                                                                              │ │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐   │ │
│  │  │  Filter     │ │  Diversify  │ │  Business   │ │  Explanation        │   │ │
│  │  │  Ineligible │ │  Categories │ │  Rules      │ │  Generation         │   │ │
│  │  │  Items      │ │  & Brands   │ │  (Boosting) │ │  ("Because you...")  │   │ │
│  │  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────────────┘   │ │
│  │                                                                              │ │
│  └─────────────────────────────────┬───────────────────────────────────────────┘ │
│                                    │                                              │
│                                    ▼                                              │
│                          ┌────────────────────┐                                   │
│                          │  Final Results     │                                   │
│                          │  (10-50 items)     │                                   │
│                          └────────────────────┘                                   │
│                                                                                   │
└──────────────────────────────────────────────────────────────────────────────────┘

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
class RecommendationService {
  async getRecommendations(request: RecommendationRequest): Promise<RecommendationResponse> {
    const startTime = performance.now();
    const { userId, context, slot, count } = request;
    
    // Load user features (cached, ~5ms)
    const userFeatures = await this.userFeatureStore.get(userId);
    
    // STAGE 1: Candidate Generation (parallel, ~40ms total)
    const candidatePromises = [
      this.userCF.recommend(userId, 500),
      this.itemCF.recommendFromRecent(userFeatures.recentItems, 500),
      this.contentBased.recommend(userId, 300),
      this.trending.getByCategory(userFeatures.preferredCategories, 200),
      this.neural.retrieve(userId, 500),
    ];
    
    const candidateSets = await Promise.all(candidatePromises);
    
    // Merge and deduplicate
    const candidates = this.mergeCandidates(candidateSets);
    
    // STAGE 2: Scoring (~25ms)
    const scored = await this.ranker.scoreAll(userId, candidates, context);
    
    // STAGE 3: Post-processing (~15ms)
    const processed = await this.postProcess(scored, {
      diversityConstraint: { maxPerCategory: 3, maxPerBrand: 2 },
      exclude: userFeatures.recentPurchases,
      boost: slot.boostCategories,
      includeExplanations: request.includeExplanations,
    });
    
    // Take top N
    const results = processed.slice(0, count);
    
    // Log for model training
    await this.logRequest({
      userId,
      slot,
      candidates: scored.map(c => c.itemId),
      displayed: results.map(r => r.itemId),
      latencyMs: performance.now() - startTime,
    });
    
    return {
      recommendations: results,
      metadata: {
        algorithmVersion: this.modelVersion,
        latencyMs: performance.now() - startTime,
        candidateCount: candidates.length,
      },
    };
  }
  
  private async postProcess(
    scored: ScoredCandidate[],
    options: PostProcessOptions
  ): Promise<Recommendation[]> {
    let results = scored
      // Filter excluded items
      .filter(c => !options.exclude.has(c.itemId))
      // Filter out-of-stock
      .filter(c => c.inStock);
    
    // Apply business boosting rules
    for (const category of options.boost) {
      results = results.map(r => ({
        ...r,
        score: r.category === category ? r.score * 1.2 : r.score,
      }));
    }
    
    // Diversify: Maximal Marginal Relevance
    results = this.diversify(results, options.diversityConstraint);
    
    // Generate explanations
    if (options.includeExplanations) {
      results = results.map(r => ({
        ...r,
        explanation: this.generateExplanation(r),
      }));
    }
    
    return results;
  }
  
  private diversify(
    items: ScoredCandidate[],
    constraint: DiversityConstraint
  ): ScoredCandidate[] {
    const result: ScoredCandidate[] = [];
    const categoryCount = new Map<string, number>();
    const brandCount = new Map<string, number>();
    
    for (const item of items) {
      const catCount = categoryCount.get(item.category) || 0;
      const brdCount = brandCount.get(item.brand) || 0;
      
      if (catCount < constraint.maxPerCategory && 
          brdCount < constraint.maxPerBrand) {
        result.push(item);
        categoryCount.set(item.category, catCount + 1);
        brandCount.set(item.brand, brdCount + 1);
      }
    }
    
    return result;
  }
}

Handling Cold Start

The cold start problem—recommending for new users or new items—is one of the hardest challenges in recommendation systems. Without historical data, collaborative filtering fails completely.

New User Strategies

•Popularity-based — Start with globally popular items
•Context signals — Use location, device, time of day
•Referral source — Infer interests from ad/search keyword
•Onboarding quiz — Ask preferences explicitly
•Rapid learning — Weight early signals heavily
•Demographic segments — Use similar user archetypes

New Item Strategies

•Content-based — Use item attributes and descriptions
•Exploration bonus — Boost new items to collect signals
•Similar item inference — Map to similar existing items
•Seller/brand transfer — Use seller's other items' patterns
•Image similarity — Visual similarity to popular items
•Strategic placement — Show in high-traffic slots

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
// Multi-Armed Bandit for Exploration-Exploitation
class EpsilonGreedyRecommender {
  private epsilon = 0.1;  // 10% exploration rate
  
  async recommend(
    userId: string,
    count: number
  ): Promise<Recommendation[]> {
    const recommendations: Recommendation[] = [];
    
    for (let i = 0; i < count; i++) {
      if (Math.random() < this.epsilon) {
        // EXPLORE: Show a random/new item to gather data
        const exploreItem = await this.selectExplorationCandidate();
        recommendations.push({
          ...exploreItem,
          isExploration: true,
        });
      } else {
        // EXPLOIT: Show highest predicted value item
        const exploitItem = await this.selectBestPredicted(userId);
        recommendations.push({
          ...exploitItem,
          isExploration: false,
        });
      }
    }
    
    return recommendations;
  }
  
  private async selectExplorationCandidate(): Promise<Recommendation> {
    // Thompson Sampling: balance exploration with uncertainty
    const candidates = await this.getNewItemCandidates();
    
    // For each candidate, sample from posterior distribution
    // of expected reward
    const sampledValues = candidates.map(c => ({
      item: c,
      sample: this.samplePosterior(c.conversionMean, c.conversionStdDev),
    }));
    
    // Select highest sampled value
    return sampledValues.sort((a, b) => b.sample - a.sample)[0].item;
  }
  
  private samplePosterior(mean: number, stdDev: number): number {
    // Beta distribution for conversion rates
    return mean + stdDev * this.gaussianRandom();
  }
}
 
// Contextual Bandits for Cold Start
class ContextualBanditRecommender {
  // Use user context even without history
  async recommendForNewUser(context: UserContext): Promise<Recommendation[]> {
    // Extract signal from available context
    const features = this.extractContextFeatures(context);
    // {
    //   device: 'mobile',
    //   platform: 'ios',
    //   hour: 14,
    //   dayOfWeek: 'tuesday',
    //   location: 'US-CA',
    //   referrer: 'google',
    //   searchQuery: 'wireless earbuds',
    // }
    
    // Find similar contexts and their successful recommendations
    const similarContexts = await this.contextIndex.query(features, 100);
    
    // Aggregate successful items from similar contexts
    const itemScores = new Map<string, number>();
    
    for (const ctx of similarContexts) {
      const weight = ctx.similarity;
      for (const item of ctx.clickedItems) {
        itemScores.set(item, (itemScores.get(item) || 0) + weight);
      }
    }
    
    return Array.from(itemScores.entries())
      .sort((a, b) => b[1] - a[1])
      .slice(0, 20)
      .map(([itemId, score]) => ({ itemId, score, source: 'contextual_bandit' }));
  }
}

ML Infrastructure at Scale

Supporting recommendation systems at Amazon's scale requires specialized ML infrastructure for feature storage, model training, and real-time serving.

┌──────────────────────────────────────────────────────────────────────────────────┐
│                         FEATURE STORE ARCHITECTURE                                │
├──────────────────────────────────────────────────────────────────────────────────┤
│                                                                                   │
│  DATA SOURCES                   FEATURE COMPUTATION           FEATURE SERVING    │
│  ───────────────                ───────────────────           ────────────────   │
│                                                                                   │
│  ┌─────────────┐               ┌────────────────────┐        ┌──────────────┐   │
│  │  Clickstream│───────────┬──▶│  Batch Processing  │───────▶│   Offline    │   │
│  │  Events     │           │   │  (Spark/Flink)     │        │   Store      │   │
│  └─────────────┘           │   │                    │        │  (S3/HDFS)   │   │
│  ┌─────────────┐           │   │  Aggregations:     │        └──────────────┘   │
│  │ Transaction │───────────┤   │  - User purchase   │               │           │
│  │  Data       │           │   │    history         │               │           │
│  └─────────────┘           │   │  - Item popularity │               │           │
│  ┌─────────────┐           │   │  - User-item       │               ▼           │
│  │  Product    │───────────┤   │    interactions    │        ┌──────────────┐   │
│  │  Catalog    │           │   └────────────────────┘        │   Training   │   │
│  └─────────────┘           │                                 │   Pipeline   │   │
│  ┌─────────────┐           │   ┌────────────────────┐        └──────────────┘   │
│  │  User       │───────────┘   │  Stream Processing │                           │
│  │  Profiles   │──────────────▶│  (Kafka Streams)   │        ┌──────────────┐   │
│  └─────────────┘               │                    │───────▶│   Online     │   │
│                                │  Real-time:        │        │   Store      │   │
│  ┌─────────────┐               │  - Session context │        │  (Redis/     │   │
│  │  Real-time  │──────────────▶│  - Recent clicks   │        │   DynamoDB)  │   │
│  │  Events     │               │  - Cart state      │        └──────┬───────┘   │
│  └─────────────┘               └────────────────────┘               │           │
│                                                                      │           │
│                                                                      ▼           │
│                                                              ┌──────────────┐   │
│                                                              │  Serving     │   │
│                                                              │  Layer       │   │
│                                                              │  (<5ms)      │   │
│                                                              └──────────────┘   │
│                                                                                   │
│  FEATURE TYPES:                                                                   │
│  ─────────────────────────────────────────────────────────────────────────────   │
│  • USER FEATURES: Demographics, preferences, engagement scores, lifetime value  │
│  • ITEM FEATURES: Embeddings, category, price, popularity, freshness            │
│  • CROSS FEATURES: User-item interaction counts, affinity scores                │
│  • CONTEXT FEATURES: Time, device, location, session depth                      │
│                                                                                   │
└──────────────────────────────────────────────────────────────────────────────────┘

Infrastructure Components

•Feature Store — Centralized storage for features with offline (training) and online (serving) stores
•Embedding Index — Approximate nearest neighbor index (FAISS, ScaNN) for fast similarity search
•Model Registry — Versioned storage for trained models with A/B test allocation
•Prediction Service — Low-latency model serving with GPU acceleration for ranking
•Feedback Loop — Real-time ingestion of user actions for continuous learning
•Monitoring — Track model freshness, prediction latency, and recommendation quality metrics

Recommendation Engine Summary

We've explored the complete architecture of an e-commerce recommendation system. The key principles:

Recommendation System Principles

•Hybrid approach — Combine collaborative filtering, content-based, and deep learning for robustness
•Multi-stage pipeline — Candidate generation (fast, broad) → Ranking (accurate, expensive) → Post-processing
•Two-tower architecture — Separate user/item encoders enable precomputing item embeddings for fast serving
•Exploration-exploitation — Balance showing proven items vs discovering new item potential
•Feature store — Centralized feature engineering ensures consistency between training and serving
•Real-time feedback — Continuously learn from user actions to keep recommendations fresh

Module Complete:

Congratulations! You've now completed the comprehensive deep-dive into Amazon E-Commerce system design. You've learned how to architect:

Product Catalogs serving 350M+ products with sub-200ms latency
Shopping Carts that persist across sessions and devices
Inventory Management preventing overselling at massive scale
Order Processing with saga patterns and idempotency
Recommendation Engines driving 35%+ of purchases

These patterns form the foundation of every major e-commerce platform worldwide.

Module Complete

You now have a Principal Engineer's understanding of e-commerce architecture. The key insight across all components: e-commerce combines extreme scale, real-time requirements, and financial accountability. Every subsystem must be designed with these constraints in mind—eventual consistency where possible, strong consistency where required, and always with clear failure modes.

6 / 6

Loading learning content...

System Design HLDAmazon E-Commerce

Designing Amazon: E-Commerce at Massive Scale

LevelAdvanced

Duration120 mins

TopicAmazon E-Commerce

6 / 6

Recommendation Engine

The Revenue Multiplier

At Amazon's scale, the recommendation system must:

Generate personalized recommendations for 500+ million users
Cover a catalog of 350+ million products
Serve recommendations in under 100ms per request
Process billions of user interactions daily to update models
Handle the cold start problem for new users and products

This page explores the architecture that makes this possible.

Learning Objectives

Types of E-Commerce Recommendations

Different recommendation surfaces serve different purposes. Each requires a tailored approach optimized for its specific context and user intent.

Recommendation Types and Their Characteristics
Type	Location	Primary Goal	Algorithm Style
Personalized Homepage	Homepage	Re-engagement, discovery	User-based CF + trending
Similar Items	Product page	Comparison shopping	Item-based CF + content
Frequently Bought Together	Product page	Bundle completion	Association rules
Customers Also Viewed	Product page	Alternative exploration	Session-based patterns
Recently Viewed	Homepage/Cart	Continuation	User history retrieval
Search Recommendations	Search results	Query refinement	Semantic similarity
Cart Recommendations	Shopping cart	Cross-sell, upsell	Complementary items
Email Recommendations	Marketing email	Re-engagement	User preference + recency

Key Recommendation Metrics

•Click-Through Rate (CTR) — What percentage of shown recommendations are clicked?
•Conversion Rate — What percentage of recommendation clicks result in purchase?
•Revenue per Recommendation — Average revenue driven by recommendation unit
•Coverage — What percentage of catalog appears in recommendations?
•Diversity — How varied are recommendations? (avoid echo chambers)
•Serendipity — Do recommendations surface unexpected but valuable items?

Core Recommendation Algorithms

Production recommendation systems combine multiple approaches. Each algorithm has strengths and weaknesses that complement each other.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
// User-Based Collaborative Filtering
// "Users similar to you also liked..."
 
class UserBasedCF {
  // Find similar users based on their interaction history
  async findSimilarUsers(userId: string, k: number = 50): Promise<SimilarUser[]> {
    const userVector = await this.getUserInteractionVector(userId);
    
    // Compute cosine similarity with other users
    // In practice, use approximate nearest neighbors (ANN) for scale
    const similarities = await this.annIndex.query(userVector, k);
    
    return similarities.map(s => ({
      userId: s.id,
      similarity: s.score,
    }));
  }
  
  async recommendItems(userId: string, count: number = 20): Promise<Recommendation[]> {
    const similarUsers = await this.findSimilarUsers(userId, 100);
    const userPurchases = new Set(await this.getUserPurchases(userId));
    
    // Aggregate items from similar users, weighted by similarity
    const itemScores = new Map<string, number>();
    
    for (const { userId: similarUserId, similarity } of similarUsers) {
      const theirPurchases = await this.getUserPurchases(similarUserId);
      
      for (const itemId of theirPurchases) {
        if (!userPurchases.has(itemId)) {  // Don't recommend already purchased
          const currentScore = itemScores.get(itemId) || 0;
          itemScores.set(itemId, currentScore + similarity);
        }
      }
    }
    
    // Sort by score and return top N
    return Array.from(itemScores.entries())
      .sort((a, b) => b[1] - a[1])
      .slice(0, count)
      .map(([itemId, score]) => ({ itemId, score, source: 'user_cf' }));
  }
}
 
// Item-Based Collaborative Filtering
// "Customers who bought this also bought..."
 
class ItemBasedCF {
  // Precompute item-item similarities during batch processing
  async computeItemSimilarities(): Promise<void> {
    // Build co-occurrence matrix from purchase history
    // For each pair of items, count how many users bought both
    // Normalize to get similarity scores
    
    const items = await this.getAllItems();
    
    for (const item of items) {
      const buyersOfItem = await this.getBuyers(item.id);
      const relatedItems = new Map<string, number>();
      
      for (const buyerId of buyersOfItem) {
        const theirPurchases = await this.getUserPurchases(buyerId);
        for (const otherItem of theirPurchases) {
          if (otherItem !== item.id) {
            const count = relatedItems.get(otherItem) || 0;
            relatedItems.set(otherItem, count + 1);
          }
        }
      }
      
      // Normalize by geometric mean of item popularities
      const itemPopularity = buyersOfItem.length;
      const similarities: ItemSimilarity[] = [];
      
      for (const [otherItemId, coCount] of relatedItems) {
        const otherPopularity = await this.getItemPopularity(otherItemId);
        const similarity = coCount / Math.sqrt(itemPopularity * otherPopularity);
        similarities.push({ itemId: otherItemId, similarity });
      }
      
      // Store top-K similar items per item
      const topK = similarities
        .sort((a, b) => b.similarity - a.similarity)
        .slice(0, 100);
      
      await this.storeItemSimilarities(item.id, topK);
    }
  }
  
  // Real-time: Get recommendations for a specific product
  async getSimilarItems(itemId: string): Promise<Recommendation[]> {
    // Direct lookup of precomputed similarities
    const similarities = await this.cache.get(`item_sim:${itemId}`);
    
    if (similarities) {
      return similarities.map(s => ({ 
        itemId: s.itemId, 
        score: s.similarity,
        source: 'item_cf'
      }));
    }
    
    // Fall back to content-based if no collaborative data
    return this.contentBasedFallback(itemId);
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
// Content-Based Filtering
// "Based on the attributes of items you've viewed..."
 
class ContentBasedRecommender {
  // Build item feature vectors from product attributes
  buildItemVector(item: Product): number[] {
    const features: number[] = [];
    
    // Category embedding (learned or one-hot)
    features.push(...this.categoryEmbedding.get(item.categoryId));
    
    // Brand embedding
    features.push(...this.brandEmbedding.get(item.brand));
    
    // Price bucket (normalized)
    features.push(this.normalizePriceBucket(item.price));
    
    // Attribute embeddings (varies by category)
    for (const attr of item.attributes) {
      if (this.attributeEmbeddings.has(attr.name)) {
        features.push(...this.attributeEmbeddings.get(attr.name).get(attr.value));
      }
    }
    
    // Text embeddings from title/description (using pre-trained model)
    const textEmbedding = await this.textEncoder.encode(
      `${item.title} ${item.description}`
    );
    features.push(...textEmbedding);
    
    return features;
  }
  
  // Build user preference vector from their interaction history
  async buildUserPreferenceVector(userId: string): Promise<number[]> {
    const interactions = await this.getUserInteractions(userId);
    
    // Weighted average of item vectors based on interaction type and recency
    let combinedVector = new Array(this.vectorDimension).fill(0);
    let totalWeight = 0;
    
    for (const interaction of interactions) {
      const itemVector = await this.getItemVector(interaction.itemId);
      
      // Weight by interaction type
      const typeWeight = this.interactionWeights[interaction.type];
      // Decay by recency (exponential decay)
      const recencyWeight = Math.exp(
        -(Date.now() - interaction.timestamp) / (7 * 24 * 60 * 60 * 1000)
      );
      
      const weight = typeWeight * recencyWeight;
      
      for (let i = 0; i < itemVector.length; i++) {
        combinedVector[i] += itemVector[i] * weight;
      }
      totalWeight += weight;
    }
    
    // Normalize
    return combinedVector.map(v => v / totalWeight);
  }
  
  async recommend(userId: string, count: number): Promise<Recommendation[]> {
    const userVector = await this.buildUserPreferenceVector(userId);
    
    // Find items most similar to user preference vector
    // Using approximate nearest neighbors for scale
    const candidates = await this.annIndex.query(userVector, count * 3);
    
    // Filter out already purchased
    const purchased = new Set(await this.getUserPurchases(userId));
    
    return candidates
      .filter(c => !purchased.has(c.itemId))
      .slice(0, count)
      .map(c => ({ itemId: c.itemId, score: c.similarity, source: 'content' }));
  }
  
  private interactionWeights = {
    'purchase': 5.0,
    'add_to_cart': 3.0,
    'product_view': 1.0,
    'search_click': 0.5,
    'impression': 0.1,
  };
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
// Association Rule Mining
// "Frequently bought together: X + Y + Z"
 
class AssociationRuleMiner {
  // Apriori algorithm for finding frequent itemsets
  async findFrequentItemsets(
    minSupport: number,
    maxSize: number = 3
  ): Promise<FrequentItemset[]> {
    const transactions = await this.getRecentTransactions(90);  // Last 90 days
    const totalTransactions = transactions.length;
    
    // Find frequent single items
    const itemCounts = new Map<string, number>();
    for (const tx of transactions) {
      for (const item of tx.items) {
        itemCounts.set(item.sku, (itemCounts.get(item.sku) || 0) + 1);
      }
    }
    
    let frequentItemsets: FrequentItemset[] = [];
    let currentLevel = Array.from(itemCounts.entries())
      .filter(([_, count]) => count / totalTransactions >= minSupport)
      .map(([sku, count]) => ({
        items: [sku],
        support: count / totalTransactions,
      }));
    
    frequentItemsets.push(...currentLevel);
    
    // Generate larger itemsets
    for (let size = 2; size <= maxSize; size++) {
      const candidates = this.generateCandidates(currentLevel, size);
      const nextLevel: FrequentItemset[] = [];
      
      for (const candidate of candidates) {
        const count = this.countSupport(candidate, transactions);
        const support = count / totalTransactions;
        
        if (support >= minSupport) {
          nextLevel.push({ items: candidate, support });
        }
      }
      
      if (nextLevel.length === 0) break;
      
      frequentItemsets.push(...nextLevel);
      currentLevel = nextLevel;
    }
    
    return frequentItemsets;
  }
  
  // Generate association rules from frequent itemsets
  async generateRules(minConfidence: number): Promise<AssociationRule[]> {
    const itemsets = await this.findFrequentItemsets(0.001, 3);
    const rules: AssociationRule[] = [];
    
    for (const itemset of itemsets) {
      if (itemset.items.length < 2) continue;
      
      // Generate all possible A -> B rules from itemset
      for (let i = 1; i < Math.pow(2, itemset.items.length) - 1; i++) {
        const antecedent: string[] = [];
        const consequent: string[] = [];
        
        for (let j = 0; j < itemset.items.length; j++) {
          if (i & (1 << j)) {
            antecedent.push(itemset.items[j]);
          } else {
            consequent.push(itemset.items[j]);
          }
        }
        
        // Calculate confidence: P(consequent | antecedent)
        const antecedentSupport = await this.getItemsetSupport(antecedent);
        const confidence = itemset.support / antecedentSupport;
        
        if (confidence >= minConfidence) {
          // Calculate lift: confidence / P(consequent)
          const consequentSupport = await this.getItemsetSupport(consequent);
          const lift = confidence / consequentSupport;
          
          rules.push({
            antecedent,
            consequent,
            support: itemset.support,
            confidence,
            lift,
          });
        }
      }
    }
    
    return rules.sort((a, b) => b.lift - a.lift);
  }
  
  // Real-time lookup
  async getFrequentlyBoughtTogether(itemId: string): Promise<Recommendation[]> {
    // Precomputed and cached
    const rules = await this.cache.get(`fbt:${itemId}`);
    
    if (rules) {
      return rules.map(r => ({
        itemId: r.consequent[0],
        score: r.lift * r.confidence,
        source: 'fbt',
        bundleDiscount: r.bundleDiscount,
      }));
    }
    
    return [];
  }
}

Deep Learning for Recommendations

Modern recommendation systems increasingly use deep learning to capture complex patterns in user behavior. Neural networks can learn rich representations that go beyond simple similarity measures.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
// Two-Tower Architecture for Candidate Generation
// Separate encoder networks for users and items
 
interface TwoTowerModel {
  userTower: NeuralNetwork;   // Encodes user to embedding
  itemTower: NeuralNetwork;   // Encodes item to embedding
  embeddingDim: number;       // Shared embedding dimension
}
 
class NeuralRetrievalModel {
  constructor(private model: TwoTowerModel) {}
  
  // User tower takes user features and history
  async encodeUser(userId: string): Promise<Float32Array> {
    const userFeatures = await this.getUserFeatures(userId);
    
    // Features include:
    // - User demographics (age bucket, location, etc.)
    // - Aggregated interaction history
    // - Recent click sequence (processed by RNN/Transformer)
    // - Time features (day of week, time of day)
    
    return this.model.userTower.forward(userFeatures);
  }
  
  // Item tower takes item features
  async encodeItem(itemId: string): Promise<Float32Array> {
    const itemFeatures = await this.getItemFeatures(itemId);
    
    // Features include:
    // - Category embeddings
    // - Text embeddings (title, description)
    // - Image embeddings (from CNN)
    // - Price, rating, popularity signals
    
    return this.model.itemTower.forward(itemFeatures);
  }
  
  // Training: contrastive learning with in-batch negatives
  async train(batch: TrainingBatch): Promise<void> {
    const userEmbeddings = await Promise.all(
      batch.users.map(u => this.encodeUser(u.id))
    );
    
    const itemEmbeddings = await Promise.all(
      batch.positiveItems.map(i => this.encodeItem(i.id))
    );
    
    // Compute similarity matrix (all user-item pairs in batch)
    const similarities = this.computeSimilarityMatrix(
      userEmbeddings, 
      itemEmbeddings
    );
    
    // Loss: maximize similarity for positive pairs
    // minimize similarity for in-batch negatives
    const loss = this.inBatchSoftmaxLoss(similarities, batch.labels);
    
    await this.optimizer.step(loss);
  }
  
  // Serving: encode user once, retrieve from item index
  async recommend(userId: string, count: number): Promise<Recommendation[]> {
    const userEmbedding = await this.encodeUser(userId);
    
    // Query approximate nearest neighbor index of item embeddings
    const candidates = await this.itemAnnIndex.query(userEmbedding, count * 10);
    
    // Re-rank with full model if needed
    return this.rerank(userId, candidates, count);
  }
}
 
// Sequential Recommendation with Transformers
class SequentialRecommender {
  // Process user's click/purchase sequence
  async recommendFromSequence(
    sequence: UserEvent[],
    count: number
  ): Promise<Recommendation[]> {
    // Embed each item in sequence
    const itemEmbeddings = await Promise.all(
      sequence.map(e => this.itemEncoder.encode(e.itemId))
    );
    
    // Add positional encoding
    const positionedEmbeddings = itemEmbeddings.map((emb, i) => 
      this.addPositionalEncoding(emb, i)
    );
    
    // Self-attention over sequence
    const contextualizedEmbeddings = await this.transformer.forward(
      positionedEmbeddings
    );
    
    // Use last position as sequence representation
    const sequenceEmbedding = contextualizedEmbeddings[contextualizedEmbeddings.length - 1];
    
    // Predict next items
    return this.predictNextItems(sequenceEmbedding, count);
  }
}

Why Two Towers?

Recommendation Serving Architecture

Serving recommendations at scale requires a specialized architecture that can handle billions of requests while maintaining low latency.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
┌──────────────────────────────────────────────────────────────────────────────────┐
│                    RECOMMENDATION SERVING PIPELINE                                │
├──────────────────────────────────────────────────────────────────────────────────┤
│                                                                                   │
│                     ┌──────────────┐                                              │
│                     │   Request    │                                              │
│                     │  (userId,    │                                              │
│                     │   context)   │                                              │
│                     └──────┬───────┘                                              │
│                            │                                                      │
│                            ▼                                                      │
│  ┌─────────────────────────────────────────────────────────────────────────────┐ │
│  │                    STAGE 1: CANDIDATE GENERATION                            │ │
│  │               (Retrieve 1000s of candidates, <50ms)                          │ │
│  │  ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐    │ │
│  │  │  User-based   │ │  Item-based   │ │   Content     │ │   Trending    │    │ │
│  │  │     CF        │ │     CF        │ │    Based      │ │   Popular     │    │ │
│  │  │              │ │              │ │              │ │              │    │ │
│  │  │  Top 500     │ │  Top 500     │ │  Top 300     │ │  Top 200     │    │ │
│  │  └──────┬────────┘ └──────┬────────┘ └──────┬────────┘ └──────┬────────┘    │ │
│  │         │                 │                 │                 │             │ │
│  │         └─────────────────┴─────────────────┴─────────────────┘             │ │
│  │                                     │                                        │ │
│  │                                     ▼                                        │ │
│  │                         ┌────────────────────┐                               │ │
│  │                         │  Deduplicate &     │                               │ │
│  │                         │  Merge (~1500)     │                               │ │
│  │                         └────────────────────┘                               │ │
│  └─────────────────────────────────────┬───────────────────────────────────────┘ │
│                                        │                                          │
│                                        ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────────────┐ │
│  │                    STAGE 2: SCORING/RANKING                                  │ │
│  │               (Score all candidates, <30ms)                                   │ │
│  │                                                                              │ │
│  │     ┌───────────────────────────────────────────────────────────────────┐   │ │
│  │     │              Deep Ranking Model (Neural Network)                   │   │ │
│  │     │                                                                    │   │ │
│  │     │  Inputs:                                                           │   │ │
│  │     │  - User features (demographics, history, context)                 │   │ │
│  │     │  - Item features (embeddings, attributes)                         │   │ │
│  │     │  - Cross features (user-item interaction signals)                 │   │ │
│  │     │                                                                    │   │ │
│  │     │  Outputs:                                                          │   │ │
│  │     │  - P(click), P(add_to_cart), P(purchase), E(revenue)             │   │ │
│  │     └───────────────────────────────────────────────────────────────────┘   │ │
│  │                                                                              │ │
│  │                                  │                                           │ │
│  │                                  ▼                                           │ │
│  │                         ┌────────────────┐                                   │ │
│  │                         │ Score & Sort   │                                   │ │
│  │                         │ (~100 items)   │                                   │ │
│  │                         └────────────────┘                                   │ │
│  └─────────────────────────────────┬───────────────────────────────────────────┘ │
│                                    │                                              │
│                                    ▼                                              │
│  ┌─────────────────────────────────────────────────────────────────────────────┐ │
│  │                    STAGE 3: POST-PROCESSING                                  │ │
│  │               (Filter, reorder, diversify, <20ms)                            │ │
│  │                                                                              │ │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐   │ │
│  │  │  Filter     │ │  Diversify  │ │  Business   │ │  Explanation        │   │ │
│  │  │  Ineligible │ │  Categories │ │  Rules      │ │  Generation         │   │ │
│  │  │  Items      │ │  & Brands   │ │  (Boosting) │ │  ("Because you...")  │   │ │
│  │  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────────────┘   │ │
│  │                                                                              │ │
│  └─────────────────────────────────┬───────────────────────────────────────────┘ │
│                                    │                                              │
│                                    ▼                                              │
│                          ┌────────────────────┐                                   │
│                          │  Final Results     │                                   │
│                          │  (10-50 items)     │                                   │
│                          └────────────────────┘                                   │
│                                                                                   │
└──────────────────────────────────────────────────────────────────────────────────┘

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
class RecommendationService {
  async getRecommendations(request: RecommendationRequest): Promise<RecommendationResponse> {
    const startTime = performance.now();
    const { userId, context, slot, count } = request;
    
    // Load user features (cached, ~5ms)
    const userFeatures = await this.userFeatureStore.get(userId);
    
    // STAGE 1: Candidate Generation (parallel, ~40ms total)
    const candidatePromises = [
      this.userCF.recommend(userId, 500),
      this.itemCF.recommendFromRecent(userFeatures.recentItems, 500),
      this.contentBased.recommend(userId, 300),
      this.trending.getByCategory(userFeatures.preferredCategories, 200),
      this.neural.retrieve(userId, 500),
    ];
    
    const candidateSets = await Promise.all(candidatePromises);
    
    // Merge and deduplicate
    const candidates = this.mergeCandidates(candidateSets);
    
    // STAGE 2: Scoring (~25ms)
    const scored = await this.ranker.scoreAll(userId, candidates, context);
    
    // STAGE 3: Post-processing (~15ms)
    const processed = await this.postProcess(scored, {
      diversityConstraint: { maxPerCategory: 3, maxPerBrand: 2 },
      exclude: userFeatures.recentPurchases,
      boost: slot.boostCategories,
      includeExplanations: request.includeExplanations,
    });
    
    // Take top N
    const results = processed.slice(0, count);
    
    // Log for model training
    await this.logRequest({
      userId,
      slot,
      candidates: scored.map(c => c.itemId),
      displayed: results.map(r => r.itemId),
      latencyMs: performance.now() - startTime,
    });
    
    return {
      recommendations: results,
      metadata: {
        algorithmVersion: this.modelVersion,
        latencyMs: performance.now() - startTime,
        candidateCount: candidates.length,
      },
    };
  }
  
  private async postProcess(
    scored: ScoredCandidate[],
    options: PostProcessOptions
  ): Promise<Recommendation[]> {
    let results = scored
      // Filter excluded items
      .filter(c => !options.exclude.has(c.itemId))
      // Filter out-of-stock
      .filter(c => c.inStock);
    
    // Apply business boosting rules
    for (const category of options.boost) {
      results = results.map(r => ({
        ...r,
        score: r.category === category ? r.score * 1.2 : r.score,
      }));
    }
    
    // Diversify: Maximal Marginal Relevance
    results = this.diversify(results, options.diversityConstraint);
    
    // Generate explanations
    if (options.includeExplanations) {
      results = results.map(r => ({
        ...r,
        explanation: this.generateExplanation(r),
      }));
    }
    
    return results;
  }
  
  private diversify(
    items: ScoredCandidate[],
    constraint: DiversityConstraint
  ): ScoredCandidate[] {
    const result: ScoredCandidate[] = [];
    const categoryCount = new Map<string, number>();
    const brandCount = new Map<string, number>();
    
    for (const item of items) {
      const catCount = categoryCount.get(item.category) || 0;
      const brdCount = brandCount.get(item.brand) || 0;
      
      if (catCount < constraint.maxPerCategory && 
          brdCount < constraint.maxPerBrand) {
        result.push(item);
        categoryCount.set(item.category, catCount + 1);
        brandCount.set(item.brand, brdCount + 1);
      }
    }
    
    return result;
  }
}

Handling Cold Start

The cold start problem—recommending for new users or new items—is one of the hardest challenges in recommendation systems. Without historical data, collaborative filtering fails completely.

New User Strategies

•Popularity-based — Start with globally popular items
•Context signals — Use location, device, time of day
•Referral source — Infer interests from ad/search keyword
•Onboarding quiz — Ask preferences explicitly
•Rapid learning — Weight early signals heavily
•Demographic segments — Use similar user archetypes

New Item Strategies

•Content-based — Use item attributes and descriptions
•Exploration bonus — Boost new items to collect signals
•Similar item inference — Map to similar existing items
•Seller/brand transfer — Use seller's other items' patterns
•Image similarity — Visual similarity to popular items
•Strategic placement — Show in high-traffic slots

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
// Multi-Armed Bandit for Exploration-Exploitation
class EpsilonGreedyRecommender {
  private epsilon = 0.1;  // 10% exploration rate
  
  async recommend(
    userId: string,
    count: number
  ): Promise<Recommendation[]> {
    const recommendations: Recommendation[] = [];
    
    for (let i = 0; i < count; i++) {
      if (Math.random() < this.epsilon) {
        // EXPLORE: Show a random/new item to gather data
        const exploreItem = await this.selectExplorationCandidate();
        recommendations.push({
          ...exploreItem,
          isExploration: true,
        });
      } else {
        // EXPLOIT: Show highest predicted value item
        const exploitItem = await this.selectBestPredicted(userId);
        recommendations.push({
          ...exploitItem,
          isExploration: false,
        });
      }
    }
    
    return recommendations;
  }
  
  private async selectExplorationCandidate(): Promise<Recommendation> {
    // Thompson Sampling: balance exploration with uncertainty
    const candidates = await this.getNewItemCandidates();
    
    // For each candidate, sample from posterior distribution
    // of expected reward
    const sampledValues = candidates.map(c => ({
      item: c,
      sample: this.samplePosterior(c.conversionMean, c.conversionStdDev),
    }));
    
    // Select highest sampled value
    return sampledValues.sort((a, b) => b.sample - a.sample)[0].item;
  }
  
  private samplePosterior(mean: number, stdDev: number): number {
    // Beta distribution for conversion rates
    return mean + stdDev * this.gaussianRandom();
  }
}
 
// Contextual Bandits for Cold Start
class ContextualBanditRecommender {
  // Use user context even without history
  async recommendForNewUser(context: UserContext): Promise<Recommendation[]> {
    // Extract signal from available context
    const features = this.extractContextFeatures(context);
    // {
    //   device: 'mobile',
    //   platform: 'ios',
    //   hour: 14,
    //   dayOfWeek: 'tuesday',
    //   location: 'US-CA',
    //   referrer: 'google',
    //   searchQuery: 'wireless earbuds',
    // }
    
    // Find similar contexts and their successful recommendations
    const similarContexts = await this.contextIndex.query(features, 100);
    
    // Aggregate successful items from similar contexts
    const itemScores = new Map<string, number>();
    
    for (const ctx of similarContexts) {
      const weight = ctx.similarity;
      for (const item of ctx.clickedItems) {
        itemScores.set(item, (itemScores.get(item) || 0) + weight);
      }
    }
    
    return Array.from(itemScores.entries())
      .sort((a, b) => b[1] - a[1])
      .slice(0, 20)
      .map(([itemId, score]) => ({ itemId, score, source: 'contextual_bandit' }));
  }
}

ML Infrastructure at Scale

Supporting recommendation systems at Amazon's scale requires specialized ML infrastructure for feature storage, model training, and real-time serving.

┌──────────────────────────────────────────────────────────────────────────────────┐
│                         FEATURE STORE ARCHITECTURE                                │
├──────────────────────────────────────────────────────────────────────────────────┤
│                                                                                   │
│  DATA SOURCES                   FEATURE COMPUTATION           FEATURE SERVING    │
│  ───────────────                ───────────────────           ────────────────   │
│                                                                                   │
│  ┌─────────────┐               ┌────────────────────┐        ┌──────────────┐   │
│  │  Clickstream│───────────┬──▶│  Batch Processing  │───────▶│   Offline    │   │
│  │  Events     │           │   │  (Spark/Flink)     │        │   Store      │   │
│  └─────────────┘           │   │                    │        │  (S3/HDFS)   │   │
│  ┌─────────────┐           │   │  Aggregations:     │        └──────────────┘   │
│  │ Transaction │───────────┤   │  - User purchase   │               │           │
│  │  Data       │           │   │    history         │               │           │
│  └─────────────┘           │   │  - Item popularity │               │           │
│  ┌─────────────┐           │   │  - User-item       │               ▼           │
│  │  Product    │───────────┤   │    interactions    │        ┌──────────────┐   │
│  │  Catalog    │           │   └────────────────────┘        │   Training   │   │
│  └─────────────┘           │                                 │   Pipeline   │   │
│  ┌─────────────┐           │   ┌────────────────────┐        └──────────────┘   │
│  │  User       │───────────┘   │  Stream Processing │                           │
│  │  Profiles   │──────────────▶│  (Kafka Streams)   │        ┌──────────────┐   │
│  └─────────────┘               │                    │───────▶│   Online     │   │
│                                │  Real-time:        │        │   Store      │   │
│  ┌─────────────┐               │  - Session context │        │  (Redis/     │   │
│  │  Real-time  │──────────────▶│  - Recent clicks   │        │   DynamoDB)  │   │
│  │  Events     │               │  - Cart state      │        └──────┬───────┘   │
│  └─────────────┘               └────────────────────┘               │           │
│                                                                      │           │
│                                                                      ▼           │
│                                                              ┌──────────────┐   │
│                                                              │  Serving     │   │
│                                                              │  Layer       │   │
│                                                              │  (<5ms)      │   │
│                                                              └──────────────┘   │
│                                                                                   │
│  FEATURE TYPES:                                                                   │
│  ─────────────────────────────────────────────────────────────────────────────   │
│  • USER FEATURES: Demographics, preferences, engagement scores, lifetime value  │
│  • ITEM FEATURES: Embeddings, category, price, popularity, freshness            │
│  • CROSS FEATURES: User-item interaction counts, affinity scores                │
│  • CONTEXT FEATURES: Time, device, location, session depth                      │
│                                                                                   │
└──────────────────────────────────────────────────────────────────────────────────┘

Infrastructure Components

•Feature Store — Centralized storage for features with offline (training) and online (serving) stores
•Embedding Index — Approximate nearest neighbor index (FAISS, ScaNN) for fast similarity search
•Model Registry — Versioned storage for trained models with A/B test allocation
•Prediction Service — Low-latency model serving with GPU acceleration for ranking
•Feedback Loop — Real-time ingestion of user actions for continuous learning
•Monitoring — Track model freshness, prediction latency, and recommendation quality metrics

Recommendation Engine Summary

We've explored the complete architecture of an e-commerce recommendation system. The key principles:

Recommendation System Principles

•Hybrid approach — Combine collaborative filtering, content-based, and deep learning for robustness
•Multi-stage pipeline — Candidate generation (fast, broad) → Ranking (accurate, expensive) → Post-processing
•Two-tower architecture — Separate user/item encoders enable precomputing item embeddings for fast serving
•Exploration-exploitation — Balance showing proven items vs discovering new item potential
•Feature store — Centralized feature engineering ensures consistency between training and serving
•Real-time feedback — Continuously learn from user actions to keep recommendations fresh

Module Complete:

Congratulations! You've now completed the comprehensive deep-dive into Amazon E-Commerce system design. You've learned how to architect:

Product Catalogs serving 350M+ products with sub-200ms latency
Shopping Carts that persist across sessions and devices
Inventory Management preventing overselling at massive scale
Order Processing with saga patterns and idempotency
Recommendation Engines driving 35%+ of purchases

These patterns form the foundation of every major e-commerce platform worldwide.

Module Complete

6 / 6