System Design (HLD)Typeahead / Autocomplete

Typeahead and Autocomplete Systems

LevelAdvanced

Duration90 mins

TopicTypeahead / Autocomplete

4 / 6

Ranking Suggestions: Ordering for Relevance and Engagement

The Art and Science of Suggestion Ordering

Finding matching suggestions is only half the battle. A prefix like "how to" might match millions of suggestions in a large corpus. The difference between a good and great typeahead system lies in ranking—ensuring the most relevant suggestions appear in the precious few slots visible to users.

Ranking transforms raw matches into an ordered list where position matters: the #1 suggestion is clicked 10x more than #5. A ranking algorithm that surfaces the right suggestion in position #1 versus #5 can improve user satisfaction by 50% or more.

This page explores ranking from first principles through production-grade implementations used by tech giants. You'll understand both classical approaches and modern machine learning methods, along with the trade-offs that guide real-world decisions.

What You Will Learn

By the end of this page, you will understand frequency-based scoring, recency and trending signals, multi-factor ranking formulas, learning-to-rank (LTR) models, real-time ranking at scale, and the offline-online architecture that enables sophisticated ranking within latency constraints.

Why Ranking Matters: The Position Effect

Not all suggestion positions are created equal. Eye-tracking studies and click data consistently show dramatic position bias:

The Click Distribution Reality

Typical Click Distribution by Position
Position	Click Share	Relative to #1
#1	35-45%	1.00x
#2	20-25%	0.50x
#3	12-15%	0.35x
#4	8-10%	0.25x
#5	5-7%	0.15x
#6+	< 5%	< 0.10x

This means that moving the "right" suggestion from position #5 to #1 can increase its selection rate by 6-7x. For a typeahead system handling billions of queries, this translates to massive improvements in user satisfaction and engagement.

Ranking Goals

A ranking system optimizes for multiple objectives, often in tension:

Ranking Objectives

•Relevance: Show suggestions that match user intent. If the user types "python", show Python-related suggestions.
•Popularity: Surface what most users want. "python programming" is more likely the intent than "python snake facts".
•Freshness: Boost timely suggestions. During a major event, related queries should surface quickly.
•Diversity: Avoid showing 10 variations of the same suggestion. Variety helps when intent is ambiguous.
•Business Goals: For e-commerce, prioritize high-margin products. For platforms, boost engaging content.
•Personalization: Tailor to individual user preferences and history (covered in next page).

Ranking is Product Strategy

Ranking decisions encode product values. Prioritizing popularity creates conformity; prioritizing diversity enables discovery. Boosting sponsored content trades user trust for revenue. These are business decisions as much as engineering decisions.

Frequency-Based Scoring

The simplest and often surprisingly effective approach: rank by how often each suggestion is searched. If "python tutorial" is searched 100,000 times/day and "python terrarium" is searched 100 times/day, the former should rank higher.

Raw Frequency Scoring

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
interface Suggestion {
  text: string;
  searchCount: number;     // Total historical searches
  clickCount: number;      // Times selected from typeahead
}
 
function scoreByFrequency(suggestion: Suggestion): number {
  return suggestion.searchCount;
}
 
function rankByFrequency(suggestions: Suggestion[]): Suggestion[] {
  return [...suggestions].sort((a, b) => 
    scoreByFrequency(b) - scoreByFrequency(a)
  );
}
 
// Simple, fast, interpretable
// But: old evergreen content dominates forever

Log-Transformed Frequency

Raw frequency has a problem: popular queries dominate disproportionately. A query with 10M searches shouldn't necessarily rank 1000x higher than one with 10K. Log transformation compresses the range:

1
2
3
4
5
6
7
8
9
10
11
12
13
function scoreByLogFrequency(suggestion: Suggestion): number {
  // Add 1 to avoid log(0)
  return Math.log10(suggestion.searchCount + 1);
}
 
// Comparison:
// 10M searches: log10(10,000,001) ≈ 7.0
// 10K searches: log10(10,001) ≈ 4.0
// 1K searches:  log10(1,001) ≈ 3.0
// 100 searches: log10(101) ≈ 2.0
 
// Now 10M is only 3.5x the score of 10K, not 1000x
// This allows other signals to influence ranking

Exponential Decay for Recency

A suggestion that peaked in popularity 3 years ago shouldn't dominate forever. Apply time decay:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
interface SuggestionWithTime extends Suggestion {
  lastSearchedAt: Date;    // Most recent search
  dailyCounts: Map<string, number>;  // date string → count
}
 
function scoreWithTimeDecay(
  suggestion: SuggestionWithTime,
  halfLifeDays: number = 30  // Score halves every 30 days
): number {
  const now = Date.now();
  const msPerDay = 24 * 60 * 60 * 1000;
  
  let decayedScore = 0;
  
  for (const [dateStr, count] of suggestion.dailyCounts) {
    const date = new Date(dateStr).getTime();
    const daysAgo = (now - date) / msPerDay;
    
    // Exponential decay: score × 0.5^(daysAgo / halfLife)
    const decayFactor = Math.pow(0.5, daysAgo / halfLifeDays);
    decayedScore += count * decayFactor;
  }
  
  return decayedScore;
}
 
// Effect:
// A query searched 1000 times yesterday: ~1000
// A query searched 1000 times 30 days ago: ~500
// A query searched 1000 times 60 days ago: ~250
// This naturally promotes fresh content

Tuning Decay Half-Life

The optimal half-life depends on your domain. News sites might use 1-7 days (very fresh). E-commerce might use 30-90 days (seasonal). Documentation search might use 180+ days (evergreen). Analyze your click data to find where recency correlates with satisfaction.

Multi-Signal Ranking Formulas

Real ranking uses multiple signals combined into a unified score. The challenge is weighting them appropriately.

Common Ranking Signals

Ranking Signals and Their Properties
Signal	What It Measures	Pros	Cons
Search Volume	How often this suggestion is searched	Strong relevance signal	Favors head queries, slow to update
Click-Through Rate (CTR)	When shown, how often clicked	Direct user preference signal	Biased by position, cold start problem
Conversion Rate	Leads to desired action (purchase, signup)	Business-aligned	Sparse for long-tail queries
Recency	How recently popular	Captures trending topics	May surface noise/fads
Term Match Quality	How well query matches suggestion	Respects user input	Doesn't account for intent
Category Relevance	Matches user's context/category	Contextual relevance	Requires category detection

Linear Combination (Weighted Sum)

The most common approach: assign weights to each signal and sum:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
interface RankingSignals {
  logFrequency: number;       // Log-transformed search count
  ctr: number;                // Click-through rate [0, 1]
  recencyScore: number;       // Time-decayed score
  matchQuality: number;       // Prefix match quality [0, 1]
  categoryMatch: number;      // Does category match context? [0, 1]
}
 
interface RankingWeights {
  wFrequency: number;
  wCtr: number;
  wRecency: number;
  wMatch: number;
  wCategory: number;
}
 
function computeScore(signals: RankingSignals, weights: RankingWeights): number {
  // Normalize signals to similar scales first
  const normalizedFreq = Math.min(signals.logFrequency / 7, 1);  // Max ~10M = 1.0
  const normalizedRecency = Math.min(signals.recencyScore / 10000, 1);
  
  return (
    weights.wFrequency * normalizedFreq +
    weights.wCtr * signals.ctr +
    weights.wRecency * normalizedRecency +
    weights.wMatch * signals.matchQuality +
    weights.wCategory * signals.categoryMatch
  );
}
 
// Example weights (tuned empirically):
const defaultWeights: RankingWeights = {
  wFrequency: 0.25,
  wCtr: 0.30,
  wRecency: 0.15,
  wMatch: 0.20,
  wCategory: 0.10,
};
 
// Note: Weights should sum to 1.0 for interpretability
// but any non-negative weights work mathematically

Multiplicative Scoring

Alternatively, multiply factors for a more aggressive combination:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function computeMultiplicativeScore(signals: RankingSignals): number {
  // Base score from frequency
  let score = 1 + signals.logFrequency;
  
  // Boost by CTR (a 50% CTR doubles the score)
  score *= (1 + signals.ctr);
  
  // Boost by recency (recent items can 2x)
  score *= (0.5 + signals.recencyScore / 10000);
  
  // Require match quality (poor match can halve score)
  score *= (0.5 + 0.5 * signals.matchQuality);
  
  // Category bonus (matching category adds 20%)
  score *= (1 + 0.2 * signals.categoryMatch);
  
  return score;
}
 
// Multiplicative is more aggressive:
// - A poor signal can dramatically hurt ranking
// - Encourages all signals to be good, not just one
// - Harder to tune than linear

Weight Tuning Approaches

Manual tuning works for simple formulas but becomes intractable as signals increase. Common approaches: (1) Grid search with held-out validation data, (2) Bayesian optimization for efficient search, (3) Learning-to-rank models that learn weights from data. We'll cover LTR in detail.

Trending and Burst Detection

Users expect typeahead to reflect what's happening now. When a major event occurs, related suggestions should surface immediately, not after the standard batch processing window.

Velocity-Based Trending

Detect topics gaining search volume faster than their historical baseline:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
interface SuggestionTimeSeries {
  text: string;
  hourlyVolume: number[];  // Last 168 hours (1 week)
}
 
function computeTrendScore(series: SuggestionTimeSeries): number {
  const recentHours = 4;
  const baselineHours = 168;  // 1 week
  
  // Recent volume (last 4 hours)
  const recentVolume = series.hourlyVolume
    .slice(-recentHours)
    .reduce((a, b) => a + b, 0);
  
  // Baseline: average of same hours in previous weeks
  // (simplification: just use overall average here)
  const baselineVolume = series.hourlyVolume
    .slice(0, -recentHours)
    .reduce((a, b) => a + b, 0) / (baselineHours - recentHours) * recentHours;
  
  // Velocity: how much faster than baseline
  const velocity = baselineVolume > 0 
    ? (recentVolume - baselineVolume) / baselineVolume 
    : 0;
  
  return velocity;  // 0.0 = normal, 1.0 = 2x baseline, 2.0 = 3x baseline
}
 
function isTrending(velocity: number, threshold: number = 0.5): boolean {
  return velocity > threshold;
}
 
// If velocity > 0.5 (50% above baseline), mark as trending
// Trending items get a ranking boost

Burst Detection

Bursts are sudden spikes that stand out from normal patterns. More sophisticated than simple velocity:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
interface BurstDetector {
  windowMinutes: number;
  sensitivitySigma: number;  // Z-score threshold
}
 
function detectBurst(
  recentCount: number,
  historicalMean: number,
  historicalStdDev: number,
  config: BurstDetector
): { isBurst: boolean; severity: number } {
  // Z-score: how many standard deviations from mean
  const zScore = historicalStdDev > 0
    ? (recentCount - historicalMean) / historicalStdDev
    : 0;
  
  return {
    isBurst: zScore > config.sensitivitySigma,
    severity: Math.max(0, zScore - config.sensitivitySigma),
  };
}
 
// Configuration:
// sensitivitySigma = 2.0: Triggers at ~97.7th percentile
// sensitivitySigma = 3.0: Triggers at ~99.9th percentile (rare)
 
// Example: "olympics" normally has σ = 100 searches/hour, mean = 500
// During opening ceremony: 5000 searches/hour
// Z-score = (5000 - 500) / 100 = 45 → massive burst
 
// Boost formula for bursts:
function burstBoost(severity: number): number {
  // Logarithmic boost to prevent extreme domination
  return 1 + Math.log1p(severity) * 0.3;  // severity=10 → 1.72x boost
}

Real-Time Trending Pipeline

Detecting trends in real-time requires streaming architecture:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
┌─────────────┐   ┌──────────────┐   ┌───────────────┐
│ Search Logs │──▶│ Kafka Topic  │──▶│ Flink/Spark   │
│  (queries)  │   │ (raw events) │   │  Streaming    │
└─────────────┘   └──────────────┘   └───────┬───────┘
                                             │
        ┌────────────────────────────────────┴───────────┐
        │                                                │
        ▼                                                ▼
┌───────────────┐                            ┌───────────────┐
│ Sliding Window │                            │ Baseline Stats│
│ Aggregation    │                            │ (Redis TTL)   │
│ (last 5 min)   │                            │               │
└───────┬───────┘                            └───────┬───────┘
        │                                            │
        └──────────────────┬─────────────────────────┘
                           │
                           ▼
                ┌──────────────────┐
                │ Burst Detector   │
                │ (compare window  │
                │  to baseline)    │
                └────────┬─────────┘
                         │
                         ▼
                ┌──────────────────┐
                │ Trending Index   │
                │ (Redis Sorted Set│
                │  with TTL)       │
                └──────────────────┘
 
Latency: ~10-30 seconds from query to trending detection

Trending Abuse Prevention

Trending systems are targets for manipulation (coordinated searches to boost a topic). Mitigations: (1) Require distributed geographic sources, (2) Rate-limit contribution per user/IP, (3) Require corroborating signals (social mentions, news articles), (4) Human review for extreme bursts.

Learning to Rank (LTR) Models

Hand-tuning ranking formulas becomes intractable as signals multiply. Learning to Rank (LTR) uses machine learning to learn optimal ranking functions from data.

LTR Problem Formulation

Given:

A query q
A set of candidate suggestions {s₁, s₂, ..., sₙ}
Features for each (query, suggestion) pair
Ground truth relevance labels (from clicks, explicit feedback, etc.)

Learn a scoring function f(q, s) that orders suggestions by relevance.

LTR Approaches

Learning to Rank Approaches
Approach	Description	Pros	Cons
Pointwise	Predict relevance score for each item independently	Simple, any regression model works	Ignores relative ordering
Pairwise	Predict which of two items should rank higher	Directly optimizes ordering	O(n²) pairs, can be slow
Listwise	Optimize the entire ranked list at once	Best aligns with metrics	More complex training

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// Feature engineering: extract features for (query, suggestion) pair
interface RankingFeatures {
  // Query-specific
  queryLength: number;
  queryWordCount: number;
  
  // Suggestion-specific
  suggestionLength: number;
  logSearchVolume: number;
  historicalCtr: number;
  recencyScore: number;
  
  // Query-suggestion interaction
  prefixMatchRatio: number;     // How much of suggestion matched
  editDistance: number;         // If fuzzy matched
  wordOverlap: number;          // Jaccard similarity of words
  positionOfMatch: number;      // 0 = first word, 1 = second, etc.
  
  // Context
  isMobile: number;             // 0 or 1
  hourOfDay: number;            // 0-23
  dayOfWeek: number;            // 0-6
}
 
// Training data: collect (features, clicked) pairs
interface TrainingExample {
  features: RankingFeatures;
  clicked: boolean;  // Did user click this suggestion?
}
 
// Train a gradient boosted tree to predict click probability
// Libraries: XGBoost, LightGBM, CatBoost
import { XGBClassifier } from 'xgboost';
 
const model = new XGBClassifier({
  maxDepth: 6,
  nEstimators: 100,
  learningRate: 0.1,
  objective: 'binary:logistic',
});
 
model.fit(trainingFeatures, trainingLabels);
 
// At inference: compute features, predict scores, sort
function rankWithLTR(
  query: string,
  candidates: Suggestion[],
  context: QueryContext
): Suggestion[] {
  const scored = candidates.map(s => ({
    suggestion: s,
    score: model.predict(extractFeatures(query, s, context)),
  }));
  
  return scored
    .sort((a, b) => b.score - a.score)
    .map(x => x.suggestion);
}

LambdaMART: State of the Art

LambdaMART combines gradient boosting with listwise optimization. It directly optimizes for ranking metrics like NDCG (Normalized Discounted Cumulative Gain):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import lightgbm as lgb
 
# Prepare data in LightGBM format
# Groups indicate which items belong to same query
train_data = lgb.Dataset(
    features,
    label=relevance_labels,
    group=query_groups  # [10, 8, 15, ...] items per query
)
 
# LambdaMART parameters
params = {
    'objective': 'lambdarank',
    'metric': 'ndcg',
    'ndcg_eval_at': [1, 3, 5, 10],  # Evaluate NDCG@1, @3, @5, @10
    'num_leaves': 64,
    'learning_rate': 0.05,
    'min_data_in_leaf': 20,
    'feature_fraction': 0.8,
}
 
model = lgb.train(
    params,
    train_data,
    num_boost_round=500,
    valid_sets=[validation_data],
    early_stopping_rounds=50
)
 
# Model directly optimizes for ranking quality metrics

LTR in Production

LTR models are typically trained offline on historical data and deployed as scoring functions. Feature computation must be very fast (< 1ms per candidate). Tree-based models (XGBoost, LightGBM) are popular because they're fast at inference and handle mixed feature types well.

Two-Stage Ranking Architecture

Production systems can't run expensive ML models on millions of candidates per query. The solution: a two-stage (or multi-stage) ranking architecture.

Stage 1: Candidate Retrieval (Fast & Broad)

Quickly narrow from millions to hundreds/thousands using simple, fast scoring:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
interface Stage1Scorer {
  // Pre-computed scores stored in trie nodes
  staticScore: number;  // Log frequency + base CTR
}
 
function stage1Retrieve(
  prefix: string,
  index: PrefixIndex,
  maxCandidates: number = 1000
): Suggestion[] {
  // Trie lookup + top-K by static score
  // All data is co-located in memory, no computation needed
  return index.getTopK(prefix, maxCandidates);
}
 
// Latency budget: 5-10ms
// Retrieves: 100-1000 candidates
// Uses: Pre-computed scores, no feature extraction

Stage 2: Precision Ranking (Slow & Accurate)

Apply the full LTR model to the smaller candidate set:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
interface Stage2Ranker {
  model: LTRModel;
}
 
async function stage2Rank(
  query: string,
  candidates: Suggestion[],
  context: QueryContext,
  topK: number = 10
): Promise<Suggestion[]> {
  // Extract features for each candidate
  const features = candidates.map(c => extractFeatures(query, c, context));
  
  // Batch predict with ML model
  const scores = await model.batchPredict(features);
  
  // Combine with candidates and sort
  const scored = candidates.map((c, i) => ({
    suggestion: c,
    score: scores[i],
  }));
  
  return scored
    .sort((a, b) => b.score - a.score)
    .slice(0, topK)
    .map(x => x.suggestion);
}
 
// Latency budget: 10-20ms
// Processes: 100-1000 candidates
// Returns: Top 10

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
User types "python"
     │
     ▼ (< 1ms)
┌─────────────────────────────────────────┐
│ Stage 0: Query Processing               │
│ - Normalization, tokenization           │
│ - Feature extraction (query-level)      │
└─────────────────┬───────────────────────┘
                  │
     ▼ (5-10ms)   │
┌─────────────────────────────────────────┐
│ Stage 1: Candidate Retrieval            │
│ - Trie prefix lookup                    │
│ - Return top 500 by static score        │
│ Candidates: "python tutorial" (5.2),    │
│             "python programming" (5.1), │
│             "python for beginners" (4.9)│
│             ... (500 total)             │
└─────────────────┬───────────────────────┘
                  │
     ▼ (10-20ms)  │
┌─────────────────────────────────────────┐
│ Stage 2: ML Ranking                     │
│ - Extract features for 500 candidates   │
│ - Apply LambdaMART model                │
│ - Score and sort                        │
└─────────────────┬───────────────────────┘
                  │
     ▼ (< 1ms)    │
┌─────────────────────────────────────────┐
│ Stage 3: Post-Processing                │
│ - Diversity re-ranking                  │
│ - Business rules (demote, boost)        │
│ - Content filtering                     │
│ Return top 10 to user                   │
└─────────────────────────────────────────┘
 
Total latency: 20-35ms end-to-end

Three or More Stages

Large-scale systems may use more stages. For example: (1) Coarse retrieval (10K candidates), (2) Lightweight model (1K candidates), (3) Full model (100 candidates), (4) Post-processing (10 results). Each stage trades computation for accuracy.

Post-Processing and Business Rules

After ML ranking, additional transformations ensure the final list meets business requirements.

Diversity Re-Ranking

Avoid showing variations of the same suggestion. Use MMR (Maximal Marginal Relevance):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
function diversityRerank(
  suggestions: ScoredSuggestion[],
  targetK: number,
  lambda: number = 0.7  // Balance relevance vs diversity
): ScoredSuggestion[] {
  const selected: ScoredSuggestion[] = [];
  const remaining = [...suggestions];
  
  // Greedily select, maximizing MMR
  while (selected.length < targetK && remaining.length > 0) {
    let bestIdx = -1;
    let bestMMR = -Infinity;
    
    for (let i = 0; i < remaining.length; i++) {
      const candidate = remaining[i];
      
      // Relevance: original ML score
      const relevance = candidate.score;
      
      // Similarity to already selected
      const maxSimilarity = selected.length === 0 
        ? 0 
        : Math.max(...selected.map(s => similarity(candidate, s)));
      
      // MMR formula
      const mmr = lambda * relevance - (1 - lambda) * maxSimilarity;
      
      if (mmr > bestMMR) {
        bestMMR = mmr;
        bestIdx = i;
      }
    }
    
    if (bestIdx !== -1) {
      selected.push(remaining[bestIdx]);
      remaining.splice(bestIdx, 1);
    }
  }
  
  return selected;
}
 
function similarity(a: ScoredSuggestion, b: ScoredSuggestion): number {
  // Example: Jaccard similarity of words
  const wordsA = new Set(a.text.toLowerCase().split(/\s+/));
  const wordsB = new Set(b.text.toLowerCase().split(/\s+/));
  const intersection = [...wordsA].filter(w => wordsB.has(w)).length;
  const union = new Set([...wordsA, ...wordsB]).size;
  return intersection / union;
}
 
// Result: Instead of 5 variations of "python tutorial",
// show diverse suggestions covering different intents

Business Rule Application

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
interface BusinessRule {
  name: string;
  condition: (s: Suggestion, context: QueryContext) => boolean;
  action: 'boost' | 'demote' | 'remove' | 'pin';
  magnitude: number;  // For boost/demote: multiplier; for pin: position
}
 
const rules: BusinessRule[] = [
  {
    name: 'boost_sponsored',
    condition: (s) => s.metadata?.sponsored === true,
    action: 'boost',
    magnitude: 1.5,  // 50% score increase
  },
  {
    name: 'demote_competitor',
    condition: (s) => s.text.toLowerCase().includes('competitor_brand'),
    action: 'demote',
    magnitude: 0.5,  // 50% score reduction
  },
  {
    name: 'remove_blocked',
    condition: (s) => blocklist.has(s.text.toLowerCase()),
    action: 'remove',
    magnitude: 0,
  },
  {
    name: 'pin_current_promo',
    condition: (s, ctx) => s.metadata?.promoId === currentPromoId && ctx.platform === 'web',
    action: 'pin',
    magnitude: 1,  // Always position #1
  },
];
 
function applyBusinessRules(
  suggestions: ScoredSuggestion[],
  context: QueryContext
): ScoredSuggestion[] {
  let result = suggestions.filter(s => {
    // Apply remove rules
    return !rules
      .filter(r => r.action === 'remove')
      .some(r => r.condition(s, context));
  });
  
  // Apply boost/demote
  result = result.map(s => {
    let score = s.score;
    for (const rule of rules) {
      if ((rule.action === 'boost' || rule.action === 'demote') && 
          rule.condition(s, context)) {
        score *= rule.magnitude;
      }
    }
    return { ...s, score };
  });
  
  // Re-sort
  result.sort((a, b) => b.score - a.score);
  
  // Apply pins
  for (const rule of rules.filter(r => r.action === 'pin')) {
    const pinned = result.filter(s => rule.condition(s, context));
    const others = result.filter(s => !rule.condition(s, context));
    result = [...pinned, ...others];  // Pinned first
  }
  
  return result;
}

Rules Can Conflict

Business rules can create unexpected behavior when combined. Document rule priorities, test extensively, and monitor for anomalies. A rule that seemed harmless can tank suggestion quality if it interacts poorly with others.

Summary: Ranking for Real-World Typeahead

Ranking is where typeahead quality differentiates. Let's consolidate:

Key Takeaways

•Position matters enormously: #1 gets 7-10x more clicks than #5. Ranking quality directly impacts user satisfaction.
•Start with frequency, add complexity gradually: Log-frequency + time decay is a solid baseline. Add signals only when data proves they help.
•Trending keeps typeahead fresh: Real-time burst detection and velocity scoring ensure users see what's happening now.
•LTR learns optimal combinations: When you have enough click data, gradient boosted trees (LambdaMART) outperform hand-tuned formulas.
•Two-stage architecture is essential: Fast retrieval narrows candidates; slower ML model optimizes the final ranking.
•Post-processing handles business needs: Diversity, sponsored content, and content policies are applied after ML ranking.

Ranking Quality Checklist

✓ Baseline metrics established (MRR, NDCG, CTR)
✓ A/B testing infrastructure for ranking changes
✓ Feature importance analysis for LTR models
✓ Cold start handling for new suggestions
✓ Latency monitoring for ranking pipeline
✓ Diversity metrics to prevent monotony
✓ Business rule audit trail

What's next:

Ranking quality improves dramatically with Personalization—tailoring suggestions to individual users based on their history, preferences, and context. The next page explores how to deliver personalized typeahead at scale while respecting privacy constraints.

Page Complete

You now understand the full ranking stack for production typeahead: from simple frequency scoring through ML-based learning-to-rank, two-stage architecture, and business rule application. This knowledge enables you to build ranking systems that feel magical to users.

4 / 6

Loading learning content...

System Design (HLD)Typeahead / Autocomplete

Typeahead and Autocomplete Systems

LevelAdvanced

Duration90 mins

TopicTypeahead / Autocomplete

4 / 6

Ranking Suggestions: Ordering for Relevance and Engagement

The Art and Science of Suggestion Ordering

What You Will Learn

Why Ranking Matters: The Position Effect

Not all suggestion positions are created equal. Eye-tracking studies and click data consistently show dramatic position bias:

The Click Distribution Reality

Typical Click Distribution by Position
Position	Click Share	Relative to #1
#1	35-45%	1.00x
#2	20-25%	0.50x
#3	12-15%	0.35x
#4	8-10%	0.25x
#5	5-7%	0.15x
#6+	< 5%	< 0.10x

Ranking Goals

A ranking system optimizes for multiple objectives, often in tension:

Ranking Objectives

•Relevance: Show suggestions that match user intent. If the user types "python", show Python-related suggestions.
•Popularity: Surface what most users want. "python programming" is more likely the intent than "python snake facts".
•Freshness: Boost timely suggestions. During a major event, related queries should surface quickly.
•Diversity: Avoid showing 10 variations of the same suggestion. Variety helps when intent is ambiguous.
•Business Goals: For e-commerce, prioritize high-margin products. For platforms, boost engaging content.
•Personalization: Tailor to individual user preferences and history (covered in next page).

Ranking is Product Strategy

Frequency-Based Scoring

Raw Frequency Scoring

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
interface Suggestion {
  text: string;
  searchCount: number;     // Total historical searches
  clickCount: number;      // Times selected from typeahead
}
 
function scoreByFrequency(suggestion: Suggestion): number {
  return suggestion.searchCount;
}
 
function rankByFrequency(suggestions: Suggestion[]): Suggestion[] {
  return [...suggestions].sort((a, b) => 
    scoreByFrequency(b) - scoreByFrequency(a)
  );
}
 
// Simple, fast, interpretable
// But: old evergreen content dominates forever

Log-Transformed Frequency

Raw frequency has a problem: popular queries dominate disproportionately. A query with 10M searches shouldn't necessarily rank 1000x higher than one with 10K. Log transformation compresses the range:

1
2
3
4
5
6
7
8
9
10
11
12
13
function scoreByLogFrequency(suggestion: Suggestion): number {
  // Add 1 to avoid log(0)
  return Math.log10(suggestion.searchCount + 1);
}
 
// Comparison:
// 10M searches: log10(10,000,001) ≈ 7.0
// 10K searches: log10(10,001) ≈ 4.0
// 1K searches:  log10(1,001) ≈ 3.0
// 100 searches: log10(101) ≈ 2.0
 
// Now 10M is only 3.5x the score of 10K, not 1000x
// This allows other signals to influence ranking

Exponential Decay for Recency

A suggestion that peaked in popularity 3 years ago shouldn't dominate forever. Apply time decay:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
interface SuggestionWithTime extends Suggestion {
  lastSearchedAt: Date;    // Most recent search
  dailyCounts: Map<string, number>;  // date string → count
}
 
function scoreWithTimeDecay(
  suggestion: SuggestionWithTime,
  halfLifeDays: number = 30  // Score halves every 30 days
): number {
  const now = Date.now();
  const msPerDay = 24 * 60 * 60 * 1000;
  
  let decayedScore = 0;
  
  for (const [dateStr, count] of suggestion.dailyCounts) {
    const date = new Date(dateStr).getTime();
    const daysAgo = (now - date) / msPerDay;
    
    // Exponential decay: score × 0.5^(daysAgo / halfLife)
    const decayFactor = Math.pow(0.5, daysAgo / halfLifeDays);
    decayedScore += count * decayFactor;
  }
  
  return decayedScore;
}
 
// Effect:
// A query searched 1000 times yesterday: ~1000
// A query searched 1000 times 30 days ago: ~500
// A query searched 1000 times 60 days ago: ~250
// This naturally promotes fresh content

Tuning Decay Half-Life

Multi-Signal Ranking Formulas

Real ranking uses multiple signals combined into a unified score. The challenge is weighting them appropriately.

Common Ranking Signals

Ranking Signals and Their Properties
Signal	What It Measures	Pros	Cons
Search Volume	How often this suggestion is searched	Strong relevance signal	Favors head queries, slow to update
Click-Through Rate (CTR)	When shown, how often clicked	Direct user preference signal	Biased by position, cold start problem
Conversion Rate	Leads to desired action (purchase, signup)	Business-aligned	Sparse for long-tail queries
Recency	How recently popular	Captures trending topics	May surface noise/fads
Term Match Quality	How well query matches suggestion	Respects user input	Doesn't account for intent
Category Relevance	Matches user's context/category	Contextual relevance	Requires category detection

Linear Combination (Weighted Sum)

The most common approach: assign weights to each signal and sum:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
interface RankingSignals {
  logFrequency: number;       // Log-transformed search count
  ctr: number;                // Click-through rate [0, 1]
  recencyScore: number;       // Time-decayed score
  matchQuality: number;       // Prefix match quality [0, 1]
  categoryMatch: number;      // Does category match context? [0, 1]
}
 
interface RankingWeights {
  wFrequency: number;
  wCtr: number;
  wRecency: number;
  wMatch: number;
  wCategory: number;
}
 
function computeScore(signals: RankingSignals, weights: RankingWeights): number {
  // Normalize signals to similar scales first
  const normalizedFreq = Math.min(signals.logFrequency / 7, 1);  // Max ~10M = 1.0
  const normalizedRecency = Math.min(signals.recencyScore / 10000, 1);
  
  return (
    weights.wFrequency * normalizedFreq +
    weights.wCtr * signals.ctr +
    weights.wRecency * normalizedRecency +
    weights.wMatch * signals.matchQuality +
    weights.wCategory * signals.categoryMatch
  );
}
 
// Example weights (tuned empirically):
const defaultWeights: RankingWeights = {
  wFrequency: 0.25,
  wCtr: 0.30,
  wRecency: 0.15,
  wMatch: 0.20,
  wCategory: 0.10,
};
 
// Note: Weights should sum to 1.0 for interpretability
// but any non-negative weights work mathematically

Multiplicative Scoring

Alternatively, multiply factors for a more aggressive combination:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function computeMultiplicativeScore(signals: RankingSignals): number {
  // Base score from frequency
  let score = 1 + signals.logFrequency;
  
  // Boost by CTR (a 50% CTR doubles the score)
  score *= (1 + signals.ctr);
  
  // Boost by recency (recent items can 2x)
  score *= (0.5 + signals.recencyScore / 10000);
  
  // Require match quality (poor match can halve score)
  score *= (0.5 + 0.5 * signals.matchQuality);
  
  // Category bonus (matching category adds 20%)
  score *= (1 + 0.2 * signals.categoryMatch);
  
  return score;
}
 
// Multiplicative is more aggressive:
// - A poor signal can dramatically hurt ranking
// - Encourages all signals to be good, not just one
// - Harder to tune than linear

Weight Tuning Approaches

Trending and Burst Detection

Users expect typeahead to reflect what's happening now. When a major event occurs, related suggestions should surface immediately, not after the standard batch processing window.

Velocity-Based Trending

Detect topics gaining search volume faster than their historical baseline:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
interface SuggestionTimeSeries {
  text: string;
  hourlyVolume: number[];  // Last 168 hours (1 week)
}
 
function computeTrendScore(series: SuggestionTimeSeries): number {
  const recentHours = 4;
  const baselineHours = 168;  // 1 week
  
  // Recent volume (last 4 hours)
  const recentVolume = series.hourlyVolume
    .slice(-recentHours)
    .reduce((a, b) => a + b, 0);
  
  // Baseline: average of same hours in previous weeks
  // (simplification: just use overall average here)
  const baselineVolume = series.hourlyVolume
    .slice(0, -recentHours)
    .reduce((a, b) => a + b, 0) / (baselineHours - recentHours) * recentHours;
  
  // Velocity: how much faster than baseline
  const velocity = baselineVolume > 0 
    ? (recentVolume - baselineVolume) / baselineVolume 
    : 0;
  
  return velocity;  // 0.0 = normal, 1.0 = 2x baseline, 2.0 = 3x baseline
}
 
function isTrending(velocity: number, threshold: number = 0.5): boolean {
  return velocity > threshold;
}
 
// If velocity > 0.5 (50% above baseline), mark as trending
// Trending items get a ranking boost

Burst Detection

Bursts are sudden spikes that stand out from normal patterns. More sophisticated than simple velocity:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
interface BurstDetector {
  windowMinutes: number;
  sensitivitySigma: number;  // Z-score threshold
}
 
function detectBurst(
  recentCount: number,
  historicalMean: number,
  historicalStdDev: number,
  config: BurstDetector
): { isBurst: boolean; severity: number } {
  // Z-score: how many standard deviations from mean
  const zScore = historicalStdDev > 0
    ? (recentCount - historicalMean) / historicalStdDev
    : 0;
  
  return {
    isBurst: zScore > config.sensitivitySigma,
    severity: Math.max(0, zScore - config.sensitivitySigma),
  };
}
 
// Configuration:
// sensitivitySigma = 2.0: Triggers at ~97.7th percentile
// sensitivitySigma = 3.0: Triggers at ~99.9th percentile (rare)
 
// Example: "olympics" normally has σ = 100 searches/hour, mean = 500
// During opening ceremony: 5000 searches/hour
// Z-score = (5000 - 500) / 100 = 45 → massive burst
 
// Boost formula for bursts:
function burstBoost(severity: number): number {
  // Logarithmic boost to prevent extreme domination
  return 1 + Math.log1p(severity) * 0.3;  // severity=10 → 1.72x boost
}

Real-Time Trending Pipeline

Detecting trends in real-time requires streaming architecture:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
┌─────────────┐   ┌──────────────┐   ┌───────────────┐
│ Search Logs │──▶│ Kafka Topic  │──▶│ Flink/Spark   │
│  (queries)  │   │ (raw events) │   │  Streaming    │
└─────────────┘   └──────────────┘   └───────┬───────┘
                                             │
        ┌────────────────────────────────────┴───────────┐
        │                                                │
        ▼                                                ▼
┌───────────────┐                            ┌───────────────┐
│ Sliding Window │                            │ Baseline Stats│
│ Aggregation    │                            │ (Redis TTL)   │
│ (last 5 min)   │                            │               │
└───────┬───────┘                            └───────┬───────┘
        │                                            │
        └──────────────────┬─────────────────────────┘
                           │
                           ▼
                ┌──────────────────┐
                │ Burst Detector   │
                │ (compare window  │
                │  to baseline)    │
                └────────┬─────────┘
                         │
                         ▼
                ┌──────────────────┐
                │ Trending Index   │
                │ (Redis Sorted Set│
                │  with TTL)       │
                └──────────────────┘
 
Latency: ~10-30 seconds from query to trending detection

Trending Abuse Prevention

Learning to Rank (LTR) Models

Hand-tuning ranking formulas becomes intractable as signals multiply. Learning to Rank (LTR) uses machine learning to learn optimal ranking functions from data.

LTR Problem Formulation

Given:

A query q
A set of candidate suggestions {s₁, s₂, ..., sₙ}
Features for each (query, suggestion) pair
Ground truth relevance labels (from clicks, explicit feedback, etc.)

Learn a scoring function f(q, s) that orders suggestions by relevance.

LTR Approaches

Learning to Rank Approaches
Approach	Description	Pros	Cons
Pointwise	Predict relevance score for each item independently	Simple, any regression model works	Ignores relative ordering
Pairwise	Predict which of two items should rank higher	Directly optimizes ordering	O(n²) pairs, can be slow
Listwise	Optimize the entire ranked list at once	Best aligns with metrics	More complex training

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// Feature engineering: extract features for (query, suggestion) pair
interface RankingFeatures {
  // Query-specific
  queryLength: number;
  queryWordCount: number;
  
  // Suggestion-specific
  suggestionLength: number;
  logSearchVolume: number;
  historicalCtr: number;
  recencyScore: number;
  
  // Query-suggestion interaction
  prefixMatchRatio: number;     // How much of suggestion matched
  editDistance: number;         // If fuzzy matched
  wordOverlap: number;          // Jaccard similarity of words
  positionOfMatch: number;      // 0 = first word, 1 = second, etc.
  
  // Context
  isMobile: number;             // 0 or 1
  hourOfDay: number;            // 0-23
  dayOfWeek: number;            // 0-6
}
 
// Training data: collect (features, clicked) pairs
interface TrainingExample {
  features: RankingFeatures;
  clicked: boolean;  // Did user click this suggestion?
}
 
// Train a gradient boosted tree to predict click probability
// Libraries: XGBoost, LightGBM, CatBoost
import { XGBClassifier } from 'xgboost';
 
const model = new XGBClassifier({
  maxDepth: 6,
  nEstimators: 100,
  learningRate: 0.1,
  objective: 'binary:logistic',
});
 
model.fit(trainingFeatures, trainingLabels);
 
// At inference: compute features, predict scores, sort
function rankWithLTR(
  query: string,
  candidates: Suggestion[],
  context: QueryContext
): Suggestion[] {
  const scored = candidates.map(s => ({
    suggestion: s,
    score: model.predict(extractFeatures(query, s, context)),
  }));
  
  return scored
    .sort((a, b) => b.score - a.score)
    .map(x => x.suggestion);
}

LambdaMART: State of the Art

LambdaMART combines gradient boosting with listwise optimization. It directly optimizes for ranking metrics like NDCG (Normalized Discounted Cumulative Gain):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import lightgbm as lgb
 
# Prepare data in LightGBM format
# Groups indicate which items belong to same query
train_data = lgb.Dataset(
    features,
    label=relevance_labels,
    group=query_groups  # [10, 8, 15, ...] items per query
)
 
# LambdaMART parameters
params = {
    'objective': 'lambdarank',
    'metric': 'ndcg',
    'ndcg_eval_at': [1, 3, 5, 10],  # Evaluate NDCG@1, @3, @5, @10
    'num_leaves': 64,
    'learning_rate': 0.05,
    'min_data_in_leaf': 20,
    'feature_fraction': 0.8,
}
 
model = lgb.train(
    params,
    train_data,
    num_boost_round=500,
    valid_sets=[validation_data],
    early_stopping_rounds=50
)
 
# Model directly optimizes for ranking quality metrics

LTR in Production

Two-Stage Ranking Architecture

Production systems can't run expensive ML models on millions of candidates per query. The solution: a two-stage (or multi-stage) ranking architecture.

Stage 1: Candidate Retrieval (Fast & Broad)

Quickly narrow from millions to hundreds/thousands using simple, fast scoring:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
interface Stage1Scorer {
  // Pre-computed scores stored in trie nodes
  staticScore: number;  // Log frequency + base CTR
}
 
function stage1Retrieve(
  prefix: string,
  index: PrefixIndex,
  maxCandidates: number = 1000
): Suggestion[] {
  // Trie lookup + top-K by static score
  // All data is co-located in memory, no computation needed
  return index.getTopK(prefix, maxCandidates);
}
 
// Latency budget: 5-10ms
// Retrieves: 100-1000 candidates
// Uses: Pre-computed scores, no feature extraction

Stage 2: Precision Ranking (Slow & Accurate)

Apply the full LTR model to the smaller candidate set:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
interface Stage2Ranker {
  model: LTRModel;
}
 
async function stage2Rank(
  query: string,
  candidates: Suggestion[],
  context: QueryContext,
  topK: number = 10
): Promise<Suggestion[]> {
  // Extract features for each candidate
  const features = candidates.map(c => extractFeatures(query, c, context));
  
  // Batch predict with ML model
  const scores = await model.batchPredict(features);
  
  // Combine with candidates and sort
  const scored = candidates.map((c, i) => ({
    suggestion: c,
    score: scores[i],
  }));
  
  return scored
    .sort((a, b) => b.score - a.score)
    .slice(0, topK)
    .map(x => x.suggestion);
}
 
// Latency budget: 10-20ms
// Processes: 100-1000 candidates
// Returns: Top 10

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
User types "python"
     │
     ▼ (< 1ms)
┌─────────────────────────────────────────┐
│ Stage 0: Query Processing               │
│ - Normalization, tokenization           │
│ - Feature extraction (query-level)      │
└─────────────────┬───────────────────────┘
                  │
     ▼ (5-10ms)   │
┌─────────────────────────────────────────┐
│ Stage 1: Candidate Retrieval            │
│ - Trie prefix lookup                    │
│ - Return top 500 by static score        │
│ Candidates: "python tutorial" (5.2),    │
│             "python programming" (5.1), │
│             "python for beginners" (4.9)│
│             ... (500 total)             │
└─────────────────┬───────────────────────┘
                  │
     ▼ (10-20ms)  │
┌─────────────────────────────────────────┐
│ Stage 2: ML Ranking                     │
│ - Extract features for 500 candidates   │
│ - Apply LambdaMART model                │
│ - Score and sort                        │
└─────────────────┬───────────────────────┘
                  │
     ▼ (< 1ms)    │
┌─────────────────────────────────────────┐
│ Stage 3: Post-Processing                │
│ - Diversity re-ranking                  │
│ - Business rules (demote, boost)        │
│ - Content filtering                     │
│ Return top 10 to user                   │
└─────────────────────────────────────────┘
 
Total latency: 20-35ms end-to-end

Three or More Stages

Post-Processing and Business Rules

After ML ranking, additional transformations ensure the final list meets business requirements.

Diversity Re-Ranking

Avoid showing variations of the same suggestion. Use MMR (Maximal Marginal Relevance):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
function diversityRerank(
  suggestions: ScoredSuggestion[],
  targetK: number,
  lambda: number = 0.7  // Balance relevance vs diversity
): ScoredSuggestion[] {
  const selected: ScoredSuggestion[] = [];
  const remaining = [...suggestions];
  
  // Greedily select, maximizing MMR
  while (selected.length < targetK && remaining.length > 0) {
    let bestIdx = -1;
    let bestMMR = -Infinity;
    
    for (let i = 0; i < remaining.length; i++) {
      const candidate = remaining[i];
      
      // Relevance: original ML score
      const relevance = candidate.score;
      
      // Similarity to already selected
      const maxSimilarity = selected.length === 0 
        ? 0 
        : Math.max(...selected.map(s => similarity(candidate, s)));
      
      // MMR formula
      const mmr = lambda * relevance - (1 - lambda) * maxSimilarity;
      
      if (mmr > bestMMR) {
        bestMMR = mmr;
        bestIdx = i;
      }
    }
    
    if (bestIdx !== -1) {
      selected.push(remaining[bestIdx]);
      remaining.splice(bestIdx, 1);
    }
  }
  
  return selected;
}
 
function similarity(a: ScoredSuggestion, b: ScoredSuggestion): number {
  // Example: Jaccard similarity of words
  const wordsA = new Set(a.text.toLowerCase().split(/\s+/));
  const wordsB = new Set(b.text.toLowerCase().split(/\s+/));
  const intersection = [...wordsA].filter(w => wordsB.has(w)).length;
  const union = new Set([...wordsA, ...wordsB]).size;
  return intersection / union;
}
 
// Result: Instead of 5 variations of "python tutorial",
// show diverse suggestions covering different intents

Business Rule Application

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
interface BusinessRule {
  name: string;
  condition: (s: Suggestion, context: QueryContext) => boolean;
  action: 'boost' | 'demote' | 'remove' | 'pin';
  magnitude: number;  // For boost/demote: multiplier; for pin: position
}
 
const rules: BusinessRule[] = [
  {
    name: 'boost_sponsored',
    condition: (s) => s.metadata?.sponsored === true,
    action: 'boost',
    magnitude: 1.5,  // 50% score increase
  },
  {
    name: 'demote_competitor',
    condition: (s) => s.text.toLowerCase().includes('competitor_brand'),
    action: 'demote',
    magnitude: 0.5,  // 50% score reduction
  },
  {
    name: 'remove_blocked',
    condition: (s) => blocklist.has(s.text.toLowerCase()),
    action: 'remove',
    magnitude: 0,
  },
  {
    name: 'pin_current_promo',
    condition: (s, ctx) => s.metadata?.promoId === currentPromoId && ctx.platform === 'web',
    action: 'pin',
    magnitude: 1,  // Always position #1
  },
];
 
function applyBusinessRules(
  suggestions: ScoredSuggestion[],
  context: QueryContext
): ScoredSuggestion[] {
  let result = suggestions.filter(s => {
    // Apply remove rules
    return !rules
      .filter(r => r.action === 'remove')
      .some(r => r.condition(s, context));
  });
  
  // Apply boost/demote
  result = result.map(s => {
    let score = s.score;
    for (const rule of rules) {
      if ((rule.action === 'boost' || rule.action === 'demote') && 
          rule.condition(s, context)) {
        score *= rule.magnitude;
      }
    }
    return { ...s, score };
  });
  
  // Re-sort
  result.sort((a, b) => b.score - a.score);
  
  // Apply pins
  for (const rule of rules.filter(r => r.action === 'pin')) {
    const pinned = result.filter(s => rule.condition(s, context));
    const others = result.filter(s => !rule.condition(s, context));
    result = [...pinned, ...others];  // Pinned first
  }
  
  return result;
}

Rules Can Conflict

Summary: Ranking for Real-World Typeahead

Ranking is where typeahead quality differentiates. Let's consolidate:

Key Takeaways

•Position matters enormously: #1 gets 7-10x more clicks than #5. Ranking quality directly impacts user satisfaction.
•Start with frequency, add complexity gradually: Log-frequency + time decay is a solid baseline. Add signals only when data proves they help.
•Trending keeps typeahead fresh: Real-time burst detection and velocity scoring ensure users see what's happening now.
•LTR learns optimal combinations: When you have enough click data, gradient boosted trees (LambdaMART) outperform hand-tuned formulas.
•Two-stage architecture is essential: Fast retrieval narrows candidates; slower ML model optimizes the final ranking.
•Post-processing handles business needs: Diversity, sponsored content, and content policies are applied after ML ranking.

Ranking Quality Checklist

✓ Baseline metrics established (MRR, NDCG, CTR)
✓ A/B testing infrastructure for ranking changes
✓ Feature importance analysis for LTR models
✓ Cold start handling for new suggestions
✓ Latency monitoring for ranking pipeline
✓ Diversity metrics to prevent monotony
✓ Business rule audit trail

What's next:

Page Complete

4 / 6