Loading system design...
Design a search autocomplete (typeahead) system like Google Search Suggestions. As the user types each character into the search box, the system returns the top K most popular query suggestions matching the typed prefix — all within 100ms for a seamless, lag-free typing experience.
| Metric | Value |
|---|---|
| Daily active users (DAU) | 300 million |
| Searches per day | 10 billion |
| Average characters typed per search | 15 (each triggers an autocomplete request) |
| Autocomplete queries per day | 10B × 15 × 0.5 (debounce) ≈ 75 billion |
| Peak QPS | ~2 million autocomplete lookups / sec |
| Unique searchable queries | 100 million (after deduplication and filtering) |
| Trie node count (compressed) | ~500 million nodes |
| Trie memory (in-memory) | ~10–20 GB per replica |
| Trie server replicas | 20–50 per region |
| K (suggestions per request) | 10 |
As the user types each character in the search box, return the top K (e.g., K=10) most popular search query suggestions that match the typed prefix
Suggestions must be returned within 100ms of each keystroke for a fluid, lag-free typing experience
Rank suggestions by popularity (search frequency); the most frequently searched queries appear first
Update suggestion rankings based on new search data — trending queries should surface quickly (within minutes to hours)
Support multi-language autocomplete (English, Chinese, Japanese, Hindi, etc.) with proper Unicode handling
Personalised suggestions: boost results based on the user's own search history in addition to global popularity
Filter out offensive, NSFW, or policy-violating suggestions before displaying to the user
Handle typos and spelling corrections: 'iphne' should still suggest 'iphone 15 pro max'
Support trending/breaking queries: rapidly surging searches (e.g., breaking news events) surface in autocomplete within minutes
Work offline/client-side: cache the most common prefixes locally on the client for instant zero-latency suggestions
Non-functional requirements define the system qualities critical to your users. Frame them as 'The system should be able to...' statements. These will guide your deep dives later.
Think about CAP theorem trade-offs, scalability limits, latency targets, durability guarantees, security requirements, fault tolerance, and compliance needs.
Frame NFRs for this specific system. 'Low latency search under 100ms' is far more valuable than just 'low latency'.
Add concrete numbers: 'P99 response time < 500ms', '99.9% availability', '10M DAU'. This drives architectural decisions.
Choose the 3-5 most critical NFRs. Every system should be 'scalable', but what makes THIS system's scaling uniquely challenging?