Loading content...
When you open Twitter (now X), your timeline loads in milliseconds—a personalized stream of tweets from the hundreds or thousands of accounts you follow. This seemingly simple interaction masks one of the most challenging distributed systems problems in modern software engineering.
Consider the scale:
Designing a system that can handle this scale while maintaining sub-second response times requires deep understanding of distributed systems, data modeling, caching strategies, and algorithmic trade-offs. In this module, we'll design Twitter's core features from scratch.
By the end of this page, you will master the requirements analysis for a Twitter-like system. You'll understand how to decompose the problem into functional and non-functional requirements, identify critical constraints, estimate scale, and establish clear success criteria—the foundation for all subsequent design decisions.
Before diving into technical requirements, we must deeply understand what Twitter/X actually does. This understanding guides every technical decision we make.
Twitter's Core Value Proposition:
Twitter is a real-time information network where users share short-form content with followers. Unlike Facebook's bidirectional friendships, Twitter uses a unidirectional follow model—you can follow anyone without their approval. This asymmetric relationship creates unique technical challenges:
| Characteristic | Twitter/X | ||
|---|---|---|---|
| Relationship Model | Unidirectional (follow) | Bidirectional (friend) | Unidirectional (follow) |
| Content Format | Text-first, 280 chars | Multi-format, unlimited | Image/video-first |
| Primary Use Case | Real-time information | Social connections | Visual storytelling |
| Feed Expectation | Chronological + ranked | Heavily ranked | Ranked by engagement |
| Follower Distribution | Extreme power law | Moderate variance | Strong power law |
| Time Sensitivity | Seconds matter | Hours acceptable | Hours acceptable |
The Unidirectional Follow Model Implications:
This seemingly simple design choice has profound technical implications:
When designing Twitter in an interview, always clarify the relationship model first. The unidirectional follow model (vs. bidirectional friendship) fundamentally changes your architecture. This question demonstrates that you understand the problem before jumping to solutions.
Functional requirements define what the system must do. For Twitter, we'll focus on three core features: posting tweets, following users, and viewing timelines. Each has nuanced sub-requirements that significantly impact design.
The tweet posting feature is deceptively complex. Let's decompose it:
The follow system establishes the social graph that powers content distribution:
The timeline is the most complex and performance-critical feature:
In a 45-minute interview, you cannot design all these features. Confirm with your interviewer: 'Should I focus on the core tweet-follow-timeline loop, or dive deep into a specific area like real-time updates or media handling?' This shows prioritization skills.
Non-functional requirements (NFRs) define how well the system must perform. For Twitter, these NFRs drive architectural decisions more than functional requirements do.
| Metric | Estimate | Implications |
|---|---|---|
| Daily Active Users (DAU) | 200-250 million | Global distribution required |
| Monthly Active Users (MAU) | 350-400 million | Account storage at scale |
| Tweets per day | 500 million | ~6,000 writes/second average |
| Timeline reads per day | 100+ billion | ~1.2 million reads/second average |
| Read:Write ratio | ~200:1 | Heavily read-optimized architecture |
| Average followers per user | 200-400 | Moderate fan-out for most users |
| Max followers (celebrities) | 100+ million | Extreme fan-out edge cases |
| Peak traffic multiplier | 3-5x average | Spikes during major events |
Twitter is a critical communication platform, especially during emergencies and breaking news:
User experience depends critically on response times:
| Operation | P50 Target | P99 Target | Rationale |
|---|---|---|---|
| Home timeline load | < 100ms | < 300ms | Primary user experience |
| Post a tweet | < 200ms | < 500ms | Immediate feedback required |
| Follow/unfollow | < 100ms | < 300ms | Should feel instant |
| Search results | < 200ms | < 500ms | Users expect fast search |
| Real-time tweet delivery | < 5 seconds | < 30 seconds | "Real-time" expectation |
| Profile load | < 150ms | < 400ms | Frequent navigation target |
Twitter can tolerate some consistency trade-offs in favor of availability and performance:
Twitter explicitly chooses Availability and Partition Tolerance over perfect Consistency. A user seeing a slightly stale timeline is acceptable; the timeline being unavailable is not. This trade-off shapes the entire architecture.
Before designing any system, we must estimate the scale. These calculations inform technology choices, capacity planning, and architecture patterns.
// Daily Active Users (DAU)DAU = 200 million users // Tweet writesTweets per day = 500 millionTweets per second (average) = 500M / 86,400 ≈ 5,800 TPSPeak tweet write rate = 5,800 × 5 = ~30,000 TPS // Timeline readsAssume each DAU opens timeline 10 times/dayTimeline reads per day = 200M × 10 = 2 billionTimeline reads per second (average) = 2B / 86,400 ≈ 23,000 RPSPeak read rate = 23,000 × 5 = ~115,000 RPS // Read:Write ratioRatio = 2 billion reads / 500 million writes ≈ 4:1 (timeline loads) // But each timeline load fetches many tweets!Avg tweets fetched per timeline = 50Effective read ratio = 50 × 4 = 200:1 (tweet reads)// Tweet storageAverage tweet size: - Tweet ID: 8 bytes - User ID: 8 bytes - Content: 280 chars × 4 bytes (UTF-8 worst case) = 1,120 bytes - Timestamp: 8 bytes - Metadata (likes, RTs, replies counts): 24 bytes - Media references: 50 bytes (URLs/IDs) Total per tweet ≈ 1,200 bytes ≈ 1.2 KB Tweets per year = 500M × 365 = 182.5 billion tweetsTweet storage per year = 182.5B × 1.2 KB ≈ 219 TB // Media storage (separate system)Assume 20% of tweets have mediaMedia tweets per year = 182.5B × 0.2 = 36.5 billionAverage media size = 500 KB (images compressed)Media storage per year = 36.5B × 500 KB ≈ 18.25 PB // Follow graph storageTotal users = 400 millionAverage follows per user = 300Total follow edges = 400M × 300 = 120 billionEdge size = 16 bytes (follower_id + followee_id)Follow graph storage = 120B × 16 bytes ≈ 1.92 TB// Outbound bandwidth (serving timelines)Timeline reads per second (peak) = 115,000Tweets per timeline = 50Tweet size (with metadata) = 2 KBOutbound per request = 50 × 2 KB = 100 KB Peak outbound = 115,000 × 100 KB = 11.5 GB/s = 92 Gbps // Inbound bandwidth (receiving tweets)Tweet writes per second (peak) = 30,000Average tweet size = 1.2 KBInbound text = 30,000 × 1.2 KB = 36 MB/s // Media upload bandwidth (peak)Media tweets per second = 30,000 × 0.2 = 6,000Average media size = 500 KBMedia inbound = 6,000 × 500 KB = 3 GB/s = 24 Gbps Total peak inbound ≈ 25 GbpsIn interviews, round aggressively and show your reasoning. '200 million × 10 × 50 = 100 billion tweet reads daily' is better than getting lost in precise calculations. The goal is order-of-magnitude accuracy to guide architectural decisions.
Every system has constraints that shape its design. For Twitter, several constraints are particularly critical:
The most challenging constraint in Twitter's design is the power-law distribution of followers:
| Account Type | Followers | When They Tweet | Fan-out Impact |
|---|---|---|---|
| Regular user | 200 | Deliver to 200 timelines | Trivial |
| Micro-influencer | 10,000 | Deliver to 10K timelines | Moderate load |
| Celebrity | 10 million | Deliver to 10M timelines | Significant spike |
| @BarackObama | 130+ million | Deliver to 130M timelines | Potential outage |
| @elonmusk | 170+ million | Deliver to 170M timelines | System-wide impact |
This creates a fundamental design challenge: should we pre-compute timelines (fanout on write) or compute them on request (fanout on read)? We'll explore both approaches in detail later.
Users expect tweets to appear in their timeline within seconds of posting. For breaking news and live events, this expectation is even higher. This constraint impacts:
When content goes viral, millions of users may request the same tweet simultaneously. Without careful caching:
A system that works for the average user but fails for edge cases is not production-ready. Your design must handle both @regularuser with 200 followers AND @elonmusk with 170 million. This duality defines Twitter's architecture.
Before designing internal systems, let's define the external API contract. This clarifies exactly what the system must support.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
## POST /tweetsCreate a new tweet. Request:{ "content": "string (1-280 chars, 1-4000 for premium)", "reply_to_id": "string (optional, for replies)", "quote_tweet_id": "string (optional, for quote tweets)", "media_ids": ["string"] (optional, up to 4), "poll": { "options": ["string"], "duration_minutes": int } (optional), "geo": { "lat": float, "lon": float } (optional)} Response: 201 Created{ "id": "1234567890", "author_id": "user_abc", "content": "Hello, world!", "created_at": "2024-01-15T10:30:00Z", "metrics": { "likes": 0, "retweets": 0, "replies": 0, "views": 0 }} --- ## DELETE /tweets/{tweet_id}Delete a tweet (author only). Response: 204 No Content --- ## POST /tweets/{tweet_id}/retweetRetweet an existing tweet. Response: 201 Created{ "retweet_id": "9876543210" } --- ## POST /tweets/{tweet_id}/likeLike a tweet. Response: 200 OK{ "liked": true }12345678910111213141516171819202122232425262728293031323334353637383940
## POST /users/{user_id}/followFollow a user. Response: 200 OK{ "following": true, "pending_approval": false // true for protected accounts} --- ## DELETE /users/{user_id}/followUnfollow a user. Response: 200 OK{ "following": false } --- ## GET /users/{user_id}/followersGet paginated list of followers. Query params: - cursor: string (pagination token) - limit: int (default 20, max 100) Response:{ "users": [ { "id": "user_xyz", "username": "johndoe", "display_name": "John Doe", ... } ], "next_cursor": "abc123", "has_more": true} --- ## GET /users/{user_id}/followingGet paginated list of accounts user follows.(Same response format as /followers)1234567891011121314151617181920212223242526272829303132333435363738394041424344
## GET /timeline/homeGet authenticated user's home timeline. Query params: - cursor: string (pagination token for older tweets) - since_id: string (fetch tweets newer than this ID) - limit: int (default 50, max 200) - include_replies: boolean (default true) - ranking: "chronological" | "algorithmic" (default "algorithmic") Response:{ "tweets": [ { "id": "1234567890", "author": { "id": "user_abc", "username": "alice", ... }, "content": "Just shipped a new feature!", "created_at": "2024-01-15T10:30:00Z", "metrics": { "likes": 42, "retweets": 5, "replies": 3, "views": 1200 }, "in_reply_to": null, "retweet_of": null, "media": [] }, ... ], "next_cursor": "eyJsYXN0X2lkIjogIjEyMzQ1Njc4OTAifQ==", "has_more": true, "gap_detected": false // true if user missed tweets} --- ## GET /users/{user_id}/tweetsGet a user's profile timeline. Query params: - cursor, limit (same as home timeline) - include_replies: boolean - include_retweets: boolean --- ## GET /timeline/mentionsGet tweets mentioning the authenticated user.Note the use of cursor-based pagination (not offset-based), which performs better at scale. Also note 'since_id' for efficient polling—clients can ask 'what's new since I last checked?' without fetching the entire timeline.
Before building, we must define how we'll measure success. Service Level Indicators (SLIs) and Service Level Objectives (SLOs) provide concrete targets.
| Service | SLI (What We Measure) | SLO (Target) | Error Budget |
|---|---|---|---|
| Timeline API | P99 latency | < 300ms | 0.1% of requests can exceed |
| Timeline API | Availability | 99.99% | 4.32 min downtime/month |
| Tweet Post API | P99 latency | < 500ms | 0.1% can exceed |
| Tweet Post API | Success rate | 99.9% | 0.1% can fail |
| Tweet Delivery | Time to timeline | < 5 seconds | 95th percentile |
| Follow API | P99 latency | < 300ms | 0.1% can exceed |
| Search | P99 latency | < 500ms | 0.5% can exceed |
Define these metrics before writing code. Instrument every API endpoint with latency histograms, error rates, and throughput counters. Without observability, you're flying blind—you won't know if your system meets requirements until users complain.
We've completed a thorough requirements analysis for our Twitter-like system. Let's consolidate our understanding:
What's Next:
With requirements established, we'll explore how to actually build this system. The next page dives into Feed Generation Approaches—the architectural patterns for assembling a user's timeline from hundreds of followed accounts efficiently. We'll examine pull-based, push-based, and hybrid approaches, each with distinct trade-offs.
The requirements we've defined here will directly inform which approach works best for different user segments. The celebrity problem, in particular, will drive us toward a hybrid architecture that treats different account types differently.
You've mastered the requirements analysis for a Twitter-like system. You understand the functional requirements (post, follow, timeline), non-functional requirements (scale, latency, availability), critical constraints (celebrity problem, real-time expectations), and success metrics. This foundation enables informed architectural decisions in the pages ahead.