Loading content...
We've designed individual components: timeline generation, tweet storage, trending topics. Now we must address the meta-challenge: How do all these components work together at global scale?
Twitter's Operating Reality:
This final page synthesizes everything into a cohesive, production-ready system architecture. We'll cover global distribution, caching strategies, capacity planning, graceful degradation, and operational patterns that enable Twitter-scale systems to run reliably.
By the end of this page, you'll understand multi-region architecture patterns, edge caching strategies, capacity planning methodologies, graceful degradation techniques, and operational best practices for systems serving hundreds of millions of users.
A single data center cannot serve a global user base with acceptable latency. Multi-region deployment is essential.
| Region | Location | Primary Role | User Base |
|---|---|---|---|
| US-East | Virginia | Primary write region for Americas | North/South America |
| US-West | Oregon | Read replicas, failover for US-East | West Coast Americas |
| Europe | Ireland/Frankfurt | Primary write region for EMEA | Europe, Middle East, Africa |
| Asia-Pacific | Singapore/Tokyo | Primary write region for APAC | Asia-Pacific, Oceania |
Critical question: Should each region have all data, or should data be partitioned by user location?
Twitter uses a hybrid model optimized for its access patterns:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
"""Twitter's Data Distribution Strategy Different data types have different distribution patternsbased on their access characteristics.""" class DataDistributionStrategy: """ Data Distribution by Type ========================= 1. USER DATA (profile, settings, auth) - Primary: User's "home" region (based on signup location) - Replicated: All regions (read-only copies) - Rationale: Users mostly access their own profile; others need fast reads globally 2. TWEETS - Primary: Author's home region - Replicated: All regions asynchronously - Rationale: Tweets are immutable; eventual consistency OK Most reads are for recent tweets of followed users 3. TIMELINE CACHES - Primary: User's current region (not home region) - Not replicated: Generated locally from replicated data - Rationale: Timelines are computed, not stored Each region builds from its local tweet copy 4. SOCIAL GRAPH (follows) - Primary: User's home region - Replicated: All regions frequently - Rationale: Read-heavy, rarely changes Needs to be consistent for fanout 5. ENGAGEMENT COUNTERS (likes, retweets) - Primary: Sharded globally (not regional) - Updates: Async aggregation from all regions - Rationale: Write-heavy, eventual consistency OK Exact counts not critical """ def route_write(self, data_type: str, user_id: str) -> str: """Determine write region for a data type.""" home_region = self.get_user_home_region(user_id) if data_type in ["user", "tweet", "follow"]: return home_region elif data_type == "timeline_cache": return self.get_current_region() # Where user is now elif data_type == "counter": return self.get_counter_shard_region(user_id) return home_region def route_read(self, data_type: str, user_id: str) -> str: """Determine read region for a data type.""" current_region = self.get_current_region() # All reads prefer local region (replicas available) return current_region def should_wait_for_replication( self, data_type: str, operation: str ) -> bool: """ Should we wait for cross-region replication? Generally: Write locally, read eventually consistent. Exception: User's own writes should be visible immediately. """ if operation == "read_own_write": # User reading data they just wrote # Route to primary or use read-your-writes consistency return True return FalseEven in an eventually consistent system, users expect to see their own changes immediately. When Alice posts a tweet, she should see it in her timeline instantly—even if replication to other regions takes seconds. Achieve this by routing her reads to the same region that processed her write, or by reading from a cache populated at write time.
Caching is the single most important technique for achieving Twitter-scale performance. Every layer of the stack uses caching to reduce latency and load on downstream systems.
| Layer | Technology | TTL | Data Cached | Hit Rate Target |
|---|---|---|---|---|
| CDN Edge | Cloudflare/Fastly | 5-60s | Static assets, media, API responses | 95% |
| API Gateway | Nginx cache | 1-10s | Trending topics, public profiles | 80% |
| Application | Local in-memory | Request-scoped | Frequently accessed objects | 90% |
| Distributed | Redis Cluster | Minutes-hours | Timelines, tweet objects | 95% |
| Local Replicas | SSD-backed cache | Hours-days | User data, less hot content | 80% |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293
"""Multi-Layer Cache Architecture Each layer handles different access patterns and providesdifferent consistency guarantees.""" class CacheHierarchy: def __init__( self, local_cache, # In-process (Guava, Caffeine) redis_cache, # Distributed Redis cluster cdn_cache # Edge CDN cache ): self.local = local_cache self.redis = redis_cache self.cdn = cdn_cache async def get_timeline( self, user_id: str, viewer_id: str ) -> List[Tweet]: """ Get timeline with multi-layer cache lookup. Cache key strategy: - Private timeline: "tl:{user_id}:{viewer_id}" (viewer-specific) - Public profile: "tl:{user_id}:public" (shared across viewers) """ is_own_profile = (user_id == viewer_id) if is_own_profile: # Own timeline: check Redis directly (personalized) return await self._get_private_timeline(user_id) else: # Viewing other's profile: use multi-layer cache return await self._get_public_timeline(user_id, viewer_id) async def _get_private_timeline(self, user_id: str) -> List[Tweet]: """Private timeline - user's own home timeline.""" cache_key = f"home_tl:{user_id}" # Skip local cache (too personalized to share) # Check Redis cached = await self.redis.get(cache_key) if cached: return cached # Build timeline timeline = await self._build_home_timeline(user_id) # Cache in Redis (short TTL for freshness) await self.redis.set(cache_key, timeline, ex=60) return timeline async def _get_public_timeline( self, user_id: str, viewer_id: str ) -> List[Tweet]: """Public profile timeline - can be heavily cached.""" cache_key = f"profile_tl:{user_id}" # Check local cache first (hot profiles) cached = self.local.get(cache_key) if cached: return self._filter_for_viewer(cached, viewer_id) # Check Redis cached = await self.redis.get(cache_key) if cached: self.local.set(cache_key, cached, ttl=30) # Cache locally return self._filter_for_viewer(cached, viewer_id) # Build timeline timeline = await self._build_profile_timeline(user_id) # Cache at multiple layers await self.redis.set(cache_key, timeline, ex=300) self.local.set(cache_key, timeline, ttl=30) return self._filter_for_viewer(timeline, viewer_id) def _filter_for_viewer( self, tweets: List[Tweet], viewer_id: str ) -> List[Tweet]: """Apply viewer-specific filtering (blocks, mutes).""" blocked_users = self.get_blocked_muted(viewer_id) return [t for t in tweets if t.author_id not in blocked_users]12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
"""Cache Invalidation Patterns The famous quote: "There are only two hard things in computer science:cache invalidation and naming things." Here's how we handle it.""" class CacheInvalidator: """ Invalidation Strategies by Data Type ===================================== 1. TIME-BASED (TTL) - Data naturally expires - Simple, no coordination needed - Use for: Trending topics, aggregate counts, public timelines 2. EVENT-BASED (Explicit Invalidation) - Writer invalidates cache after update - Requires coordination between services - Use for: Home timelines (on new tweet), user profiles (on edit) 3. VERSION-BASED (Cache Keys with Versions) - Include version number in cache key - Old keys naturally expire, new key used immediately - Use for: Data that changes infrequently but needs immediate update 4. BACKGROUND REFRESH (Stale-While-Revalidate) - Serve stale data while refreshing in background - Never blocks on cache miss - Use for: Non-critical data where freshness is acceptable """ async def invalidate_user_timeline(self, user_id: str): """ Invalidate a user's home timeline cache. Called when: new tweet from followed account, new follow, unfollow. """ cache_key = f"home_tl:{user_id}" # Option 1: Simple delete (next read will rebuild) await self.redis.delete(cache_key) # Option 2: Refresh in background (non-blocking) # await self.queue_timeline_refresh(user_id) async def invalidate_tweet(self, tweet_id: str, author_id: str): """ Invalidate caches when a tweet is deleted or modified. Challenge: Tweet might be cached in: - Author's profile timeline - All followers' home timelines - Quote tweet caches - Reply thread caches """ # Invalidate author's profile await self.redis.delete(f"profile_tl:{author_id}") # For followers' timelines: don't invalidate immediately # Instead, filter out deleted tweets at read time # (Eventually, TTL will expire old timeline caches) # Mark tweet as deleted in tweet cache await self.redis.hset(f"tweet:{tweet_id}", "deleted", "true") async def invalidate_all_user_caches(self, user_id: str): """ Nuclear option: invalidate all caches for a user. Used for: Account suspension, privacy changes, data deletion requests. """ patterns = [ f"home_tl:{user_id}", f"profile_tl:{user_id}", f"user:{user_id}:*", f"tweets:{user_id}:*", ] for pattern in patterns: keys = await self.redis.keys(pattern) if keys: await self.redis.delete(*keys)For very expensive invalidations (like removing a deleted tweet from millions of follower timelines), prefer lazy invalidation: filter out deleted content at read time, let TTL handle cache expiry. It's eventually consistent but much cheaper than eager invalidation.
Capacity planning ensures the system can handle expected load with headroom for growth and spikes.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122
"""Capacity Planning Framework Systematic approach to sizing infrastructure for Twitter-scale systems.""" class CapacityPlanner: def __init__( self, dau: int = 200_000_000, # Daily Active Users peak_multiplier: float = 3.0, # Peak vs average ratio growth_rate: float = 0.25, # 25% annual growth headroom: float = 0.30 # 30% safety margin ): self.dau = dau self.peak_multiplier = peak_multiplier self.growth_rate = growth_rate self.headroom = headroom def calculate_timeline_api_capacity(self) -> dict: """ Calculate required capacity for Timeline API servers. """ # Average requests per user per day timeline_loads_per_user = 20 # Total daily requests daily_requests = self.dau * timeline_loads_per_user # Average requests per second avg_rps = daily_requests / 86400 # Peak requests per second peak_rps = avg_rps * self.peak_multiplier # With headroom target_capacity = peak_rps * (1 + self.headroom) # Server sizing (assume 2000 RPS per server) rps_per_server = 2000 servers_needed = int(target_capacity / rps_per_server) + 1 return { "avg_rps": int(avg_rps), "peak_rps": int(peak_rps), "target_capacity": int(target_capacity), "servers_needed": servers_needed, "per_region": servers_needed // 4, # 4 regions } def calculate_redis_capacity(self) -> dict: """ Calculate required Redis cache capacity. """ # Timeline cache per user timeline_size_bytes = 800 * 100 # 800 tweet IDs × 100 bytes each # Active users have cached timelines active_fraction = 0.7 # 70% of DAU are active enough to cache cached_timelines = int(self.dau * active_fraction) timeline_storage = cached_timelines * timeline_size_bytes # Tweet cache (recent tweets) tweets_per_day = 500_000_000 tweet_cache_days = 7 tweet_size = 1000 # 1KB per tweet tweet_storage = tweets_per_day * tweet_cache_days * tweet_size # Hot user cache hot_users = 10_000_000 # 10M "hot" users user_size = 500 # 500 bytes per user object user_storage = hot_users * user_size total_bytes = timeline_storage + tweet_storage + user_storage total_gb = total_bytes / (1024**3) # With replication replication_factor = 3 required_gb = total_gb * replication_factor # Redis cluster sizing (assume 256GB per node max) gb_per_node = 200 # Leave headroom nodes_needed = int(required_gb / gb_per_node) + 1 return { "timeline_cache_gb": timeline_storage / (1024**3), "tweet_cache_gb": tweet_storage / (1024**3), "user_cache_gb": user_storage / (1024**3), "total_gb": total_gb, "with_replication_gb": required_gb, "redis_nodes_needed": nodes_needed } def calculate_database_capacity(self) -> dict: """ Calculate required database storage. """ tweets_per_day = 500_000_000 tweet_size_bytes = 1000 # 3 years of tweets years = 3 total_tweets = tweets_per_day * 365 * years tweet_storage = total_tweets * tweet_size_bytes # Index overhead (~30%) index_overhead = 1.3 total_with_indexes = tweet_storage * index_overhead # Replication factor replication = 3 total_replicated = total_with_indexes * replication total_pb = total_replicated / (1024**5) return { "total_tweets": total_tweets, "raw_storage_tb": tweet_storage / (1024**4), "with_indexes_tb": total_with_indexes / (1024**4), "with_replication_pb": total_pb, }Static capacity planning isn't enough—systems must scale dynamically:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
# Kubernetes Horizontal Pod Autoscaler Configuration# For Timeline API Service apiVersion: autoscaling/v2kind: HorizontalPodAutoscalermetadata: name: timeline-api-hpa namespace: twitter-corespec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: timeline-api # Scaling bounds minReplicas: 50 # Never go below 50 pods per region maxReplicas: 500 # Can burst to 500 for major events # Scaling metrics metrics: # CPU-based scaling (primary) - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 65 # Scale up at 65% CPU # RPS-based scaling (secondary) - type: Pods pods: metric: name: requests_per_second target: type: AverageValue averageValue: "1500" # Scale up if RPS exceeds 1500/pod # Latency-based scaling (prevent degradation) - type: Pods pods: metric: name: response_time_p99_ms target: type: AverageValue averageValue: "200" # Scale up if P99 exceeds 200ms # Scaling behavior behavior: scaleUp: stabilizationWindowSeconds: 60 # Wait 60s before scaling up more policies: - type: Percent value: 50 # Add up to 50% more pods periodSeconds: 60 - type: Pods value: 20 # Or add up to 20 pods periodSeconds: 60 selectPolicy: Max # Use whichever adds more scaleDown: stabilizationWindowSeconds: 300 # Wait 5min before scaling down policies: - type: Percent value: 10 # Remove at most 10% of pods periodSeconds: 120 selectPolicy: Min # Conservative scale-downScaling down too aggressively causes oscillation (scale down → load spikes → scale up → repeat). Use longer stabilization windows for scale-down, and remove capacity gradually. It's better to pay for a few extra servers than to create instability.
When systems are overloaded, the goal is to degrade gracefully—maintain core functionality while sacrificing non-essential features.
| Level | Trigger | Actions | User Impact |
|---|---|---|---|
| Normal | All metrics healthy | Full feature set | None |
| Advisory | P99 > 200ms | Reduce non-essential API calls | Minimal |
| Warning | P99 > 400ms | Disable luxury features | Noticeable but acceptable |
| Critical | P99 > 1s or errors > 1% | Emergency load shedding | Significant |
| Survival | System at risk | Core features only | Major, but system survives |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106
"""Degradation Controller Coordinates graceful degradation across services based onreal-time system health indicators.""" from enum import IntEnumfrom dataclasses import dataclassfrom typing import Set class DegradationLevel(IntEnum): NORMAL = 0 ADVISORY = 1 WARNING = 2 CRITICAL = 3 SURVIVAL = 4 @dataclassclass FeatureFlag: name: str min_level: DegradationLevel # Feature disabled at this level and above description: str class DegradationController: # Features that can be disabled during degradation DEGRADABLE_FEATURES = [ FeatureFlag("algorithmic_timeline", DegradationLevel.WARNING, "Serve chronological timeline instead of ranked"), FeatureFlag("trending_topics", DegradationLevel.WARNING, "Hide trending topics section"), FeatureFlag("who_to_follow", DegradationLevel.ADVISORY, "Disable follow suggestions"), FeatureFlag("analytics_tracking", DegradationLevel.ADVISORY, "Stop non-essential analytics"), FeatureFlag("media_previews", DegradationLevel.WARNING, "Show links instead of previews"), FeatureFlag("real_time_updates", DegradationLevel.CRITICAL, "Disable WebSocket updates, require refresh"), FeatureFlag("search", DegradationLevel.CRITICAL, "Disable search, show cached trends"), FeatureFlag("new_tweets", DegradationLevel.SURVIVAL, "Disable posting, read-only mode"), ] def __init__(self, metrics_client, feature_flags): self.metrics = metrics_client self.flags = feature_flags self.current_level = DegradationLevel.NORMAL def evaluate_health(self) -> DegradationLevel: """ Evaluate current system health and return appropriate level. """ p99_latency = self.metrics.get("timeline_api.latency.p99") error_rate = self.metrics.get("timeline_api.error_rate") cpu_usage = self.metrics.get("cluster.cpu.avg") if error_rate > 0.05 or p99_latency > 2000: return DegradationLevel.SURVIVAL elif error_rate > 0.01 or p99_latency > 1000: return DegradationLevel.CRITICAL elif p99_latency > 400 or cpu_usage > 0.85: return DegradationLevel.WARNING elif p99_latency > 200 or cpu_usage > 0.75: return DegradationLevel.ADVISORY else: return DegradationLevel.NORMAL def get_disabled_features(self) -> Set[str]: """Get features that should be disabled at current level.""" disabled = set() for feature in self.DEGRADABLE_FEATURES: if self.current_level >= feature.min_level: disabled.add(feature.name) return disabled def update_and_notify(self): """ Update degradation level and notify relevant services. """ new_level = self.evaluate_health() if new_level != self.current_level: old_level = self.current_level self.current_level = new_level # Update feature flags disabled = self.get_disabled_features() for feature in self.DEGRADABLE_FEATURES: self.flags.set( feature.name, feature.name not in disabled ) # Alert operations team if new_level >= DegradationLevel.WARNING: self.alert_ops(old_level, new_level) # Log for analysis self.log_level_change(old_level, new_level) def is_feature_enabled(self, feature_name: str) -> bool: """Check if a feature should be enabled.""" disabled = self.get_disabled_features() return feature_name not in disabledWhen overwhelmed, the system must shed load to protect core functionality:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102
"""Load Shedding Strategies When the system cannot handle all requests, strategicallyreject some to protect the rest.""" class LoadShedder: """ Load Shedding Priorities ======================== Priority 1 (Never Shed): Core functionality - User authentication - Tweet posting (with rate limit) - Critical notifications Priority 2 (Shed Last): Primary features - Home timeline - User profile views - Notifications Priority 3 (Shed Early): Secondary features - Search - Explore/Discovery - Analytics Priority 4 (Shed First): Luxury features - Who to follow - Trending topics - Related tweets """ PRIORITY_MAP = { "/auth/*": 1, "/tweets/create": 1, "/timeline/home": 2, "/users/*/profile": 2, "/notifications": 2, "/search/*": 3, "/explore/*": 3, "/trending": 4, "/suggestions/*": 4, } def __init__(self, target_rps: int, current_rps_getter): self.target_rps = target_rps self.get_current_rps = current_rps_getter def should_shed(self, request_path: str) -> bool: """ Determine if a request should be shed. """ current_rps = self.get_current_rps() if current_rps <= self.target_rps: return False # Under capacity, accept all overload_ratio = current_rps / self.target_rps priority = self._get_priority(request_path) # Higher overload + lower priority = more likely to shed shed_probability = self._calculate_shed_probability( overload_ratio, priority ) return random.random() < shed_probability def _get_priority(self, path: str) -> int: """Get priority for a request path.""" for pattern, priority in self.PRIORITY_MAP.items(): if self._match_pattern(path, pattern): return priority return 3 # Default to middle priority def _calculate_shed_probability( self, overload_ratio: float, priority: int ) -> float: """ Calculate probability of shedding this request. At 1.5x overload: - Priority 1: 0% shed - Priority 2: 10% shed - Priority 3: 30% shed - Priority 4: 50% shed At 2x overload: - Priority 1: 0% shed - Priority 2: 25% shed - Priority 3: 50% shed - Priority 4: 75% shed """ if priority == 1: return 0.0 # Never shed priority 1 base_shed = (overload_ratio - 1) * 0.5 # 50% at 2x overload priority_multiplier = (priority - 1) / 3 # 0 to 1 based on priority return min(0.95, base_shed * (1 + priority_multiplier))When features are degraded, communicate this to users. A banner saying 'Some features are temporarily limited' is better than confusing silent failures. Users are surprisingly understanding when you're transparent about issues.
Running a Twitter-scale system requires operational practices beyond the code itself.
| Dashboard | Key Metrics | Alert Thresholds |
|---|---|---|
| API Health | RPS, Latency P50/P99, Error Rate | P99 > 300ms, Errors > 0.1% |
| Infrastructure | CPU, Memory, Disk, Network | CPU > 80%, Memory > 85% |
| Database | QPS, Replication Lag, Connection Pool | Lag > 10s, Pool > 90% |
| Cache | Hit Rate, Evictions, Memory | Hit Rate < 90%, Memory > 90% |
| Fanout | Queue Depth, Processing Lag | Depth > 1M, Lag > 60s |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
# Timeline API Incident Response Runbook ## Severity Levels- SEV1: Complete outage, >50% of users affected- SEV2: Major degradation, >10% of users affected - SEV3: Minor degradation, <10% of users affected- SEV4: No user impact, potential risk ## Immediate Actions (First 5 minutes) ### 1. Assess Scope- Check: Is this regional or global?- Check: Which services are affected?- Check: When did it start? (correlate with deploys) ### 2. Communicate- SEV1/SEV2: Page incident commander immediately- Update status page within 10 minutes- Open incident Slack channel ### 3. Mitigate First, Debug Later- Recent deploy? ROLLBACK IMMEDIATELY- Traffic spike? Enable aggressive load shedding- Single region? Failover to healthy region- Database issue? Failover to replica ## Common Issues and Fixes ### Timeline API Latency Spike1. Check Redis cache hit rate (dashboard: Cache Health)2. If hit rate dropped → Check for cache invalidation storm3. If hit rate OK → Check database latency4. If database OK → Check downstream services (Fanout, User Service) ### Fanout Queue Backing Up1. Check for celebrity tweet (high follower count)2. Scale up fanout workers (kubectl scale --replicas=200)3. If still backing up → Enable emergency fanout limits4. Consider temporarily disabling push for high-follower accounts ### Database Connection Exhaustion1. Check for connection leaks (long-running queries)2. Kill long-running queries if safe3. Scale database replicas4. Enable connection pool shedding ## Post-Incident- Blameless post-mortem within 48 hours- Action items with owners and due dates- Update runbooks with learningsChaos engineering isn't about breaking things—it's about building confidence that your systems handle failures gracefully. Run chaos experiments continuously in staging, and periodically in production during low-traffic periods. The bugs you find proactively are far less costly than the ones found during peak traffic.
We've designed a complete Twitter-like system. Let's see how all the pieces fit together:
Throughout this module, we've applied fundamental system design principles:
Trade-offs are inevitable: Every decision involves trade-offs. We chose availability over consistency, favored read performance over write simplicity, and balanced cost against latency.
Design for scale from day one: The celebrity problem wasn't an afterthought—it shaped our entire architecture. Anticipate extreme cases.
Caching is king: Multi-layer caching transformed impossible latency targets into achievable ones. Know your cache hit rates.
Fail gracefully: From circuit breakers to load shedding to degradation levels, we planned for failure at every layer.
Observability enables everything: You can't optimize what you can't measure. Comprehensive monitoring is as important as the features themselves.
Congratulations! You've completed a comprehensive deep-dive into designing a Twitter-like social media platform. You now have the knowledge to discuss Twitter's architecture at a principal engineer level—understanding not just what to build, but why each decision was made and what trade-offs were accepted. This foundation applies to any large-scale content distribution system.