Loading content...
Caching is the universal solution to performance problems. Store frequently accessed data closer to the consumer, and you eliminate expensive recomputation. But Facebook's feed presents a paradox:
Every user gets a unique feed. How do you cache 2 billion unique items?
Traditional caching assumes shared content—everyone gets the same homepage, the same product page, the same article. But feeds are personalized at the individual level. User A sees different posts than User B, ranked in different orders, with different engagement context.
Yet Facebook's feed loads in under 500ms. The secret isn't avoiding caching—it's rethinking what to cache. Instead of caching final feeds, we cache components and build feeds from cached building blocks.
This architectural insight transforms an impossible caching problem (2B unique feeds) into a tractable one (millions of shared components).
By the end of this page, you will understand Facebook's multi-layer caching architecture, master cache strategies for different data types, learn cache invalidation approaches at scale, and explore the trade-offs between freshness and performance.
Facebook employs a sophisticated multi-layer caching strategy where each layer optimizes for different access patterns, latencies, and data types.
| Layer | Latency | Capacity | TTL | Consistency |
|---|---|---|---|---|
| Client Memory | ~1ms | ~10MB/user | Session | Request-local |
| Client Storage | ~10ms | ~100MB/user | Days | Eventually consistent |
| CDN/Edge | ~5ms | ~100TB/region | Hours | TTL-based |
| L1 Process | ~0.1ms | ~1GB/server | Request | Process-local |
| L2 Distributed | ~1ms | ~100TB | Minutes-Hours | Eventually consistent |
| Feed Cache | ~5ms | ~200TB | Hours | Eventually consistent |
| TAO Cache | ~1ms | ~500TB | Write-through | Write-through |
Each cache layer should be 10x faster and 10x smaller than the layer below. If L2 cache takes 1ms, L1 should take 0.1ms. If L2 holds 100TB, L1 holds 10GB. This creates a natural hierarchy where hot data migrates up to faster, smaller caches.
Instead of caching complete feeds (impossible at 2B users), we cache the components that feeds are built from. This enables high cache hit rates while supporting personalization.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
// Cache Strategy: Components, not Complete Feeds // ❌ Naive approach: Cache complete feed per userinterface CachedFeed { userId: string; posts: FullPost[]; // Complete post data}// Problem: 2B users × 50KB/feed = 100 PB of cache (impossible)// Problem: Any change to any post invalidates caches // ✅ Better approach: Cache components separately // 1. Feed Candidate Cache - Post IDs onlyinterface FeedCandidateCache { userId: string; postIds: string[]; // Just IDs, not content generatedAt: timestamp;}// Size: 2B users × 1500 posts × 32 bytes = 96 TB (feasible) // 2. Post Content Cache - Shared across all usersinterface PostCache { postId: string; content: PostContent; author: UserReference; mediaIds: string[];}// Size: Billions of posts, but each cached once // 3. Engagement Cache - Updated independentlyinterface EngagementCache { postId: string; likes: number; comments: number; shares: number; lastUpdated: timestamp;}// Size: Small per-post, high update frequency // 4. User Profile Cache - Shared across all postsinterface UserProfileCache { userId: string; name: string; profilePicUrl: string; // ... other profile fields}// Size: 3B users × 1KB = 3 TB // Feed Assembly (at request time)async function assembleRead(userId: string): Promise<Feed> { // 1. Get candidate post IDs (cached) const candidates = await feedCandidateCache.get(userId); // 2. Fetch post content (cached, high hit rate) const posts = await postCache.batchGet(candidates.postIds); // 3. Fetch engagement counts (cached, short TTL) const engagement = await engagementCache.batchGet(candidates.postIds); // 4. Fetch author profiles (cached, high hit rate) const authorIds = posts.map(p => p.authorId); const profiles = await userProfileCache.batchGet(authorIds); // 5. Rank and assemble (computed, not cached) return rankAndAssemble(posts, engagement, profiles);}| Component | Hit Rate | Miss Cost | Optimization Strategy |
|---|---|---|---|
| Feed Candidates | ~70% | High (re-aggregate) | Long TTL, background refresh |
| Post Content | ~95% | Medium (DB query) | Write-through, CDN for media |
| Engagement Counts | ~99% | Low (fast counter) | Short TTL, coalesced updates |
| User Profiles | ~99% | Low (small data) | Write-through, replicated |
| Social Graph | ~99% | Medium (TAO query) | TAO handles caching |
| Ranking Features | ~60% | High (ML computation) | Feature store, pre-computation |
Post content and user profiles achieve 95-99% hit rates because they're shared across users. A viral post with 1M views gets its content cached once and served 1M times. The personalization happens in which posts to include, not in the post content itself.
Different data types require different caching strategies. Understanding when to use each pattern is crucial for designing effective cache systems.
1234567891011121314151617181920212223242526272829303132333435363738
// Cache-Aside Pattern// Application manages cache explicitly// Best for: Read-heavy data with infrequent updates class CacheAsidePostStore { private cache: DistributedCache; private database: Database; async getPost(postId: string): Promise<Post | null> { // 1. Try cache first const cached = await this.cache.get(`post:${postId}`); if (cached) { return cached; } // 2. Cache miss: query database const post = await this.database.query(`SELECT * FROM posts WHERE id = ?`, [postId]); if (post) { // 3. Populate cache for next request await this.cache.set(`post:${postId}`, post, { ttl: 3600 }); } return post; } async updatePost(postId: string, updates: Partial<Post>): Promise<void> { // 1. Update database (source of truth) await this.database.update('posts', postId, updates); // 2. Invalidate cache (or update it) await this.cache.delete(`post:${postId}`); // Alternative: await this.cache.set(`post:${postId}`, updatedPost); }} // Pros: Simple, only caches what's needed// Cons: Cache misses hit database, potential cache stampede123456789101112131415161718192021222324252627282930313233343536
// Write-Through Pattern// Cache is always updated with database// Best for: Data requiring strong consistency, frequently read after write class WriteThroughProfileStore { private cache: DistributedCache; private database: Database; async getProfile(userId: string): Promise<Profile> { // Always read from cache (guaranteed to be populated) return this.cache.get(`profile:${userId}`); } async updateProfile(userId: string, updates: Partial<Profile>): Promise<void> { // 1. Start transaction const tx = await this.database.beginTransaction(); try { // 2. Update database await tx.update('profiles', userId, updates); // 3. Update cache (synchronously, before commit) const updatedProfile = await tx.query(`SELECT * FROM profiles WHERE id = ?`, [userId]); await this.cache.set(`profile:${userId}`, updatedProfile); // 4. Commit transaction await tx.commit(); } catch (e) { await tx.rollback(); throw e; } }} // Pros: Cache always consistent with DB, no cold start// Cons: Write latency increased, requires careful error handling123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// Write-Behind Pattern// Write to cache immediately, async persist to database// Best for: High write throughput, eventual consistency acceptable class WriteBehindEngagementStore { private cache: DistributedCache; private writeQueue: WriteQueue; async incrementLikes(postId: string): Promise<number> { // 1. Increment in cache immediately (fast) const newCount = await this.cache.incr(`likes:${postId}`); // 2. Queue database write (async, batched) this.writeQueue.enqueue({ type: 'increment', table: 'post_engagement', postId, field: 'likes', delta: 1, }); return newCount; } // Background job processes write queue async processWriteQueue(): Promise<void> { const batch = await this.writeQueue.dequeue(100); // Coalesce increments for same post const coalesced = this.coalesceIncrements(batch); // Batch write to database await this.database.batchUpdate(coalesced); } coalesceIncrements(items: WriteItem[]): CoalescedWrite[] { const byPost = new Map(); for (const item of items) { const key = `${item.postId}:${item.field}`; byPost.set(key, (byPost.get(key) || 0) + item.delta); } return [...byPost.entries()].map(([key, delta]) => ({ postId: key.split(':')[0], field: key.split(':')[1], delta, })); }} // Pros: Very fast writes, natural batching// Cons: Data loss risk if cache fails before persist, complex recovery| Pattern | Use When | Data Type Examples |
|---|---|---|
| Cache-Aside | Read-heavy, eventual consistency OK | Post content, user profiles |
| Write-Through | Strong consistency required | Privacy settings, blocking data |
| Write-Behind | High write volume, eventual consistency OK | Engagement counts, view counts |
| Read-Through | Always cache on miss, auto-loading | Configuration, reference data |
| Refresh-Ahead | Predictable access patterns | Feed candidates, trending data |
As Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things." In personalized feed systems, invalidation is particularly challenging because changes in one place can affect many cached items.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
// Event-driven cache invalidation class CacheInvalidator { private redis: RedisCluster; private eventBus: EventBus; constructor() { // Subscribe to relevant events this.eventBus.subscribe('post:updated', this.onPostUpdated.bind(this)); this.eventBus.subscribe('post:deleted', this.onPostDeleted.bind(this)); this.eventBus.subscribe('user:profile_updated', this.onProfileUpdated.bind(this)); this.eventBus.subscribe('user:privacy_changed', this.onPrivacyChanged.bind(this)); this.eventBus.subscribe('friendship:changed', this.onFriendshipChanged.bind(this)); } async onPostUpdated(event: PostUpdatedEvent) { // Invalidate post content cache await this.redis.del(`post:${event.postId}`); // Invalidate CDN cache for media if (event.mediaChanged) { await this.cdnInvalidate(`/posts/${event.postId}/media/*`); } } async onPostDeleted(event: PostDeletedEvent) { // Invalidate all caches related to this post await this.redis.del([ `post:${event.postId}`, `engagement:${event.postId}`, `comments:${event.postId}`, ]); // Invalidate feed caches that might contain this post // This is expensive - defer to background processing await this.queueFeedInvalidation({ postId: event.postId, authorId: event.authorId, }); } async onPrivacyChanged(event: PrivacyChangedEvent) { // Privacy changes require broader invalidation const userId = event.userId; // Invalidate all posts by this user const postIds = await this.getPostsByUser(userId); await this.redis.del(postIds.map(id => `post:${id}`)); // Invalidate feed candidates for all friends // (they may need to see less or more content now) const friends = await this.getFriends(userId); await this.invalidateFeedCandidates(friends); } async invalidateFeedCandidates(userIds: string[]) { // For large invalidations, use background job if (userIds.length > 1000) { await this.queueBulkInvalidation(userIds); return; } // Direct invalidation for smaller batches await this.redis.del(userIds.map(id => `feed_candidates:${id}`)); }}1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
// Cascade invalidation can be expensive// Example: User changes name → affects all their posts in all friends' feeds class CascadeInvalidator { // Strategy 1: Lazy invalidation (check freshness on read) async lazyProfileCheck(post: CachedPost, viewer: string): Promise<Post> { const author = await profileCache.get(post.authorId); // Check if profile is newer than cached post if (author.lastModified > post.cachedAt) { // Profile changed, update embedded author data post.author = await this.refreshAuthorData(post, author); } return post; } // Strategy 2: Tag-based invalidation async tagBasedInvalidation(userId: string) { // All posts tagged with this author await this.redis.eval(` local keys = redis.call('keys', 'post:*:author:' .. ARGV[1]) for i, key in ipairs(keys) do redis.call('del', key) end return #keys `, 0, userId); } // Strategy 3: TTL with version check async versionedRead(postId: string): Promise<Post> { const cached = await this.redis.hgetall(`post:${postId}`); // Check freshness version const currentVersion = await this.getPostVersion(postId); if (cached.version !== currentVersion) { // Stale cache, refresh return this.refreshPostCache(postId); } return cached.data; } // Strategy 4: Probabilistic expiration (avoid thundering herd) cacheWithProbabilisticExpiry(key: string, data: any, baseTTL: number) { // Add random jitter to TTL const ttl = baseTTL + Math.random() * baseTTL * 0.1; // ±10% return this.redis.set(key, JSON.stringify(data), 'EX', Math.floor(ttl)); }}When a popular cached item expires, many simultaneous requests hit the database to rebuild it. This can overwhelm the database. Solutions include: probabilistic early expiration, mutex locks during rebuild, or serving stale data while rebuilding in background.
Facebook has built custom caching systems optimized for specific access patterns. Understanding these helps appreciate why generic caching often isn't enough.
1234567891011121314151617181920212223242526272829303132333435
// TAO API for social graph operations // Object operationsconst user = await tao.objectGet(userId);await tao.objectSet(userId, { name: 'New Name', lastModified: now() }); // Association operations// Create friendship edgeawait tao.assocAdd(userId1, 'friend', userId2, { timestamp: now(), initiator: userId1,}); // Query all friends (paginated)const friends = await tao.assocRange( userId, // Source object 'friend', // Association type 0, // Start offset 100 // Count); // Query specific associationconst isFriend = await tao.assocGet(userId1, 'friend', userId2); // Count associationsconst friendCount = await tao.assocCount(userId, 'friend'); // Multi-get for batch efficiencyconst profiles = await tao.objectMultiGet([userId1, userId2, userId3]); // TAO handles caching automatically:// - Objects cached individually// - Association lists cached as sorted sets// - Writes invalidate relevant caches synchronously// - Read scalability via cache replicas1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
// Lease-based cache get to prevent thundering herd class LeaseMemcache { async getWithLease(key: string): Promise<{ value: any, lease?: string }> { const result = await this.memcache.getsWithLease(key); if (result.value !== null) { // Cache hit return { value: result.value }; } // Cache miss - server issues lease if (result.lease) { // This request has the lease to rebuild cache return { value: null, lease: result.lease }; } // Another request has the lease, wait briefly await sleep(10); return this.getWithLease(key); // Retry } async setWithLease(key: string, value: any, lease: string) { // Only succeeds if lease is still valid return this.memcache.casWithLease(key, value, lease); } // Usage pattern async getCachedPost(postId: string): Promise<Post> { const { value, lease } = await this.getWithLease(`post:${postId}`); if (value) { return value; // Cache hit } // Cache miss with lease - rebuild const post = await database.getPost(postId); if (lease) { await this.setWithLease(`post:${postId}`, post, lease); } return post; }} // Lease mechanism ensures only ONE request rebuilds cache// All other requests wait and then get the cached value// This prevents N simultaneous database queries1234567891011121314151617181920212223242526272829303132333435363738394041424344
// Feature Store for ranking model inputs interface FeatureStore { // Pre-computed features updated periodically userFeatures: Map<UserId, UserFeatureVector>; contentFeatures: Map<ContentId, ContentFeatureVector>; // Real-time features computed per-request realtimeFeatures: (context: Context) => FeatureVector;} class FeedFeatureStore { private offlineStore: RocksDB; // Pre-computed features private onlineStore: Redis; // Real-time features private featureCache: LocalCache; // In-process cache async getUserFeatures(userId: string): Promise<UserFeatures> { // 1. Check process-local cache (hot users) const local = await this.featureCache.get(`user:${userId}`); if (local) return local; // 2. Check online store (recent computations) const online = await this.onlineStore.get(`user_features:${userId}`); if (online) { await this.featureCache.set(`user:${userId}`, online); return online; } // 3. Fall back to offline store (batch-computed features) const offline = await this.offlineStore.get(`user_features:${userId}`); await this.onlineStore.set(`user_features:${userId}`, offline, 'EX', 3600); await this.featureCache.set(`user:${userId}`, offline); return offline; } // Daily batch job pre-computes user features // Avoids computing 200+ features at request time async batchComputeUserFeatures(userIds: string[]) { for (const userId of userIds) { const features = await this.computeUserFeatures(userId); await this.offlineStore.put(`user_features:${userId}`, features); } }}With multiple cache layers, debugging performance issues and optimizing cache efficiency requires comprehensive monitoring.
| Metric | Target | Alert Threshold | Impact if Degraded |
|---|---|---|---|
| Hit Rate | 95% | < 90% | Increased DB load, higher latency |
| Latency p50 | < 1ms | 2ms | Feed loading slower |
| Latency p99 | < 5ms | 20ms | Tail latency spikes |
| Eviction Rate | < 5%/hour | 10%/hour | Cache thrashing, need more memory |
| Memory Usage | < 90% | 95% | Evictions, potential OOM |
| Connection Errors | < 0.01% | 0.1% | Cache unavailability |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
// Cache monitoring instrumentation class CacheMetrics { private metrics: MetricsClient; async instrumentedGet(cache: Cache, key: string): Promise<any> { const startTime = Date.now(); try { const result = await cache.get(key); const latency = Date.now() - startTime; // Record metrics this.metrics.histogram('cache.latency', latency, { cache: cache.name, operation: 'get', }); if (result !== null) { this.metrics.increment('cache.hit', 1, { cache: cache.name, key_pattern: this.extractPattern(key) }); } else { this.metrics.increment('cache.miss', 1, { cache: cache.name, key_pattern: this.extractPattern(key) }); } return result; } catch (error) { this.metrics.increment('cache.error', 1, { cache: cache.name, error_type: error.name }); throw error; } } // Dashboard alerts setupAlerts() { alert({ name: 'cache_hit_rate_low', query: 'rate(cache_hit) / rate(cache_hit + cache_miss) < 0.9', severity: 'warning', message: 'Cache hit rate below 90%', }); alert({ name: 'cache_latency_high', query: 'percentile(cache_latency, 99) > 20ms', severity: 'critical', message: 'Cache p99 latency exceeds 20ms', }); alert({ name: 'cache_connection_errors', query: 'rate(cache_error) / rate(cache_*) > 0.001', severity: 'critical', message: 'Cache error rate exceeds 0.1%', }); } // Extract key pattern for aggregation (e.g., "post:123" → "post:*") extractPattern(key: string): string { return key.replace(/[0-9]+/g, '*'); }}When hit rates drop: 1) Check if new code is using different key patterns, 2) Look for TTL misconfiguration, 3) Check if evictions are happening (memory pressure), 4) Verify cache isn't being incorrectly invalidated by events.
We've explored the sophisticated caching architecture that makes Facebook's personalized feed fast. Let's consolidate the key concepts:
What's Next:
With caching covered, the final page explores Privacy Considerations—the architectural patterns that ensure user privacy and data protection in personalized feed systems.
You now understand the multi-layer caching strategy that makes personalized feeds fast. The key insight is caching shareable components while assembling unique feeds at request time—turning an impossible caching problem into a tractable one.