Loading content...
The Instagram home feed is where users spend most of their time—scrolling through a curated stream of photos, videos, and stories from accounts they follow. What appears to be a simple chronological list is actually the output of one of the most sophisticated content ranking systems in existence.
Consider the challenge: A typical user follows 200-500 accounts. Those accounts collectively produce dozens of new posts per day. Instagram must determine which posts to show, in what order, and when—all while rendering a feed that feels personal, timely, and engaging.
Now multiply this by 500 million daily active users, each requesting their feed multiple times per day. The feed generation system handles 10+ billion feed renders per day, each requiring aggregation from hundreds of sources, ranking through ML models, and delivery in under 200 milliseconds.
By the end of this page, you will understand: (1) The fanout problem and Instagram's hybrid push/pull solution, (2) How feed candidates are generated and ranked using ML, (3) The social graph infrastructure that powers following relationships, (4) Feed caching strategies and invalidation patterns, (5) Real-time feed updates and notification systems, and (6) Pagination and infinite scroll implementation at scale.
The fanout problem is the central architectural challenge in social feed systems. When a user posts a photo, how do we ensure it appears in all their followers' feeds?
There are two fundamental approaches, each with profound trade-offs:
Fanout-on-Write (Push Model):
When a user posts, immediately write to every follower's pre-computed feed cache.
User A posts photo
→ For each follower of A:
→ Append photo to follower's feed cache
→ All followers have updated feeds immediately
Fanout-on-Read (Pull Model):
When a user requests their feed, query all followed accounts and aggregate their recent posts.
User B requests feed
→ For each account B follows:
→ Fetch recent posts from that account
→ Merge and rank all posts
→ Return top N posts as feed
The Celebrity Problem:
Neither approach works universally. Consider these extremes:
| User Type | Followers | Best Approach | Why |
|---|---|---|---|
| Average user | 500 | Fanout-on-write | 500 writes is cheap; fast reads for followers |
| Micro-influencer | 50,000 | Hybrid | Significant but manageable write amplification |
| Major celebrity | 500M | Fanout-on-read | 500M writes per post is catastrophic |
If Cristiano Ronaldo (600M+ followers) used fanout-on-write, a single post would generate 600 million write operations. At 10 posts/day, that's 6 billion writes daily from one user—more than Instagram's total daily post volume.
Instagram uses a hybrid approach with a follower threshold (approximately 10,000-50,000 followers). Accounts below the threshold use fanout-on-write (push to followers). Accounts above use fanout-on-read (pulled at feed generation time). This hybrid optimizes for the common case (regular users) while handling extreme cases (celebrities) gracefully.
At the heart of feed generation is the social graph—the data structure representing who follows whom. This graph is massive and accessed on nearly every operation.
Graph Scale:
| Metric | Estimated Value |
|---|---|
| Nodes (users) | 2+ billion |
| Edges (follow relationships) | 100+ billion |
| Average following per user | ~200 |
| Average followers per user | ~200 (but highly skewed) |
| Daily edge changes (follow/unfollow) | 100+ million |
Graph Operations Needed:
Graph Storage Strategies:
Storing 100+ billion edges efficiently requires specialized infrastructure:
Option 1: Adjacency List in Key-Value Store
# Store outgoing edges (following)
Key: user_id:following
Value: [followed_user_id_1, followed_user_id_2, ...]
# Store incoming edges (followers)
Key: user_id:followers
Value: [follower_id_1, follower_id_2, ...]
Option 2: Edge Table in Relational/Wide-Column DB
CREATE TABLE follows (
follower_id BIGINT,
followed_id BIGINT,
created_at TIMESTAMP,
PRIMARY KEY (follower_id, followed_id)
);
-- Index for reverse lookup
CREATE INDEX idx_followed ON follows(followed_id, follower_id);
Option 3: Graph Database
// Neo4j style
(:User {id: 'A'})-[:FOLLOWS]->(:User {id: 'B'})
Instagram's Approach: TAO (The Associations and Objects)
Meta (Facebook/Instagram) uses TAO, a purpose-built graph storage system optimized for social graph access patterns:
| TAO Feature | Benefit |
|---|---|
| Distributed cache layer | Sub-millisecond reads for hot data |
| Persistent storage tier (MySQL) | Durability with replication |
| Edge pagination | Efficient handling of celebrity-scale follower lists |
| Edge counts caching | Follower counts without scanning lists |
| Bidirectional edges | Following stored in both directions for fast lookups |
| Eventual consistency | Trades strict consistency for availability |
Graph Sharding:
With billions of users, the graph must be sharded across many machines:
# Shard by user_id hash
Shard(user_id) = hash(user_id) % num_shards
# User A's following list → Shard for A
# User B's followers list → Shard for B
# Problem: Following query (A follows B) might span two shards
# Solution: Store edges in both shards (write amplification for read efficiency)
Storing each edge twice (once in follower's shard, once in followed's shard) doubles write cost but ensures any relationship query touches only one shard. At Instagram's scale, the read/write ratio strongly favors this trade-off—edges are read billions of times but written once.
When a user requests their home feed, the system must determine which posts are candidates for inclusion. This happens in stages:
Stage 1: Identify Content Sources
User requests feed
→ Get list of followed accounts
→ Separate into 'push' accounts (regular users) and 'pull' accounts (celebrities)
→ Identify other content sources (ads, suggested posts, etc.)
Stage 2: Fetch Pre-pushed Content
For regular accounts (fanout-on-write), the user has a pre-computed feed cache:
Feed cache structure:
{
user_id: "viewer_123",
posts: [
{post_id: "abc", author_id: "456", timestamp: 1704300000, score_features: {...}},
{post_id: "def", author_id: "789", timestamp: 1704290000, score_features: {...}},
// ... up to ~1000 recent posts
],
last_updated: 1704310000
}
Stage 3: Pull Celebrity Content
For high-follower accounts, fetch their recent posts at request time:
async def fetch_celebrity_posts(user_id: str, celebrity_ids: List[str]) -> List[Post]:
"""
Fetch recent posts from celebrity accounts the user follows.
Runs at feed-request time for accounts above the push threshold.
"""
# Parallel fetch from all celebrity accounts
tasks = [fetch_recent_posts(celeb_id, limit=20) for celeb_id in celebrity_ids]
results = await asyncio.gather(*tasks)
# Flatten and deduplicate
all_posts = []
seen_post_ids = set()
for posts in results:
for post in posts:
if post.id not in seen_post_ids:
all_posts.append(post)
seen_post_ids.add(post.id)
return all_posts
Stage 4: Merge Candidate Pools
Candidate pool sources:
├── Pre-pushed feed items (from fanout-on-write)
├── Pulled celebrity posts (from fanout-on-read)
├── Suggested posts (from recommendation system)
├── Ads (from advertising system)
├── Reshared content (friends reshared to Stories)
└── Cross-posted content (e.g., Reels that appear in feed)
Merge all into single candidate pool → 500-2000 candidates
The candidate pool is intentionally larger than what will be shown. A typical feed load shows 10-20 posts, but ranking considers 500-2000 candidates. This allows the ranking system to select the optimal posts, not just the most recent ones. Over-generating candidates ensures the final feed is high-quality.
Once candidates are assembled, the ranking system determines the order in which posts appear. Modern Instagram feeds are heavily optimized by machine learning models that predict which posts each user will find most engaging.
Ranking Objectives:
The ranking system optimizes for multiple outcomes simultaneously:
| Objective | Signal | Weight Consideration |
|---|---|---|
| Engagement | Predicted P(like), P(comment), P(share), P(save) | Primary signal |
| Time spent | Predicted view duration | Engagement depth |
| Relationship | Historical interaction with author | Social connection |
| Recency | Time since post creation | Timeliness |
| Content diversity | Content type, topic variety | Avoid repetition |
| Creator balance | Distribution across followed accounts | Fairness |
| User satisfaction | Long-term retention signals | Avoid addictive patterns |
The Ranking Pipeline:
Multi-Stage Ranking:
Ranking uses a cascade of increasingly sophisticated (and expensive) models:
| Stage | Candidates In | Candidates Out | Model | Latency Budget |
|---|---|---|---|---|
| Candidate fetch | 0 | ~2000 | N/A | 50ms |
| First-pass ranking | ~2000 | ~200 | Lightweight NN or GBDT | 20ms |
| Second-pass ranking | ~200 | ~50 | Heavy transformer/NN | 30ms |
| Business rules | ~50 | ~30 | Rules engine | 5ms |
| Final ordering | ~30 | ~20 | Diversity/position rules | 5ms |
| Total | <150ms |
Feature Categories for Ranking:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
from dataclasses import dataclassfrom typing import List, Dictimport numpy as np @dataclassclass RankingFeatures: # Post features post_age_hours: float content_type: str # 'photo', 'video', 'carousel' has_location: bool caption_length: int hashtag_count: int # Author features author_follower_count: int author_post_frequency: float # posts per week author_avg_engagement_rate: float is_verified: bool # User-author interaction features user_liked_author_count_7d: int user_commented_author_count_7d: int user_viewed_author_profile_7d: bool days_since_last_interaction: float # Content features (from ML embeddings) visual_embedding: List[float] # 512-d caption_embedding: List[float] # 768-d # Contextual features hour_of_day: int day_of_week: int posts_seen_this_session: int def compute_ranking_features( user_id: str, post: Post, user_context: UserContext, interaction_history: InteractionHistory) -> RankingFeatures: """ Compute all features needed for ranking a single post for a user. In production, this is heavily optimized with batch lookups and caching. """ return RankingFeatures( # Post features post_age_hours=(time.now() - post.created_at).total_seconds() / 3600, content_type=post.content_type, has_location=post.location_id is not None, caption_length=len(post.caption), hashtag_count=count_hashtags(post.caption), # Author features (cached per author) author_follower_count=get_cached_follower_count(post.author_id), author_post_frequency=get_posting_frequency(post.author_id), author_avg_engagement_rate=get_engagement_rate(post.author_id), is_verified=is_verified(post.author_id), # Interaction features user_liked_author_count_7d=interaction_history.likes_to_author_7d(post.author_id), user_commented_author_count_7d=interaction_history.comments_to_author_7d(post.author_id), user_viewed_author_profile_7d=interaction_history.profile_view_7d(post.author_id), days_since_last_interaction=interaction_history.days_since_interaction(post.author_id), # Content features (pre-computed during upload) visual_embedding=post.visual_embedding, caption_embedding=get_text_embedding(post.caption), # Context hour_of_day=user_context.local_hour, day_of_week=user_context.day_of_week, posts_seen_this_session=user_context.session_post_count, )The ranking model itself runs in milliseconds, but computing features for ~1000 candidates is expensive. Optimizations include: batch fetching from caches, pre-computing features at post time, using approximate features when exact ones are too slow, and aggressive caching of user interaction histories.
With billions of feed requests daily, aggressive caching is essential. However, feed caching is complex because feeds are personalized, time-sensitive, and constantly changing.
What Can Be Cached?
| Component | Cacheability | Cache Strategy |
|---|---|---|
| Feed candidate list | Per-user, short TTL | Cache 5-15 minutes, invalidate on new post |
| Ranked feed | Per-user, very short TTL | Cache 1-5 minutes, personalization changes |
| Post content | Global, long TTL | Cache for hours/days, immutable content |
| User metadata | Global, medium TTL | Cache 15-60 minutes, changes infrequently |
| Engagement counts | Global, short TTL | Cache 1-5 minutes, eventual consistency OK |
| Social graph | Per-user, medium TTL | Cache 30 minutes, invalidate on follow/unfollow |
Feed Cache Architecture:
Cache Invalidation Strategies:
The classic challenge: "When do we invalidate the cache?"
1. Time-based Expiration (TTL):
# Simple but can serve stale content
feed_cache.set(user_id, feed_data, ttl=300) # 5-minute TTL
2. Event-driven Invalidation:
# When followed account posts
def on_new_post(author_id: str, post_id: str):
# Get all followers who have cached feeds
followers = get_followers(author_id)
for follower_id in followers:
# Invalidate or update their feed cache
feed_cache.invalidate(follower_id)
# Or: Append new post to cached feed (lazy update)
feed_cache.append_post(follower_id, post_id)
# When user follows/unfollows
def on_follow_change(user_id: str, target_id: str):
feed_cache.invalidate(user_id) # Feed sources changed
3. Hybrid: TTL + Incremental Updates:
# Cache stores feed with version number
cached_feed = {
'version': 123,
'posts': [...],
'last_update': timestamp
}
# On request, check for updates since last version
def get_feed_with_refresh(user_id: str):
cached = feed_cache.get(user_id)
if cached:
new_posts = get_posts_since(user_id, cached['last_update'])
if new_posts:
# Merge new posts into cached feed and re-rank
merged = merge_and_rank(cached['posts'], new_posts)
cached['posts'] = merged
cached['version'] += 1
feed_cache.set(user_id, cached)
return cached['posts']
else:
# Full feed generation
return generate_full_feed(user_id)
When a celebrity with 100M followers posts, invalidating 100M feed caches simultaneously creates a thundering herd of cache misses. Solutions include: staggered invalidation (spread over seconds), lazy invalidation (invalidate on next request), and pull-based updates (don't push-invalidate for celebrities at all).
Modern users expect feeds to update in real-time. When a friend posts while you're scrolling, you expect to see "New posts available" appear. This requires push infrastructure that can reach hundreds of millions of concurrent users.
Real-Time Update Channels:
| Update Type | Delivery Method | Latency Target |
|---|---|---|
| New post indicator | Push notification or in-app pill | <10 seconds |
| Live engagement counts | WebSocket streaming | <2 seconds |
| Story ring update | Background refresh + push | <30 seconds |
| Comment thread | Real-time sync | <1 second |
| Like feedback | Optimistic UI + confirmation | Instant (<100ms) |
Push Notification Architecture:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
from dataclasses import dataclassfrom enum import Enumfrom typing import List, Set class UpdateType(Enum): NEW_POST = "new_post" ENGAGEMENT_UPDATE = "engagement" STORY_UPDATE = "story" @dataclass class FeedUpdate: update_type: UpdateType payload: dict target_users: Set[str] class RealTimeFeedService: """ Manages real-time feed updates to connected clients. """ def __init__(self, connection_manager, message_queue): self.connections = connection_manager # Tracks WebSocket connections self.queue = message_queue # Kafka/SQS for durability async def on_new_post(self, author_id: str, post_id: str): """ Called when a user publishes a new post. Notifies followers who are currently active. """ # Get followers who are currently online all_followers = await get_followers(author_id) online_followers = await self.connections.filter_online(all_followers) # For online users, send real-time notification if len(online_followers) < 50000: # Regular users update = FeedUpdate( update_type=UpdateType.NEW_POST, payload={ "author_id": author_id, "post_id": post_id, "preview_url": get_preview_url(post_id), "author_name": get_username(author_id), }, target_users=set(online_followers) ) await self.push_update(update) else: # Celebrity post - don't push directly, let clients poll # or use progressively expanding fanout await self.queue_delayed_push(author_id, post_id, online_followers) async def push_update(self, update: FeedUpdate): """ Push update to all target users via WebSocket. """ for user_id in update.target_users: connection = self.connections.get(user_id) if connection: await connection.send_json({ "type": update.update_type.value, "data": update.payload }) # Client-side handling# When client receives NEW_POST notification:# 1. Show "New posts available" pill at top of feed# 2. On tap, scroll to top and refresh feed# 3. OR: Auto-refresh if at top of feed and not actively scrollingWebSocket Connection Management:
Maintaining persistent connections to 100+ million concurrent users is a significant infrastructure challenge:
| Challenge | Solution |
|---|---|
| Connection count | Millions of connections per server using epoll/kqueue |
| Connection routing | Consistent hashing to route user to specific server |
| Connection migration | Graceful handoff during deployments |
| Mobile connection stability | Automatic reconnection with exponential backoff |
| Battery impact | Batch updates, reduce heartbeat frequency |
| Presence tracking | Distributed presence service (who's online) |
Long-Polling Fallback:
For clients that can't maintain WebSocket connections (firewalls, proxies, old devices):
# Long-polling endpoint
GET /api/v1/feed/updates?since=<timestamp>&timeout=30
# Server holds connection open for up to 30 seconds
# Returns immediately if updates available
# Returns empty after timeout if no updates
# Client immediately reconnects after each response
Mobile apps must balance real-time updates against battery drain and data usage. Instagram intelligently reduces update frequency when battery is low, user is on cellular, or app is backgrounded. The 'New posts' indicator is a UX compromise—it's not instant updates, but it's battery-efficient and still feels responsive.
Instagram's feed is an infinite scroll experience—users continuously scroll and new content loads seamlessly. Implementing this correctly is more complex than it appears.
The Pagination Challenge:
Traditional offset-based pagination fails for dynamic feeds:
-- Offset pagination: Simple but broken for feeds
SELECT * FROM posts ORDER BY rank DESC LIMIT 20 OFFSET 40;
-- Problem: If new posts are inserted or rankings change,
-- the user may see duplicate posts or miss posts entirely
Cursor-based Pagination:
Instagram uses cursor-based (keyset) pagination for stable scrolling:
# Cursor encodes the position in the feed
# Using last-seen post ID and its rank score
@dataclass
class FeedCursor:
user_id: str
last_post_id: str
last_rank_score: float
page_number: int
session_id: str # For analytics
def get_feed_page(cursor: FeedCursor) -> Tuple[List[Post], FeedCursor]:
"""
Get next page of feed starting after the cursor position.
"""
# Fetch posts ranked after the cursor
posts = db.query(
"""SELECT * FROM ranked_feed
WHERE user_id = :user_id
AND (rank_score, post_id) < (:last_score, :last_post_id)
ORDER BY rank_score DESC, post_id DESC
LIMIT 20""",
user_id=cursor.user_id,
last_score=cursor.last_rank_score,
last_post_id=cursor.last_post_id
)
# Create cursor for next page
if posts:
next_cursor = FeedCursor(
user_id=cursor.user_id,
last_post_id=posts[-1].id,
last_rank_score=posts[-1].rank_score,
page_number=cursor.page_number + 1,
session_id=cursor.session_id
)
else:
next_cursor = None
return posts, next_cursor
Feed Session Stability:
A user's feed should remain stable during a scrolling session, even if rankings change:
| Scenario | Without Session Stability | With Session Stability |
|---|---|---|
| New post appears | May insert mid-scroll, causing confusion | Held until session refresh |
| Ranking update | Cards reorder unexpectedly | Order frozen within session |
| Seen post | May reappear if re-ranked | Filtered out for session duration |
| Deleted post | Card disappears mid-scroll | Card removed gracefully |
Implementation: Session-Scoped Feed Snapshot:
class FeedSession:
"""
Represents a single scrolling session with stable feed content.
"""
def __init__(self, user_id: str):
self.session_id = generate_uuid()
self.user_id = user_id
self.created_at = time.now()
self.seen_post_ids: Set[str] = set()
self.ranked_posts: List[RankedPost] = []
def initialize(self):
"""Generate and cache the ranked feed for this session."""
candidates = fetch_candidates(self.user_id)
self.ranked_posts = rank_candidates(self.user_id, candidates)
# Store in Redis with TTL
cache.set(
f"feed_session:{self.session_id}",
self.serialize(),
ttl=3600 # 1 hour max session
)
def get_page(self, page_num: int, page_size: int = 20) -> List[Post]:
"""Get a page from the frozen session feed."""
start = page_num * page_size
end = start + page_size
posts = self.ranked_posts[start:end]
# Track seen posts
for post in posts:
self.seen_post_ids.add(post.id)
return posts
Instagram shows 'You're All Caught Up' when users have seen all posts from the past 48 hours. This is both a UX feature (promoting healthy usage) and a technical boundary—the feed doesn't try to rank/show posts older than 2 days. Beyond this point, Explore takes over for content discovery.
Pure engagement-based ranking can create degenerate feeds—too many posts from one author, too much of one content type, or filter bubbles that limit exposure. Instagram applies diversity rules after ML ranking to ensure feed quality.
Post-Ranking Rules:
Diversity Re-Ranking Algorithm:
def apply_diversity_rules(ranked_posts: List[RankedPost]) -> List[RankedPost]:
"""
Apply diversity rules to re-order posts after ML ranking.
Maintains a sliding window to enforce spacing constraints.
"""
final_feed = []
remaining = list(ranked_posts)
# Track recent authors/topics for spacing
recent_authors = [] # Last 5 authors
recent_topics = [] # Last 3 topics
while remaining and len(final_feed) < MAX_FEED_LENGTH:
# Find highest-ranked post that satisfies constraints
for i, post in enumerate(remaining):
# Check author spacing
if post.author_id in recent_authors:
continue
# Check topic diversity
if post.primary_topic in recent_topics[-3:]:
# Allow if high enough rank (don't sacrifice too much quality)
if i < 3: # Top 3 ranked remaining
pass # Allow despite topic repeat
else:
continue
# Post passes constraints - add to feed
final_feed.append(post)
remaining.pop(i)
# Update tracking
recent_authors.append(post.author_id)
if len(recent_authors) > 5:
recent_authors.pop(0)
recent_topics.append(post.primary_topic)
if len(recent_topics) > 6:
recent_topics.pop(0)
break
else:
# No post satisfied constraints - take top remaining
final_feed.append(remaining.pop(0))
return final_feed
Ad Slot Injection:
Ads are inserted after ranking, not ranked alongside organic content:
def inject_ads(feed: List[Post], user_context: UserContext) -> List[FeedItem]:
"""Insert ads at appropriate intervals."""
feed_with_ads = []
ad_interval = calculate_ad_interval(user_context) # ~6-8 for most users
for i, post in enumerate(feed):
feed_with_ads.append(post)
if (i + 1) % ad_interval == 0:
ad = fetch_ad_for_position(user_context, position=i)
if ad:
feed_with_ads.append(ad)
return feed_with_ads
Pure engagement optimization might show you 10 posts from your closest friend if that maximizes predicted engagement. But users report preferring diverse feeds even if individual post engagement is slightly lower. Instagram balances short-term engagement with long-term satisfaction through these rules.
The feed generation system is Instagram's core value creator—it turns raw uploads into personalized, engaging experiences. Let's consolidate the key learnings:
What's Next: Stories Architecture
Stories present unique architectural challenges distinct from the main feed: 24-hour ephemeral content, sequential viewing within stories, and the prominent 'story ring' UI. We'll explore:
You now understand how Instagram transforms followed accounts into a personalized, endlessly scrollable feed experience. The patterns here—fanout, ranking cascades, session stability, real-time updates—are foundational to any social content platform. Next, we'll see how Stories adds an ephemeral layer with its own unique challenges.