Instagram Photos - Learning Module

Loading content...

0/273

Feed Generation: Curating the Timeline at Scale

The Heart of the Instagram Experience

The Instagram home feed is where users spend most of their time—scrolling through a curated stream of photos, videos, and stories from accounts they follow. What appears to be a simple chronological list is actually the output of one of the most sophisticated content ranking systems in existence.

Consider the challenge: A typical user follows 200-500 accounts. Those accounts collectively produce dozens of new posts per day. Instagram must determine which posts to show, in what order, and when—all while rendering a feed that feels personal, timely, and engaging.

Now multiply this by 500 million daily active users, each requesting their feed multiple times per day. The feed generation system handles 10+ billion feed renders per day, each requiring aggregation from hundreds of sources, ranking through ML models, and delivery in under 200 milliseconds.

What You Will Learn

By the end of this page, you will understand: (1) The fanout problem and Instagram's hybrid push/pull solution, (2) How feed candidates are generated and ranked using ML, (3) The social graph infrastructure that powers following relationships, (4) Feed caching strategies and invalidation patterns, (5) Real-time feed updates and notification systems, and (6) Pagination and infinite scroll implementation at scale.

The Fanout Problem: Push vs. Pull

The fanout problem is the central architectural challenge in social feed systems. When a user posts a photo, how do we ensure it appears in all their followers' feeds?

There are two fundamental approaches, each with profound trade-offs:

Fanout-on-Write (Push Model):

When a user posts, immediately write to every follower's pre-computed feed cache.

User A posts photo
→ For each follower of A:
    → Append photo to follower's feed cache
→ All followers have updated feeds immediately

Fanout-on-Read (Pull Model):

When a user requests their feed, query all followed accounts and aggregate their recent posts.

User B requests feed
→ For each account B follows:
    → Fetch recent posts from that account
→ Merge and rank all posts
→ Return top N posts as feed

Fanout-on-Write Characteristics

•Read latency: Excellent (feed is pre-computed)
•Write latency: Proportional to follower count
•Celebrity problem: 100M followers = 100M writes
•Storage: O(total posts × fanout)
•Staleness: Feed may be stale if write fails
•Best for: Users with few followers

Fanout-on-Read Characteristics

•Read latency: Proportional to following count
•Write latency: Excellent (single write)
•Celebrity problem: Solved (single write per post)
•Storage: O(total posts) - efficient
•Staleness: Always fresh (queried at read time)
•Best for: Celebrity/high-follower accounts

The Celebrity Problem:

Neither approach works universally. Consider these extremes:

User Type	Followers	Best Approach	Why
Average user	500	Fanout-on-write	500 writes is cheap; fast reads for followers
Micro-influencer	50,000	Hybrid	Significant but manageable write amplification
Major celebrity	500M	Fanout-on-read	500M writes per post is catastrophic

If Cristiano Ronaldo (600M+ followers) used fanout-on-write, a single post would generate 600 million write operations. At 10 posts/day, that's 6 billion writes daily from one user—more than Instagram's total daily post volume.

Instagram's Hybrid Solution

Instagram uses a hybrid approach with a follower threshold (approximately 10,000-50,000 followers). Accounts below the threshold use fanout-on-write (push to followers). Accounts above use fanout-on-read (pulled at feed generation time). This hybrid optimizes for the common case (regular users) while handling extreme cases (celebrities) gracefully.

Social Graph Infrastructure

At the heart of feed generation is the social graph—the data structure representing who follows whom. This graph is massive and accessed on nearly every operation.

Graph Scale:

Metric	Estimated Value
Nodes (users)	2+ billion
Edges (follow relationships)	100+ billion
Average following per user	~200
Average followers per user	~200 (but highly skewed)
Daily edge changes (follow/unfollow)	100+ million

Graph Operations Needed:

Critical Graph Queries

•Get followers: Who follows user X? (for fanout-on-write)
•Get following: Who does user X follow? (for fanout-on-read, feed generation)
•Check relationship: Does user A follow user B? (for access control)
•Mutual check: Do users A and B follow each other? (for DM permissions)
•Follower count: How many followers does X have? (for display)
•Mutual friends: Who do both A and B follow? (for recommendations)
•N-hop traversal: Who are friends-of-friends? (for suggestions)

Graph Storage Strategies:

Storing 100+ billion edges efficiently requires specialized infrastructure:

Option 1: Adjacency List in Key-Value Store

# Store outgoing edges (following)
Key: user_id:following
Value: [followed_user_id_1, followed_user_id_2, ...]

# Store incoming edges (followers) 
Key: user_id:followers
Value: [follower_id_1, follower_id_2, ...]

Simple and fast for common queries
Problematic for users with millions of followers (large values)
Requires pagination for large lists

Option 2: Edge Table in Relational/Wide-Column DB

CREATE TABLE follows (
    follower_id BIGINT,
    followed_id BIGINT,
    created_at TIMESTAMP,
    PRIMARY KEY (follower_id, followed_id)
);

-- Index for reverse lookup
CREATE INDEX idx_followed ON follows(followed_id, follower_id);

Clean relational model
Handles any list size gracefully
Requires careful sharding for scale

Option 3: Graph Database

// Neo4j style
(:User {id: 'A'})-[:FOLLOWS]->(:User {id: 'B'})

Native graph operations (traversal, pattern matching)
Often harder to scale horizontally
May be used for recommendation graph, not primary storage

Instagram's Approach: TAO (The Associations and Objects)

Meta (Facebook/Instagram) uses TAO, a purpose-built graph storage system optimized for social graph access patterns:

TAO Feature	Benefit
Distributed cache layer	Sub-millisecond reads for hot data
Persistent storage tier (MySQL)	Durability with replication
Edge pagination	Efficient handling of celebrity-scale follower lists
Edge counts caching	Follower counts without scanning lists
Bidirectional edges	Following stored in both directions for fast lookups
Eventual consistency	Trades strict consistency for availability

Graph Sharding:

With billions of users, the graph must be sharded across many machines:

# Shard by user_id hash
Shard(user_id) = hash(user_id) % num_shards

# User A's following list → Shard for A
# User B's followers list → Shard for B

# Problem: Following query (A follows B) might span two shards
# Solution: Store edges in both shards (write amplification for read efficiency)

Write Amplification Trade-off

Storing each edge twice (once in follower's shard, once in followed's shard) doubles write cost but ensures any relationship query touches only one shard. At Instagram's scale, the read/write ratio strongly favors this trade-off—edges are read billions of times but written once.

Feed Candidate Generation

When a user requests their home feed, the system must determine which posts are candidates for inclusion. This happens in stages:

Stage 1: Identify Content Sources

User requests feed
→ Get list of followed accounts
→ Separate into 'push' accounts (regular users) and 'pull' accounts (celebrities)
→ Identify other content sources (ads, suggested posts, etc.)

Stage 2: Fetch Pre-pushed Content

For regular accounts (fanout-on-write), the user has a pre-computed feed cache:

Feed cache structure:
{
  user_id: "viewer_123",
  posts: [
    {post_id: "abc", author_id: "456", timestamp: 1704300000, score_features: {...}},
    {post_id: "def", author_id: "789", timestamp: 1704290000, score_features: {...}},
    // ... up to ~1000 recent posts
  ],
  last_updated: 1704310000
}

Stage 3: Pull Celebrity Content

For high-follower accounts, fetch their recent posts at request time:

async def fetch_celebrity_posts(user_id: str, celebrity_ids: List[str]) -> List[Post]:
    """
    Fetch recent posts from celebrity accounts the user follows.
    Runs at feed-request time for accounts above the push threshold.
    """
    # Parallel fetch from all celebrity accounts
    tasks = [fetch_recent_posts(celeb_id, limit=20) for celeb_id in celebrity_ids]
    results = await asyncio.gather(*tasks)
    
    # Flatten and deduplicate
    all_posts = []
    seen_post_ids = set()
    for posts in results:
        for post in posts:
            if post.id not in seen_post_ids:
                all_posts.append(post)
                seen_post_ids.add(post.id)
    
    return all_posts

Stage 4: Merge Candidate Pools

Candidate pool sources:
├── Pre-pushed feed items (from fanout-on-write)
├── Pulled celebrity posts (from fanout-on-read)
├── Suggested posts (from recommendation system)
├── Ads (from advertising system)
├── Reshared content (friends reshared to Stories)
└── Cross-posted content (e.g., Reels that appear in feed)

Merge all into single candidate pool → 500-2000 candidates

Candidate Pool Sizing

The candidate pool is intentionally larger than what will be shown. A typical feed load shows 10-20 posts, but ranking considers 500-2000 candidates. This allows the ranking system to select the optimal posts, not just the most recent ones. Over-generating candidates ensures the final feed is high-quality.

Feed Ranking: ML-Powered Personalization

Once candidates are assembled, the ranking system determines the order in which posts appear. Modern Instagram feeds are heavily optimized by machine learning models that predict which posts each user will find most engaging.

Ranking Objectives:

The ranking system optimizes for multiple outcomes simultaneously:

Objective	Signal	Weight Consideration
Engagement	Predicted P(like), P(comment), P(share), P(save)	Primary signal
Time spent	Predicted view duration	Engagement depth
Relationship	Historical interaction with author	Social connection
Recency	Time since post creation	Timeliness
Content diversity	Content type, topic variety	Avoid repetition
Creator balance	Distribution across followed accounts	Fairness
User satisfaction	Long-term retention signals	Avoid addictive patterns

The Ranking Pipeline:

Converting Mermaid diagram...

Multi-Stage Ranking:

Ranking uses a cascade of increasingly sophisticated (and expensive) models:

Stage	Candidates In	Candidates Out	Model	Latency Budget
Candidate fetch	0	~2000	N/A	50ms
First-pass ranking	~2000	~200	Lightweight NN or GBDT	20ms
Second-pass ranking	~200	~50	Heavy transformer/NN	30ms
Business rules	~50	~30	Rules engine	5ms
Final ordering	~30	~20	Diversity/position rules	5ms
Total				<150ms

Feature Categories for Ranking:

Ranking Signal Categories

•Post features: Age, content type (photo/video/carousel), caption length, hashtag count, has location
•Author features: Follower count, posting frequency, engagement rate, account age, verification status
•User features: Following count, typical session length, preferred content types, active hours
•Interaction features: User's history with author (likes, comments, profile visits, DMs), recency of interactions
•Content features: Visual embeddings (what's in the image), text embeddings (caption analysis), detected objects
•Contextual features: Time of day, day of week, user's recent engagement patterns, session position

Simplified Ranking Feature Extraction
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
from dataclasses import dataclass
from typing import List, Dict
import numpy as np
 
@dataclass
class RankingFeatures:
    # Post features
    post_age_hours: float
    content_type: str  # 'photo', 'video', 'carousel'
    has_location: bool
    caption_length: int
    hashtag_count: int
    
    # Author features
    author_follower_count: int
    author_post_frequency: float  # posts per week
    author_avg_engagement_rate: float
    is_verified: bool
    
    # User-author interaction features
    user_liked_author_count_7d: int
    user_commented_author_count_7d: int
    user_viewed_author_profile_7d: bool
    days_since_last_interaction: float
    
    # Content features (from ML embeddings)
    visual_embedding: List[float]  # 512-d
    caption_embedding: List[float]  # 768-d
    
    # Contextual features
    hour_of_day: int
    day_of_week: int
    posts_seen_this_session: int
 
 
def compute_ranking_features(
    user_id: str,
    post: Post,
    user_context: UserContext,
    interaction_history: InteractionHistory
) -> RankingFeatures:
    """
    Compute all features needed for ranking a single post for a user.
    In production, this is heavily optimized with batch lookups and caching.
    """
    return RankingFeatures(
        # Post features
        post_age_hours=(time.now() - post.created_at).total_seconds() / 3600,
        content_type=post.content_type,
        has_location=post.location_id is not None,
        caption_length=len(post.caption),
        hashtag_count=count_hashtags(post.caption),
        
        # Author features (cached per author)
        author_follower_count=get_cached_follower_count(post.author_id),
        author_post_frequency=get_posting_frequency(post.author_id),
        author_avg_engagement_rate=get_engagement_rate(post.author_id),
        is_verified=is_verified(post.author_id),
        
        # Interaction features
        user_liked_author_count_7d=interaction_history.likes_to_author_7d(post.author_id),
        user_commented_author_count_7d=interaction_history.comments_to_author_7d(post.author_id),
        user_viewed_author_profile_7d=interaction_history.profile_view_7d(post.author_id),
        days_since_last_interaction=interaction_history.days_since_interaction(post.author_id),
        
        # Content features (pre-computed during upload)
        visual_embedding=post.visual_embedding,
        caption_embedding=get_text_embedding(post.caption),
        
        # Context
        hour_of_day=user_context.local_hour,
        day_of_week=user_context.day_of_week,
        posts_seen_this_session=user_context.session_post_count,
    )

Feature Computation is the Bottleneck

The ranking model itself runs in milliseconds, but computing features for ~1000 candidates is expensive. Optimizations include: batch fetching from caches, pre-computing features at post time, using approximate features when exact ones are too slow, and aggressive caching of user interaction histories.

Feed Caching Architecture

With billions of feed requests daily, aggressive caching is essential. However, feed caching is complex because feeds are personalized, time-sensitive, and constantly changing.

What Can Be Cached?

Component	Cacheability	Cache Strategy
Feed candidate list	Per-user, short TTL	Cache 5-15 minutes, invalidate on new post
Ranked feed	Per-user, very short TTL	Cache 1-5 minutes, personalization changes
Post content	Global, long TTL	Cache for hours/days, immutable content
User metadata	Global, medium TTL	Cache 15-60 minutes, changes infrequently
Engagement counts	Global, short TTL	Cache 1-5 minutes, eventual consistency OK
Social graph	Per-user, medium TTL	Cache 30 minutes, invalidate on follow/unfollow

Feed Cache Architecture:

Converting Mermaid diagram...

Cache Invalidation Strategies:

The classic challenge: "When do we invalidate the cache?"

1. Time-based Expiration (TTL):

# Simple but can serve stale content
feed_cache.set(user_id, feed_data, ttl=300)  # 5-minute TTL

2. Event-driven Invalidation:

# When followed account posts
def on_new_post(author_id: str, post_id: str):
    # Get all followers who have cached feeds
    followers = get_followers(author_id)
    for follower_id in followers:
        # Invalidate or update their feed cache
        feed_cache.invalidate(follower_id)
        # Or: Append new post to cached feed (lazy update)
        feed_cache.append_post(follower_id, post_id)

# When user follows/unfollows
def on_follow_change(user_id: str, target_id: str):
    feed_cache.invalidate(user_id)  # Feed sources changed

3. Hybrid: TTL + Incremental Updates:

# Cache stores feed with version number
cached_feed = {
    'version': 123,
    'posts': [...],
    'last_update': timestamp
}

# On request, check for updates since last version
def get_feed_with_refresh(user_id: str):
    cached = feed_cache.get(user_id)
    if cached:
        new_posts = get_posts_since(user_id, cached['last_update'])
        if new_posts:
            # Merge new posts into cached feed and re-rank
            merged = merge_and_rank(cached['posts'], new_posts)
            cached['posts'] = merged
            cached['version'] += 1
            feed_cache.set(user_id, cached)
        return cached['posts']
    else:
        # Full feed generation
        return generate_full_feed(user_id)

The Celebrity Invalidation Storm

When a celebrity with 100M followers posts, invalidating 100M feed caches simultaneously creates a thundering herd of cache misses. Solutions include: staggered invalidation (spread over seconds), lazy invalidation (invalidate on next request), and pull-based updates (don't push-invalidate for celebrities at all).

Real-Time Feed Updates & Notifications

Modern users expect feeds to update in real-time. When a friend posts while you're scrolling, you expect to see "New posts available" appear. This requires push infrastructure that can reach hundreds of millions of concurrent users.

Real-Time Update Channels:

Update Type	Delivery Method	Latency Target
New post indicator	Push notification or in-app pill	<10 seconds
Live engagement counts	WebSocket streaming	<2 seconds
Story ring update	Background refresh + push	<30 seconds
Comment thread	Real-time sync	<1 second
Like feedback	Optimistic UI + confirmation	Instant (<100ms)

Push Notification Architecture:

Real-Time Push System (Conceptual)
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
from dataclasses import dataclass
from enum import Enum
from typing import List, Set
 
class UpdateType(Enum):
    NEW_POST = "new_post"
    ENGAGEMENT_UPDATE = "engagement"
    STORY_UPDATE = "story"
 
@dataclass  
class FeedUpdate:
    update_type: UpdateType
    payload: dict
    target_users: Set[str]
 
 
class RealTimeFeedService:
    """
    Manages real-time feed updates to connected clients.
    """
    
    def __init__(self, connection_manager, message_queue):
        self.connections = connection_manager  # Tracks WebSocket connections
        self.queue = message_queue  # Kafka/SQS for durability
    
    async def on_new_post(self, author_id: str, post_id: str):
        """
        Called when a user publishes a new post.
        Notifies followers who are currently active.
        """
        # Get followers who are currently online
        all_followers = await get_followers(author_id)
        online_followers = await self.connections.filter_online(all_followers)
        
        # For online users, send real-time notification
        if len(online_followers) < 50000:  # Regular users
            update = FeedUpdate(
                update_type=UpdateType.NEW_POST,
                payload={
                    "author_id": author_id,
                    "post_id": post_id,
                    "preview_url": get_preview_url(post_id),
                    "author_name": get_username(author_id),
                },
                target_users=set(online_followers)
            )
            await self.push_update(update)
        else:
            # Celebrity post - don't push directly, let clients poll
            # or use progressively expanding fanout
            await self.queue_delayed_push(author_id, post_id, online_followers)
    
    async def push_update(self, update: FeedUpdate):
        """
        Push update to all target users via WebSocket.
        """
        for user_id in update.target_users:
            connection = self.connections.get(user_id)
            if connection:
                await connection.send_json({
                    "type": update.update_type.value,
                    "data": update.payload
                })
 
 
# Client-side handling
# When client receives NEW_POST notification:
#   1. Show "New posts available" pill at top of feed
#   2. On tap, scroll to top and refresh feed
#   3. OR: Auto-refresh if at top of feed and not actively scrolling

WebSocket Connection Management:

Maintaining persistent connections to 100+ million concurrent users is a significant infrastructure challenge:

Challenge	Solution
Connection count	Millions of connections per server using epoll/kqueue
Connection routing	Consistent hashing to route user to specific server
Connection migration	Graceful handoff during deployments
Mobile connection stability	Automatic reconnection with exponential backoff
Battery impact	Batch updates, reduce heartbeat frequency
Presence tracking	Distributed presence service (who's online)

Long-Polling Fallback:

For clients that can't maintain WebSocket connections (firewalls, proxies, old devices):

# Long-polling endpoint
GET /api/v1/feed/updates?since=<timestamp>&timeout=30

# Server holds connection open for up to 30 seconds
# Returns immediately if updates available
# Returns empty after timeout if no updates

# Client immediately reconnects after each response

Battery and Data Efficiency

Mobile apps must balance real-time updates against battery drain and data usage. Instagram intelligently reduces update frequency when battery is low, user is on cellular, or app is backgrounded. The 'New posts' indicator is a UX compromise—it's not instant updates, but it's battery-efficient and still feels responsive.

Pagination and Infinite Scroll

Instagram's feed is an infinite scroll experience—users continuously scroll and new content loads seamlessly. Implementing this correctly is more complex than it appears.

The Pagination Challenge:

Traditional offset-based pagination fails for dynamic feeds:

-- Offset pagination: Simple but broken for feeds
SELECT * FROM posts ORDER BY rank DESC LIMIT 20 OFFSET 40;

-- Problem: If new posts are inserted or rankings change,
-- the user may see duplicate posts or miss posts entirely

Cursor-based Pagination:

Instagram uses cursor-based (keyset) pagination for stable scrolling:

# Cursor encodes the position in the feed
# Using last-seen post ID and its rank score

@dataclass
class FeedCursor:
    user_id: str
    last_post_id: str
    last_rank_score: float
    page_number: int
    session_id: str  # For analytics

def get_feed_page(cursor: FeedCursor) -> Tuple[List[Post], FeedCursor]:
    """
    Get next page of feed starting after the cursor position.
    """
    # Fetch posts ranked after the cursor
    posts = db.query(
        """SELECT * FROM ranked_feed 
           WHERE user_id = :user_id 
           AND (rank_score, post_id) < (:last_score, :last_post_id)
           ORDER BY rank_score DESC, post_id DESC
           LIMIT 20""",
        user_id=cursor.user_id,
        last_score=cursor.last_rank_score,
        last_post_id=cursor.last_post_id
    )
    
    # Create cursor for next page
    if posts:
        next_cursor = FeedCursor(
            user_id=cursor.user_id,
            last_post_id=posts[-1].id,
            last_rank_score=posts[-1].rank_score,
            page_number=cursor.page_number + 1,
            session_id=cursor.session_id
        )
    else:
        next_cursor = None
    
    return posts, next_cursor

Feed Session Stability:

A user's feed should remain stable during a scrolling session, even if rankings change:

Scenario	Without Session Stability	With Session Stability
New post appears	May insert mid-scroll, causing confusion	Held until session refresh
Ranking update	Cards reorder unexpectedly	Order frozen within session
Seen post	May reappear if re-ranked	Filtered out for session duration
Deleted post	Card disappears mid-scroll	Card removed gracefully

Implementation: Session-Scoped Feed Snapshot:

class FeedSession:
    """
    Represents a single scrolling session with stable feed content.
    """
    
    def __init__(self, user_id: str):
        self.session_id = generate_uuid()
        self.user_id = user_id
        self.created_at = time.now()
        self.seen_post_ids: Set[str] = set()
        self.ranked_posts: List[RankedPost] = []
        
    def initialize(self):
        """Generate and cache the ranked feed for this session."""
        candidates = fetch_candidates(self.user_id)
        self.ranked_posts = rank_candidates(self.user_id, candidates)
        # Store in Redis with TTL
        cache.set(
            f"feed_session:{self.session_id}",
            self.serialize(),
            ttl=3600  # 1 hour max session
        )
    
    def get_page(self, page_num: int, page_size: int = 20) -> List[Post]:
        """Get a page from the frozen session feed."""
        start = page_num * page_size
        end = start + page_size
        
        posts = self.ranked_posts[start:end]
        
        # Track seen posts
        for post in posts:
            self.seen_post_ids.add(post.id)
        
        return posts

The 'You're All Caught Up' UX

Instagram shows 'You're All Caught Up' when users have seen all posts from the past 48 hours. This is both a UX feature (promoting healthy usage) and a technical boundary—the feed doesn't try to rank/show posts older than 2 days. Beyond this point, Explore takes over for content discovery.

Feed Diversity & Business Rules

Pure engagement-based ranking can create degenerate feeds—too many posts from one author, too much of one content type, or filter bubbles that limit exposure. Instagram applies diversity rules after ML ranking to ensure feed quality.

Post-Ranking Rules:

Feed Diversity Rules

•Author spacing: No two posts from same author within 5 positions (prevents feed domination)
•Content type mixing: Ensure photos, videos, carousels, and Reels are interspersed
•Topic diversity: Avoid consecutive posts on same topic (e.g., not 5 food posts in a row)
•Ad load balancing: Ads appear every ~6-8 posts, never consecutive
•Suggested post insertion: Recommendations from non-followed accounts appear every ~10-15 posts
•Story reshare positioning: Content reshared via Stories appears at appropriate positions

Diversity Re-Ranking Algorithm:

def apply_diversity_rules(ranked_posts: List[RankedPost]) -> List[RankedPost]:
    """
    Apply diversity rules to re-order posts after ML ranking.
    Maintains a sliding window to enforce spacing constraints.
    """
    final_feed = []
    remaining = list(ranked_posts)
    
    # Track recent authors/topics for spacing
    recent_authors = []  # Last 5 authors
    recent_topics = []   # Last 3 topics
    
    while remaining and len(final_feed) < MAX_FEED_LENGTH:
        # Find highest-ranked post that satisfies constraints
        for i, post in enumerate(remaining):
            # Check author spacing
            if post.author_id in recent_authors:
                continue
            
            # Check topic diversity
            if post.primary_topic in recent_topics[-3:]:
                # Allow if high enough rank (don't sacrifice too much quality)
                if i < 3:  # Top 3 ranked remaining
                    pass  # Allow despite topic repeat
                else:
                    continue
            
            # Post passes constraints - add to feed
            final_feed.append(post)
            remaining.pop(i)
            
            # Update tracking
            recent_authors.append(post.author_id)
            if len(recent_authors) > 5:
                recent_authors.pop(0)
            
            recent_topics.append(post.primary_topic)
            if len(recent_topics) > 6:
                recent_topics.pop(0)
            
            break
        else:
            # No post satisfied constraints - take top remaining
            final_feed.append(remaining.pop(0))
    
    return final_feed

Ad Slot Injection:

Ads are inserted after ranking, not ranked alongside organic content:

def inject_ads(feed: List[Post], user_context: UserContext) -> List[FeedItem]:
    """Insert ads at appropriate intervals."""
    feed_with_ads = []
    ad_interval = calculate_ad_interval(user_context)  # ~6-8 for most users
    
    for i, post in enumerate(feed):
        feed_with_ads.append(post)
        
        if (i + 1) % ad_interval == 0:
            ad = fetch_ad_for_position(user_context, position=i)
            if ad:
                feed_with_ads.append(ad)
    
    return feed_with_ads

The Tension: Engagement vs. Diversity

Pure engagement optimization might show you 10 posts from your closest friend if that maximizes predicted engagement. But users report preferring diverse feeds even if individual post engagement is slightly lower. Instagram balances short-term engagement with long-term satisfaction through these rules.

Feed Generation Summary

The feed generation system is Instagram's core value creator—it turns raw uploads into personalized, engaging experiences. Let's consolidate the key learnings:

Key Architectural Patterns

•Hybrid fanout solves the celebrity problem: Push for regular users, pull for high-follower accounts
•Specialized social graph storage (TAO) optimizes for common access patterns with massive scale
•Multi-stage ranking balances computation cost with ranking quality through progressive filtering
•Session-scoped caching provides stable scrolling while allowing efficient cache management
•Cursor-based pagination prevents the offset pagination pitfalls for dynamic content
•Diversity rules post-ML-ranking ensure feed quality beyond pure engagement optimization
•Real-time push for active users, efficient polling for others—matching infrastructure to user states

What's Next: Stories Architecture

Stories present unique architectural challenges distinct from the main feed: 24-hour ephemeral content, sequential viewing within stories, and the prominent 'story ring' UI. We'll explore:

Story creation and expiration lifecycle
The story tray (ring of stories at top)
Sequential story viewing and progress tracking
Story interactions (polls, questions, reactions)
Close Friends and audience controls
Story analytics and view tracking

Feed Generation Complete

You now understand how Instagram transforms followed accounts into a personalized, endlessly scrollable feed experience. The patterns here—fanout, ranking cascades, session stability, real-time updates—are foundational to any social content platform. Next, we'll see how Stories adds an ephemeral layer with its own unique challenges.

Feed Generation: Curating the Timeline at Scale

The Heart of the Instagram Experience

What You Will Learn

The Fanout Problem: Push vs. Pull

The fanout problem is the central architectural challenge in social feed systems. When a user posts a photo, how do we ensure it appears in all their followers' feeds?

There are two fundamental approaches, each with profound trade-offs:

Fanout-on-Write (Push Model):

When a user posts, immediately write to every follower's pre-computed feed cache.

User A posts photo
→ For each follower of A:
    → Append photo to follower's feed cache
→ All followers have updated feeds immediately

Fanout-on-Read (Pull Model):

When a user requests their feed, query all followed accounts and aggregate their recent posts.

User B requests feed
→ For each account B follows:
    → Fetch recent posts from that account
→ Merge and rank all posts
→ Return top N posts as feed

Fanout-on-Write Characteristics

•Read latency: Excellent (feed is pre-computed)
•Write latency: Proportional to follower count
•Celebrity problem: 100M followers = 100M writes
•Storage: O(total posts × fanout)
•Staleness: Feed may be stale if write fails
•Best for: Users with few followers

Fanout-on-Read Characteristics

•Read latency: Proportional to following count
•Write latency: Excellent (single write)
•Celebrity problem: Solved (single write per post)
•Storage: O(total posts) - efficient
•Staleness: Always fresh (queried at read time)
•Best for: Celebrity/high-follower accounts

The Celebrity Problem:

Neither approach works universally. Consider these extremes:

User Type	Followers	Best Approach	Why
Average user	500	Fanout-on-write	500 writes is cheap; fast reads for followers
Micro-influencer	50,000	Hybrid	Significant but manageable write amplification
Major celebrity	500M	Fanout-on-read	500M writes per post is catastrophic

Instagram's Hybrid Solution

Social Graph Infrastructure

At the heart of feed generation is the social graph—the data structure representing who follows whom. This graph is massive and accessed on nearly every operation.

Graph Scale:

Metric	Estimated Value
Nodes (users)	2+ billion
Edges (follow relationships)	100+ billion
Average following per user	~200
Average followers per user	~200 (but highly skewed)
Daily edge changes (follow/unfollow)	100+ million

Graph Operations Needed:

Critical Graph Queries

•Get followers: Who follows user X? (for fanout-on-write)
•Get following: Who does user X follow? (for fanout-on-read, feed generation)
•Check relationship: Does user A follow user B? (for access control)
•Mutual check: Do users A and B follow each other? (for DM permissions)
•Follower count: How many followers does X have? (for display)
•Mutual friends: Who do both A and B follow? (for recommendations)
•N-hop traversal: Who are friends-of-friends? (for suggestions)

Graph Storage Strategies:

Storing 100+ billion edges efficiently requires specialized infrastructure:

Option 1: Adjacency List in Key-Value Store

# Store outgoing edges (following)
Key: user_id:following
Value: [followed_user_id_1, followed_user_id_2, ...]

# Store incoming edges (followers) 
Key: user_id:followers
Value: [follower_id_1, follower_id_2, ...]

Simple and fast for common queries
Problematic for users with millions of followers (large values)
Requires pagination for large lists

Option 2: Edge Table in Relational/Wide-Column DB

CREATE TABLE follows (
    follower_id BIGINT,
    followed_id BIGINT,
    created_at TIMESTAMP,
    PRIMARY KEY (follower_id, followed_id)
);

-- Index for reverse lookup
CREATE INDEX idx_followed ON follows(followed_id, follower_id);

Clean relational model
Handles any list size gracefully
Requires careful sharding for scale

Option 3: Graph Database

// Neo4j style
(:User {id: 'A'})-[:FOLLOWS]->(:User {id: 'B'})

Native graph operations (traversal, pattern matching)
Often harder to scale horizontally
May be used for recommendation graph, not primary storage

Instagram's Approach: TAO (The Associations and Objects)

Meta (Facebook/Instagram) uses TAO, a purpose-built graph storage system optimized for social graph access patterns:

TAO Feature	Benefit
Distributed cache layer	Sub-millisecond reads for hot data
Persistent storage tier (MySQL)	Durability with replication
Edge pagination	Efficient handling of celebrity-scale follower lists
Edge counts caching	Follower counts without scanning lists
Bidirectional edges	Following stored in both directions for fast lookups
Eventual consistency	Trades strict consistency for availability

Graph Sharding:

With billions of users, the graph must be sharded across many machines:

# Shard by user_id hash
Shard(user_id) = hash(user_id) % num_shards

# User A's following list → Shard for A
# User B's followers list → Shard for B

# Problem: Following query (A follows B) might span two shards
# Solution: Store edges in both shards (write amplification for read efficiency)

Write Amplification Trade-off

Feed Candidate Generation

When a user requests their home feed, the system must determine which posts are candidates for inclusion. This happens in stages:

Stage 1: Identify Content Sources

User requests feed
→ Get list of followed accounts
→ Separate into 'push' accounts (regular users) and 'pull' accounts (celebrities)
→ Identify other content sources (ads, suggested posts, etc.)

Stage 2: Fetch Pre-pushed Content

For regular accounts (fanout-on-write), the user has a pre-computed feed cache:

Feed cache structure:
{
  user_id: "viewer_123",
  posts: [
    {post_id: "abc", author_id: "456", timestamp: 1704300000, score_features: {...}},
    {post_id: "def", author_id: "789", timestamp: 1704290000, score_features: {...}},
    // ... up to ~1000 recent posts
  ],
  last_updated: 1704310000
}

Stage 3: Pull Celebrity Content

For high-follower accounts, fetch their recent posts at request time:

async def fetch_celebrity_posts(user_id: str, celebrity_ids: List[str]) -> List[Post]:
    """
    Fetch recent posts from celebrity accounts the user follows.
    Runs at feed-request time for accounts above the push threshold.
    """
    # Parallel fetch from all celebrity accounts
    tasks = [fetch_recent_posts(celeb_id, limit=20) for celeb_id in celebrity_ids]
    results = await asyncio.gather(*tasks)
    
    # Flatten and deduplicate
    all_posts = []
    seen_post_ids = set()
    for posts in results:
        for post in posts:
            if post.id not in seen_post_ids:
                all_posts.append(post)
                seen_post_ids.add(post.id)
    
    return all_posts

Stage 4: Merge Candidate Pools

Candidate pool sources:
├── Pre-pushed feed items (from fanout-on-write)
├── Pulled celebrity posts (from fanout-on-read)
├── Suggested posts (from recommendation system)
├── Ads (from advertising system)
├── Reshared content (friends reshared to Stories)
└── Cross-posted content (e.g., Reels that appear in feed)

Merge all into single candidate pool → 500-2000 candidates

Candidate Pool Sizing

Feed Ranking: ML-Powered Personalization

Ranking Objectives:

The ranking system optimizes for multiple outcomes simultaneously:

Objective	Signal	Weight Consideration
Engagement	Predicted P(like), P(comment), P(share), P(save)	Primary signal
Time spent	Predicted view duration	Engagement depth
Relationship	Historical interaction with author	Social connection
Recency	Time since post creation	Timeliness
Content diversity	Content type, topic variety	Avoid repetition
Creator balance	Distribution across followed accounts	Fairness
User satisfaction	Long-term retention signals	Avoid addictive patterns

The Ranking Pipeline:

Converting Mermaid diagram...

Multi-Stage Ranking:

Ranking uses a cascade of increasingly sophisticated (and expensive) models:

Stage	Candidates In	Candidates Out	Model	Latency Budget
Candidate fetch	0	~2000	N/A	50ms
First-pass ranking	~2000	~200	Lightweight NN or GBDT	20ms
Second-pass ranking	~200	~50	Heavy transformer/NN	30ms
Business rules	~50	~30	Rules engine	5ms
Final ordering	~30	~20	Diversity/position rules	5ms
Total				<150ms

Feature Categories for Ranking:

Ranking Signal Categories

•Post features: Age, content type (photo/video/carousel), caption length, hashtag count, has location
•Author features: Follower count, posting frequency, engagement rate, account age, verification status
•User features: Following count, typical session length, preferred content types, active hours
•Interaction features: User's history with author (likes, comments, profile visits, DMs), recency of interactions
•Content features: Visual embeddings (what's in the image), text embeddings (caption analysis), detected objects
•Contextual features: Time of day, day of week, user's recent engagement patterns, session position

Simplified Ranking Feature Extraction
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
from dataclasses import dataclass
from typing import List, Dict
import numpy as np
 
@dataclass
class RankingFeatures:
    # Post features
    post_age_hours: float
    content_type: str  # 'photo', 'video', 'carousel'
    has_location: bool
    caption_length: int
    hashtag_count: int
    
    # Author features
    author_follower_count: int
    author_post_frequency: float  # posts per week
    author_avg_engagement_rate: float
    is_verified: bool
    
    # User-author interaction features
    user_liked_author_count_7d: int
    user_commented_author_count_7d: int
    user_viewed_author_profile_7d: bool
    days_since_last_interaction: float
    
    # Content features (from ML embeddings)
    visual_embedding: List[float]  # 512-d
    caption_embedding: List[float]  # 768-d
    
    # Contextual features
    hour_of_day: int
    day_of_week: int
    posts_seen_this_session: int
 
 
def compute_ranking_features(
    user_id: str,
    post: Post,
    user_context: UserContext,
    interaction_history: InteractionHistory
) -> RankingFeatures:
    """
    Compute all features needed for ranking a single post for a user.
    In production, this is heavily optimized with batch lookups and caching.
    """
    return RankingFeatures(
        # Post features
        post_age_hours=(time.now() - post.created_at).total_seconds() / 3600,
        content_type=post.content_type,
        has_location=post.location_id is not None,
        caption_length=len(post.caption),
        hashtag_count=count_hashtags(post.caption),
        
        # Author features (cached per author)
        author_follower_count=get_cached_follower_count(post.author_id),
        author_post_frequency=get_posting_frequency(post.author_id),
        author_avg_engagement_rate=get_engagement_rate(post.author_id),
        is_verified=is_verified(post.author_id),
        
        # Interaction features
        user_liked_author_count_7d=interaction_history.likes_to_author_7d(post.author_id),
        user_commented_author_count_7d=interaction_history.comments_to_author_7d(post.author_id),
        user_viewed_author_profile_7d=interaction_history.profile_view_7d(post.author_id),
        days_since_last_interaction=interaction_history.days_since_interaction(post.author_id),
        
        # Content features (pre-computed during upload)
        visual_embedding=post.visual_embedding,
        caption_embedding=get_text_embedding(post.caption),
        
        # Context
        hour_of_day=user_context.local_hour,
        day_of_week=user_context.day_of_week,
        posts_seen_this_session=user_context.session_post_count,
    )

Feature Computation is the Bottleneck

Feed Caching Architecture

With billions of feed requests daily, aggressive caching is essential. However, feed caching is complex because feeds are personalized, time-sensitive, and constantly changing.

What Can Be Cached?

Component	Cacheability	Cache Strategy
Feed candidate list	Per-user, short TTL	Cache 5-15 minutes, invalidate on new post
Ranked feed	Per-user, very short TTL	Cache 1-5 minutes, personalization changes
Post content	Global, long TTL	Cache for hours/days, immutable content
User metadata	Global, medium TTL	Cache 15-60 minutes, changes infrequently
Engagement counts	Global, short TTL	Cache 1-5 minutes, eventual consistency OK
Social graph	Per-user, medium TTL	Cache 30 minutes, invalidate on follow/unfollow

Feed Cache Architecture:

Converting Mermaid diagram...

Cache Invalidation Strategies:

The classic challenge: "When do we invalidate the cache?"

1. Time-based Expiration (TTL):

# Simple but can serve stale content
feed_cache.set(user_id, feed_data, ttl=300)  # 5-minute TTL

2. Event-driven Invalidation:

# When followed account posts
def on_new_post(author_id: str, post_id: str):
    # Get all followers who have cached feeds
    followers = get_followers(author_id)
    for follower_id in followers:
        # Invalidate or update their feed cache
        feed_cache.invalidate(follower_id)
        # Or: Append new post to cached feed (lazy update)
        feed_cache.append_post(follower_id, post_id)

# When user follows/unfollows
def on_follow_change(user_id: str, target_id: str):
    feed_cache.invalidate(user_id)  # Feed sources changed

3. Hybrid: TTL + Incremental Updates:

# Cache stores feed with version number
cached_feed = {
    'version': 123,
    'posts': [...],
    'last_update': timestamp
}

# On request, check for updates since last version
def get_feed_with_refresh(user_id: str):
    cached = feed_cache.get(user_id)
    if cached:
        new_posts = get_posts_since(user_id, cached['last_update'])
        if new_posts:
            # Merge new posts into cached feed and re-rank
            merged = merge_and_rank(cached['posts'], new_posts)
            cached['posts'] = merged
            cached['version'] += 1
            feed_cache.set(user_id, cached)
        return cached['posts']
    else:
        # Full feed generation
        return generate_full_feed(user_id)

The Celebrity Invalidation Storm

Real-Time Feed Updates & Notifications

Real-Time Update Channels:

Update Type	Delivery Method	Latency Target
New post indicator	Push notification or in-app pill	<10 seconds
Live engagement counts	WebSocket streaming	<2 seconds
Story ring update	Background refresh + push	<30 seconds
Comment thread	Real-time sync	<1 second
Like feedback	Optimistic UI + confirmation	Instant (<100ms)

Push Notification Architecture:

Real-Time Push System (Conceptual)
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
from dataclasses import dataclass
from enum import Enum
from typing import List, Set
 
class UpdateType(Enum):
    NEW_POST = "new_post"
    ENGAGEMENT_UPDATE = "engagement"
    STORY_UPDATE = "story"
 
@dataclass  
class FeedUpdate:
    update_type: UpdateType
    payload: dict
    target_users: Set[str]
 
 
class RealTimeFeedService:
    """
    Manages real-time feed updates to connected clients.
    """
    
    def __init__(self, connection_manager, message_queue):
        self.connections = connection_manager  # Tracks WebSocket connections
        self.queue = message_queue  # Kafka/SQS for durability
    
    async def on_new_post(self, author_id: str, post_id: str):
        """
        Called when a user publishes a new post.
        Notifies followers who are currently active.
        """
        # Get followers who are currently online
        all_followers = await get_followers(author_id)
        online_followers = await self.connections.filter_online(all_followers)
        
        # For online users, send real-time notification
        if len(online_followers) < 50000:  # Regular users
            update = FeedUpdate(
                update_type=UpdateType.NEW_POST,
                payload={
                    "author_id": author_id,
                    "post_id": post_id,
                    "preview_url": get_preview_url(post_id),
                    "author_name": get_username(author_id),
                },
                target_users=set(online_followers)
            )
            await self.push_update(update)
        else:
            # Celebrity post - don't push directly, let clients poll
            # or use progressively expanding fanout
            await self.queue_delayed_push(author_id, post_id, online_followers)
    
    async def push_update(self, update: FeedUpdate):
        """
        Push update to all target users via WebSocket.
        """
        for user_id in update.target_users:
            connection = self.connections.get(user_id)
            if connection:
                await connection.send_json({
                    "type": update.update_type.value,
                    "data": update.payload
                })
 
 
# Client-side handling
# When client receives NEW_POST notification:
#   1. Show "New posts available" pill at top of feed
#   2. On tap, scroll to top and refresh feed
#   3. OR: Auto-refresh if at top of feed and not actively scrolling

WebSocket Connection Management:

Maintaining persistent connections to 100+ million concurrent users is a significant infrastructure challenge:

Challenge	Solution
Connection count	Millions of connections per server using epoll/kqueue
Connection routing	Consistent hashing to route user to specific server
Connection migration	Graceful handoff during deployments
Mobile connection stability	Automatic reconnection with exponential backoff
Battery impact	Batch updates, reduce heartbeat frequency
Presence tracking	Distributed presence service (who's online)

Long-Polling Fallback:

For clients that can't maintain WebSocket connections (firewalls, proxies, old devices):

# Long-polling endpoint
GET /api/v1/feed/updates?since=<timestamp>&timeout=30

# Server holds connection open for up to 30 seconds
# Returns immediately if updates available
# Returns empty after timeout if no updates

# Client immediately reconnects after each response

Battery and Data Efficiency

Pagination and Infinite Scroll

Instagram's feed is an infinite scroll experience—users continuously scroll and new content loads seamlessly. Implementing this correctly is more complex than it appears.

The Pagination Challenge:

Traditional offset-based pagination fails for dynamic feeds:

-- Offset pagination: Simple but broken for feeds
SELECT * FROM posts ORDER BY rank DESC LIMIT 20 OFFSET 40;

-- Problem: If new posts are inserted or rankings change,
-- the user may see duplicate posts or miss posts entirely

Cursor-based Pagination:

Instagram uses cursor-based (keyset) pagination for stable scrolling:

# Cursor encodes the position in the feed
# Using last-seen post ID and its rank score

@dataclass
class FeedCursor:
    user_id: str
    last_post_id: str
    last_rank_score: float
    page_number: int
    session_id: str  # For analytics

def get_feed_page(cursor: FeedCursor) -> Tuple[List[Post], FeedCursor]:
    """
    Get next page of feed starting after the cursor position.
    """
    # Fetch posts ranked after the cursor
    posts = db.query(
        """SELECT * FROM ranked_feed 
           WHERE user_id = :user_id 
           AND (rank_score, post_id) < (:last_score, :last_post_id)
           ORDER BY rank_score DESC, post_id DESC
           LIMIT 20""",
        user_id=cursor.user_id,
        last_score=cursor.last_rank_score,
        last_post_id=cursor.last_post_id
    )
    
    # Create cursor for next page
    if posts:
        next_cursor = FeedCursor(
            user_id=cursor.user_id,
            last_post_id=posts[-1].id,
            last_rank_score=posts[-1].rank_score,
            page_number=cursor.page_number + 1,
            session_id=cursor.session_id
        )
    else:
        next_cursor = None
    
    return posts, next_cursor

Feed Session Stability:

A user's feed should remain stable during a scrolling session, even if rankings change:

Scenario	Without Session Stability	With Session Stability
New post appears	May insert mid-scroll, causing confusion	Held until session refresh
Ranking update	Cards reorder unexpectedly	Order frozen within session
Seen post	May reappear if re-ranked	Filtered out for session duration
Deleted post	Card disappears mid-scroll	Card removed gracefully

Implementation: Session-Scoped Feed Snapshot:

class FeedSession:
    """
    Represents a single scrolling session with stable feed content.
    """
    
    def __init__(self, user_id: str):
        self.session_id = generate_uuid()
        self.user_id = user_id
        self.created_at = time.now()
        self.seen_post_ids: Set[str] = set()
        self.ranked_posts: List[RankedPost] = []
        
    def initialize(self):
        """Generate and cache the ranked feed for this session."""
        candidates = fetch_candidates(self.user_id)
        self.ranked_posts = rank_candidates(self.user_id, candidates)
        # Store in Redis with TTL
        cache.set(
            f"feed_session:{self.session_id}",
            self.serialize(),
            ttl=3600  # 1 hour max session
        )
    
    def get_page(self, page_num: int, page_size: int = 20) -> List[Post]:
        """Get a page from the frozen session feed."""
        start = page_num * page_size
        end = start + page_size
        
        posts = self.ranked_posts[start:end]
        
        # Track seen posts
        for post in posts:
            self.seen_post_ids.add(post.id)
        
        return posts

The 'You're All Caught Up' UX

Feed Diversity & Business Rules

Post-Ranking Rules:

Feed Diversity Rules

•Author spacing: No two posts from same author within 5 positions (prevents feed domination)
•Content type mixing: Ensure photos, videos, carousels, and Reels are interspersed
•Topic diversity: Avoid consecutive posts on same topic (e.g., not 5 food posts in a row)
•Ad load balancing: Ads appear every ~6-8 posts, never consecutive
•Suggested post insertion: Recommendations from non-followed accounts appear every ~10-15 posts
•Story reshare positioning: Content reshared via Stories appears at appropriate positions

Diversity Re-Ranking Algorithm:

def apply_diversity_rules(ranked_posts: List[RankedPost]) -> List[RankedPost]:
    """
    Apply diversity rules to re-order posts after ML ranking.
    Maintains a sliding window to enforce spacing constraints.
    """
    final_feed = []
    remaining = list(ranked_posts)
    
    # Track recent authors/topics for spacing
    recent_authors = []  # Last 5 authors
    recent_topics = []   # Last 3 topics
    
    while remaining and len(final_feed) < MAX_FEED_LENGTH:
        # Find highest-ranked post that satisfies constraints
        for i, post in enumerate(remaining):
            # Check author spacing
            if post.author_id in recent_authors:
                continue
            
            # Check topic diversity
            if post.primary_topic in recent_topics[-3:]:
                # Allow if high enough rank (don't sacrifice too much quality)
                if i < 3:  # Top 3 ranked remaining
                    pass  # Allow despite topic repeat
                else:
                    continue
            
            # Post passes constraints - add to feed
            final_feed.append(post)
            remaining.pop(i)
            
            # Update tracking
            recent_authors.append(post.author_id)
            if len(recent_authors) > 5:
                recent_authors.pop(0)
            
            recent_topics.append(post.primary_topic)
            if len(recent_topics) > 6:
                recent_topics.pop(0)
            
            break
        else:
            # No post satisfied constraints - take top remaining
            final_feed.append(remaining.pop(0))
    
    return final_feed

Ad Slot Injection:

Ads are inserted after ranking, not ranked alongside organic content:

def inject_ads(feed: List[Post], user_context: UserContext) -> List[FeedItem]:
    """Insert ads at appropriate intervals."""
    feed_with_ads = []
    ad_interval = calculate_ad_interval(user_context)  # ~6-8 for most users
    
    for i, post in enumerate(feed):
        feed_with_ads.append(post)
        
        if (i + 1) % ad_interval == 0:
            ad = fetch_ad_for_position(user_context, position=i)
            if ad:
                feed_with_ads.append(ad)
    
    return feed_with_ads

The Tension: Engagement vs. Diversity

Feed Generation Summary

The feed generation system is Instagram's core value creator—it turns raw uploads into personalized, engaging experiences. Let's consolidate the key learnings:

Key Architectural Patterns

•Hybrid fanout solves the celebrity problem: Push for regular users, pull for high-follower accounts
•Specialized social graph storage (TAO) optimizes for common access patterns with massive scale
•Multi-stage ranking balances computation cost with ranking quality through progressive filtering
•Session-scoped caching provides stable scrolling while allowing efficient cache management
•Cursor-based pagination prevents the offset pagination pitfalls for dynamic content
•Diversity rules post-ML-ranking ensure feed quality beyond pure engagement optimization
•Real-time push for active users, efficient polling for others—matching infrastructure to user states

What's Next: Stories Architecture

Stories present unique architectural challenges distinct from the main feed: 24-hour ephemeral content, sequential viewing within stories, and the prominent 'story ring' UI. We'll explore:

Story creation and expiration lifecycle
The story tray (ring of stories at top)
Sequential story viewing and progress tracking
Story interactions (polls, questions, reactions)
Close Friends and audience controls
Story analytics and view tracking

Feed Generation Complete