Instagram Photos - Learning Module

Loading content...

0/273

Image Processing Pipeline: From Upload to Delivery

The Journey of a Single Photo

When a user taps 'Share' on their carefully filtered photo, they expect it to appear instantly in their profile and followers' feeds. What seems like a simple operation—saving an image to the internet—actually triggers one of the most sophisticated media processing systems ever built.

The photo embarks on a journey through upload ingestion, format normalization, filter rendering, multi-resolution generation, content analysis, safety scanning, and distributed storage—all completing within seconds from the user's perspective. Behind this seamless experience lies a pipeline processing over 25,000 photos per second during peak hours, generating petabytes of derived assets daily.

This page dissects Instagram's image processing pipeline—the architectural backbone that transforms raw user uploads into the optimized, analyzed, safely-stored assets that power billions of visual experiences daily.

What You Will Learn

By the end of this page, you will understand: (1) How Instagram handles upload ingestion including chunked uploads and network resilience, (2) The asynchronous processing workflow that decouples user experience from actual processing, (3) How filters are 'baked' into images rather than applied at render time, (4) Multi-resolution variant generation strategy, (5) ML-based content analysis for safety, accessibility, and recommendations, and (6) Exabyte-scale storage architecture for media assets.

Upload Ingestion Architecture

The upload ingestion layer is the first system to receive user photos. Its primary responsibilities are:

Accepting uploads reliably across varied network conditions
Authenticating and authorizing the upload request
Validating that the uploaded content is acceptable (size, format, basic checks)
Queuing the upload for asynchronous processing
Providing immediate feedback to the client application

The Chunked Upload Pattern:

Mobile networks are inherently unreliable. Users upload from subways, elevators, and areas with spotty coverage. A naive approach where the client uploads the entire photo in a single HTTP request would fail catastrophically:

A 10MB photo upload on 3G might take 30+ seconds
Network interruption means restarting from scratch
Failed uploads create frustrated users who won't retry

Instagram uses chunked uploads to solve this problem:

Chunked Upload Protocol

Protocol Flow

// Phase 1: Initiate Upload Session
POST /api/v1/media/upload/initialize
{
  "content_type": "image/jpeg",
  "content_length": 10485760,  // 10MB
  "chunk_size": 524288,         // 512KB chunks
  "client_context": "uuid-abc123",
  "metadata": {
    "device": "iPhone 15 Pro",
    "capture_time": "2024-01-15T14:30:00Z",
    "filter": "clarendon"
  }
}
 
// Response: Session Created
{
  "upload_id": "upload_xyz789",
  "upload_url": "https://upload.instagram.com/v1/xyz789",
  "chunk_count": 20,
  "session_expires_at": "2024-01-15T15:00:00Z"
}
 
// Phase 2: Upload Chunks (in parallel or sequence)
PUT /v1/xyz789/chunk/0
Content-Range: bytes 0-524287/10485760
[binary data: 512KB]
 
PUT /v1/xyz789/chunk/1
Content-Range: bytes 524288-1048575/10485760
[binary data: 512KB]
// ... chunks 2-18 ...
 
PUT /v1/xyz789/chunk/19
Content-Range: bytes 9961472-10485759/10485760
[binary data: remaining bytes]
 
// Phase 3: Finalize Upload
POST /v1/xyz789/finalize
{
  "checksum": "sha256:abc123...",
  "caption": "Beautiful sunset! 🌅",
  "location_id": "123456",
  "tagged_users": ["user_id_1", "user_id_2"]
}
 
// Response: Upload Queued
{
  "media_id": "media_12345",
  "status": "processing",
  "estimated_completion_ms": 3000
}

Chunk Upload Mechanics:

Aspect	Implementation	Why
Chunk size	256KB - 1MB	Small enough for quick transmission, large enough for efficiency
Parallel uploads	Up to 4 concurrent chunks	Utilizes available bandwidth without congestion
Retry logic	3 retries with exponential backoff	Handles transient failures gracefully
Chunk verification	MD5/SHA256 per chunk	Detects corruption during transmission
Session timeout	30 minutes	Allows interrupted uploads to resume
Idempotency	Chunk number acts as idempotent key	Duplicate chunk uploads are safely ignored

Upload Ingestion Servers:

Instagram operates dedicated upload ingestion clusters optimized for receiving large binary payloads:

Globally distributed: Upload endpoints in every major region to minimize latency
Connection optimized: HTTP/2 for multiplexing, long-lived connections
Rate limited: Per-user upload limits to prevent abuse
Direct-to-storage: Chunks stream directly to temporary storage, not through API servers
Stateless coordination: Redis/Memcached tracks chunk receipt across distributed upload servers

Why Direct-to-Storage?

Upload servers receive chunks and immediately stream them to temporary object storage (S3, GCS, or internal blob store). They don't buffer the entire upload in memory. This allows upload servers to handle high throughput without massive memory requirements. The finalization step then triggers processing from the temporary storage location.

Asynchronous Processing Workflow

Once an upload is finalized, it enters Instagram's asynchronous processing pipeline. The key insight is that the user doesn't need to wait for full processing to complete—they only need to see their post 'submitted'. The actual processing happens in the background.

The Optimistic UI Pattern:

User taps 'Share'
Client immediately shows the post in-feed with local image
Background: Upload completes and processing begins
Background: Processing finishes, CDN URLs become available
Client silently swaps local image for CDN version
If processing fails: Show subtle error, offer retry

This optimistic UI approach decouples perceived latency from actual processing time, enabling complex operations while maintaining snappy user experience.

Processing Pipeline Stages:

The processing pipeline is modeled as a directed acyclic graph (DAG) of processing stages. Some stages can run in parallel; others have dependencies:

Converting Mermaid diagram...

Stage Dependencies:

Stage	Depends On	Output	Parallelizable With
Format Decode	Chunk Assembly	Raw pixel data
EXIF Extraction	Format Decode	Metadata JSON	Orientation Fix, Safety Scan
Orientation Fix	Format Decode	Correctly rotated image	EXIF Extraction, Safety Scan
Filter Application	Orientation Fix	Filtered image
Resolution Variants	Filter Application	5 image sizes
Safety Scan	Format Decode	Safety classification	EXIF, Orientation
Object Detection	Format Decode	Object labels	EXIF, Orientation, Safety
Policy Check	Safety Scan	Publish permission
Feed Fanout	Policy Check, Variants	Follower notifications

Workflow Orchestration:

Instagram uses workflow engines (similar to Temporal, Airflow, or custom solutions) to orchestrate this DAG:

Task queues: Each processing stage pulls from dedicated queues (Kafka, SQS)
Retry with backoff: Failed stages retry automatically with limits
Dead letter queues: Permanently failed tasks routed for human review
Priority lanes: Paid/verified accounts may get priority processing
Observability: Each stage emits metrics and traces for debugging

Processing SLOs

Instagram targets processing completion within 5 seconds for P95 of uploads. This includes decoding, filtering, generating all variants, running safety checks, storing to object storage, and priming CDN. Achieving this at 25K+ uploads/second requires massive parallelization and highly optimized processing code.

Image Format Handling & Normalization

Users upload images in countless formats from diverse devices. The processing pipeline must normalize this chaos into a consistent internal representation.

Supported Input Formats:

Input Format Support
Format	Source	Challenges	Handling
JPEG	Most cameras, Android	Varied quality levels, EXIF orientation	Decode, apply orientation, re-encode
HEIC/HEIF	iPhone (iOS 11+)	Hardware decoder requirements, licensing	Transcode to JPEG with libheif
PNG	Screenshots, graphics	Large files, transparency	Flatten alpha, convert to JPEG
WebP	Chrome, Android	Varied support historically	Decode with libwebp
GIF	Animations (legacy)	Animation support varies	Extract first frame or convert to video
RAW formats	Pro cameras	RAW processing complexity	Generally rejected, guide to use JPEG

The Orientation Problem:

One of the most common image display bugs stems from EXIF orientation. Cameras often store images in landscape orientation with an EXIF tag indicating how to rotate for display. Many applications (including earlier Instagram versions) ignore this tag, resulting in sideways or upside-down photos.

Instagram's pipeline explicitly:

Reads EXIF orientation from original file
Applies rotation/flip transforms to pixel data
Stores output in standard orientation (no EXIF orientation needed)
Strips EXIF from final output (including GPS if not explicitly shared)

This 'bakes' the orientation into the pixel data, ensuring consistent display everywhere.

Orientation Handling (Conceptual)
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from PIL import Image
from PIL.ExifTags import TAGS
 
def normalize_orientation(image_path: str) -> Image:
    """
    Load an image and apply EXIF orientation to pixel data.
    Returns an image in standard orientation.
    """
    img = Image.open(image_path)
    
    # Get EXIF data if present
    exif = img.getexif()
    orientation = exif.get(274)  # 274 is the orientation tag
    
    # Apply transforms based on orientation value
    # Values 1-8 represent different rotation/flip combinations
    transforms = {
        1: lambda x: x,                                    # Normal
        2: lambda x: x.transpose(Image.FLIP_LEFT_RIGHT),   # Mirrored
        3: lambda x: x.rotate(180),                        # Rotated 180°
        4: lambda x: x.rotate(180).transpose(Image.FLIP_LEFT_RIGHT),
        5: lambda x: x.rotate(-90, expand=True).transpose(Image.FLIP_LEFT_RIGHT),
        6: lambda x: x.rotate(-90, expand=True),           # Rotated 90° CW
        7: lambda x: x.rotate(90, expand=True).transpose(Image.FLIP_LEFT_RIGHT),
        8: lambda x: x.rotate(90, expand=True),            # Rotated 90° CCW
    }
    
    if orientation in transforms:
        img = transforms[orientation](img)
    
    # Strip all EXIF (including orientation) for clean output
    img = img.copy()  # Removes EXIF
    
    return img

Color Space Handling:

Images come in various color spaces (sRGB, Adobe RGB, Display P3, etc.). For consistent display across devices:

Extract embedded ICC color profile
Convert to sRGB (web standard) for storage
For high-end devices, optionally provide Display P3 variants
Embed sRGB profile or strip for web compatibility

Quality vs. Size Optimization:

Instagram uses adaptive quality encoding:

Visual quality analysis: Perceptual hashing to ensure encoding doesn't degrade quality noticeably
Adaptive JPEG quality: Quality level varies (70-95) based on image complexity
Modern codecs: WebP for supporting browsers/apps (30-50% smaller than JPEG)
AVIF exploration: Next-gen codec for even better compression

Filter Application: Baking Visual Effects

Instagram's filters—Clarendon, Juno, Valencia, and dozens more—are defining features of the platform. Understanding how filters are applied reveals important architectural decisions.

The 'Baking' Approach:

Instagram bakes filters into stored images rather than applying them at render time. This means:

Filter is applied once during processing
Resulting pixels are stored permanently
Viewers receive pre-filtered images
No filter computation on every view

Why Bake Filters?

Baked Filters (Instagram's Choice)

•Zero compute on view (critical at scale)
•Consistent rendering everywhere
•Simpler CDN caching (just static images)
•Filter code can change without affecting old posts
•No client capability requirements

Dynamic Filters (Alternative)

•GPU compute on every view (expensive)
•Rendering differences across devices
•Complex caching (filter params + image)
•Filter bugs affect all existing posts
•Requires capable rendering clients

Filter Implementation:

Instagram filters are implemented as chains of image processing operations:

Operation	Description	Parameters
Curves	Tone mapping (shadows, midtones, highlights)	RGB curve control points
Vignette	Darkening edges	Radius, feather, intensity
Saturation	Color intensity adjustment	-100 to +100
Contrast	Tonal range adjustment	-100 to +100
Temperature	Warm/cool shift	Kelvin value
Grain	Film-like texture	Intensity, size
Fade	Reduced contrast, lifted blacks	Amount
Tint	Color overlay	Hue, saturation

Processing Filters at Scale:

At 25K+ images/second, filter processing must be highly optimized:

SIMD operations: SSE/AVX vectorized pixel processing
GPU acceleration: CUDA/OpenCL for parallel pixel operations
Lookup tables (LUTs): Pre-computed color mappings for common operations
Batch processing: Process multiple images through same filter together
Optimized pipelines: Combine sequential operations to read/write pixels fewer times

Filter Application with LUT (Conceptual)
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import numpy as np
 
def apply_filter_with_lut(image: np.ndarray, lut_3d: np.ndarray) -> np.ndarray:
    """
    Apply a color grading filter using 3D LUT lookup.
    
    3D LUTs precompute color transformations, allowing complex
    color grading to be applied with simple array lookups.
    
    Args:
        image: RGB image array (H, W, 3) with values 0-255
        lut_3d: 3D lookup table (N, N, N, 3) typically 33x33x33
    
    Returns:
        Filtered image array
    """
    lut_size = lut_3d.shape[0]
    
    # Normalize pixel values to LUT index space
    scale = (lut_size - 1) / 255.0
    scaled = image.astype(np.float32) * scale
    
    # Get integer indices and fractional parts for trilinear interpolation
    indices = np.floor(scaled).astype(np.int32)
    fractions = scaled - indices
    
    # Clamp indices to valid range
    indices = np.clip(indices, 0, lut_size - 2)
    
    r, g, b = indices[..., 0], indices[..., 1], indices[..., 2]
    fr, fg, fb = fractions[..., 0], fractions[..., 1], fractions[..., 2]
    
    # Trilinear interpolation for smooth color mapping
    # (simplified - full implementation has 8 lookups and interpolation)
    result = lut_3d[r, g, b]
    
    return result.astype(np.uint8)
 
 
# Pre-computed LUT for "Clarendon" filter (example)
# These are generated offline and loaded at startup
CLARENDON_LUT = load_lut("clarendon_33x33x33.cube")
 
# Application is just a single function call per image
filtered = apply_filter_with_lut(image, CLARENDON_LUT)

LUTs for Speed

3D Lookup Tables (LUTs) are the secret to fast filter application. Instead of computing curves, saturation adjustments, and color grading for each pixel, you precompute the transformation for a grid of colors (typically 33×33×33 = ~36K entries) and interpolate. This reduces complex color grading to a single array lookup per pixel.

Resolution Variant Generation

Instagram serves images on devices ranging from budget Android phones with 720p screens to 4K tablets and high-DPI Retina displays. Serving a single resolution would waste bandwidth on small screens or appear blurry on large ones.

The Variant Strategy:

For each uploaded photo, Instagram generates multiple resolution variants:

Standard Image Variants
Variant Name	Max Dimension	Use Cases	Typical Size
thumbnail_150	150×150px	Grid previews, notifications, search results	8-15 KB
small_320	320×320px	Low-bandwidth preview, placeholder	20-40 KB
standard_640	640×640px (or aspect preserved)	Feed on medium-DPI devices	60-120 KB
large_1080	1080×1080px (or aspect preserved)	Feed on high-DPI, full-screen view	150-300 KB
original_capped	Up to 1440×1440px	Pinch-to-zoom, highest quality	200-500 KB

Aspect Ratio Handling:

Instagram supports multiple aspect ratios:

1:1 (Square): Original Instagram format, 1080×1080 max
4:5 (Portrait): 1080×1350 max, popular for portraits
1.91:1 (Landscape): 1080×566 max, cinematic feel

The pipeline generates variants that preserve the uploaded aspect ratio within allowed bounds. Extreme aspect ratios are cropped to fit within permitted ranges.

Resizing Algorithm Selection:

Not all resizing algorithms are equal. Instagram uses Lanczos resampling (or similar high-quality algorithms) for downscaling:

Algorithm	Quality	Speed	When Used
Nearest Neighbor	Poor	Fastest	Almost never (too blocky)
Bilinear	Fair	Fast	Quick previews, thumbnails
Bicubic	Good	Medium	Standard resizing
Lanczos	Excellent	Slower	Final variants, quality-critical
AI Upscaling	Excellent	Slowest	Potentially for zoom enhancement

Thumbnail Generation Strategy:

Thumbnails require special handling beyond simple shrinking:

Smart cropping: Center on detected faces or subjects, not just geometric center
Sharpening: Apply unsharp mask after downsizing to combat blur
Higher compression: Smaller files tolerate higher compression without visible artifacts
Square cropping: Force to 1:1 for grid consistency

The Lazy Generation Debate:

Should all variants be generated eagerly (at upload) or lazily (on first request)?

Approach	Tradeoff
Eager (Instagram's approach)	Higher processing cost upfront, but guaranteed fast delivery. No cold-start latency for new posts.
Lazy with caching	Lower initial cost, but first viewer pays latency penalty. Cache misses create load spikes.
Hybrid	Generate most-used variants eagerly (thumb, standard, large), rare variants lazily (ultra-high-res).

At Instagram's scale, the eager approach wins because the probability of every variant being requested is nearly 100% for popular content. Posts that never get viewed are rare, and the processing cost is amortized across many views.

Storage Multiplier

Generating 5 variants increases storage by roughly 30-50% over storing just the largest version. For 6 PB/day of uploads, this means ~8-9 PB/day of actual storage. However, this is far cheaper than computing variants on-demand—CPU cycles cost more than disk at this scale.

Content Analysis & Machine Learning Pipeline

Every uploaded image passes through multiple ML models that analyze its content for safety, accessibility, recommendations, and business purposes. This ML pipeline runs in parallel with image processing to avoid blocking publication.

Analysis Categories:

ML Analysis Pipeline

•Safety & Policy Compliance: Nudity detection, violence detection, hate symbol recognition, CSAM (child safety) detection, spam/scam detection
•Content Classification: Scene type (beach, city, food, etc.), activity recognition, object detection, text extraction (OCR)
•Accessibility: Auto-generated alt text for screen readers, visual description for assistive technology
•Quality Signals: Image quality assessment (blur, exposure, composition), originality detection (screenshot, repost)
•Embedding Generation: Visual feature vectors for similarity search, recommendation, and duplicate detection

Safety Detection Architecture:

Safety is the highest priority ML workload. The system must prevent policy-violating content from ever being published:

Check	Model Type	Action on Detection	Latency Budget
CSAM detection	Specialized CNN + hash matching	Immediate block, report to NCMEC	<100ms (blocking)
Nudity classification	Multi-label classification	Flag for review or auto-block	<200ms (can be async)
Violence/gore	Object detection + classification	Flag for review	<200ms
Hate symbols	Pattern matching + CNN	Flag for review	<200ms
Text policy (OCR)	OCR + NLP classification	Flag for review	<500ms

Model Serving at Scale:

Running ML inference on 25K+ images/second requires specialized infrastructure:

GPU clusters: NVIDIA A100/H100 clusters for inference
Batching: Accumulate images, batch inference for GPU efficiency
Model optimization: TensorRT/ONNX optimization for production
Model routing: Route to specialized model servers by task
Graceful degradation: If ML services are overloaded, queue for later processing (but block publication)

Content Analysis Service (Conceptual)
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
from dataclasses import dataclass
from typing import List, Optional
import asyncio
 
@dataclass
class ContentAnalysisResult:
    image_id: str
    
    # Safety results (blocking)
    is_csam: bool
    csam_hash_match: Optional[str]
    nudity_score: float
    violence_score: float
    policy_violation: Optional[str]
    
    # Classification (non-blocking)
    scene_labels: List[str]     # ["beach", "sunset", "vacation"]
    objects_detected: List[str]  # ["person", "dog", "surfboard"]
    text_extracted: Optional[str]
    
    # Accessibility
    alt_text_generated: str
    
    # Quality signals
    quality_score: float        # 0-1, overall perceived quality
    is_screenshot: bool
    blur_score: float
    
    # Embeddings for recommendations
    visual_embedding: List[float]  # 512-d or 2048-d feature vector
 
async def analyze_image(image_bytes: bytes, image_id: str) -> ContentAnalysisResult:
    """
    Run all content analysis models on an uploaded image.
    Safety checks run first and can block publication.
    """
    # Decode image once, share across models
    image_tensor = decode_image(image_bytes)
    
    # Run safety checks first (can block publication)
    safety_task = asyncio.create_task(run_safety_models(image_tensor))
    
    # Run other analyses in parallel (non-blocking)
    classification_task = asyncio.create_task(run_classification(image_tensor))
    accessibility_task = asyncio.create_task(generate_alt_text(image_tensor))
    quality_task = asyncio.create_task(assess_quality(image_tensor))
    embedding_task = asyncio.create_task(generate_embedding(image_tensor))
    
    # Wait for safety first - this is blocking
    safety_result = await safety_task
    if safety_result.is_csam:
        # Immediate escalation, do not publish
        await report_to_ncmec(image_id, image_bytes)
        raise PolicyViolation("CSAM detected")
    
    # Gather remaining results
    classification, alt_text, quality, embedding = await asyncio.gather(
        classification_task,
        accessibility_task, 
        quality_task,
        embedding_task
    )
    
    return ContentAnalysisResult(
        image_id=image_id,
        **safety_result.__dict__,
        **classification.__dict__,
        alt_text_generated=alt_text,
        **quality.__dict__,
        visual_embedding=embedding
    )

False Positives are Expensive

At 2 billion daily uploads, a 0.1% false positive rate means 2 million wrongly flagged photos per day. Each requires human review or causes user frustration. Safety models must balance sensitivity (catching bad content) against specificity (avoiding false accusations). This is why multi-stage review with human-in-the-loop is essential.

Object Storage Architecture at Exabyte Scale

With 6+ petabytes of new media stored daily, Instagram requires storage infrastructure that exceeds typical enterprise scale by orders of magnitude. This section explores how media assets are stored, replicated, and accessed.

Storage Requirements:

Requirement	Value	Implication
Daily ingest	6+ PB	Massive write throughput
Total storage	20+ exabytes	At this scale, even small optimizations save petabytes
Durability	11 nines (99.999999999%)	Data must never be lost
Availability	99.99% for reads	Users must always see their photos
Read latency	<50ms p50, <200ms p99	From anywhere in the world
Write latency	<1 second	For processing pipeline

Storage Tiers:

Not all photos are accessed equally. Instagram uses tiered storage to optimize cost:

Storage Tier Strategy
Tier	Use Case	Storage Type	Cost	Access Time
Hot	Recent photos (<7 days)	SSD-backed object store	$$$$	<10ms
Warm	Popular older photos	HDD-backed object store	$$	<50ms
Cold	Rarely accessed photos	Archive storage (S3 Glacier class)	$	<1 hour
Archive	Compliance/legal hold	Deep archive	¢	Hours to days

Object Naming & Organization:

With trillions of objects, the naming scheme matters enormously:

# Object Key Structure
/{region}/{bucket_shard}/{media_id}/{variant}.{format}

# Examples:
/us-east/shard-0042/abc123def456/large_1080.jpg
/us-east/shard-0042/abc123def456/thumb_150.webp
/eu-west/shard-1337/xyz789ghi012/standard_640.jpg

Sharding Strategy:

Media ID is a globally unique identifier (UUID or similar)
Bucket shards distribute load across storage backends (typically 1000+ shards)
First characters of media ID can deterministically route to shards (hash partitioning)
Avoids hot partitions from popular content or temporal patterns

Replication Architecture:

For durability and availability, every object is replicated:

Intra-region replication: 3 copies across availability zones (hardware failure protection)
Cross-region replication: Replicate to 2-3 geographic regions (disaster protection)
Erasure coding: For cold storage, use Reed-Solomon encoding to reduce replication overhead while maintaining durability

Durability calculation for 3-copy replication:
- Single copy failure rate: 0.1% per year
- 3-copy failure rate (all must fail): (0.001)³ = 10⁻⁹
- 11 nines achieved with additional measures (checksums, scrubbing, cross-region)

Write Path:

Processing pipeline writes to hot tier in primary region
Synchronous acknowledgment after local durability (3 AZ copies)
Asynchronous cross-region replication begins
CDN priming triggered after primary write completes

Read Path:

CDN checks edge cache (80%+ hit rate)
CDN miss → Origin shield (centralized cache, 95%+ hit rate including edge)
Shield miss → Object storage directly
For cold storage, promote to warm tier after access

Lifecycle Policies:

Age	Action
0-7 days	Stay in hot tier (frequent access)
7-90 days	Migrate to warm tier (access patterns stabilize)
90+ days	Candidate for cold tier (based on access frequency)
1+ year, no access	Archive tier
Account deleted	Legal hold period, then permanent deletion

The 80/20 Rule on Steroids

In practice, >90% of all photo views are for content <7 days old. Recent photos get shared, appear in feeds, and drive engagement. Old photos are mostly archival—accessed occasionally for memories or profile browsing. This extreme temporal skew makes tiered storage highly cost-effective.

CDN Integration & Global Delivery

No matter how fast your object storage is, physics limits speed-of-light latency across continents. A user in Tokyo shouldn't wait 300ms for an image to travel from a US data center. Content Delivery Networks (CDNs) solve this by caching content at edge locations worldwide.

Instagram's CDN Requirements:

Requirement	Target	Why
Global coverage	200+ edge locations	Minimize latency worldwide
Cache hit rate	90%	Reduce origin load and latency
Edge capacity	10+ Tbps aggregate	Handle peak traffic globally
Purge latency	<1 minute	Remove deleted content quickly
HTTPS everywhere	100%	Security and privacy

CDN Architecture:

Converting Mermaid diagram...

Edge Caching Strategy:

Content Type	TTL (Time-to-Live)	Rationale
Image variants	1 year	Immutable—content at URL never changes
Profile pictures	24 hours	Rarely changed, but when changed, should update
Story images	24 hours + stale-if-error	Match story expiration
Thumbnails	1 year	Same as images

CDN Priming (Pre-warming):

When a new photo is published, waiting for the first viewer to trigger CDN caching creates a bad experience for that viewer. Priming proactively pushes content to edge caches:

Geographic targeting: Push to edges where followers are located
Popular content prediction: Verified/celebrity accounts get broader priming
Regional priming: Minimum priming to origin shields in each region

URL Structure and Caching:

# Image URL structure
https://instagram.fcdn.net/v/t51.2885-15/
  media_id/variant/filter_params/quality/
  final_filename.jpg?signature=...

# Key insight: URL contains all parameters
# Same content = Same URL = Cache hit
# Filter or size change = Different URL = Fresh fetch

Purging and Invalidation:

When content is deleted (user deletion, policy violation, or DMCA takedown), it must be purged from all CDN edges quickly:

Instant purge API: Signal CDN to invalidate specific URLs
Soft delete + TTL: Mark as deleted in origin, let CDN TTL expire naturally (for non-urgent cases)
Edge propagation: Changes propagate to all 200+ POPs within minutes

Meta's CDN Strategy

Meta (Instagram's parent company) operates one of the largest CDN infrastructures in the world. While they use commercial CDNs (Akamai, Cloudflare) for some traffic, much of Instagram's media is served through Meta's proprietary CDN infrastructure, including Facebook's extensive network of edge POPs and private peering arrangements with ISPs worldwide.

Pipeline Summary & Key Takeaways

We've traced the complete journey of an Instagram photo from upload to delivery. Let's consolidate the architectural principles that make this pipeline work at planetary scale:

Key Architectural Principles

•Chunked uploads with resumption enable reliable uploads on unreliable networks
•Asynchronous processing decouples user experience from actual processing time via optimistic UI
•Staged pipelines allow parallel processing and independent scaling of each stage
•Baked filters eliminate per-view computation cost at the expense of storage
•Eager variant generation trades processing cost for guaranteed fast delivery
•ML in the critical path enables safety enforcement but requires extreme optimization
•Tiered storage aligns cost with access patterns (hot/warm/cold)
•Multi-tier caching (CDN + origin shield) minimizes latency and origin load
•Immutable content + content-addressable URLs maximize cache effectiveness

What's Next: Feed Generation

With photos processed and stored, we turn to the next challenge: feed generation. How does Instagram decide which photos appear in your home feed, in what order? How does it balance recency, engagement, relationship strength, and content type? The feed generation system is where all the content comes together into the personalized experience users see.

We'll explore:

The fanout problem at scale (push vs. pull)
Feed ranking algorithms and signals
Real-time feed updates and notifications
Feed caching and invalidation strategies
Timeline pagination and infinite scroll

Image Pipeline Complete

You now understand how Instagram transforms raw uploads into the optimized, analyzed, globally-available assets that power billions of photo views daily. The principles here—chunked upload, async processing, baked transformations, tiered storage, and CDN delivery—apply to any large-scale media platform. Next, we'll see how these assets are assembled into personalized feeds.

Image Processing Pipeline: From Upload to Delivery

The Journey of a Single Photo

What You Will Learn

Upload Ingestion Architecture

The upload ingestion layer is the first system to receive user photos. Its primary responsibilities are:

Accepting uploads reliably across varied network conditions
Authenticating and authorizing the upload request
Validating that the uploaded content is acceptable (size, format, basic checks)
Queuing the upload for asynchronous processing
Providing immediate feedback to the client application

The Chunked Upload Pattern:

A 10MB photo upload on 3G might take 30+ seconds
Network interruption means restarting from scratch
Failed uploads create frustrated users who won't retry

Instagram uses chunked uploads to solve this problem:

Chunked Upload Protocol

Protocol Flow

// Phase 1: Initiate Upload Session
POST /api/v1/media/upload/initialize
{
  "content_type": "image/jpeg",
  "content_length": 10485760,  // 10MB
  "chunk_size": 524288,         // 512KB chunks
  "client_context": "uuid-abc123",
  "metadata": {
    "device": "iPhone 15 Pro",
    "capture_time": "2024-01-15T14:30:00Z",
    "filter": "clarendon"
  }
}
 
// Response: Session Created
{
  "upload_id": "upload_xyz789",
  "upload_url": "https://upload.instagram.com/v1/xyz789",
  "chunk_count": 20,
  "session_expires_at": "2024-01-15T15:00:00Z"
}
 
// Phase 2: Upload Chunks (in parallel or sequence)
PUT /v1/xyz789/chunk/0
Content-Range: bytes 0-524287/10485760
[binary data: 512KB]
 
PUT /v1/xyz789/chunk/1
Content-Range: bytes 524288-1048575/10485760
[binary data: 512KB]
// ... chunks 2-18 ...
 
PUT /v1/xyz789/chunk/19
Content-Range: bytes 9961472-10485759/10485760
[binary data: remaining bytes]
 
// Phase 3: Finalize Upload
POST /v1/xyz789/finalize
{
  "checksum": "sha256:abc123...",
  "caption": "Beautiful sunset! 🌅",
  "location_id": "123456",
  "tagged_users": ["user_id_1", "user_id_2"]
}
 
// Response: Upload Queued
{
  "media_id": "media_12345",
  "status": "processing",
  "estimated_completion_ms": 3000
}

Chunk Upload Mechanics:

Aspect	Implementation	Why
Chunk size	256KB - 1MB	Small enough for quick transmission, large enough for efficiency
Parallel uploads	Up to 4 concurrent chunks	Utilizes available bandwidth without congestion
Retry logic	3 retries with exponential backoff	Handles transient failures gracefully
Chunk verification	MD5/SHA256 per chunk	Detects corruption during transmission
Session timeout	30 minutes	Allows interrupted uploads to resume
Idempotency	Chunk number acts as idempotent key	Duplicate chunk uploads are safely ignored

Upload Ingestion Servers:

Instagram operates dedicated upload ingestion clusters optimized for receiving large binary payloads:

Globally distributed: Upload endpoints in every major region to minimize latency
Connection optimized: HTTP/2 for multiplexing, long-lived connections
Rate limited: Per-user upload limits to prevent abuse
Direct-to-storage: Chunks stream directly to temporary storage, not through API servers
Stateless coordination: Redis/Memcached tracks chunk receipt across distributed upload servers

Why Direct-to-Storage?

Asynchronous Processing Workflow

The Optimistic UI Pattern:

User taps 'Share'
Client immediately shows the post in-feed with local image
Background: Upload completes and processing begins
Background: Processing finishes, CDN URLs become available
Client silently swaps local image for CDN version
If processing fails: Show subtle error, offer retry

This optimistic UI approach decouples perceived latency from actual processing time, enabling complex operations while maintaining snappy user experience.

Processing Pipeline Stages:

The processing pipeline is modeled as a directed acyclic graph (DAG) of processing stages. Some stages can run in parallel; others have dependencies:

Converting Mermaid diagram...

Stage Dependencies:

Stage	Depends On	Output	Parallelizable With
Format Decode	Chunk Assembly	Raw pixel data
EXIF Extraction	Format Decode	Metadata JSON	Orientation Fix, Safety Scan
Orientation Fix	Format Decode	Correctly rotated image	EXIF Extraction, Safety Scan
Filter Application	Orientation Fix	Filtered image
Resolution Variants	Filter Application	5 image sizes
Safety Scan	Format Decode	Safety classification	EXIF, Orientation
Object Detection	Format Decode	Object labels	EXIF, Orientation, Safety
Policy Check	Safety Scan	Publish permission
Feed Fanout	Policy Check, Variants	Follower notifications

Workflow Orchestration:

Instagram uses workflow engines (similar to Temporal, Airflow, or custom solutions) to orchestrate this DAG:

Task queues: Each processing stage pulls from dedicated queues (Kafka, SQS)
Retry with backoff: Failed stages retry automatically with limits
Dead letter queues: Permanently failed tasks routed for human review
Priority lanes: Paid/verified accounts may get priority processing
Observability: Each stage emits metrics and traces for debugging

Processing SLOs

Image Format Handling & Normalization

Users upload images in countless formats from diverse devices. The processing pipeline must normalize this chaos into a consistent internal representation.

Supported Input Formats:

Input Format Support
Format	Source	Challenges	Handling
JPEG	Most cameras, Android	Varied quality levels, EXIF orientation	Decode, apply orientation, re-encode
HEIC/HEIF	iPhone (iOS 11+)	Hardware decoder requirements, licensing	Transcode to JPEG with libheif
PNG	Screenshots, graphics	Large files, transparency	Flatten alpha, convert to JPEG
WebP	Chrome, Android	Varied support historically	Decode with libwebp
GIF	Animations (legacy)	Animation support varies	Extract first frame or convert to video
RAW formats	Pro cameras	RAW processing complexity	Generally rejected, guide to use JPEG

The Orientation Problem:

Instagram's pipeline explicitly:

Reads EXIF orientation from original file
Applies rotation/flip transforms to pixel data
Stores output in standard orientation (no EXIF orientation needed)
Strips EXIF from final output (including GPS if not explicitly shared)

This 'bakes' the orientation into the pixel data, ensuring consistent display everywhere.

Orientation Handling (Conceptual)
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from PIL import Image
from PIL.ExifTags import TAGS
 
def normalize_orientation(image_path: str) -> Image:
    """
    Load an image and apply EXIF orientation to pixel data.
    Returns an image in standard orientation.
    """
    img = Image.open(image_path)
    
    # Get EXIF data if present
    exif = img.getexif()
    orientation = exif.get(274)  # 274 is the orientation tag
    
    # Apply transforms based on orientation value
    # Values 1-8 represent different rotation/flip combinations
    transforms = {
        1: lambda x: x,                                    # Normal
        2: lambda x: x.transpose(Image.FLIP_LEFT_RIGHT),   # Mirrored
        3: lambda x: x.rotate(180),                        # Rotated 180°
        4: lambda x: x.rotate(180).transpose(Image.FLIP_LEFT_RIGHT),
        5: lambda x: x.rotate(-90, expand=True).transpose(Image.FLIP_LEFT_RIGHT),
        6: lambda x: x.rotate(-90, expand=True),           # Rotated 90° CW
        7: lambda x: x.rotate(90, expand=True).transpose(Image.FLIP_LEFT_RIGHT),
        8: lambda x: x.rotate(90, expand=True),            # Rotated 90° CCW
    }
    
    if orientation in transforms:
        img = transforms[orientation](img)
    
    # Strip all EXIF (including orientation) for clean output
    img = img.copy()  # Removes EXIF
    
    return img

Color Space Handling:

Images come in various color spaces (sRGB, Adobe RGB, Display P3, etc.). For consistent display across devices:

Extract embedded ICC color profile
Convert to sRGB (web standard) for storage
For high-end devices, optionally provide Display P3 variants
Embed sRGB profile or strip for web compatibility

Quality vs. Size Optimization:

Instagram uses adaptive quality encoding:

Visual quality analysis: Perceptual hashing to ensure encoding doesn't degrade quality noticeably
Adaptive JPEG quality: Quality level varies (70-95) based on image complexity
Modern codecs: WebP for supporting browsers/apps (30-50% smaller than JPEG)
AVIF exploration: Next-gen codec for even better compression

Filter Application: Baking Visual Effects

Instagram's filters—Clarendon, Juno, Valencia, and dozens more—are defining features of the platform. Understanding how filters are applied reveals important architectural decisions.

The 'Baking' Approach:

Instagram bakes filters into stored images rather than applying them at render time. This means:

Filter is applied once during processing
Resulting pixels are stored permanently
Viewers receive pre-filtered images
No filter computation on every view

Why Bake Filters?

Baked Filters (Instagram's Choice)

•Zero compute on view (critical at scale)
•Consistent rendering everywhere
•Simpler CDN caching (just static images)
•Filter code can change without affecting old posts
•No client capability requirements

Dynamic Filters (Alternative)

•GPU compute on every view (expensive)
•Rendering differences across devices
•Complex caching (filter params + image)
•Filter bugs affect all existing posts
•Requires capable rendering clients

Filter Implementation:

Instagram filters are implemented as chains of image processing operations:

Operation	Description	Parameters
Curves	Tone mapping (shadows, midtones, highlights)	RGB curve control points
Vignette	Darkening edges	Radius, feather, intensity
Saturation	Color intensity adjustment	-100 to +100
Contrast	Tonal range adjustment	-100 to +100
Temperature	Warm/cool shift	Kelvin value
Grain	Film-like texture	Intensity, size
Fade	Reduced contrast, lifted blacks	Amount
Tint	Color overlay	Hue, saturation

Processing Filters at Scale:

At 25K+ images/second, filter processing must be highly optimized:

SIMD operations: SSE/AVX vectorized pixel processing
GPU acceleration: CUDA/OpenCL for parallel pixel operations
Lookup tables (LUTs): Pre-computed color mappings for common operations
Batch processing: Process multiple images through same filter together
Optimized pipelines: Combine sequential operations to read/write pixels fewer times

Filter Application with LUT (Conceptual)
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import numpy as np
 
def apply_filter_with_lut(image: np.ndarray, lut_3d: np.ndarray) -> np.ndarray:
    """
    Apply a color grading filter using 3D LUT lookup.
    
    3D LUTs precompute color transformations, allowing complex
    color grading to be applied with simple array lookups.
    
    Args:
        image: RGB image array (H, W, 3) with values 0-255
        lut_3d: 3D lookup table (N, N, N, 3) typically 33x33x33
    
    Returns:
        Filtered image array
    """
    lut_size = lut_3d.shape[0]
    
    # Normalize pixel values to LUT index space
    scale = (lut_size - 1) / 255.0
    scaled = image.astype(np.float32) * scale
    
    # Get integer indices and fractional parts for trilinear interpolation
    indices = np.floor(scaled).astype(np.int32)
    fractions = scaled - indices
    
    # Clamp indices to valid range
    indices = np.clip(indices, 0, lut_size - 2)
    
    r, g, b = indices[..., 0], indices[..., 1], indices[..., 2]
    fr, fg, fb = fractions[..., 0], fractions[..., 1], fractions[..., 2]
    
    # Trilinear interpolation for smooth color mapping
    # (simplified - full implementation has 8 lookups and interpolation)
    result = lut_3d[r, g, b]
    
    return result.astype(np.uint8)
 
 
# Pre-computed LUT for "Clarendon" filter (example)
# These are generated offline and loaded at startup
CLARENDON_LUT = load_lut("clarendon_33x33x33.cube")
 
# Application is just a single function call per image
filtered = apply_filter_with_lut(image, CLARENDON_LUT)

LUTs for Speed

Resolution Variant Generation

The Variant Strategy:

For each uploaded photo, Instagram generates multiple resolution variants:

Standard Image Variants
Variant Name	Max Dimension	Use Cases	Typical Size
thumbnail_150	150×150px	Grid previews, notifications, search results	8-15 KB
small_320	320×320px	Low-bandwidth preview, placeholder	20-40 KB
standard_640	640×640px (or aspect preserved)	Feed on medium-DPI devices	60-120 KB
large_1080	1080×1080px (or aspect preserved)	Feed on high-DPI, full-screen view	150-300 KB
original_capped	Up to 1440×1440px	Pinch-to-zoom, highest quality	200-500 KB

Aspect Ratio Handling:

Instagram supports multiple aspect ratios:

1:1 (Square): Original Instagram format, 1080×1080 max
4:5 (Portrait): 1080×1350 max, popular for portraits
1.91:1 (Landscape): 1080×566 max, cinematic feel

The pipeline generates variants that preserve the uploaded aspect ratio within allowed bounds. Extreme aspect ratios are cropped to fit within permitted ranges.

Resizing Algorithm Selection:

Not all resizing algorithms are equal. Instagram uses Lanczos resampling (or similar high-quality algorithms) for downscaling:

Algorithm	Quality	Speed	When Used
Nearest Neighbor	Poor	Fastest	Almost never (too blocky)
Bilinear	Fair	Fast	Quick previews, thumbnails
Bicubic	Good	Medium	Standard resizing
Lanczos	Excellent	Slower	Final variants, quality-critical
AI Upscaling	Excellent	Slowest	Potentially for zoom enhancement

Thumbnail Generation Strategy:

Thumbnails require special handling beyond simple shrinking:

Smart cropping: Center on detected faces or subjects, not just geometric center
Sharpening: Apply unsharp mask after downsizing to combat blur
Higher compression: Smaller files tolerate higher compression without visible artifacts
Square cropping: Force to 1:1 for grid consistency

The Lazy Generation Debate:

Should all variants be generated eagerly (at upload) or lazily (on first request)?

Approach	Tradeoff
Eager (Instagram's approach)	Higher processing cost upfront, but guaranteed fast delivery. No cold-start latency for new posts.
Lazy with caching	Lower initial cost, but first viewer pays latency penalty. Cache misses create load spikes.
Hybrid	Generate most-used variants eagerly (thumb, standard, large), rare variants lazily (ultra-high-res).

Storage Multiplier

Content Analysis & Machine Learning Pipeline

Analysis Categories:

ML Analysis Pipeline

•Safety & Policy Compliance: Nudity detection, violence detection, hate symbol recognition, CSAM (child safety) detection, spam/scam detection
•Content Classification: Scene type (beach, city, food, etc.), activity recognition, object detection, text extraction (OCR)
•Accessibility: Auto-generated alt text for screen readers, visual description for assistive technology
•Quality Signals: Image quality assessment (blur, exposure, composition), originality detection (screenshot, repost)
•Embedding Generation: Visual feature vectors for similarity search, recommendation, and duplicate detection

Safety Detection Architecture:

Safety is the highest priority ML workload. The system must prevent policy-violating content from ever being published:

Check	Model Type	Action on Detection	Latency Budget
CSAM detection	Specialized CNN + hash matching	Immediate block, report to NCMEC	<100ms (blocking)
Nudity classification	Multi-label classification	Flag for review or auto-block	<200ms (can be async)
Violence/gore	Object detection + classification	Flag for review	<200ms
Hate symbols	Pattern matching + CNN	Flag for review	<200ms
Text policy (OCR)	OCR + NLP classification	Flag for review	<500ms

Model Serving at Scale:

Running ML inference on 25K+ images/second requires specialized infrastructure:

GPU clusters: NVIDIA A100/H100 clusters for inference
Batching: Accumulate images, batch inference for GPU efficiency
Model optimization: TensorRT/ONNX optimization for production
Model routing: Route to specialized model servers by task
Graceful degradation: If ML services are overloaded, queue for later processing (but block publication)

Content Analysis Service (Conceptual)
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
from dataclasses import dataclass
from typing import List, Optional
import asyncio
 
@dataclass
class ContentAnalysisResult:
    image_id: str
    
    # Safety results (blocking)
    is_csam: bool
    csam_hash_match: Optional[str]
    nudity_score: float
    violence_score: float
    policy_violation: Optional[str]
    
    # Classification (non-blocking)
    scene_labels: List[str]     # ["beach", "sunset", "vacation"]
    objects_detected: List[str]  # ["person", "dog", "surfboard"]
    text_extracted: Optional[str]
    
    # Accessibility
    alt_text_generated: str
    
    # Quality signals
    quality_score: float        # 0-1, overall perceived quality
    is_screenshot: bool
    blur_score: float
    
    # Embeddings for recommendations
    visual_embedding: List[float]  # 512-d or 2048-d feature vector
 
async def analyze_image(image_bytes: bytes, image_id: str) -> ContentAnalysisResult:
    """
    Run all content analysis models on an uploaded image.
    Safety checks run first and can block publication.
    """
    # Decode image once, share across models
    image_tensor = decode_image(image_bytes)
    
    # Run safety checks first (can block publication)
    safety_task = asyncio.create_task(run_safety_models(image_tensor))
    
    # Run other analyses in parallel (non-blocking)
    classification_task = asyncio.create_task(run_classification(image_tensor))
    accessibility_task = asyncio.create_task(generate_alt_text(image_tensor))
    quality_task = asyncio.create_task(assess_quality(image_tensor))
    embedding_task = asyncio.create_task(generate_embedding(image_tensor))
    
    # Wait for safety first - this is blocking
    safety_result = await safety_task
    if safety_result.is_csam:
        # Immediate escalation, do not publish
        await report_to_ncmec(image_id, image_bytes)
        raise PolicyViolation("CSAM detected")
    
    # Gather remaining results
    classification, alt_text, quality, embedding = await asyncio.gather(
        classification_task,
        accessibility_task, 
        quality_task,
        embedding_task
    )
    
    return ContentAnalysisResult(
        image_id=image_id,
        **safety_result.__dict__,
        **classification.__dict__,
        alt_text_generated=alt_text,
        **quality.__dict__,
        visual_embedding=embedding
    )

False Positives are Expensive

Object Storage Architecture at Exabyte Scale

Storage Requirements:

Requirement	Value	Implication
Daily ingest	6+ PB	Massive write throughput
Total storage	20+ exabytes	At this scale, even small optimizations save petabytes
Durability	11 nines (99.999999999%)	Data must never be lost
Availability	99.99% for reads	Users must always see their photos
Read latency	<50ms p50, <200ms p99	From anywhere in the world
Write latency	<1 second	For processing pipeline

Storage Tiers:

Not all photos are accessed equally. Instagram uses tiered storage to optimize cost:

Storage Tier Strategy
Tier	Use Case	Storage Type	Cost	Access Time
Hot	Recent photos (<7 days)	SSD-backed object store	$$$$	<10ms
Warm	Popular older photos	HDD-backed object store	$$	<50ms
Cold	Rarely accessed photos	Archive storage (S3 Glacier class)	$	<1 hour
Archive	Compliance/legal hold	Deep archive	¢	Hours to days

Object Naming & Organization:

With trillions of objects, the naming scheme matters enormously:

# Object Key Structure
/{region}/{bucket_shard}/{media_id}/{variant}.{format}

# Examples:
/us-east/shard-0042/abc123def456/large_1080.jpg
/us-east/shard-0042/abc123def456/thumb_150.webp
/eu-west/shard-1337/xyz789ghi012/standard_640.jpg

Sharding Strategy:

Media ID is a globally unique identifier (UUID or similar)
Bucket shards distribute load across storage backends (typically 1000+ shards)
First characters of media ID can deterministically route to shards (hash partitioning)
Avoids hot partitions from popular content or temporal patterns

Replication Architecture:

For durability and availability, every object is replicated:

Intra-region replication: 3 copies across availability zones (hardware failure protection)
Cross-region replication: Replicate to 2-3 geographic regions (disaster protection)
Erasure coding: For cold storage, use Reed-Solomon encoding to reduce replication overhead while maintaining durability

Durability calculation for 3-copy replication:
- Single copy failure rate: 0.1% per year
- 3-copy failure rate (all must fail): (0.001)³ = 10⁻⁹
- 11 nines achieved with additional measures (checksums, scrubbing, cross-region)

Write Path:

Processing pipeline writes to hot tier in primary region
Synchronous acknowledgment after local durability (3 AZ copies)
Asynchronous cross-region replication begins
CDN priming triggered after primary write completes

Read Path:

CDN checks edge cache (80%+ hit rate)
CDN miss → Origin shield (centralized cache, 95%+ hit rate including edge)
Shield miss → Object storage directly
For cold storage, promote to warm tier after access

Lifecycle Policies:

Age	Action
0-7 days	Stay in hot tier (frequent access)
7-90 days	Migrate to warm tier (access patterns stabilize)
90+ days	Candidate for cold tier (based on access frequency)
1+ year, no access	Archive tier
Account deleted	Legal hold period, then permanent deletion

The 80/20 Rule on Steroids

CDN Integration & Global Delivery

Instagram's CDN Requirements:

Requirement	Target	Why
Global coverage	200+ edge locations	Minimize latency worldwide
Cache hit rate	90%	Reduce origin load and latency
Edge capacity	10+ Tbps aggregate	Handle peak traffic globally
Purge latency	<1 minute	Remove deleted content quickly
HTTPS everywhere	100%	Security and privacy

CDN Architecture:

Converting Mermaid diagram...

Edge Caching Strategy:

Content Type	TTL (Time-to-Live)	Rationale
Image variants	1 year	Immutable—content at URL never changes
Profile pictures	24 hours	Rarely changed, but when changed, should update
Story images	24 hours + stale-if-error	Match story expiration
Thumbnails	1 year	Same as images

CDN Priming (Pre-warming):

When a new photo is published, waiting for the first viewer to trigger CDN caching creates a bad experience for that viewer. Priming proactively pushes content to edge caches:

Geographic targeting: Push to edges where followers are located
Popular content prediction: Verified/celebrity accounts get broader priming
Regional priming: Minimum priming to origin shields in each region

URL Structure and Caching:

# Image URL structure
https://instagram.fcdn.net/v/t51.2885-15/
  media_id/variant/filter_params/quality/
  final_filename.jpg?signature=...

# Key insight: URL contains all parameters
# Same content = Same URL = Cache hit
# Filter or size change = Different URL = Fresh fetch

Purging and Invalidation:

When content is deleted (user deletion, policy violation, or DMCA takedown), it must be purged from all CDN edges quickly:

Instant purge API: Signal CDN to invalidate specific URLs
Soft delete + TTL: Mark as deleted in origin, let CDN TTL expire naturally (for non-urgent cases)
Edge propagation: Changes propagate to all 200+ POPs within minutes

Meta's CDN Strategy

Pipeline Summary & Key Takeaways

We've traced the complete journey of an Instagram photo from upload to delivery. Let's consolidate the architectural principles that make this pipeline work at planetary scale:

Key Architectural Principles

•Chunked uploads with resumption enable reliable uploads on unreliable networks
•Asynchronous processing decouples user experience from actual processing time via optimistic UI
•Staged pipelines allow parallel processing and independent scaling of each stage
•Baked filters eliminate per-view computation cost at the expense of storage
•Eager variant generation trades processing cost for guaranteed fast delivery
•ML in the critical path enables safety enforcement but requires extreme optimization
•Tiered storage aligns cost with access patterns (hot/warm/cold)
•Multi-tier caching (CDN + origin shield) minimizes latency and origin load
•Immutable content + content-addressable URLs maximize cache effectiveness

What's Next: Feed Generation

We'll explore:

The fanout problem at scale (push vs. pull)
Feed ranking algorithms and signals
Real-time feed updates and notifications
Feed caching and invalidation strategies
Timeline pagination and infinite scroll

Image Pipeline Complete