URL Shortener - Learning Module

Loading content...

0/273

Redirect Latency Optimization

The Speed Imperative

When a user clicks a shortened URL, they expect instantaneous redirection. Every millisecond of delay is perceptible; every hundred milliseconds feels broken.

Consider the redirect path:

User clicks short URL in a tweet
Browser initiates DNS lookup
TCP connection to our service
HTTP request sent
Our service processes the redirect ← This is what we control
HTTP 301/302 response sent
Browser follows to destination

Steps 2-4 and 6-7 are network overhead (100-400ms typically). Our goal is to make step 5—our processing—essentially invisible at under 10ms for cached lookups and under 50ms worst-case.

At 1 billion redirects per day (50,000+ per second at peak), this latency target requires sophisticated caching, global distribution, and optimized data paths.

What You Will Learn

By the end of this page, you will master multi-layer caching strategies (CDN → in-memory → distributed cache → database), understand database selection and optimization for redirect workloads, and design global distribution patterns for consistent sub-50ms latency worldwide.

Latency Breakdown Analysis

Before optimizing, we must understand where time is spent in a redirect request. Let's trace a request through the system:

Request Flow Components

latency_breakdown.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Redirect Request Latency Breakdown
===================================
 
[Client] → [DNS] → [CDN Edge] → [Load Balancer] → [App Server] → [Cache] → [DB]
 
Component Latency (typical):
────────────────────────────
DNS Lookup:              1-50ms    (cached: 0ms, cold: 50ms+)
CDN Edge:                1-5ms     (regional, very fast)
TLS Handshake:           10-50ms   (session resumption helps)
Load Balancer:           0.5-2ms   (minimal overhead)
App Server Processing:   1-10ms    (parse request, logic)
Cache Lookup (Redis):    0.5-2ms   (same region)
Database Lookup:         2-20ms    (depends on indexing)
 
TOTAL (cache hit):       15-70ms   (dominated by network)
TOTAL (cache miss):      25-100ms  (+ database latency)
 
What we control directly:
- App Server Processing: 1-10ms
- Cache Lookup: 0.5-2ms
- Database Lookup: 2-20ms
─────────────────────────────
Our Budget: <50ms total, target <10ms for cache hit

Latency Percentiles Matter

Average latency is misleading. What matters is the tail latency—how slow the slowest requests are:

Target Latency Percentiles
Percentile	Target	What It Means	Acceptable Per 1B Requests
P50 (median)	<10ms	Half of requests under 10ms	500M requests under 10ms
P90	<25ms	90% of requests under 25ms	100M may exceed 25ms
P99	<50ms	99% under 50ms	10M may exceed 50ms
P99.9	<100ms	99.9% under 100ms	1M may exceed 100ms
P99.99	<500ms	Avoid timeouts	100K may be slow

The Tail Latency Problem

At 1B requests/day, even P99.99 latency (500ms) affects 100,000 users daily. These users will perceive the service as broken. Tail latency optimization is critical—and often harder than median optimization because it involves eliminating edge cases, garbage collection pauses, and cache misses.

Multi-Layer Caching Strategy

URL shortener redirects are perfectly cacheable: the mapping from short code to long URL rarely changes. We exploit this with aggressive, multi-layer caching.

The Caching Pyramid

caching_layers.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Multi-Layer Cache Architecture
==============================
 
┌───────────────────────────────────────────────────────────────────┐
│                     LAYER 1: CDN EDGE CACHE                       │
│  • 200+ global edge locations                                     │
│  • Latency: 1-5ms                                                 │
│  • Capacity: Effectively unlimited                                │
│  • TTL: 24 hours (popular URLs always cached)                     │
│  • Hit rate target: 60-70% of all requests                        │
└───────────────────────────────────────────────────────────────────┘
                              ↓ cache miss
┌───────────────────────────────────────────────────────────────────┐
│                   LAYER 2: LOCAL IN-MEMORY CACHE                  │
│  • Per-server LRU cache (HashMap)                                 │
│  • Latency: <0.1ms (memory access)                                │
│  • Capacity: 1-5GB per server (10-50M entries)                    │
│  • TTL: 10 minutes                                                │
│  • Hit rate target: 20-30% of CDN misses                          │
└───────────────────────────────────────────────────────────────────┘
                              ↓ cache miss
┌───────────────────────────────────────────────────────────────────┐
│                  LAYER 3: DISTRIBUTED REDIS CACHE                 │
│  • Redis Cluster across region                                    │
│  • Latency: 0.5-2ms                                               │
│  • Capacity: 100GB+ per region                                    │
│  • TTL: 1 hour                                                    │
│  • Hit rate target: 95%+ of local cache misses                    │
└───────────────────────────────────────────────────────────────────┘
                              ↓ cache miss
┌───────────────────────────────────────────────────────────────────┐
│                     LAYER 4: DATABASE (Source)                    │
│  • Primary database or read replica                               │
│  • Latency: 2-20ms                                                │
│  • Contains all URL mappings                                      │
│  • Target: <5% of total requests reach DB                         │
└───────────────────────────────────────────────────────────────────┘

Layer 1: CDN Edge Caching

Content Delivery Networks (CDNs) like Cloudflare, AWS CloudFront, or Fastly can cache redirect responses at edge locations worldwide:

cdn_config.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// CDN Cache Configuration for Redirects
 
// HTTP Response Headers for Caching
const redirectResponse = {
    statusCode: 302,  // Temporary redirect (allows analytics tracking)
    headers: {
        'Location': longUrl,
        
        // CDN Cache Control
        'Cache-Control': 'public, max-age=86400, s-maxage=86400',
        // public: CDN can cache
        // max-age: browser cache for 24h
        // s-maxage: CDN cache for 24h (overrides max-age for CDN)
        
        // CDN-specific headers
        'CDN-Cache-Control': 'public, max-age=86400',
        'Surrogate-Control': 'max-age=86400',  // Fastly/Varnish
        
        // Cache key variants (cache per-URL only, not per-user)
        'Vary': 'Accept-Encoding',  // Only vary on encoding, not cookies
        
        // Analytics bypass hint
        'X-Cache-Status': 'origin',  // Track origin vs edge hits
    }
};
 
// CDN Edge Configuration (conceptual - varies by provider)
const cdnConfig = {
    // Cache based only on path (not query params or headers)
    cacheKeyRules: {
        includeQueryString: false,     // /abc123?ref=twitter same as /abc123
        includeHost: true,             // Different domains cached separately
        includeCookies: false,         // User-specific cookies don't vary cache
    },
    
    // Stale-while-revalidate for high availability
    staleContentRules: {
        serveStaleOnError: true,        // Return cached version if origin down
        staleMaxAge: 3600,              // Serve stale up to 1 hour
    },
    
    // Origin shield (reduce origin load)
    originShield: {
        enabled: true,
        region: 'us-east-1',           // Single region contacts origin
    }
};

301 vs 302 Caching Implications

301 (Permanent) redirects are cached indefinitely by browsers—subsequent visits never hit your service. This prevents analytics tracking. 302 (Temporary) redirects allow CDN caching while ensuring browsers check back. Use 302 for analytics, 301 for static/permanent links.

In-Memory and Distributed Caches

When CDN misses, requests hit our application servers. Here, we employ two cache layers: local in-memory cache and distributed Redis cache.

Layer 2: Local In-Memory Cache

local_cache.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
/**
 * Local In-Memory LRU Cache
 * 
 * Each application server maintains its own cache for hottest URLs.
 * Eliminates network round-trip for frequently accessed URLs.
 */
 
class LocalUrlCache {
    private cache: Map<string, CacheEntry>;
    private readonly maxSize: number;
    private readonly ttlMs: number;
    
    constructor(maxSize: number = 1_000_000, ttlMs: number = 600_000) {
        this.cache = new Map();
        this.maxSize = maxSize;  // 1M entries ≈ 300MB memory
        this.ttlMs = ttlMs;       // 10 minutes TTL
    }
    
    get(shortCode: string): string | null {
        const entry = this.cache.get(shortCode);
        
        if (!entry) return null;
        
        // Check expiration
        if (Date.now() > entry.expiresAt) {
            this.cache.delete(shortCode);
            return null;
        }
        
        // LRU: Move to end (most recently used)
        this.cache.delete(shortCode);
        this.cache.set(shortCode, entry);
        
        return entry.longUrl;
    }
    
    set(shortCode: string, longUrl: string): void {
        // Evict oldest if at capacity (LRU eviction)
        if (this.cache.size >= this.maxSize) {
            const oldestKey = this.cache.keys().next().value;
            this.cache.delete(oldestKey);
        }
        
        this.cache.set(shortCode, {
            longUrl,
            expiresAt: Date.now() + this.ttlMs,
        });
    }
    
    invalidate(shortCode: string): void {
        this.cache.delete(shortCode);
    }
}
 
interface CacheEntry {
    longUrl: string;
    expiresAt: number;
}
 
// Memory footprint estimation:
// 1M entries × 300 bytes = 300MB per server
// Typical server RAM: 16GB → cache uses ~2% of memory

Layer 3: Distributed Redis Cache

Redis provides shared caching across servers with sub-millisecond latency:

redis_cache.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
/**
 * Distributed Redis Cache Layer
 * 
 * Shared cache across all application servers.
 * Provides cache coherence and higher capacity than local cache.
 */
 
import Redis from 'ioredis';
 
class RedisUrlCache {
    private redis: Redis.Cluster;
    private readonly defaultTtl: number = 3600;  // 1 hour
    
    constructor(nodes: { host: string; port: number }[]) {
        this.redis = new Redis.Cluster(nodes, {
            redisOptions: {
                connectTimeout: 5000,
                commandTimeout: 100,  // Fail fast - 100ms timeout
            },
            scaleReads: 'slave',  // Read from replicas for scalability
            enableReadyCheck: true,
        });
    }
    
    async get(shortCode: string): Promise<string | null> {
        try {
            const longUrl = await this.redis.get(`url:${shortCode}`);
            return longUrl;
        } catch (error) {
            // Redis failure should not block redirects
            console.error('Redis get failed:', error);
            return null;
        }
    }
    
    async set(shortCode: string, longUrl: string): Promise<void> {
        try {
            await this.redis.setex(
                `url:${shortCode}`,
                this.defaultTtl,
                longUrl
            );
        } catch (error) {
            // Log but don't fail - cache is optimization, not requirement
            console.error('Redis set failed:', error);
        }
    }
    
    async getMulti(shortCodes: string[]): Promise<Map<string, string>> {
        const keys = shortCodes.map(code => `url:${code}`);
        const values = await this.redis.mget(...keys);
        
        const result = new Map<string, string>();
        values.forEach((value, index) => {
            if (value) {
                result.set(shortCodes[index], value);
            }
        });
        return result;
    }
    
    async invalidate(shortCode: string): Promise<void> {
        await this.redis.del(`url:${shortCode}`);
    }
}
 
// Redis Cluster Topology for URL Shortener:
// 
// ┌─────────────────────────────────────────────┐
// │  Redis Cluster (6 nodes per region)        │
// │  ┌─────────┐ ┌─────────┐ ┌─────────┐       │
// │  │Master 1 │ │Master 2 │ │Master 3 │       │
// │  │ Slots   │ │ Slots   │ │ Slots   │       │
// │  │ 0-5460  │ │5461-10922│ │10923-16383│    │
// │  └────┬────┘ └────┬────┘ └────┬────┘       │
// │       │           │           │            │
// │  ┌────┴────┐ ┌────┴────┐ ┌────┴────┐       │
// │  │Replica 1│ │Replica 2│ │Replica 3│       │
// │  └─────────┘ └─────────┘ └─────────┘       │
// └─────────────────────────────────────────────┘

Cache Warming Strategy

On server startup or after cache flush, cold caches cause latency spikes. Implement cache warming: pre-populate the cache with the top 1000-10000 most accessed URLs from the last 24 hours. This ensures immediate high hit rates.

Complete Redirect Implementation

Let's implement the complete redirect flow with all cache layers:

Optimized Redirect Handler

redirect_handler.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
/**
 * Production Redirect Handler
 * 
 * Multi-layer cache lookup with graceful fallback.
 */
 
class RedirectHandler {
    private localCache: LocalUrlCache;
    private redisCache: RedisUrlCache;
    private database: UrlDatabase;
    private analytics: AnalyticsEmitter;
    
    async handleRedirect(
        shortCode: string, 
        request: Request
    ): Promise<Response> {
        const startTime = performance.now();
        let cacheLevel = 'none';
        
        try {
            // Layer 1: Check local in-memory cache (fastest)
            let longUrl = this.localCache.get(shortCode);
            
            if (longUrl) {
                cacheLevel = 'local';
            } else {
                // Layer 2: Check Redis distributed cache
                longUrl = await this.redisCache.get(shortCode);
                
                if (longUrl) {
                    cacheLevel = 'redis';
                    // Promote to local cache for future requests
                    this.localCache.set(shortCode, longUrl);
                } else {
                    // Layer 3: Database lookup (slowest)
                    longUrl = await this.database.getLongUrl(shortCode);
                    
                    if (longUrl) {
                        cacheLevel = 'database';
                        // Populate both cache layers
                        this.localCache.set(shortCode, longUrl);
                        // Don't await - async cache population
                        this.redisCache.set(shortCode, longUrl).catch(() => {});
                    }
                }
            }
            
            // URL not found
            if (!longUrl) {
                return this.notFoundResponse(shortCode);
            }
            
            // Emit analytics asynchronously (never block redirect)
            this.emitAnalytics(shortCode, request, cacheLevel);
            
            // Return redirect response
            const latency = performance.now() - startTime;
            return this.redirectResponse(longUrl, latency, cacheLevel);
            
        } catch (error) {
            // Fail gracefully - try to serve from any available source
            return this.handleError(shortCode, error);
        }
    }
    
    private redirectResponse(
        longUrl: string, 
        latencyMs: number, 
        cacheLevel: string
    ): Response {
        return new Response(null, {
            status: 302,
            headers: {
                'Location': longUrl,
                'Cache-Control': 'public, max-age=86400, s-maxage=86400',
                'X-Response-Time': `${latencyMs.toFixed(2)}ms`,
                'X-Cache-Level': cacheLevel,
            },
        });
    }
    
    private notFoundResponse(shortCode: string): Response {
        return new Response('URL not found', {
            status: 404,
            headers: {
                'Cache-Control': 'no-store',  // Don't cache 404s
            },
        });
    }
    
    private emitAnalytics(
        shortCode: string, 
        request: Request, 
        cacheLevel: string
    ): void {
        // Fire-and-forget analytics emission
        setImmediate(() => {
            this.analytics.emit({
                shortCode,
                timestamp: Date.now(),
                ip: request.headers.get('cf-connecting-ip') ?? request.ip,
                userAgent: request.headers.get('user-agent'),
                referer: request.headers.get('referer'),
                cacheLevel,
            });
        });
    }
}

Performance Optimizations in This Code

•Sequential cache lookup — Check fastest cache first, stop on hit
•Cache promotion — On Redis hit, promote to local cache
•Async cache population — Don't await Redis set; let it happen in background
•Fire-and-forget analytics — Use setImmediate to emit without blocking
•Error isolation — Catch errors per layer, try to continue
•Response instrumentation — Track latency and cache level for monitoring

Database Selection and Optimization

When cache misses occur, database performance is critical. With proper caching, only 1-5% of requests reach the database, but that's still 10-50 million queries per day!

Database Options Comparison

Database Selection for URL Shortener
Database	Strengths	Weaknesses	Best For
PostgreSQL	Mature, ACID, rich indexing	Horizontal scaling complex	Moderate scale, complex queries
MySQL	Simple, well-understood, replication	Limited horizontal scaling	Simple use cases, read replicas
DynamoDB	Serverless, auto-scaling, global tables	Limited query flexibility	AWS-native, global scale
Cassandra	Write-heavy, linear scalability	Eventual consistency, no joins	Extreme write scale
MongoDB	Flexible schema, sharding built-in	Less mature transactions	Rapid iteration, flexible needs

Recommended: DynamoDB for Redirect Lookups

For pure key-value lookups, DynamoDB (or similar) excels:

dynamodb_schema.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
/**
 * DynamoDB Table Design for URL Shortener
 */
 
// Primary table: URLs
const urlsTable = {
    TableName: 'Urls',
    
    // Simple primary key - short code only
    KeySchema: [
        { AttributeName: 'shortCode', KeyType: 'HASH' }
    ],
    
    AttributeDefinitions: [
        { AttributeName: 'shortCode', AttributeType: 'S' },
        { AttributeName: 'userId', AttributeType: 'S' },
        { AttributeName: 'createdAt', AttributeType: 'N' },
    ],
    
    // Global Secondary Index for user's URLs
    GlobalSecondaryIndexes: [
        {
            IndexName: 'UserUrlsIndex',
            KeySchema: [
                { AttributeName: 'userId', KeyType: 'HASH' },
                { AttributeName: 'createdAt', KeyType: 'RANGE' }
            ],
            Projection: { ProjectionType: 'ALL' }
        }
    ],
    
    // On-demand capacity for auto-scaling
    BillingMode: 'PAY_PER_REQUEST',
};
 
// Sample item structure
const urlItem = {
    shortCode: 'a7Xk2B',           // Partition key
    longUrl: 'https://example.com/very/long/path?with=params',
    userId: 'user_12345',
    createdAt: 1704067200000,      // Unix timestamp
    expiresAt: null,               // Optional TTL
    clickCount: 1523,              // Denormalized for quick access
    customAlias: false,
    metadata: {
        title: 'My Campaign Link',
        tags: ['marketing', 'q1-2024'],
    }
};
 
// Redirect lookup query
const getLongUrl = async (shortCode: string): Promise<string | null> => {
    const result = await dynamodb.get({
        TableName: 'Urls',
        Key: { shortCode },
        ProjectionExpression: 'longUrl, expiresAt',  // Only fetch needed fields
        ConsistentRead: false,  // Eventually consistent = faster
    }).promise();
    
    if (!result.Item) return null;
    
    // Check expiration
    if (result.Item.expiresAt && result.Item.expiresAt < Date.now()) {
        return null;  // Expired
    }
    
    return result.Item.longUrl;
};

Why DynamoDB Excels Here

URL shortener redirects are pure key-value lookups with no joins or complex queries. DynamoDB provides: single-digit millisecond latency, automatic horizontal scaling, global tables for multi-region, and serverless operation. For this access pattern, it's nearly ideal.

Global Distribution Architecture

Users click short URLs from everywhere. A user in Tokyo shouldn't wait 200ms for a round-trip to US servers. Global distribution minimizes latency for all users.

Multi-Region Architecture

global_architecture.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Global Distribution Architecture
==================================
 
                        ┌─────────────────────┐
                        │   DNS (GeoDNS)      │
                        │  Route to nearest   │
                        │    edge region      │
                        └──────────┬──────────┘
                                   │
        ┌──────────────────────────┼──────────────────────────┐
        │                          │                          │
        ▼                          ▼                          ▼
┌───────────────┐         ┌───────────────┐         ┌───────────────┐
│  US-EAST      │         │  EU-WEST      │         │  AP-NORTHEAST │
│  Region       │         │  Region       │         │  Region       │
├───────────────┤         ├───────────────┤         ├───────────────┤
│ • CDN Edge    │         │ • CDN Edge    │         │ • CDN Edge    │
│ • App Servers │         │ • App Servers │         │ • App Servers │
│ • Redis Cache │         │ • Redis Cache │         │ • Redis Cache │
│ • DB Replica  │◄────────┤ • DB Replica  │◄────────┤ • DB Replica  │
└───────┬───────┘         └───────────────┘         └───────────────┘
        │
        │ Replication
        ▼
┌───────────────┐
│  PRIMARY DB   │
│  (US-EAST)    │
│               │
│  All writes   │
│  go here      │
└───────────────┘
 
Data Flow:
- READS: Served from nearest region (local replica)
- WRITES: Routed to primary region, async replicated
 
Latency from any major city:
- With local region: 10-30ms
- Without local region: 100-300ms (unacceptable)

DNS-Based Geographic Routing

geo_dns.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
/**
 * Geographic DNS Routing Configuration
 * 
 * Route users to nearest healthy region based on:
 * 1. Geographic location (latency-based)
 * 2. Health checks (failover to next-closest)
 * 3. Load balancing (within region)
 */
 
// AWS Route 53 Latency-Based Routing (conceptual)
const dnsConfig = {
    recordSets: [
        {
            name: 'short.url',
            type: 'A',
            region: 'us-east-1',
            aliasTarget: 'alb-us-east.elb.amazonaws.com',
            healthCheckId: 'hc-us-east',
        },
        {
            name: 'short.url',
            type: 'A',
            region: 'eu-west-1',
            aliasTarget: 'alb-eu-west.elb.amazonaws.com',
            healthCheckId: 'hc-eu-west',
        },
        {
            name: 'short.url',
            type: 'A',
            region: 'ap-northeast-1',
            aliasTarget: 'alb-ap-ne.elb.amazonaws.com',
            healthCheckId: 'hc-ap-ne',
        },
    ],
    
    healthChecks: {
        type: 'HTTPS',
        path: '/health',
        interval: 10,           // Check every 10 seconds
        failureThreshold: 2,    // 2 failures = unhealthy
        regions: ['us-east-1', 'eu-west-1', 'ap-northeast-1'],
    },
    
    routingPolicy: 'latency',   // Route to lowest-latency healthy region
};
 
// Failover behavior:
// 1. US-East goes down → Route 53 detects via health check
// 2. US users routed to EU-West (next closest healthy)
// 3. When US-East recovers → traffic automatically returns

Anycast for Even Lower Latency

CDNs like Cloudflare use Anycast IP addressing—the same IP address is announced from multiple locations. Users are automatically routed to the nearest point of presence based on BGP routing, often faster than DNS-based routing.

Cache Invalidation Strategies

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

When a URL is updated or deleted, cached versions must be invalidated across ALL layers and regions.

Invalidation Scenarios

Cache Invalidation Scenarios
Event	What to Invalidate	Urgency	Strategy
URL destination changed	All caches for that code	High	Active invalidation + short TTL
URL deleted	All caches for that code	High	Active invalidation
URL expired (TTL)	All caches for that code	Medium	TTL-based expiration
User requests private	All caches for that code	High	Immediate invalidation
Security issue (phishing)	All caches for affected URLs	Critical	Emergency purge

cache_invalidation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
/**
 * Multi-Layer Cache Invalidation
 */
 
class CacheInvalidator {
    private localCache: LocalUrlCache;
    private redisCache: RedisUrlCache;
    private cdnPurger: CdnPurger;
    private pubsub: PubSub;
    
    /**
     * Invalidate a short code across all cache layers and regions.
     * Uses pub/sub to notify all app server instances.
     */
    async invalidate(shortCode: string): Promise<void> {
        const invalidationId = generateUuid();
        
        console.log(`[Invalidation ${invalidationId}] Starting for ${shortCode}`);
        
        // 1. Invalidate local cache (this instance only)
        this.localCache.invalidate(shortCode);
        
        // 2. Invalidate distributed Redis cache
        await this.redisCache.invalidate(shortCode);
        
        // 3. Publish invalidation event to all app servers (all regions)
        await this.pubsub.publish('cache-invalidation', {
            shortCode,
            invalidationId,
            timestamp: Date.now(),
        });
        
        // 4. Purge from CDN edge caches
        await this.cdnPurger.purge(`/${shortCode}`);
        
        console.log(`[Invalidation ${invalidationId}] Complete`);
    }
    
    /**
     * Subscribe to invalidation events from other instances
     */
    async subscribeToInvalidations(): Promise<void> {
        await this.pubsub.subscribe('cache-invalidation', (message) => {
            // Invalidate local cache when notified by other instances
            this.localCache.invalidate(message.shortCode);
            console.log(`Received invalidation for ${message.shortCode}`);
        });
    }
}
 
// CDN Purge API (example for Cloudflare)
class CdnPurger {
    async purge(path: string): Promise<void> {
        await fetch('https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache', {
            method: 'POST',
            headers: {
                'Authorization': `Bearer ${process.env.CF_API_TOKEN}`,
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                files: [`https://short.url${path}`],
            }),
        });
    }
    
    async purgeAll(): Promise<void> {
        // Emergency: purge entire cache
        await fetch('https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache', {
            method: 'POST',
            headers: {
                'Authorization': `Bearer ${process.env.CF_API_TOKEN}`,
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({ purge_everything: true }),
        });
    }
}

Invalidation Latency

Even with active invalidation, there's a propagation window. CDN purges take 1-30 seconds. Pub/sub messages take 100-500ms. Browser caches may not be clearable at all (if user has cached 301). Design for eventual consistency—absolute immediate invalidation is impossible in distributed systems.

Redirect Latency Optimization Summary

We've built a comprehensive latency optimization strategy for URL shortener redirects. Let's consolidate the key approaches:

Latency Optimization Techniques Summary
Technique	Latency Impact	Implementation Effort	Hit Rate
CDN Edge Caching	-100ms+ (eliminates origin)	Medium	60-70%
Local In-Memory Cache	-1-2ms (eliminates Redis)	Low	20-30% of misses
Redis Distributed Cache	-5-15ms (eliminates DB)	Medium	95%+ of local misses
Database Read Replicas	-10-50ms (reduces latency)	Medium	N/A
Global Multi-Region	-50-200ms (locality)	High	N/A
KeyValue DB (DynamoDB)	-5-10ms vs SQL DB	Medium	N/A

Key Takeaways

•Multi-layer caching is essential — CDN → local memory → Redis → database provides defense in depth against latency.
•Tail latency matters more than average — P99 and P99.9 latency affects millions of users daily at scale.
•Cache promotion optimizes hot content — Promote Redis hits to local cache; warm caches on startup.
•Global distribution is non-negotiable — Users worldwide need sub-100ms latency; single-region won't work.
•Key-value databases excel for redirects — DynamoDB's access pattern matches URL lookup perfectly.
•Async analytics never blocks redirects — Fire-and-forget analytics emission is critical for latency.

Page Complete

You now understand how to achieve sub-50ms redirect latency at scale. Next, we'll explore analytics collection—how to gather click data from billions of redirects without impacting that hard-won latency.