Loading content...
Every click on a short URL is a data point. At 1 billion redirects per day, that's 1 billion data points—a treasure trove of insights about user behavior, campaign effectiveness, geographic distribution, and content engagement.
For many URL shortener businesses (like Bitly), analytics are the product. The URL shortening is merely the means to collect valuable marketing intelligence. Understanding who clicks, from where, when, and on what devices transforms a simple redirect service into a powerful analytics platform.
But collecting analytics at this scale is challenging:
By the end of this page, you will understand event collection patterns, streaming architectures with Kafka, aggregation strategies for time-series data, storage optimization techniques, and how to serve analytics queries efficiently.
Before building the collection pipeline, we must define what data we're collecting and how we'll structure it.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
/** * Raw Click Event * * Captured at the moment of redirect, before any aggregation. * This is the highest-fidelity data we collect. */interface RawClickEvent { // Identity eventId: string; // UUID for deduplication shortCode: string; // Which short URL was clicked timestamp: number; // Unix milliseconds // User Information (derived from request) ip: string; // Visitor IP (for geo-lookup) userAgent: string; // Browser/device identification acceptLanguage: string; // Language preference // Traffic Source referer: string | null; // Referring page utmSource: string | null; // utm_source parameter utmMedium: string | null; // utm_medium parameter utmCampaign: string | null; // utm_campaign parameter // Session/Visitor Tracking visitorId: string | null; // Cookie-based visitor ID sessionId: string | null; // Session identifier // Technical Details protocol: string; // HTTP or HTTPS responseTime: number; // Redirect latency in ms cacheLevel: string; // local/redis/database} // Estimated size per event: ~500 bytes// 1 billion events × 500 bytes = 500 GB/day of raw events /** * Enriched Click Event * * After processing, we add derived fields from lookups. */interface EnrichedClickEvent extends RawClickEvent { // Geo-derived (from IP lookup) country: string; // ISO country code region: string; // State/province city: string; // City name latitude: number; // Coordinates longitude: number; // Device-derived (from User-Agent parsing) deviceType: 'desktop' | 'mobile' | 'tablet' | 'bot'; os: string; // Operating system osVersion: string; browser: string; // Browser name browserVersion: string; isBot: boolean; // Crawler detection // Time-derived hour: number; // 0-23 dayOfWeek: number; // 0-6 isWeekend: boolean; // Visitor classification isUnique: boolean; // First visit to this short URL isReturning: boolean; // Seen this visitor before}Raw events are aggregated into time-bucketed summaries for efficient querying:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
/** * Aggregated Click Metrics * * Pre-computed aggregations stored at various time granularities. */ // Hourly aggregation (most detailed, retained 30 days)interface HourlyMetrics { shortCode: string; hour: string; // "2024-01-15T14:00:00Z" clicks: number; uniqueVisitors: number; // Grouped distributions byCountry: Record<string, number>; // {"US": 523, "UK": 234, ...} byDevice: Record<string, number>; // {"mobile": 612, "desktop": 145} byBrowser: Record<string, number>; // {"Chrome": 432, "Safari": 215} byReferer: Record<string, number>; // {"twitter.com": 324, "direct": 433}} // Daily aggregation (retained 1 year)interface DailyMetrics { shortCode: string; date: string; // "2024-01-15" clicks: number; uniqueVisitors: number; byCountry: Record<string, number>; byDevice: Record<string, number>; peakHour: number; // Hour with most clicks avgResponseTime: number;} // Monthly aggregation (retained indefinitely)interface MonthlyMetrics { shortCode: string; month: string; // "2024-01" clicks: number; uniqueVisitors: number; topCountries: { country: string; clicks: number }[]; // Top 10 topReferers: { referer: string; clicks: number }[]; // Top 10} // Real-time counters (for dashboard display)interface RealtimeCounter { shortCode: string; windowStart: number; // Unix timestamp windowDuration: number; // Seconds (60, 300, 3600) clicks: number;} /** * Data Retention Policy * * Raw events: 7 days (then deleted) * Hourly: 30 days * Daily: 1 year * Monthly: Forever * Real-time: Rolling (auto-expires) */1 million clicks on a URL over a day becomes 24 hourly records instead of 1 million raw events. This 40,000x reduction makes long-term storage and querying feasible. The trade-off: you lose individual event details after the retention window.
The cardinal rule of analytics collection is: never let analytics slow down the user's redirect. We achieve this through asynchronous, decoupled event emission.
12345678910111213141516171819202122232425262728293031323334353637383940
Analytics Collection Pipeline============================== ┌──────────────┐ ┌──────────────────────────────────┐│ Redirect │ │ Analytics Pipeline ││ Service │ │ ││ │ │ ┌───────────────────────────┐ ││ ┌────────┐ │────▶│ │ Kafka / Kinesis │ ││ │Redirect│ │ │ │ (Event Streaming) │ ││ │Handler │ │ │ └───────────┬───────────────┘ ││ └────────┘ │ │ │ ││ │ │ │ ▼ ││ │ fire │ │ ┌───────────────────────────┐ ││ │ and │ │ │ Stream Processors │ ││ │ forget│ │ │ (Flink / Spark) │ ││ │ │ │ │ │ ││ ▼ │ │ │ • Enrich (geo, device) │ ││ ┌────────┐ │ │ │ • Deduplicate │ ││ │ Async │ │ │ │ • Aggregate by time │ ││ │Producer│──┼────▶│ └───────────┬───────────────┘ ││ └────────┘ │ │ │ ││ │ │ ▼ │└──────────────┘ │ ┌───────────────────────────┐ │ │ │ Time-Series Storage │ │ │ │ (InfluxDB / TimescaleDB) │ │ │ └───────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌───────────────────────────┐ │ │ │ Analytics API │ │ │ │ (Query Service) │ │ │ └───────────────────────────┘ │ │ │ └──────────────────────────────────┘ Key Design Principles:1. Redirect path is isolated from analytics processing2. Kafka provides durability buffer (survives downstream failures)3. Stream processing handles enrichment and aggregation4. Time-series DB optimized for analytics queriesThe redirect handler must emit events without blocking:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091
/** * Asynchronous Event Emission * * Fire-and-forget event publishing that never blocks redirects. */ import { Kafka, Producer } from 'kafkajs'; class AnalyticsEmitter { private producer: Producer; private localBuffer: RawClickEvent[] = []; private readonly BATCH_SIZE = 100; private readonly FLUSH_INTERVAL_MS = 100; constructor(kafka: Kafka) { this.producer = kafka.producer({ allowAutoTopicCreation: false, idempotent: true, // Exactly-once semantics }); // Start background flush loop setInterval(() => this.flush(), this.FLUSH_INTERVAL_MS); } /** * Emit a click event. Returns immediately. * Event is buffered and sent in background. */ emit(event: RawClickEvent): void { // Generate unique event ID for deduplication event.eventId = crypto.randomUUID(); // Add to local buffer (fast memory operation) this.localBuffer.push(event); // If buffer is full, trigger immediate flush (still async) if (this.localBuffer.length >= this.BATCH_SIZE) { setImmediate(() => this.flush()); } } /** * Flush buffered events to Kafka. * Runs in background, errors logged but never thrown to caller. */ private async flush(): Promise<void> { if (this.localBuffer.length === 0) return; // Swap buffer (atomic operation) const events = this.localBuffer; this.localBuffer = []; try { await this.producer.send({ topic: 'click-events', messages: events.map(event => ({ key: event.shortCode, // Partition by short code value: JSON.stringify(event), timestamp: String(event.timestamp), })), }); } catch (error) { // Log error but don't crash - analytics are non-critical console.error('Failed to send analytics batch:', error); // Optional: retry failed events // this.retryBuffer.push(...events); } }} // Usage in redirect handler:const analytics = new AnalyticsEmitter(kafka); function handleRedirect(request: Request): Response { const shortCode = parseShortCode(request.url); const longUrl = lookupUrl(shortCode); // Emit analytics event (non-blocking) analytics.emit({ shortCode, timestamp: Date.now(), ip: request.headers.get('cf-connecting-ip') ?? '', userAgent: request.headers.get('user-agent') ?? '', referer: request.headers.get('referer'), // ... other fields }); // Return redirect immediately return Response.redirect(longUrl, 302);}Sending 1 billion individual Kafka messages per day is expensive. Batching 100 events per message reduces this to 10 million messages—a 100x reduction in Kafka overhead. The trade-off is slight latency (100ms buffer window) before events appear in the pipeline.
Raw events need enrichment (geo-location, device parsing) and aggregation before storage. Stream processing frameworks like Apache Flink or Kafka Streams handle this in real-time.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102
"""Analytics Stream Processor (Apache Flink / Conceptual) Processes raw click events through enrichment and aggregation stages.""" from pyflink.datastream import StreamExecutionEnvironmentfrom pyflink.table import StreamTableEnvironment # Stage 1: Parse and Validatedef parse_event(raw_event: str) -> ClickEvent: """Parse JSON and validate required fields.""" event = json.loads(raw_event) # Validate required fields required = ['shortCode', 'timestamp', 'ip', 'userAgent'] if not all(field in event for field in required): raise ValueError(f"Missing required field in event") return ClickEvent(**event) # Stage 2: Enrich with Geo Datadef enrich_geo(event: ClickEvent) -> EnrichedEvent: """Add geographic information from IP address.""" geo = geo_ip_lookup(event.ip) # MaxMind or similar return EnrichedEvent( **event.__dict__, country=geo.country_code, region=geo.region, city=geo.city, latitude=geo.latitude, longitude=geo.longitude, ) # Stage 3: Parse User-Agentdef enrich_device(event: EnrichedEvent) -> EnrichedEvent: """Add device information from User-Agent.""" ua = parse_user_agent(event.user_agent) # ua-parser library event.device_type = ua.device.family # mobile, desktop, etc. event.os = ua.os.family event.os_version = ua.os.version_string event.browser = ua.browser.family event.browser_version = ua.browser.version_string event.is_bot = ua.is_bot return event # Stage 4: Deduplicatedef deduplicate(events: Stream[EnrichedEvent]) -> Stream[EnrichedEvent]: """Remove duplicate events within a time window.""" # Key by event_id, keep first occurrence return events \ .key_by(lambda e: e.event_id) \ .reduce(lambda a, b: a) # Keep first # Stage 5: Time-Window Aggregationdef aggregate_hourly(events: Stream[EnrichedEvent]): """Aggregate events into hourly buckets.""" return events \ .key_by(lambda e: e.short_code) \ .window(TumblingEventTimeWindows.of(Time.hours(1))) \ .aggregate( HourlyAggregator(), # Custom aggregator ProcessWindowFunction() ) class HourlyAggregator: """Aggregate click metrics for one hour window.""" def create_accumulator(self): return { 'clicks': 0, 'unique_visitors': set(), 'by_country': Counter(), 'by_device': Counter(), 'by_browser': Counter(), 'by_referer': Counter(), } def add(self, event: EnrichedEvent, accumulator): accumulator['clicks'] += 1 accumulator['unique_visitors'].add(event.visitor_id) accumulator['by_country'][event.country] += 1 accumulator['by_device'][event.device_type] += 1 accumulator['by_browser'][event.browser] += 1 referer = extract_domain(event.referer) or 'direct' accumulator['by_referer'][referer] += 1 return accumulator def get_result(self, accumulator): return HourlyMetrics( clicks=accumulator['clicks'], unique_visitors=len(accumulator['unique_visitors']), by_country=dict(accumulator['by_country']), by_device=dict(accumulator['by_device']), by_browser=dict(accumulator['by_browser']), by_referer=dict(accumulator['by_referer']), )Events may arrive late due to network delays or mobile devices coming online. Use event-time (timestamp in event) not processing-time, and configure watermarks with allowed lateness (e.g., 1 hour). Late events trigger window re-computation.
Different analytics use cases have different freshness requirements. A Lambda Architecture or Kappa Architecture can serve both.
| Use Case | Freshness Needed | Acceptable Latency | Approach |
|---|---|---|---|
| Live dashboard 'Now' | Real-time | < 1 second | Streaming counters |
| Hourly reports | Near real-time | < 5 minutes | Micro-batch |
| Daily summaries | Batch | < 1 hour | Daily batch jobs |
| Historical analysis | Batch | Next day | Nightly aggregation |
| Ad-hoc queries | On-demand | Minutes | Query on aggregates |
For truly real-time metrics ('clicks in last 60 seconds'), we use Redis sorted sets or HyperLogLog:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
/** * Real-Time Click Counters * * Uses Redis for sub-second counter updates and queries. */ class RealtimeCounters { private redis: Redis; /** * Increment click counter for current minute window. * Called from stream processor, not redirect handler. */ async incrementClick(shortCode: string, timestamp: number): Promise<void> { const minuteWindow = Math.floor(timestamp / 60000) * 60000; const key = `clicks:${shortCode}:realtime`; // ZADD to sorted set with timestamp as score // Member is unique per click (use counter) await this.redis.zadd(key, minuteWindow, `${timestamp}:${Math.random()}`); // Set expiration (keep 1 hour of data) await this.redis.expire(key, 3600); } /** * Get clicks in the last N seconds. */ async getClicksInWindow(shortCode: string, windowSeconds: number): Promise<number> { const key = `clicks:${shortCode}:realtime`; const now = Date.now(); const windowStart = now - (windowSeconds * 1000); // Count members with score >= windowStart return await this.redis.zcount(key, windowStart, now); } /** * Get unique visitors using HyperLogLog (approximate, memory efficient). */ async addUniqueVisitor(shortCode: string, visitorId: string, date: string): Promise<void> { const key = `unique:${shortCode}:${date}`; await this.redis.pfadd(key, visitorId); await this.redis.expire(key, 86400 * 7); // Keep 7 days } async getUniqueVisitors(shortCode: string, date: string): Promise<number> { const key = `unique:${shortCode}:${date}`; return await this.redis.pfcount(key); }} // HyperLogLog properties:// - Uses only 12KB per counter regardless of cardinality// - 0.81% standard error (accurate enough for analytics)// - Can merge multiple HLLs for combined counts // For 1M short URLs with daily unique tracking:// Memory: 1M × 12KB = 12GB (very efficient!)Counting exact unique visitors requires storing every visitor ID—impossible at scale. HyperLogLog provides approximate counts (±1% error) using only 12KB per counter. It's the industry standard for web analytics unique counting.
Choosing the right storage for analytics data depends on query patterns, retention requirements, and scale.
| Database | Best For | Query Speed | Write Speed | Scalability |
|---|---|---|---|---|
| InfluxDB | Time-series metrics | Fast for time queries | High throughput | Clustering available |
| TimescaleDB | Time-series with SQL | Excellent (PostgreSQL) | Very high | Excellent |
| ClickHouse | OLAP analytics | Extremely fast | Very high | Excellent |
| Druid | Real-time OLAP | Sub-second | High | Excellent |
| BigQuery | Ad-hoc analytics | Fast for aggregations | Batch preferred | Serverless |
ClickHouse is purpose-built for analytics workloads with columnar storage and vectorized query execution:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
-- ClickHouse Schema for URL Shortener Analytics -- Table for hourly aggregated metricsCREATE TABLE hourly_metrics ( short_code LowCardinality(String), hour DateTime, -- Core metrics clicks UInt64, unique_visitors UInt64, -- Distributions (stored as nested structures) country_clicks Nested( country LowCardinality(String), count UInt32 ), device_clicks Nested( device LowCardinality(String), count UInt32 ), browser_clicks Nested( browser LowCardinality(String), count UInt32 ), referer_clicks Nested( referer String, count UInt32 ))ENGINE = MergeTree()PARTITION BY toYYYYMM(hour) -- Partition by monthORDER BY (short_code, hour) -- Primary keyTTL hour + INTERVAL 30 DAY; -- Auto-delete after 30 days -- Table for daily rollups (longer retention)CREATE TABLE daily_metrics ( short_code LowCardinality(String), date Date, clicks UInt64, unique_visitors UInt64, -- Top-N only (reduces storage) top_countries Array(Tuple(String, UInt32)), -- Top 10 top_devices Array(Tuple(String, UInt32)), top_referers Array(Tuple(String, UInt32)), peak_hour UInt8, avg_response_ms Float32)ENGINE = MergeTree()PARTITION BY toYYYYMM(date)ORDER BY (short_code, date)TTL date + INTERVAL 1 YEAR; -- Materialized view for automatic rollup from hourly to dailyCREATE MATERIALIZED VIEW daily_metrics_mvTO daily_metricsAS SELECT short_code, toDate(hour) as date, sum(clicks) as clicks, sum(unique_visitors) as unique_visitors, -- Approximate! -- ... aggregation logicFROM hourly_metricsGROUP BY short_code, toDate(hour);1234567891011121314151617181920212223242526272829303132333435363738394041
-- Common Analytics Queries -- 1. Clicks over time for a short URL (dashboard chart)SELECT toStartOfHour(hour) as time, sum(clicks) as clicksFROM hourly_metricsWHERE short_code = 'a7Xk2B' AND hour >= now() - INTERVAL 7 DAYGROUP BY timeORDER BY time; -- 2. Top countries for a URLSELECT country_clicks.country as country, sum(country_clicks.count) as clicksFROM hourly_metricsARRAY JOIN country_clicksWHERE short_code = 'a7Xk2B' AND hour >= now() - INTERVAL 30 DAYGROUP BY countryORDER BY clicks DESCLIMIT 10; -- 3. Total clicks across all URLs (global stats)SELECT sum(clicks) as total_clicksFROM daily_metricsWHERE date = today(); -- 4. Top performing URLs todaySELECT short_code, sum(clicks) as clicksFROM hourly_metricsWHERE hour >= today()GROUP BY short_codeORDER BY clicks DESCLIMIT 100; -- Query performance: sub-second even on billions of rows-- ClickHouse scans 100M+ rows/second on commodity hardwareClickHouse stores data by column, not row. For a query selecting only 'clicks' from 1 billion rows, it reads only the 'clicks' column (8 bytes × 1B = 8GB), not entire rows (500 bytes × 1B = 500GB). This 60x reduction makes analytics queries blazingly fast.
Counting unique visitors is one of the most challenging analytics problems, especially in a privacy-conscious world where cookies are blocked and IP addresses are shared.
| Approach | Accuracy | Privacy Impact | Limitations |
|---|---|---|---|
| First-Party Cookie | High (when available) | Low (consented) | Blocked by ~30% of users |
| IP + User-Agent Hash | Medium | Medium | Shared IPs, VPNs, NAT |
| Browser Fingerprinting | High | High (controversial) | Privacy regulations restrict |
| Login-based | Perfect | Low (explicit) | Only works for logged-in users |
| Statistical Estimation | Medium | None | Approximate, not individual-level |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
/** * Visitor Identification Strategy * * Privacy-respecting approach combining multiple signals. */ interface VisitorIdentification { visitorId: string | null; // Primary identifier sessionId: string | null; // Session tracking confidence: 'high' | 'medium' | 'low'; method: 'cookie' | 'fingerprint' | 'statistical';} class VisitorIdentifier { /** * Identify visitor from request headers and optional cookie. * Falls back to fingerprint, then to no identification. */ identify(request: Request): VisitorIdentification { // Tier 1: First-party cookie (highest accuracy, consented) const cookieId = this.extractVisitorCookie(request); if (cookieId) { return { visitorId: cookieId, sessionId: this.extractSessionCookie(request), confidence: 'high', method: 'cookie', }; } // Tier 2: IP + User-Agent hash (medium accuracy) const ip = request.headers.get('cf-connecting-ip') ?? ''; const userAgent = request.headers.get('user-agent') ?? ''; if (ip && userAgent) { const fingerprint = this.hashFingerprint(ip, userAgent); return { visitorId: fingerprint, sessionId: null, confidence: 'medium', method: 'fingerprint', }; } // Tier 3: No identification possible return { visitorId: null, sessionId: null, confidence: 'low', method: 'statistical', }; } private hashFingerprint(ip: string, userAgent: string): string { // Hash IP and UA together - not unique but better than nothing // Add date component so same visitor is counted fresh each day const date = new Date().toISOString().split('T')[0]; const input = `${ip}:${userAgent}:${date}`; return crypto.createHash('sha256').update(input).digest('hex').slice(0, 16); } /** * Set visitor cookie in redirect response. * Only if privacy policy allows and user hasn't opted out. */ setVisitorCookie(response: Response, visitorId?: string): Response { const id = visitorId ?? crypto.randomUUID(); response.headers.set('Set-Cookie', `vid=${id}; ` + `Max-Age=31536000; ` + // 1 year `Path=/; ` + `Secure; ` + `HttpOnly; ` + `SameSite=Lax` ); return response; }} // Privacy considerations:// - Cookie requires user consent in GDPR regions// - Fingerprinting may be restricted by ePrivacy regulations// - Always honor DNT (Do Not Track) header// - Provide opt-out mechanism in privacy policyGDPR, CCPA, and ePrivacy regulations restrict tracking technologies. Cookies require consent banners. Fingerprinting is increasingly restricted. Always consult legal counsel before implementing visitor tracking, and provide clear opt-out mechanisms.
The analytics API serves dashboards and integrations. It must be fast, flexible, and protect user data.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687
/** * Analytics API Endpoints */ // GET /api/v1/analytics/{shortCode}// Get overview metrics for a short URLinterface AnalyticsOverviewResponse { shortCode: string; shortUrl: string; longUrl: string; // Summary metrics totalClicks: number; uniqueVisitors: number; // Period comparisons clicksToday: number; clicksYesterday: number; clicks7Days: number; clicks30Days: number; // Growth indicators dailyGrowth: number; // Percentage weeklyGrowth: number; // Metadata createdAt: string; lastClickAt: string;} // GET /api/v1/analytics/{shortCode}/timeseries// Get clicks over time for chartinginterface TimeseriesRequest { shortCode: string; startDate: string; // ISO date endDate: string; granularity: 'hour' | 'day' | 'week' | 'month'; metrics: ('clicks' | 'uniqueVisitors' | 'avgResponseTime')[];} interface TimeseriesResponse { data: { timestamp: string; clicks: number; uniqueVisitors: number; avgResponseTime?: number; }[]; granularity: string; timezone: string;} // GET /api/v1/analytics/{shortCode}/breakdown// Get breakdown by dimensioninterface BreakdownRequest { shortCode: string; startDate: string; endDate: string; dimension: 'country' | 'device' | 'browser' | 'os' | 'referer'; limit?: number; // Default 10} interface BreakdownResponse { dimension: string; data: { value: string; // Country code, device name, etc. clicks: number; percentage: number; }[]; total: number; other: number; // Clicks in items beyond limit} // GET /api/v1/analytics/{shortCode}/realtime// Get live metrics (last 5 minutes)interface RealtimeResponse { shortCode: string; clicksLastMinute: number; clicksLast5Minutes: number; clicksLastHour: number; activeNow: number; // Approximate concurrent viewers recentClicks: { timestamp: string; country: string; device: string; referer: string; }[]; // Last 10 clicks}For dashboard views, pre-fetch the most common queries (overview, 7-day chart, top 5 countries) in parallel. Users perceive faster load when data appears progressively rather than waiting for a single large query.
We've built a comprehensive analytics collection and serving system. Let's consolidate the key architectural decisions:
| Component | Technology | Purpose |
|---|---|---|
| Event Emission | Async producer + Kafka | Decouple from redirect, handle bursts |
| Stream Processing | Flink / Kafka Streams | Enrich, dedupe, aggregate in real-time |
| Real-time Counters | Redis HyperLogLog | Sub-second unique counts, live dashboards |
| Time-series Storage | ClickHouse | Fast analytical queries on aggregates |
| Raw Event Storage | S3 / GCS (7-day) | Audit trail, reprocessing capability |
| API Layer | REST + caching | Serve dashboards and integrations |
You now understand how to collect, process, and serve analytics from billions of redirect events. Next, we'll explore custom short URLs—how to let users choose their own short codes while maintaining uniqueness and security.