System Design (HLD)Edge Computing

Edge Computing: Processing at the Network Periphery

LevelAdvanced

Duration90 mins

TopicEdge Computing

5 / 5

Edge Data Challenges: Navigating Distributed State at the Periphery

The Data Problem at the Edge

Edge computing promises low latency and local processing—but computing requires data, and data at the edge is fundamentally harder than data in a centralized cloud.

When your data lives in one place, you have one source of truth. When your data spans 200+ edge locations, each with potential connectivity issues, varying capacities, and independent state, you've entered a realm of distributed systems challenges that centralized architectures never face.

Edge data challenges are the hard problems that determine whether your edge architecture succeeds or becomes an operational nightmare. This final page addresses these challenges head-on: cache coherence, data sovereignty, storage limitations, and the patterns that production edge systems use to navigate this complexity.

What You Will Learn

By the end of this page, you will understand the fundamental challenges of managing state at the edge, specific technical problems like cache invalidation and data sovereignty, edge storage options and their trade-offs, patterns for handling edge data effectively, and strategies for testing and operating edge data systems.

The Fundamental Edge Data Challenge

Edge data systems face a fundamental tension: data wants to be centralized for consistency, but latency demands it be distributed. This tension manifests in several core challenges:

Core Edge Data Challenges

•Replication Latency — Data pushed to edge locations takes time to propagate. During propagation, locations serve stale data. The more locations, the longer full propagation takes.
•Consistency vs. Availability — CAP theorem applies brutally at the edge. During origin connectivity loss, edge must choose: serve potentially stale data (available) or fail requests (consistent).
•Storage Constraints — Edge nodes have limited storage. You cannot replicate your entire database to each edge location. What data lives at edge is a critical design decision.
•Write Handling — Reads can be served from local replicas; writes must eventually reach authoritative store. Write patterns at edge—local-first, write-through, buffered—each have trade-offs.
•Cache Invalidation — The 'two hard things in computer science' problem is amplified by 200+ cache locations. How do you invalidate consistently across a global edge network?
•Data Sovereignty — Regulations (GDPR, data residency laws) may prohibit data from leaving certain jurisdictions. Edge architecture must respect these boundaries.

The Scale of the Problem:

Consider a global edge deployment:

200 edge locations worldwide
1 million requests/second across locations
100KB average cache item size
10 million unique cacheable items

Full replication would require: 10M items × 100KB × 200 locations = 200 TB of distributed storage Full propagation (at 100ms per item per location): 10M × 200 × 0.1s = 200 million item-second operations per full update cycle

Clearly, naive approaches don't scale. Edge data architecture must be strategic about what lives where and how it stays synchronized.

Edge Data Is a Different Domain

Traditional database knowledge—ACID transactions, normalized schemas, SQL queries—doesn't directly apply at the edge. Edge data systems use different primitives: key-value stores, eventually consistent replication, CRDTs, and cache hierarchies. Treat edge data as its own discipline, not an extension of traditional databases.

Cache Invalidation at Scale

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

At the edge, cache invalidation isn't just hard—it's the defining data management challenge. When does cached data become stale? How do you update 200+ locations simultaneously? What happens during propagation?

Cache Invalidation Strategies

•Time-Based Expiration (TTL) — Cache items expire after fixed duration. Simple, predictable, but doesn't react to actual changes. Best when staleness is bounded and acceptable.
•Event-Based Invalidation — Origin publishes change events; edge locations subscribe and invalidate. Near-real-time but requires messaging infrastructure and guaranteed delivery.
•Purge APIs — Origin explicitly calls cache purge API when data changes. Direct control but adds origin complexity and latency to update flows.
•Version-Based URLs — Cache keys include version/hash (e.g., /api/products?v=a3f2b). Changes change URL, automatically bypassing old cache. Works for static content; impractical for dynamic.
•Stale-While-Revalidate — Serve stale content immediately while fetching fresh in background. Best latency but requires tolerance for brief staleness.

Cache Invalidation Strategy Comparison
Strategy	Staleness Window	Complexity	Best For
TTL-based	Up to TTL duration	Low	Reference data, static content
Event-based	Seconds (propagation delay)	High	Frequently changing hot data
Purge API	API call + propagation	Medium	Critical data requiring immediate update
Version URLs	None (new URL)	Medium	Static assets, versioned content
Stale-while-revalidate	Background refresh duration	Low	High-traffic pages, personalization

The Propagation Problem:

Even with instant invalidation at origin, propagation to all edge locations takes time. During this window:

Some locations have old data, some have new
Users may see inconsistent results depending on which edge they hit
Related data may be partially updated (e.g., product updated but price stale)

Mitigation strategies:

Accept bounded inconsistency for most data (be explicit about staleness windows)
Use versioning to ensure all fetches get consistent point-in-time view
For critical updates, briefly route traffic to origin while edge propagates
Design UI to gracefully handle temporary inconsistencies

Cache Invalidation Pattern
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
// Edge worker with stale-while-revalidate and TTL fallback
 
export default {
  async fetch(request, env, ctx) {
    const cacheKey = new Request(request.url, request);
    const cache = caches.default;
    
    // Check cache first
    let response = await cache.match(cacheKey);
    
    if (response) {
      // Check if within SWR window (custom header)
      const cachedAt = response.headers.get('X-Cached-At');
      const maxAge = 300; // 5 minutes hard TTL
      const swrWindow = 60; // Revalidate if older than 1 minute
      
      const age = (Date.now() - new Date(cachedAt).getTime()) / 1000;
      
      if (age > maxAge) {
        // Hard expired - must fetch fresh
        response = null;
      } else if (age > swrWindow) {
        // In SWR window - return stale, revalidate in background
        ctx.waitUntil(revalidateCache(request, env, cache, cacheKey));
        return addCacheHeaders(response, 'HIT-STALE');
      } else {
        // Fresh hit
        return addCacheHeaders(response, 'HIT');
      }
    }
    
    // Cache miss or expired - fetch from origin
    const originResponse = await fetchFromOrigin(request, env);
    
    // Cache the response
    const responseToCache = originResponse.clone();
    ctx.waitUntil(cacheResponse(cache, cacheKey, responseToCache));
    
    return addCacheHeaders(originResponse, 'MISS');
  }
};
 
async function revalidateCache(request, env, cache, cacheKey) {
  try {
    const freshResponse = await fetchFromOrigin(request, env);
    await cacheResponse(cache, cacheKey, freshResponse);
    console.log(`Revalidated: ${request.url}`);
  } catch (error) {
    console.error(`Revalidation failed: ${error.message}`);
    // Continue serving stale - don't fail silently
  }
}
 
async function cacheResponse(cache, cacheKey, response) {
  const headers = new Headers(response.headers);
  headers.set('X-Cached-At', new Date().toISOString());
  headers.set('Cache-Control', 'public, max-age=300');
  
  const cachedResponse = new Response(response.body, {
    status: response.status,
    statusText: response.statusText,
    headers
  });
  
  await cache.put(cacheKey, cachedResponse);
}
 
function addCacheHeaders(response, status) {
  const newResponse = new Response(response.body, response);
  newResponse.headers.set('X-Cache-Status', status);
  return newResponse;
}
 
async function fetchFromOrigin(request, env) {
  // Add origin fetch logic with timeout
  const controller = new AbortController();
  const timeout = setTimeout(() => controller.abort(), 5000);
  
  try {
    return await fetch(request, { signal: controller.signal });
  } finally {
    clearTimeout(timeout);
  }
}

Data Sovereignty and Compliance

Edge computing doesn't exist in a legal vacuum. Data sovereignty laws—GDPR, CCPA, data residency requirements—impose hard constraints on where data can physically reside and be processed. Edge architecture must account for these realities.

Key Data Sovereignty Regulations

•GDPR (EU) — Personal data of EU residents must be protected per GDPR principles. Cross-border transfer requires adequate safeguards (SCCs, adequacy decisions). Processing at edge locations outside EU may require explicit consent or legal basis.
•Data Residency Laws — Some countries (Russia, China, Germany for certain sectors) require data to be stored within national borders. Edge locations outside these borders cannot cache personal data of those residents.
•Sector-Specific (HIPAA, PCI-DSS) — Healthcare and payment data have specific requirements for processing locations, encryption, and audit trails that edge systems must meet.
•California (CCPA/CPRA) — Similar to GDPR for California residents. Edge processing must respect consumer rights and provide transparency.

Architectural Patterns for Data Sovereignty:

Geo-Fenced Edge Processing

•Pattern: Route requests based on user location to edge nodes within the same jurisdiction. EU users go to EU edge; data never leaves region.
•Implementation: Edge function reads geo signals (Cloudflare's cf.country, request headers), routes to appropriate backend or restricts cached data access.
•Limitation: Reduces edge effectiveness. If EU users can only use EU edge nodes, users near EU borders may get suboptimal latency.
•Use When: Handling personal data that falls under data residency requirements.

Data Classification at Edge

•Pattern: Classify data by sensitivity. Only non-sensitive/anonymized data replicated globally; sensitive data geo-restricted.
•Implementation: Tag content with sensitivity levels. Edge caching rules enforce: public→global cache, PII→regional only, regulated→no cache.
•Benefit: Maximizes edge benefits for appropriate content while complying for sensitive data.
•Use When: Mixed content sensitivity in same application.

Edge Anonymization

•Pattern: Process personal data at edge but anonymize/aggregate before logging or caching. Edge computation, not edge storage of PII.
•Implementation: Edge function strips PII from responses before caching, aggregates analytics without individual identification.
•Benefit: Enables edge processing benefits while reducing compliance scope.
•Use When: Analytics, personalization where individual identity isn't required for cached result.

Compliance Is Complex

This section provides patterns, not legal advice. Data sovereignty regulations are complex, vary by jurisdiction, and change frequently. Work with legal counsel to determine specific requirements for your application. Document your edge data processing in your privacy policy and data processing agreements.

Edge Storage Technologies

Different edge platforms offer different storage primitives. Understanding these options is essential for designing edge data architecture:

Cloudflare Workers Storage

•Workers KV — Eventually consistent, globally replicated key-value store. Reads in <50ms globally. Writes propagate in 60 seconds. Best for: read-heavy reference data, feature flags, configuration.
•Durable Objects — Strongly consistent, single-threaded state actors. Each object lives at one location. Best for: coordination, counters, sessions, collaborative state.
•R2 — S3-compatible object storage with no egress fees. Global availability. Best for: large objects, media, backups.
•D1 — SQLite at the edge. Fully relational with SQL queries. Still maturing. Best for: application data requiring relational queries.
•Cache API — Per-location cache. Fastest reads but no cross-location consistency. Best for: response caching, computed values.

Lambda@Edge Storage Options

•CloudFront Cache — Response caching at edge locations. Controlled via Cache-Control headers. Best for: response caching.
•DynamoDB Global Tables — Multi-region replication with optional local reads. Millisecond latency within region. Best for: application data requiring global access.
•S3 — Object storage accessible from Lambda@Edge (with latency cost). Best for: large objects, origin fallback.
•ElastiCache Global Datastore — Multi-region Redis. Sub-millisecond reads in-region. Best for: session cache, hot data.

Edge Storage Comparison
Storage Type	Consistency	Latency (read)	Capacity	Cost Model
Workers KV	Eventually consistent	<50ms global	25GB-unlimited	per operation
Durable Objects	Strongly consistent	Variable (affinity)	128KB-unlimited per object	per operation + duration
Cache API	None (per-location)	<1ms local	Limited per location	included
DynamoDB Global	Eventually or strong (config)	Single-digit ms in region	Unlimited	per request + storage
Redis (ElastiCache)	Single-master or CRDT	Sub-ms in region	Node-limited	per hour + transfer

Combine Storage Tiers

Production edge systems typically combine multiple storage tiers. Example: Cache API for response caching (fastest, ephemeral), Workers KV for reference data (fast reads, eventual consistency), Durable Objects for coordination (strong consistency, higher latency), and origin database for writes and complex queries. Design your data architecture to place each data type at the appropriate tier.

Write Handling Patterns

Reads at edge are well-understood (cache + replicate). Writes introduce complexity: how do you accept user mutations at edge while maintaining data integrity at origin?

Pattern 1: Synchronous Write-Through

•How It Works: Edge receives write, forwards to origin synchronously, waits for confirmation, returns to user.
•Pros: Simple consistency model. Origin is always authoritative. No edge state mutation.
•Cons: Write latency includes origin round-trip (100-300ms). Edge provides no write latency benefit.
•Use When: Data integrity is paramount; write frequency is low; write latency is acceptable.
•Example: Payment processing, order placement, account updates.

Pattern 2: Asynchronous Write-Behind

•How It Works: Edge accepts write, acknowledges immediately, queues for async sync to origin.
•Pros: Fast write acknowledgment (<10ms). Edge absorbs write bursts. Origin write load smoothed.
•Cons: Potential data loss if edge fails before sync. User may see stale read after write.
•Use When: Write latency is critical; some data loss risk is acceptable; writes are non-critical.
•Example: Analytics events, user activity logging, view counters.

Pattern 3: Local-First with CRDT Merge

•How It Works: Edge writes to local state (Durable Objects), synchronizes with origin using CRDTs for conflict resolution.
•Pros: Fast local writes. Offline capable. Automatic conflict resolution.
•Cons: Complex implementation. Limited to CRDT-compatible data types. Eventual consistency.
•Use When: Offline support required; collaborative editing; counters or sets.
•Example: Shopping cart (add/remove items), collaborative documents, voting/likes.

Pattern 4: Edge Validation + Origin Execution

•How It Works: Edge validates write request (schema, auth, rate limits), forwards valid commands to origin for execution.
•Pros: Early rejection of invalid writes. Reduced origin load. Clear separation of concerns.
•Cons: Origin latency still paid for valid writes. Edge doesn't reduce happy-path latency.
•Use When: High invalid write rate (bot attacks, validation errors); origin should only process valid operations.
•Example: API endpoints receiving untrusted input, form submissions with validation.

Write-Behind Pattern with Retry
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
// Asynchronous write-behind with at-least-once delivery
 
export default {
  async fetch(request, env, ctx) {
    if (request.method !== 'POST') {
      return fetch(request); // Non-writes pass through
    }
    
    const body = await request.json();
    
    // Validate write at edge
    const validation = validateWrite(body);
    if (!validation.valid) {
      return new Response(JSON.stringify({ error: validation.error }), {
        status: 400,
        headers: { 'Content-Type': 'application/json' }
      });
    }
    
    // Generate unique write ID for idempotency
    const writeId = crypto.randomUUID();
    const writeRecord = {
      id: writeId,
      payload: body,
      timestamp: Date.now(),
      retryCount: 0
    };
    
    // Store in durable write queue
    const writeQueue = env.WRITE_QUEUE.get(
      env.WRITE_QUEUE.idFromName('global')
    );
    
    await writeQueue.fetch(new Request('https://queue/enqueue', {
      method: 'POST',
      body: JSON.stringify(writeRecord)
    }));
    
    // Acknowledge immediately
    return new Response(JSON.stringify({ 
      accepted: true, 
      writeId: writeId,
      note: 'Write queued for processing'
    }), {
      status: 202, // Accepted
      headers: { 'Content-Type': 'application/json' }
    });
  }
};
 
// Durable Object for write queue with retry
export class WriteQueue {
  constructor(state, env) {
    this.state = state;
    this.env = env;
    this.queue = [];
  }
  
  async fetch(request) {
    const url = new URL(request.url);
    
    if (url.pathname === '/enqueue') {
      const write = await request.json();
      this.queue.push(write);
      await this.state.storage.put('queue', this.queue);
      
      // Schedule processing if not already running
      this.scheduleProcess();
      
      return new Response('OK');
    }
    
    return new Response('Not Found', { status: 404 });
  }
  
  async scheduleProcess() {
    // Process queue in background
    while (this.queue.length > 0) {
      const write = this.queue[0];
      
      try {
        await this.sendToOrigin(write);
        this.queue.shift(); // Remove on success
        await this.state.storage.put('queue', this.queue);
      } catch (error) {
        // Retry with exponential backoff
        write.retryCount++;
        if (write.retryCount > 5) {
          // Move to dead letter queue
          await this.deadLetter(write, error);
          this.queue.shift();
        } else {
          // Wait before retry
          await new Promise(r => setTimeout(r, 
            Math.min(1000 * Math.pow(2, write.retryCount), 30000)
          ));
        }
      }
    }
  }
  
  async sendToOrigin(write) {
    const response = await fetch(this.env.ORIGIN_URL, {
      method: 'POST',
      headers: { 
        'Content-Type': 'application/json',
        'X-Idempotency-Key': write.id
      },
      body: JSON.stringify(write.payload)
    });
    
    if (!response.ok) {
      throw new Error(`Origin returned ${response.status}`);
    }
  }
  
  async deadLetter(write, error) {
    // Log failed writes for manual investigation
    console.error('Write failed permanently:', write.id, error);
    // Could also send to a dead letter storage
  }
}
 
function validateWrite(body) {
  if (!body.type || !body.data) {
    return { valid: false, error: 'Missing required fields' };
  }
  // Additional validation...
  return { valid: true };
}

Handling Edge Failures

Edge systems introduce failure modes that don't exist in centralized architectures. Your data strategy must account for these scenarios:

Edge-Specific Failure Modes

•Origin Connectivity Loss — Edge cannot reach origin. Must decide: serve stale cache? Fail requests? For how long can edge operate autonomously?
•Edge Location Failure — Single edge PoP goes offline. Traffic must reroute to alternative. Any pending writes at that location may be lost.
•Inconsistent Propagation — Data update reaches some edge locations but not others. Users see different data depending on which edge they hit.
•Split-Brain in Durable Objects — If coordination primitives fail, edge locations may make conflicting decisions without detecting conflicts.
•Cache Poisoning — Malformed data enters cache and serves to many users before detection. Poison propagates faster than centralized systems.

Resilience Strategies:

Building Edge Data Resilience

•Explicit Staleness Budgets — Define maximum acceptable staleness per data type. Configure TTLs accordingly. Fail requests that would exceed staleness budget rather than serving ancient data.
•Origin Health Checks — Probe origin continuously from edge. When origin fails, shift to graceful degradation mode: extended cache TTLs, reduced writes, user notification.
•Write Durability Guarantees — For important writes, don't acknowledge until durably stored (at origin or in replicated edge storage). Accept latency cost for durability.
•Circuit Breakers — If origin errors exceed threshold, stop sending traffic. Serve from cache or error page. Periodically probe to detect recovery.
•Cache Entry Validation — Validate cache entries before serving. Detect corrupted/invalid entries early. Log and invalidate poison entries immediately.

Failure Mode Response Matrix
Failure	Detection	Response	Recovery
Origin unreachable	Fetch timeouts, health checks	Serve stale, extend TTL, fail writes	Resume normal on origin recovery
Edge location down	Anycast auto-reroute	Traffic goes to next-nearest edge	Automatic when PoP recovers
Data propagation delay	Version checks, lag monitoring	Serve with staleness warning, or fail	Wait for propagation completion
Cache corruption	Validation checks, checksums	Invalidate entry, fetch fresh	Investigate source, purge affected
Write queue overflow	Queue depth monitoring	Reject new writes, alert ops	Drain queue, increase capacity

Test Failure Modes

You cannot reason your way to correct failure handling—you must test it. Conduct regular chaos engineering: kill edge locations, block origin connectivity, corrupt cache entries, flood write queues. Verify your resilience strategies work as designed before production incidents reveal gaps.

Monitoring and Observability for Edge Data

Observability for edge data systems requires different approaches than centralized systems. You can't simply inspect one database—you're monitoring 200+ independent data stores with varying states.

Critical Edge Data Metrics

•Cache Hit Rate (per location) — Healthy edge serves 80%+ from cache. Low hit rate indicates missing data, capacity issues, or misconfigured TTLs.
•Data Freshness / Staleness — Track age of cached data. Alert when staleness exceeds acceptable thresholds. Histogram of cache entry ages.
•Propagation Latency — Time from origin write to edge availability. Track P50/P95/P99. Alert on propagation delays.
•Write Queue Depth — For async writes, monitor queue size. Growing queues indicate origin connectivity issues or capacity problems.
•Origin Request Rate — Requests reaching origin after edge layer. Rising origin traffic suggests cache ineffectiveness.
•Error Rate by Location — Some edge locations may have issues others don't. Segment all metrics by location for isolation.

Observability Patterns

•Synthetic Data Probes — Periodically write sentinel values to origin, measure time to appear at edge locations. Provides end-to-end propagation visibility.
•Version Tagging — Tag all cached responses with version identifiers. Query edge locations to detect version inconsistencies. Identify lagging locations.
•Edge-Side Sampling — At edge, sample a fraction of requests for detailed logging. Full logging at 200 locations is prohibitive; sampling provides representative data.
•Distributed Tracing with Data Tags — Include cache hit/miss, data version, and staleness in trace spans. Correlate user experience issues with data state.

Edge Data Observability
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
// Adding observability to edge data operations
 
export default {
  async fetch(request, env, ctx) {
    const startTime = Date.now();
    const colo = request.cf?.colo || 'unknown';
    const metrics = {
      colo,
      cacheStatus: 'MISS',
      dataVersion: null,
      staleness: null,
      originLatency: null
    };
    
    // Check cache with version tracking
    const cache = caches.default;
    const cacheKey = new Request(request.url, request);
    let response = await cache.match(cacheKey);
    
    if (response) {
      metrics.cacheStatus = 'HIT';
      metrics.dataVersion = response.headers.get('X-Data-Version');
      const cachedAt = response.headers.get('X-Cached-At');
      if (cachedAt) {
        metrics.staleness = Date.now() - new Date(cachedAt).getTime();
      }
    } else {
      // Fetch from origin
      const originStart = Date.now();
      response = await fetch(request);
      metrics.originLatency = Date.now() - originStart;
      metrics.dataVersion = response.headers.get('X-Data-Version');
      
      // Cache response
      const responseToCache = response.clone();
      ctx.waitUntil(cacheWithHeaders(cache, cacheKey, responseToCache));
    }
    
    // Emit metrics
    ctx.waitUntil(emitMetrics(env, metrics, startTime));
    
    // Add observability headers to response
    const finalResponse = new Response(response.body, response);
    finalResponse.headers.set('X-Edge-Colo', colo);
    finalResponse.headers.set('X-Cache-Status', metrics.cacheStatus);
    if (metrics.staleness) {
      finalResponse.headers.set('X-Data-Age-Ms', String(metrics.staleness));
    }
    
    return finalResponse;
  }
};
 
async function cacheWithHeaders(cache, key, response) {
  const headers = new Headers(response.headers);
  headers.set('X-Cached-At', new Date().toISOString());
  
  await cache.put(key, new Response(response.body, {
    status: response.status,
    headers
  }));
}
 
async function emitMetrics(env, metrics, startTime) {
  const totalLatency = Date.now() - startTime;
  
  // Emit to analytics (e.g., Cloudflare Analytics Engine or external service)
  const data = {
    timestamp: Date.now(),
    colo: metrics.colo,
    cache_status: metrics.cacheStatus,
    data_version: metrics.dataVersion,
    staleness_ms: metrics.staleness,
    origin_latency_ms: metrics.originLatency,
    total_latency_ms: totalLatency
  };
  
  // Sample 1% for detailed logging
  if (Math.random() < 0.01) {
    console.log('Edge data metrics:', JSON.stringify(data));
  }
  
  // Push to analytics endpoint (fire-and-forget)
  if (env.ANALYTICS_URL) {
    await fetch(env.ANALYTICS_URL, {
      method: 'POST',
      body: JSON.stringify(data)
    }).catch(() => {}); // Don't fail on analytics error
  }
}

Summary: Edge Data Challenges

We've explored the hard problems of managing data at the edge—from cache invalidation to data sovereignty, from write handling to failure modes. Let's consolidate the key insights:

Key Takeaways

•Edge data is a different discipline — Traditional database thinking doesn't apply. Embrace key-value stores, eventual consistency, and cache hierarchies as primary tools.
•Cache invalidation requires strategy — TTL, event-based, purge APIs, stale-while-revalidate each have trade-offs. Choose based on staleness tolerance and update frequency.
•Data sovereignty constrains architecture — Regulations may prohibit global replication. Design with geo-fencing, data classification, and edge anonymization.
•Multiple storage tiers are typical — Combine Cache API (fastest), KV (fast reads), Durable Objects (coordination), and origin database (writes). Place data appropriately.
•Writes at edge have patterns — Write-through for consistency, write-behind for speed, local-first for offline. Choose based on durability and latency requirements.
•Plan for edge failures — Origin disconnection, edge location failure, inconsistent propagation are real. Define staleness budgets, implement circuit breakers, test failure modes.
•Observability requires edge-aware approaches — Monitor per-location, track propagation, use synthetic probes, sample for detailed analysis. Centralized logging doesn't scale.

Module Complete:

You've now completed the Edge Computing module. From the fundamentals of why edge exists (physics of latency) through edge function platforms, use cases, edge vs. origin decisions, and finally the hardest part—managing data at the edge—you have a comprehensive foundation for designing and operating edge-enabled systems.

Edge computing is a powerful tool, but not a universal solution. Apply it where latency, bandwidth, or compliance demands it. Start simple, measure impact, and expand edge capabilities progressively as you prove value and build operational expertise.

Module Complete

Congratulations! You've mastered the fundamentals of edge computing: what it is, when to use it, how to implement edge functions, where edge provides value, how to partition workloads, and how to navigate edge data challenges. You're now equipped to design, implement, and operate edge-enabled systems that deliver ultra-low latency while maintaining data integrity and operational excellence.

5 / 5

Loading learning content...

System Design (HLD)Edge Computing

Edge Computing: Processing at the Network Periphery

LevelAdvanced

Duration90 mins

TopicEdge Computing

5 / 5

Edge Data Challenges: Navigating Distributed State at the Periphery

The Data Problem at the Edge

Edge computing promises low latency and local processing—but computing requires data, and data at the edge is fundamentally harder than data in a centralized cloud.

What You Will Learn

The Fundamental Edge Data Challenge

Edge data systems face a fundamental tension: data wants to be centralized for consistency, but latency demands it be distributed. This tension manifests in several core challenges:

Core Edge Data Challenges

•Replication Latency — Data pushed to edge locations takes time to propagate. During propagation, locations serve stale data. The more locations, the longer full propagation takes.
•Consistency vs. Availability — CAP theorem applies brutally at the edge. During origin connectivity loss, edge must choose: serve potentially stale data (available) or fail requests (consistent).
•Storage Constraints — Edge nodes have limited storage. You cannot replicate your entire database to each edge location. What data lives at edge is a critical design decision.
•Write Handling — Reads can be served from local replicas; writes must eventually reach authoritative store. Write patterns at edge—local-first, write-through, buffered—each have trade-offs.
•Cache Invalidation — The 'two hard things in computer science' problem is amplified by 200+ cache locations. How do you invalidate consistently across a global edge network?
•Data Sovereignty — Regulations (GDPR, data residency laws) may prohibit data from leaving certain jurisdictions. Edge architecture must respect these boundaries.

The Scale of the Problem:

Consider a global edge deployment:

200 edge locations worldwide
1 million requests/second across locations
100KB average cache item size
10 million unique cacheable items

Clearly, naive approaches don't scale. Edge data architecture must be strategic about what lives where and how it stays synchronized.

Edge Data Is a Different Domain

Cache Invalidation at Scale

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Cache Invalidation Strategies

•Time-Based Expiration (TTL) — Cache items expire after fixed duration. Simple, predictable, but doesn't react to actual changes. Best when staleness is bounded and acceptable.
•Event-Based Invalidation — Origin publishes change events; edge locations subscribe and invalidate. Near-real-time but requires messaging infrastructure and guaranteed delivery.
•Purge APIs — Origin explicitly calls cache purge API when data changes. Direct control but adds origin complexity and latency to update flows.
•Version-Based URLs — Cache keys include version/hash (e.g., /api/products?v=a3f2b). Changes change URL, automatically bypassing old cache. Works for static content; impractical for dynamic.
•Stale-While-Revalidate — Serve stale content immediately while fetching fresh in background. Best latency but requires tolerance for brief staleness.

Cache Invalidation Strategy Comparison
Strategy	Staleness Window	Complexity	Best For
TTL-based	Up to TTL duration	Low	Reference data, static content
Event-based	Seconds (propagation delay)	High	Frequently changing hot data
Purge API	API call + propagation	Medium	Critical data requiring immediate update
Version URLs	None (new URL)	Medium	Static assets, versioned content
Stale-while-revalidate	Background refresh duration	Low	High-traffic pages, personalization

The Propagation Problem:

Even with instant invalidation at origin, propagation to all edge locations takes time. During this window:

Some locations have old data, some have new
Users may see inconsistent results depending on which edge they hit
Related data may be partially updated (e.g., product updated but price stale)

Mitigation strategies:

Accept bounded inconsistency for most data (be explicit about staleness windows)
Use versioning to ensure all fetches get consistent point-in-time view
For critical updates, briefly route traffic to origin while edge propagates
Design UI to gracefully handle temporary inconsistencies

Cache Invalidation Pattern
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
// Edge worker with stale-while-revalidate and TTL fallback
 
export default {
  async fetch(request, env, ctx) {
    const cacheKey = new Request(request.url, request);
    const cache = caches.default;
    
    // Check cache first
    let response = await cache.match(cacheKey);
    
    if (response) {
      // Check if within SWR window (custom header)
      const cachedAt = response.headers.get('X-Cached-At');
      const maxAge = 300; // 5 minutes hard TTL
      const swrWindow = 60; // Revalidate if older than 1 minute
      
      const age = (Date.now() - new Date(cachedAt).getTime()) / 1000;
      
      if (age > maxAge) {
        // Hard expired - must fetch fresh
        response = null;
      } else if (age > swrWindow) {
        // In SWR window - return stale, revalidate in background
        ctx.waitUntil(revalidateCache(request, env, cache, cacheKey));
        return addCacheHeaders(response, 'HIT-STALE');
      } else {
        // Fresh hit
        return addCacheHeaders(response, 'HIT');
      }
    }
    
    // Cache miss or expired - fetch from origin
    const originResponse = await fetchFromOrigin(request, env);
    
    // Cache the response
    const responseToCache = originResponse.clone();
    ctx.waitUntil(cacheResponse(cache, cacheKey, responseToCache));
    
    return addCacheHeaders(originResponse, 'MISS');
  }
};
 
async function revalidateCache(request, env, cache, cacheKey) {
  try {
    const freshResponse = await fetchFromOrigin(request, env);
    await cacheResponse(cache, cacheKey, freshResponse);
    console.log(`Revalidated: ${request.url}`);
  } catch (error) {
    console.error(`Revalidation failed: ${error.message}`);
    // Continue serving stale - don't fail silently
  }
}
 
async function cacheResponse(cache, cacheKey, response) {
  const headers = new Headers(response.headers);
  headers.set('X-Cached-At', new Date().toISOString());
  headers.set('Cache-Control', 'public, max-age=300');
  
  const cachedResponse = new Response(response.body, {
    status: response.status,
    statusText: response.statusText,
    headers
  });
  
  await cache.put(cacheKey, cachedResponse);
}
 
function addCacheHeaders(response, status) {
  const newResponse = new Response(response.body, response);
  newResponse.headers.set('X-Cache-Status', status);
  return newResponse;
}
 
async function fetchFromOrigin(request, env) {
  // Add origin fetch logic with timeout
  const controller = new AbortController();
  const timeout = setTimeout(() => controller.abort(), 5000);
  
  try {
    return await fetch(request, { signal: controller.signal });
  } finally {
    clearTimeout(timeout);
  }
}

Data Sovereignty and Compliance

Key Data Sovereignty Regulations

•GDPR (EU) — Personal data of EU residents must be protected per GDPR principles. Cross-border transfer requires adequate safeguards (SCCs, adequacy decisions). Processing at edge locations outside EU may require explicit consent or legal basis.
•Data Residency Laws — Some countries (Russia, China, Germany for certain sectors) require data to be stored within national borders. Edge locations outside these borders cannot cache personal data of those residents.
•Sector-Specific (HIPAA, PCI-DSS) — Healthcare and payment data have specific requirements for processing locations, encryption, and audit trails that edge systems must meet.
•California (CCPA/CPRA) — Similar to GDPR for California residents. Edge processing must respect consumer rights and provide transparency.

Architectural Patterns for Data Sovereignty:

Geo-Fenced Edge Processing

•Pattern: Route requests based on user location to edge nodes within the same jurisdiction. EU users go to EU edge; data never leaves region.
•Implementation: Edge function reads geo signals (Cloudflare's cf.country, request headers), routes to appropriate backend or restricts cached data access.
•Limitation: Reduces edge effectiveness. If EU users can only use EU edge nodes, users near EU borders may get suboptimal latency.
•Use When: Handling personal data that falls under data residency requirements.

Data Classification at Edge

•Pattern: Classify data by sensitivity. Only non-sensitive/anonymized data replicated globally; sensitive data geo-restricted.
•Implementation: Tag content with sensitivity levels. Edge caching rules enforce: public→global cache, PII→regional only, regulated→no cache.
•Benefit: Maximizes edge benefits for appropriate content while complying for sensitive data.
•Use When: Mixed content sensitivity in same application.

Edge Anonymization

•Pattern: Process personal data at edge but anonymize/aggregate before logging or caching. Edge computation, not edge storage of PII.
•Implementation: Edge function strips PII from responses before caching, aggregates analytics without individual identification.
•Benefit: Enables edge processing benefits while reducing compliance scope.
•Use When: Analytics, personalization where individual identity isn't required for cached result.

Compliance Is Complex

Edge Storage Technologies

Different edge platforms offer different storage primitives. Understanding these options is essential for designing edge data architecture:

Cloudflare Workers Storage

•Workers KV — Eventually consistent, globally replicated key-value store. Reads in <50ms globally. Writes propagate in 60 seconds. Best for: read-heavy reference data, feature flags, configuration.
•Durable Objects — Strongly consistent, single-threaded state actors. Each object lives at one location. Best for: coordination, counters, sessions, collaborative state.
•R2 — S3-compatible object storage with no egress fees. Global availability. Best for: large objects, media, backups.
•D1 — SQLite at the edge. Fully relational with SQL queries. Still maturing. Best for: application data requiring relational queries.
•Cache API — Per-location cache. Fastest reads but no cross-location consistency. Best for: response caching, computed values.

Lambda@Edge Storage Options

•CloudFront Cache — Response caching at edge locations. Controlled via Cache-Control headers. Best for: response caching.
•DynamoDB Global Tables — Multi-region replication with optional local reads. Millisecond latency within region. Best for: application data requiring global access.
•S3 — Object storage accessible from Lambda@Edge (with latency cost). Best for: large objects, origin fallback.
•ElastiCache Global Datastore — Multi-region Redis. Sub-millisecond reads in-region. Best for: session cache, hot data.

Edge Storage Comparison
Storage Type	Consistency	Latency (read)	Capacity	Cost Model
Workers KV	Eventually consistent	<50ms global	25GB-unlimited	per operation
Durable Objects	Strongly consistent	Variable (affinity)	128KB-unlimited per object	per operation + duration
Cache API	None (per-location)	<1ms local	Limited per location	included
DynamoDB Global	Eventually or strong (config)	Single-digit ms in region	Unlimited	per request + storage
Redis (ElastiCache)	Single-master or CRDT	Sub-ms in region	Node-limited	per hour + transfer

Combine Storage Tiers

Write Handling Patterns

Reads at edge are well-understood (cache + replicate). Writes introduce complexity: how do you accept user mutations at edge while maintaining data integrity at origin?

Pattern 1: Synchronous Write-Through

•How It Works: Edge receives write, forwards to origin synchronously, waits for confirmation, returns to user.
•Pros: Simple consistency model. Origin is always authoritative. No edge state mutation.
•Cons: Write latency includes origin round-trip (100-300ms). Edge provides no write latency benefit.
•Use When: Data integrity is paramount; write frequency is low; write latency is acceptable.
•Example: Payment processing, order placement, account updates.

Pattern 2: Asynchronous Write-Behind

•How It Works: Edge accepts write, acknowledges immediately, queues for async sync to origin.
•Pros: Fast write acknowledgment (<10ms). Edge absorbs write bursts. Origin write load smoothed.
•Cons: Potential data loss if edge fails before sync. User may see stale read after write.
•Use When: Write latency is critical; some data loss risk is acceptable; writes are non-critical.
•Example: Analytics events, user activity logging, view counters.

Pattern 3: Local-First with CRDT Merge

•How It Works: Edge writes to local state (Durable Objects), synchronizes with origin using CRDTs for conflict resolution.
•Pros: Fast local writes. Offline capable. Automatic conflict resolution.
•Cons: Complex implementation. Limited to CRDT-compatible data types. Eventual consistency.
•Use When: Offline support required; collaborative editing; counters or sets.
•Example: Shopping cart (add/remove items), collaborative documents, voting/likes.

Pattern 4: Edge Validation + Origin Execution

•How It Works: Edge validates write request (schema, auth, rate limits), forwards valid commands to origin for execution.
•Pros: Early rejection of invalid writes. Reduced origin load. Clear separation of concerns.
•Cons: Origin latency still paid for valid writes. Edge doesn't reduce happy-path latency.
•Use When: High invalid write rate (bot attacks, validation errors); origin should only process valid operations.
•Example: API endpoints receiving untrusted input, form submissions with validation.

Write-Behind Pattern with Retry
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
// Asynchronous write-behind with at-least-once delivery
 
export default {
  async fetch(request, env, ctx) {
    if (request.method !== 'POST') {
      return fetch(request); // Non-writes pass through
    }
    
    const body = await request.json();
    
    // Validate write at edge
    const validation = validateWrite(body);
    if (!validation.valid) {
      return new Response(JSON.stringify({ error: validation.error }), {
        status: 400,
        headers: { 'Content-Type': 'application/json' }
      });
    }
    
    // Generate unique write ID for idempotency
    const writeId = crypto.randomUUID();
    const writeRecord = {
      id: writeId,
      payload: body,
      timestamp: Date.now(),
      retryCount: 0
    };
    
    // Store in durable write queue
    const writeQueue = env.WRITE_QUEUE.get(
      env.WRITE_QUEUE.idFromName('global')
    );
    
    await writeQueue.fetch(new Request('https://queue/enqueue', {
      method: 'POST',
      body: JSON.stringify(writeRecord)
    }));
    
    // Acknowledge immediately
    return new Response(JSON.stringify({ 
      accepted: true, 
      writeId: writeId,
      note: 'Write queued for processing'
    }), {
      status: 202, // Accepted
      headers: { 'Content-Type': 'application/json' }
    });
  }
};
 
// Durable Object for write queue with retry
export class WriteQueue {
  constructor(state, env) {
    this.state = state;
    this.env = env;
    this.queue = [];
  }
  
  async fetch(request) {
    const url = new URL(request.url);
    
    if (url.pathname === '/enqueue') {
      const write = await request.json();
      this.queue.push(write);
      await this.state.storage.put('queue', this.queue);
      
      // Schedule processing if not already running
      this.scheduleProcess();
      
      return new Response('OK');
    }
    
    return new Response('Not Found', { status: 404 });
  }
  
  async scheduleProcess() {
    // Process queue in background
    while (this.queue.length > 0) {
      const write = this.queue[0];
      
      try {
        await this.sendToOrigin(write);
        this.queue.shift(); // Remove on success
        await this.state.storage.put('queue', this.queue);
      } catch (error) {
        // Retry with exponential backoff
        write.retryCount++;
        if (write.retryCount > 5) {
          // Move to dead letter queue
          await this.deadLetter(write, error);
          this.queue.shift();
        } else {
          // Wait before retry
          await new Promise(r => setTimeout(r, 
            Math.min(1000 * Math.pow(2, write.retryCount), 30000)
          ));
        }
      }
    }
  }
  
  async sendToOrigin(write) {
    const response = await fetch(this.env.ORIGIN_URL, {
      method: 'POST',
      headers: { 
        'Content-Type': 'application/json',
        'X-Idempotency-Key': write.id
      },
      body: JSON.stringify(write.payload)
    });
    
    if (!response.ok) {
      throw new Error(`Origin returned ${response.status}`);
    }
  }
  
  async deadLetter(write, error) {
    // Log failed writes for manual investigation
    console.error('Write failed permanently:', write.id, error);
    // Could also send to a dead letter storage
  }
}
 
function validateWrite(body) {
  if (!body.type || !body.data) {
    return { valid: false, error: 'Missing required fields' };
  }
  // Additional validation...
  return { valid: true };
}

Handling Edge Failures

Edge systems introduce failure modes that don't exist in centralized architectures. Your data strategy must account for these scenarios:

Edge-Specific Failure Modes

•Origin Connectivity Loss — Edge cannot reach origin. Must decide: serve stale cache? Fail requests? For how long can edge operate autonomously?
•Edge Location Failure — Single edge PoP goes offline. Traffic must reroute to alternative. Any pending writes at that location may be lost.
•Inconsistent Propagation — Data update reaches some edge locations but not others. Users see different data depending on which edge they hit.
•Split-Brain in Durable Objects — If coordination primitives fail, edge locations may make conflicting decisions without detecting conflicts.
•Cache Poisoning — Malformed data enters cache and serves to many users before detection. Poison propagates faster than centralized systems.

Resilience Strategies:

Building Edge Data Resilience

•Explicit Staleness Budgets — Define maximum acceptable staleness per data type. Configure TTLs accordingly. Fail requests that would exceed staleness budget rather than serving ancient data.
•Origin Health Checks — Probe origin continuously from edge. When origin fails, shift to graceful degradation mode: extended cache TTLs, reduced writes, user notification.
•Write Durability Guarantees — For important writes, don't acknowledge until durably stored (at origin or in replicated edge storage). Accept latency cost for durability.
•Circuit Breakers — If origin errors exceed threshold, stop sending traffic. Serve from cache or error page. Periodically probe to detect recovery.
•Cache Entry Validation — Validate cache entries before serving. Detect corrupted/invalid entries early. Log and invalidate poison entries immediately.

Failure Mode Response Matrix
Failure	Detection	Response	Recovery
Origin unreachable	Fetch timeouts, health checks	Serve stale, extend TTL, fail writes	Resume normal on origin recovery
Edge location down	Anycast auto-reroute	Traffic goes to next-nearest edge	Automatic when PoP recovers
Data propagation delay	Version checks, lag monitoring	Serve with staleness warning, or fail	Wait for propagation completion
Cache corruption	Validation checks, checksums	Invalidate entry, fetch fresh	Investigate source, purge affected
Write queue overflow	Queue depth monitoring	Reject new writes, alert ops	Drain queue, increase capacity

Test Failure Modes

Monitoring and Observability for Edge Data

Observability for edge data systems requires different approaches than centralized systems. You can't simply inspect one database—you're monitoring 200+ independent data stores with varying states.

Critical Edge Data Metrics

•Cache Hit Rate (per location) — Healthy edge serves 80%+ from cache. Low hit rate indicates missing data, capacity issues, or misconfigured TTLs.
•Data Freshness / Staleness — Track age of cached data. Alert when staleness exceeds acceptable thresholds. Histogram of cache entry ages.
•Propagation Latency — Time from origin write to edge availability. Track P50/P95/P99. Alert on propagation delays.
•Write Queue Depth — For async writes, monitor queue size. Growing queues indicate origin connectivity issues or capacity problems.
•Origin Request Rate — Requests reaching origin after edge layer. Rising origin traffic suggests cache ineffectiveness.
•Error Rate by Location — Some edge locations may have issues others don't. Segment all metrics by location for isolation.

Observability Patterns

•Synthetic Data Probes — Periodically write sentinel values to origin, measure time to appear at edge locations. Provides end-to-end propagation visibility.
•Version Tagging — Tag all cached responses with version identifiers. Query edge locations to detect version inconsistencies. Identify lagging locations.
•Edge-Side Sampling — At edge, sample a fraction of requests for detailed logging. Full logging at 200 locations is prohibitive; sampling provides representative data.
•Distributed Tracing with Data Tags — Include cache hit/miss, data version, and staleness in trace spans. Correlate user experience issues with data state.

Edge Data Observability
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
// Adding observability to edge data operations
 
export default {
  async fetch(request, env, ctx) {
    const startTime = Date.now();
    const colo = request.cf?.colo || 'unknown';
    const metrics = {
      colo,
      cacheStatus: 'MISS',
      dataVersion: null,
      staleness: null,
      originLatency: null
    };
    
    // Check cache with version tracking
    const cache = caches.default;
    const cacheKey = new Request(request.url, request);
    let response = await cache.match(cacheKey);
    
    if (response) {
      metrics.cacheStatus = 'HIT';
      metrics.dataVersion = response.headers.get('X-Data-Version');
      const cachedAt = response.headers.get('X-Cached-At');
      if (cachedAt) {
        metrics.staleness = Date.now() - new Date(cachedAt).getTime();
      }
    } else {
      // Fetch from origin
      const originStart = Date.now();
      response = await fetch(request);
      metrics.originLatency = Date.now() - originStart;
      metrics.dataVersion = response.headers.get('X-Data-Version');
      
      // Cache response
      const responseToCache = response.clone();
      ctx.waitUntil(cacheWithHeaders(cache, cacheKey, responseToCache));
    }
    
    // Emit metrics
    ctx.waitUntil(emitMetrics(env, metrics, startTime));
    
    // Add observability headers to response
    const finalResponse = new Response(response.body, response);
    finalResponse.headers.set('X-Edge-Colo', colo);
    finalResponse.headers.set('X-Cache-Status', metrics.cacheStatus);
    if (metrics.staleness) {
      finalResponse.headers.set('X-Data-Age-Ms', String(metrics.staleness));
    }
    
    return finalResponse;
  }
};
 
async function cacheWithHeaders(cache, key, response) {
  const headers = new Headers(response.headers);
  headers.set('X-Cached-At', new Date().toISOString());
  
  await cache.put(key, new Response(response.body, {
    status: response.status,
    headers
  }));
}
 
async function emitMetrics(env, metrics, startTime) {
  const totalLatency = Date.now() - startTime;
  
  // Emit to analytics (e.g., Cloudflare Analytics Engine or external service)
  const data = {
    timestamp: Date.now(),
    colo: metrics.colo,
    cache_status: metrics.cacheStatus,
    data_version: metrics.dataVersion,
    staleness_ms: metrics.staleness,
    origin_latency_ms: metrics.originLatency,
    total_latency_ms: totalLatency
  };
  
  // Sample 1% for detailed logging
  if (Math.random() < 0.01) {
    console.log('Edge data metrics:', JSON.stringify(data));
  }
  
  // Push to analytics endpoint (fire-and-forget)
  if (env.ANALYTICS_URL) {
    await fetch(env.ANALYTICS_URL, {
      method: 'POST',
      body: JSON.stringify(data)
    }).catch(() => {}); // Don't fail on analytics error
  }
}

Summary: Edge Data Challenges

We've explored the hard problems of managing data at the edge—from cache invalidation to data sovereignty, from write handling to failure modes. Let's consolidate the key insights:

Key Takeaways

•Edge data is a different discipline — Traditional database thinking doesn't apply. Embrace key-value stores, eventual consistency, and cache hierarchies as primary tools.
•Cache invalidation requires strategy — TTL, event-based, purge APIs, stale-while-revalidate each have trade-offs. Choose based on staleness tolerance and update frequency.
•Data sovereignty constrains architecture — Regulations may prohibit global replication. Design with geo-fencing, data classification, and edge anonymization.
•Multiple storage tiers are typical — Combine Cache API (fastest), KV (fast reads), Durable Objects (coordination), and origin database (writes). Place data appropriately.
•Writes at edge have patterns — Write-through for consistency, write-behind for speed, local-first for offline. Choose based on durability and latency requirements.
•Plan for edge failures — Origin disconnection, edge location failure, inconsistent propagation are real. Define staleness budgets, implement circuit breakers, test failure modes.
•Observability requires edge-aware approaches — Monitor per-location, track propagation, use synthetic probes, sample for detailed analysis. Centralized logging doesn't scale.

Module Complete:

Module Complete

5 / 5