Denormalization Patterns - Learning Module

Loading content...

0/241

Caching Strategies

Caching as Extreme Denormalization

Caching represents the most aggressive form of denormalization—creating complete, pre-constructed copies of data optimized entirely for read performance. While in-table denormalization (column replication, derived columns) trades minimal write overhead for join elimination, caching trades significant complexity for orders-of-magnitude performance improvements.

Every high-scale system relies on caching. The question is never whether to cache but what, where, how long, and how to handle staleness. Mastering caching strategies is essential for any engineer building systems that serve millions of users.

What You Will Learn

By the end of this page, you will understand caching architectures, cache-aside vs. read-through patterns, time-based vs. event-based invalidation, cache stampedes and their prevention, and production patterns for maintaining cache consistency with databases.

The Caching Landscape

Before diving into strategies, let's understand the caching landscape in modern systems. Caching exists at multiple levels, each with different latency, capacity, and management characteristics:

Cache Hierarchy (from fastest to slowest):

CPU Cache (L1/L2/L3) — Nanoseconds, kilobytes to megabytes, hardware-managed
In-Process Cache — Microseconds, megabytes, application-managed
Distributed Cache (Redis/Memcached) — Sub-millisecond, gigabytes to terabytes, infrastructure-managed
Database Buffer Pool — Milliseconds, gigabytes, DBMS-managed
CDN Edge Cache — Tens of milliseconds (network), terabytes, provider-managed
Browser Cache — Instant (local), megabytes, browser-managed

Cache Layer Comparison
Cache Layer	Latency	Capacity	Consistency Challenge
In-Process (HashMap)	~100ns	MB (heap-limited)	Multi-instance coordination
Redis (network)	~500µs-1ms	GB-TB (cluster)	Cache-DB sync
CDN Edge	~10-50ms (miss)	TB (distributed)	Global invalidation delay
Browser	0ms (local)	MB per origin	Version/freshness signaling

Database Caching Focus:

For database denormalization, we primarily focus on application-level caching—specifically distributed caches like Redis or Memcached that sit between the application and database:

┌──────────┐    ┌───────────────┐    ┌──────────────┐
│   App    │ ── │  Redis Cache  │ ── │  PostgreSQL  │
└──────────┘    └───────────────┘    └──────────────┘
     │                  │                    │
   ~50µs             ~500µs              ~5-50ms

This cache layer serves several purposes:

Latency reduction — 10-100x faster than database queries
Database load offloading — Most reads never hit the database
Computed result storage — Cache expensive derived/aggregated data
Session state — User sessions shared across application instances

The 80/20 Rule of Caching

In most systems, 20% of data receives 80% of reads. Effective caching exploits this skew—cache the hot data, let the cold data flow through to the database. Cache hit rates of 95%+ are common for well-designed systems.

Caching Patterns: Access Strategies

How your application interacts with the cache matters significantly for performance, consistency, and complexity. The major patterns are:

Cache-Aside (Lazy Loading)

The most common pattern. Application manages cache explicitly:

def get_user(user_id):
    # Try cache first
    user = cache.get(f'user:{user_id}')
    if user is not None:
        return user
    
    # Cache miss: load from database
    user = db.query('SELECT * FROM users WHERE id = ?', user_id)
    
    # Populate cache for next time
    cache.set(f'user:{user_id}', user, ttl=3600)
    
    return user

Pros:

Only requested data gets cached (no unnecessary caching)
Cache failure is recoverable (fall through to DB)
Simple to implement

Cons:

Cache miss has two round trips (cache check, then DB)
Cold cache can cause load spikes
Data can become stale (no automatic sync)

Pattern Selection Guide:

Pattern	Read Performance	Write Performance	Consistency	Use Case
Cache-Aside	Good (after warm)	N/A (DB direct)	Eventual	Most CRUD apps
Read-Through	Good (after warm)	N/A (DB direct)	Eventual	Centralized loading
Write-Through	Good	Slower	Strong	Read-heavy, consistency-critical
Write-Behind	Good	Very Fast	Eventual	Write-heavy, loss-tolerant

Recommendation: Most applications start with cache-aside for its simplicity and flexibility. Move to read-through when you have many cache access points that benefit from centralized loading. Use write-through/write-behind only when requirements demand them.

Cache Invalidation Strategies

As Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things."

Cache invalidation is the process of removing or updating cached data when the source data changes. Get it wrong, and users see stale data. Get it too aggressive, and you lose caching benefits.

Invalidation Strategies

•Time-To-Live (TTL) — Data expires after fixed duration; simplest approach. Trade staleness tolerance for simplicity.
•Explicit Invalidation — Application deletes cache entry when data changes. Requires disciplined cache awareness in code.
•Event-Driven Invalidation — Database changes trigger events (CDC, triggers) that invalidate cache. Decouples cache logic from application.
•Version-Based Invalidation — Cache keys include version; new writes increment version. Old versions naturally expire.
•Tag-Based Invalidation — Cache entries tagged with categories; invalidate all entries with a tag. Useful for related data.

TTL-Based Invalidation:

# Simple TTL - user profile cached for 1 hour
cache.set(f'user:{user_id}', user_data, ttl=3600)

Pros: Simple, self-healing (stale data eventually expires) Cons: Users see stale data for up to TTL duration

TTL Selection Considerations:

Data Type	Typical TTL	Rationale
Static config	Hours-Days	Rarely changes; tolerate some staleness
User profile	15-60 min	Balance freshness with load reduction
Product info	5-15 min	Changes moderately; users expect current price
Real-time data	Seconds	Near-current accuracy needed
Session data	Hours	Tied to session lifetime

Explicit Invalidation:

def update_user(user_id, data):
    # Update database
    db.execute('UPDATE users SET name = ? WHERE id = ?', data['name'], user_id)
    
    # Invalidate cache immediately
    cache.delete(f'user:{user_id}')
    
    # OR update cache with new data
    cache.set(f'user:{user_id}', data, ttl=3600)

The choice between delete and update:

Delete: Simpler; next read will populate fresh data
Update: Better for hot data; avoids cache miss after every write

Delete vs Update Race Condition

Updating cache after DB write has a race condition: if a read fetches old DB data between your write and cache update, it may recache stale data. Delete is safer—the penalty is one cache miss, not stale data.

Event-Driven Invalidation (CDC):

PostgreSQL → Debezium (CDC) → Kafka → Cache Invalidation Service → Redis

Change Data Capture captures database writes at the transaction log level and publishes them as events. Advantages:

Complete coverage — Every write triggers invalidation, regardless of source
Decoupled — Application code doesn't manage cache invalidation
Audit trail — All changes logged for debugging

Latency is typically 50-500ms for invalidation, which is acceptable for most applications.

Version-Based Invalidation:

# Include version in key
version = db.query('SELECT cache_version FROM users WHERE id = ?', user_id)
key = f'user:{user_id}:v{version}'

def update_user(user_id, data):
    db.execute(
        'UPDATE users SET name = ?, cache_version = cache_version + 1 WHERE id = ?',
        data['name'], user_id
    )
    # No explicit cache invalidation needed - new reads use new version key

Old versions naturally expire via TTL. New reads always get new version key.

Downside: Unused cache entries accumulate (but TTL eventually clears them).

Cache Stampede Prevention

A cache stampede (or thundering herd) occurs when many requests simultaneously find the cache empty and all attempt to reload from the database. This can overwhelm the database and cause cascading failures.

Scenario:

Cache entry expires at 12:00:00
1,000 requests arrive at 12:00:00.001
All 1,000 find cache miss
All 1,000 query the database simultaneously
Database overwhelmed; timeouts ensue
Application errors cascade

This is especially problematic for:

Very hot cache entries (many concurrent readers)
Expensive database queries
Cold cache situations (restart, cache failure)

Stampede Prevention Techniques

•Locking (Mutex) — Only one request loads from DB; others wait for cache population.
•Probabilistic Early Expiration — Randomly refresh before TTL to stagger reloads.
•Background Refresh — Refresh cache before expiration using background workers.
•Stale-While-Revalidate — Serve stale data while refresh happens in background.
•Request Coalescing — Multiple waiting requests share single DB query result.

Locking Implementation:

def get_user_with_lock(user_id):
    cache_key = f'user:{user_id}'
    lock_key = f'lock:user:{user_id}'
    
    # Try cache first
    user = cache.get(cache_key)
    if user is not None:
        return user
    
    # Try to acquire lock
    if cache.set(lock_key, '1', nx=True, ttl=10):  # nx=True means "if not exists"
        try:
            # We own the lock - load from DB
            user = db.query_user(user_id)
            cache.set(cache_key, user, ttl=3600)
            return user
        finally:
            cache.delete(lock_key)
    else:
        # Another request is loading - wait and retry
        time.sleep(0.05)  # Short sleep
        return get_user_with_lock(user_id)  # Retry

Probabilistic Early Expiration (XFetch):

def get_with_early_expiration(cache_key, ttl, beta=1.0):
    result = cache.get_with_meta(cache_key)  # Returns value + remaining TTL
    if result is None:
        return refresh_and_cache(cache_key, ttl)
    
    value, remaining_ttl = result
    
    # Probabilistically refresh before expiration
    # Higher beta = earlier refresh; log(random) is always negative
    delta = ttl * beta * math.log(random.random())
    should_refresh = remaining_ttl + delta < 0
    
    if should_refresh:
        # Refresh in background, return current value
        async_refresh(cache_key)
    
    return value

This algorithm makes entries "feel" older probabilistically, spreading refreshes over time instead of having all expire simultaneously.

Stale-While-Revalidate:

def get_with_swr(cache_key):
    result = cache.get_with_meta(cache_key)
    
    if result is None:
        return refresh_and_cache(cache_key)
    
    value, remaining_ttl, is_stale = result
    
    if is_stale:
        # Return stale immediately, refresh async
        trigger_background_refresh(cache_key)
    
    return value  # Always return immediately (stale or fresh)

HTTP caching formalizes this with Cache-Control: stale-while-revalidate=<seconds>.

Cache Warming

For predictably hot data, proactively populate caches before load arrives. Example: Before a sale event, pre-cache all sale products. This eliminates the cold-cache stampede entirely.

Cache Architecture Patterns

Beyond individual cache access patterns, the overall architecture of your caching layer matters significantly:

Single-Tier Cache

•One cache layer (typically Redis)
•Simple to operate and debug
•Single point of configuration
•Suitable for most applications
•Cache miss goes directly to database

Multi-Tier Cache

•L1 (in-process) + L2 (distributed)
•Hottest data in-process for microsecond access
•L2 catches L1 misses before database
•L1 invalidation complexity
•Used by high-throughput systems

Multi-Tier Cache Implementation:

class TieredCache:
    def __init__(self):
        self.l1 = {}  # In-process (dict with LRU eviction in production)
        self.l2 = redis.Redis()  # Distributed cache
    
    def get(self, key):
        # Check L1 first (microseconds)
        if key in self.l1:
            return self.l1[key]
        
        # Check L2 (sub-millisecond)
        value = self.l2.get(key)
        if value:
            self.l1[key] = value  # Promote to L1
            return value
        
        return None  # Cache miss
    
    def set(self, key, value, ttl):
        self.l2.set(key, value, ex=ttl)
        self.l1[key] = value
    
    def invalidate(self, key):
        self.l2.delete(key)
        self.l1.pop(key, None)  # Remove from local cache
        self.broadcast_invalidation(key)  # Tell other app instances

L1 Invalidation Challenge:

With multiple application instances, each has its own L1 cache. When data changes, all L1 caches must be invalidated:

Pub/Sub Broadcast — Redis pub/sub notifies all instances to invalidate
Short L1 TTL — Very short TTL (seconds) limits staleness
Version Keys — Include version in key; version change = effective invalidation

Cache Sharding:

For very large datasets, a single Redis instance may not suffice. Options:

Redis Cluster — Automatic sharding across nodes
Consistent Hashing — Application-level sharding with client libraries
Twemproxy/Envoy — Proxy layer handles sharding transparently

Sharding key selection matters:

# Good: Distribute across shards
key = f'user:{user_id}'  # user_id has good distribution

# Bad: Hot shard
key = f'popular:user:{user_id}'  # 'popular' prefix may concentrate traffic

Operational Considerations

Production caches require monitoring: hit rate, memory usage, eviction rate, latency percentiles. A drop in hit rate often indicates workload change or cache misconfiguration. Set alerts for hit rate thresholds.

Cache Key Design

Cache key design significantly impacts cache effectiveness, debuggability, and invalidation granularity. Poor key design leads to cache collisions, unnecessary misses, and maintenance headaches.

Key Design Principles:

Cache Key Best Practices

•Namespacing — Prefix with entity type: user:123, product:456, session:abc. Prevents collisions and aids debugging.
•Determinism — Same inputs must always produce same key. Avoid random elements or timestamps in keys.
•Completeness — Include all parameters that affect the cached value. Missing parameters cause cache collisions.
•Ordering — Sort variable components consistently. product:123:region:us not sometimes product:123:us.
•Reasonable Length — Keep keys short but readable. Very long keys waste memory and network bandwidth.

Examples of Good and Bad Keys:

# Good: Clear namespace, consistent structure
'user:profile:12345'
'product:listing:67890:region:us'
'order:history:12345:page:1'

# Bad: Ambiguous, collision-prone
'12345'  # What entity? Collision between user 12345 and product 12345
'data'   # Which data?

# Bad: Non-deterministic
f'user:{user_id}:{datetime.now()}'  # Timestamp causes cache miss every time

# Bad: Missing parameter
'search:results:laptop'  # Missing: page, sort order, filters - cache collision

# Good: Complete parameters, sorted
'search:results:q:laptop:page:1:sort:price_asc:brand:dell'

Versioned Keys for Schema Changes:

When cached data structure changes, old cache entries become incompatible:

# v1: Cached user had {name, email}
# v2: Cached user now has {name, email, avatar_url}

# Include version in key
key = f'user:v2:{user_id}'

On schema change, increment version. Old keys naturally expire.

Hash-Based Keys for Complex Queries:

For complex queries with many parameters, use hash:

import hashlib
import json

def cache_key(query_params):
    # Sort for determinism
    normalized = json.dumps(query_params, sort_keys=True)
    param_hash = hashlib.md5(normalized.encode()).hexdigest()[:16]
    return f'search:results:{param_hash}'

# Readable debug key stored separately
cache.set(f'debug:search:{param_hash}', query_params)

Cache Key Documentation

Document your cache key schema in a central location. When debugging production issues, engineers must quickly understand what key pattern maps to what data—this is often the first step in diagnosing caching bugs.

Cache-Database Consistency Patterns

Maintaining consistency between cache and database is the fundamental challenge of caching. Several patterns exist, each with different guarantees:

Pattern Spectrum:

Eventual ◄───────────────────────────────────► Strong

TTL-only   Event-driven   Write-through   Distributed
           invalidation   sync            transactions

Consistency Patterns Comparison
Pattern	Consistency	Write Latency	Complexity	Use Case
TTL-Only	Eventual (TTL bound)	Low	Simple	Static/semi-static data
TTL + Invalidation	Eventual (seconds)	Low-Medium	Medium	Most CRUD applications
Write-Through	Strong	High	Medium	Read-your-writes required
Cache Transactions	Strong	Very High	Very High	Rarely used (Lua scripts)

The Read-After-Write Problem:

Common consistency issue: user updates data, then immediately reads and sees old value.

T1: Write to DB
T2: Invalidate cache (async)
T3: Read hits old cache (invalidation not yet processed)

Solutions:

Synchronous Invalidation:

def update_user(user_id, data):
    db.update(user_id, data)
    cache.delete(f'user:{user_id}')  # Wait for completion

Write-Through Cache:

def update_user(user_id, data):
    new_data = db.update(user_id, data)
    cache.set(f'user:{user_id}', new_data)  # Update, not delete

Session Affinity:
- Route user's reads back to same server that processed write
- Server's local cache already has fresh data
Read-Your-Writes Token:
- Write returns a version token
- Read includes token; if cache older than token, bypass cache

The Double-Write Problem:

T1: Write to DB
T2: Update cache
-- SYSTEM CRASH --

Result: DB has new value, cache has new value. OK.

OR:

T1: Update cache
T2: Write to DB
-- SYSTEM CRASH --

Result: DB has old value, cache has new value. INCONSISTENT!

Rule: Always write to the authoritative source (database) first. Cache is derivative.

Delete, Don't Update on External Writes

When data can be modified by external systems (background jobs, other services, direct DB access), prefer cache deletion over cache updates. You may not have the complete new value, and deleting ensures the next read fetches from the authoritative source.

Summary: Caching Strategies

We've covered the essential caching strategies that complement in-table denormalization:

Key Takeaways

•Caching is extreme denormalization — Full copies optimized for reads, managed outside the database.
•Access patterns — Cache-aside is the default; read-through, write-through, and write-behind for specific needs.
•Invalidation is the hard part — TTL is simplest; explicit/event-driven invalidation for freshness.
•Cache stampedes are real — Use locking, probabilistic expiration, or stale-while-revalidate.
•Multi-tier caching — L1 in-process for hottest data, L2 distributed for shared access.
•Key design matters — Namespace, determinism, completeness, reasonable length.
•Consistency patterns range from eventual (TTL) to strong (write-through); choose based on requirements.

What's Next:

Now that you understand caching strategies, we'll explore materialized views—database-native denormalization that offers a middle ground between in-table denormalization and external caching. Materialized views provide query acceleration with database-managed consistency.

Page Complete

You now have comprehensive knowledge of caching strategies as a denormalization approach. These patterns are essential for any high-traffic system and complement the in-table patterns covered earlier.