Loading learning content...
Caching creates copies of data for faster access. But what happens when one copy changes? If process A modifies a cached file and process B reads its own cached copy, B sees stale data. If node A updates a distributed cache and node B serves the old version, users see inconsistencies.
This is the cache coherence problem—ensuring that multiple caches holding copies of the same data present a consistent view. The challenge spans from CPU caches (nanoseconds, same chip) to distributed systems (milliseconds, global network), with fundamentally different solutions at each scale.
By the end of this page, you will understand the cache coherence problem at different system levels, coherence protocols and their trade-offs, how operating systems maintain file cache coherence, and the challenges of coherence in distributed systems.
A cache is coherent if:
| Level | Caches Involved | Timescale | Primary Solution |
|---|---|---|---|
| CPU Cache | L1/L2/L3 per core | Nanoseconds | Hardware protocols (MESI) |
| Process Memory | Per-process page cache view | Microseconds | Shared page cache, mmap coherence |
| File System | Buffer/page cache across processes | Milliseconds | Kernel-managed coherence, locks |
| Distributed Cache | Nodes across network | Milliseconds-seconds | Invalidation, TTL, consensus |
Coherence mechanisms trade off three properties:
Latency: How quickly can reads complete? Checking coherence adds delay.
Bandwidth: How much communication is needed? Broadcasts scale poorly.
Staleness: How old can cached data be? Stricter freshness costs more.
No solution optimizes all three. Systems choose based on requirements—strong coherence for database caches, eventual consistency for CDNs.
Multi-core CPUs maintain coherence across per-core caches using hardware protocols. Understanding these illuminates principles that apply at higher levels.
MESI (Modified, Exclusive, Shared, Invalid) is the most common coherence protocol. Each cache line exists in one of four states:
| State | Description | Other Caches | Action on Read | Action on Write |
|---|---|---|---|---|
| Modified (M) | Dirty, only copy | Invalid | Supply data, may stay M | Write locally |
| Exclusive (E) | Clean, only copy | Invalid | Stay E | Transition to M |
| Shared (S) | Clean, possibly shared | May have copy | Stay S | Invalidate others, go M |
| Invalid (I) | Not valid | N/A | Fetch from memory/other cache | Fetch, invalidate, go M |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123
/* * MESI Protocol State Machine (Conceptual) * * Shows how cache line states transition on various events */ enum mesi_state { INVALID, SHARED, EXCLUSIVE, MODIFIED }; struct cache_line { enum mesi_state state; void *data; uint64_t tag;}; /* * Local processor read */void local_read(struct cache_line *line, uint64_t addr) { switch (line->state) { case MODIFIED: case EXCLUSIVE: case SHARED: /* Hit - return data, state unchanged */ return; case INVALID: /* Miss - need to fetch */ if (other_cache_has_modified(addr)) { /* Get from other cache, both go SHARED */ fetch_from_other_cache(line, addr); line->state = SHARED; } else if (other_cache_has(addr)) { /* Get from memory, go SHARED */ fetch_from_memory(line, addr); line->state = SHARED; } else { /* Get from memory, go EXCLUSIVE (only copy) */ fetch_from_memory(line, addr); line->state = EXCLUSIVE; } break; }} /* * Local processor write */void local_write(struct cache_line *line, uint64_t addr, uint64_t value) { switch (line->state) { case MODIFIED: /* Already own exclusive dirty copy */ line->data = value; return; case EXCLUSIVE: /* Have exclusive copy, now modifying */ line->data = value; line->state = MODIFIED; return; case SHARED: /* Must invalidate other copies first */ broadcast_invalidate(addr); wait_for_acks(); line->data = value; line->state = MODIFIED; return; case INVALID: /* Fetch with intent to modify */ broadcast_read_exclusive(addr); line->data = value; line->state = MODIFIED; return; }} /* * Snoop: another processor wants to read */void snoop_read(struct cache_line *line, uint64_t addr) { if (line->tag != addr) return; switch (line->state) { case MODIFIED: /* Must supply data, write back, go SHARED */ supply_data_to_bus(line); write_back_to_memory(line); line->state = SHARED; break; case EXCLUSIVE: /* Others can share, go SHARED */ line->state = SHARED; break; case SHARED: case INVALID: /* No action needed */ break; }} /* * Snoop: another processor wants exclusive access */void snoop_read_exclusive(struct cache_line *line, uint64_t addr) { if (line->tag != addr) return; switch (line->state) { case MODIFIED: supply_data_to_bus(line); /* Fall through - must invalidate */ case EXCLUSIVE: case SHARED: line->state = INVALID; break; case INVALID: break; }}Hardware coherence succeeds because all caches are on the same bus (or connected via coherence fabric) and can snoop all memory transactions. This doesn't scale to distributed systems where communication latency prevents tight coupling.
Operating systems must maintain coherence for file data cached in the page cache. Multiple processes may have the same file open, and changes must be visible appropriately.
On a single system, the kernel provides natural coherence because all processes share the same page cache:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
/* * File cache coherence on a single system * * The key insight: there's only ONE page cache copy * of each file page, shared by all processes */ /* * Process A writes to file at offset X */void process_a_write(int fd, off_t offset, char *data, size_t len) { /* Write goes to page cache page for (inode, offset/PAGE_SIZE) */ struct page *page = find_or_create_page( file->f_mapping, offset >> PAGE_SHIFT); /* Copy data into page */ memcpy(page_address(page) + (offset % PAGE_SIZE), data, len); SetPageDirty(page); /* Page now contains new data */} /* * Process B reads from same file at offset X */void process_b_read(int fd, off_t offset, char *buf, size_t len) { /* Read comes from SAME page cache page */ struct page *page = find_get_page( file->f_mapping, offset >> PAGE_SHIFT); /* Gets data including A's modifications */ memcpy(buf, page_address(page) + (offset % PAGE_SIZE), len);} /* * Memory-mapped coherence is also maintained */ /* Process A maps file */void *a_mapping = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); /* Process B maps same file */void *b_mapping = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); /* * Both mappings point to the SAME page cache pages! * * When A writes: a_mapping[100] = 'X'; * B immediately sees it: assert(b_mapping[100] == 'X'); * * This works because page table entries point to shared * page cache pages, not private copies. */Coherence semantics differ between read()/write() and memory-mapped access:
read()/write(): Always reads from and writes to current page cache state. Coherence is automatic.
mmap(): Changes are visible immediately to other mmap() users of the same file. However, coherence with read()/write() depends on implementation—POSIX doesn't require it.
O_DIRECT bypasses the page cache, creating coherence challenges:
O_DIRECT reads/writes go directly to disk, bypassing cached data. If one process uses normal I/O (cached) and another uses O_DIRECT, they see different data. Applications using O_DIRECT (like databases) typically have exclusive access to their files to avoid this problem.
When files are accessed over a network (NFS, SMB/CIFS), each client has its own cache, creating true distributed coherence challenges.
NFS uses close-to-open consistency:
123456789101112131415161718192021222324252627282930313233343536373839
/* * NFS close-to-open consistency model */ /* Client A opens file, modifies, closes */fd = open("/nfs/shared/file", O_RDWR);write(fd, "new data", 8);close(fd); /* Flushes to server */ /* Client B opens file after A's close */fd = open("/nfs/shared/file", O_RDONLY);/* Validates cache against server - sees A's changes */read(fd, buf, 8); /* Gets "new data" */ /* * BUT: If B already has file open when A modifies: */ /* Client B opens file */fd_b = open("/nfs/shared/file", O_RDONLY); /* Client A opens, modifies, closes */fd_a = open("/nfs/shared/file", O_RDWR);write(fd_a, "changed", 7);close(fd_a); /* Client B reads - may see old data from local cache! */read(fd_b, buf, 7); /* Might NOT see "changed" */ /* * NFS attribute cache timeout controls revalidation *//* * mount options: * actimeo=N - cache attributes for N seconds * noac - no attribute caching (strongest coherence) * * Shorter timeout = stronger coherence, higher overhead */NFSv4 introduces delegations—the server grants a client exclusive (or read) access to a file. While holding a delegation:
Windows file sharing uses oplocks (opportunistic locks):
In distributed systems (Redis clusters, CDNs, application caches), maintaining coherence across network boundaries requires different approaches than hardware or single-system solutions.
| Strategy | Mechanism | Consistency | Latency | Use Case |
|---|---|---|---|---|
| Invalidation | Notify caches to drop entry | Strong on update | Update cost | Database caches |
| TTL-based | Entries expire after time limit | Eventual | Read-through | CDN, session data |
| Write-through | Update cache and store together | Strong | Slower writes | Critical data |
| Pub/Sub | Publish changes, subscribers update | Eventual | Async | Real-time updates |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
"""Distributed cache coherence patterns""" class InvalidationCoherence: """ Write-invalidate: On update, invalidate all cached copies. Next read fetches fresh data from source. """ def __init__(self, cache_nodes, source): self.caches = cache_nodes # List of cache servers self.source = source # Authoritative data store def read(self, key): # Try local cache first value = self.caches[0].get(key) if value is not None: return value # Cache miss - fetch from source value = self.source.get(key) self.caches[0].set(key, value) return value def write(self, key, value): # Update source of truth self.source.set(key, value) # Invalidate ALL cache copies for cache in self.caches: cache.delete(key) # Could also use pub/sub for async invalidation # publish("invalidate", key) class TTLCoherence: """ TTL-based: Entries live for fixed time, then expire. Simple but only provides eventual consistency. """ def __init__(self, cache, source, ttl_seconds): self.cache = cache self.source = source self.ttl = ttl_seconds def read(self, key): # Try cache (includes TTL check) value = self.cache.get(key) if value is not None: return value # Expired or missing - refresh value = self.source.get(key) self.cache.set(key, value, ttl=self.ttl) return value def write(self, key, value): # Update source self.source.set(key, value) # Optionally update cache too (reduces stale reads) self.cache.set(key, value, ttl=self.ttl) # But other cache nodes will still serve stale # until their TTL expires class VersionedCoherence: """ Version-based: Each entry has version number. Readers validate version, refetch if stale. """ def read(self, key): cached = self.cache.get(key) # Check version against source (cheap metadata query) current_version = self.source.get_version(key) if cached and cached.version >= current_version: return cached.value # Stale - refetch full value value, version = self.source.get_with_version(key) self.cache.set(key, CachedEntry(value, version)) return valueThe CAP theorem states you can have at most two of: Consistency, Availability, Partition tolerance. Distributed caches typically choose AP (available even during partitions, eventual consistency) or CP (consistent but may be unavailable during partitions). Your coherence strategy should align with this choice.
Practical coherence implementation involves tracking what's cached where and efficiently propagating changes.
Maintain a directory tracking which caches hold each entry:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
"""Directory-based coherence for distributed cache"""import threadingfrom typing import Set, Dict, Any class CoherenceDirectory: """ Tracks which cache nodes hold copies of each key. Enables targeted invalidation instead of broadcast. """ def __init__(self): # key -> set of cache node IDs holding it self.entries: Dict[str, Set[str]] = {} self.lock = threading.Lock() def register(self, key: str, cache_id: str): """Record that cache_id has cached key.""" with self.lock: if key not in self.entries: self.entries[key] = set() self.entries[key].add(cache_id) def unregister(self, key: str, cache_id: str): """Record that cache_id no longer has key.""" with self.lock: if key in self.entries: self.entries[key].discard(cache_id) def get_holders(self, key: str) -> Set[str]: """Get all caches holding this key.""" with self.lock: return self.entries.get(key, set()).copy() def invalidate(self, key: str, cache_nodes: dict): """Invalidate key on all holders.""" holders = self.get_holders(key) for cache_id in holders: try: cache_nodes[cache_id].delete(key) self.unregister(key, cache_id) except Exception: # Cache node unreachable - will timeout eventually pass class CoherentCache: """Cache with directory-based coherence.""" def __init__(self, cache_id, directory, source): self.cache_id = cache_id self.directory = directory self.source = source self.local_cache = {} def get(self, key): if key in self.local_cache: return self.local_cache[key] # Fetch and register value = self.source.get(key) self.local_cache[key] = value self.directory.register(key, self.cache_id) return value def set(self, key, value): self.source.set(key, value) # Invalidate all other copies self.directory.invalidate(key, all_caches)Leases combine caching with time-limited validity:
What's Next: The final page examines cache effectiveness—measuring and optimizing cache performance for real workloads.
You now understand cache coherence from hardware protocols to distributed systems. This knowledge enables you to reason about consistency guarantees, choose appropriate coherence mechanisms, and debug subtle caching bugs.