Operating SystemsI/O Software

Caching

LevelAdvanced

Duration90 mins

TopicI/O Software

4 / 5

Cache Coherence

When Caches Disagree

Caching creates copies of data for faster access. But what happens when one copy changes? If process A modifies a cached file and process B reads its own cached copy, B sees stale data. If node A updates a distributed cache and node B serves the old version, users see inconsistencies.

This is the cache coherence problem—ensuring that multiple caches holding copies of the same data present a consistent view. The challenge spans from CPU caches (nanoseconds, same chip) to distributed systems (milliseconds, global network), with fundamentally different solutions at each scale.

What You Will Learn

By the end of this page, you will understand the cache coherence problem at different system levels, coherence protocols and their trade-offs, how operating systems maintain file cache coherence, and the challenges of coherence in distributed systems.

The Cache Coherence Problem

Defining Coherence

A cache is coherent if:

A read by processor P to address X that follows a write by P to X always returns the written value
A read by P to X that follows a write by Q to X returns the written value, if the read and write are sufficiently separated in time
Writes to the same location are serialized—all processors see writes in the same order

Levels of the Coherence Challenge

Cache Coherence at Different System Levels
Level	Caches Involved	Timescale	Primary Solution
CPU Cache	L1/L2/L3 per core	Nanoseconds	Hardware protocols (MESI)
Process Memory	Per-process page cache view	Microseconds	Shared page cache, mmap coherence
File System	Buffer/page cache across processes	Milliseconds	Kernel-managed coherence, locks
Distributed Cache	Nodes across network	Milliseconds-seconds	Invalidation, TTL, consensus

The Fundamental Trade-off

Coherence mechanisms trade off three properties:

Latency: How quickly can reads complete? Checking coherence adds delay.

Bandwidth: How much communication is needed? Broadcasts scale poorly.

Staleness: How old can cached data be? Stricter freshness costs more.

No solution optimizes all three. Systems choose based on requirements—strong coherence for database caches, eventual consistency for CDNs.

Hardware Cache Coherence

Multi-core CPUs maintain coherence across per-core caches using hardware protocols. Understanding these illuminates principles that apply at higher levels.

The MESI Protocol

MESI (Modified, Exclusive, Shared, Invalid) is the most common coherence protocol. Each cache line exists in one of four states:

MESI Cache Line States
State	Description	Other Caches	Action on Read	Action on Write
Modified (M)	Dirty, only copy	Invalid	Supply data, may stay M	Write locally
Exclusive (E)	Clean, only copy	Invalid	Stay E	Transition to M
Shared (S)	Clean, possibly shared	May have copy	Stay S	Invalidate others, go M
Invalid (I)	Not valid	N/A	Fetch from memory/other cache	Fetch, invalidate, go M

mesi_transitions.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
/*
 * MESI Protocol State Machine (Conceptual)
 * 
 * Shows how cache line states transition on various events
 */
 
enum mesi_state { INVALID, SHARED, EXCLUSIVE, MODIFIED };
 
struct cache_line {
    enum mesi_state state;
    void *data;
    uint64_t tag;
};
 
/*
 * Local processor read
 */
void local_read(struct cache_line *line, uint64_t addr) {
    switch (line->state) {
    case MODIFIED:
    case EXCLUSIVE:
    case SHARED:
        /* Hit - return data, state unchanged */
        return;
        
    case INVALID:
        /* Miss - need to fetch */
        if (other_cache_has_modified(addr)) {
            /* Get from other cache, both go SHARED */
            fetch_from_other_cache(line, addr);
            line->state = SHARED;
        } else if (other_cache_has(addr)) {
            /* Get from memory, go SHARED */
            fetch_from_memory(line, addr);
            line->state = SHARED;
        } else {
            /* Get from memory, go EXCLUSIVE (only copy) */
            fetch_from_memory(line, addr);
            line->state = EXCLUSIVE;
        }
        break;
    }
}
 
/*
 * Local processor write
 */
void local_write(struct cache_line *line, uint64_t addr, uint64_t value) {
    switch (line->state) {
    case MODIFIED:
        /* Already own exclusive dirty copy */
        line->data = value;
        return;
        
    case EXCLUSIVE:
        /* Have exclusive copy, now modifying */
        line->data = value;
        line->state = MODIFIED;
        return;
        
    case SHARED:
        /* Must invalidate other copies first */
        broadcast_invalidate(addr);
        wait_for_acks();
        line->data = value;
        line->state = MODIFIED;
        return;
        
    case INVALID:
        /* Fetch with intent to modify */
        broadcast_read_exclusive(addr);
        line->data = value;
        line->state = MODIFIED;
        return;
    }
}
 
/*
 * Snoop: another processor wants to read
 */
void snoop_read(struct cache_line *line, uint64_t addr) {
    if (line->tag != addr)
        return;
        
    switch (line->state) {
    case MODIFIED:
        /* Must supply data, write back, go SHARED */
        supply_data_to_bus(line);
        write_back_to_memory(line);
        line->state = SHARED;
        break;
        
    case EXCLUSIVE:
        /* Others can share, go SHARED */
        line->state = SHARED;
        break;
        
    case SHARED:
    case INVALID:
        /* No action needed */
        break;
    }
}
 
/*
 * Snoop: another processor wants exclusive access
 */
void snoop_read_exclusive(struct cache_line *line, uint64_t addr) {
    if (line->tag != addr)
        return;
        
    switch (line->state) {
    case MODIFIED:
        supply_data_to_bus(line);
        /* Fall through - must invalidate */
    case EXCLUSIVE:
    case SHARED:
        line->state = INVALID;
        break;
    case INVALID:
        break;
    }
}

Why Hardware Coherence Works

Hardware coherence succeeds because all caches are on the same bus (or connected via coherence fabric) and can snoop all memory transactions. This doesn't scale to distributed systems where communication latency prevents tight coupling.

File Cache Coherence

Operating systems must maintain coherence for file data cached in the page cache. Multiple processes may have the same file open, and changes must be visible appropriately.

Single-System Coherence

On a single system, the kernel provides natural coherence because all processes share the same page cache:

file_cache_coherence.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
/*
 * File cache coherence on a single system
 * 
 * The key insight: there's only ONE page cache copy
 * of each file page, shared by all processes
 */
 
/*
 * Process A writes to file at offset X
 */
void process_a_write(int fd, off_t offset, char *data, size_t len) {
    /* Write goes to page cache page for (inode, offset/PAGE_SIZE) */
    struct page *page = find_or_create_page(
        file->f_mapping, offset >> PAGE_SHIFT);
    
    /* Copy data into page */
    memcpy(page_address(page) + (offset % PAGE_SIZE), data, len);
    
    SetPageDirty(page);
    /* Page now contains new data */
}
 
/*
 * Process B reads from same file at offset X
 */
void process_b_read(int fd, off_t offset, char *buf, size_t len) {
    /* Read comes from SAME page cache page */
    struct page *page = find_get_page(
        file->f_mapping, offset >> PAGE_SHIFT);
    
    /* Gets data including A's modifications */
    memcpy(buf, page_address(page) + (offset % PAGE_SIZE), len);
}
 
/*
 * Memory-mapped coherence is also maintained
 */
 
/* Process A maps file */
void *a_mapping = mmap(NULL, size, PROT_READ|PROT_WRITE, 
                       MAP_SHARED, fd, 0);
 
/* Process B maps same file */
void *b_mapping = mmap(NULL, size, PROT_READ|PROT_WRITE,
                       MAP_SHARED, fd, 0);
 
/* 
 * Both mappings point to the SAME page cache pages!
 * 
 * When A writes: a_mapping[100] = 'X';
 * B immediately sees it: assert(b_mapping[100] == 'X');
 * 
 * This works because page table entries point to shared
 * page cache pages, not private copies.
 */

read() vs mmap() Coherence

Coherence semantics differ between read()/write() and memory-mapped access:

read()/write(): Always reads from and writes to current page cache state. Coherence is automatic.

mmap(): Changes are visible immediately to other mmap() users of the same file. However, coherence with read()/write() depends on implementation—POSIX doesn't require it.

O_DIRECT and Coherence

O_DIRECT bypasses the page cache, creating coherence challenges:

O_DIRECT Coherence Hazard

O_DIRECT reads/writes go directly to disk, bypassing cached data. If one process uses normal I/O (cached) and another uses O_DIRECT, they see different data. Applications using O_DIRECT (like databases) typically have exclusive access to their files to avoid this problem.

Network File System Coherence

When files are accessed over a network (NFS, SMB/CIFS), each client has its own cache, creating true distributed coherence challenges.

NFS Coherence Model

NFS uses close-to-open consistency:

Client caches file data and attributes locally
When a file is closed, modifications are flushed to server
When a file is opened, cache is validated against server
Between open and close, changes may not be visible to other clients

nfs_coherence.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
/*
 * NFS close-to-open consistency model
 */
 
/* Client A opens file, modifies, closes */
fd = open("/nfs/shared/file", O_RDWR);
write(fd, "new data", 8);
close(fd);  /* Flushes to server */
 
/* Client B opens file after A's close */
fd = open("/nfs/shared/file", O_RDONLY);
/* Validates cache against server - sees A's changes */
read(fd, buf, 8);  /* Gets "new data" */
 
/*
 * BUT: If B already has file open when A modifies:
 */
 
/* Client B opens file */
fd_b = open("/nfs/shared/file", O_RDONLY);
 
/* Client A opens, modifies, closes */
fd_a = open("/nfs/shared/file", O_RDWR);
write(fd_a, "changed", 7);
close(fd_a);
 
/* Client B reads - may see old data from local cache! */
read(fd_b, buf, 7);  /* Might NOT see "changed" */
 
/*
 * NFS attribute cache timeout controls revalidation
 */
/* 
 * mount options:
 *   actimeo=N    - cache attributes for N seconds
 *   noac         - no attribute caching (strongest coherence)
 *   
 * Shorter timeout = stronger coherence, higher overhead
 */

NFSv4 Delegations

NFSv4 introduces delegations—the server grants a client exclusive (or read) access to a file. While holding a delegation:

Client can cache aggressively without server round-trips
Server recalls delegation when another client wants access
This provides strong coherence for non-contended files

SMB/CIFS Oplocks

Windows file sharing uses oplocks (opportunistic locks):

Exclusive oplock: Client has sole access, can cache freely
Batch oplock: Client can cache open/close operations
Level 2 oplock: Multiple readers, read caching only
Server breaks oplock when contention arises

Distributed Cache Coherence

In distributed systems (Redis clusters, CDNs, application caches), maintaining coherence across network boundaries requires different approaches than hardware or single-system solutions.

Coherence Strategies

Distributed Cache Coherence Approaches
Strategy	Mechanism	Consistency	Latency	Use Case
Invalidation	Notify caches to drop entry	Strong on update	Update cost	Database caches
TTL-based	Entries expire after time limit	Eventual	Read-through	CDN, session data
Write-through	Update cache and store together	Strong	Slower writes	Critical data
Pub/Sub	Publish changes, subscribers update	Eventual	Async	Real-time updates

distributed_coherence.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
"""
Distributed cache coherence patterns
"""
 
class InvalidationCoherence:
    """
    Write-invalidate: On update, invalidate all cached copies.
    Next read fetches fresh data from source.
    """
    
    def __init__(self, cache_nodes, source):
        self.caches = cache_nodes  # List of cache servers
        self.source = source        # Authoritative data store
    
    def read(self, key):
        # Try local cache first
        value = self.caches[0].get(key)
        if value is not None:
            return value
        
        # Cache miss - fetch from source
        value = self.source.get(key)
        self.caches[0].set(key, value)
        return value
    
    def write(self, key, value):
        # Update source of truth
        self.source.set(key, value)
        
        # Invalidate ALL cache copies
        for cache in self.caches:
            cache.delete(key)
        
        # Could also use pub/sub for async invalidation
        # publish("invalidate", key)
 
 
class TTLCoherence:
    """
    TTL-based: Entries live for fixed time, then expire.
    Simple but only provides eventual consistency.
    """
    
    def __init__(self, cache, source, ttl_seconds):
        self.cache = cache
        self.source = source
        self.ttl = ttl_seconds
    
    def read(self, key):
        # Try cache (includes TTL check)
        value = self.cache.get(key)
        if value is not None:
            return value
        
        # Expired or missing - refresh
        value = self.source.get(key)
        self.cache.set(key, value, ttl=self.ttl)
        return value
    
    def write(self, key, value):
        # Update source
        self.source.set(key, value)
        
        # Optionally update cache too (reduces stale reads)
        self.cache.set(key, value, ttl=self.ttl)
        
        # But other cache nodes will still serve stale
        # until their TTL expires
 
 
class VersionedCoherence:
    """
    Version-based: Each entry has version number.
    Readers validate version, refetch if stale.
    """
    
    def read(self, key):
        cached = self.cache.get(key)
        
        # Check version against source (cheap metadata query)
        current_version = self.source.get_version(key)
        
        if cached and cached.version >= current_version:
            return cached.value
        
        # Stale - refetch full value
        value, version = self.source.get_with_version(key)
        self.cache.set(key, CachedEntry(value, version))
        return value

CAP Theorem Implications

The CAP theorem states you can have at most two of: Consistency, Availability, Partition tolerance. Distributed caches typically choose AP (available even during partitions, eventual consistency) or CP (consistent but may be unavailable during partitions). Your coherence strategy should align with this choice.

Implementing Coherence

Practical coherence implementation involves tracking what's cached where and efficiently propagating changes.

Directory-Based Coherence

Maintain a directory tracking which caches hold each entry:

coherence_directory.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
"""
Directory-based coherence for distributed cache
"""
import threading
from typing import Set, Dict, Any
 
class CoherenceDirectory:
    """
    Tracks which cache nodes hold copies of each key.
    Enables targeted invalidation instead of broadcast.
    """
    
    def __init__(self):
        # key -> set of cache node IDs holding it
        self.entries: Dict[str, Set[str]] = {}
        self.lock = threading.Lock()
    
    def register(self, key: str, cache_id: str):
        """Record that cache_id has cached key."""
        with self.lock:
            if key not in self.entries:
                self.entries[key] = set()
            self.entries[key].add(cache_id)
    
    def unregister(self, key: str, cache_id: str):
        """Record that cache_id no longer has key."""
        with self.lock:
            if key in self.entries:
                self.entries[key].discard(cache_id)
    
    def get_holders(self, key: str) -> Set[str]:
        """Get all caches holding this key."""
        with self.lock:
            return self.entries.get(key, set()).copy()
    
    def invalidate(self, key: str, cache_nodes: dict):
        """Invalidate key on all holders."""
        holders = self.get_holders(key)
        
        for cache_id in holders:
            try:
                cache_nodes[cache_id].delete(key)
                self.unregister(key, cache_id)
            except Exception:
                # Cache node unreachable - will timeout eventually
                pass
 
 
class CoherentCache:
    """Cache with directory-based coherence."""
    
    def __init__(self, cache_id, directory, source):
        self.cache_id = cache_id
        self.directory = directory
        self.source = source
        self.local_cache = {}
    
    def get(self, key):
        if key in self.local_cache:
            return self.local_cache[key]
        
        # Fetch and register
        value = self.source.get(key)
        self.local_cache[key] = value
        self.directory.register(key, self.cache_id)
        return value
    
    def set(self, key, value):
        self.source.set(key, value)
        # Invalidate all other copies
        self.directory.invalidate(key, all_caches)

Lease-Based Coherence

Leases combine caching with time-limited validity:

Server grants lease for fixed duration when serving data
Client can cache data until lease expires
To update, server waits for all leases to expire (or recalls them)
Provides strong coherence without constant validation

Summary: Cache Coherence

Key Takeaways

•Coherence ensures caches present consistent views — Multiple copies of data must reflect updates appropriately.
•Hardware coherence uses snooping protocols — MESI and variants maintain coherence via cache line state machines.
•File cache coherence is natural on single systems — Shared page cache means one copy, automatic coherence.
•Network filesystems use close-to-open semantics — Weaker than single-system but practical for distributed access.
•Distributed caches trade consistency for performance — Invalidation, TTL, and versioning offer different trade-offs.
•Directory and lease mechanisms enable targeted coherence — Avoid broadcast overhead by tracking cache locations.

What's Next: The final page examines cache effectiveness—measuring and optimizing cache performance for real workloads.

Page Complete

You now understand cache coherence from hardware protocols to distributed systems. This knowledge enables you to reason about consistency guarantees, choose appropriate coherence mechanisms, and debug subtle caching bugs.

4 / 5

Loading learning content...

Operating SystemsI/O Software

Caching

LevelAdvanced

Duration90 mins

TopicI/O Software

4 / 5

Cache Coherence

When Caches Disagree

What You Will Learn

The Cache Coherence Problem

Defining Coherence

A cache is coherent if:

A read by processor P to address X that follows a write by P to X always returns the written value
A read by P to X that follows a write by Q to X returns the written value, if the read and write are sufficiently separated in time
Writes to the same location are serialized—all processors see writes in the same order

Levels of the Coherence Challenge

Cache Coherence at Different System Levels
Level	Caches Involved	Timescale	Primary Solution
CPU Cache	L1/L2/L3 per core	Nanoseconds	Hardware protocols (MESI)
Process Memory	Per-process page cache view	Microseconds	Shared page cache, mmap coherence
File System	Buffer/page cache across processes	Milliseconds	Kernel-managed coherence, locks
Distributed Cache	Nodes across network	Milliseconds-seconds	Invalidation, TTL, consensus

The Fundamental Trade-off

Coherence mechanisms trade off three properties:

Latency: How quickly can reads complete? Checking coherence adds delay.

Bandwidth: How much communication is needed? Broadcasts scale poorly.

Staleness: How old can cached data be? Stricter freshness costs more.

No solution optimizes all three. Systems choose based on requirements—strong coherence for database caches, eventual consistency for CDNs.

Hardware Cache Coherence

Multi-core CPUs maintain coherence across per-core caches using hardware protocols. Understanding these illuminates principles that apply at higher levels.

The MESI Protocol

MESI (Modified, Exclusive, Shared, Invalid) is the most common coherence protocol. Each cache line exists in one of four states:

MESI Cache Line States
State	Description	Other Caches	Action on Read	Action on Write
Modified (M)	Dirty, only copy	Invalid	Supply data, may stay M	Write locally
Exclusive (E)	Clean, only copy	Invalid	Stay E	Transition to M
Shared (S)	Clean, possibly shared	May have copy	Stay S	Invalidate others, go M
Invalid (I)	Not valid	N/A	Fetch from memory/other cache	Fetch, invalidate, go M

mesi_transitions.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
/*
 * MESI Protocol State Machine (Conceptual)
 * 
 * Shows how cache line states transition on various events
 */
 
enum mesi_state { INVALID, SHARED, EXCLUSIVE, MODIFIED };
 
struct cache_line {
    enum mesi_state state;
    void *data;
    uint64_t tag;
};
 
/*
 * Local processor read
 */
void local_read(struct cache_line *line, uint64_t addr) {
    switch (line->state) {
    case MODIFIED:
    case EXCLUSIVE:
    case SHARED:
        /* Hit - return data, state unchanged */
        return;
        
    case INVALID:
        /* Miss - need to fetch */
        if (other_cache_has_modified(addr)) {
            /* Get from other cache, both go SHARED */
            fetch_from_other_cache(line, addr);
            line->state = SHARED;
        } else if (other_cache_has(addr)) {
            /* Get from memory, go SHARED */
            fetch_from_memory(line, addr);
            line->state = SHARED;
        } else {
            /* Get from memory, go EXCLUSIVE (only copy) */
            fetch_from_memory(line, addr);
            line->state = EXCLUSIVE;
        }
        break;
    }
}
 
/*
 * Local processor write
 */
void local_write(struct cache_line *line, uint64_t addr, uint64_t value) {
    switch (line->state) {
    case MODIFIED:
        /* Already own exclusive dirty copy */
        line->data = value;
        return;
        
    case EXCLUSIVE:
        /* Have exclusive copy, now modifying */
        line->data = value;
        line->state = MODIFIED;
        return;
        
    case SHARED:
        /* Must invalidate other copies first */
        broadcast_invalidate(addr);
        wait_for_acks();
        line->data = value;
        line->state = MODIFIED;
        return;
        
    case INVALID:
        /* Fetch with intent to modify */
        broadcast_read_exclusive(addr);
        line->data = value;
        line->state = MODIFIED;
        return;
    }
}
 
/*
 * Snoop: another processor wants to read
 */
void snoop_read(struct cache_line *line, uint64_t addr) {
    if (line->tag != addr)
        return;
        
    switch (line->state) {
    case MODIFIED:
        /* Must supply data, write back, go SHARED */
        supply_data_to_bus(line);
        write_back_to_memory(line);
        line->state = SHARED;
        break;
        
    case EXCLUSIVE:
        /* Others can share, go SHARED */
        line->state = SHARED;
        break;
        
    case SHARED:
    case INVALID:
        /* No action needed */
        break;
    }
}
 
/*
 * Snoop: another processor wants exclusive access
 */
void snoop_read_exclusive(struct cache_line *line, uint64_t addr) {
    if (line->tag != addr)
        return;
        
    switch (line->state) {
    case MODIFIED:
        supply_data_to_bus(line);
        /* Fall through - must invalidate */
    case EXCLUSIVE:
    case SHARED:
        line->state = INVALID;
        break;
    case INVALID:
        break;
    }
}

Why Hardware Coherence Works

File Cache Coherence

Operating systems must maintain coherence for file data cached in the page cache. Multiple processes may have the same file open, and changes must be visible appropriately.

Single-System Coherence

On a single system, the kernel provides natural coherence because all processes share the same page cache:

file_cache_coherence.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
/*
 * File cache coherence on a single system
 * 
 * The key insight: there's only ONE page cache copy
 * of each file page, shared by all processes
 */
 
/*
 * Process A writes to file at offset X
 */
void process_a_write(int fd, off_t offset, char *data, size_t len) {
    /* Write goes to page cache page for (inode, offset/PAGE_SIZE) */
    struct page *page = find_or_create_page(
        file->f_mapping, offset >> PAGE_SHIFT);
    
    /* Copy data into page */
    memcpy(page_address(page) + (offset % PAGE_SIZE), data, len);
    
    SetPageDirty(page);
    /* Page now contains new data */
}
 
/*
 * Process B reads from same file at offset X
 */
void process_b_read(int fd, off_t offset, char *buf, size_t len) {
    /* Read comes from SAME page cache page */
    struct page *page = find_get_page(
        file->f_mapping, offset >> PAGE_SHIFT);
    
    /* Gets data including A's modifications */
    memcpy(buf, page_address(page) + (offset % PAGE_SIZE), len);
}
 
/*
 * Memory-mapped coherence is also maintained
 */
 
/* Process A maps file */
void *a_mapping = mmap(NULL, size, PROT_READ|PROT_WRITE, 
                       MAP_SHARED, fd, 0);
 
/* Process B maps same file */
void *b_mapping = mmap(NULL, size, PROT_READ|PROT_WRITE,
                       MAP_SHARED, fd, 0);
 
/* 
 * Both mappings point to the SAME page cache pages!
 * 
 * When A writes: a_mapping[100] = 'X';
 * B immediately sees it: assert(b_mapping[100] == 'X');
 * 
 * This works because page table entries point to shared
 * page cache pages, not private copies.
 */

read() vs mmap() Coherence

Coherence semantics differ between read()/write() and memory-mapped access:

read()/write(): Always reads from and writes to current page cache state. Coherence is automatic.

mmap(): Changes are visible immediately to other mmap() users of the same file. However, coherence with read()/write() depends on implementation—POSIX doesn't require it.

O_DIRECT and Coherence

O_DIRECT bypasses the page cache, creating coherence challenges:

O_DIRECT Coherence Hazard

Network File System Coherence

When files are accessed over a network (NFS, SMB/CIFS), each client has its own cache, creating true distributed coherence challenges.

NFS Coherence Model

NFS uses close-to-open consistency:

Client caches file data and attributes locally
When a file is closed, modifications are flushed to server
When a file is opened, cache is validated against server
Between open and close, changes may not be visible to other clients

nfs_coherence.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
/*
 * NFS close-to-open consistency model
 */
 
/* Client A opens file, modifies, closes */
fd = open("/nfs/shared/file", O_RDWR);
write(fd, "new data", 8);
close(fd);  /* Flushes to server */
 
/* Client B opens file after A's close */
fd = open("/nfs/shared/file", O_RDONLY);
/* Validates cache against server - sees A's changes */
read(fd, buf, 8);  /* Gets "new data" */
 
/*
 * BUT: If B already has file open when A modifies:
 */
 
/* Client B opens file */
fd_b = open("/nfs/shared/file", O_RDONLY);
 
/* Client A opens, modifies, closes */
fd_a = open("/nfs/shared/file", O_RDWR);
write(fd_a, "changed", 7);
close(fd_a);
 
/* Client B reads - may see old data from local cache! */
read(fd_b, buf, 7);  /* Might NOT see "changed" */
 
/*
 * NFS attribute cache timeout controls revalidation
 */
/* 
 * mount options:
 *   actimeo=N    - cache attributes for N seconds
 *   noac         - no attribute caching (strongest coherence)
 *   
 * Shorter timeout = stronger coherence, higher overhead
 */

NFSv4 Delegations

NFSv4 introduces delegations—the server grants a client exclusive (or read) access to a file. While holding a delegation:

Client can cache aggressively without server round-trips
Server recalls delegation when another client wants access
This provides strong coherence for non-contended files

SMB/CIFS Oplocks

Windows file sharing uses oplocks (opportunistic locks):

Exclusive oplock: Client has sole access, can cache freely
Batch oplock: Client can cache open/close operations
Level 2 oplock: Multiple readers, read caching only
Server breaks oplock when contention arises

Distributed Cache Coherence

In distributed systems (Redis clusters, CDNs, application caches), maintaining coherence across network boundaries requires different approaches than hardware or single-system solutions.

Coherence Strategies

Distributed Cache Coherence Approaches
Strategy	Mechanism	Consistency	Latency	Use Case
Invalidation	Notify caches to drop entry	Strong on update	Update cost	Database caches
TTL-based	Entries expire after time limit	Eventual	Read-through	CDN, session data
Write-through	Update cache and store together	Strong	Slower writes	Critical data
Pub/Sub	Publish changes, subscribers update	Eventual	Async	Real-time updates

distributed_coherence.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
"""
Distributed cache coherence patterns
"""
 
class InvalidationCoherence:
    """
    Write-invalidate: On update, invalidate all cached copies.
    Next read fetches fresh data from source.
    """
    
    def __init__(self, cache_nodes, source):
        self.caches = cache_nodes  # List of cache servers
        self.source = source        # Authoritative data store
    
    def read(self, key):
        # Try local cache first
        value = self.caches[0].get(key)
        if value is not None:
            return value
        
        # Cache miss - fetch from source
        value = self.source.get(key)
        self.caches[0].set(key, value)
        return value
    
    def write(self, key, value):
        # Update source of truth
        self.source.set(key, value)
        
        # Invalidate ALL cache copies
        for cache in self.caches:
            cache.delete(key)
        
        # Could also use pub/sub for async invalidation
        # publish("invalidate", key)
 
 
class TTLCoherence:
    """
    TTL-based: Entries live for fixed time, then expire.
    Simple but only provides eventual consistency.
    """
    
    def __init__(self, cache, source, ttl_seconds):
        self.cache = cache
        self.source = source
        self.ttl = ttl_seconds
    
    def read(self, key):
        # Try cache (includes TTL check)
        value = self.cache.get(key)
        if value is not None:
            return value
        
        # Expired or missing - refresh
        value = self.source.get(key)
        self.cache.set(key, value, ttl=self.ttl)
        return value
    
    def write(self, key, value):
        # Update source
        self.source.set(key, value)
        
        # Optionally update cache too (reduces stale reads)
        self.cache.set(key, value, ttl=self.ttl)
        
        # But other cache nodes will still serve stale
        # until their TTL expires
 
 
class VersionedCoherence:
    """
    Version-based: Each entry has version number.
    Readers validate version, refetch if stale.
    """
    
    def read(self, key):
        cached = self.cache.get(key)
        
        # Check version against source (cheap metadata query)
        current_version = self.source.get_version(key)
        
        if cached and cached.version >= current_version:
            return cached.value
        
        # Stale - refetch full value
        value, version = self.source.get_with_version(key)
        self.cache.set(key, CachedEntry(value, version))
        return value

CAP Theorem Implications

Implementing Coherence

Practical coherence implementation involves tracking what's cached where and efficiently propagating changes.

Directory-Based Coherence

Maintain a directory tracking which caches hold each entry:

coherence_directory.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
"""
Directory-based coherence for distributed cache
"""
import threading
from typing import Set, Dict, Any
 
class CoherenceDirectory:
    """
    Tracks which cache nodes hold copies of each key.
    Enables targeted invalidation instead of broadcast.
    """
    
    def __init__(self):
        # key -> set of cache node IDs holding it
        self.entries: Dict[str, Set[str]] = {}
        self.lock = threading.Lock()
    
    def register(self, key: str, cache_id: str):
        """Record that cache_id has cached key."""
        with self.lock:
            if key not in self.entries:
                self.entries[key] = set()
            self.entries[key].add(cache_id)
    
    def unregister(self, key: str, cache_id: str):
        """Record that cache_id no longer has key."""
        with self.lock:
            if key in self.entries:
                self.entries[key].discard(cache_id)
    
    def get_holders(self, key: str) -> Set[str]:
        """Get all caches holding this key."""
        with self.lock:
            return self.entries.get(key, set()).copy()
    
    def invalidate(self, key: str, cache_nodes: dict):
        """Invalidate key on all holders."""
        holders = self.get_holders(key)
        
        for cache_id in holders:
            try:
                cache_nodes[cache_id].delete(key)
                self.unregister(key, cache_id)
            except Exception:
                # Cache node unreachable - will timeout eventually
                pass
 
 
class CoherentCache:
    """Cache with directory-based coherence."""
    
    def __init__(self, cache_id, directory, source):
        self.cache_id = cache_id
        self.directory = directory
        self.source = source
        self.local_cache = {}
    
    def get(self, key):
        if key in self.local_cache:
            return self.local_cache[key]
        
        # Fetch and register
        value = self.source.get(key)
        self.local_cache[key] = value
        self.directory.register(key, self.cache_id)
        return value
    
    def set(self, key, value):
        self.source.set(key, value)
        # Invalidate all other copies
        self.directory.invalidate(key, all_caches)

Lease-Based Coherence

Leases combine caching with time-limited validity:

Server grants lease for fixed duration when serving data
Client can cache data until lease expires
To update, server waits for all leases to expire (or recalls them)
Provides strong coherence without constant validation

Summary: Cache Coherence

Key Takeaways

•Coherence ensures caches present consistent views — Multiple copies of data must reflect updates appropriately.
•Hardware coherence uses snooping protocols — MESI and variants maintain coherence via cache line state machines.
•File cache coherence is natural on single systems — Shared page cache means one copy, automatic coherence.
•Network filesystems use close-to-open semantics — Weaker than single-system but practical for distributed access.
•Distributed caches trade consistency for performance — Invalidation, TTL, and versioning offer different trade-offs.
•Directory and lease mechanisms enable targeted coherence — Avoid broadcast overhead by tracking cache locations.

What's Next: The final page examines cache effectiveness—measuring and optimizing cache performance for real workloads.

Page Complete

4 / 5