Loading learning content...
Throughout this module, we've examined how the buffer cache accelerates file system performance by keeping data in fast memory. But performance without durability is hollow—data that exists only in volatile memory vanishes on power failure. Sync operations bridge this gap: they are the mechanisms that force data from the buffer cache onto stable, persistent storage.
Understanding sync operations is essential for any engineer building reliable software:
fsync() and what it guaranteesIn this final page of the buffer cache module, we'll examine every aspect of synchronization: from high-level system calls to kernel implementation to storage device behavior.
By the end of this page, you will understand: (1) The complete sync() family of system calls and their guarantees, (2) How the kernel implements these operations internally, (3) The difference between data sync and metadata sync, (4) Storage device flush commands and their interaction with software sync, (5) Performance implications and optimization strategies, and (6) Real-world sync patterns for databases and critical applications.
POSIX defines several system calls for synchronizing cached data to storage. Each has different scope, guarantees, and performance characteristics.
The Complete Family:
| Call | Scope | Waits? | Syncs Data? | Syncs Metadata? |
|---|---|---|---|---|
sync() | All filesystems | Usually no* | Yes | Yes |
syncfs(fd) | Single filesystem | Yes | Yes | Yes |
fsync(fd) | Single file | Yes | Yes | Yes |
fdatasync(fd) | Single file | Yes | Yes | Partial** |
sync_file_range() | Byte range | Configurable | Yes | No |
msync(addr, len, flags) | Mapped region | Configurable | Yes | Yes |
*sync() historically returned immediately after initiating writeback; modern Linux waits for completion.
**fdatasync() skips metadata that doesn't affect data retrieval (e.g., atime, mtime) but includes metadata that does (e.g., file size).
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132
#include <unistd.h>#include <fcntl.h>#include <sys/mman.h> /* * sync() - Synchronize all filesystems * * Schedules or performs writeback of all dirty data and metadata * across all mounted filesystems. * * Returns: void (cannot fail in traditional UNIX semantics) */void sync(void); /* Flushes entire system */ /* * syncfs(fd) - Synchronize single filesystem * * Forces writes for all dirty data on the filesystem containing fd. * More targeted than sync() for systems with many filesystems. * * Returns: 0 on success, -1 on error */int backup_filesystem_sync(const char *mountpoint) { int fd = open(mountpoint, O_RDONLY); if (fd < 0) { perror("open"); return -1; } /* Sync only this filesystem */ if (syncfs(fd) != 0) { perror("syncfs"); close(fd); return -1; } close(fd); return 0;} /* * fsync(fd) - Synchronize single file * * Forces all data and metadata for fd to stable storage. * Returns only after data is confirmed durable. * * This is the PRIMARY durability mechanism for applications. */int durable_write(int fd, const void *data, size_t len) { /* Write data to kernel buffer */ ssize_t written = write(fd, data, len); if (written != len) { return -1; } /* Force to stable storage */ if (fsync(fd) != 0) { perror("fsync"); return -1; /* Data may not be durable! */ } return 0; /* Data is now on stable storage */} /* * fdatasync(fd) - Synchronize file data only * * Like fsync() but omits metadata that doesn't affect data retrieval. * Faster when you don't need timestamp updates to be durable. */int durable_data_write(int fd, const void *data, size_t len) { write(fd, data, len); /* Only sync data and essential metadata (like file size) */ if (fdatasync(fd) != 0) { return -1; } return 0; /* File content is durable; mtime may not be */} /* * sync_file_range() - Fine-grained sync control (Linux-specific) * * Offers surgical control over which portions of a file to sync * and whether to wait for completion. */#define SYNC_FILE_RANGE_WAIT_BEFORE 1#define SYNC_FILE_RANGE_WRITE 2#define SYNC_FILE_RANGE_WAIT_AFTER 4 void streaming_write_example(int fd) { char buffer[1024 * 1024]; /* 1MB chunks */ off_t offset = 0; while (/* have more data */) { /* Fill buffer with data */ // ... /* Write to kernel buffer */ pwrite(fd, buffer, sizeof(buffer), offset); /* Initiate async writeback for this chunk */ /* Don't wait - let it happen in background */ sync_file_range(fd, offset, sizeof(buffer), SYNC_FILE_RANGE_WRITE); offset += sizeof(buffer); } /* At end, ensure everything is synced */ sync_file_range(fd, 0, offset, SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE | SYNC_FILE_RANGE_WAIT_AFTER);} /* * msync() - Synchronize memory-mapped file region */void mmap_sync_example(void *addr, size_t len) { /* MS_SYNC: Wait for writeback to complete */ if (msync(addr, len, MS_SYNC) != 0) { perror("msync"); } /* MS_ASYNC: Schedule writeback, don't wait */ msync(addr, len, MS_ASYNC); /* MS_INVALIDATE: Invalidate cached pages (force re-read) */ msync(addr, len, MS_INVALIDATE);}If fsync() fails, your data may not be durable! Ignoring fsync() errors has caused data loss in real applications. Always check the return value, and if fsync fails, consider the data at risk. Some file systems (historically ext3/4) had issues reporting errors correctly—test your specific setup.
Understanding how the kernel implements sync operations helps explain their behavior and performance. Let's trace through the implementation path.
fsync() Implementation Flow:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118
/* * Linux fsync() implementation (simplified) * Real code in fs/sync.c and specific FS implementations */ /* System call entry point */SYSCALL_DEFINE1(fsync, unsigned int, fd) { struct fd f = fdget(fd); int ret; if (!f.file) return -EBADF; ret = vfs_fsync(f.file, 0); /* 0 = sync data AND metadata */ fdput(f); return ret;} /* VFS layer fsync implementation */int vfs_fsync(struct file *file, int datasync) { return vfs_fsync_range(file, 0, LLONG_MAX, datasync);} int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync) { struct inode *inode = file_inode(file); int ret; /* Call filesystem-specific fsync if provided */ if (file->f_op->fsync) { return file->f_op->fsync(file, start, end, datasync); } /* Generic implementation */ ret = sync_inode_metadata(inode, 1); /* 1 = wait */ if (!ret) ret = filemap_fdatawrite_range(file->f_mapping, start, end); if (!ret) ret = filemap_fdatawait_range(file->f_mapping, start, end); return ret;} /* Write all dirty pages in the specified range */int filemap_fdatawrite_range(struct address_space *mapping, loff_t start, loff_t end) { struct writeback_control wbc = { .sync_mode = WB_SYNC_ALL, /* Wait for completion */ .nr_to_write = LONG_MAX, /* No page limit */ .range_start = start, .range_end = end, }; return mapping->a_ops->writepages(mapping, &wbc);} /* Wait for all I/O on pages in range to complete */int filemap_fdatawait_range(struct address_space *mapping, loff_t start, loff_t end) { pgoff_t start_idx = start >> PAGE_SHIFT; pgoff_t end_idx = end >> PAGE_SHIFT; struct page *page; int ret = 0; /* Iterate through all pages in range */ for (pgoff_t idx = start_idx; idx <= end_idx; idx++) { page = find_get_page(mapping, idx); if (!page) continue; /* Wait if page has writeback in progress */ if (PageWriteback(page)) wait_on_page_writeback(page); /* Check for I/O error */ if (PageError(page)) ret = -EIO; put_page(page); } return ret;} /* * Ext4-specific fsync (more sophisticated) */int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync) { struct inode *inode = file_inode(file); int ret; /* Handle journaled data case */ if (EXT4_JOURNAL(inode)) { /* Wait for journal commit covering our data */ ret = ext4_jbd2_inode_add_wait(inode, start, end); if (ret) return ret; } /* Write file data */ ret = file_write_and_wait_range(file, start, end); if (ret) return ret; /* Sync metadata if needed */ if (!datasync || ext4_inode_data_dirty(inode)) { ret = ext4_write_inode(inode, WB_SYNC_ALL); if (ret) return ret; } /* Issue storage device flush */ if (needs_barrier(inode->i_sb)) { ret = blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL); } return ret;}Note the blkdev_issue_flush() at the end. Even after data is written to the device, it may sit in the device's volatile cache. The flush command forces data to stable media. Without this, fsync() could return success while data is still in volatile device memory—leading to data loss on power failure.
Understanding the distinction between data and metadata sync is crucial for performance optimization. File operations affect both data and metadata, but not all metadata is equally important for durability.
What is File Metadata?
| Metadata | Synced by fsync? | Synced by fdatasync? | Why it matters |
|---|---|---|---|
| File size | Yes | Yes* | Must know how much data to read |
| Block pointers | Yes | Yes | Must find where data is stored |
| mtime (modify time) | Yes | No | Nice-to-have, not for data retrieval |
| atime (access time) | Yes | No | Often disabled entirely |
| ctime (change time) | Yes | No | Nice-to-have |
| Permissions | Yes | No | Doesn't affect data retrieval |
| Owner/group | Yes | No | Doesn't affect data retrieval |
*fdatasync syncs file size only if it changed, because knowing the correct size is essential for reading the file correctly.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
/* * When to use fsync() vs fdatasync() */ #include <unistd.h>#include <fcntl.h>#include <time.h> /* * SCENARIO 1: Appending to a log file * * Each append extends the file, so size changes. * fdatasync() syncs both data and the new size. * Timestamps don't matter for log integrity. */void append_log_entry(int fd, const char *entry, size_t len) { write(fd, entry, len); fdatasync(fd); /* Syncs data + size */ /* Faster than fsync() because skips mtime/ctime */} /* * SCENARIO 2: Overwriting existing data (no size change) * * If file size doesn't change, fdatasync() only syncs data. * This is even faster - no metadata I/O at all. */void update_record(int fd, off_t offset, const void *data, size_t len) { pwrite(fd, data, len, offset); fdatasync(fd); /* Only data, no metadata (size unchanged) */ /* This can be 2x faster than fsync() */} /* * SCENARIO 3: Creating a new file (fsync required) * * New file needs directory entry synced too! * fsync() on file PLUS fsync() on directory */void create_durable_file(const char *path, const void *data, size_t len) { int fd = open(path, O_CREAT | O_WRONLY | O_TRUNC, 0644); write(fd, data, len); fsync(fd); /* Use fsync for new files */ close(fd); /* CRITICAL: Sync the directory to make filename durable */ int dir_fd = open(dirname_path(path), O_RDONLY); fsync(dir_fd); close(dir_fd);} /* * SCENARIO 4: Preserving timestamps matters * * Some applications rely on mtime for synchronization * (e.g., rsync, make). Use fsync() in these cases. */void timestamp_critical_write(int fd, const void *data, size_t len) { write(fd, data, len); fsync(fd); /* Ensure mtime is also persisted */} /* * Performance comparison: fsync vs fdatasync * * On a test system with ext4: * - fsync() on overwrite: ~8ms * - fdatasync() on overwrite: ~4ms (2x faster!) * * The difference comes from: * 1. Writing inode block for timestamp update * 2. Possibly journal transaction for metadata * * For high-frequency sync (like database commits), * fdatasync() can significantly improve throughput. */PostgreSQL uses fdatasync() by default for WAL (Write-Ahead Log) syncing because WAL entries don't need accurate timestamps. This can provide 30-50% better transaction throughput compared to fsync(). The choice can be configured via wal_sync_method.
The operating system's sync operations ultimately depend on storage devices honoring flush commands. Understanding these commands and their behavior across device types is essential for true durability guarantees.
The Storage Stack:
Application: fsync(fd)
↓
VFS Layer: Writes dirty pages
↓
Block Layer: Submits I/O requests
↓
Device Driver: Sends commands to device
↓
Storage Device: Writes to volatile cache
↓ (flush command)
Storage Media: Data on stable storage
| Interface | Flush Command | Description |
|---|---|---|
| SATA/ATA | FLUSH CACHE / FLUSH CACHE EXT | Flushes volatile write cache to platters/cells |
| SCSI/SAS | SYNCHRONIZE CACHE | Ensures data in cache reaches medium |
| NVMe | Flush | Commits data in volatile write cache |
| MMC/eMMC | CACHE_FLUSH | Flushes internal cache to flash |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
/* * How the kernel issues device flush commands */ /* * Issue a flush/sync to a block device * Called at end of fsync() path */int blkdev_issue_flush(struct block_device *bdev, gfp_t gfp_mask, sector_t *error_sector) { struct bio *bio; int ret = 0; /* Create a bio (block I/O request) with no data */ bio = bio_alloc(gfp_mask, 0); bio_set_dev(bio, bdev); bio->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH; /* Submit and wait for completion */ ret = submit_bio_wait(bio); bio_put(bio); return ret;} /* * NVMe flush command handling (in driver) */void nvme_execute_flush(struct nvme_ns *ns, struct request *req) { struct nvme_command cmd = { .common = { .opcode = nvme_cmd_flush, .nsid = cpu_to_le32(ns->head->ns_id), }, }; /* Send flush command to controller */ nvme_submit_sync_cmd(ns->ctrl, &cmd, NULL, 0); /* Returns when flush is complete */} /* * Force Unit Access (FUA) - Alternative to flush * * FUA writes bypass the volatile cache entirely. * Used for individual critical writes without draining entire cache. */void write_with_fua(struct block_device *bdev, void *data, sector_t sector, size_t len) { struct bio *bio = bio_alloc(GFP_KERNEL, 1); bio_set_dev(bio, bdev); bio->bi_iter.bi_sector = sector; bio_add_page(bio, virt_to_page(data), len, offset_in_page(data)); /* REQ_FUA: Force Unit Access - bypass volatile cache */ bio->bi_opf = REQ_OP_WRITE | REQ_FUA; submit_bio_wait(bio); bio_put(bio); /* This write is directly on stable media */}Some storage devices have non-battery-backed volatile write caches that ignore flush commands (for performance). Data 'synced' to such devices is NOT durable! Enterprise SSDs typically have power-loss protection (capacitors). Check your device specs. For critical data, disable volatile write caches or use devices with PLP (Power Loss Protection).
12345678910111213141516171819202122
#!/bin/bash# Managing storage device write cache # Check current write cache status (SATA/ATA)hdparm -W /dev/sda# /dev/sda:# write-caching = 1 (on) # Disable write cache (sacrifices performance for safety)hdparm -W0 /dev/sda # For SCSI devicessdparm --get=WCE /dev/sdb # Get Write Cache Enablesdparm --set=WCE=0 /dev/sdb # Disable # Check NVMe volatile write cachenvme id-ctrl /dev/nvme0 | grep vwc# vwc : 1 (volatile write cache present) # Enterprise SSDs with Power Loss Protection (PLP)# These are SAFE to have write cache enabled because# capacitors flush data on power lossSync operations have profound performance implications. Understanding these helps you make informed tradeoffs between durability and speed.
The Cost of Sync:
Each fsync() involves multiple expensive operations:
| Component | HDD Time | SSD Time | NVMe Time |
|---|---|---|---|
| Dirty page writeback | 5-10ms | 0.5-2ms | 0.1-0.5ms |
| Metadata/Journal write | 2-5ms | 0.5-1ms | 0.1-0.3ms |
| Device flush command | 5-15ms | 1-5ms | 0.5-2ms |
| Total fsync() | 12-30ms | 2-8ms | 0.7-3ms |
| Maximum TPS* | ~30-80 | ~120-500 | ~300-1400 |
*TPS = Transactions Per Second if each transaction requires one fsync.
Why Sync is Slow:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798
/* * Performance optimization strategies for sync-heavy workloads */ /* * STRATEGY 1: Group Commit * * Instead of syncing each write, batch multiple writes * and sync once for the entire batch. */struct batch { int count; int max_count; int fd;}; void batch_write(struct batch *b, const void *data, size_t len) { write(b->fd, data, len); b->count++; if (b->count >= b->max_count) { /* Batch is full, sync now */ fdatasync(b->fd); b->count = 0; }} /* Result: * 100 individual writes with sync: 100 × 5ms = 500ms * 100 writes batched, 1 sync: 100 × 0.001ms + 5ms ≈ 5ms * 100x improvement! */ /* * STRATEGY 2: Parallel Sync * * Multiple devices can be synced in parallel. * Distribute data across devices for higher throughput. */void parallel_sync(int *fds, int count) { pthread_t threads[count]; for (int i = 0; i < count; i++) { pthread_create(&threads[i], NULL, (void*)fsync_thread, &fds[i]); } for (int i = 0; i < count; i++) { pthread_join(threads[i], NULL); } /* With 4 SSDs: 4 × 200 TPS = 800 TPS aggregate */} /* * STRATEGY 3: Asynchronous Writeback + Barrier * * Use sync_file_range() to start writeback early, * then sync at commit points. */void pipelined_write(int fd, void *data, size_t chunk_size, int chunks) { for (int i = 0; i < chunks; i++) { off_t offset = i * chunk_size; /* Write to page cache */ pwrite(fd, data + offset, chunk_size, offset); /* Start async writeback (don't wait) */ sync_file_range(fd, offset, chunk_size, SYNC_FILE_RANGE_WRITE); } /* Final sync - wait for all writeback */ sync_file_range(fd, 0, chunks * chunk_size, SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE | SYNC_FILE_RANGE_WAIT_AFTER); fdatasync(fd);} /* * STRATEGY 4: Use O_DSYNC or O_SYNC for implicit sync * * Opens file with automatic sync on every write. * Good for small, critical writes. */void always_sync_writes() { /* O_DSYNC: Every write is like write + fdatasync */ int fd = open("critical.dat", O_WRONLY | O_DSYNC); write(fd, data, len); /* Returns after data durable */ /* O_SYNC: Every write is like write + fsync */ int fd2 = open("more_critical.dat", O_WRONLY | O_SYNC); write(fd2, data, len); /* Returns after data+metadata durable */}Production databases like PostgreSQL, MySQL, and MongoDB use group commit extensively. PostgreSQL's commit_delay parameter introduces a brief wait (0-100ms) after the first transaction to accumulate more transactions for a single WAL sync. This can improve throughput by 10x or more under concurrent load.
Databases have the most demanding sync requirements: they must provide ACID durability while maintaining high transaction throughput. Let's examine the sync patterns used by real database systems.
The WAL Sync Challenge:
Relational databases use Write-Ahead Logging (WAL): every transaction's changes are written to a log and synced before the transaction commits. This creates a sync bottleneck—every commit requires at least one fsync.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110
/* * Database Sync Patterns */ /* * PATTERN 1: Synchronous Commit (Default PostgreSQL) * * Every commit waits for WAL fsync. * Maximum durability, lowest throughput. */void synchronous_commit(struct transaction *txn) { /* Write transaction to WAL buffer */ write_wal_record(txn); /* Force WAL to disk */ fdatasync(wal_fd); /* ~4ms */ /* Acknowledge commit */ txn->status = COMMITTED; /* Client can proceed - data is durable */}/* Throughput: ~250 TPS per disk */ /* * PATTERN 2: Group Commit (PostgreSQL, MySQL) * * Batch multiple commits into one fsync. * Slight latency increase, huge throughput gain. */void group_commit() { /* Collect commits for a short window */ collect_pending_commits(COMMIT_DELAY_MS); /* Write all pending to WAL */ for_each_pending(txn) { write_wal_record(txn); } /* Single fsync for all */ fdatasync(wal_fd); /* Acknowledge all commits */ for_each_pending(txn) { txn->status = COMMITTED; wake_client(txn); }}/* Throughput: 10,000+ TPS with concurrent connections */ /* * PATTERN 3: Asynchronous Commit (PostgreSQL) * * Don't wait for fsync; accept data loss risk. * Maximum throughput, potential loss window. */void async_commit(struct transaction *txn) { /* Write to WAL buffer (not yet on disk) */ write_wal_record(txn); /* Immediately acknowledge - data NOT durable! */ txn->status = COMMITTED; /* Background thread eventually syncs */ /* Risk: last ~10ms of transactions lost on crash */}/* Throughput: 50,000+ TPS (memory speed) */ /* * PATTERN 4: Commit=No_wait + Group Sync (MySQL InnoDB) * * innodb_flush_log_at_trx_commit settings: * 0: Log written & synced once/second (may lose 1s data) * 1: Log written & synced on each commit (full durability) * 2: Log written on commit, synced once/second (lose 1s) */ /* * PATTERN 5: Parallel WAL (PostgreSQL 9.4+) * * Multiple backends can populate WAL concurrently. * Single sync still needed, but write phase parallelized. */ /* * PATTERN 6: Double-Write Buffer (MySQL InnoDB) * * Problem: Partial page writes on crash (torn pages). * Solution: Write pages to doublewrite buffer first. */void doublewrite_flush(struct page **pages, int count) { /* Write all pages to sequential doublewrite space */ for (int i = 0; i < count; i++) { pwrite(dblwr_fd, pages[i], PAGE_SIZE, dblwr_offset); dblwr_offset += PAGE_SIZE; } /* Single fsync for doublewrite */ fdatasync(dblwr_fd); /* Now safe to write to actual locations */ for (int i = 0; i < count; i++) { pwrite(datafile_fd, pages[i], PAGE_SIZE, pages[i]->offset); } /* On crash: recover torn pages from doublewrite */}| Database | Setting | Effect |
|---|---|---|
| PostgreSQL | synchronous_commit = on | Full durability, lower TPS |
| PostgreSQL | synchronous_commit = off | Higher TPS, risk window |
| PostgreSQL | commit_delay = 10 | 10 µs wait for group commit |
| MySQL InnoDB | innodb_flush_log_at_trx_commit = 1 | Full durability |
| MySQL InnoDB | innodb_flush_log_at_trx_commit = 2 | OS-level sync timing |
| MongoDB | j: true (write concern) | Wait for journal fsync |
| MongoDB | w: majority, j: false | Replicated but not synced |
Modern databases often use synchronous replication instead of (or alongside) fsync for durability. Writing to two nodes before acknowledging protects against single-node failure and can be faster than fsync if replicas are memory-acknowledged. This is why cloud databases often offer 'durability' without waiting for disk sync.
Sync behavior is notoriously difficult to test—you can't easily simulate power failures in software. However, several techniques help verify that your application's durability guarantees hold.
Testing Strategies:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
#!/bin/bash# Testing sync behavior and durability # 1. STRACE: Verify sync calls are happeningstrace -e fsync,fdatasync,sync,sync_file_range \ ./my_database_app 2>&1 | head -50 # Expected output:# fsync(5) = 0# fdatasync(5) = 0# ... # 2. Track device I/O during sync# In terminal 1:iostat -x 1 /dev/sda # In terminal 2:fsync_test_program # Look for:# - Writes during sync# - Device utilization spikes# - await (average wait time) # 3. BLKTRACE: Detailed block I/O tracingblktrace -d /dev/sda -o trace &./my_app --run-testkill %1blkparse trace.* | grep -i flush# Look for FLUSH operations # 4. Use dm-flakey to simulate failures# Create a device that randomly failsdmsetup create flakey --table "0 $(blockdev --getsz /dev/loop0) \ flakey /dev/loop0 0 30 5 1 drop_writes"# 30 seconds good, 5 seconds dropping writes # Run your application against /dev/mapper/flakey# Check data integrity after simulated failure # 5. Crash testing with VMs# - Run workload in VM# - Abruptly terminate VM process# - Boot VM and check data integrity # 6. SystemTap probes for sync trackingstap -e 'probe kernel.function("blkdev_issue_flush") { printf("Flush on %s", kernel_string($bdev->bd_disk->disk_name)) }'Power-Loss Testing:
For critical applications, simulate actual power loss:
kill -9)Virtual machines often virtualize fsync—the hypervisor may acknowledge fsync before data reaches host storage. For durability testing, ensure your VM disk is configured for write-through caching, or test on bare metal. Many cloud VMs provide NO durability guarantee for local storage!
We've completed our deep exploration of sync operations—the critical mechanisms that transform cached data into durable, persistent storage. This knowledge completes our understanding of the buffer cache and its role in file system performance and reliability.
Module Complete:
Congratulations! You've completed the Buffer Cache module—the foundational layer of file system performance. You now understand:
This knowledge forms the foundation for understanding file system performance optimization, database internals, and building reliable software systems.
You have mastered the Buffer Cache module. You understand the complete lifecycle of data from application write through cache residence to stable storage, including all the performance and durability tradeoffs involved. This knowledge will serve you in database development, system administration, performance tuning, and building any software that cares about data persistence.