Loading content...
Sequential access treats a file like a river—you can only go with the flow, moving steadily from source to sea. But what if you need to teleport upstream? What if you must read byte 1,000,000 without first reading bytes 0 through 999,999?
Direct access (also called random access) answers this need. It provides the ability to position the file pointer at any arbitrary offset and perform read/write operations there—instantly, without traversing intervening data.
The term 'random' doesn't imply randomness in the probabilistic sense. It means access at any position, in any order, as opposed to the strictly linear pattern of sequential access. Think of it like the difference between a cassette tape (sequential: must fast-forward/rewind through all intermediate content) and a CD (random: the laser can jump directly to any track).
Direct access is the foundation of all indexed data structures: databases, file system metadata, search indexes, and countless applications where looking up specific records by key is the dominant operation. Without it, modern computing as we know it—where billions of database queries execute per second—would be impossible.
By the end of this page, you will master direct file access—understanding the lseek() system call in depth, how random positioning interacts with read/write operations, the performance costs of breaking sequential patterns, and the critical use cases where random access provides orders-of-magnitude improvements over sequential scanning.
Direct access reframes how we think about files. Instead of a stream to be consumed, a file becomes an array of bytes that can be addressed by index. Just as you can access array[500] without first touching array[0] through array[499], direct access lets you read file[500] directly.
Key characteristics of direct access:
The array metaphor:
Think of a file as a very large byte array persisted to disk:
File: [B₀][B₁][B₂][B₃]...[B₉₉₉₉₉₉]...[Bₙ₋₁]
↑ ↑
offset 0 offset 1,000,000
With direct access:
Historical Context:
Direct access became practical with magnetic disk drives in the 1960s. Unlike tape (strictly sequential), disks had movable read/write heads that could position over any track. While seeking wasn't free (it required mechanical head movement), it was vastly faster than reading all intervening data.
Modern SSDs take this further—with no mechanical parts, positioning is essentially instantaneous. The byte-addressable abstraction that seemed like a convenient fiction on HDDs becomes physically accurate on solid-state media.
The file abstraction presents a clean byte-array interface regardless of underlying storage. Whether the file lives on spinning rust (HDD), flash chips (SSD), network storage (NFS), or RAM-based filesystem (tmpfs), the lseek/read/write interface remains identical. This virtualization is a core operating system contribution.
The lseek() system call is the gateway to direct access. It repositions the file offset (file pointer) for an open file descriptor, determining where the next read or write will occur.
The signature:
#include <unistd.h>
off_t lseek(int fd, off_t offset, int whence);
Parameters in depth:
fd — An open file descriptor. Must refer to a seekable file (regular files and block devices are seekable; pipes, FIFOs, sockets, and terminal devices are not).
offset — A signed integer specifying the offset. Its interpretation depends on whence.
whence — The reference point for the offset. Three standard values:
SEEK_SET — Offset from the beginning of the file. New position = offset.SEEK_CUR — Offset from the current position. New position = current + offset.SEEK_END — Offset from the end of the file. New position = file_size + offset.Return value: The new file offset measured from the beginning, or -1 on error (with errno set).
| Call | Current Pos | File Size | New Position |
|---|---|---|---|
| lseek(fd, 0, SEEK_SET) | any | any | 0 (file start) |
| lseek(fd, 100, SEEK_SET) | any | any | 100 |
| lseek(fd, 50, SEEK_CUR) | 100 | any | 150 |
| lseek(fd, -30, SEEK_CUR) | 100 | any | 70 |
| lseek(fd, 0, SEEK_END) | any | 1000 | 1000 (EOF) |
| lseek(fd, -100, SEEK_END) | any | 1000 | 900 |
| lseek(fd, 100, SEEK_END) | any | 1000 | 1100 (past EOF) |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
#include <stdio.h>#include <unistd.h>#include <fcntl.h>#include <sys/stat.h> /** * Comprehensive lseek() demonstration */void demonstrate_lseek(const char *filename) { int fd = open(filename, O_RDWR); if (fd < 0) { perror("open"); return; } // Get file size struct stat st; fstat(fd, &st); printf("File size: %lld bytes\n", (long long)st.st_size); // SEEK_SET: Absolute positioning off_t pos = lseek(fd, 1000, SEEK_SET); printf("After SEEK_SET to 1000: position = %lld\n", (long long)pos); // SEEK_CUR: Relative positioning pos = lseek(fd, 500, SEEK_CUR); // Move forward 500 printf("After SEEK_CUR +500: position = %lld\n", (long long)pos); pos = lseek(fd, -200, SEEK_CUR); // Move backward 200 printf("After SEEK_CUR -200: position = %lld\n", (long long)pos); // SEEK_END: Position relative to file end pos = lseek(fd, 0, SEEK_END); // Go to EOF printf("After SEEK_END +0: position = %lld (EOF)\n", (long long)pos); pos = lseek(fd, -100, SEEK_END); // 100 bytes before EOF printf("After SEEK_END -100: position = %lld\n", (long long)pos); // Get current position without moving (common idiom) pos = lseek(fd, 0, SEEK_CUR); printf("Current position (via SEEK_CUR +0): %lld\n", (long long)pos); // Rewind to beginning lseek(fd, 0, SEEK_SET); printf("Rewound to start\n"); close(fd);} /** * Reading a specific record using direct access * Assumes fixed-size records of 100 bytes each */typedef struct { int id; char name[48]; double value; char padding[40]; // Total: 100 bytes} Record; Record read_record_by_number(int fd, int record_num) { Record record; // Calculate byte offset: record_num * record_size off_t offset = (off_t)record_num * sizeof(Record); // Seek directly to that record lseek(fd, offset, SEEK_SET); // Read the single record read(fd, &record, sizeof(Record)); return record;}Calling lseek() on pipes, FIFOs, sockets, or terminals returns -1 with errno set to ESPIPE. These are stream-oriented and fundamentally cannot support random positioning. Always verify that lseek() succeeds when working with diverse file types.
One of the most interesting and often misunderstood aspects of lseek() is that you can seek past the end of a file. The position is not bounded by the file's current size.
What happens when you seek past EOF and write?
When you seek to a position beyond the current file size and then write data, the file is extended. But here's the crucial insight: the gap between the old EOF and the new write position doesn't necessarily consume disk space.
This creates what's called a sparse file or a file with holes. The file system records that those bytes exist (they read as zeros) but doesn't allocate actual disk blocks for them.
Example scenario:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
#include <stdio.h>#include <unistd.h>#include <fcntl.h>#include <string.h>#include <sys/stat.h> /** * Creating a sparse file demonstration */void create_sparse_file(const char *filename) { // Create a new file int fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC, 0644); // Write some data at the beginning const char *start_data = "START DATA"; write(fd, start_data, strlen(start_data)); printf("Wrote %zu bytes at start\n", strlen(start_data)); // Seek way past the current file size (1 GB ahead!) off_t gap_start = lseek(fd, 0, SEEK_CUR); // Current position off_t target = 1024 * 1024 * 1024; // 1 GB lseek(fd, target, SEEK_SET); // Write some data at the 1 GB position const char *end_data = "END DATA"; write(fd, end_data, strlen(end_data)); printf("Wrote %zu bytes at offset %lld\n", strlen(end_data), (long long)target); close(fd); // Check file size vs. disk usage struct stat st; stat(filename, &st); printf("\nFile analysis:\n"); printf(" Logical size (st_size): %lld bytes (~1 GB)\n", (long long)st.st_size); printf(" Actual blocks (st_blocks): %lld\n", (long long)st.st_blocks); printf(" Disk usage: %lld bytes (~%lld KB)\n", (long long)st.st_blocks * 512, (long long)st.st_blocks * 512 / 1024); printf("\nThe 'hole' (1 GB gap) uses no disk space!\n");} /* * Typical output: * * Wrote 10 bytes at start * Wrote 8 bytes at offset 1073741824 * * File analysis: * Logical size (st_size): 1073741832 bytes (~1 GB) * Actual blocks (st_blocks): 16 * Disk usage: 8192 bytes (~8 KB) * * The 'hole' (1 GB gap) uses no disk space! */Use cases for sparse files:
Virtual machine disk images — A 100GB virtual disk can be created instantly; only blocks actually written consume space.
Database pre-allocation — Reserve logical space for growth without consuming physical storage immediately.
Log files with timestamps — Seek to byte offset based on timestamp for time-based access.
Core dumps — Large regions of unmapped memory appear as holes in the dump file.
Torrent downloads — File is created at full size immediately; pieces are filled in as downloaded.
Reading holes:
When you read from a hole (a region that was never written), the file system returns zeros. This is transparent to the application—it appears as if the file contains zeros in those positions.
Caution with holes:
cp and tar may expand holes — Copying a sparse file with naive tools can create a dense file that consumes the full logical size. Use 'cp --sparse=always' or 'tar --sparse'.
Not all file systems support holes — FAT32, for example, does not. Holes become real zeros.
Disk quotas count logical size — Some quota systems count the full logical size, not actual blocks.
Fragmentation — Sparse files with scattered writes can become severely fragmented.
A common pattern in direct-access code is:
lseek(fd, offset, SEEK_SET);
read(fd, buffer, count);
This works, but it has a problem: it's not atomic. In a multi-threaded application where multiple threads share a file descriptor, another thread could seek between your lseek and read, corrupting both operations.
The pread() and pwrite() system calls solve this by combining the seek and I/O into a single atomic operation:
#include <unistd.h>
ssize_t pread(int fd, void *buf, size_t count, off_t offset);
ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset);
Key differences from lseek + read/write:
| Aspect | lseek + read | pread |
|---|---|---|
| Atomicity | Two separate syscalls; race-prone | Single atomic operation |
| File pointer | Modified by both operations | Not modified at all |
| System call count | 2 | 1 |
| Thread safety | Requires external locking | Inherently thread-safe per-call |
| Use case | Sequential with occasional seeks | True random access patterns |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
#include <stdio.h>#include <unistd.h>#include <fcntl.h>#include <pthread.h>#include <string.h> /** * Thread-safe random access using pread/pwrite * Multiple threads can safely access different offsets simultaneously */ #define RECORD_SIZE 100#define NUM_RECORDS 10000 typedef struct { int fd; int record_num; char data[RECORD_SIZE];} ThreadArgs; // Thread-safe record read using preadvoid* read_record_thread(void *arg) { ThreadArgs *args = (ThreadArgs*)arg; off_t offset = (off_t)args->record_num * RECORD_SIZE; // pread is atomic and doesn't modify the shared file pointer ssize_t bytes = pread(args->fd, args->data, RECORD_SIZE, offset); if (bytes != RECORD_SIZE) { fprintf(stderr, "Short read for record %d\n", args->record_num); } return NULL;} // Thread-safe record write using pwritevoid* write_record_thread(void *arg) { ThreadArgs *args = (ThreadArgs*)arg; off_t offset = (off_t)args->record_num * RECORD_SIZE; // pwrite is atomic and doesn't modify the shared file pointer ssize_t bytes = pwrite(args->fd, args->data, RECORD_SIZE, offset); if (bytes != RECORD_SIZE) { fprintf(stderr, "Short write for record %d\n", args->record_num); } return NULL;} /** * Demonstration: Multiple threads reading different records concurrently */void concurrent_random_access(const char *filename) { int fd = open(filename, O_RDONLY); if (fd < 0) return; pthread_t threads[10]; ThreadArgs args[10]; // Launch 10 threads reading 10 different records simultaneously for (int i = 0; i < 10; i++) { args[i].fd = fd; args[i].record_num = i * 1000; // Records 0, 1000, 2000, ... pthread_create(&threads[i], NULL, read_record_thread, &args[i]); } // Wait for all threads for (int i = 0; i < 10; i++) { pthread_join(threads[i], NULL); printf("Thread %d read record %d\n", i, args[i].record_num); } close(fd);}Use pread() and pwrite() for: (1) Multi-threaded access to shared file descriptors, (2) Database-style random record access, (3) Any situation where you need to read/write at a specific offset without affecting or being affected by the file pointer. They're the gold standard for modern random-access file I/O.
While direct access provides immense flexibility, it comes with performance trade-offs that every systems programmer must understand. The costs vary dramatically based on storage technology.
Hard Disk Drives (HDDs):
On spinning disks, random access incurs physical costs that sequential access avoids:
Seek time — Moving the actuator arm to a different track takes 3-15ms on average.
Rotational latency — Waiting for the target sector to rotate under the head adds another 2-8ms on average.
No read-ahead benefit — The OS cannot prefetch data since it doesn't know where you'll seek next.
For random 4KB reads on HDD:
Compare to sequential 4KB reads:
The ratio: Random access on HDD is 100-500x slower than sequential.
| Storage Type | Random 4KB Read | Sequential Read | Random Penalty |
|---|---|---|---|
| HDD (7200 RPM) | ~100 IOPS (0.4 MB/s) | ~150 MB/s | ~375x |
| HDD (15K RPM) | ~200 IOPS (0.8 MB/s) | ~200 MB/s | ~250x |
| SATA SSD | ~50K IOPS (200 MB/s) | ~500 MB/s | ~2.5x |
| NVMe SSD | ~500K IOPS (2 GB/s) | ~5 GB/s | ~2.5x |
| Intel Optane | ~500K IOPS (2 GB/s) | ~2.5 GB/s | ~1.25x |
| RAM (tmpfs) | ~10M+ IOPS | ~10 GB/s | ~1x |
Solid State Drives (SSDs):
SSDs dramatically improve random access performance because they have no mechanical parts:
The random/sequential gap narrows significantly but doesn't disappear:
Page granularity — SSDs read in pages (4-16KB). Small random reads still incur per-page overhead.
Controller queuing — Deep command queues help random workloads; shallow queues favor sequential.
Internal organization — Sequential writes align with erase block boundaries; random writes cause write amplification.
Practical implications:
SSDs have transformed random access from 'avoid at all costs' to 'use judiciously'. Workloads that were impossible on HDD (e.g., serving millions of small random reads per second) are routine on SSD. This shift underpins modern databases, caching systems, and cloud storage.
Direct access isn't just an alternative to sequential access—for certain workloads, it's the only viable approach. Let's examine the canonical use cases:
1. Database Systems
Databases are the poster child for random access:
A single SQL query like SELECT * FROM users WHERE id = 42 might require 3-5 random reads (index levels + data page).
2. File System Metadata
File systems themselves rely heavily on random access:
3. Memory-Mapped File Editing
Document editors and IDEs use memory mapping (covered later) which inherently provides random access:
4. Virtual Machine Disk Images
VM disk images emulate block devices:
5. Game Asset Loading
Modern games bundle assets in archive files:
| Application | Read Pattern | Write Pattern | Random % |
|---|---|---|---|
| OLTP Database | Index lookups | Row updates, log appends | 80-95% |
| File System | Metadata ops | Block alloc, journal | 60-80% |
| Text Editor | Scroll, search | Local modifications | 70-90% |
| VM Disk Image | Guest I/O | Guest I/O | 50-90% |
| Game Engine | Asset loading | Save states | 40-70% |
| Web Server | Small file reads | Logging (sequential) | 20-40% |
Real applications typically exhibit hybrid access patterns. A database has random reads for queries but sequential writes for WAL. A web server serves mostly small random files but writes logs sequentially. Understanding your workload's access pattern distribution is key to optimization.
A powerful application of direct access is record-based file organization, where a file contains a sequence of fixed-size records that can be accessed by record number.
The model:
File: [Record 0][Record 1][Record 2]...[Record N-1]
| | |
0 100 200 ... (byte offsets)
If each record is 100 bytes, accessing record K means seeking to offset K * 100.
Complete implementation:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <fcntl.h>#include <string.h> /** * Fixed-size record file implementation * Provides O(1) access to any record by number */ typedef struct { int id; char name[64]; double balance; int flags; char reserved[16];} Record; // 96 bytes; pad to 100 for alignment #define RECORD_SIZE 100_Static_assert(sizeof(Record) <= RECORD_SIZE, "Record too large"); typedef struct { int fd; int record_count;} RecordFile; // Open or create a record fileRecordFile* record_file_open(const char *filename, int create) { RecordFile *rf = malloc(sizeof(RecordFile)); int flags = O_RDWR; if (create) flags |= O_CREAT | O_TRUNC; rf->fd = open(filename, flags, 0644); if (rf->fd < 0) { free(rf); return NULL; } // Determine record count from file size off_t size = lseek(rf->fd, 0, SEEK_END); rf->record_count = size / RECORD_SIZE; return rf;} // Read a record by number (0-indexed)int record_read(RecordFile *rf, int record_num, Record *out) { if (record_num < 0 || record_num >= rf->record_count) { return -1; // Out of bounds } char buffer[RECORD_SIZE]; off_t offset = (off_t)record_num * RECORD_SIZE; // Use pread for thread safety ssize_t bytes = pread(rf->fd, buffer, RECORD_SIZE, offset); if (bytes != RECORD_SIZE) { return -1; // Read error } memcpy(out, buffer, sizeof(Record)); return 0;} // Write a record by number (0-indexed)int record_write(RecordFile *rf, int record_num, const Record *record) { char buffer[RECORD_SIZE]; memset(buffer, 0, RECORD_SIZE); memcpy(buffer, record, sizeof(Record)); off_t offset = (off_t)record_num * RECORD_SIZE; // Use pwrite for thread safety ssize_t bytes = pwrite(rf->fd, buffer, RECORD_SIZE, offset); if (bytes != RECORD_SIZE) { return -1; // Write error } // Update record count if we extended the file if (record_num >= rf->record_count) { rf->record_count = record_num + 1; } return 0;} // Append a new record, return its record numberint record_append(RecordFile *rf, const Record *record) { int new_num = rf->record_count; if (record_write(rf, new_num, record) != 0) { return -1; } return new_num;} // Delete a record (mark as deleted; actual deletion is complex)int record_delete(RecordFile *rf, int record_num) { Record empty = {0}; empty.id = -1; // Convention: id=-1 means deleted return record_write(rf, record_num, &empty);} // Usage examplevoid demo_record_file() { RecordFile *rf = record_file_open("accounts.dat", 1); // Add some records Record r1 = {.id = 1001, .name = "Alice", .balance = 5000.00}; Record r2 = {.id = 1002, .name = "Bob", .balance = 3200.50}; Record r3 = {.id = 1003, .name = "Charlie", .balance = 8100.75}; int n1 = record_append(rf, &r1); int n2 = record_append(rf, &r2); int n3 = record_append(rf, &r3); printf("Created records at positions: %d, %d, %d\n", n1, n2, n3); // Random access: read record 1 (Bob) Record lookup; record_read(rf, 1, &lookup); printf("Record 1: %s, balance: %.2f\n", lookup.name, lookup.balance); // Update record 1 lookup.balance -= 500.00; record_write(rf, 1, &lookup); close(rf->fd); free(rf);}Always pad records to fixed sizes that align with disk block sizes (ideally powers of 2: 64, 128, 256, 512 bytes). This prevents records from spanning block boundaries, which doubles the I/O cost and complicates atomic updates.
We've thoroughly explored direct (random) file access—the ability to read and write at arbitrary file positions without processing intervening data. Let's consolidate the critical points:
What's next:
Direct access gives us the ability to seek anywhere, but what if we need to find records by key rather than by position? What if records are variable-size? The next page explores Indexed Access—how to build and use indexes that map logical keys to physical file locations, enabling efficient lookup without knowing byte offsets in advance.
You now command a deep understanding of direct file access—the lseek/pread/pwrite system calls, sparse files, performance implications across storage media, and critical use cases. This foundation prepares you for understanding how indexes build efficient key-based lookup on top of random access primitives.