Loading learning content...
Secondary storage represents a fundamental shift in the memory hierarchy. Unlike registers, caches, and RAM—which are volatile and lose data when power is removed—secondary storage is persistent. It retains data across power cycles, system reboots, and hardware changes. This persistence enables:
However, persistence comes at a cost: secondary storage is dramatically slower than main memory. Understanding this tier's technology, performance characteristics, and interface with the operating system is essential for systems engineering.
By the end of this page, you will understand: the fundamental technologies behind HDDs and SSDs; their drastically different performance characteristics; storage interfaces and protocols; the OS storage stack; and emerging technologies bridging the gap between memory and storage.
The Hard Disk Drive (HDD) has been the dominant mass storage technology for over 50 years. Despite the rise of SSDs, HDDs remain important for cost-sensitive bulk storage and archival applications.
Physical construction:
An HDD contains:
Data organization:
Data on a platter is organized as:
Access latency components:
HDD access time is dominated by mechanical delays:
| Specification | 7,200 RPM Desktop | 15,000 RPM Enterprise | 5,400 RPM Laptop |
|---|---|---|---|
| Average seek time | 8-10 ms | 3-4 ms | 10-14 ms |
| Rotational latency | 4.17 ms | 2 ms | 5.56 ms |
| Average access time | ~12-14 ms | ~5-6 ms | ~15-20 ms |
| Sequential read | 150-200 MB/s | 200-300 MB/s | 100-140 MB/s |
| Random 4KB IOPS | 75-150 | 180-300 | 50-80 |
| Power (active) | 6-10W | 10-15W | 1.5-3W |
HDDs are fundamentally sequential devices. Sequential access achieves 150+ MB/s; random 4KB access achieves ~1 MB/s effective bandwidth (75 IOPS × 4KB = 0.3 MB/s). This 500× difference makes HDDs unsuitable for random access workloads like databases without extensive caching.
Solid State Drives (SSDs) use NAND flash memory to store data, eliminating mechanical components entirely. This fundamental change brings dramatically different performance characteristics.
NAND flash basics:
NAND flash stores data in floating-gate transistors:
Cell types by bits per cell:
SSD internal organization:
SSDs have a complex internal structure:
The program/erase asymmetry:
NAND flash has a critical constraint: you cannot overwrite data in place.
This creates the garbage collection problem: to write to a "used" page, the SSD must:
This background activity can cause performance variability and write amplification.
| Specification | Consumer NVMe SSD | Enterprise NVMe SSD | SATA SSD |
|---|---|---|---|
| Sequential read | 3-7 GB/s | 7-14 GB/s | 500-550 MB/s |
| Sequential write | 2-5 GB/s | 3-10 GB/s | 450-520 MB/s |
| Random 4KB read IOPS | 400K-1M | 1-2M | 90-100K |
| Random 4KB write IOPS | 200K-700K | 200-500K | 70-90K |
| Average latency (read) | ~10-50 μs | ~10-30 μs | ~100 μs |
| Average latency (write) | ~20-100 μs | ~20-50 μs | ~100 μs |
| Endurance (TBW) | 300-600 TB | 5-50 PBW | 150-300 TB |
The interface between storage devices and the computer system significantly impacts performance. Different interfaces evolved for different use cases, and modern NVMe represents a clean-slate design for flash storage.
Major storage interfaces:
NVMe deep dive:
NVMe was designed from the ground up for low-latency, high-parallelism flash storage:
Queue architecture:
Streamlined command set:
Low latency optimizations:
| Interface | Max Bandwidth | Queue Depth | Latency | Use Case |
|---|---|---|---|---|
| SATA 3.0 | ~550 MB/s | 32 (single queue) | ~100 μs | Consumer HDD/SSD |
| SAS-3 | ~1.2 GB/s | 256+ per LUN | ~70-100 μs | Enterprise HDD/SSD |
| NVMe PCIe 3.0 x4 | ~3.5 GB/s | 64K × 64K | ~10-30 μs | Consumer NVMe SSD |
| NVMe PCIe 4.0 x4 | ~7 GB/s | 64K × 64K | ~10-30 μs | Modern NVMe SSD |
| NVMe PCIe 5.0 x4 | ~14 GB/s | 64K × 64K | ~10-30 μs | Latest Gen NVMe |
Deep queues are essential for SSD performance. With 30 μs latency, a single queue achieves ~33K IOPS. With 32 parallel commands, the same drive achieves 1M+ IOPS. NVMe's per-CPU queues eliminate contention that SATA's single queue creates.
Understanding storage performance requires considering multiple dimensions beyond simple bandwidth numbers. Workload characteristics dramatically affect achieved performance.
Key performance metrics:
Workload patterns:
Sequential access:
Random access:
Mixed workloads:
Latency under load:
As queue depth increases, latency typically increases (more operations waiting). SSDs handle this better than HDDs. At very high loads, SSDs may exhibit garbage collection pauses causing latency spikes.
| Metric | 7200 RPM HDD | SATA SSD | NVMe SSD | Difference |
|---|---|---|---|---|
| Sequential read | 180 MB/s | 550 MB/s | 3500 MB/s | ~20× (NVMe vs HDD) |
| Sequential write | 170 MB/s | 520 MB/s | 3000 MB/s | ~18× (NVMe vs HDD) |
| Random 4KB read | ~100 IOPS | ~95K IOPS | ~500K IOPS | ~5000× (NVMe vs HDD) |
| Random 4KB write | ~100 IOPS | ~85K IOPS | ~400K IOPS | ~4000× (NVMe vs HDD) |
| Access latency | ~10 ms | ~100 μs | ~30 μs | ~300× (NVMe vs HDD) |
| Power (active) | 8W | 3-5W | 5-10W | Varies |
Real-world performance differs from benchmarks. SSD performance degrades as drives fill up (less space for garbage collection). Sustained writes may throttle due to heat. Consumer SSDs may have vastly different read vs write performance after cache exhaustion (SLC cache). Always benchmark with realistic workloads.
The operating system provides a complex software stack between applications and storage hardware. This stack provides abstraction, caching, scheduling, and management services.
Linux storage stack (representative):
From top to bottom:
I/O Schedulers:
I/O schedulers order and merge requests to optimize performance:
none (noop): No scheduling—pass requests directly to device. Best for NVMe SSDs with internal parallelism and no seek penalty.
mq-deadline: Batch requests while ensuring no request waits too long (deadline). Good balance for SSDs.
BFQ (Budget Fair Queueing): Per-process fairness and latency guarantees. Better for interactive workloads on slower devices.
kyber: Simple, low-overhead latency-targeting scheduler for fast devices.
Historical (now deprecated):
For NVMe SSDs, the 'none' scheduler is often optimal. The SSD controller has sophisticated internal scheduling; adding OS-level scheduling introduces latency without benefit. The multi-queue block layer (blk-mq) enables parallel I/O submission to all hardware queues.
Direct I/O and Bypass:
For applications needing direct device access (databases, high-performance storage):
O_DIRECT: Bypasses page cache, reading/writing directly to device. Avoids double-buffering for applications managing their own cache.
io_uring: Modern Linux async I/O interface with submission/completion rings in shared memory. Achieves near-hardware latency with batched syscalls.
SPDK (Storage Performance Development Kit): User-space NVMe driver completely bypassing kernel. Achieves maximum performance but requires application modification.
Device drivers are kernel modules that interface between the OS block layer and specific storage hardware. Well-designed drivers are essential for performance and reliability.
NVMe driver architecture (Linux nvme):
The NVMe driver is relatively simple compared to SCSI/ATA drivers because NVMe is a clean protocol:
Key driver responsibilities:
Interrupt modes:
Legacy interrupts: Single shared interrupt line. High overhead from shared interrupt handling. Rarely used for modern storage.
MSI (Message Signaled Interrupts): Dedicated interrupt per device. Lower latency than legacy.
MSI-X (Extended MSI): Multiple interrupts per device—typically one per CPU core/queue. Enables parallel completion processing without contention.
Polling mode: Driver continuously polls for completions instead of waiting for interrupts. Lower latency at cost of CPU usage. Useful for ultra-low-latency NVMe.
1234567891011121314151617181920212223242526272829303132333435
// Simplified NVMe I/O flow // 1. Application requests readread(fd, buffer, 4096); // 2. VFS → file system → block layerblock_request = create_request(LBA, length, READ); // 3. Block layer dispatches to NVMe drivernvme_queue_request(queue, block_request); // 4. Driver builds NVMe commandnvme_cmd = { opcode: NVME_READ, lba: block_request->lba, length: block_request->length / 512 - 1, prp1: dma_addr(buffer), // Physical address for DMA}; // 5. Submit to hardware queuesubmission_queue[tail] = nvme_cmd;writel(tail, doorbell_register); // Ring doorbell // 6. Device processes request via DMA// ...hardware reads from NAND, DMAs to buffer... // 7. Completion interrupt firesirq_handler() { while (completion_queue[head].valid) { complete_block_request(completion_queue[head].request); head++; }} // 8. Application unblocks with data in bufferThe traditional gap between volatile memory (DRAM) and persistent storage (HDD/SSD) is being bridged by new technologies that combine the characteristics of both.
Storage-class memory technologies:
Persistent memory programming:
With byte-addressable persistent memory, traditional file I/O becomes optional:
ZNS (Zoned Namespace) SSDs:
A new SSD interface where the drive is divided into zones that must be written sequentially:
The traditional 3-tier hierarchy (cache, RAM, disk) is becoming more nuanced: registers → cache → DRAM → Optane/NVRAM → fast NVMe SSD → slower QLC SSD → HDD → tape. Operating systems and applications are adapting to manage these multiple tiers intelligently.
We've explored secondary storage in depth—from spinning platters to flash cells, from SATA to NVMe, from device drivers to emerging technologies. Storage is where persistence meets performance, and understanding this tier is essential for systems design.
What's next:
Having explored all tiers of the memory hierarchy—registers, caches, main memory, and secondary storage—we'll now examine the access times and tradeoffs across the entire hierarchy. We'll quantify the performance gaps, understand the economic factors driving design decisions, and learn how to reason about memory hierarchy when designing systems and writing performance-critical code.
You now understand secondary storage technologies, interfaces, and OS integration. This knowledge is essential for understanding file systems, optimizing I/O-intensive applications, and designing storage architectures.