Storage Hierarchy - Learning Module

Loading content...

0/241

Access Times - The Physics of Storage Performance

Where Does Latency Come From?

When you execute a database query and wait for results, that wait time is composed of countless smaller delays—each governed by the laws of physics and the constraints of engineering. Understanding access time at a fundamental level transforms vague notions of "fast" and "slow" into precise, actionable knowledge.

Every nanosecond of storage access is accounted for. Light traveling through fiber optic cable covers about 20 centimeters per nanosecond. Electrons propagating through copper move at roughly 60-70% of light speed. A magnetic disk platter spinning at 7,200 RPM completes one rotation in 8.33 milliseconds. These physical constants are immutable—no amount of software optimization can transcend them.

This page dissects access time at each tier of the memory hierarchy, revealing the physics behind the numbers. When you understand why an HDD access takes 10 milliseconds while a cache access takes 1 nanosecond, you can predict performance behavior, identify bottlenecks, and make informed architectural decisions.

What You Will Master

By the end of this page, you will understand the physical and architectural sources of latency at each memory tier, how to calculate expected access times for different storage operations, the distinction between latency and throughput, and practical methods for measuring and monitoring access times in production systems.

Anatomy of Access Time

Storage access time is not a single monolithic quantity—it's the sum of multiple components, each with its own physical origin. Understanding these components is essential for diagnosing performance issues and optimizing system behavior.

The Fundamental Equation:

Total Access Time = Command Overhead + Seek/Address Time + Transfer Time + Protocol Overhead

Let's define each component:

Components of Access Time

•Command Overhead: Time to issue and decode the access request. CPU instruction dispatch, memory controller scheduling, storage controller command parsing.
•Seek/Address Time: Time to locate the target data. For RAM, this is row activation. For HDD, this is physical head movement. For SSD, this is page addressing.
•Transfer Time: Time to move the data from storage to destination. Determined by data size and bandwidth. For large transfers, this dominates; for small random accesses, it's negligible.
•Protocol Overhead: Interface and protocol delays. NVMe command submission, SATA command frames, network protocol stacks for remote storage.

Latency vs. Throughput:

Two distinct metrics characterize storage performance:

Latency: Time from request to response. Measures how long you wait for a single operation. Critical for interactive workloads.
Throughput: Data volume per unit time. Measures aggregate bandwidth. Critical for batch processing.

These metrics don't always correlate:

Storage Type	Latency	Throughput	Paradox
RAM	Very low	Very high	Consistent—both excellent
NVMe SSD	Low	Very high	Good at both
HDD	High	Moderate	High latency but decent sequential throughput
Network Storage	Variable	Potentially very high	Latency can be poor, throughput can exceed local disk

Queue Depth and Parallelism:

Modern storage systems can process multiple requests concurrently. The queue depth is the number of outstanding requests. Higher queue depths can mask latency by keeping the device busy:

Effective Throughput = IOPS × Transfer Size
IOPS = 1 / Latency (for QD=1)
IOPS = Queue Depth / Latency (for higher QD, up to device limits)

An SSD with 100μs latency achieves 10,000 IOPS at queue depth 1. At queue depth 32, it can achieve 300,000+ IOPS if the internal parallelism supports it.

CPU Register and Cache Access Times

At the top of the memory hierarchy, CPU registers and caches operate at speeds difficult to comprehend. Understanding their timing illuminates why databases invest heavily in cache-efficient algorithms.

CPU Registers:

Registers are the fastest possible storage—integral parts of the CPU execution pipeline:

Location: On the CPU die, within the execution core
Size: ~1KB total (typically 16-32 general-purpose registers of 64 bits each)
Access time: 0 cycles (registers are part of the instruction's operands)
Technology: Made of SRAM flip-flops, no separate memory access needed

Registers do not have "access time" in the traditional sense—they are inputs and outputs of CPU instructions, not addressable memory.

L1 Cache:

The L1 cache is the first true memory level, sitting closest to the CPU cores:

L1 Cache Characteristics
Property	L1 Data Cache	L1 Instruction Cache
Typical size	32-64 KB per core	32-64 KB per core
Access time	3-5 CPU cycles (~1-1.5ns at 3GHz)	1-2 cycles (pipelined fetch)
Associativity	8-12 way set associative	4-8 way
Line size	64 bytes	64 bytes
Bandwidth	~200 GB/s per core	~100 GB/s per core
Hit rate (typical)	95-99%	98-99%

L2 Cache:

The L2 cache serves as a larger, slightly slower backing store for L1:

Property	Typical Value
Size	256KB - 1MB per core
Access time	10-15 cycles (~4-5ns)
Associativity	8-16 way
Bandwidth	~100 GB/s per core
Hit rate	80-95% of L1 misses

L3 Cache (Last-Level Cache):

Shared across all cores, L3 is the last line of defense before main memory:

Property	Typical Value
Size	8-64 MB total
Access time	30-50 cycles (~12-20ns)
Associativity	12-20 way
Bandwidth	~200-500 GB/s shared
Hit rate	80-90% of L2 misses

Why Cache Latency Matters for Databases:

Query processing involves billions of memory references per second. Consider a hash join probing a hash table:

Table fits in L1: ~1ns per probe → 1 billion probes/second
Table fits in L3: ~15ns per probe → 65 million probes/second
Table in RAM: ~100ns per probe → 10 million probes/second
Table on SSD: ~100μs per probe → 10,000 probes/second

The difference between L1-resident and RAM-resident is 100x. Cache-conscious algorithms can achieve dramatic speedups by ensuring that hot data structures fit in cache.

Cache Line Efficiency

When you access one byte, the entire 64-byte cache line is loaded. If your data structure places related items in adjacent bytes, subsequent accesses are 'free.' If related items are scattered across memory, you pay for a cache line load each time. This is why columnar storage formats outperform row storage for analytics—needed columns are packed together.

Main Memory (DRAM) Access Times

Main memory access involves a complex choreography between the CPU's memory controller and the DRAM modules. Understanding this process reveals why DRAM latency has improved slowly compared to CPU speeds.

The Memory Access Sequence:

Address Decode: The memory controller decodes the address to identify channel, rank, bank, row, and column.
Row Activation (tRCD): If the target row is not already active, it must be opened:
- The row address is sent to the bank
- The entire row (8KB) is read into the row buffer
- This takes 13-18ns (tRCD: Row-to-Column Delay)
Column Read (tCAS): The column address selects the target bytes:
- Bytes are read from the row buffer
- Takes 13-19ns (CAS latency, or CL)
Data Transfer: Data travels from DRAM to CPU:
- DDR transfers data on both clock edges
- 64 bytes (burst length of 8 × 8 byte bus width)
Precharge (tRP): To access a different row in the same bank:
- Current row must be closed and precharged
- Takes 13-18ns (tRP: Row Precharge Time)

DRAM Timing Parameters
Parameter	DDR4-2400	DDR4-3200	DDR5-4800	DDR5-6400
CAS Latency (CL)	16-17	18-22	36-42	40-48
tRCD (ns)	~14ns	~14ns	~14ns	~14ns
tRP (ns)	~14ns	~14ns	~14ns	~14ns
Absolute latency*	~70ns	~65ns	~75ns	~70ns
Bandwidth (per ch)	19.2 GB/s	25.6 GB/s	38.4 GB/s	51.2 GB/s

*Absolute latency = CL × clock period. Higher CL at higher frequencies results in similar absolute latencies.

Access Time Scenarios:

Scenario	Time	Explanation
Row buffer hit	~15-20ns	Column access only, row already open
Row closed, same bank	~50-70ns	Must activate row (tRCD + tCAS)
Row conflict, same bank	~70-90ns	Must precharge, then activate (tRP + tRCD + tCAS)
Bank miss with parallelism	~40-50ns	Different bank, can overlap with other operations

Why DRAM Is "Slow":

DRAM uses capacitors, not transistors, to store data. This has consequences:

Charge sensing is slow: Reading a capacitor requires sensing tiny voltage differences
Refresh is mandatory: Capacitors leak; every cell must be refreshed every 64ms
Destructive reads: Reading a DRAM cell destroys its contents; it must immediately be rewritten
Row buffer width: The entire row loads on activation—wasteful for random access

Implications for Databases:

DRAM access patterns significantly affect the buffer pool and query processing:

Sequential scans benefit from row buffer locality
Random index probes cause row buffer conflicts
NUMA systems add complexity—memory local to a CPU is faster than remote memory
Memory channel interleaving distributes accesses but complicates locality analysis

Memory Bandwidth vs. Latency

DDR5 offers ~50% higher bandwidth than DDR4 but similar latency. For bandwidth-limited workloads (large sequential scans), DDR5 helps. For latency-limited workloads (random lookups), DDR5 offers minimal improvement. Database buffer pool operations are often latency-bound, not bandwidth-bound.

SSD Access Times

SSD access times are dominated by the characteristics of NAND flash memory and the overhead of the Flash Translation Layer (FTL). Unlike RAM's relatively predictable timing, SSD latency varies significantly based on the operation type, queue depth, and device state.

NAND Flash Operation Times:

NAND Flash Operation Latencies
Operation	SLC	MLC	TLC	QLC
Page Read	25 μs	50 μs	75-100 μs	100-150 μs
Page Program (Write)	200 μs	600 μs	1-2 ms	2-5 ms
Block Erase	1.5 ms	3 ms	5 ms	10+ ms

Why Read Latency Varies:

NAND flash reads involve:

Address Decode: ~1μs to decode the target page
Charge Sensing: Reading voltage levels from floating gates
- SLC: 1 bit, 2 levels → fast, simple threshold
- QLC: 4 bits, 16 levels → complex, multiple read voltages needed
Error Correction: ECC decodes the raw data (longer for worn cells)
Data Transfer: Moving page data to controller buffer

Controller Overhead:

Beyond raw NAND timing, SSD controllers add latency:

Component	Latency Added	Notes
Command parsing	1-5 μs	Decode NVMe/SATA command
FTL lookup	1-2 μs	Logical-to-physical address translation
Queue management	1-2 μs	Select command from queue
Data path	2-5 μs	Move data through controller
Interface	2-10 μs	PCIe/SATA protocol overhead

Total observed latency = NAND time + controller overhead = 25-100μs for reads

Latency Spikes

SSD latency is not constant. Garbage collection, wear leveling, and background operations can spike latency to several milliseconds. Enterprise SSDs include capacitors to complete in-flight writes during power loss, which allows more aggressive background operations. Monitoring p99 latency is essential—average latency masks problematic spikes.

Read vs. Write Asymmetry:

Writes are inherently slower than reads on NAND flash:

Read: Sense voltage levels—passive operation
Write (Program): Inject electrons into floating gates—requires high voltage, multiple verify cycles
Erase: Remove electrons from entire block—requires even higher voltage, very slow

Queue Depth Effects:

SSDs contain multiple NAND dies operating in parallel. Higher queue depths exploit this parallelism:

Queue Depth	Latency (p50)	IOPS	Notes
1	80 μs	12,500	Minimal parallelism
4	85 μs	47,000	Some die parallelism
16	100 μs	160,000	Good parallelism
64	200 μs	320,000	Maximum throughput
128+	500+ μs	350,000	Queue depth too high, latency degrades

Database workloads should target the sweet spot—enough concurrency to achieve high throughput without latency degradation.

NVMe vs. SATA:

Aspect	SATA SSD	NVMe SSD
Interface latency	10-20 μs	2-5 μs
Queue depth	32 (single queue)	64K queues × 64K entries
CPU overhead	Higher (AHCI)	Lower (direct PCIe)
Peak IOPS	100K	1,000K+

NVMe SSDs are strictly superior for database workloads, especially OLTP with high concurrency.

HDD Access Times — The Mechanical Reality

Hard disk drive access times are dominated by mechanical motion. Understanding the physics of Head movement and platter rotation explains why HDD latency is orders of magnitude higher than electronic storage.

The Three Components of HDD Access Time:

HDD Access Time = Seek Time + Rotational Latency + Transfer Time

Let's analyze each in detail.

Seek Time Analysis

•Definition: Time to move the read/write head assembly to the target track
•Typical values: 4-12ms average, 15-20ms maximum (full stroke)
•Physics: The actuator must accelerate, travel, decelerate, and settle
•Components: Acceleration (1-2ms) + Seek (varies) + Settle (0.5-1ms)
•Track-to-track: <1ms (adjacent tracks)
•Optimization: Elevator algorithms schedule seeks to minimize total movement

Seek Time Details:

The actuator arm uses a voice coil motor—essentially a speaker coil—to move the heads. The seek has three phases:

Acceleration: Apply current to accelerate the arm toward the target
Coast/Travel: The arm moves across tracks (may coast with minimal power)
Deceleration/Settle: Precisely position the head and wait for vibrations to dampen

Seek distance affects time non-linearly:

Seek Distance	Approximate Time
Adjacent track	0.5-1 ms
1/3 stroke	5-7 ms
1/2 stroke	7-10 ms
Full stroke	12-18 ms

Rotational Latency Analysis

•Definition: Time for the target sector to rotate under the head after seeking
•Average latency: Half the rotation period (random arrival assumption)
•7,200 RPM: 8.33ms rotation → 4.17ms average latency
•10,000 RPM: 6ms rotation → 3ms average latency
•15,000 RPM: 4ms rotation → 2ms average latency
•Worst case: Just missed the sector → full rotation wait

Complete HDD Access Time Breakdown
Component	5,400 RPM	7,200 RPM	10,000 RPM	15,000 RPM
Average Seek	14 ms	9 ms	5 ms	3.5 ms
Rotational Latency (avg)	5.56 ms	4.17 ms	3 ms	2 ms
Transfer (4KB)	0.02 ms	0.02 ms	0.02 ms	0.02 ms
Total Average	19.6 ms	13.2 ms	8 ms	5.5 ms
IOPS (random 4KB)	~50	~75	~125	~180

The IOPS Reality

A 7,200 RPM HDD can perform only ~75 random IOPS. A database processing 500 random reads per second saturates 7 such drives. An NVMe SSD achieving 300,000 IOPS replaces 4,000 HDDs for random workloads. This is why SSDs have transformed OLTP database deployments.

Sequential vs. Random Performance:

For sequential access, seek and rotational latency are incurred only once. Subsequent sectors stream continuously:

Metric	Random 4KB	Sequential 1MB
Seeks required	1 per 4KB	1 total
Effective throughput	0.3-1 MB/s	150-250 MB/s
Throughput ratio	1x	200-500x

Database Design Implications:

Append-only logs: Redo/undo logs should be on separate spindles for sequential writes
Tablespace layout: Separate filegroups for tables and indexes, placed on different drives
Defragmentation: Fragmented files cause random seeks; regular maintenance helps
RAID stripe size: Match stripe size to typical I/O size for sequential benefit

Network and Remote Storage Latency

Modern database deployments increasingly involve network-attached storage, cloud block storage, and distributed storage systems. Understanding network latency components is essential for architecting distributed databases.

Components of Network Storage Latency:

Network Storage Latency = Software Stack + Network Transit + Remote Storage Access

Network Latency Components
Component	Typical Range	Notes
Application → OS kernel	1-5 μs	System call overhead
OS network stack	5-20 μs	TCP/IP processing
NIC processing	2-10 μs	DMA, interrupt handling
Switch hop (datacenter)	0.5-2 μs	Cut-through switching
Switch hop (cloud)	10-50 μs	Virtual networking overhead
Fiber propagation (300m)	1.5 μs	Speed of light: ~5μs/km
Cross-datacenter (1000km)	10 ms	Speed of light limited
Remote storage processing	20-200 μs	SAN/NAS controller
Remote media access	25-10000 μs	SSD to HDD range

Storage Area Network (SAN) Latency:

Enterprise databases often use Fibre Channel or iSCSI SANs:

SAN Type	Protocol Overhead	Typical Round-Trip
Fibre Channel (local)	10-20 μs	100-500 μs
iSCSI (local)	50-100 μs	200-1000 μs
Cloud Block Storage	Variable	500-5000 μs
NFS/CIFS	Higher	500-2000 μs

Cloud Block Storage:

Cloud providers offer block storage with varying characteristics:

Service Tier	IOPS	Latency	Use Case
Standard SSD	3,000-16,000	1-4 ms	General purpose
Provisioned IOPS	16,000-256,000	0.5-2 ms	OLTP databases
Local NVMe	100,000-3,000,000	100-500 μs	High-performance

Cloud latency includes:

Virtualization overhead (hypervisor)
Virtual network overhead
Storage service processing
Multi-tenant interference

The Speed of Light Limit

Light travels ~300 km in 1 millisecond through fiber. New York to London is ~5,500 km → minimum 37ms round-trip. This fundamental limit affects distributed databases, replication, and geo-distributed queries. No technology can overcome it—only geographic placement or architectural changes (async replication, eventual consistency) can address it.

RDMA and Kernel Bypass:

For ultra-low-latency storage access, modern systems use:

RDMA (Remote Direct Memory Access):

Bypasses CPU and OS kernel
Direct NIC-to-memory transfers
Latency: 2-5 μs for small reads
Used in high-performance databases and distributed storage

NVMe-oF (NVMe over Fabrics):

NVMe protocol over RDMA or TCP
Near-local SSD latency over network
Latency: 20-50 μs (RDMA), 100-200 μs (TCP)

Database Architecture Impact:

Synchronous replication: Add 2× network latency to commit time
Distributed transactions: Two-phase commit adds multiple round-trips
Read replicas: Query routing can add network latency
Placement optimization: Co-locate compute with storage, minimize network hops

Measuring and Monitoring Access Times

Understanding theoretical access times is valuable, but production systems require measurement. Monitoring storage latency reveals bottlenecks, validates configurations, and enables capacity planning.

Key Metrics to Monitor:

Essential Storage Metrics

•Average Latency: Mean response time. Provides baseline but masks variability.
•p50/p95/p99 Latency: Percentile latencies. p99 reveals tail behavior that affects user experience.
•p999 Latency: The 99.9th percentile. For high-volume systems, even rare spikes affect many users.
•IOPS: Operations per second. Indicates workload intensity.
•Throughput (MB/s): Bandwidth utilization. Important for sequential workloads.
•Queue Depth: Number of pending requests. High queue depth may indicate saturation.
•Utilization: Percentage of time the device is busy. Over 80% often degrades latency.
•Wait Time vs. Service Time: Time in queue vs. time being processed.

Measurement Tools:

Operating System Level:

Tool	Platform	What It Shows
`iostat`	Linux	Per-device IOPS, throughput, utilization, await
`iotop`	Linux	Per-process I/O usage
`blktrace`/`blkparse`	Linux	Block-level tracing with microsecond timing
`Performance Monitor`	Windows	Disk counters, queue length, latency
`Resource Monitor`	Windows	Real-time disk activity by process

Example iostat Output Interpretation:

Device  r/s    w/s   rMB/s  wMB/s  avgrq-sz  avgqu-sz  await  svctm  %util
nvme0  12500  8000      49     31      8.0       1.2    0.06   0.04   82%

await (0.06ms): Average total request time including queue wait
svctm (0.04ms): Average service time (device busy time)
%util (82%): Approaching saturation—latency will increase with more load

Database-Level Monitoring:

Database	Key Views/Tables	Metrics
PostgreSQL	`pg_stat_io`, `pg_stat_bgwriter`	Buffer hits, reads, writes
MySQL	`information_schema.INNODB_METRICS`	Buffer pool reads, writes
Oracle	`v$iostat_file`, `v$system_event`	Wait events, I/O statistics
SQL Server	`sys.dm_io_virtual_file_stats`	Stall times, bytes read/written

The Saturation Curve

Storage latency increases non-linearly as utilization approaches 100%. Below 50% utilization, latency is near-constant. From 50-80%, latency increases gradually. Above 80%, latency climbs steeply (queuing theory). Keep production storage under 70% utilization for predictable latency.

Summary: The Physics of Performance

Access time is the fundamental currency of storage performance. We've dissected the sources of latency from CPU registers to network-attached storage. Let's consolidate the key insights:

Key Takeaways

•Access time = multiple components — Command overhead, seek/address time, transfer time, and protocol overhead combine to determine total latency.
•Cache latency is 1-20ns — L1 through L3 caches provide the fastest possible access for frequently-used data.
•DRAM latency is 50-100ns — Row activation and CAS timing dominate; sequential access exploits row buffer hits.
•SSD latency is 25-500μs — NAND technology, cell type, and controller efficiency determine read/write times.
•HDD latency is 5-15ms — Mechanical seek and rotational latency dominate; sequential is 100x+ faster than random.
•Network adds unpredictable latency — Protocol overhead, distance, and congestion can add microseconds to seconds.
•Measurement is essential — Monitor percentile latencies, not just averages; maintain utilization below saturation points.

What's Next:

We've examined the physics of storage performance. But performance is only half the equation—cost matters equally in real-world deployments. The next page explores Cost Considerations, analyzing the economics of storage choices and how to optimize the cost-performance ratio for database workloads.

Page Complete

You now understand the fundamental physics and engineering factors that determine storage access times. This knowledge enables you to predict performance, diagnose latency issues, and make informed storage architecture decisions based on physical reality rather than marketing claims.