Loading content...
Every computation, every file access, every network packet, every database query ultimately depends on one critical operation: moving data. While processors execute billions of instructions per second and memory responds in nanoseconds, the speed at which data flows through I/O subsystems often determines the actual performance users experience. A report that takes seconds to generate waits minutes for disk I/O. A real-time video stream stutters not because of codec complexity, but because data cannot arrive fast enough.
At the heart of understanding I/O performance lies a deceptively simple concept: throughput—the rate at which data moves through a system. Yet beneath this simple definition lies a rich landscape of engineering trade-offs, physical constraints, protocol overhead, and optimization opportunities that separate ordinary systems from high-performance ones.
By the end of this page, you will understand how to measure, analyze, and reason about I/O throughput. You'll learn the difference between theoretical and achievable throughput, understand the factors that limit real-world performance, and develop the mental models needed to diagnose and optimize I/O bottlenecks in production systems.
I/O throughput is formally defined as the amount of data successfully transferred per unit time between a source and destination. It is the fundamental metric for quantifying data movement capacity and is typically expressed in bytes per second (B/s) or its multiples:
Alternatively, throughput may be expressed in bits per second (bps), particularly for network interfaces:
The conversion factor of 8 bits per byte means that a 1 Gbps network link theoretically transfers 125 MB/s of payload data—though as we'll see, reality is considerably more nuanced.
Storage manufacturers often use decimal prefixes (1 GB = 10⁹ bytes), while operating systems traditionally use binary prefixes (1 GiB = 2³⁰ bytes = 1,073,741,824 bytes). This ~7.4% difference causes confusion when comparing advertised capacities versus reported values. Always verify which convention is in use when analyzing throughput measurements.
The Throughput Equation
At its most fundamental level, throughput can be expressed as:
$$\text{Throughput} = \frac{\text{Data Transferred}}{\text{Time Elapsed}}$$
However, this simple equation obscures critical details. A more accurate model accounts for the lifecycle of an I/O operation:
$$T_{effective} = \frac{D}{T_{setup} + T_{transfer} + T_{completion}}$$
Where:
This decomposition reveals why small I/O operations often achieve much lower throughput than large ones: the fixed overhead ($T_{setup} + T_{completion}$) dominates when $D$ is small.
| Category | Definition | Use Case |
|---|---|---|
| Raw Throughput | Maximum theoretical data rate of the physical medium | Interface specifications, hardware design limits |
| Effective Throughput | Actual data rate achieved after protocol overhead | Application-level performance measurement |
| Sustained Throughput | Throughput maintained over extended periods | Long-running batch operations, streaming workloads |
| Peak Throughput | Maximum momentary throughput during burst transfers | Cache hits, burst I/O patterns |
| Aggregate Throughput | Combined throughput across multiple channels or devices | RAID arrays, parallel I/O subsystems |
Accurate throughput measurement is both critical and surprisingly complex. Different measurement methodologies yield different results, and understanding these differences is essential for proper system analysis.
Sequential vs Random I/O Throughput
For storage devices, throughput varies dramatically based on access patterns:
Sequential throughput measures performance when accessing contiguous data blocks in order. This pattern allows devices to optimize for streaming transfers—HDDs can minimize seek overhead, SSDs can leverage internal parallelism, and caches achieve high hit rates.
Random throughput measures performance when accessing non-contiguous blocks in unpredictable order. This worst-case pattern exposes all overhead costs: seek latency, rotational delay, flash translation layer lookups, and cache misses.
Block Size Impact on Measured Throughput
The size of individual I/O requests profoundly affects measured throughput. Consider the relationship:
$$\text{Throughput} = \text{IOPS} \times \text{Block Size}$$
Where IOPS (I/O Operations Per Second) represents the rate of completed I/O requests. For a device capable of 10,000 IOPS:
| Block Size | Calculated Throughput |
|---|---|
| 4 KB | 40 MB/s |
| 64 KB | 640 MB/s |
| 256 KB | 2,560 MB/s |
| 1 MB | 10,240 MB/s |
This relationship explains why database workloads (typically 4-16 KB blocks) achieve vastly different throughput than backup operations (often 256 KB+ blocks) on identical hardware.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194
/** * Throughput Measurement Framework * * Demonstrates proper methodology for measuring I/O throughput * with consideration for warmup, multiple iterations, and * statistical analysis of results. */ #include <stdio.h>#include <stdlib.h>#include <fcntl.h>#include <unistd.h>#include <time.h>#include <string.h>#include <errno.h> #define KB (1024ULL)#define MB (1024ULL * KB)#define GB (1024ULL * MB) #define WARMUP_ITERATIONS 3#define MEASUREMENT_ITERATIONS 10#define DEFAULT_TRANSFER_SIZE (1 * GB)#define MAX_BLOCK_SIZE (1 * MB) typedef struct { double throughput_mbps; // Measured MB/s double elapsed_seconds; // Total time size_t bytes_transferred; // Actual bytes moved int error_count; // Any I/O errors} BenchmarkResult; typedef struct { double mean; // Average throughput double std_dev; // Standard deviation double min; // Minimum observed double max; // Maximum observed double p95; // 95th percentile} ThroughputStatistics; /** * High-resolution timer for nanosecond precision */static inline double get_time_seconds(void) { struct timespec ts; clock_gettime(CLOCK_MONOTONIC, &ts); return ts.tv_sec + ts.tv_nsec / 1e9;} /** * Measure sequential read throughput * * Key considerations: * - Use O_DIRECT to bypass OS buffer cache (measures device speed) * - Align buffer to page boundary (required for O_DIRECT) * - Use large block sizes to minimize syscall overhead */BenchmarkResult measure_seq_read_throughput( const char* device_path, size_t total_bytes, size_t block_size) { BenchmarkResult result = {0}; // O_DIRECT bypasses OS caching for accurate device measurement int fd = open(device_path, O_RDONLY | O_DIRECT); if (fd < 0) { perror("Failed to open device"); result.error_count = 1; return result; } // Allocate aligned buffer (O_DIRECT requirement) void* buffer; if (posix_memalign(&buffer, 4096, block_size) != 0) { result.error_count = 1; close(fd); return result; } // Perform measurement double start_time = get_time_seconds(); size_t total_read = 0; while (total_read < total_bytes) { ssize_t bytes = read(fd, buffer, block_size); if (bytes <= 0) { if (bytes < 0 && errno != EINTR) { result.error_count++; } break; } total_read += bytes; } double end_time = get_time_seconds(); // Calculate results result.elapsed_seconds = end_time - start_time; result.bytes_transferred = total_read; result.throughput_mbps = (total_read / MB) / result.elapsed_seconds; free(buffer); close(fd); return result;} /** * Calculate statistical summary of throughput measurements */ThroughputStatistics calculate_statistics( double* measurements, int count) { ThroughputStatistics stats = {0}; // Calculate mean for (int i = 0; i < count; i++) { stats.mean += measurements[i]; if (measurements[i] < stats.min || stats.min == 0) stats.min = measurements[i]; if (measurements[i] > stats.max) stats.max = measurements[i]; } stats.mean /= count; // Calculate standard deviation for (int i = 0; i < count; i++) { double diff = measurements[i] - stats.mean; stats.std_dev += diff * diff; } stats.std_dev = sqrt(stats.std_dev / count); // Sort for percentile calculation // (simplified insertion sort for demonstration) for (int i = 1; i < count; i++) { double key = measurements[i]; int j = i - 1; while (j >= 0 && measurements[j] > key) { measurements[j + 1] = measurements[j]; j--; } measurements[j + 1] = key; } stats.p95 = measurements[(int)(count * 0.95)]; return stats;} /** * Run comprehensive throughput benchmark */void run_throughput_benchmark(const char* device_path) { size_t block_sizes[] = {4*KB, 16*KB, 64*KB, 256*KB, 1*MB}; int num_block_sizes = sizeof(block_sizes) / sizeof(block_sizes[0]); printf("\n%-12s %12s %12s %12s %12s\n", "Block Size", "Mean (MB/s)", "StdDev", "Min", "Max"); printf("%-12s %12s %12s %12s %12s\n", "----------", "-----------", "------", "---", "---"); for (int bs = 0; bs < num_block_sizes; bs++) { double measurements[MEASUREMENT_ITERATIONS]; // Warmup iterations (discard results) for (int i = 0; i < WARMUP_ITERATIONS; i++) { measure_seq_read_throughput( device_path, DEFAULT_TRANSFER_SIZE / 10, block_sizes[bs] ); } // Actual measurement iterations for (int i = 0; i < MEASUREMENT_ITERATIONS; i++) { BenchmarkResult result = measure_seq_read_throughput( device_path, DEFAULT_TRANSFER_SIZE, block_sizes[bs] ); measurements[i] = result.throughput_mbps; } ThroughputStatistics stats = calculate_statistics(measurements, MEASUREMENT_ITERATIONS); printf("%-12zu %12.2f %12.2f %12.2f %12.2f\n", block_sizes[bs] / KB, stats.mean, stats.std_dev, stats.min, stats.max); }}Always include warmup iterations before measurement to stabilize caches and trigger any just-in-time optimizations. Collect multiple samples and report statistical measures (mean, standard deviation, percentiles) rather than single values. Use O_DIRECT when measuring actual device throughput to bypass OS caching, but remember that applications typically benefit from OS caching.
A fundamental reality of I/O systems is that practical throughput never achieves theoretical maximum. Understanding this gap—and the factors that cause it—is essential for realistic capacity planning and performance optimization.
Theoretical Throughput
Theoretical throughput represents the maximum data rate that the physical interface can sustain under ideal conditions. For example:
| Interface | Theoretical Throughput | Calculation |
|---|---|---|
| SATA III | 6 Gbps = 600 MB/s | Raw signaling rate |
| PCIe 4.0 x4 | 64 Gbps = 8 GB/s | 16 GT/s × 4 lanes × 2B/4b encoding |
| NVMe over PCIe 4.0 x4 | ~7.88 GB/s | After 128b/130b encoding |
| USB 3.2 Gen 2 | 10 Gbps = 1.25 GB/s | SuperSpeed+ specification |
| 10 GbE | 10 Gbps = 1.25 GB/s | Wire speed maximum |
These numbers represent wire speed—the maximum signaling capacity of the physical layer.
The Overhead Cascade
Multiple layers of overhead reduce practical throughput:
1. Encoding Overhead Physical interfaces use line coding to maintain signal integrity. SATA uses 8b/10b encoding (20% overhead), while PCIe 4.0+ uses 128b/130b (~1.5% overhead). This is a fixed tax on all transfers.
2. Protocol Overhead Every I/O operation includes framing, addressing, commands, and status information beyond actual data. A 4 KB NVMe read might require:
3. Software Stack Overhead Data traverses multiple software layers, each adding latency and consuming CPU cycles:
4. Interleaving Overhead Physical media cannot always stream continuously. HDDs have seek time and rotational delay. SSDs have internal garbage collection. Networks have packet gaps and retransmissions.
| Interface | Theoretical Max | Typical Sustained | Efficiency | Primary Loss Factors |
|---|---|---|---|---|
| SATA III SSD | 600 MB/s | 550 MB/s | ~92% | 8b/10b encoding, command overhead |
| 7,200 RPM HDD (seq) | 200 MB/s | 150-180 MB/s | ~80% | Track switching, zone density variation |
| PCIe 4.0 NVMe SSD | 7,880 MB/s | 5,000-7,000 MB/s | 65-90% | Controller limits, thermal throttling |
| 10 GbE (TCP) | 1,250 MB/s | 1,100-1,180 MB/s | ~92% | Ethernet framing, IP/TCP headers |
| USB 3.2 Gen 2 | 1,250 MB/s | 900-1,050 MB/s | ~80% | Protocol overhead, cable quality |
The Reality Check Formula
A useful heuristic for estimating practical throughput:
$$T_{practical} \approx T_{theoretical} \times \eta_{encoding} \times \eta_{protocol} \times \eta_{media}$$
Where:
Example: Estimating NVMe SSD Throughput
For a PCIe 4.0 x4 NVMe SSD:
This matches observed real-world throughput of 6-7 GB/s for high-end NVMe drives.
While theoretical throughput provides upper bounds, real-world performance depends heavily on workload characteristics. A database performing random 4 KB reads achieves vastly different throughput than a video editor streaming sequential 1 MB blocks—even on identical hardware. Always measure throughput under representative workload conditions.
I/O throughput is influenced by a complex interplay of hardware capabilities, software design, workload characteristics, and environmental factors. Understanding these dependencies enables targeted optimization and realistic performance predictions.
Hardware Factors
Software Factors
The software stack profoundly impacts achieved throughput, often more than hardware selection:
Workload Characteristics
The pattern of I/O requests fundamentally shapes achievable throughput:
| Pattern | Description | Throughput Impact |
|---|---|---|
| Sequential Large | Contiguous blocks, large requests (1 MB+) | Highest throughput; approaches interface limits |
| Sequential Small | Contiguous blocks, small requests (4-16 KB) | Good throughput; limited by IOPS × block size |
| Random Large | Non-contiguous blocks, large requests | Moderate throughput; media seek/access overhead |
| Random Small | Non-contiguous blocks, small requests | Lowest throughput; dominated by latency |
| Mixed | Combination of patterns | Varies; often worse than either pure pattern |
The Queue Depth Effect
Queue depth—the number of simultaneous outstanding I/O requests—dramatically influences throughput, particularly for devices with internal parallelism:
| Queue Depth | Typical Impact |
|---|---|
| 1 | ~30-40% of peak throughput; device idles between requests |
| 4 | ~50-70% of peak; some pipelining possible |
| 16 | ~80-90% of peak; good internal parallelism exploitation |
| 32+ | ~95%+ of peak; diminishing returns beyond this |
NVMe SSDs are designed for high queue depths (up to 64K queues × 64K depth), which is why they excel in enterprise workloads that generate many concurrent requests, but may show similar performance to SATA SSDs in single-threaded desktop workloads.
Modern file systems and applications should use asynchronous I/O (io_uring on Linux, IOCP on Windows) to maintain appropriate queue depths. Synchronous I/O with single-threaded applications fundamentally limits queue depth to 1, leaving most device capacity unused.
Throughput manifests differently across storage, network, and peripheral I/O subsystems, each with unique characteristics and optimization strategies.
Storage I/O Throughput
Storage throughput depends heavily on the storage medium and access pattern. Modern storage hierarchies exhibit vast throughput ranges:
| Storage Type | Sequential Read | Sequential Write | Random Read (4K) | Random Write (4K) |
|---|---|---|---|---|
| 7,200 RPM HDD | 150-200 MB/s | 150-180 MB/s | 0.5-2 MB/s | 0.5-2 MB/s |
| 15,000 RPM SAS HDD | 200-250 MB/s | 200-230 MB/s | 1-2 MB/s | 1-2 MB/s |
| SATA SSD | 550 MB/s | 520 MB/s | 150-350 MB/s | 100-300 MB/s |
| NVMe SSD (PCIe 3.0) | 3,500 MB/s | 3,000 MB/s | 400-600 MB/s | 250-400 MB/s |
| NVMe SSD (PCIe 4.0) | 7,000 MB/s | 5,500 MB/s | 600-1000 MB/s | 400-600 MB/s |
| Intel Optane (3D XPoint) | 2,700 MB/s | 2,200 MB/s | 2,400 MB/s | 1,900 MB/s |
The contrast between sequential and random throughput is striking. An HDD with 180 MB/s sequential throughput achieves only 1 MB/s for random 4K reads—a 180× difference. This gap, caused by mechanical seek latency (~10 ms per seek), fundamentally shapes storage system design.
Network I/O Throughput
Network throughput is bounded by link capacity, protocol overhead, and congestion:
| Network Type | Wire Speed | Practical TCP | Overhead Sources |
|---|---|---|---|
| 1 GbE | 125 MB/s | 110-120 MB/s | Ethernet framing, IP/TCP headers, ACKs |
| 10 GbE | 1.25 GB/s | 1.1-1.18 GB/s | Same + switch latency, buffer limits |
| 25 GbE | 3.125 GB/s | 2.8-3.0 GB/s | Same + NIC processing limits |
| 100 GbE | 12.5 GB/s | 10-11 GB/s | Same + multiple streams needed |
| InfiniBand HDR | 25 GB/s | 23-24 GB/s | RDMA bypasses OS stack |
Network applications often require multiple parallel streams or RDMA (Remote Direct Memory Access) to saturate high-speed links. A single TCP stream rarely achieves more than 3-5 Gbps due to latency-bandwidth product constraints.
Peripheral I/O Throughput
Peripheral throughput varies enormously based on device class:
| Peripheral Type | Typical Throughput | Limiting Factor |
|---|---|---|
| USB Mouse/Keyboard | 1-10 KB/s | Low-frequency polling |
| USB Audio (48 kHz stereo) | 384 KB/s | Audio sample rate |
| USB 4K Webcam | 45-100 MB/s | Video resolution/framerate |
| Thunderbolt 4 Storage | 3,000 MB/s | PCIe 3.0 x4 tunneling |
| PCIe GPU (x16) | 32 GB/s | Memory bandwidth limited |
Peripheral throughput optimization often focuses on reducing CPU overhead (via DMA), batching transfers, and matching buffer sizes to typical payload sizes.
In real systems, multiple I/O subsystems compete for shared resources (PCIe lanes, memory bandwidth, CPU attention). Total system throughput may be less than the sum of individual device throughputs due to contention and scheduling overhead. Careful system design balances load across independent I/O paths.
Optimizing I/O throughput requires a systematic approach across hardware selection, software architecture, and operational tuning.
Hardware-Level Optimization
Software-Level Optimization
Software optimizations often provide the largest throughput gains:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
#!/bin/bash# Linux I/O Throughput Optimization Checklist # ============================================# 1. SCHEDULER SELECTION# ============================================# NVMe devices benefit from 'none' or 'mq-deadline'# HDDs benefit from 'bfq' for fairness or 'mq-deadline' # Check current scheduler for nvme0n1cat /sys/block/nvme0n1/queue/scheduler # Set optimal scheduler (requires root)echo "none" > /sys/block/nvme0n1/queue/scheduler # For NVMeecho "mq-deadline" > /sys/block/sda/queue/scheduler # For SSD/HDD # ============================================# 2. QUEUE DEPTH TUNING# ============================================# Increase queue depth for high-throughput workloads# Default is often 128-256; can increase for NVMe # Check current queue depthcat /sys/block/nvme0n1/queue/nr_requests # Increase queue depth (power of 2 recommended)echo 1024 > /sys/block/nvme0n1/queue/nr_requests # ============================================# 3. READ-AHEAD TUNING# ============================================# Increase read-ahead for sequential workloads# Values in 512-byte sectors (multiply KB by 2) # Check current read-ahead (in sectors)cat /sys/block/nvme0n1/queue/read_ahead_kb # Set 16 MB read-ahead for streaming workloadsecho 16384 > /sys/block/nvme0n1/queue/read_ahead_kb # ============================================# 4. FILESYSTEM MOUNT OPTIONS# ============================================# XFS/ext4 options for throughput:# - noatime: Skip access time updates# - nodiratime: Skip directory access time# - discard: Enable TRIM (or use fstrim.timer)# - nobarrier: Disable write barriers (DANGER: data loss risk) # Example fstab entry for NVMe data volume:# /dev/nvme0n1p1 /data xfs defaults,noatime,discard 0 2 # ============================================# 5. MEMORY AND CACHING# ============================================# Increase dirty page limits for write-heavy workloads # Current settingssysctl vm.dirty_ratio vm.dirty_background_ratio # Aggressive write caching (up to 40% of RAM as dirty pages)sysctl -w vm.dirty_ratio=40sysctl -w vm.dirty_background_ratio=10 # ============================================# 6. NUMA AWARENESS# ============================================# Bind I/O-intensive processes to NUMA node with storage# Check NUMA topologylscpu | grep NUMAcat /sys/block/nvme0n1/device/numa_node # Run application on correct NUMA nodenumactl --cpunodebind=0 --membind=0 ./io_intensive_app # ============================================# 7. INTERRUPT AFFINITY# ============================================# Balance IRQs across CPUs for high-throughput NICs/storage# Install irqbalance or manually tune # Check NVMe interrupt distributioncat /proc/interrupts | grep nvme # Distribute NVMe interrupts across CPUs (example)# Set affinity mask for each interrupt queueApplication-Level Optimization
Beyond system tuning, application design choices profoundly affect throughput:
Throughput optimization often trades off against other qualities. Aggressive caching risks data loss on power failure. Large I/O requests improve throughput but increase latency for individual operations. Deep queues benefit throughput but may starve low-priority requests. Always consider the broader system requirements when tuning for throughput.
Production systems require continuous throughput monitoring to detect degradation, plan capacity, and diagnose issues. Effective monitoring combines real-time metrics, historical trending, and diagnostic deep-dives.
Essential Metrics
Beyond raw throughput (MB/s), comprehensive monitoring tracks:
| Metric | Description | Interpretation |
|---|---|---|
| Read throughput | Bytes read per second | Baseline for read-heavy workloads |
| Write throughput | Bytes written per second | Watch for write amplification |
| IOPS | I/O operations per second | Complements throughput for small I/O |
| Queue depth | Outstanding I/O requests | Low depth suggests application limits |
| I/O wait | CPU time waiting for I/O | High iowait indicates I/O bottleneck |
| Device utilization | Percentage of time device busy | 100% indicates saturation |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081
# Real-time I/O throughput monitoring commands # ============================================# iostat - Comprehensive I/O statistics# ============================================# -x: Extended statistics# -m: Display in MB/s (not sectors)# 1: Update every 1 second iostat -xm 1 # Sample output:# Device r/s w/s rMB/s wMB/s await %util# nvme0n1 8524 4218 432.1 215.6 0.12 78.5## Key columns:# - r/s, w/s: Read/write IOPS# - rMB/s, wMB/s: Throughput in MB/s# - await: Average queue wait + service time (ms)# - %util: Percentage time device was busy # ============================================# iotop - Per-process I/O monitoring# ============================================# -o: Only show processes doing I/O# -b: Batch mode (for scripting) sudo iotop -o # Identify which processes consume I/O bandwidth # ============================================# blktrace - Detailed block layer tracing# ============================================# Captures low-level I/O events for deep analysis # Start trace on devicesudo blktrace -d /dev/nvme0n1 -o trace_output # Analyze traceblkparse trace_output | head -100 # Generate statistical summarybtt -i trace_output.blktrace.0 # ============================================# dstat - Combined system statistics# ============================================# Shows I/O, CPU, network together dstat -cdnm 1# c: CPU stats# d: Disk I/O# n: Network I/O# m: Memory stats # ============================================# nfsstat/nfsiostat - NFS throughput# ============================================# For NFS-mounted filesystems nfsiostat 1 # ============================================# Network throughput monitoring# ============================================# iftop for real-time network throughputsudo iftop -i eth0 # nethogs for per-process network bandwidthsudo nethogs eth0 # ============================================# Continuous logging for trending# ============================================# Collect iostat data for historical analysis iostat -xm 60 >> /var/log/iostat.log & # Or use structured collection with sarsar -d 60 >> /var/log/sar_disk.log &Interpreting Monitoring Data
Raw metrics require context for meaningful interpretation:
1. Baseline Establishment Before identifying problems, establish normal throughput patterns. What does healthy throughput look like during peak hours? Off-peak? During batch jobs?
2. Saturation Detection Device utilization at or near 100% indicates saturation—the device cannot handle additional load without queuing delays. However, NVMe devices may show low utilization while achieving high throughput due to command parallelism.
3. Queue Depth Analysis If throughput is below expectations but queue depth is low, the bottleneck is likely in the application (not generating enough requests) rather than the device. Conversely, high queue depth with high utilization indicates true device saturation.
4. Trending Analysis Sudden throughput drops often indicate:
Gradual throughput decline may indicate:
Set alerts for throughput anomalies before users notice performance degradation. Alert when throughput drops below 80% of baseline, when utilization consistently exceeds 85%, or when queue depth grows unexpectedly. Early warning enables proactive remediation rather than reactive firefighting.
I/O throughput—the rate of data movement—is a foundational metric for understanding and optimizing system performance. While the concept is simple, achieving high throughput in practice requires deep understanding of hardware, software, and workload interactions.
What's Next
Throughput tells us how much data moves, but not how quickly individual operations complete. The next page examines latency considerations—the other half of the I/O performance equation—and explores how throughput and latency interact in complex ways that shape real-world system behavior.
You now have a comprehensive understanding of I/O throughput: its measurement, the factors that affect it, the gap between theory and practice, and strategies for optimization. This foundation prepares you to analyze and optimize data movement in any computing system.