Operating SystemsRAID

RAID: Redundant Array of Independent Disks

LevelIntermediate

Duration90 mins

TopicRAID

4 / 5

RAID Performance

Performance: The Other Half of the RAID Equation

RAID was designed to solve two problems: reliability (surviving disk failures) and performance (exceeding single-disk capabilities). While reliability is often the primary motivation for deploying RAID, performance characteristics frequently determine whether a storage system meets application requirements.

Performance in RAID is not a single number—it's a multi-dimensional space defined by throughput (MB/s), IOPS (I/O operations per second), latency (response time), and how these metrics change under different workloads and failure conditions. Understanding these dimensions is essential for capacity planning, array design, and troubleshooting bottlenecks.

What You Will Learn

By the end of this page, you will understand how to model RAID throughput and IOPS for different configurations, identify performance bottlenecks in storage systems, predict performance degradation during disk failures and rebuilds, optimize RAID configuration for specific workload patterns, and apply queuing theory concepts to understand latency behavior.

Storage Performance Metrics

Before analyzing RAID-specific performance, we must establish a foundation in storage performance metrics. These metrics describe different aspects of storage system capability:

Throughput (Bandwidth)

Measured in MB/s (megabytes per second) or GB/s. Throughput represents the raw data transfer rate of the storage system. It's the primary metric for sequential workloads like video streaming, backup, or large file transfers.

IOPS (Input/Output Operations Per Second)

The number of discrete read or write operations completed per second. IOPS is the primary metric for random-access workloads like database transactions, email servers, or virtualization. A workload may be IOPS-bound even when throughput is low.

Latency (Response Time)

The time between issuing an I/O request and receiving the response. Components of latency include:

Seek time (HDD): Time for heads to position over the correct track
Rotational latency (HDD): Time for the correct sector to rotate under the heads
Transfer time: Time to read/write the data
Controller overhead: RAID calculation, queuing, etc.
Queue wait time: Time spent waiting for other I/O to complete

The Relationship Between Metrics:

These metrics are interrelated through a fundamental equation:

$$\text{Throughput} = \text{IOPS} \times \text{I/O Size}$$

For example:

10,000 IOPS × 4 KB = 40 MB/s
10,000 IOPS × 256 KB = 2,560 MB/s
500 IOPS × 1 MB = 500 MB/s

Little's Law for Storage:

Queuing theory provides insight into latency:

$$\text{Average Queue Length} = \text{Arrival Rate} \times \text{Average Wait Time}$$

Or equivalently: $$L = \lambda \times W$$

This means that as utilization increases, queue lengths and latencies grow non-linearly. At 70% utilization, latency is roughly 3× the service time. At 90%, it's approximately 10× the service time.

Typical Performance Ranges by Storage Media
Media Type	Sequential Read	Random Read IOPS	Random Write IOPS	Latency
7200 RPM HDD	150-200 MB/s	80-150	80-150	4-8 ms
10K RPM HDD	200-250 MB/s	150-200	150-200	3-5 ms
15K RPM HDD	250-300 MB/s	200-300	200-300	2-4 ms
SATA SSD	500-550 MB/s	50K-100K	30K-70K	0.1-0.2 ms
NVMe SSD	3-7 GB/s	500K-1M	300K-700K	0.02-0.1 ms

Workload Characterization

Real workloads are mixtures of reads and writes, sequential and random access, and various I/O sizes. Characterizing your workload (read/write ratio, random/sequential ratio, I/O size distribution) is the first step in predicting RAID performance.

RAID Throughput Analysis

Throughput in RAID arrays depends on the ability to parallelize I/O across multiple disks. Let's analyze each RAID level:

RAID 0 Throughput:

With n disks, theoretical maximum: $$T_{RAID0} = n \times T_{disk}$$

For large sequential I/O that spans all disks, this linear scaling is achievable. With four 150 MB/s HDDs: $$T_{RAID0} = 4 \times 150 = 600 \text{ MB/s}$$

RAID 1 Throughput:

Reads: Can load-balance across mirrors $$T_{RAID1_read} = n \times T_{disk}$$ (for n-way mirror)

Writes: Must write to all mirrors; limited by slowest $$T_{RAID1_write} = T_{disk}$$

For a 2-way mirror: 2× read throughput, 1× write throughput.

RAID 5 Throughput:

Reads: All data disks contribute $$T_{RAID5_read} = n \times T_{disk}$$

Full-stripe writes: Very efficient $$T_{RAID5_fullstripe} = (n-1) \times T_{disk}$$

Random small writes: Severely limited by read-modify-write $$T_{RAID5_random_write} = \frac{n \times T_{disk}}{4}$$

RAID 6 Throughput:

Similar to RAID 5, but with dual parity overhead:

Full-stripe writes: (n-2) × T_disk
Random small writes: n × T_disk / 6

RAID 10 Throughput:

Reads: All disks can contribute (mirror pairs share load) $$T_{RAID10_read} = n \times T_{disk}$$

Writes: Limited by mirror pair throughput $$T_{RAID10_write} = \frac{n}{2} \times T_{disk}$$

For 8 disks (4 mirror pairs) × 150 MB/s:

Read: up to 1,200 MB/s
Write: up to 600 MB/s

RAID Throughput Calculator
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
def calculate_raid_throughput(
    num_disks: int,
    disk_throughput_mbs: float,
    raid_level: str,
    workload: str = "sequential_read"
) -> float:
    """
    Calculate theoretical RAID throughput.
    
    Args:
        num_disks: Total number of drives in the array
        disk_throughput_mbs: Sequential throughput of single disk in MB/s
        raid_level: One of "0", "1", "5", "6", "10"
        workload: "sequential_read", "sequential_write", "random_write"
    
    Returns:
        Theoretical throughput in MB/s
    """
    T = disk_throughput_mbs
    n = num_disks
    
    if raid_level == "0":
        # All stripes contribute for both reads and writes
        return n * T
    
    elif raid_level == "1":
        if workload == "sequential_read":
            return 2 * T  # 2-way mirror
        else:
            return T  # Writes go to both mirrors
    
    elif raid_level == "5":
        if workload == "sequential_read":
            return n * T
        elif workload == "sequential_write":
            return (n - 1) * T  # One disk for parity
        else:  # random_write
            return n * T / 4  # 4x write penalty
    
    elif raid_level == "6":
        if workload == "sequential_read":
            return n * T
        elif workload == "sequential_write":
            return (n - 2) * T  # Two disks for parity
        else:  # random_write
            return n * T / 6  # 6x write penalty
    
    elif raid_level == "10":
        num_pairs = n // 2
        if workload == "sequential_read":
            return n * T  # All disks contribute
        else:  # writes
            return num_pairs * T  # One write per pair
 
# Example: Compare 8-disk arrays
for raid in ["0", "5", "6", "10"]:
    for workload in ["sequential_read", "sequential_write", "random_write"]:
        throughput = calculate_raid_throughput(8, 150, raid, workload)
        print(f"RAID {raid:>2} - {workload:<18}: {throughput:>6.0f} MB/s")
    print()
 
# Output:
# RAID 0  - sequential_read   :    1200 MB/s
# RAID 0  - sequential_write  :    1200 MB/s
# RAID 0  - random_write      :    1200 MB/s
 
# RAID 5  - sequential_read   :    1200 MB/s
# RAID 5  - sequential_write  :    1050 MB/s
# RAID 5  - random_write      :     300 MB/s
 
# RAID 6  - sequential_read   :    1200 MB/s
# RAID 6  - sequential_write  :     900 MB/s
# RAID 6  - random_write      :     200 MB/s
 
# RAID 10 - sequential_read   :    1200 MB/s
# RAID 10 - sequential_write  :     600 MB/s
# RAID 10 - random_write      :     600 MB/s

Theoretical vs. Achieved

These are theoretical maximums. Real-world throughput is limited by controller processing power, bus bandwidth (PCIe, SAS lanes), cache hit rates, workload alignment, and the gap between spindle speed and interface speed. Expect 60-85% of theoretical in well-tuned systems.

RAID IOPS Analysis

For transaction-oriented workloads, IOPS is the critical metric. RAID's impact on IOPS differs significantly between reads and writes.

Read IOPS:

For all RAID levels, read IOPS scales roughly linearly with disk count:

$$IOPS_{array_read} = n \times IOPS_{disk}$$

With 8 HDDs at 150 IOPS each: 1,200 read IOPS With 8 SSDs at 50K IOPS each: 400K read IOPS

Mirror-based RAID (1, 10) can sometimes exceed this slightly due to read load balancing optimizations.

Write IOPS:

The write penalty dramatically impacts effective write IOPS:

RAID Level	Write Operations per Logical Write	Effective Write IOPS
RAID 0	1	n × IOPS_disk
RAID 1	2 (parallel)	n × IOPS_disk / 2
RAID 5	4 (2 read + 2 write)	n × IOPS_disk / 4
RAID 6	6 (3 read + 3 write)	n × IOPS_disk / 6
RAID 10	2 (parallel)	n × IOPS_disk / 2

Mixed Workload IOPS:

Real applications perform both reads and writes. For a workload with read fraction r and write fraction w (where r + w = 1):

$$IOPS_{effective} = \frac{IOPS_{disk} \times n}{r \times 1 + w \times \text{write_penalty}}$$

Example: 70% read, 30% write workload on 8-disk RAID 5

$$IOPS_{effective} = \frac{150 \times 8}{0.7 \times 1 + 0.3 \times 4}$$ $$= \frac{1200}{0.7 + 1.2} = \frac{1200}{1.9} \approx 632\ IOPS$$

Compare to RAID 10 with same workload: $$IOPS_{effective} = \frac{150 \times 8}{0.7 \times 1 + 0.3 \times 2}$$ $$= \frac{1200}{0.7 + 0.6} = \frac{1200}{1.3} \approx 923\ IOPS$$

RAID 10 delivers ~46% more IOPS for this write-moderate workload.

Mixed Workload IOPS Calculator
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
def calculate_mixed_iops(
    num_disks: int,
    disk_iops: float,
    raid_level: str,
    read_percent: float
) -> dict:
    """
    Calculate effective IOPS for mixed read/write workload.
    
    Returns dictionary with detailed breakdown.
    """
    write_percent = 1.0 - read_percent
    
    write_penalties = {
        "0": 1,
        "1": 2,
        "5": 4,
        "6": 6,
        "10": 2
    }
    
    penalty = write_penalties[raid_level]
    raw_iops = num_disks * disk_iops
    
    # Weighted average of read (1x) and write (penalty x)
    effective_multiplier = read_percent * 1 + write_percent * penalty
    effective_iops = raw_iops / effective_multiplier
    
    # Calculate breakdown
    read_iops = effective_iops * read_percent
    write_iops = effective_iops * write_percent
    physical_ops = read_iops + (write_iops * penalty)
    
    return {
        "raid_level": raid_level,
        "raw_iops": raw_iops,
        "effective_iops": effective_iops,
        "read_iops": read_iops,
        "write_iops": write_iops,
        "physical_ops_per_second": physical_ops,
        "write_penalty": penalty,
        "efficiency": (effective_iops / raw_iops) * 100
    }
 
# Compare RAID levels for database workload (70% read, 30% write)
print("8-disk array, 150 IOPS/disk, 70% read / 30% write workload:\n")
print(f"{'RAID':^6} | {'Effective':^10} | {'Efficiency':^10} | {'Writes':^10}")
print("-" * 50)
 
for raid in ["0", "5", "6", "10"]:
    result = calculate_mixed_iops(8, 150, raid, 0.70)
    print(f"{raid:^6} | {result['effective_iops']:>8.0f}   | "
          f"{result['efficiency']:>8.1f}%  | {result['write_iops']:>8.0f}")
 
print("\n100% write workload (worst case):\n")
for raid in ["0", "5", "6", "10"]:
    result = calculate_mixed_iops(8, 150, raid, 0.0)
    print(f"RAID {raid}: {result['effective_iops']:.0f} effective IOPS "
          f"({result['efficiency']:.1f}% efficiency)")

Sizing for Required IOPS

When sizing storage, work backwards: determine required application IOPS, apply the write penalty formula to find required raw IOPS, then calculate the number of disks needed. Always add headroom (30-50%) because performance degrades sharply at high utilization.

RAID Latency Analysis

Latency is often the most critical performance metric for interactive applications. Users notice latency; they rarely notice throughput directly. RAID impacts latency through several mechanisms:

Components of RAID Latency:

Media latency: Time for the disk to complete the I/O
- HDD: seek + rotational latency + transfer ≈ 4-8 ms
- SSD: controller + flash access ≈ 0.05-0.2 ms
Controller overhead: RAID calculations, cache lookup
- Hardware RAID: 0.02-0.1 ms
- Software RAID: 0.1-0.5 ms
Queue wait time: Time waiting for disk availability
- Depends on current queue depth and utilization
Parity operations: Additional I/O for parity reads/writes
- RAID 5 small write: adds 2 disk latencies (read old data, read old parity)
- RAID 6 small write: adds 3 disk latencies

The Queue Depth Effect:

Latency is not constant—it increases as the array becomes more loaded. For a disk with service time S and utilization ρ (fraction of time busy), the average response time follows the M/M/1 queuing model:

$$T_{response} = \frac{S}{1 - \rho}$$

This relationship is non-linear and becomes severe at high utilization:

Utilization	Response Time (multiple of S)
50%	2×
70%	3.3×
80%	5×
90%	10×
95%	20×
99%	100×

This is why storage systems should never run at sustained high utilization—latency becomes unacceptable well before theoretical IOPS limits.

Converting Mermaid diagram...

Latency Impact by RAID Level:

Read Latency:

RAID 0, 1, 10: Single disk access (can be any disk with data)
RAID 5, 6: Single disk access (unless degraded)
Difference is negligible for healthy arrays

Write Latency (without cache):

RAID 0: Single disk write time
RAID 1, 10: Maximum of mirror writes (parallel, so ~= single disk)
RAID 5: Old data read + old parity read + new data write + new parity write = 4 sequential operations
RAID 6: 6 sequential operations for small writes

Write Latency (with write-back cache):

All RAID levels: Cache write latency (~0.02-0.1 ms)
Actual disk writes occur asynchronously
This is why write-back cache is critical for parity RAID performance

Tail Latency Matters

Average latency can be misleading. For user-facing applications, the 99th or 99.9th percentile latency (tail latency) often determines user experience. RAID 5/6 can have severe tail latency spikes during parity operations, even when average latency is acceptable.

Performance in Degraded Mode

When a disk fails, the array enters degraded mode. Understanding degraded performance is crucial because this is precisely when you need your storage to keep working—and it's when performance is worst.

RAID 5/6 Degraded Performance:

Every read to a block that was on the failed disk now requires:

Reading from ALL remaining data disks
Reading the parity block(s)
XOR computation to reconstruct the data

For a 5-disk RAID 5 array:

Normal read: 1 disk involved
Degraded read (to failed disk location): 4 disks involved

This means:

4× more disk operations for affected reads
All disks become potential bottlenecks
Any slow disk limits entire operation
Much higher latency for degraded reads

Performance Degradation by RAID Level
RAID Level	Read Performance	Write Performance	Overall Impact
RAID 0	N/A (no fault tolerance)	N/A	Array fails completely
RAID 1	50% capacity, full speed	50% capacity, full speed	Minimal—just lost redundancy
RAID 5	~25-50% of normal	Severely degraded (6× ops)	Major degradation
RAID 6 (1 failure)	~25-50% of normal	Severely degraded (8× ops)	Major degradation
RAID 6 (2 failures)	~10-25% of normal	Extremely degraded	Critical degradation
RAID 10	~87% of normal	~87% of normal	Minor degradation

Why RAID 10 Degrades Gracefully:

In RAID 10, a disk failure affects only its mirror pair:

Reads to the affected pair come from the surviving mirror (no reconstruction needed)
Writes to the affected pair go only to the surviving mirror
Other mirror pairs operate completely normally
Overall array loses ~12.5% capacity for an 8-disk array

Rebuild Performance Impact:

During rebuild, the array must:

Service normal application I/O (foreground)
Read all data from surviving disks (rebuild reads)
Write reconstructed data to replacement disk (rebuild writes)

This triple demand creates severe contention:

RAID 5/6 rebuild reads from ALL remaining disks continuously
Application I/O competes with rebuild I/O
Some controllers allow rebuild priority tuning

Empirical degradation during rebuild:

Priority Setting	Application Performance	Rebuild Time
High (aggressive)	20-40% of normal	Fastest
Medium	50-70% of normal	Moderate
Low (background)	80-90% of normal	Very slow

The Danger Zone

A RAID 5 array in degraded mode while rebuilding is at maximum vulnerability with minimum performance. This is when second failures occur (due to rebuild stress) and when users complain most (due to slowness). Plan for this: schedule rebuilds during low-activity periods, consider RAID 6/10 for critical systems.

Identifying Performance Bottlenecks

RAID arrays can be bottlenecked at multiple points. Identifying the limiting factor is essential for optimization:

Potential Bottlenecks:

Disk spindle IOPS (for random workloads)
- Symptom: Disk utilization at 100%, high queue depths
- Fix: More disks, faster disks (15K/SSD), or move to RAID 10
Disk throughput (for sequential workloads)
- Symptom: Transfer rate limited despite low IOPS
- Fix: More disks in stripe, faster interface (SATA→SAS→NVMe)
Controller processing power
- Symptom: CPU usage high on RAID controller/host, disks not maxed
- Fix: Faster controller, hardware RAID vs. software RAID
Controller cache
- Symptom: Performance degrades when working set exceeds cache
- Fix: Larger cache, better caching algorithm
Interface bandwidth (SAS, PCIe)
- Symptom: All disks underutilized, interface saturated
- Fix: More lanes/higher speed interface, distribute across HBAs

Linux Storage Monitoring Commands
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Monitor disk I/O utilization
iostat -xz 1
 
# Key metrics to watch:
# %util   - Disk utilization (100% = bottleneck)
# await   - Average I/O wait time in ms
# r/s, w/s - Read/write IOPS
# avgqu-sz - Average queue depth
 
# Example output:
# Device  r/s    w/s   rkB/s   wkB/s  await  %util
# sda     12.00  284.00 48.00  1136.00  2.84  88.40
# sdb     8.00   280.00 32.00  1120.00  3.12  85.60
 
# Monitor RAID array status (mdadm for Linux software RAID)
cat /proc/mdstat
mdadm --detail /dev/md0
 
# Monitor cache effectiveness
# For hardware RAID, check controller-specific tools
# For bcache or dm-cache:
cat /sys/block/bcache0/bcache/stats_total/cache_hit_ratio
 
# Check for I/O wait bottleneck at system level
vmstat 1
# Look at 'wa' column - high values indicate I/O bottleneck
 
# Trace I/O latency distribution
# Using bcc/BPF tools
biolatency -D 10
 
# Check queue depths per disk
cat /sys/block/sd*/queue/nr_requests

Diagnostic Decision Tree:

1. Is any disk at 100% utilization?
   YES → Disk spindle bottleneck
   NO → Continue

2. Is total throughput near theoretical max?
   YES → Array performing optimally (or interface bottleneck)
   NO → Continue

3. Is RAID controller CPU high?
   YES → Controller processing bottleneck
   NO → Continue

4. Is cache hit rate low with working set > cache size?
   YES → Cache capacity bottleneck
   NO → Continue

5. Are latencies high despite low utilization?
   YES → Check for lock contention, misalignment, or controller issues
   NO → Investigate application-level issues

Common Performance Anti-patterns:

RAID 5 for database transaction logs: Transaction logs require fast synchronous sequential writes. RAID 5's write penalty makes this a poor choice; RAID 1 or RAID 10 is preferred.
Too-small stripe size for random I/O: Small stripes spread single operations across multiple disks, adding coordination overhead without benefit for random access.
Misaligned partitions: If the partition start doesn't align with stripe boundaries, every I/O potentially crosses stripes, requiring multiple disk accesses.

RAID Performance Optimization

Armed with an understanding of RAID performance characteristics, let's explore optimization strategies:

Workload Matching:

Workload Profile	Optimal Configuration
Read-heavy, large sequential	RAID 5/6 with small stripe, many disks
Write-heavy, small random	RAID 10, large cache with write-back
Mixed, transaction processing	RAID 10 with SSD
Streaming media	RAID 0 (if replaceable) or RAID 5
Virtualization	RAID 10 with SSD, separate arrays per purpose

Key Optimization Techniques

•Short-stroking HDDs: Use only the outer tracks (first 30-50%) of HDDs where sequential performance is highest. Sacrifices capacity for performance.
•SSD tiering: Place hot data on SSD tier, cold data on HDD tier. Automated tiering moves data based on access patterns.
•Write-back cache sizing: Ensure cache can absorb burst writes. Size for expected burst duration × write rate.
•Separate arrays by workload: Put transaction logs on RAID 10, data files on RAID 5/6. Don't mix incompatible workloads.
•Stripe alignment: Align file system allocation units with RAID stripe size. Most modern installers do this automatically.
•Appropriate queue depth: Set host queue depth to balance latency and throughput. Start at 32, tune based on measurement.
•RAID controller cache policy: Read-ahead for sequential; random priority for transaction. Many controllers auto-detect.

SSD-Specific Considerations:

SSDs fundamentally change RAID performance calculus:

No seek time: Random and sequential IOPS are similar. Stripe size matters less.
Parallelism within SSDs: A single SSD already has internal parallelism (multiple channels, dies). RAID adds another layer.
Write amplification: Both RAID parity and SSD wear-leveling cause write amplification. Combined effect can be severe.
TRIM/UNMAP: File system must support TRIM, RAID layer must pass it through, and SSDs must support it. Chain must be complete.
Latency still matters: At 100K IOPS, queuing effects dominate. Keep utilization below 70% for consistent latency.

The 80/20 Performance Rule

80% of performance issues come from fundamental choices: wrong RAID level for workload, insufficient disk count, or missing write-back cache. Address these first before pursuing advanced optimizations. Measure before and after every change.

Summary: RAID Performance Mastery

We've explored the multi-dimensional nature of RAID performance. Here are the essential principles:

Key Takeaways

•Throughput, IOPS, and latency are distinct metrics serving different workloads—optimize for the metric that matters most
•The write penalty (4× for RAID 5, 6× for RAID 6) dominates write-intensive workload performance calculations
•Mixed workload calculation requires weighting read and write ratios against the write penalty factor
•Latency increases non-linearly with utilization—keep arrays below 70% utilization for predictable latency
•Degraded mode performance is severely impacted for parity RAID; RAID 10 degrades more gracefully
•Bottleneck identification requires monitoring disk utilization, queue depths, and controller metrics
•Workload matching is the primary optimization—choose RAID level appropriate for read/write ratio and I/O pattern

What's Next:

The final page of this module examines RAID reliability from a mathematical perspective. We'll calculate Mean Time To Data Loss (MTTDL) for various configurations, understand how drive capacity, array size, and rebuild time affect failure probability, and learn to make informed reliability trade-offs.

Page Complete

You now have a comprehensive understanding of RAID performance characteristics, modeling techniques, and optimization strategies. This knowledge enables you to design storage systems that meet performance requirements while maintaining appropriate reliability—the central engineering challenge in storage architecture.

4 / 5

Loading learning content...

Operating SystemsRAID

RAID: Redundant Array of Independent Disks

LevelIntermediate

Duration90 mins

TopicRAID

4 / 5

RAID Performance

Performance: The Other Half of the RAID Equation

What You Will Learn

Storage Performance Metrics

Before analyzing RAID-specific performance, we must establish a foundation in storage performance metrics. These metrics describe different aspects of storage system capability:

Throughput (Bandwidth)

IOPS (Input/Output Operations Per Second)

Latency (Response Time)

The time between issuing an I/O request and receiving the response. Components of latency include:

Seek time (HDD): Time for heads to position over the correct track
Rotational latency (HDD): Time for the correct sector to rotate under the heads
Transfer time: Time to read/write the data
Controller overhead: RAID calculation, queuing, etc.
Queue wait time: Time spent waiting for other I/O to complete

The Relationship Between Metrics:

These metrics are interrelated through a fundamental equation:

$$\text{Throughput} = \text{IOPS} \times \text{I/O Size}$$

For example:

10,000 IOPS × 4 KB = 40 MB/s
10,000 IOPS × 256 KB = 2,560 MB/s
500 IOPS × 1 MB = 500 MB/s

Little's Law for Storage:

Queuing theory provides insight into latency:

$$\text{Average Queue Length} = \text{Arrival Rate} \times \text{Average Wait Time}$$

Or equivalently: $$L = \lambda \times W$$

This means that as utilization increases, queue lengths and latencies grow non-linearly. At 70% utilization, latency is roughly 3× the service time. At 90%, it's approximately 10× the service time.

Typical Performance Ranges by Storage Media
Media Type	Sequential Read	Random Read IOPS	Random Write IOPS	Latency
7200 RPM HDD	150-200 MB/s	80-150	80-150	4-8 ms
10K RPM HDD	200-250 MB/s	150-200	150-200	3-5 ms
15K RPM HDD	250-300 MB/s	200-300	200-300	2-4 ms
SATA SSD	500-550 MB/s	50K-100K	30K-70K	0.1-0.2 ms
NVMe SSD	3-7 GB/s	500K-1M	300K-700K	0.02-0.1 ms

Workload Characterization

RAID Throughput Analysis

Throughput in RAID arrays depends on the ability to parallelize I/O across multiple disks. Let's analyze each RAID level:

RAID 0 Throughput:

With n disks, theoretical maximum: $$T_{RAID0} = n \times T_{disk}$$

For large sequential I/O that spans all disks, this linear scaling is achievable. With four 150 MB/s HDDs: $$T_{RAID0} = 4 \times 150 = 600 \text{ MB/s}$$

RAID 1 Throughput:

Reads: Can load-balance across mirrors $$T_{RAID1_read} = n \times T_{disk}$$ (for n-way mirror)

Writes: Must write to all mirrors; limited by slowest $$T_{RAID1_write} = T_{disk}$$

For a 2-way mirror: 2× read throughput, 1× write throughput.

RAID 5 Throughput:

Reads: All data disks contribute $$T_{RAID5_read} = n \times T_{disk}$$

Full-stripe writes: Very efficient $$T_{RAID5_fullstripe} = (n-1) \times T_{disk}$$

Random small writes: Severely limited by read-modify-write $$T_{RAID5_random_write} = \frac{n \times T_{disk}}{4}$$

RAID 6 Throughput:

Similar to RAID 5, but with dual parity overhead:

Full-stripe writes: (n-2) × T_disk
Random small writes: n × T_disk / 6

RAID 10 Throughput:

Reads: All disks can contribute (mirror pairs share load) $$T_{RAID10_read} = n \times T_{disk}$$

Writes: Limited by mirror pair throughput $$T_{RAID10_write} = \frac{n}{2} \times T_{disk}$$

For 8 disks (4 mirror pairs) × 150 MB/s:

Read: up to 1,200 MB/s
Write: up to 600 MB/s

RAID Throughput Calculator
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
def calculate_raid_throughput(
    num_disks: int,
    disk_throughput_mbs: float,
    raid_level: str,
    workload: str = "sequential_read"
) -> float:
    """
    Calculate theoretical RAID throughput.
    
    Args:
        num_disks: Total number of drives in the array
        disk_throughput_mbs: Sequential throughput of single disk in MB/s
        raid_level: One of "0", "1", "5", "6", "10"
        workload: "sequential_read", "sequential_write", "random_write"
    
    Returns:
        Theoretical throughput in MB/s
    """
    T = disk_throughput_mbs
    n = num_disks
    
    if raid_level == "0":
        # All stripes contribute for both reads and writes
        return n * T
    
    elif raid_level == "1":
        if workload == "sequential_read":
            return 2 * T  # 2-way mirror
        else:
            return T  # Writes go to both mirrors
    
    elif raid_level == "5":
        if workload == "sequential_read":
            return n * T
        elif workload == "sequential_write":
            return (n - 1) * T  # One disk for parity
        else:  # random_write
            return n * T / 4  # 4x write penalty
    
    elif raid_level == "6":
        if workload == "sequential_read":
            return n * T
        elif workload == "sequential_write":
            return (n - 2) * T  # Two disks for parity
        else:  # random_write
            return n * T / 6  # 6x write penalty
    
    elif raid_level == "10":
        num_pairs = n // 2
        if workload == "sequential_read":
            return n * T  # All disks contribute
        else:  # writes
            return num_pairs * T  # One write per pair
 
# Example: Compare 8-disk arrays
for raid in ["0", "5", "6", "10"]:
    for workload in ["sequential_read", "sequential_write", "random_write"]:
        throughput = calculate_raid_throughput(8, 150, raid, workload)
        print(f"RAID {raid:>2} - {workload:<18}: {throughput:>6.0f} MB/s")
    print()
 
# Output:
# RAID 0  - sequential_read   :    1200 MB/s
# RAID 0  - sequential_write  :    1200 MB/s
# RAID 0  - random_write      :    1200 MB/s
 
# RAID 5  - sequential_read   :    1200 MB/s
# RAID 5  - sequential_write  :    1050 MB/s
# RAID 5  - random_write      :     300 MB/s
 
# RAID 6  - sequential_read   :    1200 MB/s
# RAID 6  - sequential_write  :     900 MB/s
# RAID 6  - random_write      :     200 MB/s
 
# RAID 10 - sequential_read   :    1200 MB/s
# RAID 10 - sequential_write  :     600 MB/s
# RAID 10 - random_write      :     600 MB/s

Theoretical vs. Achieved

RAID IOPS Analysis

For transaction-oriented workloads, IOPS is the critical metric. RAID's impact on IOPS differs significantly between reads and writes.

Read IOPS:

For all RAID levels, read IOPS scales roughly linearly with disk count:

$$IOPS_{array_read} = n \times IOPS_{disk}$$

With 8 HDDs at 150 IOPS each: 1,200 read IOPS With 8 SSDs at 50K IOPS each: 400K read IOPS

Mirror-based RAID (1, 10) can sometimes exceed this slightly due to read load balancing optimizations.

Write IOPS:

The write penalty dramatically impacts effective write IOPS:

RAID Level	Write Operations per Logical Write	Effective Write IOPS
RAID 0	1	n × IOPS_disk
RAID 1	2 (parallel)	n × IOPS_disk / 2
RAID 5	4 (2 read + 2 write)	n × IOPS_disk / 4
RAID 6	6 (3 read + 3 write)	n × IOPS_disk / 6
RAID 10	2 (parallel)	n × IOPS_disk / 2

Mixed Workload IOPS:

Real applications perform both reads and writes. For a workload with read fraction r and write fraction w (where r + w = 1):

$$IOPS_{effective} = \frac{IOPS_{disk} \times n}{r \times 1 + w \times \text{write_penalty}}$$

Example: 70% read, 30% write workload on 8-disk RAID 5

$$IOPS_{effective} = \frac{150 \times 8}{0.7 \times 1 + 0.3 \times 4}$$ $$= \frac{1200}{0.7 + 1.2} = \frac{1200}{1.9} \approx 632\ IOPS$$

Compare to RAID 10 with same workload: $$IOPS_{effective} = \frac{150 \times 8}{0.7 \times 1 + 0.3 \times 2}$$ $$= \frac{1200}{0.7 + 0.6} = \frac{1200}{1.3} \approx 923\ IOPS$$

RAID 10 delivers ~46% more IOPS for this write-moderate workload.

Mixed Workload IOPS Calculator
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
def calculate_mixed_iops(
    num_disks: int,
    disk_iops: float,
    raid_level: str,
    read_percent: float
) -> dict:
    """
    Calculate effective IOPS for mixed read/write workload.
    
    Returns dictionary with detailed breakdown.
    """
    write_percent = 1.0 - read_percent
    
    write_penalties = {
        "0": 1,
        "1": 2,
        "5": 4,
        "6": 6,
        "10": 2
    }
    
    penalty = write_penalties[raid_level]
    raw_iops = num_disks * disk_iops
    
    # Weighted average of read (1x) and write (penalty x)
    effective_multiplier = read_percent * 1 + write_percent * penalty
    effective_iops = raw_iops / effective_multiplier
    
    # Calculate breakdown
    read_iops = effective_iops * read_percent
    write_iops = effective_iops * write_percent
    physical_ops = read_iops + (write_iops * penalty)
    
    return {
        "raid_level": raid_level,
        "raw_iops": raw_iops,
        "effective_iops": effective_iops,
        "read_iops": read_iops,
        "write_iops": write_iops,
        "physical_ops_per_second": physical_ops,
        "write_penalty": penalty,
        "efficiency": (effective_iops / raw_iops) * 100
    }
 
# Compare RAID levels for database workload (70% read, 30% write)
print("8-disk array, 150 IOPS/disk, 70% read / 30% write workload:\n")
print(f"{'RAID':^6} | {'Effective':^10} | {'Efficiency':^10} | {'Writes':^10}")
print("-" * 50)
 
for raid in ["0", "5", "6", "10"]:
    result = calculate_mixed_iops(8, 150, raid, 0.70)
    print(f"{raid:^6} | {result['effective_iops']:>8.0f}   | "
          f"{result['efficiency']:>8.1f}%  | {result['write_iops']:>8.0f}")
 
print("\n100% write workload (worst case):\n")
for raid in ["0", "5", "6", "10"]:
    result = calculate_mixed_iops(8, 150, raid, 0.0)
    print(f"RAID {raid}: {result['effective_iops']:.0f} effective IOPS "
          f"({result['efficiency']:.1f}% efficiency)")

Sizing for Required IOPS

RAID Latency Analysis

Latency is often the most critical performance metric for interactive applications. Users notice latency; they rarely notice throughput directly. RAID impacts latency through several mechanisms:

Components of RAID Latency:

Media latency: Time for the disk to complete the I/O
- HDD: seek + rotational latency + transfer ≈ 4-8 ms
- SSD: controller + flash access ≈ 0.05-0.2 ms
Controller overhead: RAID calculations, cache lookup
- Hardware RAID: 0.02-0.1 ms
- Software RAID: 0.1-0.5 ms
Queue wait time: Time waiting for disk availability
- Depends on current queue depth and utilization
Parity operations: Additional I/O for parity reads/writes
- RAID 5 small write: adds 2 disk latencies (read old data, read old parity)
- RAID 6 small write: adds 3 disk latencies

The Queue Depth Effect:

$$T_{response} = \frac{S}{1 - \rho}$$

This relationship is non-linear and becomes severe at high utilization:

Utilization	Response Time (multiple of S)
50%	2×
70%	3.3×
80%	5×
90%	10×
95%	20×
99%	100×

This is why storage systems should never run at sustained high utilization—latency becomes unacceptable well before theoretical IOPS limits.

Converting Mermaid diagram...

Latency Impact by RAID Level:

Read Latency:

RAID 0, 1, 10: Single disk access (can be any disk with data)
RAID 5, 6: Single disk access (unless degraded)
Difference is negligible for healthy arrays

Write Latency (without cache):

RAID 0: Single disk write time
RAID 1, 10: Maximum of mirror writes (parallel, so ~= single disk)
RAID 5: Old data read + old parity read + new data write + new parity write = 4 sequential operations
RAID 6: 6 sequential operations for small writes

Write Latency (with write-back cache):

All RAID levels: Cache write latency (~0.02-0.1 ms)
Actual disk writes occur asynchronously
This is why write-back cache is critical for parity RAID performance

Tail Latency Matters

Performance in Degraded Mode

RAID 5/6 Degraded Performance:

Every read to a block that was on the failed disk now requires:

Reading from ALL remaining data disks
Reading the parity block(s)
XOR computation to reconstruct the data

For a 5-disk RAID 5 array:

Normal read: 1 disk involved
Degraded read (to failed disk location): 4 disks involved

This means:

4× more disk operations for affected reads
All disks become potential bottlenecks
Any slow disk limits entire operation
Much higher latency for degraded reads

Performance Degradation by RAID Level
RAID Level	Read Performance	Write Performance	Overall Impact
RAID 0	N/A (no fault tolerance)	N/A	Array fails completely
RAID 1	50% capacity, full speed	50% capacity, full speed	Minimal—just lost redundancy
RAID 5	~25-50% of normal	Severely degraded (6× ops)	Major degradation
RAID 6 (1 failure)	~25-50% of normal	Severely degraded (8× ops)	Major degradation
RAID 6 (2 failures)	~10-25% of normal	Extremely degraded	Critical degradation
RAID 10	~87% of normal	~87% of normal	Minor degradation

Why RAID 10 Degrades Gracefully:

In RAID 10, a disk failure affects only its mirror pair:

Reads to the affected pair come from the surviving mirror (no reconstruction needed)
Writes to the affected pair go only to the surviving mirror
Other mirror pairs operate completely normally
Overall array loses ~12.5% capacity for an 8-disk array

Rebuild Performance Impact:

During rebuild, the array must:

Service normal application I/O (foreground)
Read all data from surviving disks (rebuild reads)
Write reconstructed data to replacement disk (rebuild writes)

This triple demand creates severe contention:

RAID 5/6 rebuild reads from ALL remaining disks continuously
Application I/O competes with rebuild I/O
Some controllers allow rebuild priority tuning

Empirical degradation during rebuild:

Priority Setting	Application Performance	Rebuild Time
High (aggressive)	20-40% of normal	Fastest
Medium	50-70% of normal	Moderate
Low (background)	80-90% of normal	Very slow

The Danger Zone

Identifying Performance Bottlenecks

RAID arrays can be bottlenecked at multiple points. Identifying the limiting factor is essential for optimization:

Potential Bottlenecks:

Disk spindle IOPS (for random workloads)
- Symptom: Disk utilization at 100%, high queue depths
- Fix: More disks, faster disks (15K/SSD), or move to RAID 10
Disk throughput (for sequential workloads)
- Symptom: Transfer rate limited despite low IOPS
- Fix: More disks in stripe, faster interface (SATA→SAS→NVMe)
Controller processing power
- Symptom: CPU usage high on RAID controller/host, disks not maxed
- Fix: Faster controller, hardware RAID vs. software RAID
Controller cache
- Symptom: Performance degrades when working set exceeds cache
- Fix: Larger cache, better caching algorithm
Interface bandwidth (SAS, PCIe)
- Symptom: All disks underutilized, interface saturated
- Fix: More lanes/higher speed interface, distribute across HBAs

Linux Storage Monitoring Commands
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Monitor disk I/O utilization
iostat -xz 1
 
# Key metrics to watch:
# %util   - Disk utilization (100% = bottleneck)
# await   - Average I/O wait time in ms
# r/s, w/s - Read/write IOPS
# avgqu-sz - Average queue depth
 
# Example output:
# Device  r/s    w/s   rkB/s   wkB/s  await  %util
# sda     12.00  284.00 48.00  1136.00  2.84  88.40
# sdb     8.00   280.00 32.00  1120.00  3.12  85.60
 
# Monitor RAID array status (mdadm for Linux software RAID)
cat /proc/mdstat
mdadm --detail /dev/md0
 
# Monitor cache effectiveness
# For hardware RAID, check controller-specific tools
# For bcache or dm-cache:
cat /sys/block/bcache0/bcache/stats_total/cache_hit_ratio
 
# Check for I/O wait bottleneck at system level
vmstat 1
# Look at 'wa' column - high values indicate I/O bottleneck
 
# Trace I/O latency distribution
# Using bcc/BPF tools
biolatency -D 10
 
# Check queue depths per disk
cat /sys/block/sd*/queue/nr_requests

Diagnostic Decision Tree:

1. Is any disk at 100% utilization?
   YES → Disk spindle bottleneck
   NO → Continue

2. Is total throughput near theoretical max?
   YES → Array performing optimally (or interface bottleneck)
   NO → Continue

3. Is RAID controller CPU high?
   YES → Controller processing bottleneck
   NO → Continue

4. Is cache hit rate low with working set > cache size?
   YES → Cache capacity bottleneck
   NO → Continue

5. Are latencies high despite low utilization?
   YES → Check for lock contention, misalignment, or controller issues
   NO → Investigate application-level issues

Common Performance Anti-patterns:

RAID 5 for database transaction logs: Transaction logs require fast synchronous sequential writes. RAID 5's write penalty makes this a poor choice; RAID 1 or RAID 10 is preferred.
Too-small stripe size for random I/O: Small stripes spread single operations across multiple disks, adding coordination overhead without benefit for random access.
Misaligned partitions: If the partition start doesn't align with stripe boundaries, every I/O potentially crosses stripes, requiring multiple disk accesses.

RAID Performance Optimization

Armed with an understanding of RAID performance characteristics, let's explore optimization strategies:

Workload Matching:

Workload Profile	Optimal Configuration
Read-heavy, large sequential	RAID 5/6 with small stripe, many disks
Write-heavy, small random	RAID 10, large cache with write-back
Mixed, transaction processing	RAID 10 with SSD
Streaming media	RAID 0 (if replaceable) or RAID 5
Virtualization	RAID 10 with SSD, separate arrays per purpose

Key Optimization Techniques

•Short-stroking HDDs: Use only the outer tracks (first 30-50%) of HDDs where sequential performance is highest. Sacrifices capacity for performance.
•SSD tiering: Place hot data on SSD tier, cold data on HDD tier. Automated tiering moves data based on access patterns.
•Write-back cache sizing: Ensure cache can absorb burst writes. Size for expected burst duration × write rate.
•Separate arrays by workload: Put transaction logs on RAID 10, data files on RAID 5/6. Don't mix incompatible workloads.
•Stripe alignment: Align file system allocation units with RAID stripe size. Most modern installers do this automatically.
•Appropriate queue depth: Set host queue depth to balance latency and throughput. Start at 32, tune based on measurement.
•RAID controller cache policy: Read-ahead for sequential; random priority for transaction. Many controllers auto-detect.

SSD-Specific Considerations:

SSDs fundamentally change RAID performance calculus:

No seek time: Random and sequential IOPS are similar. Stripe size matters less.
Parallelism within SSDs: A single SSD already has internal parallelism (multiple channels, dies). RAID adds another layer.
Write amplification: Both RAID parity and SSD wear-leveling cause write amplification. Combined effect can be severe.
TRIM/UNMAP: File system must support TRIM, RAID layer must pass it through, and SSDs must support it. Chain must be complete.
Latency still matters: At 100K IOPS, queuing effects dominate. Keep utilization below 70% for consistent latency.

The 80/20 Performance Rule

Summary: RAID Performance Mastery

We've explored the multi-dimensional nature of RAID performance. Here are the essential principles:

Key Takeaways

•Throughput, IOPS, and latency are distinct metrics serving different workloads—optimize for the metric that matters most
•The write penalty (4× for RAID 5, 6× for RAID 6) dominates write-intensive workload performance calculations
•Mixed workload calculation requires weighting read and write ratios against the write penalty factor
•Latency increases non-linearly with utilization—keep arrays below 70% utilization for predictable latency
•Degraded mode performance is severely impacted for parity RAID; RAID 10 degrades more gracefully
•Bottleneck identification requires monitoring disk utilization, queue depths, and controller metrics
•Workload matching is the primary optimization—choose RAID level appropriate for read/write ratio and I/O pattern

What's Next:

Page Complete

4 / 5