Raid Levels - Learning Module

Loading content...

0/241

RAID 0, 1, 5, 6, and 10: Comprehensive Level Analysis

The RAID Level Spectrum

RAID levels represent different strategies for balancing the fundamental storage tradeoffs: performance, capacity efficiency, and fault tolerance. No single RAID level is universally optimal—each makes different sacrifices to optimize for specific workload characteristics.

The original Berkeley paper defined RAID levels 1-5. Over time, additional levels (RAID 6, RAID 10, and hybrid variations) emerged to address evolving storage requirements. Today, five RAID levels dominate production database deployments: RAID 0, RAID 1, RAID 5, RAID 6, and RAID 10.

This page provides an in-depth examination of each level, explaining not just what they do, but why they behave as they do—the architectural decisions and mathematical properties that determine their characteristics.

Learning Objectives

By the end of this page, you will understand the internal architecture of each major RAID level, be able to calculate capacity and performance characteristics, and know when each level is appropriate for database workloads.

RAID 0: Striping Without Redundancy

RAID 0 is technically not 'redundant' at all—the 'R' in RAID doesn't apply. It uses pure striping to distribute data across multiple drives without any redundancy information. Despite the lack of fault tolerance, RAID 0 remains relevant for specific use cases where performance matters more than durability.

Architecture:

Data is divided into stripes of configurable size (typically 64KB-256KB) and written sequentially across all drives in the array. With N drives, the first stripe unit goes to drive 0, the second to drive 1, and so on, wrapping around after drive N-1 back to drive 0.

RAID 0 Data Distribution Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// RAID 0 with 4 drives, 64KB stripe units
// Writing a 512KB file (8 stripe units)
 
//        Drive 0    Drive 1    Drive 2    Drive 3
// Row 0:   S0         S1         S2         S3
// Row 1:   S4         S5         S6         S7
 
// Reading the full file:
//   - All 4 drives can read simultaneously
//   - Maximum parallelism: 4x single-drive throughput
 
// Writing the full file:
//   - All 4 drives can write simultaneously
//   - Maximum parallelism: 4x single-drive throughput
 
// Key formulas:
// Usable capacity = N × (smallest drive capacity)
// Read throughput = N × (single-drive throughput)
// Write throughput = N × (single-drive throughput)
// Fault tolerance = 0 (any failure = total data loss)

RAID 0 Characteristics Summary
Property	Value	Explanation
Minimum Drives	2	Need at least 2 drives to stripe across
Usable Capacity	N × drive capacity	100% capacity efficiency—no redundancy overhead
Read Performance	Excellent (N× improvement)	Parallel reads across all drives
Write Performance	Excellent (N× improvement)	Parallel writes, no parity calculation
Fault Tolerance	None (0 drives)	Any single drive failure causes complete data loss
Rebuild Capability	None	No redundant data to rebuild from

When RAID 0 Is Appropriate:

Despite its fragility, RAID 0 has legitimate uses:

Temporary/Scratch Storage: Data that can be regenerated (temp tables, build artifacts, cache)
Performance Testing: When you need maximum storage performance for benchmarks
Non-Critical Read-Heavy Caching: If data is replicated elsewhere and cache loss is acceptable
Inner Layer of Nested RAID: RAID 50 (RAID 5+0) and RAID 60 (RAID 6+0) stripe across RAID 5/6 arrays

When RAID 0 Is Absolutely Wrong:

Any production database data files
Transaction logs or WAL
Any data that cannot be regenerated
Backups or archival storage

RAID 0 Reliability Warning

With N drives in a RAID 0 array, your probability of data loss is N times higher than a single drive. An 8-drive RAID 0 array will fail, on average, 8 times more frequently than a single drive. For databases containing business-critical data, RAID 0 is never acceptable, regardless of performance requirements.

RAID 1: Mirroring for Maximum Simplicity

RAID 1 is the simplest redundant configuration: data is written identically to two (or more) drives simultaneously. If any drive fails, an identical copy remains available. Mirroring's simplicity makes it extremely reliable and efficient for read-intensive workloads.

Architecture:

Every write operation is duplicated to all drives in the mirror set. Reads can be served by any drive, allowing the controller to optimize read distribution (e.g., choosing the drive with the lowest seek time for a particular request).

RAID 1 Data Distribution Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// RAID 1 with 2 drives (simple mirror)
// Writing blocks A, B, C, D
 
//        Drive 0    Drive 1 (Mirror)
// Row 0:   A            A
// Row 1:   B            B
// Row 2:   C            C
// Row 3:   D            D
 
// Write operation for block A:
//   1. Write A to Drive 0
//   2. Write A to Drive 1
//   3. Confirm completion when BOTH succeed
//   → Write performance: 1x (limited by slowest mirror)
 
// Read operation for block A:
//   Option 1: Read from Drive 0
//   Option 2: Read from Drive 1
//   → Controller chooses optimal drive
//   → Read performance: up to 2x (parallel reads possible)
 
// Key formulas:
// Usable capacity = Total capacity / N (where N = number of mirrors)
// Read throughput = N × single-drive throughput (theoretical max)
// Write throughput = 1× single-drive throughput (must write all copies)
// Fault tolerance = N-1 drives (can lose all but one mirror)

RAID 1 Advantages:

Simplicity: No complex parity calculations; every failure is handled by switching to the surviving mirror
Read Performance: Controller can distribute reads across mirrors, effectively doubling read IOPS
Recovery Speed: Rebuilding a mirror is a simple drive copy—no parity reconstruction required
Degraded Performance: When a drive fails, performance remains near-normal (unlike parity RAID)
No Write Penalty: Although writes go to multiple drives, there's no computational overhead

RAID 1 Disadvantages:

Capacity Efficiency: Only 50% of raw capacity is usable (for 2-drive mirror)
No Write Scaling: Write performance doesn't improve with more drives
Cost: Doubles storage cost for equivalent usable capacity

RAID 1 Characteristics Summary
Property	Value	Explanation
Minimum Drives	2	Need at least 2 drives for a mirror pair
Usable Capacity	1/N × total capacity	50% for 2-way mirror, 33% for 3-way mirror
Read Performance	Excellent (N× IOPS)	Reads can be distributed across all mirrors
Write Performance	Same as single drive	All mirrors must receive each write
Fault Tolerance	N-1 drives	Survives all failures except complete mirror loss
Rebuild Speed	Fast (sequential copy)	No parity calculations; simple block copy

Three-Way Mirrors for Critical Data

For extremely critical data (like the transaction log root), some systems use 3-way mirrors. This provides tolerance for 2 simultaneous failures and further improves read performance. ZFS calls this 'mirror of 3' and it's common for metadata-heavy workloads.

RAID 1 for Databases:

RAID 1 is ideal for:

Transaction logs: Critical for durability, small in size, sequential writes benefit from low latency
Database boot volumes: Operating system and database binaries need reliability, not large capacity
High-value small databases: When capacity cost is acceptable for simplicity and reliability

RAID 5: Distributed Parity

RAID 5 uses striping with distributed parity to provide fault tolerance while maintaining high capacity efficiency. Parity information is calculated across each stripe row and distributed across all drives—no single 'parity drive' creates a bottleneck.

Architecture:

For each stripe row, one stripe unit contains parity (XOR of all data stripe units in that row). The parity position rotates across drives to distribute the write load evenly. This distributed parity design was a significant improvement over RAID 3/4, which used dedicated parity drives.

RAID 5 Data and Parity Distribution
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// RAID 5 with 4 drives, distributed parity
// D = Data, P = Parity (XOR of data in that row)
 
//        Drive 0    Drive 1    Drive 2    Drive 3
// Row 0:   D0         D1         D2         P0      (P0 = D0 ⊕ D1 ⊕ D2)
// Row 1:   D3         D4         P1         D5      (P1 = D3 ⊕ D4 ⊕ D5)
// Row 2:   D6         P2         D7         D8      (P2 = D6 ⊕ D7 ⊕ D8)
// Row 3:   P3         D9         D10        D11     (P3 = D9 ⊕ D10 ⊕ D11)
// Row 4:   D12        D13        D14        P4      (pattern repeats)
 
// Parity position cycles: 3, 2, 1, 0, 3, 2, 1, 0, ...
// This distributes parity write load across all drives
 
// Small write process (update D1):
//   1. Read old D1
//   2. Read old P0 (P0 = D0 ⊕ D1 ⊕ D2)
//   3. Calculate new P0 = old_P0 ⊕ old_D1 ⊕ new_D1
//   4. Write new D1
//   5. Write new P0
//   Total: 2 reads + 2 writes = 4 I/O operations (4x write penalty)
 
// Key formulas:
// Usable capacity = (N-1) × drive capacity
// Capacity efficiency = (N-1)/N = 75% for 4 drives, 87.5% for 8 drives
// Fault tolerance = 1 drive

The RAID 5 Write Penalty Explained:

Small writes in RAID 5 require the read-modify-write sequence shown above. Each logical write translates to 4 physical I/O operations:

Read old data block (the block being overwritten)
Read old parity block (the parity for that stripe row)
Write new data block
Write new parity block (calculated from old parity XOR old data XOR new data)

This 4x write penalty significantly impacts OLTP workloads with many small random writes. For sequential writes that span full stripes, the penalty is reduced because parity can be calculated directly from new data without reading old values.

RAID 5 Characteristics Summary
Property	Value	Explanation
Minimum Drives	3	Need at least 3 drives (2 data + 1 parity equivalent)
Usable Capacity	(N-1) × drive capacity	One drive worth of capacity used for parity
Read Performance	Excellent ((N-1)× for data)	Parallel reads across all data stripe units
Write Performance	Reduced (4× write penalty)	Each write requires 2 reads + 2 writes
Fault Tolerance	1 drive	Survives exactly one drive failure
Rebuild Time	Long (full parity reconstruction)	Must read all surviving drives to rebuild

The RAID 5 Write Hole

If power fails during a RAID 5 write (after data is written but before parity is updated), the array becomes inconsistent. This 'write hole' means parity may not match data after an unclean shutdown. Modern RAID controllers use battery-backed cache or journaling to prevent this; software RAID solutions like ZFS use copy-on-write semantics to eliminate the write hole entirely.

Degraded Mode Performance:

When a RAID 5 drive fails, the array enters degraded mode:

Reads from the failed drive: Reconstructed by XORing all other stripe units in the row
Read latency increases: Every read from 'failed' data requires reading N-1 drives
Write operations: Still require parity updates, now with reconstruction overhead
Overall performance: Can drop by 50-75% in degraded mode

Rebuild Process:

Rebuilding a RAID 5 array after drive replacement:

For each stripe row, read all surviving stripe units
XOR them together to reconstruct the missing data
Write reconstructed data to the replacement drive
Time: Hours to days, depending on array size and I/O load

RAID 5 Is No Longer Recommended for Large Drives

With modern drive capacities (8TB+), RAID 5 rebuild times can exceed 24 hours. During this window, the array is vulnerable to a second failure that would cause complete data loss. Additionally, the rebuild process aggressively reads all drives, which can trigger latent sector errors on stressed drives. For drives larger than 1TB, RAID 6 or RAID 10 is strongly recommended.

RAID 6: Dual Distributed Parity

RAID 6 extends RAID 5 by adding a second independent parity calculation, enabling the array to survive any two simultaneous drive failures. This dual-parity approach addresses the rebuild vulnerability of RAID 5 with large drives.

Architecture:

RAID 6 uses two different parity algorithms:

P (Row Parity): Standard XOR parity, same as RAID 5
Q (Diagonal Parity): A second parity using Galois Field arithmetic, mathematically independent from P

Because P and Q are calculated differently, any two missing blocks can be reconstructed using the remaining data, P, and Q.

RAID 6 Dual Parity Distribution
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// RAID 6 with 5 drives, dual distributed parity
// D = Data, P = Row Parity (XOR), Q = Diagonal Parity (Galois Field)
 
//        Drive 0    Drive 1    Drive 2    Drive 3    Drive 4
// Row 0:   D0         D1         D2         P0         Q0
// Row 1:   D3         D4         P1         Q1         D5
// Row 2:   D6         P2         Q2         D7         D8
// Row 3:   P3         Q3         D9         D10        D11
// Row 4:   Q4         D12        D13        D14        P4
 
// Both P and Q positions rotate to distribute load
 
// P calculation: P = D0 ⊕ D1 ⊕ D2 (standard XOR)
// Q calculation: Q = g⁰·D0 ⊕ g¹·D1 ⊕ g²·D2 (Reed-Solomon, Galois Field)
 
// Small write process (update D1):
//   1. Read old D1
//   2. Read old P0
//   3. Read Q0 or calculate Q coefficient
//   4. Write new D1  
//   5. Write new P0 = old_P0 ⊕ old_D1 ⊕ new_D1
//   6. Write new Q0 (using Galois Field math)
//   Total: 3 reads + 3 writes = 6 I/O operations (6× write penalty)
 
// Key formulas:
// Usable capacity = (N-2) × drive capacity
// Capacity efficiency = (N-2)/N = 60% for 5 drives, 75% for 8 drives
// Fault tolerance = 2 drives

Reed-Solomon Coding:

The Q parity in RAID 6 uses Reed-Solomon error correction codes based on Galois Field mathematics. Unlike simple XOR, Reed-Solomon can distinguish which blocks are missing, enabling recovery of any two failures.

The mathematics are complex but the practical effect is straightforward: Q provides a second, independent redundancy check. Modern CPUs include special instructions (CLMUL, PCLMULQDQ) that accelerate Galois Field calculations, minimizing the performance impact.

RAID 6 Characteristics Summary
Property	Value	Explanation
Minimum Drives	4	Need at least 4 drives (2 data + 2 parity equivalent)
Usable Capacity	(N-2) × drive capacity	Two drives worth of capacity for dual parity
Read Performance	Excellent ((N-2)× for data)	Parallel reads across all data stripe units
Write Performance	Reduced (6× write penalty)	Each write requires 3 reads + 3 writes
Fault Tolerance	2 drives	Survives any two simultaneous drive failures
Rebuild Time	Long but safer	Can lose another drive during rebuild

RAID 6 vs. RAID 5 Comparison:

Aspect	RAID 5	RAID 6
Fault tolerance	1 drive	2 drives
Capacity overhead	1/N	2/N
Write penalty	4×	6×
Rebuild safety	Vulnerable	Safe during rebuild
Recommended for	Small arrays, ≤1TB drives	Large arrays, >1TB drives

Why RAID 6 Is Now Standard:

With 18TB drives and larger becoming common:

RAID 5 rebuild for an 8×18TB array: 30+ hours
Annual failure rate: ~2-5% per drive
Probability of second failure during 30-hour window: Unacceptably high
RAID 6 provides a safety margin that RAID 5 can no longer guarantee

Triple Parity Options

ZFS offers RAIDZ3 (triple parity), tolerating 3 simultaneous failures. For extremely large arrays (20+ drives) or ultra-critical data, triple parity provides additional safety margin against correlated failures during extended rebuilds.

RAID 10: Mirrored Stripes

RAID 10 (also written RAID 1+0) combines mirroring and striping to deliver both high performance and fault tolerance without the write penalty of parity-based RAID. Data is first mirrored (RAID 1), then the mirror pairs are striped (RAID 0).

Architecture:

A RAID 10 array consists of multiple mirror pairs. Each pair contains identical copies of data. Writes go to both drives in a pair; reads can come from either. The pairs are then striped to distribute data and I/O load across the array.

RAID 10 Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// RAID 10 with 6 drives (3 mirror pairs, striped)
 
//           Pair 0        Pair 1        Pair 2
//        D0a   D0b     D1a   D1b     D2a   D2b
// Row 0:  A     A       B     B       C     C
// Row 1:  D     D       E     E       F     F
// Row 2:  G     G       H     H       I     I
 
// Logical view: A B C D E F G H I (striped across 3 pairs)
// Physical: Each element exists on 2 drives (mirrored)
 
// Write operation for block A:
//   1. Write A to D0a
//   2. Write A to D0b (in parallel)
//   3. Confirm when both complete
//   → Write performance: Near 3× (3 pairs can write in parallel)
//   → No parity calculation overhead
 
// Read operation for block A:
//   Option 1: Read from D0a
//   Option 2: Read from D0b
//   → Controller chooses optimal drive
//   → Read performance: Up to 6× (all drives can serve reads)
 
// Key formulas:
// Usable capacity = (N/2) × drive capacity = 50% efficiency
// Fault tolerance = 1 per mirror pair (up to N/2 total if lucky)
// Write throughput = (N/2) × single-drive throughput
// Read throughput = N × single-drive throughput

RAID 10 Fault Tolerance Nuance:

RAID 10 can survive multiple failures, but with critical caveats:

Best case: Lose one drive from each mirror pair → all data survives
Worst case: Lose both drives in any single pair → complete data loss

With a 6-drive RAID 10, you can survive 1, 2, or even 3 failures—as long as no mirror pair loses both drives. This means RAID 10 can be more or less reliable than RAID 6 depending on failure patterns:

Scenario	RAID 10 (6 drives)	RAID 6 (6 drives)
1 failure	✓ Survives	✓ Survives
2 failures (different pairs)	✓ Survives	✓ Survives
2 failures (same pair)	✗ Failed	✓ Survives
3 failures (one per pair)	✓ Survives	✗ Failed

RAID 10 Characteristics Summary
Property	Value	Explanation
Minimum Drives	4	Need at least 2 mirror pairs to stripe
Usable Capacity	N/2 × drive capacity	50% capacity efficiency (mirroring overhead)
Read Performance	Excellent (N× IOPS)	All drives can serve reads independently
Write Performance	Excellent ((N/2)× throughput)	No parity overhead; parallel writes to pairs
Fault Tolerance	1+ drives (pair-dependent)	Survives 1 per pair, up to N/2 total
Rebuild Speed	Fast (simple mirror copy)	Only affects one pair; no parity reconstruction

RAID 10 vs. RAID 01

RAID 10 (mirrors then stripes) differs from RAID 01 (stripes then mirrors). RAID 10 is superior because a single drive failure only degrades one mirror pair, while RAID 01 degrades an entire stripe set. Always prefer RAID 10 over RAID 01 for this reason.

Why RAID 10 for Databases:

RAID 10 is the gold standard for database workloads because:

No write penalty: Writes are simple mirrors; no parity calculation
Consistent latency: No degraded-mode performance cliff
Fast rebuild: Only copies one drive; doesn't stress entire array
Predictable performance: Linear scaling with drive count
Simple failure analysis: Easy to understand which failures are survivable

The capacity cost is the only significant drawback—you lose 50% of raw capacity. For high-performance OLTP databases where IOPS matter more than capacity, this tradeoff is almost always worthwhile.

RAID Level Comparison Matrix

The following comprehensive comparison consolidates the key characteristics of each RAID level to facilitate selection decisions.

Comprehensive RAID Level Comparison
Property	RAID 0	RAID 1	RAID 5	RAID 6	RAID 10
Min Drives	2	2	3	4	4
Usable Capacity	100%	50%	(N-1)/N	(N-2)/N	50%
Example (8 drives)	8 drives	4 drives	7 drives	6 drives	4 drives
Fault Tolerance	0	N-1	1	2	1-N/2
Read IOPS	Excellent	Excellent	Good	Good	Excellent
Write IOPS	Excellent	Good	Poor	Poor	Excellent
Write Penalty	None	None	4×	6×	None
Rebuild Time	N/A	Fast	Slow	Slow	Fast
Degraded Performance	N/A	Minimal	Significant	Significant	Minimal
Best For	Scratch/temp	Small critical	Read-heavy	Large arrays	OLTP databases

Capacity Efficiency vs. Drive Count:

Parity-based RAID becomes more capacity-efficient as you add drives:

Drives	RAID 5 Efficiency	RAID 6 Efficiency	RAID 10 Efficiency
4	75%	50%	50%
6	83.3%	66.7%	50%
8	87.5%	75%	50%
12	91.7%	83.3%	50%
16	93.75%	87.5%	50%

For large arrays (12+ drives), RAID 6 approaches RAID 10's capacity efficiency while providing superior fault tolerance. However, the write penalty remains, making RAID 10 superior for write-intensive workloads regardless of array size.

SSD Changes the Equation

With SSDs, the write penalty of parity RAID is less impactful because SSDs are so fast that parity calculations become negligible overhead. However, the extra write amplification from parity updates accelerates SSD wear. For SSD arrays, consider RAID 10 if write endurance is a concern, or accept the wear tradeoff for capacity efficiency with RAID 5/6.

Summary: Choosing the Right RAID Level

We've examined the five major RAID levels in depth. Here are the key decision factors:

RAID Selection Guidelines

•RAID 0: Only for non-critical, regenerable data where maximum performance is required
•RAID 1: For small, critical datasets (transaction logs, boot volumes) where simplicity and fast rebuild matter
•RAID 5: Avoid for new deployments; RAID 6 is safer for similar use cases
•RAID 6: For large capacity requirements with read-heavy workloads; minimum for modern large drives
•RAID 10: For write-intensive OLTP databases; gold standard when performance matters and capacity cost is acceptable

What's Next:

With solid understanding of RAID levels, we'll now examine their performance characteristics in greater depth. The next page provides detailed performance analysis, including benchmarking methodologies, real-world performance patterns, and optimization strategies for database workloads.

Page Complete

You now understand the architecture, characteristics, and tradeoffs of RAID 0, 1, 5, 6, and 10. You can calculate capacity efficiency, understand write penalties, and recognize the fault tolerance implications of each level. Next, we'll dive deeper into performance analysis.

RAID 0, 1, 5, 6, and 10: Comprehensive Level Analysis

The RAID Level Spectrum

Learning Objectives

RAID 0: Striping Without Redundancy

Architecture:

RAID 0 Data Distribution Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// RAID 0 with 4 drives, 64KB stripe units
// Writing a 512KB file (8 stripe units)
 
//        Drive 0    Drive 1    Drive 2    Drive 3
// Row 0:   S0         S1         S2         S3
// Row 1:   S4         S5         S6         S7
 
// Reading the full file:
//   - All 4 drives can read simultaneously
//   - Maximum parallelism: 4x single-drive throughput
 
// Writing the full file:
//   - All 4 drives can write simultaneously
//   - Maximum parallelism: 4x single-drive throughput
 
// Key formulas:
// Usable capacity = N × (smallest drive capacity)
// Read throughput = N × (single-drive throughput)
// Write throughput = N × (single-drive throughput)
// Fault tolerance = 0 (any failure = total data loss)

RAID 0 Characteristics Summary
Property	Value	Explanation
Minimum Drives	2	Need at least 2 drives to stripe across
Usable Capacity	N × drive capacity	100% capacity efficiency—no redundancy overhead
Read Performance	Excellent (N× improvement)	Parallel reads across all drives
Write Performance	Excellent (N× improvement)	Parallel writes, no parity calculation
Fault Tolerance	None (0 drives)	Any single drive failure causes complete data loss
Rebuild Capability	None	No redundant data to rebuild from

When RAID 0 Is Appropriate:

Despite its fragility, RAID 0 has legitimate uses:

Temporary/Scratch Storage: Data that can be regenerated (temp tables, build artifacts, cache)
Performance Testing: When you need maximum storage performance for benchmarks
Non-Critical Read-Heavy Caching: If data is replicated elsewhere and cache loss is acceptable
Inner Layer of Nested RAID: RAID 50 (RAID 5+0) and RAID 60 (RAID 6+0) stripe across RAID 5/6 arrays

When RAID 0 Is Absolutely Wrong:

Any production database data files
Transaction logs or WAL
Any data that cannot be regenerated
Backups or archival storage

RAID 0 Reliability Warning

RAID 1: Mirroring for Maximum Simplicity

Architecture:

RAID 1 Data Distribution Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// RAID 1 with 2 drives (simple mirror)
// Writing blocks A, B, C, D
 
//        Drive 0    Drive 1 (Mirror)
// Row 0:   A            A
// Row 1:   B            B
// Row 2:   C            C
// Row 3:   D            D
 
// Write operation for block A:
//   1. Write A to Drive 0
//   2. Write A to Drive 1
//   3. Confirm completion when BOTH succeed
//   → Write performance: 1x (limited by slowest mirror)
 
// Read operation for block A:
//   Option 1: Read from Drive 0
//   Option 2: Read from Drive 1
//   → Controller chooses optimal drive
//   → Read performance: up to 2x (parallel reads possible)
 
// Key formulas:
// Usable capacity = Total capacity / N (where N = number of mirrors)
// Read throughput = N × single-drive throughput (theoretical max)
// Write throughput = 1× single-drive throughput (must write all copies)
// Fault tolerance = N-1 drives (can lose all but one mirror)

RAID 1 Advantages:

Simplicity: No complex parity calculations; every failure is handled by switching to the surviving mirror
Read Performance: Controller can distribute reads across mirrors, effectively doubling read IOPS
Recovery Speed: Rebuilding a mirror is a simple drive copy—no parity reconstruction required
Degraded Performance: When a drive fails, performance remains near-normal (unlike parity RAID)
No Write Penalty: Although writes go to multiple drives, there's no computational overhead

RAID 1 Disadvantages:

Capacity Efficiency: Only 50% of raw capacity is usable (for 2-drive mirror)
No Write Scaling: Write performance doesn't improve with more drives
Cost: Doubles storage cost for equivalent usable capacity

RAID 1 Characteristics Summary
Property	Value	Explanation
Minimum Drives	2	Need at least 2 drives for a mirror pair
Usable Capacity	1/N × total capacity	50% for 2-way mirror, 33% for 3-way mirror
Read Performance	Excellent (N× IOPS)	Reads can be distributed across all mirrors
Write Performance	Same as single drive	All mirrors must receive each write
Fault Tolerance	N-1 drives	Survives all failures except complete mirror loss
Rebuild Speed	Fast (sequential copy)	No parity calculations; simple block copy

Three-Way Mirrors for Critical Data

RAID 1 for Databases:

RAID 1 is ideal for:

Transaction logs: Critical for durability, small in size, sequential writes benefit from low latency
Database boot volumes: Operating system and database binaries need reliability, not large capacity
High-value small databases: When capacity cost is acceptable for simplicity and reliability

RAID 5: Distributed Parity

Architecture:

RAID 5 Data and Parity Distribution
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// RAID 5 with 4 drives, distributed parity
// D = Data, P = Parity (XOR of data in that row)
 
//        Drive 0    Drive 1    Drive 2    Drive 3
// Row 0:   D0         D1         D2         P0      (P0 = D0 ⊕ D1 ⊕ D2)
// Row 1:   D3         D4         P1         D5      (P1 = D3 ⊕ D4 ⊕ D5)
// Row 2:   D6         P2         D7         D8      (P2 = D6 ⊕ D7 ⊕ D8)
// Row 3:   P3         D9         D10        D11     (P3 = D9 ⊕ D10 ⊕ D11)
// Row 4:   D12        D13        D14        P4      (pattern repeats)
 
// Parity position cycles: 3, 2, 1, 0, 3, 2, 1, 0, ...
// This distributes parity write load across all drives
 
// Small write process (update D1):
//   1. Read old D1
//   2. Read old P0 (P0 = D0 ⊕ D1 ⊕ D2)
//   3. Calculate new P0 = old_P0 ⊕ old_D1 ⊕ new_D1
//   4. Write new D1
//   5. Write new P0
//   Total: 2 reads + 2 writes = 4 I/O operations (4x write penalty)
 
// Key formulas:
// Usable capacity = (N-1) × drive capacity
// Capacity efficiency = (N-1)/N = 75% for 4 drives, 87.5% for 8 drives
// Fault tolerance = 1 drive

The RAID 5 Write Penalty Explained:

Small writes in RAID 5 require the read-modify-write sequence shown above. Each logical write translates to 4 physical I/O operations:

Read old data block (the block being overwritten)
Read old parity block (the parity for that stripe row)
Write new data block
Write new parity block (calculated from old parity XOR old data XOR new data)

RAID 5 Characteristics Summary
Property	Value	Explanation
Minimum Drives	3	Need at least 3 drives (2 data + 1 parity equivalent)
Usable Capacity	(N-1) × drive capacity	One drive worth of capacity used for parity
Read Performance	Excellent ((N-1)× for data)	Parallel reads across all data stripe units
Write Performance	Reduced (4× write penalty)	Each write requires 2 reads + 2 writes
Fault Tolerance	1 drive	Survives exactly one drive failure
Rebuild Time	Long (full parity reconstruction)	Must read all surviving drives to rebuild

The RAID 5 Write Hole

Degraded Mode Performance:

When a RAID 5 drive fails, the array enters degraded mode:

Reads from the failed drive: Reconstructed by XORing all other stripe units in the row
Read latency increases: Every read from 'failed' data requires reading N-1 drives
Write operations: Still require parity updates, now with reconstruction overhead
Overall performance: Can drop by 50-75% in degraded mode

Rebuild Process:

Rebuilding a RAID 5 array after drive replacement:

For each stripe row, read all surviving stripe units
XOR them together to reconstruct the missing data
Write reconstructed data to the replacement drive
Time: Hours to days, depending on array size and I/O load

RAID 5 Is No Longer Recommended for Large Drives

RAID 6: Dual Distributed Parity

Architecture:

RAID 6 uses two different parity algorithms:

P (Row Parity): Standard XOR parity, same as RAID 5
Q (Diagonal Parity): A second parity using Galois Field arithmetic, mathematically independent from P

Because P and Q are calculated differently, any two missing blocks can be reconstructed using the remaining data, P, and Q.

RAID 6 Dual Parity Distribution
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// RAID 6 with 5 drives, dual distributed parity
// D = Data, P = Row Parity (XOR), Q = Diagonal Parity (Galois Field)
 
//        Drive 0    Drive 1    Drive 2    Drive 3    Drive 4
// Row 0:   D0         D1         D2         P0         Q0
// Row 1:   D3         D4         P1         Q1         D5
// Row 2:   D6         P2         Q2         D7         D8
// Row 3:   P3         Q3         D9         D10        D11
// Row 4:   Q4         D12        D13        D14        P4
 
// Both P and Q positions rotate to distribute load
 
// P calculation: P = D0 ⊕ D1 ⊕ D2 (standard XOR)
// Q calculation: Q = g⁰·D0 ⊕ g¹·D1 ⊕ g²·D2 (Reed-Solomon, Galois Field)
 
// Small write process (update D1):
//   1. Read old D1
//   2. Read old P0
//   3. Read Q0 or calculate Q coefficient
//   4. Write new D1  
//   5. Write new P0 = old_P0 ⊕ old_D1 ⊕ new_D1
//   6. Write new Q0 (using Galois Field math)
//   Total: 3 reads + 3 writes = 6 I/O operations (6× write penalty)
 
// Key formulas:
// Usable capacity = (N-2) × drive capacity
// Capacity efficiency = (N-2)/N = 60% for 5 drives, 75% for 8 drives
// Fault tolerance = 2 drives

Reed-Solomon Coding:

RAID 6 Characteristics Summary
Property	Value	Explanation
Minimum Drives	4	Need at least 4 drives (2 data + 2 parity equivalent)
Usable Capacity	(N-2) × drive capacity	Two drives worth of capacity for dual parity
Read Performance	Excellent ((N-2)× for data)	Parallel reads across all data stripe units
Write Performance	Reduced (6× write penalty)	Each write requires 3 reads + 3 writes
Fault Tolerance	2 drives	Survives any two simultaneous drive failures
Rebuild Time	Long but safer	Can lose another drive during rebuild

RAID 6 vs. RAID 5 Comparison:

Aspect	RAID 5	RAID 6
Fault tolerance	1 drive	2 drives
Capacity overhead	1/N	2/N
Write penalty	4×	6×
Rebuild safety	Vulnerable	Safe during rebuild
Recommended for	Small arrays, ≤1TB drives	Large arrays, >1TB drives

Why RAID 6 Is Now Standard:

With 18TB drives and larger becoming common:

RAID 5 rebuild for an 8×18TB array: 30+ hours
Annual failure rate: ~2-5% per drive
Probability of second failure during 30-hour window: Unacceptably high
RAID 6 provides a safety margin that RAID 5 can no longer guarantee

Triple Parity Options

RAID 10: Mirrored Stripes

Architecture:

RAID 10 Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// RAID 10 with 6 drives (3 mirror pairs, striped)
 
//           Pair 0        Pair 1        Pair 2
//        D0a   D0b     D1a   D1b     D2a   D2b
// Row 0:  A     A       B     B       C     C
// Row 1:  D     D       E     E       F     F
// Row 2:  G     G       H     H       I     I
 
// Logical view: A B C D E F G H I (striped across 3 pairs)
// Physical: Each element exists on 2 drives (mirrored)
 
// Write operation for block A:
//   1. Write A to D0a
//   2. Write A to D0b (in parallel)
//   3. Confirm when both complete
//   → Write performance: Near 3× (3 pairs can write in parallel)
//   → No parity calculation overhead
 
// Read operation for block A:
//   Option 1: Read from D0a
//   Option 2: Read from D0b
//   → Controller chooses optimal drive
//   → Read performance: Up to 6× (all drives can serve reads)
 
// Key formulas:
// Usable capacity = (N/2) × drive capacity = 50% efficiency
// Fault tolerance = 1 per mirror pair (up to N/2 total if lucky)
// Write throughput = (N/2) × single-drive throughput
// Read throughput = N × single-drive throughput

RAID 10 Fault Tolerance Nuance:

RAID 10 can survive multiple failures, but with critical caveats:

Best case: Lose one drive from each mirror pair → all data survives
Worst case: Lose both drives in any single pair → complete data loss

Scenario	RAID 10 (6 drives)	RAID 6 (6 drives)
1 failure	✓ Survives	✓ Survives
2 failures (different pairs)	✓ Survives	✓ Survives
2 failures (same pair)	✗ Failed	✓ Survives
3 failures (one per pair)	✓ Survives	✗ Failed

RAID 10 Characteristics Summary
Property	Value	Explanation
Minimum Drives	4	Need at least 2 mirror pairs to stripe
Usable Capacity	N/2 × drive capacity	50% capacity efficiency (mirroring overhead)
Read Performance	Excellent (N× IOPS)	All drives can serve reads independently
Write Performance	Excellent ((N/2)× throughput)	No parity overhead; parallel writes to pairs
Fault Tolerance	1+ drives (pair-dependent)	Survives 1 per pair, up to N/2 total
Rebuild Speed	Fast (simple mirror copy)	Only affects one pair; no parity reconstruction

RAID 10 vs. RAID 01

Why RAID 10 for Databases:

RAID 10 is the gold standard for database workloads because:

No write penalty: Writes are simple mirrors; no parity calculation
Consistent latency: No degraded-mode performance cliff
Fast rebuild: Only copies one drive; doesn't stress entire array
Predictable performance: Linear scaling with drive count
Simple failure analysis: Easy to understand which failures are survivable

RAID Level Comparison Matrix

The following comprehensive comparison consolidates the key characteristics of each RAID level to facilitate selection decisions.

Comprehensive RAID Level Comparison
Property	RAID 0	RAID 1	RAID 5	RAID 6	RAID 10
Min Drives	2	2	3	4	4
Usable Capacity	100%	50%	(N-1)/N	(N-2)/N	50%
Example (8 drives)	8 drives	4 drives	7 drives	6 drives	4 drives
Fault Tolerance	0	N-1	1	2	1-N/2
Read IOPS	Excellent	Excellent	Good	Good	Excellent
Write IOPS	Excellent	Good	Poor	Poor	Excellent
Write Penalty	None	None	4×	6×	None
Rebuild Time	N/A	Fast	Slow	Slow	Fast
Degraded Performance	N/A	Minimal	Significant	Significant	Minimal
Best For	Scratch/temp	Small critical	Read-heavy	Large arrays	OLTP databases

Capacity Efficiency vs. Drive Count:

Parity-based RAID becomes more capacity-efficient as you add drives:

Drives	RAID 5 Efficiency	RAID 6 Efficiency	RAID 10 Efficiency
4	75%	50%	50%
6	83.3%	66.7%	50%
8	87.5%	75%	50%
12	91.7%	83.3%	50%
16	93.75%	87.5%	50%

SSD Changes the Equation

Summary: Choosing the Right RAID Level

We've examined the five major RAID levels in depth. Here are the key decision factors:

RAID Selection Guidelines

•RAID 0: Only for non-critical, regenerable data where maximum performance is required
•RAID 1: For small, critical datasets (transaction logs, boot volumes) where simplicity and fast rebuild matter
•RAID 5: Avoid for new deployments; RAID 6 is safer for similar use cases
•RAID 6: For large capacity requirements with read-heavy workloads; minimum for modern large drives
•RAID 10: For write-intensive OLTP databases; gold standard when performance matters and capacity cost is acceptable

What's Next:

Page Complete