Database Management SystemsDirty Read Problem

The Dirty Read Problem: Reading Uncommitted Data

LevelIntermediate

Duration60 mins

TopicDirty Read Problem

5 / 5

Prevention: Defending Against Dirty Reads

The Defense-in-Depth Strategy

Having explored the mechanics, consequences, and rollback implications of dirty reads, we arrive at the most practical question: How do we prevent them?

The good news is that preventing dirty reads is straightforward in modern database systems. The challenge lies in understanding the available mechanisms, their trade-offs, and when each is appropriate. Prevention strategies range from simple configuration changes to sophisticated architectural patterns—and often, the right choice depends on your specific requirements.

What You Will Learn

This page covers the complete spectrum of dirty read prevention: SQL isolation levels and their implementation, locking protocols, Multi-Version Concurrency Control (MVCC), application-level defensive patterns, and architectural strategies. You'll finish with a practical playbook for ensuring your applications never suffer from dirty read anomalies.

SQL Isolation Levels: The Primary Defense

The SQL standard defines four isolation levels that provide different guarantees against concurrency anomalies. Any isolation level except READ UNCOMMITTED prevents dirty reads.

The Isolation Level Hierarchy:

SQL Standard Isolation Levels and Dirty Read Prevention
Isolation Level	Dirty Reads?	Non-Repeatable Reads?	Phantoms?	Typical Use
READ UNCOMMITTED	✓ Possible	✓ Possible	✓ Possible	Approximate analytics only
READ COMMITTED	✗ Prevented	✓ Possible	✓ Possible	General OLTP (most common)
REPEATABLE READ	✗ Prevented	✗ Prevented	✓ Possible	Consistent reporting
SERIALIZABLE	✗ Prevented	✗ Prevented	✗ Prevented	Critical operations

READ COMMITTED: The Standard Defense

READ COMMITTED is the default isolation level in most major database systems (PostgreSQL, Oracle, SQL Server). It provides a simple guarantee:

A transaction may only read data that has been committed by other transactions.

This directly prevents dirty reads by definition. Uncommitted data is invisible to other transactions.

isolation-level-examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- PostgreSQL: Setting isolation levels
 
-- Session-level default (affects all transactions in session)
SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED;
 
-- Transaction-specific setting
BEGIN;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- All reads in this transaction see only committed data
SELECT balance FROM accounts WHERE id = 1;
COMMIT;
 
-- Or use the combined syntax
BEGIN ISOLATION LEVEL READ COMMITTED;
SELECT balance FROM accounts WHERE id = 1;
COMMIT;
 
-- Checking current isolation level
SHOW transaction_isolation;
 
-- PostgreSQL note: READ UNCOMMITTED is treated as READ COMMITTED
-- PostgreSQL's MVCC doesn't actually support true dirty reads

Default Isolation Levels by Database

PostgreSQL, Oracle, SQL Server: READ COMMITTED (default). MySQL/InnoDB: REPEATABLE READ (default). All of these prevent dirty reads. Only explicit use of READ UNCOMMITTED enables dirty reads, and some databases (like PostgreSQL) don't actually support true dirty reads even at that level.

Locking Mechanisms That Prevent Dirty Reads

In lock-based concurrency control systems, preventing dirty reads relies on holding write locks until commit and requiring read locks that conflict with uncommitted writes.

The Lock Compatibility Matrix:

Lock Compatibility Matrix (Standard)
Request \ Held	None	Shared (S)	Exclusive (X)
Shared (S)	✓ Grant	✓ Grant	✗ Wait
Exclusive (X)	✓ Grant	✗ Wait	✗ Wait

How Locking Prevents Dirty Reads:

When a transaction writes data:

It acquires an Exclusive (X) lock on the data item
This lock is held until the transaction commits or aborts (in strict 2PL)

When another transaction tries to read the same data:

It requests a Shared (S) lock on the data item
The S lock conflicts with the X lock held by the writer
The reader must wait until the writer commits or aborts
After commit/abort, the reader gets the lock and reads committed data

The key insight: The reader is blocked from seeing uncommitted data because it cannot acquire a lock on data that is being written.

lock-based-prevention.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
LOCK-BASED DIRTY READ PREVENTION
═══════════════════════════════════════════════════════════════════════
 
Timeline showing how locks prevent dirty reads:
 
Time    T1 (Writer)                    T2 (Reader)
────────────────────────────────────────────────────────────────────────
t1      BEGIN
t2      REQUEST X-LOCK(A)              
t3      GRANT X-LOCK(A)                
t4      WRITE A = 200
        (A is now dirty, locked)
                                       
t5                                     BEGIN
t6                                     REQUEST S-LOCK(A)
t7                                     ╳ BLOCKED ╳ (X-lock held by T1)
t8                                     |  Waiting...
t9      COMMIT                         |
t10     RELEASE X-LOCK(A)              |
t11                                    GRANT S-LOCK(A)
t12                                    READ A → 200 (COMMITTED value!)
t13                                    RELEASE S-LOCK(A)
t14                                    COMMIT
 
Result: T2 read A = 200, but this is the COMMITTED value.
        T2 never saw the uncommitted state because it was blocked.
 
ALTERNATIVE: If T1 had aborted instead of committed:
 
t9'     ABORT
t10'    ROLLBACK A = 100
t11'    RELEASE X-LOCK(A)
t12'                                   GRANT S-LOCK(A)
t13'                                   READ A → 100 (original value!)
 
Either way: T2 only ever reads committed data.

Two-Phase Locking (2PL)

The most common locking protocol, Two-Phase Locking, ensures that locks are held until transaction completion (in its strict variant). This guarantees that readers are blocked from accessing uncommitted writes, preventing dirty reads. We'll explore 2PL in detail in the Locking Protocols chapter.

MVCC: Preventing Dirty Reads Without Blocking

Multi-Version Concurrency Control (MVCC) provides an alternative approach to dirty read prevention that avoids the blocking behavior of locking. Instead of making readers wait for uncommitted writes, MVCC makes uncommitted writes invisible to readers.

The MVCC Visibility Model:

In MVCC, each write operation creates a new version of the data item. Each version is tagged with:

The creating transaction's ID
The transaction's commit status

When a transaction reads data, it uses visibility rules to determine which version to see:

mvcc-visibility.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
MVCC VISIBILITY ALGORITHM (Simplified)
═══════════════════════════════════════════════════════════════════════
 
FUNCTION get_visible_version(data_item X, reader_transaction T_r):
    versions = get_all_versions(X)  # Ordered newest to oldest
    
    FOR EACH version V IN versions:
        creator_txn = V.creating_transaction
        
        # Rule 1: Transaction sees its own uncommitted changes
        IF creator_txn == T_r:
            RETURN V
        
        # Rule 2: Skip uncommitted versions from other transactions
        IF NOT is_committed(creator_txn):
            CONTINUE  # Skip this version, try older one
        
        # Rule 3: Skip versions committed after our snapshot
        IF commit_time(creator_txn) > T_r.snapshot_time:
            CONTINUE  # Skip this version, try older one
        
        # Rule 4: Skip versions deleted before our snapshot
        IF V.deleted_by IS NOT NULL:
            deleter = V.deleted_by
            IF is_committed(deleter) AND commit_time(deleter) <= T_r.snapshot_time:
                CONTINUE  # This version was deleted before our snapshot
        
        # This version is visible
        RETURN V
    
    RETURN NULL  # No visible version exists
 
KEY: Uncommitted versions are ALWAYS skipped (Rule 2)
     This inherently prevents dirty reads!

MVCC Advantages for Dirty Read Prevention:

MVCC Benefits

•Non-Blocking Reads: Readers never wait for writers—they simply see older committed versions
•Non-Blocking Writes: Writers never wait for readers—they create new versions without touching versions being read
•Consistent Snapshots: Readers see a consistent point-in-time view of the data
•Natural Prevention: Dirty reads are impossible by design, not by blocking
•Performance: Higher concurrency than lock-based approaches for read-heavy workloads

Converting Mermaid diagram...

PostgreSQL's Implementation

PostgreSQL uses MVCC so thoroughly that it doesn't even support true READ UNCOMMITTED. If you request READ UNCOMMITTED, PostgreSQL silently upgrades it to READ COMMITTED. The MVCC architecture makes dirty reads architecturally impossible in the standard read path.

Database-Specific Configuration

Each major database system has specific configurations and behaviors regarding dirty read prevention. Here's a practical guide:

PostgreSQL:

postgresql-config.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- PostgreSQL's MVCC makes dirty reads effectively impossible
 
-- Check default isolation level (should be 'read committed')
SHOW default_transaction_isolation;
 
-- Set in postgresql.conf for cluster-wide default
-- default_transaction_isolation = 'read committed'
 
-- Even if you explicitly request READ UNCOMMITTED:
BEGIN ISOLATION LEVEL READ UNCOMMITTED;
SELECT * FROM accounts;  -- Still behaves as READ COMMITTED!
COMMIT;
 
-- This is by design: PostgreSQL's documentation states:
-- "Read Uncommitted [is treated] the same as Read Committed...
--  PostgreSQL's Read Committed mode does not allow dirty reads."
 
-- For additional safety, ensure you're not accidentally
-- using external systems that bypass MVCC:
-- - Avoid pg_dump --data-only during active transactions
-- - Don't use filesystem backups without proper WAL handling

MySQL/InnoDB:

mysql-config.sql
MySQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- MySQL/InnoDB defaults to REPEATABLE READ (prevents dirty reads)
 
-- Check current setting
SELECT @@transaction_isolation;
 
-- Ensure InnoDB is the storage engine (MyISAM has no transactions!)
SHOW CREATE TABLE your_table;
 
-- Set server-wide default in my.cnf:
-- [mysqld]
-- transaction-isolation = READ-COMMITTED
 
-- MySQL DOES support true READ UNCOMMITTED
-- Avoid using it:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;  -- DANGEROUS!
SELECT * FROM accounts;  -- Can see uncommitted data
COMMIT;
 
-- Audit your application for any READ UNCOMMITTED usage:
-- Search codebase for: 'READ UNCOMMITTED' or 'TRANSACTION_READ_UNCOMMITTED'
 
-- Best practice: Set application-level default
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

SQL Server:

sqlserver-config.sql
SQL Server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- SQL Server defaults to READ COMMITTED (prevents dirty reads)
 
-- Check database settings
SELECT name, is_read_committed_snapshot_on, snapshot_isolation_state_desc
FROM sys.databases WHERE name = DB_NAME();
 
-- Enable READ COMMITTED SNAPSHOT for non-blocking reads (recommended)
ALTER DATABASE YourDatabase SET READ_COMMITTED_SNAPSHOT ON;
-- This uses MVCC-style behavior for READ COMMITTED isolation
 
-- AVOID NOLOCK hint / READ UNCOMMITTED (common anti-pattern!)
-- BAD:
SELECT * FROM accounts WITH (NOLOCK);  -- Allows dirty reads!
SELECT * FROM accounts (NOLOCK);        -- Same thing, older syntax
 
-- EQUIVALENT BAD:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT * FROM accounts;
 
-- Search your codebase for:
-- 'NOLOCK', 'WITH (NOLOCK)', 'READ UNCOMMITTED'
-- These are often used for "performance" but enable dirty reads
 
-- Good alternative for non-blocking reads:
ALTER DATABASE YourDatabase SET READ_COMMITTED_SNAPSHOT ON;
-- Now READ COMMITTED uses snapshots, no blocking OR dirty reads

The NOLOCK Anti-Pattern

SQL Server's WITH (NOLOCK) hint is one of the most common sources of dirty reads in production systems. It's often used for 'performance' without understanding the consequences. Audit your SQL Server applications for NOLOCK usage and replace with READ COMMITTED SNAPSHOT for safe non-blocking reads.

Application-Level Prevention Strategies

Beyond database configuration, application design can incorporate additional defenses against dirty reads and their consequences.

Connection Configuration:

Ensure your database connections always use appropriate isolation:

connection-config.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Connection Pool Configuration Examples
 
// Prisma (Node.js)
// In schema.prisma, no special config needed - uses database default
// For explicit control, use interactive transactions:
await prisma.$transaction(async (tx) => {
    // This transaction uses the database's default isolation
    const account = await tx.account.findUnique({ where: { id: 1 } });
    // All operations see consistent, committed data
}, {
    isolationLevel: 'ReadCommitted',  // Explicit isolation level
});
 
// TypeORM (Node.js)
const dataSource = new DataSource({
    // ...connection options
    extra: {
        // For MySQL: set default isolation at connection
        connectionLimit: 10,
    },
});
 
// Run queries with explicit isolation
await dataSource.transaction('READ COMMITTED', async (manager) => {
    const account = await manager.findOne(Account, { where: { id: 1 } });
});
 
// JDBC (Java)
connection.setTransactionIsolation(Connection.TRANSACTION_READ_COMMITTED);
 
// Ensure you NEVER use:
// Connection.TRANSACTION_READ_UNCOMMITTED  <-- DANGEROUS

Defensive Query Patterns:

Application-Level Best Practices

•Verify Isolation Level: Log and alert if connections are set to READ UNCOMMITTED
•Wrap Critical Operations: Use explicit transactions with specified isolation for important logic
•Validate Before Commit: Re-read critical values just before commit to detect unexpected changes
•Idempotent Operations: Design operations to be safely retryable in case of rollback scenarios
•Explicit Locking: Use SELECT...FOR UPDATE when you need guaranteed consistency for updates
•Event Sourcing: For critical workflows, use event sourcing with committed events only

defensive-pattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Example: Defensive balance check with explicit locking
 
async function transferFunds(
    fromAccountId: string, 
    toAccountId: string, 
    amount: number
): Promise<TransferResult> {
    return await prisma.$transaction(async (tx) => {
        // SELECT FOR UPDATE: Locks the row, guarantees committed data
        const fromAccount = await tx.$queryRaw`
            SELECT balance FROM accounts 
            WHERE id = ${fromAccountId}
            FOR UPDATE
        `;
        
        // Validate with committed, locked data
        if (fromAccount.balance < amount) {
            throw new InsufficientFundsError();
        }
        
        // Safe to proceed - we have a committed, locked view
        await tx.account.update({
            where: { id: fromAccountId },
            data: { balance: { decrement: amount } }
        });
        
        await tx.account.update({
            where: { id: toAccountId },
            data: { balance: { increment: amount } }
        });
        
        return { success: true };
    }, {
        isolationLevel: 'Serializable',  // Maximum safety for financial ops
        maxWait: 5000,
        timeout: 10000,
    });
}

Defense in Depth

Combine multiple layers of protection: database defaults, connection settings, and explicit transaction control in application code. This way, even if one layer fails (e.g., misconfigured connection pool), other layers maintain protection.

Architectural Patterns for Dirty Read Prevention

Certain architectural patterns provide structural protection against dirty reads and related concurrency issues.

1. Read Replicas with Committed Data Only

Separating read traffic to read replicas that only receive committed data:

Converting Mermaid diagram...

2. Event Sourcing with Committed Events

Event sourcing architectures naturally prevent dirty reads when events are only published after commit:

event-sourcing-pattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Event Sourcing: Events are only visible after commit
 
interface DomainEvent {
    eventId: string;
    aggregateId: string;
    eventType: string;
    data: any;
    timestamp: Date;
}
 
class AccountAggregate {
    private uncommittedEvents: DomainEvent[] = [];
    
    transfer(amount: number, toAccount: string) {
        // Create event but don't publish yet
        this.uncommittedEvents.push({
            eventId: uuid(),
            aggregateId: this.id,
            eventType: 'FundsTransferred',
            data: { amount, toAccount },
            timestamp: new Date(),
        });
    }
    
    async commit(eventStore: EventStore): Promise<void> {
        // Atomically save events to event store
        await eventStore.saveEvents(this.uncommittedEvents);
        
        // Only NOW publish events for other services
        // Other services never see uncommitted transfers
        await eventBus.publish(this.uncommittedEvents);
        
        this.uncommittedEvents = [];
    }
}
 
// Consumers only receive committed events
// No dirty reads possible at the application level

3. CQRS (Command Query Responsibility Segregation)

Separating read models from write models ensures reads are from consistent, committed snapshots:

CQRS for Dirty Read Prevention

•Separate write and read databases: Writes go to transactional DB, reads from optimized read store
•Read models built from events: Only committed events update read models
•Eventual consistency: Read model may lag slightly, but data is always committed
•Projection handlers: Only process events that have been durably committed

Trade-offs

These architectural patterns trade immediate consistency for guaranteed correctness. Replicas and CQRS read models may show slightly stale data (eventual consistency), but they never show uncommitted data that might be rolled back. For most applications, this trade-off is highly favorable.

When Is READ UNCOMMITTED Acceptable?

Despite all the warnings, there are narrow scenarios where READ UNCOMMITTED might be acceptable. Understanding these helps avoid both over-restriction and under-protection.

Potentially Acceptable Use Cases:

READ UNCOMMITTED Risk Assessment
Scenario	Why It Might Be OK	Residual Risk
Approximate row counts	COUNT(*) on large tables where exact count unnecessary	Count might include rolled-back rows
Real-time monitoring dashboards	Showing 'approximately N active sessions'	Display may briefly show phantom data
Existence checks for logging	Checking if record exists for debug logging only	May log incorrect existence state
Long-running analytics on stable data	Historical data that's no longer being modified	New data is still at risk

Criteria for Acceptable READ UNCOMMITTED:

All of these criteria must be met:

Safety Criteria

•No business decisions: The data is not used to make any business decision (approval, rejection, calculation)
•No data derivation: The read value is not used to compute or derive any other stored data
•No external effects: No emails, API calls, notifications, or other side effects based on the data
•Approximation acceptable: The use case explicitly tolerates approximate or potentially incorrect values
•Display only: The data is for informational display only, clearly marked as approximate
•Isolated scope: The query is isolated; its results don't influence other queries in the transaction

When in Doubt, Don't

If you're not absolutely certain a use case meets ALL the criteria above, do not use READ UNCOMMITTED. The performance benefit is rarely significant enough to justify the risk. Use READ COMMITTED SNAPSHOT or similar for non-blocking reads without dirty read exposure.

Better Alternatives:

For most 'performance' use cases that motivate READ UNCOMMITTED:

READ COMMITTED SNAPSHOT (SQL Server): Non-blocking reads without dirty reads
Snapshot Isolation: Consistent point-in-time reads without blocking
Read Replicas: Offload read traffic to replicas with only committed data
Caching Layer: Cache query results at application layer with controlled staleness
Materialized Views: Pre-computed results refreshed periodically from committed data

Summary: A Complete Prevention Strategy

Preventing dirty reads is straightforward in principle but requires attention across multiple layers. Here's a consolidated prevention playbook:

Complete Dirty Read Prevention Playbook

•Database Default: Ensure database default isolation is READ COMMITTED or higher (most databases do this by default)
•Audit Usage: Search codebase for READ UNCOMMITTED, NOLOCK, and equivalent patterns; review and remove unless justified
•Connection Settings: Configure connection pools to explicit READ COMMITTED or higher
•Critical Operations: Use explicit transactions with appropriate isolation for important business logic
•MVCC Preference: Use databases and configurations that leverage MVCC for non-blocking, dirty-read-free reads
•Architecture Review: Consider read replicas, CQRS, or event sourcing for additional structural protection
•Documentation: Document any intentional READ UNCOMMITTED usage with justification and risk acceptance

The Module in Review:

Across this module, we've built a comprehensive understanding of the Dirty Read Problem:

Definition: Dirty reads occur when transactions read uncommitted data that may be rolled back
Uncommitted Data: Exists in provisional state, may disappear on abort, fundamentally uncertain
Incorrect Results: Lead to wrong aggregations, logic corruption, constraint violations, and cascading errors
Rollback Implications: Create phantom data references and may require cascading aborts
Prevention: Isolation levels, locking, MVCC, and architectural patterns all provide protection

Dirty reads are one of the most fundamental concurrency problems, and understanding them establishes a foundation for understanding all other transaction anomalies.

Module Complete

Congratulations! You now have a comprehensive, world-class understanding of the Dirty Read Problem. You can define it formally, explain its consequences, analyze rollback scenarios, and implement prevention strategies across different database systems and application architectures. This knowledge is foundational for transaction management and essential for building reliable database applications.

5 / 5

Loading learning content...

Database Management SystemsDirty Read Problem

The Dirty Read Problem: Reading Uncommitted Data

LevelIntermediate

Duration60 mins

TopicDirty Read Problem

5 / 5

Prevention: Defending Against Dirty Reads

The Defense-in-Depth Strategy

Having explored the mechanics, consequences, and rollback implications of dirty reads, we arrive at the most practical question: How do we prevent them?

What You Will Learn

SQL Isolation Levels: The Primary Defense

The SQL standard defines four isolation levels that provide different guarantees against concurrency anomalies. Any isolation level except READ UNCOMMITTED prevents dirty reads.

The Isolation Level Hierarchy:

SQL Standard Isolation Levels and Dirty Read Prevention
Isolation Level	Dirty Reads?	Non-Repeatable Reads?	Phantoms?	Typical Use
READ UNCOMMITTED	✓ Possible	✓ Possible	✓ Possible	Approximate analytics only
READ COMMITTED	✗ Prevented	✓ Possible	✓ Possible	General OLTP (most common)
REPEATABLE READ	✗ Prevented	✗ Prevented	✓ Possible	Consistent reporting
SERIALIZABLE	✗ Prevented	✗ Prevented	✗ Prevented	Critical operations

READ COMMITTED: The Standard Defense

READ COMMITTED is the default isolation level in most major database systems (PostgreSQL, Oracle, SQL Server). It provides a simple guarantee:

A transaction may only read data that has been committed by other transactions.

This directly prevents dirty reads by definition. Uncommitted data is invisible to other transactions.

isolation-level-examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- PostgreSQL: Setting isolation levels
 
-- Session-level default (affects all transactions in session)
SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED;
 
-- Transaction-specific setting
BEGIN;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- All reads in this transaction see only committed data
SELECT balance FROM accounts WHERE id = 1;
COMMIT;
 
-- Or use the combined syntax
BEGIN ISOLATION LEVEL READ COMMITTED;
SELECT balance FROM accounts WHERE id = 1;
COMMIT;
 
-- Checking current isolation level
SHOW transaction_isolation;
 
-- PostgreSQL note: READ UNCOMMITTED is treated as READ COMMITTED
-- PostgreSQL's MVCC doesn't actually support true dirty reads

Default Isolation Levels by Database

Locking Mechanisms That Prevent Dirty Reads

In lock-based concurrency control systems, preventing dirty reads relies on holding write locks until commit and requiring read locks that conflict with uncommitted writes.

The Lock Compatibility Matrix:

Lock Compatibility Matrix (Standard)
Request \ Held	None	Shared (S)	Exclusive (X)
Shared (S)	✓ Grant	✓ Grant	✗ Wait
Exclusive (X)	✓ Grant	✗ Wait	✗ Wait

How Locking Prevents Dirty Reads:

When a transaction writes data:

It acquires an Exclusive (X) lock on the data item
This lock is held until the transaction commits or aborts (in strict 2PL)

When another transaction tries to read the same data:

It requests a Shared (S) lock on the data item
The S lock conflicts with the X lock held by the writer
The reader must wait until the writer commits or aborts
After commit/abort, the reader gets the lock and reads committed data

The key insight: The reader is blocked from seeing uncommitted data because it cannot acquire a lock on data that is being written.

lock-based-prevention.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
LOCK-BASED DIRTY READ PREVENTION
═══════════════════════════════════════════════════════════════════════
 
Timeline showing how locks prevent dirty reads:
 
Time    T1 (Writer)                    T2 (Reader)
────────────────────────────────────────────────────────────────────────
t1      BEGIN
t2      REQUEST X-LOCK(A)              
t3      GRANT X-LOCK(A)                
t4      WRITE A = 200
        (A is now dirty, locked)
                                       
t5                                     BEGIN
t6                                     REQUEST S-LOCK(A)
t7                                     ╳ BLOCKED ╳ (X-lock held by T1)
t8                                     |  Waiting...
t9      COMMIT                         |
t10     RELEASE X-LOCK(A)              |
t11                                    GRANT S-LOCK(A)
t12                                    READ A → 200 (COMMITTED value!)
t13                                    RELEASE S-LOCK(A)
t14                                    COMMIT
 
Result: T2 read A = 200, but this is the COMMITTED value.
        T2 never saw the uncommitted state because it was blocked.
 
ALTERNATIVE: If T1 had aborted instead of committed:
 
t9'     ABORT
t10'    ROLLBACK A = 100
t11'    RELEASE X-LOCK(A)
t12'                                   GRANT S-LOCK(A)
t13'                                   READ A → 100 (original value!)
 
Either way: T2 only ever reads committed data.

Two-Phase Locking (2PL)

MVCC: Preventing Dirty Reads Without Blocking

The MVCC Visibility Model:

In MVCC, each write operation creates a new version of the data item. Each version is tagged with:

The creating transaction's ID
The transaction's commit status

When a transaction reads data, it uses visibility rules to determine which version to see:

mvcc-visibility.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
MVCC VISIBILITY ALGORITHM (Simplified)
═══════════════════════════════════════════════════════════════════════
 
FUNCTION get_visible_version(data_item X, reader_transaction T_r):
    versions = get_all_versions(X)  # Ordered newest to oldest
    
    FOR EACH version V IN versions:
        creator_txn = V.creating_transaction
        
        # Rule 1: Transaction sees its own uncommitted changes
        IF creator_txn == T_r:
            RETURN V
        
        # Rule 2: Skip uncommitted versions from other transactions
        IF NOT is_committed(creator_txn):
            CONTINUE  # Skip this version, try older one
        
        # Rule 3: Skip versions committed after our snapshot
        IF commit_time(creator_txn) > T_r.snapshot_time:
            CONTINUE  # Skip this version, try older one
        
        # Rule 4: Skip versions deleted before our snapshot
        IF V.deleted_by IS NOT NULL:
            deleter = V.deleted_by
            IF is_committed(deleter) AND commit_time(deleter) <= T_r.snapshot_time:
                CONTINUE  # This version was deleted before our snapshot
        
        # This version is visible
        RETURN V
    
    RETURN NULL  # No visible version exists
 
KEY: Uncommitted versions are ALWAYS skipped (Rule 2)
     This inherently prevents dirty reads!

MVCC Advantages for Dirty Read Prevention:

MVCC Benefits

•Non-Blocking Reads: Readers never wait for writers—they simply see older committed versions
•Non-Blocking Writes: Writers never wait for readers—they create new versions without touching versions being read
•Consistent Snapshots: Readers see a consistent point-in-time view of the data
•Natural Prevention: Dirty reads are impossible by design, not by blocking
•Performance: Higher concurrency than lock-based approaches for read-heavy workloads

Converting Mermaid diagram...

PostgreSQL's Implementation

Database-Specific Configuration

Each major database system has specific configurations and behaviors regarding dirty read prevention. Here's a practical guide:

PostgreSQL:

postgresql-config.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- PostgreSQL's MVCC makes dirty reads effectively impossible
 
-- Check default isolation level (should be 'read committed')
SHOW default_transaction_isolation;
 
-- Set in postgresql.conf for cluster-wide default
-- default_transaction_isolation = 'read committed'
 
-- Even if you explicitly request READ UNCOMMITTED:
BEGIN ISOLATION LEVEL READ UNCOMMITTED;
SELECT * FROM accounts;  -- Still behaves as READ COMMITTED!
COMMIT;
 
-- This is by design: PostgreSQL's documentation states:
-- "Read Uncommitted [is treated] the same as Read Committed...
--  PostgreSQL's Read Committed mode does not allow dirty reads."
 
-- For additional safety, ensure you're not accidentally
-- using external systems that bypass MVCC:
-- - Avoid pg_dump --data-only during active transactions
-- - Don't use filesystem backups without proper WAL handling

MySQL/InnoDB:

mysql-config.sql
MySQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- MySQL/InnoDB defaults to REPEATABLE READ (prevents dirty reads)
 
-- Check current setting
SELECT @@transaction_isolation;
 
-- Ensure InnoDB is the storage engine (MyISAM has no transactions!)
SHOW CREATE TABLE your_table;
 
-- Set server-wide default in my.cnf:
-- [mysqld]
-- transaction-isolation = READ-COMMITTED
 
-- MySQL DOES support true READ UNCOMMITTED
-- Avoid using it:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;  -- DANGEROUS!
SELECT * FROM accounts;  -- Can see uncommitted data
COMMIT;
 
-- Audit your application for any READ UNCOMMITTED usage:
-- Search codebase for: 'READ UNCOMMITTED' or 'TRANSACTION_READ_UNCOMMITTED'
 
-- Best practice: Set application-level default
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

SQL Server:

sqlserver-config.sql
SQL Server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- SQL Server defaults to READ COMMITTED (prevents dirty reads)
 
-- Check database settings
SELECT name, is_read_committed_snapshot_on, snapshot_isolation_state_desc
FROM sys.databases WHERE name = DB_NAME();
 
-- Enable READ COMMITTED SNAPSHOT for non-blocking reads (recommended)
ALTER DATABASE YourDatabase SET READ_COMMITTED_SNAPSHOT ON;
-- This uses MVCC-style behavior for READ COMMITTED isolation
 
-- AVOID NOLOCK hint / READ UNCOMMITTED (common anti-pattern!)
-- BAD:
SELECT * FROM accounts WITH (NOLOCK);  -- Allows dirty reads!
SELECT * FROM accounts (NOLOCK);        -- Same thing, older syntax
 
-- EQUIVALENT BAD:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT * FROM accounts;
 
-- Search your codebase for:
-- 'NOLOCK', 'WITH (NOLOCK)', 'READ UNCOMMITTED'
-- These are often used for "performance" but enable dirty reads
 
-- Good alternative for non-blocking reads:
ALTER DATABASE YourDatabase SET READ_COMMITTED_SNAPSHOT ON;
-- Now READ COMMITTED uses snapshots, no blocking OR dirty reads

The NOLOCK Anti-Pattern

Application-Level Prevention Strategies

Beyond database configuration, application design can incorporate additional defenses against dirty reads and their consequences.

Connection Configuration:

Ensure your database connections always use appropriate isolation:

connection-config.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Connection Pool Configuration Examples
 
// Prisma (Node.js)
// In schema.prisma, no special config needed - uses database default
// For explicit control, use interactive transactions:
await prisma.$transaction(async (tx) => {
    // This transaction uses the database's default isolation
    const account = await tx.account.findUnique({ where: { id: 1 } });
    // All operations see consistent, committed data
}, {
    isolationLevel: 'ReadCommitted',  // Explicit isolation level
});
 
// TypeORM (Node.js)
const dataSource = new DataSource({
    // ...connection options
    extra: {
        // For MySQL: set default isolation at connection
        connectionLimit: 10,
    },
});
 
// Run queries with explicit isolation
await dataSource.transaction('READ COMMITTED', async (manager) => {
    const account = await manager.findOne(Account, { where: { id: 1 } });
});
 
// JDBC (Java)
connection.setTransactionIsolation(Connection.TRANSACTION_READ_COMMITTED);
 
// Ensure you NEVER use:
// Connection.TRANSACTION_READ_UNCOMMITTED  <-- DANGEROUS

Defensive Query Patterns:

Application-Level Best Practices

•Verify Isolation Level: Log and alert if connections are set to READ UNCOMMITTED
•Wrap Critical Operations: Use explicit transactions with specified isolation for important logic
•Validate Before Commit: Re-read critical values just before commit to detect unexpected changes
•Idempotent Operations: Design operations to be safely retryable in case of rollback scenarios
•Explicit Locking: Use SELECT...FOR UPDATE when you need guaranteed consistency for updates
•Event Sourcing: For critical workflows, use event sourcing with committed events only

defensive-pattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Example: Defensive balance check with explicit locking
 
async function transferFunds(
    fromAccountId: string, 
    toAccountId: string, 
    amount: number
): Promise<TransferResult> {
    return await prisma.$transaction(async (tx) => {
        // SELECT FOR UPDATE: Locks the row, guarantees committed data
        const fromAccount = await tx.$queryRaw`
            SELECT balance FROM accounts 
            WHERE id = ${fromAccountId}
            FOR UPDATE
        `;
        
        // Validate with committed, locked data
        if (fromAccount.balance < amount) {
            throw new InsufficientFundsError();
        }
        
        // Safe to proceed - we have a committed, locked view
        await tx.account.update({
            where: { id: fromAccountId },
            data: { balance: { decrement: amount } }
        });
        
        await tx.account.update({
            where: { id: toAccountId },
            data: { balance: { increment: amount } }
        });
        
        return { success: true };
    }, {
        isolationLevel: 'Serializable',  // Maximum safety for financial ops
        maxWait: 5000,
        timeout: 10000,
    });
}

Defense in Depth

Architectural Patterns for Dirty Read Prevention

Certain architectural patterns provide structural protection against dirty reads and related concurrency issues.

1. Read Replicas with Committed Data Only

Separating read traffic to read replicas that only receive committed data:

Converting Mermaid diagram...

2. Event Sourcing with Committed Events

Event sourcing architectures naturally prevent dirty reads when events are only published after commit:

event-sourcing-pattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Event Sourcing: Events are only visible after commit
 
interface DomainEvent {
    eventId: string;
    aggregateId: string;
    eventType: string;
    data: any;
    timestamp: Date;
}
 
class AccountAggregate {
    private uncommittedEvents: DomainEvent[] = [];
    
    transfer(amount: number, toAccount: string) {
        // Create event but don't publish yet
        this.uncommittedEvents.push({
            eventId: uuid(),
            aggregateId: this.id,
            eventType: 'FundsTransferred',
            data: { amount, toAccount },
            timestamp: new Date(),
        });
    }
    
    async commit(eventStore: EventStore): Promise<void> {
        // Atomically save events to event store
        await eventStore.saveEvents(this.uncommittedEvents);
        
        // Only NOW publish events for other services
        // Other services never see uncommitted transfers
        await eventBus.publish(this.uncommittedEvents);
        
        this.uncommittedEvents = [];
    }
}
 
// Consumers only receive committed events
// No dirty reads possible at the application level

3. CQRS (Command Query Responsibility Segregation)

Separating read models from write models ensures reads are from consistent, committed snapshots:

CQRS for Dirty Read Prevention

•Separate write and read databases: Writes go to transactional DB, reads from optimized read store
•Read models built from events: Only committed events update read models
•Eventual consistency: Read model may lag slightly, but data is always committed
•Projection handlers: Only process events that have been durably committed

Trade-offs

When Is READ UNCOMMITTED Acceptable?

Despite all the warnings, there are narrow scenarios where READ UNCOMMITTED might be acceptable. Understanding these helps avoid both over-restriction and under-protection.

Potentially Acceptable Use Cases:

READ UNCOMMITTED Risk Assessment
Scenario	Why It Might Be OK	Residual Risk
Approximate row counts	COUNT(*) on large tables where exact count unnecessary	Count might include rolled-back rows
Real-time monitoring dashboards	Showing 'approximately N active sessions'	Display may briefly show phantom data
Existence checks for logging	Checking if record exists for debug logging only	May log incorrect existence state
Long-running analytics on stable data	Historical data that's no longer being modified	New data is still at risk

Criteria for Acceptable READ UNCOMMITTED:

All of these criteria must be met:

Safety Criteria

•No business decisions: The data is not used to make any business decision (approval, rejection, calculation)
•No data derivation: The read value is not used to compute or derive any other stored data
•No external effects: No emails, API calls, notifications, or other side effects based on the data
•Approximation acceptable: The use case explicitly tolerates approximate or potentially incorrect values
•Display only: The data is for informational display only, clearly marked as approximate
•Isolated scope: The query is isolated; its results don't influence other queries in the transaction

When in Doubt, Don't

Better Alternatives:

For most 'performance' use cases that motivate READ UNCOMMITTED:

READ COMMITTED SNAPSHOT (SQL Server): Non-blocking reads without dirty reads
Snapshot Isolation: Consistent point-in-time reads without blocking
Read Replicas: Offload read traffic to replicas with only committed data
Caching Layer: Cache query results at application layer with controlled staleness
Materialized Views: Pre-computed results refreshed periodically from committed data

Summary: A Complete Prevention Strategy

Preventing dirty reads is straightforward in principle but requires attention across multiple layers. Here's a consolidated prevention playbook:

Complete Dirty Read Prevention Playbook

•Database Default: Ensure database default isolation is READ COMMITTED or higher (most databases do this by default)
•Audit Usage: Search codebase for READ UNCOMMITTED, NOLOCK, and equivalent patterns; review and remove unless justified
•Connection Settings: Configure connection pools to explicit READ COMMITTED or higher
•Critical Operations: Use explicit transactions with appropriate isolation for important business logic
•MVCC Preference: Use databases and configurations that leverage MVCC for non-blocking, dirty-read-free reads
•Architecture Review: Consider read replicas, CQRS, or event sourcing for additional structural protection
•Documentation: Document any intentional READ UNCOMMITTED usage with justification and risk acceptance

The Module in Review:

Across this module, we've built a comprehensive understanding of the Dirty Read Problem:

Definition: Dirty reads occur when transactions read uncommitted data that may be rolled back
Uncommitted Data: Exists in provisional state, may disappear on abort, fundamentally uncertain
Incorrect Results: Lead to wrong aggregations, logic corruption, constraint violations, and cascading errors
Rollback Implications: Create phantom data references and may require cascading aborts
Prevention: Isolation levels, locking, MVCC, and architectural patterns all provide protection

Dirty reads are one of the most fundamental concurrency problems, and understanding them establishes a foundation for understanding all other transaction anomalies.

Module Complete

5 / 5