Database Management SystemsConcurrency Overview

Understanding Concurrency in Database Systems

LevelIntermediate

Duration60 mins

TopicConcurrency Overview

5 / 5

Need for Control

The Imperative of Control

Throughout this module, we've established a fundamental tension in database systems:

We need concurrent execution for throughput, response time, resource utilization, and multi-user support
Concurrent execution creates problems like lost updates, dirty reads, non-repeatable reads, and phantom reads
These problems arise from interleaving of operations from different transactions

The resolution of this tension is concurrency control—active mechanisms that manage how transactions interleave to preserve correctness while maximizing concurrency benefits.

This is not optional. A database without concurrency control is not a usable database. It's a data corruption engine that happens to be fast. Every production database system—from SQLite to Oracle to distributed systems like Spanner—implements sophisticated concurrency control. The question is never whether to control concurrency, but how.

What You Will Learn

By the end of this page, you will understand why concurrency control is non-negotiable, see real-world consequences of inadequate control, understand the core principles that underpin all concurrency control mechanisms, and preview the techniques (locking, timestamps, MVCC) that implement these principles.

The Case for Active Control

Let's be absolutely clear about why passive approaches—hoping for the best, testing thoroughly, or trusting application code—are fundamentally inadequate for managing database concurrency.

Why 'Hope' Doesn't Work:

The number of possible interleavings is combinatorially explosive. For just 10 transactions with 10 operations each, there are more possible interleavings than atoms in the observable universe. Among these:

Many are safe (equivalent to some serial execution)
Some are dangerous (cause data corruption)
The dangerous ones may be rare but are not impossible

Law of large numbers: At high transaction volumes, even improbable interleavings become inevitable. A 1-in-a-billion bad interleaving will occur multiple times per day on a busy system.

Why Testing Is Insufficient:

Testing cannot guarantee absence of concurrency bugs:

Combinatorial explosion: Cannot test all possible interleavings
Non-reproducibility: The same test may produce different interleavings each run
Load sensitivity: Bugs may only appear under specific load conditions
Timing sensitivity: Bugs may only appear with specific timing of operations

A system that passes millions of test runs may still fail in production when a rare timing condition occurs.

Why Application Logic Cannot Substitute:

Developers sometimes attempt to handle concurrency at the application level—implementing their own locking, version checking, or retry logic. This approach has fundamental problems:

Incomplete coverage: Easy to miss cases, especially as code evolves
Distributed failure modes: Application may crash between critical steps
No atomicity guarantee: Application-level checks can be bypassed by direct database access
Performance penalties: Application-level concurrency is typically slower than database-native mechanisms

Why Application-Level Control Fails
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
-- EXAMPLE: Application-level concurrency control attempt
 
-- Developer's plan: Read current balance, check, then update
-- Implemented in application code (pseudocode):
 
def withdraw(account_id, amount):
    # Step 1: Read current balance
    balance = db.query("SELECT balance FROM accounts WHERE id = ?", account_id)
    
    # Step 2: Check if sufficient funds
    if balance >= amount:
        # Step 3: Perform withdrawal
        new_balance = balance - amount
        db.execute("UPDATE accounts SET balance = ? WHERE id = ?", 
                   new_balance, account_id)
        return True
    return False
 
-- PROBLEM: Race condition between Step 1 and Step 3!
 
-- Timeline:
-- T1: withdraw(1, 100)           T2: withdraw(1, 100)
-- ---------------------          ----------------------
-- Read balance: $150             
--                                 Read balance: $150
-- Check: 150 >= 100? Yes         
--                                 Check: 150 >= 100? Yes
-- Update: balance = 50           
--                                 Update: balance = 50 (OVERWRITES!)
 
-- Result: Two $100 withdrawals from $150 account
-- Final balance: $50 (should have failed one or left $-50)
-- Lost update! Application logic did NOT prevent this.
 
-- The developer might try:
def withdraw_v2(account_id, amount):
    db.execute("UPDATE accounts SET balance = balance - ? "
               "WHERE id = ? AND balance >= ?", 
               amount, account_id, amount)
    return db.rows_affected > 0
 
-- This is better but still has issues:
-- - Other operations (reads, calculations) still unprotected
-- - Complex operations spanning multiple tables vulnerable
-- - Requires developer to remember the pattern everywhere
-- - Still no guarantee of isolation for multi-step transactions

Defense in Depth

Application-level checks can complement database concurrency control but cannot replace it. The database remains the ultimate arbiter of data integrity because it sees all access—from all applications, all users, all paths. Relying solely on application logic creates gaps that will eventually be exploited by bugs, direct database access, or malicious actors.

Consequences of Inadequate Control

When concurrency control fails—either through bugs in the database system or misconfigured isolation levels—the consequences can be severe. Let's examine real-world impact categories.

Financial Data Corruption:

Banking and payment systems are particularly vulnerable:

Lost updates can cause money to appear or disappear
Double spending becomes possible when balances are read inconsistently
Incorrect totals from phantom reads lead to wrong financial statements
Reconciliation failures when transaction logs don't match balances

Even small errors compound: a $0.01 error occurring 10 million times per day is $100,000 daily loss.

Real-World Concurrency Failure Scenarios
Domain	Scenario	Consequence	Root Cause
E-commerce	Inventory oversell	Orders placed for unavailable items	Lost update on stock count
Banking	Double withdrawal	Account overdrafts, bank loss	Non-serializable balance check
Healthcare	Duplicate medication order	Patient safety risk	Phantom read in order list
Ticketing	Double booking	Sold same seat twice	Lost update on availability
Trading	Position overcounting	Regulatory violations	Dirty read of pending trades

Data Integrity Violations:

Beyond financial impact, concurrency failures can violate fundamental data integrity:

Referential integrity: Foreign key relationships corrupted when concurrent inserts/deletes race
Uniqueness constraints: Duplicates inserted when concurrent transactions check before insert
Business rules: Invariants violated when checks and actions are interleaved
Audit trails: Logs that don't match actual data due to inconsistent reads

The Compounding Effect:

Concurrency bugs rarely occur in isolation. One corrupted record affects queries that read it, leading to:

Propagation: Bad data is used in calculations, spreading corruption
Cascading decisions: Business decisions based on corrupt data
Difficult recovery: Often impossible to determine 'ground truth' after the fact
Trust erosion: Users lose confidence in the system

Long-Term Business Impacts

•Regulatory Fines — Financial services face penalties for data integrity failures and inaccurate reporting.
•Legal Liability — Incorrect data can lead to lawsuits, especially in healthcare, finance, and legal domains.
•Customer Churn — Users who experience lost orders, wrong balances, or double charges lose trust.
•Operational Costs — Manual reconciliation, customer support, and data cleanup are expensive.
•Reputation Damage — Public disclosure of data integrity issues can permanently harm brand.

The Silent Killer

The most dangerous concurrency failures are silent ones—data corruption that isn't immediately detected. Unlike crashes (which are obvious), corrupted data may propagate for days or weeks before being discovered, by which point recovery may be impossible. This is why prevention through proper concurrency control is infinitely better than detection after the fact.

The Serializability Principle

Given the impossibility of testing all interleavings and the inadequacy of application-level control, how can we guarantee correctness? The answer lies in the serializability principle—the foundational concept of database concurrency control.

The Serializability Guarantee:

A concurrent execution is serializable if its outcome (final database state and all values read by transactions) is identical to some serial execution of the same transactions.

In other words: the transactions may have executed concurrently with interleaved operations, but the result is indistinguishable from running them one at a time in some order.

Why Serializability Works:

Serial execution is always correct — By definition, running transactions one at a time preserves isolation and consistency
Equivalent to serial = also correct — If a concurrent execution has the same outcome as a serial execution, it inherits that correctness
No enumeration required — We don't need to check which serial schedule it matches; any one will do

Serializability Concept
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- Understanding serializability
 
-- Three transactions execute concurrently:
-- T1: Credit Account A with $100
-- T2: Debit Account A with $50  
-- T3: Read and display Account A balance
 
-- Serial executions possible (6 orderings for 3 transactions):
-- S1: T1, T2, T3 → A +100 -50 = A +50, T3 reads A+50
-- S2: T1, T3, T2 → A +100, T3 reads A+100, then -50
-- S3: T2, T1, T3 → A -50 +100 = A +50, T3 reads A+50
-- S4: T2, T3, T1 → A -50, T3 reads A-50, then +100
-- S5: T3, T1, T2 → T3 reads original A, then +100 -50
-- S6: T3, T2, T1 → T3 reads original A, then -50 +100
 
-- A CONCURRENT EXECUTION is serializable if it produces a result
-- identical to one of S1-S6 (any one will do).
 
-- Example concurrent execution (interleaved):
-- r3(A) r1(A) w1(A) r2(A) w2(A)
-- 
-- Analysis: 
--   T3 reads original A (before any changes)
--   T1 then T2 modify A
-- 
-- Is this equivalent to some serial schedule?
--   Matches S5: T3, T1, T2 (or S6: T3, T2, T1 if T2 read before T1 write)
-- 
-- YES - this concurrent execution is SERIALIZABLE!
 
-- Example NON-serializable execution:
-- r1(A) r2(A) w1(A) w2(A) r3(A)
--
-- T1 and T2 both read original A, then both write based on that.
-- T1's write is overwritten by T2's write (LOST UPDATE!)
-- This outcome doesn't match any S1-S6.
--
-- NO - this concurrent execution is NOT serializable.

The Practical Implication:

Concurrency control mechanisms are designed to ensure that whatever interleaving naturally occurs will be serializable. They do this by:

Constraining possible interleavings (locking: can't access data that's locked by others)
Ordering operations deterministically (timestamps: operations execute in timestamp order)
Detecting and aborting unsafe executions (validation: verify serializability before commit)

As long as the mechanism guarantees serializability, programmers can write transactions as if they execute in isolation. The database handles the complexity of making concurrent execution safe.

The Power of Abstraction

Serializability is a powerful abstraction: it allows programmers to reason about transactions individually, assuming isolation, while the database maintains this illusion across concurrent executions. This separation of concerns is what makes complex database applications feasible to write correctly.

Levels of Control

While serializability is the gold standard for correctness, it comes with performance costs. Database systems offer isolation levels—different degrees of concurrency control with different performance and correctness trade-offs.

The SQL Standard Isolation Levels:

In order of increasing strictness (and typically decreasing performance):

Read Uncommitted: Minimal control. Allows dirty reads. Rarely used.
Read Committed: Prevents dirty reads. Allows non-repeatable reads. Default in many databases.
Repeatable Read: Prevents dirty and non-repeatable reads. May allow phantoms.
Serializable: Full serializability. Prevents all anomalies.

Applications choose isolation levels based on their tolerance for anomalies versus their need for performance.

Isolation Levels and Their Characteristics
Level	Dirty Read	Non-Repeatable Read	Phantom	Typical Use Case
Read Uncommitted	Possible	Possible	Possible	Approximate analytics, non-critical dashboards
Read Committed	Prevented	Possible	Possible	Most OLTP applications (default)
Repeatable Read	Prevented	Prevented	Possible	Reports requiring consistent view
Serializable	Prevented	Prevented	Prevented	Financial transactions, safety-critical systems

Choosing the Right Level:

The choice of isolation level involves understanding your application's requirements:

Can the application tolerate dirty reads? If no → at least Read Committed
Does the application re-read the same data within a transaction? If yes and values must be consistent → at least Repeatable Read
Does the application rely on query results being stable? If yes and set membership matters → Serializable
What's the performance impact? Higher isolation typically means more blocking and lower throughput

Mixed Levels:

Advanced applications may use different isolation levels for different transactions:

Critical financial operations: Serializable
Regular lookups: Read Committed
Approximate reporting: Read Uncommitted

This provides appropriate protection where needed while minimizing performance impact.

Setting Isolation Levels
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Setting isolation level in SQL
 
-- PostgreSQL: Set for current transaction
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Critical operations here
COMMIT;
 
-- MySQL: Set for session
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
 
-- SQL Server: Set for current transaction
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
-- Operations here
COMMIT;
 
-- Oracle: Uses "Read Committed" and "Serializable" only
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
 
-- Checking current isolation level:
 
-- PostgreSQL
SHOW TRANSACTION ISOLATION LEVEL;
 
-- MySQL
SELECT @@transaction_isolation;
 
-- SQL Server
DBCC USEROPTIONS;
 
-- Best Practice: Explicitly set isolation level when needed
-- rather than relying on defaults that may vary by database.

Database Defaults Vary

Default isolation levels differ by database: PostgreSQL and SQL Server default to Read Committed; MySQL/InnoDB defaults to Repeatable Read; Oracle defaults to Read Committed. Know your database's default and explicitly set higher levels when needed for critical operations.

Mechanisms of Control

Database systems implement concurrency control through several fundamental mechanisms. Each has trade-offs in terms of performance, complexity, and the types of anomalies they prevent.

The Major Concurrency Control Mechanisms:

Locking (Pessimistic Control)
- Transactions acquire locks before accessing data
- Conflicting operations wait until locks are released
- Two-Phase Locking (2PL) guarantees serializability
- Risk: deadlocks when transactions wait for each other
Timestamp Ordering (Optimistic/Deterministic)
- Each transaction gets a unique timestamp
- Operations execute in timestamp order
- Conflicts resolved by aborting out-of-order transactions
- No deadlocks, but may have many aborts
Multi-Version Concurrency Control (MVCC)
- Multiple versions of data items maintained
- Readers see consistent snapshot, don't block writers
- Writers don't block readers (in most implementations)
- Excellent for read-heavy workloads

Converting Mermaid diagram...

Trade-offs Between Mechanisms:

Mechanism	Strengths	Weaknesses	Best For
Locking	Proven, predictable	Blocking, deadlocks	Mixed workloads
Timestamp	No deadlocks	High abort rate under contention	Distributed systems
MVCC	Readers don't block	Storage overhead, garbage collection	Read-heavy workloads
Optimistic	Maximum concurrency	Wasted work on abort	Low contention

Modern databases often combine mechanisms:

PostgreSQL: MVCC with row-level locking for write conflicts
MySQL/InnoDB: MVCC for reads, locking for writes
Oracle: MVCC with row-level locking
SQL Server: Options for both locking and MVCC (RCSI)

No Silver Bullet

There is no universally 'best' concurrency control mechanism. The right choice depends on workload characteristics (read vs. write ratio), contention patterns (hot spots vs. distributed access), consistency requirements (full serializability vs. acceptable anomalies), and performance goals (throughput vs. latency).

The Cost-Benefit Analysis

Concurrency control has costs—reduced throughput, increased latency, higher resource usage. Understanding these costs helps in making informed decisions about isolation levels and mechanism choices.

Costs of Concurrency Control:

Blocking Costs: Transactions wait for locks, reducing parallelism
Abort Costs: Work is thrown away when transactions are aborted
Storage Costs: Multi-version systems need extra storage
CPU Costs: Lock management, validation, version tracking
Complexity Costs: More code paths, harder to debug

Benefits of Concurrency Control:

Correctness: Data remains consistent and accurate
Predictability: Application behavior is deterministic
Simplicity: Developers don't handle concurrency manually
Auditability: Actions have well-defined semantics
Trust: Users and regulators can rely on the system

Cost Factors in Concurrency Control

•Lock Contention — The more transactions want the same data, the more they wait. Hot spots (like counter rows) are particularly problematic.
•Lock Granularity — Fine-grained locks (row-level) allow more concurrency but have higher overhead. Coarse locks (table-level) have less overhead but more blocking.
•Deadlock Handling — Detection algorithms, timeout management, and victim selection all consume resources.
•Version Garbage — MVCC systems must clean up old versions, which requires background work and can cause latency spikes.
•Abort Rate — In optimistic systems, high abort rates waste work and require retry logic.

Measuring Concurrency Control Overhead
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Observing concurrency control impact (PostgreSQL example)
 
-- Check lock statistics
SELECT * FROM pg_stat_activity WHERE wait_event_type = 'Lock';
 
-- View lock waits
SELECT blocked_locks.pid AS blocked_pid,
       blocked_activity.usename AS blocked_user,
       blocking_locks.pid AS blocking_pid,
       blocking_activity.usename AS blocking_user,
       blocked_activity.query AS blocked_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity 
  ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks 
  ON blocking_locks.locktype = blocked_locks.locktype
  AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
  AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
WHERE NOT blocked_locks.granted;
 
-- Check for deadlock occurrences
SELECT * FROM pg_stat_database WHERE deadlocks > 0;
 
-- MVCC bloat monitoring
SELECT relname, n_live_tup, n_dead_tup, 
       round(n_dead_tup * 100.0 / nullif(n_live_tup + n_dead_tup, 0), 2) as dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC;
 
-- Serialization failure rate (for Serializable isolation)
SELECT setting FROM pg_settings WHERE name = 'default_transaction_isolation';
-- Then check application logs for "could not serialize access" errors

The Right Trade-off

The 'right' amount of concurrency control depends on your requirements. Over-controlling (unnecessarily strict isolation) wastes performance. Under-controlling (too-weak isolation) risks data corruption. The goal is the minimum control level that meets your correctness requirements.

Looking Ahead: What Comes Next

This module has established the foundational understanding of why concurrency control exists and matters. The subsequent modules in this chapter dive deep into each specific concurrency problem, followed by chapters covering the mechanisms that solve them.

The Learning Path Ahead:

This Chapter (Concurrency Problems):

Module 2: Lost Update Problem — Deep dive into write-write conflicts
Module 3: Dirty Read Problem — Reading uncommitted data
Module 4: Non-Repeatable Read Problem — Inconsistent reads
Module 5: Phantom Read Problem — Set membership changes
Module 6: Isolation Levels — Choosing the right guarantees

Next Chapters:

Chapter 24: Locking Protocols — Pessimistic concurrency with locks
Chapter 25: Timestamp & MVCC — Optimistic and versioning approaches

Module Connections
Topic	Where Covered	Dependencies
Lost Update details	Module 2	This module
Dirty Read details	Module 3	This module
Non-Repeatable Read details	Module 4	This module
Phantom Read details	Module 5	This module
Isolation Levels	Module 6	Modules 2-5
Lock-based solutions	Chapter 24	This chapter
Timestamp/MVCC solutions	Chapter 25	This chapter

How to Study This Material:

Master each problem individually: Understand the specific scenario, mechanism, and consequences of each concurrency anomaly.
Relate problems to isolation levels: Learn which isolation levels prevent which problems and why.
Connect problems to solutions: When studying locking and MVCC later, connect back to which problems each mechanism solves.
Practice with examples: Work through interleaving scenarios to build intuition.
Think in producers and consumers: Concurrency problems occur when one transaction produces data that another consumes—understanding this flow helps identify risks.

Building Mental Models

The goal isn't to memorize definitions but to build mental models that let you recognize potential concurrency issues in real application code. As you learn each problem and solution, think about how they'd manifest in systems you work with—banking apps, e-commerce, booking systems, etc.

Module Summary: Concurrency Overview

We have completed our foundational examination of concurrency in database systems. This module has established the vocabulary, concepts, and motivation that underpin all subsequent study of concurrency control.

Module Key Takeaways

•Concurrent execution allows multiple transactions to overlap, dramatically improving throughput, response time, and resource utilization.
•Interleaving mixes operations from different transactions; some interleavings are safe (serializable), others cause data corruption.
•Four classic problems arise from uncontrolled concurrency: lost updates, dirty reads, non-repeatable reads, and phantom reads.
•Serializability is the correctness criterion: concurrent execution should be equivalent to some serial execution.
•Isolation levels offer trade-offs between performance and protection against specific anomalies.
•Active concurrency control is essential—testing and application logic cannot substitute for database-level mechanisms.
•Locking, timestamps, and MVCC are the primary mechanisms implementing concurrency control.

The Big Picture:

Concurrency control is one of the most sophisticated aspects of database systems, decades in development and the subject of ongoing research. Yet its purpose is simple: allow the benefits of concurrent execution while preventing the problems concurrent execution can cause.

As you proceed through the detailed study of each concurrency problem and the mechanisms that solve them, keep this purpose in mind. Every lock acquired, every version maintained, every validation check performed exists to ensure that your data remains correct even when hundreds or thousands of transactions access it simultaneously.

You now have the foundation to understand why these mechanisms exist and what they're trying to achieve.

Module Complete

Congratulations! You have completed Module 1: Concurrency Overview of Chapter 23. You now understand why concurrent execution is both essential and dangerous, how interleaving creates the potential for problems, and why active concurrency control is non-negotiable. The following modules will examine each concurrency problem in detail, building the complete picture of what database systems must protect against.

5 / 5

Loading learning content...

Database Management SystemsConcurrency Overview

Understanding Concurrency in Database Systems

LevelIntermediate

Duration60 mins

TopicConcurrency Overview

5 / 5

Need for Control

The Imperative of Control

Throughout this module, we've established a fundamental tension in database systems:

We need concurrent execution for throughput, response time, resource utilization, and multi-user support
Concurrent execution creates problems like lost updates, dirty reads, non-repeatable reads, and phantom reads
These problems arise from interleaving of operations from different transactions

The resolution of this tension is concurrency control—active mechanisms that manage how transactions interleave to preserve correctness while maximizing concurrency benefits.

What You Will Learn

The Case for Active Control

Let's be absolutely clear about why passive approaches—hoping for the best, testing thoroughly, or trusting application code—are fundamentally inadequate for managing database concurrency.

Why 'Hope' Doesn't Work:

Many are safe (equivalent to some serial execution)
Some are dangerous (cause data corruption)
The dangerous ones may be rare but are not impossible

Law of large numbers: At high transaction volumes, even improbable interleavings become inevitable. A 1-in-a-billion bad interleaving will occur multiple times per day on a busy system.

Why Testing Is Insufficient:

Testing cannot guarantee absence of concurrency bugs:

Combinatorial explosion: Cannot test all possible interleavings
Non-reproducibility: The same test may produce different interleavings each run
Load sensitivity: Bugs may only appear under specific load conditions
Timing sensitivity: Bugs may only appear with specific timing of operations

A system that passes millions of test runs may still fail in production when a rare timing condition occurs.

Why Application Logic Cannot Substitute:

Developers sometimes attempt to handle concurrency at the application level—implementing their own locking, version checking, or retry logic. This approach has fundamental problems:

Incomplete coverage: Easy to miss cases, especially as code evolves
Distributed failure modes: Application may crash between critical steps
No atomicity guarantee: Application-level checks can be bypassed by direct database access
Performance penalties: Application-level concurrency is typically slower than database-native mechanisms

Why Application-Level Control Fails
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
-- EXAMPLE: Application-level concurrency control attempt
 
-- Developer's plan: Read current balance, check, then update
-- Implemented in application code (pseudocode):
 
def withdraw(account_id, amount):
    # Step 1: Read current balance
    balance = db.query("SELECT balance FROM accounts WHERE id = ?", account_id)
    
    # Step 2: Check if sufficient funds
    if balance >= amount:
        # Step 3: Perform withdrawal
        new_balance = balance - amount
        db.execute("UPDATE accounts SET balance = ? WHERE id = ?", 
                   new_balance, account_id)
        return True
    return False
 
-- PROBLEM: Race condition between Step 1 and Step 3!
 
-- Timeline:
-- T1: withdraw(1, 100)           T2: withdraw(1, 100)
-- ---------------------          ----------------------
-- Read balance: $150             
--                                 Read balance: $150
-- Check: 150 >= 100? Yes         
--                                 Check: 150 >= 100? Yes
-- Update: balance = 50           
--                                 Update: balance = 50 (OVERWRITES!)
 
-- Result: Two $100 withdrawals from $150 account
-- Final balance: $50 (should have failed one or left $-50)
-- Lost update! Application logic did NOT prevent this.
 
-- The developer might try:
def withdraw_v2(account_id, amount):
    db.execute("UPDATE accounts SET balance = balance - ? "
               "WHERE id = ? AND balance >= ?", 
               amount, account_id, amount)
    return db.rows_affected > 0
 
-- This is better but still has issues:
-- - Other operations (reads, calculations) still unprotected
-- - Complex operations spanning multiple tables vulnerable
-- - Requires developer to remember the pattern everywhere
-- - Still no guarantee of isolation for multi-step transactions

Defense in Depth

Consequences of Inadequate Control

When concurrency control fails—either through bugs in the database system or misconfigured isolation levels—the consequences can be severe. Let's examine real-world impact categories.

Financial Data Corruption:

Banking and payment systems are particularly vulnerable:

Lost updates can cause money to appear or disappear
Double spending becomes possible when balances are read inconsistently
Incorrect totals from phantom reads lead to wrong financial statements
Reconciliation failures when transaction logs don't match balances

Even small errors compound: a $0.01 error occurring 10 million times per day is $100,000 daily loss.

Real-World Concurrency Failure Scenarios
Domain	Scenario	Consequence	Root Cause
E-commerce	Inventory oversell	Orders placed for unavailable items	Lost update on stock count
Banking	Double withdrawal	Account overdrafts, bank loss	Non-serializable balance check
Healthcare	Duplicate medication order	Patient safety risk	Phantom read in order list
Ticketing	Double booking	Sold same seat twice	Lost update on availability
Trading	Position overcounting	Regulatory violations	Dirty read of pending trades

Data Integrity Violations:

Beyond financial impact, concurrency failures can violate fundamental data integrity:

Referential integrity: Foreign key relationships corrupted when concurrent inserts/deletes race
Uniqueness constraints: Duplicates inserted when concurrent transactions check before insert
Business rules: Invariants violated when checks and actions are interleaved
Audit trails: Logs that don't match actual data due to inconsistent reads

The Compounding Effect:

Concurrency bugs rarely occur in isolation. One corrupted record affects queries that read it, leading to:

Propagation: Bad data is used in calculations, spreading corruption
Cascading decisions: Business decisions based on corrupt data
Difficult recovery: Often impossible to determine 'ground truth' after the fact
Trust erosion: Users lose confidence in the system

Long-Term Business Impacts

•Regulatory Fines — Financial services face penalties for data integrity failures and inaccurate reporting.
•Legal Liability — Incorrect data can lead to lawsuits, especially in healthcare, finance, and legal domains.
•Customer Churn — Users who experience lost orders, wrong balances, or double charges lose trust.
•Operational Costs — Manual reconciliation, customer support, and data cleanup are expensive.
•Reputation Damage — Public disclosure of data integrity issues can permanently harm brand.

The Silent Killer

The Serializability Principle

The Serializability Guarantee:

A concurrent execution is serializable if its outcome (final database state and all values read by transactions) is identical to some serial execution of the same transactions.

In other words: the transactions may have executed concurrently with interleaved operations, but the result is indistinguishable from running them one at a time in some order.

Why Serializability Works:

Serial execution is always correct — By definition, running transactions one at a time preserves isolation and consistency
Equivalent to serial = also correct — If a concurrent execution has the same outcome as a serial execution, it inherits that correctness
No enumeration required — We don't need to check which serial schedule it matches; any one will do

Serializability Concept
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- Understanding serializability
 
-- Three transactions execute concurrently:
-- T1: Credit Account A with $100
-- T2: Debit Account A with $50  
-- T3: Read and display Account A balance
 
-- Serial executions possible (6 orderings for 3 transactions):
-- S1: T1, T2, T3 → A +100 -50 = A +50, T3 reads A+50
-- S2: T1, T3, T2 → A +100, T3 reads A+100, then -50
-- S3: T2, T1, T3 → A -50 +100 = A +50, T3 reads A+50
-- S4: T2, T3, T1 → A -50, T3 reads A-50, then +100
-- S5: T3, T1, T2 → T3 reads original A, then +100 -50
-- S6: T3, T2, T1 → T3 reads original A, then -50 +100
 
-- A CONCURRENT EXECUTION is serializable if it produces a result
-- identical to one of S1-S6 (any one will do).
 
-- Example concurrent execution (interleaved):
-- r3(A) r1(A) w1(A) r2(A) w2(A)
-- 
-- Analysis: 
--   T3 reads original A (before any changes)
--   T1 then T2 modify A
-- 
-- Is this equivalent to some serial schedule?
--   Matches S5: T3, T1, T2 (or S6: T3, T2, T1 if T2 read before T1 write)
-- 
-- YES - this concurrent execution is SERIALIZABLE!
 
-- Example NON-serializable execution:
-- r1(A) r2(A) w1(A) w2(A) r3(A)
--
-- T1 and T2 both read original A, then both write based on that.
-- T1's write is overwritten by T2's write (LOST UPDATE!)
-- This outcome doesn't match any S1-S6.
--
-- NO - this concurrent execution is NOT serializable.

The Practical Implication:

Concurrency control mechanisms are designed to ensure that whatever interleaving naturally occurs will be serializable. They do this by:

Constraining possible interleavings (locking: can't access data that's locked by others)
Ordering operations deterministically (timestamps: operations execute in timestamp order)
Detecting and aborting unsafe executions (validation: verify serializability before commit)

As long as the mechanism guarantees serializability, programmers can write transactions as if they execute in isolation. The database handles the complexity of making concurrent execution safe.

The Power of Abstraction

Levels of Control

The SQL Standard Isolation Levels:

In order of increasing strictness (and typically decreasing performance):

Read Uncommitted: Minimal control. Allows dirty reads. Rarely used.
Read Committed: Prevents dirty reads. Allows non-repeatable reads. Default in many databases.
Repeatable Read: Prevents dirty and non-repeatable reads. May allow phantoms.
Serializable: Full serializability. Prevents all anomalies.

Applications choose isolation levels based on their tolerance for anomalies versus their need for performance.

Isolation Levels and Their Characteristics
Level	Dirty Read	Non-Repeatable Read	Phantom	Typical Use Case
Read Uncommitted	Possible	Possible	Possible	Approximate analytics, non-critical dashboards
Read Committed	Prevented	Possible	Possible	Most OLTP applications (default)
Repeatable Read	Prevented	Prevented	Possible	Reports requiring consistent view
Serializable	Prevented	Prevented	Prevented	Financial transactions, safety-critical systems

Choosing the Right Level:

The choice of isolation level involves understanding your application's requirements:

Can the application tolerate dirty reads? If no → at least Read Committed
Does the application re-read the same data within a transaction? If yes and values must be consistent → at least Repeatable Read
Does the application rely on query results being stable? If yes and set membership matters → Serializable
What's the performance impact? Higher isolation typically means more blocking and lower throughput

Mixed Levels:

Advanced applications may use different isolation levels for different transactions:

Critical financial operations: Serializable
Regular lookups: Read Committed
Approximate reporting: Read Uncommitted

This provides appropriate protection where needed while minimizing performance impact.

Setting Isolation Levels
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Setting isolation level in SQL
 
-- PostgreSQL: Set for current transaction
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Critical operations here
COMMIT;
 
-- MySQL: Set for session
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
 
-- SQL Server: Set for current transaction
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
-- Operations here
COMMIT;
 
-- Oracle: Uses "Read Committed" and "Serializable" only
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
 
-- Checking current isolation level:
 
-- PostgreSQL
SHOW TRANSACTION ISOLATION LEVEL;
 
-- MySQL
SELECT @@transaction_isolation;
 
-- SQL Server
DBCC USEROPTIONS;
 
-- Best Practice: Explicitly set isolation level when needed
-- rather than relying on defaults that may vary by database.

Database Defaults Vary

Mechanisms of Control

Database systems implement concurrency control through several fundamental mechanisms. Each has trade-offs in terms of performance, complexity, and the types of anomalies they prevent.

The Major Concurrency Control Mechanisms:

Locking (Pessimistic Control)
- Transactions acquire locks before accessing data
- Conflicting operations wait until locks are released
- Two-Phase Locking (2PL) guarantees serializability
- Risk: deadlocks when transactions wait for each other
Timestamp Ordering (Optimistic/Deterministic)
- Each transaction gets a unique timestamp
- Operations execute in timestamp order
- Conflicts resolved by aborting out-of-order transactions
- No deadlocks, but may have many aborts
Multi-Version Concurrency Control (MVCC)
- Multiple versions of data items maintained
- Readers see consistent snapshot, don't block writers
- Writers don't block readers (in most implementations)
- Excellent for read-heavy workloads

Converting Mermaid diagram...

Trade-offs Between Mechanisms:

Mechanism	Strengths	Weaknesses	Best For
Locking	Proven, predictable	Blocking, deadlocks	Mixed workloads
Timestamp	No deadlocks	High abort rate under contention	Distributed systems
MVCC	Readers don't block	Storage overhead, garbage collection	Read-heavy workloads
Optimistic	Maximum concurrency	Wasted work on abort	Low contention

Modern databases often combine mechanisms:

PostgreSQL: MVCC with row-level locking for write conflicts
MySQL/InnoDB: MVCC for reads, locking for writes
Oracle: MVCC with row-level locking
SQL Server: Options for both locking and MVCC (RCSI)

No Silver Bullet

The Cost-Benefit Analysis

Costs of Concurrency Control:

Blocking Costs: Transactions wait for locks, reducing parallelism
Abort Costs: Work is thrown away when transactions are aborted
Storage Costs: Multi-version systems need extra storage
CPU Costs: Lock management, validation, version tracking
Complexity Costs: More code paths, harder to debug

Benefits of Concurrency Control:

Correctness: Data remains consistent and accurate
Predictability: Application behavior is deterministic
Simplicity: Developers don't handle concurrency manually
Auditability: Actions have well-defined semantics
Trust: Users and regulators can rely on the system

Cost Factors in Concurrency Control

•Lock Contention — The more transactions want the same data, the more they wait. Hot spots (like counter rows) are particularly problematic.
•Lock Granularity — Fine-grained locks (row-level) allow more concurrency but have higher overhead. Coarse locks (table-level) have less overhead but more blocking.
•Deadlock Handling — Detection algorithms, timeout management, and victim selection all consume resources.
•Version Garbage — MVCC systems must clean up old versions, which requires background work and can cause latency spikes.
•Abort Rate — In optimistic systems, high abort rates waste work and require retry logic.

Measuring Concurrency Control Overhead
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Observing concurrency control impact (PostgreSQL example)
 
-- Check lock statistics
SELECT * FROM pg_stat_activity WHERE wait_event_type = 'Lock';
 
-- View lock waits
SELECT blocked_locks.pid AS blocked_pid,
       blocked_activity.usename AS blocked_user,
       blocking_locks.pid AS blocking_pid,
       blocking_activity.usename AS blocking_user,
       blocked_activity.query AS blocked_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity 
  ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks 
  ON blocking_locks.locktype = blocked_locks.locktype
  AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
  AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
WHERE NOT blocked_locks.granted;
 
-- Check for deadlock occurrences
SELECT * FROM pg_stat_database WHERE deadlocks > 0;
 
-- MVCC bloat monitoring
SELECT relname, n_live_tup, n_dead_tup, 
       round(n_dead_tup * 100.0 / nullif(n_live_tup + n_dead_tup, 0), 2) as dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC;
 
-- Serialization failure rate (for Serializable isolation)
SELECT setting FROM pg_settings WHERE name = 'default_transaction_isolation';
-- Then check application logs for "could not serialize access" errors

The Right Trade-off

Looking Ahead: What Comes Next

The Learning Path Ahead:

This Chapter (Concurrency Problems):

Module 2: Lost Update Problem — Deep dive into write-write conflicts
Module 3: Dirty Read Problem — Reading uncommitted data
Module 4: Non-Repeatable Read Problem — Inconsistent reads
Module 5: Phantom Read Problem — Set membership changes
Module 6: Isolation Levels — Choosing the right guarantees

Next Chapters:

Chapter 24: Locking Protocols — Pessimistic concurrency with locks
Chapter 25: Timestamp & MVCC — Optimistic and versioning approaches

Module Connections
Topic	Where Covered	Dependencies
Lost Update details	Module 2	This module
Dirty Read details	Module 3	This module
Non-Repeatable Read details	Module 4	This module
Phantom Read details	Module 5	This module
Isolation Levels	Module 6	Modules 2-5
Lock-based solutions	Chapter 24	This chapter
Timestamp/MVCC solutions	Chapter 25	This chapter

How to Study This Material:

Master each problem individually: Understand the specific scenario, mechanism, and consequences of each concurrency anomaly.
Relate problems to isolation levels: Learn which isolation levels prevent which problems and why.
Connect problems to solutions: When studying locking and MVCC later, connect back to which problems each mechanism solves.
Practice with examples: Work through interleaving scenarios to build intuition.
Think in producers and consumers: Concurrency problems occur when one transaction produces data that another consumes—understanding this flow helps identify risks.

Building Mental Models

Module Summary: Concurrency Overview

Module Key Takeaways

•Concurrent execution allows multiple transactions to overlap, dramatically improving throughput, response time, and resource utilization.
•Interleaving mixes operations from different transactions; some interleavings are safe (serializable), others cause data corruption.
•Four classic problems arise from uncontrolled concurrency: lost updates, dirty reads, non-repeatable reads, and phantom reads.
•Serializability is the correctness criterion: concurrent execution should be equivalent to some serial execution.
•Isolation levels offer trade-offs between performance and protection against specific anomalies.
•Active concurrency control is essential—testing and application logic cannot substitute for database-level mechanisms.
•Locking, timestamps, and MVCC are the primary mechanisms implementing concurrency control.

The Big Picture:

You now have the foundation to understand why these mechanisms exist and what they're trying to achieve.

Module Complete

5 / 5