Loading learning content...
Imagine a narrow bridge that can only support one vehicle at a time. Without traffic control, cars approaching from both directions would collide mid-span, causing catastrophic failure. The solution is obvious: install a traffic light that grants exclusive passage to one direction at a time.
Mutexes (Mutual Exclusion locks) are the traffic lights of concurrent programming. They ensure that when multiple threads need to access a shared resource—a bank account balance, a file, a database connection—only one thread can do so at any given moment. This seemingly simple concept is the foundation upon which all thread-safe systems are built.
But like many foundational concepts, the devil lies in the details. This page explores mutexes and their lock variants with the depth required to use them correctly in production systems.
By completing this page, you will understand the theoretical foundations of mutual exclusion, the anatomy of mutex implementations, different lock types (reentrant, read-write, spin), their performance characteristics, correct usage patterns, and common pitfalls that lead to bugs in production systems.
Before diving into solutions, we must viscerally understand the problem. The core issue is race conditions—scenarios where the behavior of a program depends on the relative timing of events, particularly the order in which threads execute.
Consider a simple counter increment operation:
counter = counter + 1
This single line of code appears atomic, but at the CPU level, it involves three distinct operations:
counter from memory into a CPU registerWhen two threads execute this operation simultaneously on an initial counter value of 0, here's what can go wrong:
| Step | Thread A | Thread B | Register A | Register B | Memory |
|---|---|---|---|---|---|
| Initial | — | — | ? | ? | 0 |
| 1 | READ counter | — | 0 | ? | 0 |
| 2 | — | READ counter | 0 | 0 | 0 |
| 3 | ADD 1 | — | 1 | 0 | 0 |
| 4 | — | ADD 1 | 1 | 1 | 0 |
| 5 | WRITE counter | — | 1 | 1 | 1 |
| 6 | — | WRITE counter | 1 | 1 | 1 |
Both threads incremented the counter, but the final value is 1 instead of 2. Thread B's update was 'lost' because it read the value before Thread A's write completed. In a banking system, this is literal money disappearing. In inventory systems, this means overselling products. In metrics systems, this means incorrect analytics driving wrong business decisions.
This problem is particularly insidious because:
Mutexes solve this by ensuring atomicity at a higher level than individual CPU instructions. They create protected regions called critical sections where only one thread can execute at a time.
A mutex is conceptually simple: it has two states (locked/unlocked) and two operations (acquire/release). But understanding how these operations work at a lower level is essential for using mutexes effectively.
The Mutex Contract:
The last guarantee—memory visibility—is often overlooked but critically important. Modern CPUs reorder instructions and cache values aggressively. A mutex release includes a memory barrier (or fence) that ensures all writes by the releasing thread are visible to subsequent lock acquirers.
Pseudocode for Mutex Operations:
1234567891011121314151617181920212223242526272829303132333435363738
// Conceptual mutex structureclass Mutex { private boolean locked = false; private Thread owner = null; private Queue<Thread> waitQueue = new Queue<>(); // Acquire the lock (blocking) void lock() { while (true) { // Atomic check-and-set operation (hardware supported) if (atomicCompareAndSwap(locked, false, true)) { // Successfully acquired owner = currentThread(); memoryBarrier(); // Ensure visibility return; } // Failed to acquire - add to wait queue and sleep waitQueue.enqueue(currentThread()); parkThread(currentThread()); // Yield CPU } } // Release the lock void unlock() { if (owner != currentThread()) { throw IllegalMonitorStateException("Not lock owner"); } memoryBarrier(); // Flush all writes owner = null; locked = false; // Wake up a waiting thread if (!waitQueue.isEmpty()) { Thread next = waitQueue.dequeue(); unparkThread(next); } }}Critical Implementation Details:
1. Atomic Compare-And-Swap (CAS):
The atomicCompareAndSwap operation is the magic that makes everything work. It atomically checks if a variable has an expected value and, only if so, updates it to a new value. This is a single CPU instruction (CMPXCHG on x86, LDREX/STREX on ARM) that cannot be interrupted.
2. Wait Queue Management: When a thread fails to acquire the lock, it must either spin (busy-wait) or sleep. Sleeping requires OS kernel involvement to manage the wait queue and wake threads. This is expensive but necessary for long waits.
3. Memory Barriers: The barriers ensure the happens-before relationship: everything that happened before a thread released the lock is visible to the thread that subsequently acquires it. Without this, threads might see stale cached values.
Mutexes ultimately rely on hardware atomic instructions. Without CPU support for atomic operations, implementing correct synchronization would require disabling interrupts—possible only in kernel mode. User-space locks leverage kernel assistance (futexes on Linux, Critical Sections on Windows) to efficiently coordinate without burning CPU cycles.
Not all locks are created equal. Different concurrency scenarios call for different lock behaviors. Understanding the variations helps you select the right tool for each situation.
The Standard Mutex (Non-Reentrant)
The simplest form of mutex allows exactly one thread to hold it. If the same thread attempts to acquire a lock it already holds, it deadlocks against itself.
Characteristics:
123456789101112131415161718192021222324252627282930
// Java provides ReentrantLock, but you can use synchronized // (which IS reentrant) or a Semaphore(1) for non-reentrant behavior import java.util.concurrent.Semaphore; class NonReentrantLockExample { // Semaphore with 1 permit acts as non-reentrant mutex private final Semaphore mutex = new Semaphore(1); private int sharedCounter = 0; public void increment() throws InterruptedException { mutex.acquire(); // Block until permit available try { sharedCounter++; // Protected critical section } finally { mutex.release(); // Always release in finally } } // WARNING: This would deadlock! public void nestedOperation() throws InterruptedException { mutex.acquire(); try { // This will block forever - we already hold the permit // increment(); // DEADLOCK! } finally { mutex.release(); } }}Acquiring locks correctly is only half the battle. How you structure code around locks determines whether your concurrent programs are robust or riddled with subtle bugs.
1234567891011
// BAD: I/O inside critical sectionvoid updateFromServer() { lock.acquire(); try { // Network call takes 100ms! data = httpClient.fetch(url); cache.put(key, data); } finally { lock.unlock(); }}123456789101112
// GOOD: I/O outside critical sectionvoid updateFromServer() { // Network call without lock var data = httpClient.fetch(url); lock.acquire(); try { cache.put(key, data); // Only this } finally { lock.unlock(); }}Moving a 100ms network call outside the lock transforms a system from 10 operations/second (serialized) to potentially 1000+ operations/second (parallel I/O with minimal lock contention). Critical section minimization is one of the highest-leverage performance optimizations in concurrent systems.
Even experienced developers make locking mistakes. Understanding common pitfalls helps you write more robust concurrent code from the start.
1234567891011121314151617181920212223242526272829303132
// DEADLOCK SCENARIOclass BankAccount { private final Object lock = new Object(); private int balance; // Thread A: transfer(accountX, accountY, 100) // Thread B: transfer(accountY, accountX, 50) // If they interleave, both wait forever void transferDeadlock(BankAccount from, BankAccount to, int amt) { synchronized(from.lock) { // A locks X, B locks Y // Context switch! synchronized(to.lock) { // A waits for Y, B waits for X from.balance -= amt; // Neither reaches here to.balance += amt; } } } // FIX: Always lock in consistent order (by account ID) void transferSafe(BankAccount from, BankAccount to, int amt) { BankAccount first = from.hashCode() < to.hashCode() ? from : to; BankAccount second = first == from ? to : from; synchronized(first.lock) { synchronized(second.lock) { from.balance -= amt; to.balance += amt; } } }}Lock performance depends on contention level, critical section duration, and hardware factors. Understanding these trade-offs helps you choose the right lock type and optimize for your workload.
| Lock Type | Uncontended Cost | Contended Cost | Best For | Avoid When |
|---|---|---|---|---|
| Basic Mutex | ~25ns | ~1-10μs | General purpose | Very short critical sections |
| Spin Lock | ~5ns | Burns CPU | < 1μs hold time | Single core, long waits |
| Reentrant Lock | ~30ns | ~1-10μs | Nested locking | Performance-critical paths |
| Read-Write Lock | ~35ns | ~1-10μs | 80% reads | Write-heavy workloads |
| Biased Lock | ~1ns | ~25ns first time | Thread-local access | High thread contention |
Uncontended lock acquisition is remarkably cheap—often just a single atomic instruction. The real cost comes from contention: when threads must wait. Under high contention, the choice of lock type matters far less than reducing contention itself through better data structures or finer granularity.
Key Performance Insights:
We've explored mutexes and locks from first principles to production patterns. Let's consolidate the key takeaways:
What's Next:
Mutexes are powerful but limited to binary (locked/unlocked) states. The next page explores Semaphores—a generalization that allows controlling access to a limited number of resources, enabling patterns like connection pools, rate limiting, and bounded parallelism.
You now understand mutexes and locks at a level sufficient for professional concurrent programming. You can select appropriate lock types, implement them correctly with exception safety, and avoid common pitfalls. Next, we'll explore semaphores—locks that count.