Loading content...
A hospital emergency room has a sophisticated triage system. Urgent cases receive immediate attention, but no patient waits forever. Priority is balanced with fairness. Timeouts trigger escalation. The system is designed not just to treat the most urgent cases, but to ensure every patient eventually receives care.
Database concurrency control requires the same holistic design. Individual mechanisms—fairness, priority, timeouts—each address part of the starvation problem. But building a truly starvation-free system requires integrating these mechanisms into a coherent strategy. The goal isn't just preventing obvious starvation scenarios; it's designing systems where starvation is structurally impossible.
This page synthesizes everything we've learned into practical starvation prevention strategies applicable to real database systems.
This page provides an integrated approach to starvation prevention: combining fairness policies with priority and timeout, implementing defense in depth, designing for production workloads, and establishing monitoring to detect emerging starvation. You'll leave with a practical framework for building starvation-free database systems.
Effective starvation prevention employs defense in depth—multiple layers of protection that work together. If one layer fails, others catch the problem. This three-layer model provides comprehensive protection:
Layer 1: Structural Prevention
These are proactive design choices that make starvation difficult or impossible by construction:
Layer 2: Dynamic Detection
Even with good structural design, workload anomalies can create unexpected starvation conditions. Detection systems identify emerging problems:
Layer 3: Intervention & Recovery
When prevention fails and detection identifies problems, intervention mechanisms resolve them:
Structural prevention is cheapest—it's built in and requires no runtime overhead. Detection is moderately expensive but essential for visibility. Intervention is most expensive (transactions are aborted, work lost) but necessary as a fallback. Invest effort up front in good structural design.
A fair lock manager is the foundation of starvation prevention. Here is a complete design that integrates fairness, priority, and timeout:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135
class FairLockManager: """ Lock manager with guaranteed starvation freedom. Properties: - Bounded bypass: max K requests can pass waiting request - Priority with aging: effective priority increases with wait time - Timeout enforcement: no wait exceeds max_timeout - Position preservation: retried transactions keep original position """ # Configuration MAX_BYPASS = 10 AGING_RATE = 1.0 # priority units per second DEFAULT_TIMEOUT = 30_seconds # State lock_tables = ConcurrentHashMap<ResourceId, LockTable>() ticket_counter = AtomicLong(0) def request_lock(transaction, resource, lock_type, timeout=None): timeout = timeout or DEFAULT_TIMEOUT table = lock_tables.get_or_create(resource) # Determine ticket number if transaction.has_preserved_ticket(resource): ticket = transaction.get_preserved_ticket(resource) else: ticket = ticket_counter.increment_and_get() request = LockRequest( transaction=transaction, resource=resource, type=lock_type, ticket=ticket, base_priority=transaction.base_priority, request_time=now(), bypass_count=0 ) # Try to grant immediately if try_grant_immediately(table, request): return LockGranted(request) # Add to wait queue table.wait_queue.add(request) # Wait with timeout deadline = now() + timeout while now() < deadline: if request.granted: return LockGranted(request) wait_until(deadline, or_until=request.granted_signal) # Timeout - preserve ticket for potential retry table.wait_queue.remove(request) transaction.preserve_ticket(resource, ticket) raise LockTimeoutError(resource, timeout) def try_grant_immediately(table, request): """Attempt to grant without waiting""" current_holders = table.holders wait_queue = table.wait_queue # Check compatibility with current holders for holder in current_holders: if conflicts(holder.type, request.type): return False # Check if anyone with better position is waiting for waiter in wait_queue: if should_serve_before(waiter, request): return False # Grant immediately table.holders.add(request) request.granted = True return True def should_serve_before(existing, new_request): """Determine if existing request should be served before new_request""" # Conflicting lock types? if not conflicts(existing.type, new_request.type): return False # Compatible, can serve in parallel # Compare effective priority (includes aging) existing_priority = effective_priority(existing) new_priority = effective_priority(new_request) if existing_priority > new_priority: return True if existing_priority < new_priority: # Check bypass limit if existing.bypass_count >= MAX_BYPASS: return True # Existing has been bypassed enough existing.bypass_count += 1 return False # Equal priority - use ticket order (FCFS) return existing.ticket < new_request.ticket def effective_priority(request): """Calculate priority with aging""" wait_time = now() - request.request_time age_bonus = wait_time.seconds * AGING_RATE return request.base_priority + age_bonus def release_lock(transaction, resource): table = lock_tables.get(resource) if not table: return # Remove from holders table.holders.remove_by_transaction(transaction) # Grant to highest-priority waiters grant_waiting_requests(table) def grant_waiting_requests(table): """Grant locks to eligible waiters after a release""" # Sort by effective priority, then ticket sorted_waiters = sorted( table.wait_queue, key=lambda r: (-effective_priority(r), r.ticket) ) for waiter in sorted_waiters: if can_grant(table, waiter): table.wait_queue.remove(waiter) table.holders.add(waiter) waiter.granted = True waiter.granted_signal.notify()This design ensures starvation-freedom through: (1) Bounded bypass—no request can be bypassed more than MAX_BYPASS times; (2) Priority aging—waiting increases priority; (3) Ticket preservation—retried transactions keep their original queue position; (4) Timeout—guaranteed upper bound on wait time.
The reader-writer problem is the most common source of starvation in databases. Here are proven strategies for preventing both reader and writer starvation:
Phase-Fair Reader-Writer Lock
Alternates between read phases and write phases. Each phase processes all pending requests of that type before switching.
Algorithm:
class PhaseFairRWLock:
def __init__(self):
self.phase = 'NONE' # NONE, READ, WRITE
self.reader_count = 0
self.writer_waiting = False
self.read_queue = Queue()
self.write_queue = Queue()
def acquire_read(self):
if self.phase == 'WRITE' or self.writer_waiting:
self.read_queue.put(current_thread())
wait()
self.phase = 'READ'
self.reader_count += 1
def release_read(self):
self.reader_count -= 1
if self.reader_count == 0:
self.transition_to_write_phase()
def acquire_write(self):
self.writer_waiting = True
if self.phase == 'READ' and self.reader_count > 0:
self.write_queue.put(current_thread())
wait()
self.phase = 'WRITE'
self.writer_waiting = False
def release_write(self):
self.transition_to_read_phase()
def transition_to_write_phase(self):
if not self.write_queue.empty():
wake(self.write_queue.get())
else:
self.transition_to_read_phase()
def transition_to_read_phase(self):
while not self.read_queue.empty():
wake(self.read_queue.get())
if self.read_queue.empty():
self.phase = 'NONE'
Properties:
When many transactions contend for the same resource, even fair scheduling may result in long waits. Hotspot mitigation reduces contention by spreading load or reducing critical section duration.
Before:
SELECT quantity FROM inventory WHERE product_id = 123 FOR UPDATE;
-- Application logic: new_quantity = quantity - order_amount
UPDATE inventory SET quantity = :new_quantity WHERE product_id = 123;After (using atomic decrement):
UPDATE inventory
SET quantity = quantity - :order_amount
WHERE product_id = 123 AND quantity >= :order_amount
RETURNING quantity;
-- No explicit lock needed—single atomic statement
-- Contention reduced because lock held only during UPDATE, not during app logicWhen to Apply Each Technique:
| Hotspot Type | Best Mitigation |
|---|---|
| Single popular row (counter) | Atomic operations, eventual consistency |
| Popular table range | Lock splitting, row-level locking |
| Read-heavy contention | MVCC, read replicas, caching |
| Write-heavy contention | Sharding, async queuing |
| Sequential processing bottleneck | Parallel batching, pipeline design |
Key Insight: The best starvation prevention is reducing the need for locking entirely. If contention is eliminated, starvation cannot occur.
Each database system provides specific mechanisms for starvation prevention. Here's how to configure popular databases for optimal fairness:
PostgreSQL Starvation Prevention Configuration:
-- MVCC eliminates most reader-writer conflicts
-- Readers never block writers, writers never block readers
-- (except for ROW EXCLUSIVE vs. SHARE UPDATE EXCLUSIVE)
-- Timeout configuration
SET lock_timeout = '10s'; -- Per-lock timeout
SET statement_timeout = '60s'; -- Total statement time
SET deadlock_timeout = '1s'; -- Deadlock detection frequency
-- Idle transaction cleanup (prevents lock hoarding)
SET idle_in_transaction_session_timeout = '5min';
-- Use SKIP LOCKED for queue-like processing
SELECT * FROM job_queue
WHERE status = 'pending'
FOR UPDATE SKIP LOCKED
LIMIT 10;
-- Advisory locks for application-level coordination
SELECT pg_advisory_lock(hashtext('inventory_update'));
-- ... do work ...
SELECT pg_advisory_unlock(hashtext('inventory_update'));
-- Monitor for lock contention
SELECT * FROM pg_stat_activity WHERE wait_event_type = 'Lock';
PostgreSQL Tips:
Even with good prevention, you must monitor for emerging starvation conditions. Here's a comprehensive monitoring strategy:
12345678910111213141516171819202122232425262728293031323334353637383940414243
-- PostgreSQL: Lock wait monitoring dashboard -- Current lock waitsSELECT blocked.pid AS blocked_pid, blocked.usename AS blocked_user, blocked.query AS blocked_query, blocking.pid AS blocking_pid, blocking.usename AS blocking_user, blocking.query AS blocking_query, NOW() - blocked.query_start AS wait_durationFROM pg_stat_activity blockedJOIN pg_locks blocked_locks ON blocked.pid = blocked_locks.pidJOIN pg_locks blocking_locks ON blocked_locks.locktype = blocking_locks.locktype AND blocked_locks.database = blocking_locks.database AND blocked_locks.relation = blocking_locks.relation AND blocked_locks.page = blocking_locks.page AND blocked_locks.tuple = blocking_locks.tuple AND blocked_locks.transactionid = blocking_locks.transactionid AND blocked_locks.classid = blocking_locks.classid AND blocked_locks.objid = blocking_locks.objid AND blocked_locks.objsubid = blocking_locks.objsubid AND blocked_locks.pid != blocking_locks.pidJOIN pg_stat_activity blocking ON blocking_locks.pid = blocking.pidWHERE NOT blocked_locks.grantedORDER BY wait_duration DESC; -- Lock wait time percentiles (requires pg_stat_statements extension)-- Use application-level logging for accurate tracking -- Alert query: transactions waiting > 10 secondsSELECT COUNT(*) AS long_waitersFROM pg_stat_activityWHERE wait_event_type = 'Lock' AND NOW() - query_start > INTERVAL '10 seconds'; -- MySQL: Lock monitoringSELECT * FROM performance_schema.events_waits_summary_by_event_nameWHERE event_name LIKE 'wait/synch/mutex/innodb%'ORDER BY sum_timer_wait DESC; SELECT waiting_pid, waiting_query, blocking_pid, blocking_queryFROM sys.innodb_lock_waits;Database-level monitoring provides aggregate views, but application-level instrumentation gives per-transaction insight. Log lock_wait_time for every transaction. Tag with transaction type. Build dashboards showing wait times by type over time. This reveals starvation affecting specific workloads before it becomes visible in database metrics.
When starvation is detected in production, follow this systematic response playbook:
| Symptom | Likely Cause | Quick Fix | Long-term Fix |
|---|---|---|---|
| Writers never complete | Reader-preference lock | Kill excess readers | Switch to fair RW lock |
| Slow transactions starve fast ones | FCFS with long holders | Decrease lock_timeout | Optimize slow queries |
| One transaction type always wins | Unfair priority configuration | Boost starved priority | Review priority policy |
| Retried transactions never succeed | Queue position lost on retry | Manual priority boost | Implement position preservation |
| Some resources always contended | Hotspot resource | Cache, replicas | Shard or redesign schema |
| Random transactions hang | Undetected deadlock | Decrease deadlock_timeout | Review access order |
Killing the blocking transaction provides immediate relief but doesn't fix the underlying issue. If a long-running query is holding locks legitimately, killing it will only delay the work—it will be retried. Address the root cause: optimize the query, change the locking strategy, or schedule the work during low-traffic periods.
Let's walk through a realistic case study of diagnosing and fixing starvation in a production system.
Scenario: Flash Sale Disaster
An e-commerce platform runs a flash sale. During the sale:
Investigation:
12345678910111213141516171819202122232425
-- Step 1: Identify affected transactionsSELECT application_name, state, wait_event_type, wait_event, COUNT(*) as count, AVG(EXTRACT(EPOCH FROM (now() - query_start))) as avg_wait_secFROM pg_stat_activityWHERE state = 'active' OR wait_event_type = 'Lock'GROUP BY 1, 2, 3, 4ORDER BY count DESC; -- Result:-- application_name | state | wait_event_type | wait_event | count | avg_wait_sec-- checkout_service | active | Lock | tuple | 47 | 18.3-- analytics_batch | active | NULL | NULL | 3 | 145.7 -- Step 2: Identify the blockerSELECT blocked.pid, blocked.application_name, blocking.pid as blocking_pid, blocking.application_name as blocking_app, blocking.query as blocking_queryFROM pg_stat_activity blockedJOIN pg_locks bl ON blocked.pid = bl.pid AND NOT bl.grantedJOIN pg_locks holding ON bl.relation = holding.relation AND holding.grantedJOIN pg_stat_activity blocking ON holding.pid = blocking.pidWHERE blocked.application_name = 'checkout_service'; -- Result: analytics_batch is holding locks on inventory tableRoot Cause:
The analytics batch job was running a large aggregation query that acquired table-level locks on the inventory table. Checkout transactions needed row-level locks on inventory for decrementing stock. The batch query held locks for minutes while checkout transactions timed out after 5 seconds.
Solution: Multi-Layer Fix
Results:
| Metric | Before | After |
|---|---|---|
| Checkout p99 latency | 8.5s | 0.4s |
| Checkout timeout rate | 12% | 0.01% |
| Flash sale revenue | Lost ~$50K | Record sales |
| Analytics impact | Blocked checkouts | Zero impact |
Key Lesson: Workload isolation prevents cross-workload starvation. Analytics and transactional workloads should never compete for the same locks.
We've covered a comprehensive approach to preventing starvation in database systems. Let's consolidate the key strategies:
Module Conclusion:
Throughout this module, we've built a comprehensive understanding of starvation:
With this knowledge, you can design, configure, and operate database systems that are fundamentally starvation-free—systems where every transaction, no matter how low its priority or how contended its resources, will eventually complete.
Congratulations! You've completed the Starvation module. You now understand starvation as a fundamental concurrency problem and have the tools to prevent it: fairness policies, priority schemes, timeout handling, and integrated prevention strategies. You're prepared to build and operate database systems that guarantee progress for every transaction.