Loading content...
Your transaction has executed all its operations successfully. You've read the data you needed, performed calculations, updated rows, inserted records—everything went according to plan. Now you issue COMMIT. Is your data safe?
Not yet.
Between the moment you request a commit and the moment your transaction's effects become permanent, there exists a critical transitional state: Partially Committed. This state represents a transaction that has completed all its operations but whose changes have not yet been guaranteed to survive a system crash.
The Partially Committed state exists because achieving durability (the 'D' in ACID) requires physical I/O operations that cannot be completed instantaneously. During this brief window, the database system is working to make your transaction's effects permanent—but if a failure occurs at precisely the wrong moment, even a 'committed' transaction might be lost.
By the end of this page, you will understand why the Partially Committed state exists, what operations the DBMS performs during this critical phase, how different databases implement commit processing, the distinction between logical and physical commit, and what happens when failures occur during partial commitment.
Let's establish a precise understanding of what it means for a transaction to be Partially Committed.
Formal Definition:
A transaction T is in the Partially Committed state if and only if:
Using state machine notation:
T ∈ PartiallyCommitted ⟺
(previous_state(T) = Active) ∧
(commit_requested(T) = true) ∧
(commit_complete(T) = false) ∧
(failure_during_commit(T) = false)
The Key Distinction:
The Partially Committed state represents the logical completion of a transaction that is waiting for physical confirmation. The transaction has done everything it logically intended to do, but the system has not yet made those changes crash-proof.
| Aspect | Active State | Partially Committed State |
|---|---|---|
| Operations permitted | Read, Write, Savepoint, Control | None—awaiting commit completion |
| Reversibility | Can be rolled back at any time | Attempting to achieve irreversibility |
| Durability status | Changes only in buffer/log | Forcing log to stable storage |
| Next possible states | Partially Committed or Failed | Committed or Failed |
| Resource locks | Held for isolation | Still held, released after commit |
| Log records | In log buffer | Being forced to disk |
You might wonder why we need a Partially Committed state at all. The reason is hardware reality: writing to stable storage takes time. While waiting for disk I/O to confirm, failures can still occur. The Partially Committed state acknowledges that the logical decision to commit has been made, but physical durability is still being established. This separation allows the DBMS to properly handle failures during commit processing.
A transaction transitions from Active to Partially Committed when it issues a commit request. Let's examine this transition in detail.
The COMMIT Command:
When you execute COMMIT (or the equivalent in your database), you're signaling that:
At this moment, the transaction's state changes to Partially Committed, and a series of critical actions begin.
What Happens Immediately After COMMIT:
The instant a transaction enters Partially Committed:
Commit Record Generation — A commit log record is created, marking this transaction as intending to commit
Operation Prohibition — No further read or write operations are permitted; the transaction's work is complete
Log Force Initiation — The DBMS begins forcing all log records (including the commit record) to stable storage
Lock Retention — All locks are still held; releasing them prematurely could violate isolation
Wait State — The transaction waits for the log force to complete
1234567891011121314151617181920212223242526272829303132333435363738394041
// Pseudocode: Commit processing from Active to Committed function process_commit(transaction T) { // PRE-CONDITION: T is in Active state assert(T.state == ACTIVE); // Transition to Partially Committed T.state = PARTIALLY_COMMITTED; // Step 1: Write the commit log record to log buffer commit_record = create_log_record( type: COMMIT, transaction_id: T.id, timestamp: current_time() ); append_to_log_buffer(commit_record); // Step 2: Force log to stable storage (THE CRITICAL STEP) // This is where durability is achieved try { force_log_to_disk(up_to: commit_record.LSN); // If we reach here, the log force succeeded // Step 3: Transition to Committed T.state = COMMITTED; // Step 4: Release all locks (now safe) release_all_locks(T); // Step 5: Clean up transaction resources deallocate_transaction_descriptor(T); return SUCCESS; } catch (IOError e) { // Log force failed - the commit cannot complete T.state = FAILED; // Transaction will need to be aborted return FAILURE; }}The transition from Partially Committed to Committed happens at exactly one point: when the log force completes successfully. If the system crashes AFTER the log force but BEFORE updating the transaction state to Committed, recovery will still see the commit record in the log and complete the commit. This is why log-based recovery works.
The most critical operation in the Partially Committed state is the log force. Understanding this operation is essential for grasping how databases achieve durability.
What is a Log Force?
A log force (also called log flush or log sync) is the operation of writing all log records from the in-memory log buffer to stable storage (disk) and waiting for confirmation that the write has completed. Only when this confirmation arrives is durability guaranteed.
Why Force the Log?
Consider the alternative: if we only kept log records in memory and the system crashed, all those records would be lost. Without the log records, the DBMS would have no way to know:
The log force ensures that the commit decision survives any crash.
| Log Record Type | Contents | Purpose |
|---|---|---|
| BEGIN | Transaction ID, Start timestamp | Mark transaction start for recovery |
| UPDATE (each) | TID, Table, OldValue, NewValue, LSN | Provide redo and undo information |
| INSERT (each) | TID, Table, NewRow, LSN | Enable redo; undo deletes the row |
| DELETE (each) | TID, Table, OldRow, LSN | Enable redo; undo reinserts the row |
| COMMIT | Transaction ID, Commit timestamp | Mark transaction as durably committed |
The Write-Ahead Logging (WAL) Rule:
Databases follow the Write-Ahead Logging rule: before any change is written to the actual data files, the corresponding log record must be written to stable storage first. This ensures:
In the Partially Committed state, we go one step further: we force ALL log records for the transaction, including the commit record, to disk.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
// Detailed log force operation during Partially Committed function force_log_to_disk(up_to_lsn: LSN) { // Get the current state of the log buffer buffer_end_lsn = log_buffer.last_lsn(); // We may force more than requested if the buffer has additional records // (batching for efficiency) lsn_to_force = max(up_to_lsn, buffer_end_lsn); // Create the I/O operation io_request = { data: log_buffer.contents_up_to(lsn_to_force), destination: log_file.next_write_position(), flags: O_SYNC | O_DIRECT // Ensure actual disk write, bypass OS cache }; // Issue the write and WAIT for completion // This is a SYNCHRONOUS operation - critical for durability result = disk_write_sync(io_request); if (result.bytes_written != io_request.data.size) { throw IOError("Incomplete log write"); } // Optionally: force disk controller to flush its cache // (Important for true durability on some hardware) disk_sync(log_file); // Update bookkeeping log_buffer.mark_flushed_up_to(lsn_to_force); // Now durability is guaranteed up to lsn_to_force} // Group commit optimization: batch multiple commits togetherfunction group_commit() { // Instead of forcing for each transaction individually, // collect multiple commits and force once wait_for_commit_batch(timeout: 1ms); // Brief delay to collect commits // Force all accumulated commits in one I/O force_log_to_disk(max_lsn_of_waiting_commits); // Wake up all transactions whose commits are now durable for each transaction in waiting_commits: transaction.state = COMMITTED; release_locks(transaction);}Modern databases use 'group commit' to amortize the cost of log forces across multiple transactions. Instead of each transaction forcing the log independently, the system briefly waits to batch multiple commit records together. One I/O operation then makes all transactions durable. This dramatically improves throughput in high-concurrency systems.
The Partially Committed state's duration and reliability depend heavily on hardware characteristics. Understanding these factors is crucial for database administrators and engineers designing storage systems.
The Lying Disk Problem:
Some disk drives (and operating systems) report that a write has completed when the data is actually only in a volatile cache on the disk controller. If power is lost at this moment, the 'durable' data is lost. This is why true durability requires:
Latency in Partially Committed State:
The time a transaction spends in Partially Committed is dominated by I/O latency. Typical values:
| Storage Type | Log Force Latency | Impact on Commit Rate |
|---|---|---|
| HDD (7.2K RPM) | 5-10 ms | ~100-200 commits/sec (individual) |
| Enterprise SSD | 0.1-0.5 ms | ~2,000-10,000 commits/sec |
| NVMe SSD | 0.02-0.1 ms | ~10,000-50,000 commits/sec |
| Optane/PMEM | 0.005-0.02 ms | ~50,000-200,000 commits/sec |
Note: With group commit, actual throughput can be much higher than individual commit latency would suggest.
Database Configuration for Durability:
12345678910111213141516171819
-- PostgreSQL durability settings -- Full durability (default, recommended for most workloads)SET synchronous_commit = on; -- Group commit delay (wait up to 10μs to batch commits)SET commit_delay = 10; -- microsecondsSET commit_siblings = 5; -- minimum concurrent transactions before delaying -- For write-heavy workloads: slightly relaxed durability-- CAUTION: Data may be lost on crash (up to 3 * wal_writer_delay)SET synchronous_commit = off; -- Force WAL sync method (varies by OS)-- SHOW wal_sync_method; -- see current setting-- Options: fsync, fdatasync, open_sync, open_datasync -- Check if the file system/disk combination is trustworthy-- Linux: Check /sys/block/sdX/queue/write_cache for write caching statusSome applications trade durability for performance by configuring asynchronous commit or reduced flush frequency. This is acceptable ONLY when the application can tolerate losing the last few seconds of transactions after a crash. Never disable durability guarantees without understanding the business impact and having compensating controls.
What happens if something goes wrong while a transaction is Partially Committed? This is a critical scenario that database recovery systems must handle correctly.
Types of Failures During Partial Commitment:
Scenario Analysis:
Scenario 1: I/O Error During Log Force
If the database cannot complete the log force due to a disk error:
Scenario 2: Crash Before Complete Log Force
If the system crashes before the log force completes:
Scenario 3: Crash After Complete Log Force
If the system crashes just after the log force succeeds but before the transaction state is updated to Committed:
The presence or absence of the COMMIT record in the recovery log is the definitive answer to 'did this transaction commit?' This is why the log force is the critical operation. Once the commit record reaches stable storage, the transaction WILL commit, even if the system crashes immediately afterward. Before the commit record reaches stable storage, the transaction WILL abort if the system crashes.
Client-Side Uncertainty:
A subtle but important issue arises from Scenario 3. Consider this sequence:
The transaction did commit, but the client is uncertain. This is an inherent limitation of distributed systems—the 'two generals problem.' Applications must be designed to handle this uncertainty, typically through:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
import uuidimport psycopg2from psycopg2 import OperationalError def transfer_with_idempotency(conn, from_account, to_account, amount): """ Idempotent funds transfer that handles commit uncertainty. Uses a unique transfer ID to prevent duplicate execution. """ # Generate unique transfer ID OUTSIDE the transaction transfer_id = str(uuid.uuid4()) max_retries = 3 for attempt in range(max_retries): try: with conn.cursor() as cur: cur.execute("BEGIN") # First, check if this transfer already happened # (handles reconnection after uncertain commit) cur.execute(""" SELECT status FROM transfers WHERE transfer_id = %s """, (transfer_id,)) existing = cur.fetchone() if existing: # Transfer already recorded - check its status if existing[0] == 'COMPLETED': conn.commit() return {'status': 'already_completed', 'id': transfer_id} # If not completed, continue with the transfer # Record the transfer attempt cur.execute(""" INSERT INTO transfers (transfer_id, from_acct, to_acct, amount, status) VALUES (%s, %s, %s, %s, 'PENDING') ON CONFLICT (transfer_id) DO NOTHING """, (transfer_id, from_account, to_account, amount)) # Perform the actual transfer cur.execute(""" UPDATE accounts SET balance = balance - %s WHERE account_id = %s AND balance >= %s """, (amount, from_account, amount)) if cur.rowcount == 0: conn.rollback() return {'status': 'insufficient_funds'} cur.execute(""" UPDATE accounts SET balance = balance + %s WHERE account_id = %s """, (amount, to_account)) # Mark transfer complete cur.execute(""" UPDATE transfers SET status = 'COMPLETED' WHERE transfer_id = %s """, (transfer_id,)) # COMMIT - entering Partially Committed state conn.commit() # This might raise an exception if connection lost return {'status': 'completed', 'id': transfer_id} except OperationalError as e: # Connection lost - we don't know if commit succeeded # On retry, the idempotency check will tell us conn.close() conn = get_new_connection() # Reconnect continue return {'status': 'failed_after_retries'}Different database systems implement the Partially Committed state with varying nuances. Understanding these differences is important for database selection and performance tuning.
PostgreSQL Commit Processing:
PostgreSQL uses Write-Ahead Logging (WAL) for durability. During commit:
synchronous_commit setting)pg_xact (commit status) in shared memoryThe commit is durable after step 2, regardless of whether data pages are updated.
12345678910111213141516171819202122232425262728
-- Examine WAL activity during commit -- Monitor WAL write positionSELECT pg_current_wal_lsn() AS current_wal_position; -- Before commitBEGIN;UPDATE test_table SET value = 'new_value' WHERE id = 1; -- Check WAL position growthSELECT pg_current_wal_lsn() AS after_update; -- Commit and observeCOMMIT;SELECT pg_current_wal_lsn() AS after_commit; -- View commit timestamp (for investigating transaction timing)SELECT pg_xact_commit_timestamp('123456'::xid); -- If track_commit_timestamp is on -- Check if WAL has been acknowledged by replicas (synchronous replication)SELECT application_name, state, sent_lsn, write_lsn, flush_lsn, -- Confirmed durable on replica replay_lsnFROM pg_stat_replication;MySQL/InnoDB Commit Processing:
MySQL separates the redo log (InnoDB) from the binary log (replication). A commit involves:
innodb_flush_log_at_trx_commitWith innodb_flush_log_at_trx_commit=1 and sync_binlog=1, both logs are forced to disk at each commit.
SQL Server Commit Processing:
SQL Server uses a sophisticated logging mechanism:
SQL Server supports delayed durability as an option, where commit returns before the log is forced to disk, trading durability for latency.
Oracle Commit Processing:
Oracle uses redo logs and has an efficient commit mechanism:
Oracle's Log Writer is a dedicated background process that batches commits for efficiency.
| Aspect | PostgreSQL | MySQL/InnoDB | SQL Server | Oracle |
|---|---|---|---|---|
| Log Mechanism | WAL | Redo Log + Binlog | Transaction Log | Redo Log |
| Commit Batching | Group Commit | Group Commit | Group Commit | LGWR Batching |
| Relaxed Durability | synchronous_commit=off | innodb_flush...=2 | Delayed Durability | COMMIT_WRITE NOWAIT |
| Typical Commit Latency | ~0.1-1ms | ~0.1-0.5ms | ~0.1-0.5ms | ~0.05-0.2ms |
The Partially Committed state is typically fleeting—lasting only as long as the log force operation, often just milliseconds. However, in certain circumstances, transactions can be observed in this state.
When Partially Committed Becomes Visible:
Monitoring Commit Latency:
12345678910111213141516171819202122232425262728
-- Enable timing for session\timing on -- Measure commit latency directlyBEGIN;UPDATE small_table SET value = 'test' WHERE id = 1;COMMIT; -- Timing shows commit duration (includes Partially Committed time) -- Monitor WAL write/sync statisticsSELECT *, CASE WHEN wal_buffers_full > 0 THEN 'May indicate commit contention' ELSE 'Normal' END AS assessmentFROM pg_stat_wal; -- Check for transactions waiting on WAL insert lockSELECT wait_event_type, wait_event, count(*)FROM pg_stat_activityWHERE wait_event LIKE '%Wal%'GROUP BY wait_event_type, wait_event; -- View bgwriter/checkpointer stats (related to durability)SELECT * FROM pg_stat_bgwriter;Troubleshooting Long Partially Committed Times:
If you observe transactions spending excessive time in the commit phase:
iostat, disk monitoring tools; look for high await timesApplication-perceived commit latency directly affects user experience. In a typical OLTP application, commit latency contributes to overall response time. Moving from HDD to SSD can reduce commit latency by 10-100x, dramatically improving application responsiveness. For write-heavy applications, commit optimization is often the highest-ROI performance improvement.
We've thoroughly explored the Partially Committed state—the critical bridge between logical commit and physical durability. Let's consolidate the key concepts:
What's Next:
With an understanding of both Active and Partially Committed states, we'll now explore the Committed state—the final, successful outcome where a transaction's effects become permanent and irreversible. We'll examine what 'committed' truly means, the guarantees it provides, and how the system handles post-commit cleanup.
You now understand the Partially Committed state in depth—why it exists, what happens during this phase, how durability is achieved through log forcing, and how to handle failures during commitment. This understanding is essential for building robust applications and troubleshooting database performance issues.