Loading learning content...
Database logs are the unsung heroes of data integrity. Every insert, update, and delete passes through the transaction log before being written to data files. Without properly managed logs, databases cannot guarantee durability, cannot recover from crashes, and cannot be replicated to standby systems.
But logs are also consumptive beasts. Left unmanaged, they grow without bound, consuming storage until the database grinds to a halt. A transaction log that fills the disk is one of the most common causes of production database outages—and often the most preventable.
Log management is essential for both reliability and operational stability.
By the end of this page, you will understand transaction log architecture, master log size management and growth strategies, implement log backup and archival procedures, configure log shipping and rotation, and develop production log management policies.
Every modern database maintains a transaction log (also called write-ahead log, redo log, or WAL) that records all modifications before they're applied to data files. This write-ahead logging (WAL) protocol is the foundation of database durability and recovery.
Why write-ahead logging exists:
| Concept | SQL Server | PostgreSQL | MySQL/InnoDB | Oracle |
|---|---|---|---|---|
| Transaction Log | Transaction Log (.ldf) | Write-Ahead Log (WAL) | Redo Log + Binary Log | Redo Log |
| Log Files | Single virtual log file with VLFs | Segment files (16MB each) | ib_logfile0, ib_logfile1 | Multiple redo log groups |
| Log Backup | Transaction log backup | WAL archiving | Binlog + Redo archive | Archived redo logs |
| Log Truncation | Checkpoint + backup | Checkpoint + WAL archive | Checkpoint | Log switch/archive |
| Log Reuse | VLF status becomes reusable | Old segments removed | Circular reuse after checkpoint | Circular reuse or archive |
The log lifecycle:
Log writes are sequential (append-only) and synchronous at commit. Data file writes are random and can be deferred. This is why logging seems 'extra work' but actually improves performance—sequential I/O is much faster than random I/O, especially on spinning disks.
Proper log file sizing balances several concerns: preventing uncontrolled growth, avoiding frequent auto-growth events, ensuring sufficient space for peak transaction volumes, and minimizing wasted disk space.
Log growth problems:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
-- SQL Server: Log file size management -- Check current log file statusSELECT DB_NAME(database_id) AS database_name, name AS log_file_name, type_desc, size * 8 / 1024 AS size_mb, max_size, growth, is_percent_growthFROM sys.master_filesWHERE type_desc = 'LOG' AND database_id = DB_ID(); -- Check log space usageDBCC SQLPERF(LOGSPACE); -- View Virtual Log Files (VLFs)DBCC LOGINFO;-- Too many VLFs (>500) indicates fragmented log growth -- Check why log cannot be truncatedSELECT name, log_reuse_wait_descFROM sys.databasesWHERE name = DB_NAME();-- Common values:-- NOTHING - Log can be truncated-- LOG_BACKUP - Waiting for log backup-- ACTIVE_TRANSACTION - Long-running transaction-- DATABASE_MIRRORING - Mirror not caught up-- REPLICATION - Pending replication -- Resize log file properly-- Step 1: Backup the log (if full recovery)BACKUP LOG [OrdersDB] TO DISK = 'D:\Backups\OrdersDB_Log.trn'; -- Step 2: Shrink the log (avoid in production if possible)DBCC SHRINKFILE (OrdersDB_Log, 1024); -- Shrink to 1GB -- Step 3: Grow to proper size with good incrementALTER DATABASE [OrdersDB]MODIFY FILE (NAME = OrdersDB_Log, SIZE = 8GB); -- Set appropriate auto-growth (avoid small increments)ALTER DATABASE [OrdersDB]MODIFY FILE (NAME = OrdersDB_Log, FILEGROWTH = 512MB); -- NEVER use percent growth for large databases-- 10% of 100GB = 10GB growth events (too large)When SQL Server grows the log file, it creates Virtual Log Files (VLFs). Many small growths create many small VLFs, slowing log operations. Target < 500 VLFs. If you have thousands, shrink and pre-grow the log in large increments (512MB-1GB) to consolidate.
Transaction log backups serve two purposes: enabling point-in-time recovery and allowing log truncation. Without regular log backups, the transaction log grows indefinitely.
Log backup frequency considerations:
| Workload Type | Recommended Frequency | RPO | Typical Log Size |
|---|---|---|---|
| Low transaction (reporting) | Every 1-4 hours | 1-4 hours data loss | Small (MB/hour) |
| Medium transaction (typical OLTP) | Every 15-30 minutes | 15-30 min data loss | Moderate (10s MB/hour) |
| High transaction (e-commerce) | Every 5-15 minutes | 5-15 min data loss | Large (100s MB/hour) |
| Critical (financial) | Every 1-5 minutes | 1-5 min data loss | Very large (GB/hour) |
| Near-zero data loss | Continuous (log shipping) | Seconds | Shipped immediately |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
-- SQL Server: Transaction log backup strategies -- Check recovery model (must be FULL for log backups)SELECT name, recovery_model_descFROM sys.databasesWHERE name = 'OrdersDB'; -- Set full recovery model if neededALTER DATABASE [OrdersDB] SET RECOVERY FULL; -- Basic log backupBACKUP LOG [OrdersDB]TO DISK = 'D:\Backups\OrdersDB_Log_20240115_0800.trn'WITH COMPRESSION, CHECKSUM; -- Log backup with automatic namingDECLARE @BackupFile NVARCHAR(500) = 'D:\Backups\OrdersDB_Log_' + FORMAT(GETDATE(), 'yyyyMMdd_HHmmss') + '.trn'; BACKUP LOG [OrdersDB]TO DISK = @BackupFileWITH COMPRESSION, CHECKSUM, STATS = 10; -- Tail-log backup (before disaster recovery)-- Captures log generated since last backupBACKUP LOG [OrdersDB]TO DISK = 'D:\Backups\OrdersDB_TailLog.trn'WITH NO_TRUNCATE, CHECKSUM; -- NO_TRUNCATE for damaged database -- Copy-only log backup (doesn't break backup chain)BACKUP LOG [OrdersDB]TO DISK = 'D:\Backups\OrdersDB_Log_CopyOnly.trn'WITH COPY_ONLY, COMPRESSION; -- Log backup to multiple files (striping)BACKUP LOG [OrdersDB]TO DISK = 'D:\Backups\Log1.trn', DISK = 'E:\Backups\Log2.trn'WITH COMPRESSION, CHECKSUM; -- Scheduled log backup job (SQL Agent)-- Create job that runs every 15 minutes-- Calls backup command with timestamp-based filename -- Verify integrity of log backupRESTORE VERIFYONLYFROM DISK = 'D:\Backups\OrdersDB_Log_20240115_0800.trn'; -- Check backup historySELECT bs.database_name, bs.backup_start_date, bs.backup_finish_date, bs.type AS backup_type, DATEDIFF(SECOND, bs.backup_start_date, bs.backup_finish_date) AS duration_sec, bs.backup_size / 1024 / 1024 AS size_mb, bmf.physical_device_nameFROM msdb.dbo.backupset bsJOIN msdb.dbo.backupmediaset bms ON bs.media_set_id = bms.media_set_idJOIN msdb.dbo.backupmediafamily bmf ON bms.media_set_id = bmf.media_set_idWHERE bs.database_name = 'OrdersDB' AND bs.type = 'L' -- L = Log backupORDER BY bs.backup_start_date DESC;Log backups form a chain from the last full backup to the present. Breaking the chain (e.g., switching from FULL to SIMPLE recovery, or missing a log backup file) prevents point-in-time recovery. Always verify backup chains are complete before deleting old backups.
Log shipping is a disaster recovery technique that automatically sends transaction log backups from a primary server to one or more secondary servers. It provides warm standby capability with configurable lag time.
Log shipping vs. synchronous replication:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
-- SQL Server: Log shipping setup -- On PRIMARY server:-- 1. Enable log backup job (creates .trn files)-- 2. Configure backup share accessible to secondary -- Create backup directory share-- \\PRIMARY\LogShipping$ -- Backup job (run every 15 minutes)BACKUP LOG [OrdersDB]TO DISK = '\\PRIMARY\LogShipping$\OrdersDB_LS.trn'WITH INIT, COMPRESSION; -- On SECONDARY server:-- 1. Restore full backup with NORECOVERYRESTORE DATABASE [OrdersDB]FROM DISK = '\\PRIMARY\LogShipping$\OrdersDB_Full.bak'WITH NORECOVERY,MOVE 'OrdersDB' TO 'D:\Data\OrdersDB_LS.mdf',MOVE 'OrdersDB_log' TO 'D:\Logs\OrdersDB_LS.ldf'; -- 2. Create copy job (copies files from primary share)-- SQL Agent job that runs every 15 minutes -- 3. Create restore job (restores copied log files)-- SQL Agent job example:RESTORE LOG [OrdersDB]FROM DISK = 'D:\LSCopy\OrdersDB_LS.trn'WITH NORECOVERY; -- Monitor log shipping statusSELECT primary_server, primary_database, secondary_server, secondary_database, last_copied_file, last_copied_date, last_restored_file, last_restored_date, DATEDIFF(MINUTE, last_restored_date, GETDATE()) AS lag_minutesFROM msdb.dbo.log_shipping_monitor_secondary; -- Failover (when primary fails):-- 1. Apply remaining logs with RECOVERYRESTORE LOG [OrdersDB]FROM DISK = 'D:\LSCopy\OrdersDB_LS.trn'WITH RECOVERY; -- Database now online as primary -- Using GUI: Database Properties > Transaction Log Shipping-- (Easier setup for most scenarios)Always monitor replication lag. A standby that falls behind may have too much to recover during failover, extending downtime. Set alerts for lag exceeding acceptable thresholds (e.g., > 1 hour for log shipping, > 1 minute for streaming replication).
Log files—whether transaction logs, error logs, slow query logs, or audit logs—must be managed proactively. Without rotation and cleanup, they consume all available storage.
| Log Type | Purpose | Rotation Strategy | Retention |
|---|---|---|---|
| Transaction/Redo Log | Durability and recovery | Circular reuse after checkpoint/backup | N/A (reused in place) |
| WAL Archive/Binlog | Point-in-time recovery | Time-based cleanup after PITR window | Last full backup + N days |
| Error Log | Troubleshooting | Size or date-based rotation | 30-90 days typically |
| Slow Query Log | Performance tuning | Size-based rotation | 7-30 days |
| Audit Log | Compliance, security | Time-based, compressed archive | Years (compliance requirements) |
| Connection Log | Security, debugging | Size or date-based rotation | 7-30 days |
1234567891011121314151617181920212223242526272829303132333435363738394041
-- SQL Server: Log file rotation and cleanup -- Cycle error log (rotates, keeping 6 previous)EXEC sp_cycle_errorlog;-- Configure max error logs in SQL Server Configuration Manager -- Increase number of error log files keptEXEC xp_instance_regwrite N'HKEY_LOCAL_MACHINE', N'SOFTWARE\Microsoft\MSSQLServer\MSSQLServer', N'NumErrorLogs', REG_DWORD, 12; -- Clean up old transaction log backups-- (Use SQL Agent job or maintenance plan)DECLARE @RetentionDays INT = 7;DECLARE @CleanupDate DATETIME = DATEADD(DAY, -@RetentionDays, GETDATE()); -- Using xp_delete_file (recommended)EXEC xp_delete_file 0, -- Type: 0 = backup files 'D:\Backups\', -- Folder path 'trn', -- Extension @CleanupDate, -- Delete older than this date 1; -- Include subfolders -- Clean up backup history in msdbDECLARE @OldestDate DATETIME = DATEADD(MONTH, -6, GETDATE());EXEC msdb.dbo.sp_delete_backuphistory @OldestDate; -- Clean up job historyEXEC msdb.dbo.sp_purge_jobhistory @oldest_date = @OldestDate; -- Clean up mail logEXEC msdb.dbo.sysmail_delete_mailitems_sp @sent_before = @OldestDate; -- Maintenance plan approach:-- Create maintenance plan with:-- 1. Backup Log task (every 15 min)-- 2. Cleanup task (delete files older than 7 days)Log-related issues are among the most common causes of database outages. Understanding how to diagnose and resolve these issues quickly is essential for production database administration.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
-- SQL Server: Log troubleshooting queries -- 1. Why can't the log be truncated?SELECT name, log_reuse_wait_desc, recovery_model_descFROM sys.databasesWHERE database_id = DB_ID(); -- Interpret log_reuse_wait_desc:-- CHECKPOINT: Normal, waiting for checkpoint-- LOG_BACKUP: Need to take log backup (FULL recovery)-- ACTIVE_TRANSACTION: Long-running transaction blocking-- DATABASE_MIRRORING: Mirror is behind-- REPLICATION: Transaction replication needs the log-- AVAILABILITY_REPLICA: AG replica behind -- 2. Find long-running transactionsSELECT s.session_id, s.login_name, s.host_name, t.transaction_id, t.transaction_begin_time, DATEDIFF(MINUTE, t.transaction_begin_time, GETDATE()) AS duration_min, t.transaction_stateFROM sys.dm_tran_active_transactions tJOIN sys.dm_tran_session_transactions st ON t.transaction_id = st.transaction_idJOIN sys.dm_exec_sessions s ON st.session_id = s.session_idORDER BY t.transaction_begin_time ASC; -- 3. Check log space usage per transactionSELECT DB_NAME(database_id) AS database_name, CAST(total_log_size_in_bytes / 1024.0 / 1024 AS DECIMAL(10,2)) AS total_log_mb, CAST(used_log_space_in_bytes / 1024.0 / 1024 AS DECIMAL(10,2)) AS used_log_mb, CAST(used_log_space_in_percent AS DECIMAL(5,2)) AS used_pctFROM sys.dm_db_log_space_usage; -- 4. Emergency: Log full, need immediate space-- Option A: Take log backup immediatelyBACKUP LOG [OrdersDB] TO DISK = 'D:\Emergency\OrdersDB_Log.trn'WITH COMPRESSION; -- Option B: If no backup needed (will break backup chain!)-- Switch to SIMPLE recovery temporarilyALTER DATABASE [OrdersDB] SET RECOVERY SIMPLE;CHECKPOINT;ALTER DATABASE [OrdersDB] SET RECOVERY FULL;-- MUST take new full backup to restart log chain! -- 5. VLF count and potential consolidationDBCC LOGINFO;-- Count rows to get VLF count-- If > 500, consider shrink and regrowEffective log management requires proactive policies, monitoring, and automation. The following best practices help prevent log-related incidents in production.
Set alerts at 80% log space usage. This gives you time to react before reaching 100%. When the alert fires, take an immediate log backup and investigate why usage is higher than normal.
Log management is often overlooked until something goes wrong. But proper log management is essential for database durability, recoverability, and operational stability. Neglecting logs leads to outages; mastering logs prevents them.
What's next:
Logs keep databases running. But the software itself needs maintenance too. In the next page, we'll explore Patching—how to plan, test, and apply database patches with minimal risk and downtime.
You now understand transaction log architecture, sizing strategies, backup and archival procedures, log shipping, rotation policies, and troubleshooting techniques. Apply these principles to keep your databases reliable and recoverable.