Loading learning content...
You've decided to decompose your shared database. You understand why Database per Service is the right architecture. Now comes the hard part: actually moving the data.
Data migration in a live system is one of the most challenging operations in software engineering. Unlike code deployments, which can often be rolled back instantly, data migrations involve state changes that are much harder to reverse. A botched migration can mean hours of downtime, data corruption, or worse—permanent data loss.
The stakes are high, but the techniques are well-established. With careful planning and the right patterns, you can migrate data safely, incrementally, and with minimal disruption.
This page provides comprehensive coverage of data migration strategies. You'll learn the Parallel Run pattern for safe migrations, the Strangler Fig approach applied to data, techniques for maintaining data synchronization during migration, strategies for handling rollback, and practical considerations for executing migrations in production systems.
The single most important principle in data migration is: never do a big-bang migration if you can possibly avoid it.
A big-bang migration attempts to move all data at once, typically during a maintenance window. While conceptually simple, this approach carries extreme risk:
The Incremental Migration Mindset
Incremental migration means:
Modern users expect 24/7 availability. Your migration strategy should target zero downtime. While brief "freeze" periods might be necessary for final cutover, the goal is that users never notice the migration happening. Every technique in this page aims for this goal.
The Parallel Run Pattern is the most robust approach to database migration. It involves running both the old and new databases simultaneously, with mechanisms to keep them synchronized and compare their outputs.
How it works:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
Phase 1: Initial Sync┌─────────────────────────────────────────────────────────────────┐│ ││ Shared Database (Primary) ││ ├── users table ││ ├── orders table ││ └── products table ││ │ ││ │ Initial data copy (bulk migration) ││ ▼ ││ User Service Database (Shadow) ││ └── users table (copy of users, read-only initially) ││ │└─────────────────────────────────────────────────────────────────┘ Phase 2: Dual Write┌─────────────────────────────────────────────────────────────────┐│ ││ User Service ││ │ ││ ├──────────────────────────────────────────────────────┤│ │ ││ Write ├───> Shared Database (Primary source of truth) ││ │ ││ └───> User Service DB (Shadow, verified for correctness)││ ││ Read ─────> Shared Database (still primary) ││ │└─────────────────────────────────────────────────────────────────┘ Phase 3: Shadow Read Comparison┌─────────────────────────────────────────────────────────────────┐│ ││ User Service ││ │ ││ Read ├───> Shared Database (returns response to user) ││ │ ││ └───> User Service DB (compared, discrepancies logged) ││ ││ Comparison engine logs any differences ││ Team investigates and resolves before cutover ││ │└─────────────────────────────────────────────────────────────────┘ Phase 4: Traffic Shift┌─────────────────────────────────────────────────────────────────┐│ ││ User Service ││ │ ││ Read ├─[10%]─> User Service DB (new primary) ││ │ ││ └─[90%]─> Shared Database ││ ││ Write ───> Both databases (dual write continues) ││ │└─────────────────────────────────────────────────────────────────┘ Phase 5: Complete Cutover┌─────────────────────────────────────────────────────────────────┐│ ││ User Service ││ │ ││ Read ─────> User Service DB (100%) ││ Write ─────> User Service DB (new source of truth) ││ ││ Shared Database (read-only backup, pending decommission) ││ │└─────────────────────────────────────────────────────────────────┘Implementation: Dual Write with Comparison
The critical mechanism is dual-write combined with read comparison:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
class UserRepository { constructor( private legacyDb: LegacyDatabase, private newDb: NewDatabase, private migrationConfig: MigrationConfig, ) {} async createUser(userData: CreateUserData): Promise<User> { // Always write to legacy (source of truth during migration) const user = await this.legacyDb.users.create(userData); // Also write to new database (shadow) try { await this.newDb.users.create({ id: user.id, // Use same ID for correlation ...userData, }); } catch (error) { // Log but don't fail - shadow write failure is not critical this.metrics.increment('migration.shadow_write_failure'); this.logger.error('Shadow write failed', { error, userId: user.id }); } return user; } async getUser(userId: string): Promise<User | null> { // Read from legacy (source of truth) const legacyUser = await this.legacyDb.users.findById(userId); // Optionally compare with new database if (this.migrationConfig.enableReadComparison) { this.compareInBackground(userId, legacyUser); } // Optionally read from new database based on traffic percentage if (this.shouldReadFromNew()) { return this.newDb.users.findById(userId); } return legacyUser; } private async compareInBackground(userId: string, legacyUser: User | null) { // Non-blocking comparison setImmediate(async () => { try { const newUser = await this.newDb.users.findById(userId); const discrepancies = this.findDiscrepancies(legacyUser, newUser); if (discrepancies.length > 0) { this.metrics.increment('migration.read_discrepancy'); this.logger.warn('Data discrepancy detected', { userId, discrepancies, }); } } catch (error) { this.logger.error('Comparison failed', { error, userId }); } }); } private shouldReadFromNew(): boolean { // Gradual traffic shift based on configuration const percentage = this.migrationConfig.newDbReadPercentage; return Math.random() * 100 < percentage; }}Read comparisons can be expensive. Consider sampling (compare 1% of reads) rather than comparing every read. Use background jobs to periodically reconcile the entire dataset. Log discrepancies with enough context to investigate root causes.
Before parallel running can begin, you need to populate the new database with existing data. This initial synchronization is often the most time-consuming part of the migration.
Approach 1: Direct Database Copy
For smaller datasets, direct tools work well:
123456789101112131415
# PostgreSQL: pg_dump and pg_restorepg_dump -h legacy-db -d monolith -t users -t user_profiles \ | psql -h user-service-db -d userservice # MySQL: mysqldumpmysqldump -h legacy-db monolith users user_profiles \ | mysql -h user-service-db userservice # AWS DMS for cloud migrationsaws dms create-replication-task \ --replication-task-identifier user-migration \ --source-endpoint-arn arn:aws:dms:...:endpoint:source \ --target-endpoint-arn arn:aws:dms:...:endpoint:target \ --migration-type full-load-and-cdc \ --table-mappings file://user-table-mappings.jsonApproach 2: ETL Pipeline for Transformations
When the new schema differs from the legacy schema, an ETL pipeline handles transformation:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
class UserMigrationPipeline { async runInitialMigration() { const batchSize = 10000; let offset = 0; let hasMore = true; while (hasMore) { // Extract: Fetch batch from legacy database const legacyUsers = await this.legacyDb.query(` SELECT id, email, first_name, last_name, phone, street_address, city, state, zip, country, created_at, updated_at FROM users ORDER BY id LIMIT $1 OFFSET $2 `, [batchSize, offset]); if (legacyUsers.length === 0) { hasMore = false; continue; } // Transform: Convert to new schema const transformedUsers = legacyUsers.map(legacy => ({ id: legacy.id, email: legacy.email.toLowerCase(), // Normalize email name: { first: legacy.first_name, last: legacy.last_name, }, contact: { phone: this.normalizePhone(legacy.phone), }, address: { street: legacy.street_address, city: legacy.city, state: legacy.state, postalCode: legacy.zip, country: this.normalizeCountry(legacy.country), }, metadata: { migratedAt: new Date(), legacyId: legacy.id, }, createdAt: legacy.created_at, updatedAt: legacy.updated_at, })); // Load: Insert into new database await this.newDb.users.bulkInsert(transformedUsers); // Progress tracking offset += batchSize; this.logger.info(`Migrated ${offset} users`); // Throttle to avoid overwhelming databases await this.sleep(100); } this.logger.info('Initial migration complete'); } private normalizePhone(phone: string): string { // Remove non-numeric characters, format consistently return phone?.replace(/\D/g, '') || null; } private normalizeCountry(country: string): string { // Convert to ISO country code return countryCodeMapping[country?.toLowerCase()] || country; }}Approach 3: Change Data Capture (CDC)
For large datasets where initial copy takes hours or days, CDC ensures you don't fall behind during the copy:
12345678910111213141516171819202122232425262728293031
Timeline of CDC Migration: T=0: Start CDC capture on legacy database All changes are captured and queued T=0 to T+8h: Run initial bulk copy - Legacy: 10 million records copied - CDC queue: 50,000 changes accumulated T+8h: Apply accumulated CDC changes - Process 50,000 queued changes - Takes 10 minutes T+8h 10m: Steady state - Bulk copy done - CDC queue drained - Real-time sync begins CDC Workflow:┌────────────────────────────────────────────────────────────────┐│ ││ Legacy DB ──[Change Log/WAL]──> CDC Connector ││ │ ││ ▼ ││ Kafka / Event Queue ││ │ ││ ▼ ││ New DB <────────────────────── CDC Consumer ││ applies changes ││ │└────────────────────────────────────────────────────────────────┘Popular CDC tools include Debezium (open source, works with Kafka), AWS DMS, Google Datastream, and Azure Data Factory. These tools capture changes at the database log level, ensuring no changes are missed even during high-load periods.
The trickiest part of migration is handling writes while both databases are active. Several strategies exist, each with tradeoffs.
Strategy 1: Legacy-Primary Dual Write
Writes go to legacy first, then replicate to new. Legacy remains the source of truth.
1234567891011121314151617181920212223
async createUser(userData: CreateUserData): Promise<User> { // Step 1: Write to legacy (synchronous, must succeed) const user = await this.legacyDb.users.create(userData); // Step 2: Replicate to new (async, failure tolerated) this.replicateToNew(user).catch(err => { this.logger.error('Replication failed', { err, userId: user.id }); this.queueForRetry(user); // Will retry later }); return user;} // Periodic job catches up any missed replicationsasync reconcileDatabases() { const legacyUsers = await this.legacyDb.users.findModifiedSince( this.lastReconcileTime ); for (const user of legacyUsers) { await this.replicateToNew(user); }}Strategy 2: Transaction Outbox Pattern
For reliable replication without distributed transactions, use the outbox pattern:
123456789101112131415161718192021222324252627282930313233343536
async createUser(userData: CreateUserData): Promise<User> { // Single transaction ensures atomicity return this.legacyDb.transaction(async (tx) => { // Create the user const user = await tx.users.create(userData); // Record the change in outbox table (same transaction) await tx.outbox.create({ aggregateType: 'User', aggregateId: user.id, eventType: 'UserCreated', payload: JSON.stringify(user), status: 'pending', createdAt: new Date(), }); return user; });} // Separate process polls outbox and applies to new databaseclass OutboxProcessor { async process() { const pendingEvents = await this.legacyDb.outbox.findByStatus('pending'); for (const event of pendingEvents) { try { await this.applyToNewDatabase(event); await this.legacyDb.outbox.updateStatus(event.id, 'processed'); } catch (error) { await this.legacyDb.outbox.updateStatus(event.id, 'failed'); this.logger.error('Outbox processing failed', { event, error }); } } }}Strategy 3: New-Primary with Backfill
In the final phase of migration, the new database becomes primary:
123456789101112131415161718192021
class UserRepository { constructor(private config: MigrationConfig) {} async createUser(userData: CreateUserData): Promise<User> { if (this.config.newDatabaseIsPrimary) { // New database is source of truth const user = await this.newDb.users.create(userData); // Backfill to legacy for rollback safety this.backfillToLegacy(user).catch(err => { this.logger.warn('Legacy backfill failed', { err, userId: user.id }); // Not critical - legacy is no longer primary }); return user; } else { // Legacy is still primary (previous code path) return this.legacyPrimaryCreate(userData); } }}| Strategy | Complexity | Consistency | Rollback Safety |
|---|---|---|---|
| Legacy-Primary Dual Write | Low | Eventual (new may lag) | Excellent |
| Transaction Outbox | Medium | Guaranteed delivery | Excellent |
| CDC Streaming | High | Near real-time | Excellent |
| New-Primary with Backfill | Low | New is authoritative | Good (requires backfill) |
The cardinal rule of write handling: never lose a write. If replication fails, queue for retry. If the queue fails, log persistently. Have reconciliation jobs that catch any gaps. Audit regularly to ensure counts match between databases.
Once the new database is synchronized and verified, you can begin shifting traffic. The goal is gradual, controllable, reversible traffic migration.
Percentage-Based Traffic Split
The most common approach: route a percentage of traffic to the new database, increasing over time.
123456789101112131415161718192021222324252627282930313233
class UserRepository { async getUser(userId: string): Promise<User | null> { const config = await this.featureFlags.get('user_db_migration'); // Route based on percentage if (this.shouldUseNewDatabase(config.readPercentage)) { try { return await this.newDb.users.findById(userId); } catch (error) { // Fallback to legacy on error this.metrics.increment('migration.new_db_fallback'); return this.legacyDb.users.findById(userId); } } return this.legacyDb.users.findById(userId); } private shouldUseNewDatabase(percentage: number): boolean { // Deterministic routing based on request ID for consistency const requestHash = hash(this.requestContext.requestId); return (requestHash % 100) < percentage; }} // Progressive rollout schedule// Day 1: 1% read traffic to new DB, monitor closely// Day 2: 5% read traffic if Day 1 successful// Day 3: 10% read traffic// Day 5: 25% read traffic// Day 7: 50% read traffic// Day 10: 100% read traffic// Day 14: Switch write traffic to new DB as primaryUser/Segment-Based Migration
For more control, migrate specific user segments first:
12345678910111213141516171819202122232425262728
class UserRepository { async getUser(userId: string): Promise<User | null> { const user = await this.getRoutingInfo(userId); // Internal users first if (user.isInternalUser) { return this.newDb.users.findById(userId); } // Then beta users who opted in if (user.isBetaTester) { return this.newDb.users.findById(userId); } // Then users by region (easier to support during business hours) if (this.isMigratedRegion(user.region)) { return this.newDb.users.findById(userId); } // Everyone else stays on legacy until their segment is migrated return this.legacyDb.users.findById(userId); } private async isMigratedRegion(region: string): Promise<boolean> { const migratedRegions = ['US-WEST', 'EU-WEST']; // Gradually expand return migratedRegions.includes(region); }}Sticky Routing
Ensure a user's traffic consistently goes to the same database during their session to avoid confusing experiences:
1234567891011121314151617
class DatabaseRouter { async route(userId: string): Promise<Database> { // Use consistent hashing for deterministic routing const bucket = consistentHash(userId) % 100; const cutoverBucket = await this.featureFlags.get('migration_cutover_bucket'); if (bucket < cutoverBucket) { return this.newDb; } return this.legacyDb; }} // The same user always routes to the same database// Moving cutoverBucket from 0 to 100 gradually migrates all users// User with bucket=25 moves when cutoverBucket reaches 26// Prevents user from flip-flopping between databasesDuring traffic shifting, monitor aggressively: latency percentiles, error rates, data discrepancy counts, and business metrics. Automated alerts should trigger if any metric degrades beyond threshold. Use feature flags that can instantly route 100% traffic back to legacy if problems emerge.
A robust migration plan includes detailed rollback procedures at every stage. If anything goes wrong, you must be able to return to a known-good state quickly.
Level 1: Traffic Rollback (Instant)
The fastest rollback: redirect all traffic back to the legacy database.
12345678910111213141516
// Emergency rollback - takes effect immediatelyawait featureFlags.set('user_db_migration', { readPercentage: 0, // All reads go to legacy writeToNew: false, // Stop writing to new DB newIsPrimary: false, // Ensure legacy is source of truth}); // Or via CLI/API call$ curl -X POST https://api.featureflags.io/flags/user_db_migration/disable // Alert teamawait alerting.critical('Database migration rolled back', { reason: 'Error rate exceeded threshold', rollbackTime: new Date(), trafficPercentage: previousPercentage,});Level 2: Data Rollback (Careful)
If data in the new database has diverged incorrectly, you may need to correct it:
123456789101112131415161718192021222324252627
// If new DB has incorrect data, re-sync from legacyclass MigrationRollback { async resyncFromLegacy(startTime: Date, endTime: Date) { // Find all records modified during the problematic period const affectedRecords = await this.legacyDb.users.findModifiedBetween( startTime, endTime ); this.logger.warn(`Rolling back ${affectedRecords.length} records`); for (const legacyRecord of affectedRecords) { // Overwrite new DB with legacy data await this.newDb.users.upsert( this.transformForNewSchema(legacyRecord) ); // Log for audit await this.auditLog.record({ action: 'migration_rollback', recordId: legacyRecord.id, reason: 'data_correction', rolledBackAt: new Date(), }); } }}Level 3: Post-Cutover Rollback (Complex)
If you've already cut over to the new database as primary and need to roll back, you must sync changes back to legacy:
123456789101112131415161718192021222324
Post-Cutover Rollback Procedure: 1. STOP: Halt all writes (brief downtime may be necessary) 2. SYNC: Apply all changes from new DB back to legacy - Query new DB for all records modified since cutover - Transform to legacy schema - Apply to legacy DB 3. VERIFY: Ensure legacy has all data - Run reconciliation queries - Compare record counts - Validate critical business data 4. SWITCH: Redirect traffic to legacy - Update feature flags - Verify traffic is flowing to legacy - Monitor closely 5. RESUME: Allow writes - Legacy is now primary again - New DB becomes shadow (or is paused) Timeline: 15-60 minutes depending on data volume and verification needsThe ability to roll back degrades over time. Once you've been running on the new database for weeks and the legacy database is stale, rollback becomes a major undertaking. Define a "point of no return" and ensure you're confident before crossing it. Keep legacy data synchronized for as long as practically possible.
Continuous validation ensures the migration is proceeding correctly. Never assume data made it—verify.
Real-Time Comparison
Compare results from both databases in real-time:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
class DataComparator { async compareRead(userId: string): Promise<void> { const [legacyResult, newResult] = await Promise.all([ this.legacyDb.users.findById(userId), this.newDb.users.findById(userId), ]); const comparison = this.compare(legacyResult, newResult); if (!comparison.isEqual) { this.metrics.increment('migration.comparison.mismatch'); await this.discrepancyLog.record({ entity: 'User', entityId: userId, differences: comparison.differences, legacyValue: this.redact(legacyResult), newValue: this.redact(newResult), timestamp: new Date(), }); // Alert if mismatch rate exceeds threshold if (await this.mismatchRateExceedsThreshold()) { await this.alertMigrationTeam('Mismatch rate too high'); } } else { this.metrics.increment('migration.comparison.match'); } } private compare(legacy: any, newRecord: any): ComparisonResult { const differences: Difference[] = []; // Compare each field, accounting for schema transformations for (const field of this.comparisonFields) { const legacyValue = this.extractField(legacy, field.legacyPath); const newValue = this.extractField(newRecord, field.newPath); if (!field.comparator(legacyValue, newValue)) { differences.push({ field: field.name, legacyValue, newValue, }); } } return { isEqual: differences.length === 0, differences, }; }}Batch Reconciliation Jobs
Periodic full reconciliation catches any discrepancies missed by real-time comparison:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
class ReconciliationJob { async runFullReconciliation(): Promise<ReconciliationReport> { const report: ReconciliationReport = { startTime: new Date(), totalLegacyRecords: 0, totalNewRecords: 0, missingInNew: [], missingInLegacy: [], mismatches: [], }; // Get all IDs from both databases const legacyIds = new Set(await this.legacyDb.users.getAllIds()); const newIds = new Set(await this.newDb.users.getAllIds()); report.totalLegacyRecords = legacyIds.size; report.totalNewRecords = newIds.size; // Find records missing in new database for (const id of legacyIds) { if (!newIds.has(id)) { report.missingInNew.push(id); } } // Find records in new but not in legacy (shouldn't happen normally) for (const id of newIds) { if (!legacyIds.has(id)) { report.missingInLegacy.push(id); } } // Compare content of matching records (sample for large datasets) const sampleIds = this.sample(Array.from(legacyIds), 10000); for (const id of sampleIds) { const [legacy, newRecord] = await Promise.all([ this.legacyDb.users.findById(id), this.newDb.users.findById(id), ]); if (!this.recordsMatch(legacy, newRecord)) { report.mismatches.push({ id, differences: this.findDifferences(legacy, newRecord), }); } } report.endTime = new Date(); return report; }}| Checkpoint | What to Verify | Action if Failed |
|---|---|---|
| After initial sync | Record counts match; sample data matches | Re-run sync; investigate gaps |
| During dual-write | Writes appear in both DBs within SLA | Check replication; fix and resync |
| Before traffic shift | Full reconciliation passes | Do not proceed until resolved |
| During traffic shift | Error rates stable; latency acceptable | Roll back traffic percentage |
| After cutover | Business metrics normal | Keep legacy hot; prepare rollback |
Automate validation gates in your migration pipeline. Traffic shift to the next percentage level should be blocked until validation passes. Human approval should be required for major milestones (50%, 100%, write cutover).
One of the trickiest aspects of database decomposition is handling foreign key relationships that span what will become separate databases.
The Problem: Cross-Service Foreign Keys
In a shared database, foreign keys enforce referential integrity:
12345678910
-- Current state: Foreign keys across domainsCREATE TABLE orders ( id UUID PRIMARY KEY, user_id UUID NOT NULL REFERENCES users(id), -- Will be in different DB product_id UUID NOT NULL REFERENCES products(id), -- Will be in different DB created_at TIMESTAMP); -- Problem: When orders moves to its own database,-- these foreign keys cannot exist across databasesSolution: Application-Level Referential Integrity
Replace database foreign keys with application-level validation:
1234567891011121314151617181920212223242526272829303132333435363738
class OrderService { async createOrder(orderData: CreateOrderInput): Promise<Order> { // Validate references exist before proceeding await this.validateReferences(orderData); // Store only the ID reference, not a foreign key const order = await this.orderDb.orders.create({ id: generateId(), userId: orderData.userId, // Just an ID, no FK constraint productIds: orderData.items.map(i => i.productId), // Just IDs ...orderData, }); return order; } private async validateReferences(orderData: CreateOrderInput): Promise<void> { // Check user exists via User Service API const user = await this.userServiceClient.getUser(orderData.userId); if (!user) { throw new ValidationError(`User ${orderData.userId} not found`); } if (!user.canPlaceOrders) { throw new ValidationError(`User ${orderData.userId} cannot place orders`); } // Check products exist via Product Service API const productIds = orderData.items.map(i => i.productId); const products = await this.productServiceClient.getProducts(productIds); const foundIds = new Set(products.map(p => p.id)); const missingIds = productIds.filter(id => !foundIds.has(id)); if (missingIds.length > 0) { throw new ValidationError(`Products not found: ${missingIds.join(', ')}`); } }}Migration Sequence for FK Dependencies
The order of migration matters when foreign keys are involved:
123456789101112131415161718192021222324
Dependency graph:orders -> users (orders.user_id references users.id)orders -> products (orders.product_id references products.id) Migration sequence: Step 1: Add application-level validation (while FKs still exist) - Order Service validates via API before creating orders - Both validation paths active: app + DB FK Step 2: Drop foreign key constraints ALTER TABLE orders DROP CONSTRAINT orders_user_id_fkey; ALTER TABLE orders DROP CONSTRAINT orders_product_id_fkey; - Application validation is now the only enforcement - Test thoroughly Step 3: Migrate tables to separate databases - users -> User Service DB - products -> Product Service DB - orders -> Order Service DB Step 4: Handle orphaned references (cleanup) - Find any orders referencing non-existent users - Decide: soft-delete, archive, or flag for reviewOnce you remove FK constraints, orphaned references become possible. A user could be deleted while orders referencing them exist. Your application must handle this gracefully: show 'deleted user' instead of crashing, or implement soft deletes, or use event-driven cleanup to cascade deletions.
Successful migrations require robust tooling. Manual processes at scale are error-prone and exhausting for engineers.
Essential Migration Tooling:
Migration Dashboard
A dedicated migration dashboard provides visibility into progress:
123456789101112131415161718192021222324252627
┌─────────────────────────────────────────────────────────────────────┐│ USER SERVICE MIGRATION DASHBOARD │├─────────────────────────────────────────────────────────────────────┤│ ││ SYNC STATUS TRAFFIC ROUTING ││ ════════════ ═══════════════ ││ Legacy records: 10,234,567 Read traffic: ││ New DB records: 10,234,565 ├── Legacy: 60% ││ Sync lag: 2 records └── New DB: 40% ││ Last sync: 2 seconds ago ││ Write traffic: ││ └── Both (dual write) ││ ││ DATA QUALITY PERFORMANCE ││ ════════════ ═══════════ ││ Comparison checks: 1,234,567 Legacy p99: 45ms ││ Matches: 1,234,550 New DB p99: 38ms ││ Mismatches: 17 (0.001%) Error rate: 0.01% ││ Last mismatch: 12 min ago ││ ││ ROLLBACK STATUS MIGRATION PHASE ││ ═══════════════ ═══════════════ ││ Rollback ready: ✓ Yes Phase: Traffic Shifting ││ Legacy current: ✓ Yes Progress: 40% reads to new ││ Backfill running: ✓ Yes Next milestone: 50% (Day 7) ││ │└─────────────────────────────────────────────────────────────────────┘Runbook Automation
Codify migration procedures as automated runbooks:
12345678910111213141516171819202122232425262728293031
class MigrationRunbook { async increaseTrafficToNewDatabase(targetPercentage: number) { // Pre-flight checks await this.validatePreConditions([ 'sync_lag_under_threshold', 'mismatch_rate_acceptable', 'new_db_latency_acceptable', 'error_rate_acceptable', ]); // Record current state for rollback const previousPercentage = await this.getCurrentPercentage(); await this.recordCheckpoint(previousPercentage); // Gradual increase (not instant jump) const steps = this.calculateSteps(previousPercentage, targetPercentage); for (const step of steps) { await this.setTrafficPercentage(step.percentage); await this.wait(step.observationPeriodMs); const health = await this.checkSystemHealth(); if (!health.isHealthy) { await this.rollbackToPercentage(previousPercentage); throw new MigrationError(`Health check failed at ${step.percentage}%`); } } await this.notifyTeam(`Traffic increased to ${targetPercentage}%`); }}The time spent building migration tooling pays dividends. You'll likely decompose multiple services over time, and robust tooling can be reused. Consider migration tooling as platform investment, not one-time project cost.
Data migration is challenging but manageable with the right strategies and tooling. The key is incremental, verifiable progress with rollback capability at every step.
What's next:
The next page addresses one of the major challenges introduced by Database per Service: Handling Joins Across Services. When you can no longer JOIN tables across services, how do you handle queries that need data from multiple sources? We'll explore API composition, data denormalization, and CQRS patterns.
You now have a thorough understanding of data migration strategies for database decomposition. These patterns—parallel run, dual write, CDC, traffic shifting, and continuous validation—form the practical toolkit for safely moving data to service-specific databases.