System Design (HLD)Strangler Fig Pattern

Strangler Fig Pattern: Gradual Migration from Monolith to Microservices

LevelAdvanced

Duration90 mins

TopicStrangler Fig Pattern

3 / 5

Extracting Functionality

The Surgical Art of Decomposition

Extracting functionality from a monolith is like performing surgery on a patient who must remain conscious and active throughout the procedure. You're removing pieces of a living, running system while ensuring it continues to serve millions of requests. One wrong cut severs a critical dependency; one forgotten connection leaves orphaned functionality.

This is the most technically demanding phase of the Strangler Fig Pattern. The routing façade gave you the traffic control capability. Now you must identify what to extract, how to extract it cleanly, and how to maintain correctness during the transition.

The fundamental challenge:

Monoliths, especially long-lived ones, develop hidden dependencies—code paths that cross module boundaries, data relationships that span domains, and implicit contracts that nobody documented. Extraction requires making these invisible connections visible, then surgically severing them while establishing new, explicit interfaces.

What You Will Learn

By the end of this page, you will understand how to identify extraction candidates, techniques for boundary discovery, strategies for dependency management, patterns for data migration, and methods for validating extraction completeness.

Identifying Extraction Candidates

Not all parts of a monolith should be extracted at the same time, and some perhaps never should be. The art of successful migration begins with identifying the right first candidates—functionality that will demonstrate value quickly while minimizing risk.

The Ideal First Extraction:

An ideal first candidate has these characteristics:

Low coupling: Few dependencies on other parts of the monolith
High cohesion: Represents a complete, logical unit of functionality
Clear data ownership: Owns its data cleanly, minimal sharing with other domains
Independent scaling needs: Benefits from being scaled independently
Active development: Team is actively working on it, bringing domain expertise
Bounded blast radius: If extraction fails, impact is contained

Extraction Candidate Evaluation Matrix
Factor	Low Risk (Prefer)	Medium Risk	High Risk (Avoid Initially)
Dependencies	No shared database tables	Read-only access to shared tables	Write access to shared tables
Data Ownership	Owns all its data exclusively	Owns some, references others	Heavily intertwined with other domains
Team Expertise	Dedicated team understands it well	Mixed ownership, documentation exists	Nobody remembers how it works
Change Frequency	Actively developed, well-tested	Occasional changes, moderate tests	Rarely touched, minimal tests
Business Criticality	Important but not critical	Core but has redundancy	Single point of failure, revenue-critical
Technical Debt	Clean, modular code	Some debt, manageable	Spaghetti code, unclear boundaries

Common First Extraction Candidates:

Notification Services: Email, SMS, push notifications are typically loosely coupled and communicate via events. They often have independent scaling needs during campaigns.
Image/File Processing: Uploading, transforming, storing files. Usually isolated with clear data ownership and benefits from specialized scaling.
Search Indexing: Building and maintaining search indices. Often a write-behind process that can be extracted without affecting read paths initially.
Analytics/Reporting: Event processing and report generation. Frequently read-only from business data and benefits from different technology choices.
Authentication/Authorization: If not already centralized, auth is a clear bounded context with well-defined contracts.

Avoid starting with core transaction processing, complex workflow orchestration, or functionality with heavy database coupling. These are high-risk extractions that should come later after you've built migration expertise.

The 'Edge Peel' Strategy

Start at the edges and work inward. Functionality at the edges of your monolith (user-facing APIs, background jobs, integrations) typically has fewer internal dependencies than core business logic. Each extraction peels back another layer, gradually exposing the core for later extraction.

Boundary Discovery Techniques

Before you can extract functionality, you must understand its actual boundaries—not what the documentation says, not what architects intended, but what the code actually does. This requires systematic discovery.

Static Analysis Approach:

Use code analysis tools to map dependencies:

Import/Include Analysis: Which modules import which? Build a dependency graph.
Call Graph Analysis: Which functions call which? Identify transitive dependencies.
Database Table Usage: Which code paths touch which tables? Map data ownership.
API Consumption: Which internal APIs does this functionality use?
Configuration Dependencies: What configuration does it read?

DependencyAnalyzer.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
interface ModuleDependency {
    source: string;
    target: string;
    type: 'import' | 'function-call' | 'database' | 'api' | 'event';
    weight: number;  // Frequency or importance
}
 
interface BoundaryAnalysis {
    internalDependencies: ModuleDependency[];
    externalDependencies: ModuleDependency[];
    dataAccessPatterns: DataAccessPattern[];
    apiSurface: ApiEndpoint[];
    eventDependencies: EventDependency[];
}
 
class BoundaryDiscovery {
    /**
     * Analyze a proposed extraction boundary
     */
    async analyzeExtractionBoundary(
        candidateModules: string[],
        codebase: Codebase
    ): Promise<BoundaryAnalysis> {
        const allDependencies = await this.mapDependencies(codebase);
        const candidateSet = new Set(candidateModules);
        
        const internalDependencies: ModuleDependency[] = [];
        const externalDependencies: ModuleDependency[] = [];
        
        for (const dep of allDependencies) {
            const sourceInCandidate = candidateSet.has(dep.source);
            const targetInCandidate = candidateSet.has(dep.target);
            
            if (sourceInCandidate && targetInCandidate) {
                // Both ends inside candidate - internal dependency
                internalDependencies.push(dep);
            } else if (sourceInCandidate || targetInCandidate) {
                // One end outside - this crosses the extraction boundary
                externalDependencies.push(dep);
            }
            // Neither in candidate - not relevant to this extraction
        }
        
        // Analyze data access patterns
        const dataPatterns = await this.analyzeDataAccess(candidateModules, codebase);
        
        // Map API surface (what external code calls into candidate)
        const apiSurface = await this.mapApiSurface(candidateModules, allDependencies);
        
        // Map event dependencies (events produced and consumed)
        const eventDeps = await this.mapEventDependencies(candidateModules, codebase);
        
        return {
            internalDependencies,
            externalDependencies,
            dataAccessPatterns: dataPatterns,
            apiSurface,
            eventDependencies: eventDeps,
        };
    }
    
    /**
     * Calculate extraction complexity score
     */
    calculateExtractionComplexity(analysis: BoundaryAnalysis): number {
        let score = 0;
        
        // Each external dependency adds complexity
        score += analysis.externalDependencies.length * 10;
        
        // Write access to shared tables is very complex
        for (const pattern of analysis.dataAccessPatterns) {
            if (!pattern.isOwnedByCandidate) {
                score += pattern.hasWriteAccess ? 50 : 20;
            }
        }
        
        // Large API surface means more contracts to maintain
        score += analysis.apiSurface.length * 5;
        
        // Event dependencies require careful handling
        score += analysis.eventDependencies.length * 8;
        
        return score;
    }
    
    /**
     * Generate extraction plan based on analysis
     */
    generateExtractionPlan(analysis: BoundaryAnalysis): ExtractionPlan {
        return {
            // Dependencies that must become API calls
            apisToCreate: analysis.externalDependencies
                .filter(d => d.type === 'function-call')
                .map(d => ({
                    from: d.source,
                    to: d.target,
                    suggestedEndpoint: this.suggestEndpoint(d),
                })),
            
            // Tables that need to be migrated or accessed via API
            dataMigrations: analysis.dataAccessPatterns
                .filter(p => p.isOwnedByCandidate)
                .map(p => p.tableName),
            
            // Tables that need API wrappers
            dataApis: analysis.dataAccessPatterns
                .filter(p => !p.isOwnedByCandidate)
                .map(p => ({
                    table: p.tableName,
                    operations: p.hasWriteAccess ? ['read', 'write'] : ['read'],
                })),
            
            // Events that become published/subscribed
            eventContracts: analysis.eventDependencies,
        };
    }
    
    private async mapDependencies(codebase: Codebase): Promise<ModuleDependency[]> {
        // Implementation: static analysis of imports, calls, etc.
        return [];
    }
    
    private async analyzeDataAccess(
        modules: string[],
        codebase: Codebase
    ): Promise<DataAccessPattern[]> {
        // Implementation: trace database queries from module code
        return [];
    }
    
    private async mapApiSurface(
        modules: string[],
        dependencies: ModuleDependency[]
    ): Promise<ApiEndpoint[]> {
        // Implementation: find all entry points into candidate modules
        return [];
    }
    
    private async mapEventDependencies(
        modules: string[],
        codebase: Codebase
    ): Promise<EventDependency[]> {
        // Implementation: find event publications and subscriptions
        return [];
    }
    
    private suggestEndpoint(dep: ModuleDependency): string {
        return `/api/internal/${dep.target.toLowerCase()}`;
    }
}
 
interface DataAccessPattern {
    tableName: string;
    isOwnedByCandidate: boolean;
    hasWriteAccess: boolean;
    accessingModules: string[];
}
 
interface ApiEndpoint {
    path: string;
    method: string;
    consumers: string[];
}
 
interface EventDependency {
    eventName: string;
    direction: 'publish' | 'subscribe';
    counterparties: string[];
}
 
interface ExtractionPlan {
    apisToCreate: { from: string; to: string; suggestedEndpoint: string }[];
    dataMigrations: string[];
    dataApis: { table: string; operations: string[] }[];
    eventContracts: EventDependency[];
}
 
interface Codebase {
    // Abstract representation of codebase for analysis
}

Dynamic Analysis Approach:

Static analysis shows what could happen. Dynamic analysis shows what actually happens:

Request Tracing: Instrument production to trace which modules handle each request type.
Database Query Logging: Log which code paths execute which queries.
Runtime Dependency Mapping: Use APM tools to build actual call graphs from production traffic.
Feature Flag Analysis: Toggle features and observe which functionality is affected.

Dynamic analysis often reveals surprising truths: that 'isolated' module that actually gets called by every checkout request, or the 'deprecated' code path that still handles 5% of traffic.

The Dead Code Trap

Static analysis will show dependencies on code that's never actually executed. Before spending effort handling a dependency, verify it's actually used in production. Many 'complex' dependencies turn out to be dead code that can simply be removed.

Dependency Management Strategies

Once you've discovered the extraction boundary, you must decide how to handle each dependency that crosses it. There are several strategies, each with different tradeoffs.

Dependency Resolution Strategies

•API Creation — Convert the dependency into an explicit API call. The extracted service exposes an endpoint; the monolith (or other consumers) calls it. This is the most common pattern.
•Dependency Inversion — Instead of the extracted service depending on the monolith, invert it. The monolith calls into the extracted service, passing necessary data.
•Event-Based Decoupling — Replace synchronous calls with events. The monolith publishes events; the extracted service subscribes. Introduces eventual consistency.
•Data Duplication — Copy necessary data into the extracted service. Keep it synchronized via events or periodic sync. Trade consistency for autonomy.
•Shared Library Extraction — If multiple services need the same logic, extract it to a shared library. Avoid for business logic; suitable for utilities.
•Co-Extraction — If two pieces are too tightly coupled to separate, extract them together into one service. Split later if needed.

Converting Mermaid diagram...

Decision Framework for Dependency Resolution:

Question	If Yes → Strategy
Is this a simple data lookup?	API with caching
Does the caller need immediate confirmation?	Synchronous API
Can the caller tolerate eventual consistency?	Event-based decoupling
Is this called in a hot path?	Data duplication or co-extraction
Is this utility code with no business logic?	Shared library
Is this too intertwined to separate?	Co-extraction (temporary)

Every dependency crossing an extraction boundary becomes an explicit contract. Document these contracts—they're the foundation of your microservices architecture.

The Anti-Corruption Layer

When creating APIs to replace internal dependencies, don't just expose the monolith's internal data model. Create an anti-corruption layer that presents a clean, domain-appropriate interface. This prevents the monolith's legacy design decisions from infecting your new microservices.

Data Migration Patterns

Data is typically the hardest part of extraction. Code can be duplicated and tested; data must be migrated carefully to maintain correctness, and often can't be easily reversed.

The Data Migration Spectrum:

From least to most invasive:

Read from monolith DB: New service reads directly from monolith's database. Quick to implement but creates tight coupling.
Read via API: New service calls monolith API for data. Reduces coupling but adds latency.
Replicated Read Model: New service maintains its own copy, synchronized via events. Better autonomy, eventual consistency.
Full Data Migration: Data is migrated to new service's database, with monolith updated to call API. Full autonomy, significant effort.

Data Migration Strategy Selection
Strategy	Coupling	Latency	Consistency	Migration Effort
Direct DB Read	Very High	Low	Strong	Low
Read via API	Medium	Medium	Strong	Medium
Replicated Read Model	Low	Low (local)	Eventual	High
Full Migration	None	Low (local)	Strong (within service)	Very High

The Dual-Write Problem:

During migration, you often need both systems to have up-to-date data. This creates the dual-write challenge: if you write to both databases, you risk inconsistency if one write succeeds and the other fails.

Solutions to dual-write:

Single writer with events: Write to one authoritative source, propagate to others via events. Accept eventual consistency.
Change Data Capture (CDC): Use tools like Debezium to capture database changes and propagate them automatically.
Outbox Pattern: Write to primary database with an 'outbox' table for events. A separate process publishes events from the outbox.
Saga Pattern: Use compensating transactions if one write fails. Requires careful error handling.
Eventual migration: Don't dual-write. Migrate data completely before switching traffic. Simpler but requires downtime for the migrated feature.

OutboxPattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
interface OutboxEvent {
    id: string;
    aggregateType: string;
    aggregateId: string;
    eventType: string;
    payload: Record<string, unknown>;
    createdAt: Date;
    published: boolean;
}
 
class OutboxPublisher {
    private db: Database;
    private eventBus: EventBus;
    
    constructor(db: Database, eventBus: EventBus) {
        this.db = db;
        this.eventBus = eventBus;
    }
    
    /**
     * Write entity change with outbox event in single transaction
     */
    async writeWithOutbox<T>(
        entityWrite: () => Promise<T>,
        event: Omit<OutboxEvent, 'id' | 'createdAt' | 'published'>
    ): Promise<T> {
        return this.db.transaction(async (tx) => {
            // Write the entity
            const result = await entityWrite();
            
            // Write to outbox in same transaction
            await tx.insert('outbox', {
                id: crypto.randomUUID(),
                ...event,
                createdAt: new Date(),
                published: false,
            });
            
            return result;
        });
    }
    
    /**
     * Background process: publish unpublished outbox events
     */
    async pollAndPublish(): Promise<number> {
        // Get unpublished events, oldest first
        const events = await this.db.query<OutboxEvent>(
            'SELECT * FROM outbox WHERE published = false ORDER BY createdAt ASC LIMIT 100'
        );
        
        let published = 0;
        
        for (const event of events) {
            try {
                // Publish to event bus
                await this.eventBus.publish({
                    type: event.eventType,
                    aggregateType: event.aggregateType,
                    aggregateId: event.aggregateId,
                    payload: event.payload,
                    timestamp: event.createdAt.toISOString(),
                });
                
                // Mark as published
                await this.db.query(
                    'UPDATE outbox SET published = true WHERE id = $1',
                    [event.id]
                );
                
                published++;
            } catch (error) {
                // Log but don't fail - retry on next poll
                console.error(`Failed to publish event ${event.id}:`, error);
                break; // Preserve ordering by stopping on first failure
            }
        }
        
        return published;
    }
    
    /**
     * Cleanup old published events
     */
    async cleanup(olderThanDays: number = 7): Promise<number> {
        const result = await this.db.query(
            'DELETE FROM outbox WHERE published = true AND createdAt < NOW() - INTERVAL $1 DAY',
            [olderThanDays]
        );
        return result.rowCount;
    }
}
 
// Usage example
class UserService {
    private outbox: OutboxPublisher;
    private userRepo: UserRepository;
    
    async updateUserEmail(userId: string, newEmail: string): Promise<void> {
        await this.outbox.writeWithOutbox(
            async () => {
                return this.userRepo.updateEmail(userId, newEmail);
            },
            {
                aggregateType: 'User',
                aggregateId: userId,
                eventType: 'UserEmailUpdated',
                payload: { userId, newEmail },
            }
        );
    }
}
 
interface Database {
    transaction<T>(fn: (tx: any) => Promise<T>): Promise<T>;
    query<T>(sql: string, params?: unknown[]): Promise<T[]>;
}
 
interface EventBus {
    publish(event: any): Promise<void>;
}
 
interface UserRepository {
    updateEmail(userId: string, email: string): Promise<void>;
}

Never Dual-Write Without Coordination

If you write to two databases without coordination, you will have inconsistency. It's not a matter of if, but when. Either accept eventual consistency with event-based propagation, or use distributed transactions (with their overhead), but never assume independent writes will stay synchronized.

The Step-by-Step Extraction Process

With boundaries discovered, dependencies mapped, and data strategy chosen, here's the step-by-step process for extracting functionality.

Extraction Playbook

•Create the new service skeleton — Set up repository, CI/CD, infrastructure, and observability. Deploy an empty service that returns health checks. Verify your platform works.
•Define API contracts — Write OpenAPI specs or protobuf definitions for the extracted service's API. Review with stakeholders. Lock down the contract before implementing.
•Implement the service — Build the business logic, following the same specifications as the monolith. Use test-driven development with the monolith as the reference implementation.
•Set up data access — Implement the chosen data strategy (shared DB, API calls, replicated data). For replicated data, build synchronization and verify consistency.
•Configure shadow traffic — Route production read traffic to both implementations. Compare responses automatically. Fix discrepancies.
•Gradual traffic shift — Begin shifting production traffic: 1% → 5% → 25% → 50% → 100%. Monitor error rates and latencies at each step. Be ready to rollback.
•Complete data migration — If using data duplication, complete the migration and switch the new service to its own database. Update write paths.
•Monolith cleanup — Remove the extracted functionality from the monolith. Delete dead code, update documentation, remove database tables if fully migrated.
•Retrospective — Document what worked, what didn't, and lessons for the next extraction.

Converting Mermaid diagram...

Time Allocation:

Based on industry experience across hundreds of extractions:

Setup & Contracts: 15% — Often underestimated. Good contracts save time later.
Implementation: 25% — Smaller than expected if boundaries are clear.
Data Access: 15% — Depends heavily on strategy choice.
Validation: 25% — Often the longest phase. Production comparison reveals surprises.
Traffic Migration: 15% — Includes monitoring time at each step.
Cleanup: 5% — Often skipped, leading to debt. Don't skip it.

The Feature Flag Advantage

Implement the extraction behind a feature flag from day one. This gives you an instant rollback mechanism, the ability to test with specific users, and a clear on/off switch for the migration. When traffic migration is complete and stable, the flag becomes permanent (always on) and can eventually be removed.

Common Extraction Anti-Patterns

Understanding what to avoid is as important as knowing what to do. These anti-patterns have derailed many extraction efforts.

Anti-Patterns to Avoid

•The Big Bang Extraction — Trying to extract too much at once. Large extractions have exponentially more dependencies and failure modes. Keep extractions small and frequent.
•The Perfect Service — Over-engineering the extracted service. The goal is parity with the monolith first, improvements after. Don't let feature creep delay migration.
•Shared Database Forever — Using shared database access as permanent solution. It starts as a shortcut but becomes the norm. Plan the transition to independent data.
•Distributed Monolith — Extracting services that still share data and require synchronized deployments. You get distributed system complexity with monolith coupling. Ensure true independence.
•Ignoring Cross-Cutting Concerns — Forgetting authentication, logging, tracing, error handling. These 'boring' parts cause production incidents when overlooked.
•No Rollback Plan — Assuming the extraction will work. Always have a way to route traffic back to the monolith instantly.
•Skipping Shadow Testing — Going straight to production traffic. Shadow testing catches incompatibilities before they affect users.
•Zombie Code — Not removing extracted functionality from the monolith. Dead code confuses future developers and may accidentally be invoked.

🚫 Don't

Extract 5 services simultaneously 'for efficiency.' You'll spend all your time coordinating cross-service dependencies and have no time to properly validate any of them.

✓ Do

Extract one service completely, including cleanup, before starting the next. Learn from each extraction and apply lessons to subsequent ones. Speed comes from expertise, not parallelism.

The 'Distributed Monolith' Red Flags:

Watch for these signs that your extraction is creating a distributed monolith rather than true microservices:

Services must be deployed together
A change in one service requires changes in multiple others
Services communicate synchronously for most operations
The database is shared with no plans to split
Teams still need to coordinate for most changes
Integration tests require all services to be running

If you see these patterns, step back and reconsider your boundaries. It's better to have a well-designed monolith than a poorly designed distributed system.

Validating Extraction Completeness

How do you know when an extraction is truly complete? You need objective criteria that prevent premature celebrations and ensure thoroughness.

Extraction Completeness Checklist
Category	Criteria	Evidence Required
Functionality	All features work identically	Shadow comparison shows 0 discrepancies for 7+ days
Performance	Latency meets or exceeds baseline	P99 latency within 10% of monolith
Reliability	Error rate equivalent or better	Error rate ≤ monolith rate for 7+ days
Scale	Can handle full production load	Load test at 2x peak traffic
Independence	No shared database tables in production use	Database dependency map shows clean separation
Operations	Runbooks documented and tested	On-call has successfully handled an incident
Observability	Full visibility into service health	Dashboard shows all golden signals, alerting works
Cleanup	Monolith code removed	Dead code deleted, documentation updated

The Definition of Done for Extraction:

✓ New service handles 100% of production traffic for the extracted functionality
✓ No fallback to monolith for at least 14 days
✓ Error rates and latencies equal or better than baseline
✓ Service can be deployed independently without coordinating with monolith
✓ Data is either fully migrated or accessed via stable, versioned APIs
✓ Old code removed from monolith
✓ Documentation updated, runbooks complete
✓ At least one on-call rotation has been completed with new ownership

The Two-Week Rule

Don't declare victory until the extracted service has run in production for at least two weeks without needing to fall back to the monolith. Many edge cases only appear after days of diverse production traffic.

Summary: Extraction Mastery

Extracting functionality is the core work of the Strangler Fig Pattern. It requires systematic discovery, careful dependency management, and disciplined execution.

Key Takeaways

•Choose first candidates carefully — Low coupling, high cohesion, clear data ownership. Start at the edges of the monolith.
•Discover boundaries systematically — Use static and dynamic analysis to understand true dependencies, not assumed ones.
•Resolve dependencies explicitly — Every cross-boundary dependency becomes an API, event, or shared library. Make implicit contracts explicit.
•Plan data migration thoughtfully — Choose between shared access, API access, and data duplication based on coupling/consistency tradeoffs.
•Follow a structured process — Skeleton → contracts → implementation → shadow → migrate → cleanup. Don't skip steps.
•Avoid common anti-patterns — No big-bang extractions, no distributed monoliths, no skipping validation.
•Define done objectively — Use measurable criteria to determine when extraction is truly complete.

What's Next:

With functionality extracted, the next challenge is managing the transition: Cutover Strategies. We'll explore techniques for switching from old to new implementations, handling the critical moment when traffic moves, and ensuring zero-downtime transitions.

Page Complete

You now understand how to identify extraction candidates, discover boundaries, manage dependencies, handle data migration, and validate completeness. With these skills, you can systematically decompose a monolith into well-designed microservices. Next, we'll learn how to execute the cutover safely.

3 / 5

Loading learning content...

System Design (HLD)Strangler Fig Pattern

Strangler Fig Pattern: Gradual Migration from Monolith to Microservices

LevelAdvanced

Duration90 mins

TopicStrangler Fig Pattern

3 / 5

Extracting Functionality

The Surgical Art of Decomposition

The fundamental challenge:

What You Will Learn

Identifying Extraction Candidates

The Ideal First Extraction:

An ideal first candidate has these characteristics:

Low coupling: Few dependencies on other parts of the monolith
High cohesion: Represents a complete, logical unit of functionality
Clear data ownership: Owns its data cleanly, minimal sharing with other domains
Independent scaling needs: Benefits from being scaled independently
Active development: Team is actively working on it, bringing domain expertise
Bounded blast radius: If extraction fails, impact is contained

Extraction Candidate Evaluation Matrix
Factor	Low Risk (Prefer)	Medium Risk	High Risk (Avoid Initially)
Dependencies	No shared database tables	Read-only access to shared tables	Write access to shared tables
Data Ownership	Owns all its data exclusively	Owns some, references others	Heavily intertwined with other domains
Team Expertise	Dedicated team understands it well	Mixed ownership, documentation exists	Nobody remembers how it works
Change Frequency	Actively developed, well-tested	Occasional changes, moderate tests	Rarely touched, minimal tests
Business Criticality	Important but not critical	Core but has redundancy	Single point of failure, revenue-critical
Technical Debt	Clean, modular code	Some debt, manageable	Spaghetti code, unclear boundaries

Common First Extraction Candidates:

Notification Services: Email, SMS, push notifications are typically loosely coupled and communicate via events. They often have independent scaling needs during campaigns.
Image/File Processing: Uploading, transforming, storing files. Usually isolated with clear data ownership and benefits from specialized scaling.
Search Indexing: Building and maintaining search indices. Often a write-behind process that can be extracted without affecting read paths initially.
Analytics/Reporting: Event processing and report generation. Frequently read-only from business data and benefits from different technology choices.
Authentication/Authorization: If not already centralized, auth is a clear bounded context with well-defined contracts.

The 'Edge Peel' Strategy

Boundary Discovery Techniques

Static Analysis Approach:

Use code analysis tools to map dependencies:

Import/Include Analysis: Which modules import which? Build a dependency graph.
Call Graph Analysis: Which functions call which? Identify transitive dependencies.
Database Table Usage: Which code paths touch which tables? Map data ownership.
API Consumption: Which internal APIs does this functionality use?
Configuration Dependencies: What configuration does it read?

DependencyAnalyzer.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
interface ModuleDependency {
    source: string;
    target: string;
    type: 'import' | 'function-call' | 'database' | 'api' | 'event';
    weight: number;  // Frequency or importance
}
 
interface BoundaryAnalysis {
    internalDependencies: ModuleDependency[];
    externalDependencies: ModuleDependency[];
    dataAccessPatterns: DataAccessPattern[];
    apiSurface: ApiEndpoint[];
    eventDependencies: EventDependency[];
}
 
class BoundaryDiscovery {
    /**
     * Analyze a proposed extraction boundary
     */
    async analyzeExtractionBoundary(
        candidateModules: string[],
        codebase: Codebase
    ): Promise<BoundaryAnalysis> {
        const allDependencies = await this.mapDependencies(codebase);
        const candidateSet = new Set(candidateModules);
        
        const internalDependencies: ModuleDependency[] = [];
        const externalDependencies: ModuleDependency[] = [];
        
        for (const dep of allDependencies) {
            const sourceInCandidate = candidateSet.has(dep.source);
            const targetInCandidate = candidateSet.has(dep.target);
            
            if (sourceInCandidate && targetInCandidate) {
                // Both ends inside candidate - internal dependency
                internalDependencies.push(dep);
            } else if (sourceInCandidate || targetInCandidate) {
                // One end outside - this crosses the extraction boundary
                externalDependencies.push(dep);
            }
            // Neither in candidate - not relevant to this extraction
        }
        
        // Analyze data access patterns
        const dataPatterns = await this.analyzeDataAccess(candidateModules, codebase);
        
        // Map API surface (what external code calls into candidate)
        const apiSurface = await this.mapApiSurface(candidateModules, allDependencies);
        
        // Map event dependencies (events produced and consumed)
        const eventDeps = await this.mapEventDependencies(candidateModules, codebase);
        
        return {
            internalDependencies,
            externalDependencies,
            dataAccessPatterns: dataPatterns,
            apiSurface,
            eventDependencies: eventDeps,
        };
    }
    
    /**
     * Calculate extraction complexity score
     */
    calculateExtractionComplexity(analysis: BoundaryAnalysis): number {
        let score = 0;
        
        // Each external dependency adds complexity
        score += analysis.externalDependencies.length * 10;
        
        // Write access to shared tables is very complex
        for (const pattern of analysis.dataAccessPatterns) {
            if (!pattern.isOwnedByCandidate) {
                score += pattern.hasWriteAccess ? 50 : 20;
            }
        }
        
        // Large API surface means more contracts to maintain
        score += analysis.apiSurface.length * 5;
        
        // Event dependencies require careful handling
        score += analysis.eventDependencies.length * 8;
        
        return score;
    }
    
    /**
     * Generate extraction plan based on analysis
     */
    generateExtractionPlan(analysis: BoundaryAnalysis): ExtractionPlan {
        return {
            // Dependencies that must become API calls
            apisToCreate: analysis.externalDependencies
                .filter(d => d.type === 'function-call')
                .map(d => ({
                    from: d.source,
                    to: d.target,
                    suggestedEndpoint: this.suggestEndpoint(d),
                })),
            
            // Tables that need to be migrated or accessed via API
            dataMigrations: analysis.dataAccessPatterns
                .filter(p => p.isOwnedByCandidate)
                .map(p => p.tableName),
            
            // Tables that need API wrappers
            dataApis: analysis.dataAccessPatterns
                .filter(p => !p.isOwnedByCandidate)
                .map(p => ({
                    table: p.tableName,
                    operations: p.hasWriteAccess ? ['read', 'write'] : ['read'],
                })),
            
            // Events that become published/subscribed
            eventContracts: analysis.eventDependencies,
        };
    }
    
    private async mapDependencies(codebase: Codebase): Promise<ModuleDependency[]> {
        // Implementation: static analysis of imports, calls, etc.
        return [];
    }
    
    private async analyzeDataAccess(
        modules: string[],
        codebase: Codebase
    ): Promise<DataAccessPattern[]> {
        // Implementation: trace database queries from module code
        return [];
    }
    
    private async mapApiSurface(
        modules: string[],
        dependencies: ModuleDependency[]
    ): Promise<ApiEndpoint[]> {
        // Implementation: find all entry points into candidate modules
        return [];
    }
    
    private async mapEventDependencies(
        modules: string[],
        codebase: Codebase
    ): Promise<EventDependency[]> {
        // Implementation: find event publications and subscriptions
        return [];
    }
    
    private suggestEndpoint(dep: ModuleDependency): string {
        return `/api/internal/${dep.target.toLowerCase()}`;
    }
}
 
interface DataAccessPattern {
    tableName: string;
    isOwnedByCandidate: boolean;
    hasWriteAccess: boolean;
    accessingModules: string[];
}
 
interface ApiEndpoint {
    path: string;
    method: string;
    consumers: string[];
}
 
interface EventDependency {
    eventName: string;
    direction: 'publish' | 'subscribe';
    counterparties: string[];
}
 
interface ExtractionPlan {
    apisToCreate: { from: string; to: string; suggestedEndpoint: string }[];
    dataMigrations: string[];
    dataApis: { table: string; operations: string[] }[];
    eventContracts: EventDependency[];
}
 
interface Codebase {
    // Abstract representation of codebase for analysis
}

Dynamic Analysis Approach:

Static analysis shows what could happen. Dynamic analysis shows what actually happens:

Request Tracing: Instrument production to trace which modules handle each request type.
Database Query Logging: Log which code paths execute which queries.
Runtime Dependency Mapping: Use APM tools to build actual call graphs from production traffic.
Feature Flag Analysis: Toggle features and observe which functionality is affected.

Dynamic analysis often reveals surprising truths: that 'isolated' module that actually gets called by every checkout request, or the 'deprecated' code path that still handles 5% of traffic.

The Dead Code Trap

Dependency Management Strategies

Once you've discovered the extraction boundary, you must decide how to handle each dependency that crosses it. There are several strategies, each with different tradeoffs.

Dependency Resolution Strategies

•API Creation — Convert the dependency into an explicit API call. The extracted service exposes an endpoint; the monolith (or other consumers) calls it. This is the most common pattern.
•Dependency Inversion — Instead of the extracted service depending on the monolith, invert it. The monolith calls into the extracted service, passing necessary data.
•Event-Based Decoupling — Replace synchronous calls with events. The monolith publishes events; the extracted service subscribes. Introduces eventual consistency.
•Data Duplication — Copy necessary data into the extracted service. Keep it synchronized via events or periodic sync. Trade consistency for autonomy.
•Shared Library Extraction — If multiple services need the same logic, extract it to a shared library. Avoid for business logic; suitable for utilities.
•Co-Extraction — If two pieces are too tightly coupled to separate, extract them together into one service. Split later if needed.

Converting Mermaid diagram...

Decision Framework for Dependency Resolution:

Question	If Yes → Strategy
Is this a simple data lookup?	API with caching
Does the caller need immediate confirmation?	Synchronous API
Can the caller tolerate eventual consistency?	Event-based decoupling
Is this called in a hot path?	Data duplication or co-extraction
Is this utility code with no business logic?	Shared library
Is this too intertwined to separate?	Co-extraction (temporary)

Every dependency crossing an extraction boundary becomes an explicit contract. Document these contracts—they're the foundation of your microservices architecture.

The Anti-Corruption Layer

Data Migration Patterns

Data is typically the hardest part of extraction. Code can be duplicated and tested; data must be migrated carefully to maintain correctness, and often can't be easily reversed.

The Data Migration Spectrum:

From least to most invasive:

Read from monolith DB: New service reads directly from monolith's database. Quick to implement but creates tight coupling.
Read via API: New service calls monolith API for data. Reduces coupling but adds latency.
Replicated Read Model: New service maintains its own copy, synchronized via events. Better autonomy, eventual consistency.
Full Data Migration: Data is migrated to new service's database, with monolith updated to call API. Full autonomy, significant effort.

Data Migration Strategy Selection
Strategy	Coupling	Latency	Consistency	Migration Effort
Direct DB Read	Very High	Low	Strong	Low
Read via API	Medium	Medium	Strong	Medium
Replicated Read Model	Low	Low (local)	Eventual	High
Full Migration	None	Low (local)	Strong (within service)	Very High

The Dual-Write Problem:

Solutions to dual-write:

Single writer with events: Write to one authoritative source, propagate to others via events. Accept eventual consistency.
Change Data Capture (CDC): Use tools like Debezium to capture database changes and propagate them automatically.
Outbox Pattern: Write to primary database with an 'outbox' table for events. A separate process publishes events from the outbox.
Saga Pattern: Use compensating transactions if one write fails. Requires careful error handling.
Eventual migration: Don't dual-write. Migrate data completely before switching traffic. Simpler but requires downtime for the migrated feature.

OutboxPattern.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
interface OutboxEvent {
    id: string;
    aggregateType: string;
    aggregateId: string;
    eventType: string;
    payload: Record<string, unknown>;
    createdAt: Date;
    published: boolean;
}
 
class OutboxPublisher {
    private db: Database;
    private eventBus: EventBus;
    
    constructor(db: Database, eventBus: EventBus) {
        this.db = db;
        this.eventBus = eventBus;
    }
    
    /**
     * Write entity change with outbox event in single transaction
     */
    async writeWithOutbox<T>(
        entityWrite: () => Promise<T>,
        event: Omit<OutboxEvent, 'id' | 'createdAt' | 'published'>
    ): Promise<T> {
        return this.db.transaction(async (tx) => {
            // Write the entity
            const result = await entityWrite();
            
            // Write to outbox in same transaction
            await tx.insert('outbox', {
                id: crypto.randomUUID(),
                ...event,
                createdAt: new Date(),
                published: false,
            });
            
            return result;
        });
    }
    
    /**
     * Background process: publish unpublished outbox events
     */
    async pollAndPublish(): Promise<number> {
        // Get unpublished events, oldest first
        const events = await this.db.query<OutboxEvent>(
            'SELECT * FROM outbox WHERE published = false ORDER BY createdAt ASC LIMIT 100'
        );
        
        let published = 0;
        
        for (const event of events) {
            try {
                // Publish to event bus
                await this.eventBus.publish({
                    type: event.eventType,
                    aggregateType: event.aggregateType,
                    aggregateId: event.aggregateId,
                    payload: event.payload,
                    timestamp: event.createdAt.toISOString(),
                });
                
                // Mark as published
                await this.db.query(
                    'UPDATE outbox SET published = true WHERE id = $1',
                    [event.id]
                );
                
                published++;
            } catch (error) {
                // Log but don't fail - retry on next poll
                console.error(`Failed to publish event ${event.id}:`, error);
                break; // Preserve ordering by stopping on first failure
            }
        }
        
        return published;
    }
    
    /**
     * Cleanup old published events
     */
    async cleanup(olderThanDays: number = 7): Promise<number> {
        const result = await this.db.query(
            'DELETE FROM outbox WHERE published = true AND createdAt < NOW() - INTERVAL $1 DAY',
            [olderThanDays]
        );
        return result.rowCount;
    }
}
 
// Usage example
class UserService {
    private outbox: OutboxPublisher;
    private userRepo: UserRepository;
    
    async updateUserEmail(userId: string, newEmail: string): Promise<void> {
        await this.outbox.writeWithOutbox(
            async () => {
                return this.userRepo.updateEmail(userId, newEmail);
            },
            {
                aggregateType: 'User',
                aggregateId: userId,
                eventType: 'UserEmailUpdated',
                payload: { userId, newEmail },
            }
        );
    }
}
 
interface Database {
    transaction<T>(fn: (tx: any) => Promise<T>): Promise<T>;
    query<T>(sql: string, params?: unknown[]): Promise<T[]>;
}
 
interface EventBus {
    publish(event: any): Promise<void>;
}
 
interface UserRepository {
    updateEmail(userId: string, email: string): Promise<void>;
}

Never Dual-Write Without Coordination

The Step-by-Step Extraction Process

With boundaries discovered, dependencies mapped, and data strategy chosen, here's the step-by-step process for extracting functionality.

Extraction Playbook

•Create the new service skeleton — Set up repository, CI/CD, infrastructure, and observability. Deploy an empty service that returns health checks. Verify your platform works.
•Define API contracts — Write OpenAPI specs or protobuf definitions for the extracted service's API. Review with stakeholders. Lock down the contract before implementing.
•Implement the service — Build the business logic, following the same specifications as the monolith. Use test-driven development with the monolith as the reference implementation.
•Set up data access — Implement the chosen data strategy (shared DB, API calls, replicated data). For replicated data, build synchronization and verify consistency.
•Configure shadow traffic — Route production read traffic to both implementations. Compare responses automatically. Fix discrepancies.
•Gradual traffic shift — Begin shifting production traffic: 1% → 5% → 25% → 50% → 100%. Monitor error rates and latencies at each step. Be ready to rollback.
•Complete data migration — If using data duplication, complete the migration and switch the new service to its own database. Update write paths.
•Monolith cleanup — Remove the extracted functionality from the monolith. Delete dead code, update documentation, remove database tables if fully migrated.
•Retrospective — Document what worked, what didn't, and lessons for the next extraction.

Converting Mermaid diagram...

Time Allocation:

Based on industry experience across hundreds of extractions:

Setup & Contracts: 15% — Often underestimated. Good contracts save time later.
Implementation: 25% — Smaller than expected if boundaries are clear.
Data Access: 15% — Depends heavily on strategy choice.
Validation: 25% — Often the longest phase. Production comparison reveals surprises.
Traffic Migration: 15% — Includes monitoring time at each step.
Cleanup: 5% — Often skipped, leading to debt. Don't skip it.

The Feature Flag Advantage

Common Extraction Anti-Patterns

Understanding what to avoid is as important as knowing what to do. These anti-patterns have derailed many extraction efforts.

Anti-Patterns to Avoid

•The Big Bang Extraction — Trying to extract too much at once. Large extractions have exponentially more dependencies and failure modes. Keep extractions small and frequent.
•The Perfect Service — Over-engineering the extracted service. The goal is parity with the monolith first, improvements after. Don't let feature creep delay migration.
•Shared Database Forever — Using shared database access as permanent solution. It starts as a shortcut but becomes the norm. Plan the transition to independent data.
•Distributed Monolith — Extracting services that still share data and require synchronized deployments. You get distributed system complexity with monolith coupling. Ensure true independence.
•Ignoring Cross-Cutting Concerns — Forgetting authentication, logging, tracing, error handling. These 'boring' parts cause production incidents when overlooked.
•No Rollback Plan — Assuming the extraction will work. Always have a way to route traffic back to the monolith instantly.
•Skipping Shadow Testing — Going straight to production traffic. Shadow testing catches incompatibilities before they affect users.
•Zombie Code — Not removing extracted functionality from the monolith. Dead code confuses future developers and may accidentally be invoked.

🚫 Don't

Extract 5 services simultaneously 'for efficiency.' You'll spend all your time coordinating cross-service dependencies and have no time to properly validate any of them.

✓ Do

Extract one service completely, including cleanup, before starting the next. Learn from each extraction and apply lessons to subsequent ones. Speed comes from expertise, not parallelism.

The 'Distributed Monolith' Red Flags:

Watch for these signs that your extraction is creating a distributed monolith rather than true microservices:

Services must be deployed together
A change in one service requires changes in multiple others
Services communicate synchronously for most operations
The database is shared with no plans to split
Teams still need to coordinate for most changes
Integration tests require all services to be running

If you see these patterns, step back and reconsider your boundaries. It's better to have a well-designed monolith than a poorly designed distributed system.

Validating Extraction Completeness

How do you know when an extraction is truly complete? You need objective criteria that prevent premature celebrations and ensure thoroughness.

Extraction Completeness Checklist
Category	Criteria	Evidence Required
Functionality	All features work identically	Shadow comparison shows 0 discrepancies for 7+ days
Performance	Latency meets or exceeds baseline	P99 latency within 10% of monolith
Reliability	Error rate equivalent or better	Error rate ≤ monolith rate for 7+ days
Scale	Can handle full production load	Load test at 2x peak traffic
Independence	No shared database tables in production use	Database dependency map shows clean separation
Operations	Runbooks documented and tested	On-call has successfully handled an incident
Observability	Full visibility into service health	Dashboard shows all golden signals, alerting works
Cleanup	Monolith code removed	Dead code deleted, documentation updated

The Definition of Done for Extraction:

✓ New service handles 100% of production traffic for the extracted functionality
✓ No fallback to monolith for at least 14 days
✓ Error rates and latencies equal or better than baseline
✓ Service can be deployed independently without coordinating with monolith
✓ Data is either fully migrated or accessed via stable, versioned APIs
✓ Old code removed from monolith
✓ Documentation updated, runbooks complete
✓ At least one on-call rotation has been completed with new ownership

The Two-Week Rule

Summary: Extraction Mastery

Extracting functionality is the core work of the Strangler Fig Pattern. It requires systematic discovery, careful dependency management, and disciplined execution.

Key Takeaways

•Choose first candidates carefully — Low coupling, high cohesion, clear data ownership. Start at the edges of the monolith.
•Discover boundaries systematically — Use static and dynamic analysis to understand true dependencies, not assumed ones.
•Resolve dependencies explicitly — Every cross-boundary dependency becomes an API, event, or shared library. Make implicit contracts explicit.
•Plan data migration thoughtfully — Choose between shared access, API access, and data duplication based on coupling/consistency tradeoffs.
•Follow a structured process — Skeleton → contracts → implementation → shadow → migrate → cleanup. Don't skip steps.
•Avoid common anti-patterns — No big-bang extractions, no distributed monoliths, no skipping validation.
•Define done objectively — Use measurable criteria to determine when extraction is truly complete.

What's Next:

Page Complete

3 / 5