Modular Monolith - Learning Module

Loading content...

0/273

Preparing for Extraction

Extraction-Ready, Not Extraction-Optimized

One of the greatest advantages of a modular monolith is the optionality it provides. You can stay as a monolith indefinitely, or you can extract modules into services when genuine needs arise. But this optionality requires preparation.

The key principle is extraction-ready, not extraction-optimized. You design modules so they can become services, without paying the full cost of distributed systems until you actually need it.

This page teaches you to walk the fine line: sufficient preparation that extraction is straightforward when needed, without premature complexity that slows you down today. You'll learn what to prepare, what to defer, and the signals that indicate extraction time has arrived.

What You Will Learn

By the end of this page, you will understand how to design module interfaces that translate to network boundaries, strategies for data separation, communication patterns that support later distribution, and how to recognize when extraction is warranted.

The Extraction Decision

Before preparing for extraction, understand that most modules will never be extracted. Extraction adds complexity. The decision should be driven by concrete needs, not abstract architectural preferences.

Legitimate Reasons to Extract:

Valid Extraction Drivers

•Independent Scalability — This module needs to scale 10x while others stay constant. The monolith deployment model is constraining resource allocation.
•Independent Deployment — This module's release cadence doesn't match the rest. It needs to deploy multiple times daily while the core deploys weekly.
•Technology Requirements — This module requires a different language, runtime, or framework that can't coexist in the monolith (e.g., ML models in Python, real-time in Rust).
•Failure Isolation — This module's failures shouldn't take down the rest of the system. The blast radius must be contained.
•Organizational Boundaries — A separate team (perhaps acquired) will own this module, and they need full autonomy including technology choices.
•Regulatory Requirements — Compliance mandates physical separation of certain data or processing.

Invalid Extraction Reasons

•'Everyone is doing microservices' — Popularity isn't architecture. Your constraints are unique.
•'It will help with code organization' — The modular monolith already provides organization. Extraction won't improve it.
•'We want to hire microservices engineers' — Hire good engineers; architecture follows constraints, not talent marketing.
•'Services are more modern' — Age isn't an architectural quality. The right approach depends on your specific situation.
•'We might need to scale someday' — Prepare for potential scale; don't pay for actual distributed systems until scale arrives.

The Premature Extraction Trap

Extracting a module too early adds distributed system complexity without benefits. You get network latency, partial failure modes, and operational overhead—but none of the scaling, independence, or isolation benefits that justified the extraction. Wait for concrete evidence that extraction is needed.

Interface Design for Network Boundaries

When a module becomes a service, its public API becomes a network API. Design module interfaces with this future in mind—without actually using network communication yet.

Principle 1: Coarse-Grained Interfaces

Network calls are expensive. Chatty interfaces that work fine in-process become performance disasters over the network. Design interfaces that accomplish meaningful work in single calls.

Chatty Interface (Problematic)

•getOrderById(id)
•getOrderItems(orderId)
•getOrderShipping(orderId)
•getOrderPayment(orderId)
•getOrderStatus(orderId)
•5 calls to display one order

Coarse-Grained Interface (Better)

•getOrderDetails(id)
•Returns order with items,
•shipping, payment, status
•All in one call
•1 call to display one order

Principle 2: DTOs Instead of Domain Objects

Expose Data Transfer Objects (DTOs) at module boundaries, not rich domain entities. DTOs are serializable, versionable, and don't leak domain logic. When the module becomes a service, DTOs map directly to API payloads.

DTO-Based Module Interface
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// modules/order/api/types.ts - Public DTOs
 
// Immutable, serializable data structures
export interface OrderDTO {
  readonly id: string;
  readonly userId: string;
  readonly status: OrderStatus;
  readonly items: readonly OrderItemDTO[];
  readonly shipping: ShippingInfoDTO;
  readonly payment: PaymentInfoDTO;
  readonly totals: OrderTotalsDTO;
  readonly createdAt: string;  // ISO 8601 string, not Date
  readonly updatedAt: string;
}
 
export interface OrderItemDTO {
  readonly productId: string;
  readonly productName: string;
  readonly quantity: number;
  readonly unitPrice: MoneyDTO;
  readonly subtotal: MoneyDTO;
}
 
export interface MoneyDTO {
  readonly amount: number;
  readonly currency: string;
}
 
// Note: No methods, no behavior, no domain logic
// These are pure data containers that serialize cleanly to JSON
 
// modules/order/api/OrderService.ts
 
export interface OrderService {
  // Returns DTO, not domain entity
  getOrderDetails(orderId: string): Promise<OrderDTO | null>;
  
  // Accepts DTO, not domain entity
  createOrder(request: CreateOrderRequestDTO): Promise<OrderDTO>;
  
  // Uses simple IDs, not entity references
  cancelOrder(orderId: string): Promise<OrderDTO>;
}
 
// Internally, the module uses rich domain objects
// The API boundary handles DTO <-> Domain translation
 
// modules/order/internal/services/OrderApplicationService.ts
export class OrderApplicationService implements OrderService {
  async getOrderDetails(orderId: string): Promise<OrderDTO | null> {
    // Load domain entity
    const order = await this.repository.findById(orderId);
    if (!order) return null;
    
    // Map to DTO at boundary
    return this.mapper.toDTO(order);
  }
}

Principle 3: Idempotent Operations

Network calls can fail and be retried. Mutating operations should be idempotent—multiple identical calls produce the same result as one call. Design this into module interfaces now; it's much harder to retrofit.

Principle 4: Explicit Error Contracts

Define explicit error types that can travel across boundaries. When the module becomes a service, these map to HTTP status codes or gRPC error codes.

Error Contracts for Extraction
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// modules/order/api/errors.ts
 
// Explicit, serializable error types
export type OrderError = 
  | OrderNotFoundError
  | InsufficientInventoryError
  | PaymentDeclinedError
  | InvalidOrderStateError;
 
export interface OrderNotFoundError {
  readonly type: 'ORDER_NOT_FOUND';
  readonly orderId: string;
}
 
export interface InsufficientInventoryError {
  readonly type: 'INSUFFICIENT_INVENTORY';
  readonly productId: string;
  readonly requested: number;
  readonly available: number;
}
 
export interface PaymentDeclinedError {
  readonly type: 'PAYMENT_DECLINED';
  readonly reason: string;
  readonly retryable: boolean;
}
 
export interface InvalidOrderStateError {
  readonly type: 'INVALID_ORDER_STATE';
  readonly currentState: string;
  readonly attemptedOperation: string;
}
 
// Using Result type instead of exceptions
export type Result<T, E> = 
  | { success: true; data: T }
  | { success: false; error: E };
 
// Module API uses explicit results
export interface OrderService {
  createOrder(request: CreateOrderRequestDTO): Promise<Result<OrderDTO, OrderError>>;
  cancelOrder(orderId: string): Promise<Result<OrderDTO, OrderError>>;
}
 
// When extracted to a service:
// - ORDER_NOT_FOUND -> 404
// - INSUFFICIENT_INVENTORY -> 409 Conflict
// - PAYMENT_DECLINED -> 402 Payment Required
// - INVALID_ORDER_STATE -> 409 Conflict

The Interface Translation Test

Periodically review module interfaces and ask: 'If this were a REST or gRPC API, would it work well?' Chatty interfaces, complex object graphs, or methods that assume shared memory are signs the interface needs work.

Data Separation Strategies

Database coupling is the hardest part of service extraction. If modules share tables freely, extracting them requires complex data migration. Prepare for this by progressively separating data—without actually splitting the database yet.

Level 1: Logical Separation (Enforce Now)

Each module owns specific tables. No module directly queries another module's tables. Cross-module data access goes through module APIs. This is the minimum for a modular monolith.

Logical Data Separation
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// ❌ WRONG: Cross-module database access
class OrderReportService {
  async getOrdersWithUserDetails() {
    // Order module directly querying User tables - coupling!
    return this.db.query(`
      SELECT o.*, u.email, u.name
      FROM orders o
      JOIN users u ON o.user_id = u.id
      WHERE o.status = 'completed'
    `);
  }
}
 
// ✅ RIGHT: API-based cross-module access
class OrderReportService {
  constructor(
    private orderRepository: OrderRepository,  // Own module's data
    private userService: UserService            // Other module's API
  ) {}
 
  async getOrdersWithUserDetails(): Promise<OrderReportDTO[]> {
    // Get orders from own data
    const orders = await this.orderRepository.findCompleted();
    
    // Get user details through User module's API
    const userIds = [...new Set(orders.map(o => o.userId))];
    const users = await this.userService.getUsersByIds(userIds);
    const userMap = new Map(users.map(u => [u.id, u]));
    
    // Combine at application layer
    return orders.map(order => ({
      orderId: order.id,
      status: order.status,
      total: order.total,
      userEmail: userMap.get(order.userId)?.email ?? 'unknown',
      userName: userMap.get(order.userId)?.displayName ?? 'Unknown User',
    }));
  }
}

Level 2: Schema Separation (Consider for Complex Systems)

Use separate database schemas or namespaces per module. Tables live in order_schema.orders, user_schema.users, etc. The database remains shared, but ownership is explicit.

Level 3: Read Replicas for Cross-Module Reads (Optional)

If cross-module reads are performance-critical, consider read replicas or materialized views owned by the consuming module. The source module publishes events; the consuming module maintains its own read model.

Data That Spans Modules:

Some data genuinely spans boundaries—a foreign key from Order to User, for instance. Handle these relationships carefully:

Strategies for Cross-Module References

•Store IDs, not entities — Order stores userId (a string), not a User entity. The join happens at application layer, not database layer.
•Cache denormalized data — If Order frequently needs userName, cache it in Order's data. Accept eventual consistency.
•Use events for synchronization — When User changes their name, publish an event. Order updates its cached copy.
•Accept application-level joins — Multiple queries with in-memory joins are fine in most cases. Only optimize if proven necessary.

Event-Driven Data Synchronization
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Order module caches user display name for performance
// Synchronized via events
 
// modules/order/internal/tables.ts
const orders = {
  id: uuid(),
  userId: varchar(36),           // Reference to User
  userDisplayName: varchar(100), // Cached, denormalized
  // ... other order fields
};
 
// modules/order/internal/handlers/UserEventHandler.ts
export class UserEventHandler {
  constructor(private orderRepository: OrderRepository) {}
 
  @on('user.profile.updated')
  async handleUserProfileUpdated(event: UserProfileUpdatedEvent) {
    // Update cached display name in all orders for this user
    await this.orderRepository.updateUserDisplayName(
      event.userId,
      event.newDisplayName
    );
  }
 
  @on('user.deleted')
  async handleUserDeleted(event: UserDeletedEvent) {
    // Handle user deletion - perhaps anonymize order data
    await this.orderRepository.anonymizeUserData(event.userId);
  }
}
 
// Benefits:
// - Order module is self-contained; doesn't need to call User for display names
// - User module doesn't know Order exists (decoupled)
// - Extraction is easy: replace event bus with message queue

Foreign Keys Across Modules

Avoid database foreign key constraints across module boundaries. They create tight coupling that prevents extraction. Store the reference as a plain column; enforce referential integrity at the application layer. When the modules become services, the database constraint couldn't exist anyway.

Communication Patterns for Distribution

How modules communicate determines how easily they can be distributed. Design communication patterns that work in-process today but translate to network communication tomorrow.

Pattern 1: Command/Query Through Interfaces

Modules expose interfaces for commands (mutations) and queries (reads). Today these are in-process calls; after extraction, they become HTTP/gRPC calls. The consuming code doesn't change—only the implementation of the interface.

Interface-Based Communication
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// modules/order/api/OrderService.ts - Interface definition
export interface OrderService {
  createOrder(request: CreateOrderRequestDTO): Promise<Result<OrderDTO, OrderError>>;
  getOrderById(orderId: string): Promise<OrderDTO | null>;
}
 
// Today: In-process implementation
// modules/order/internal/OrderServiceImpl.ts
export class OrderServiceImpl implements OrderService {
  async createOrder(request: CreateOrderRequestDTO) {
    // Direct database access, in-process logic
    const order = Order.create(request);
    await this.repository.save(order);
    return { success: true, data: this.mapper.toDTO(order) };
  }
}
 
// After extraction: HTTP client implementation
// Would live in the consuming module, not Order module
export class OrderServiceHttpClient implements OrderService {
  constructor(private httpClient: HttpClient, private baseUrl: string) {}
 
  async createOrder(request: CreateOrderRequestDTO) {
    const response = await this.httpClient.post(
      `${this.baseUrl}/orders`,
      request
    );
    if (response.ok) {
      return { success: true, data: response.body as OrderDTO };
    }
    return { success: false, error: this.mapError(response) };
  }
}
 
// Consuming code doesn't change:
class CheckoutService {
  constructor(private orderService: OrderService) {}  // Interface, not implementation
 
  async checkout(cart: Cart): Promise<OrderDTO> {
    const result = await this.orderService.createOrder({
      userId: cart.userId,
      items: cart.items.map(i => ({
        productId: i.productId,
        quantity: i.quantity,
      })),
    });
    // Works identically whether OrderService is in-process or remote
  }
}

Pattern 2: Domain Events for Loose Coupling

Events decouple modules completely. Publishers don't know who subscribes. When extracted, the in-process event bus becomes a message queue—but the event structure and semantics remain unchanged.

Event-Based Coupling
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// Today: In-process event bus
// shared/events/InProcessEventBus.ts
export class InProcessEventBus implements EventBus {
  private handlers = new Map<string, Array<(event: any) => Promise<void>>>();
 
  subscribe(eventType: string, handler: (event: any) => Promise<void>) {
    const handlers = this.handlers.get(eventType) || [];
    handlers.push(handler);
    this.handlers.set(eventType, handlers);
  }
 
  async publish<T extends DomainEvent>(event: T) {
    const handlers = this.handlers.get(event.type) || [];
    await Promise.all(handlers.map(h => h(event)));
  }
}
 
// After extraction: Message queue implementation
// Would replace InProcessEventBus in DI container
export class RabbitMQEventBus implements EventBus {
  async publish<T extends DomainEvent>(event: T) {
    await this.channel.publish(
      'domain-events',
      event.type,
      Buffer.from(JSON.stringify(event))
    );
  }
}
 
// Publisher code doesn't change:
await eventBus.publish<OrderCreatedEvent>({
  type: 'order.created',
  orderId: order.id,
  userId: order.userId,
  occurredAt: new Date(),
});
 
// Subscriber code doesn't change:
@on('order.created')
async handleOrderCreated(event: OrderCreatedEvent) {
  await this.inventoryService.reserveStock(event.items);
}

Pattern 3: Saga Pattern for Distributed Transactions

Prepare for the loss of ACID transactions across modules. When you need to coordinate multiple modules atomically today, use the Saga pattern—even though transactions would technically work. This prepares you for the distributed world where transactions across services don't exist.

Don't Over-Prepare

Use Sagas only where transactions would be problematic to unwind later. If two modules will likely always be extracted together (or never), using local transactions is fine. Reserve Saga complexity for genuinely independent modules with different extraction timelines.

The Extraction Process

When the time comes to extract a module, follow a disciplined process that minimizes risk and allows rollback at each step.

Step 1: Validate Boundaries (Before Starting)

Verify the module truly has clean boundaries:

No circular dependencies with other modules
All access through public API
Data access isolated to module's tables
Events published for relevant state changes

If any of these fail, fix them before extraction.

Step 2: Create Service Shell

Create the new service with:

HTTP/gRPC endpoints matching the module's API
Database schema matching the module's tables
Event publishing to message queue

The service is deployed but not receiving traffic yet.

Converting Mermaid diagram...

Step 3: Shadow Traffic

Route traffic to both the monolith module and the new service. Compare responses. Log discrepancies. The service is read-only at this stage; the monolith is still authoritative.

Step 4: Migrate Data

Copy the module's data to the service's database. Set up ongoing synchronization. For event-sourced systems, replay events to the new service.

Step 5: Switch Reads

Redirect read traffic to the service while writes still go to the monolith. The service reads from its own database, which is synchronized from the monolith.

Step 6: Switch Writes

Redirect write traffic to the service. The service becomes the authoritative source. The monolith now calls the service for this module's functionality.

Step 7: Remove Module from Monolith

Once stable, remove the module code from the monolith. Replace with a thin client that calls the service. Clean up synchronized data paths.

Extraction Checklist

•Boundary validation — Clean API, isolated data, no cycles
•Service shell — Endpoints, database, messaging infrastructure
•Shadow traffic — Parallel execution, response comparison
•Data migration — Copy and synchronize module data
•Read switch — Service handles reads, monolith handles writes
•Write switch — Service handles all operations
•Cleanup — Remove module from monolith, replace with client

Rollback at Every Step

Each step should be reversible. If shadow traffic reveals discrepancies, fix them before proceeding. If the write switch causes problems, roll back to the monolith. The phased approach limits blast radius and enables learning.

What NOT to Prepare

Being extraction-ready doesn't mean building a distributed system prematurely. Many preparations are wasted effort that slows you down without providing benefits.

Don't Use Network Protocols Between Modules:

Modules should communicate via function calls, not HTTP or gRPC. Network protocols add latency, failure modes, and debugging complexity. You can add them during extraction.

Don't Run Separate Databases Per Module:

Logical separation is sufficient. Running separate database instances adds operational overhead without benefit. The extraction process handles database separation.

Don't Build Complex Service Discovery:

In a monolith, modules are found through dependency injection, not service registries. Service discovery comes when you actually have services to discover.

Don't Add Message Queues for In-Process Events:

In-process event buses (observer pattern) are simpler, faster, and have stronger guarantees. Add message queues when you extract, not before.

DO Prepare

•Clean module interfaces (APIs)
•DTO-based communication
•Explicit error contracts
•Logical data separation
•Domain events for decoupling
•Idempotent operations
•Architecture enforcement tests

DON'T Prepare

•HTTP/gRPC between modules
•Separate databases per module
•Service discovery infrastructure
•Message queues for internal events
•Circuit breakers for internal calls
•Distributed tracing for internal calls
•API gateways for module routing

The Distributed Monolith Trap (Again)

Adding distributed system infrastructure to a monolith creates the worst of both worlds: you have the operational complexity of microservices without the independent deployment, scaling, or failure isolation benefits. Either stay a real modular monolith or extract real services—don't create a distributed monolith.

Summary: Preparing for Future Extraction

Preparing for extraction is about maintaining optionality without paying for complexity you don't need yet. The key is knowing what to prepare and what to defer. Let's consolidate:

Key Takeaways

•Extraction should be driven by concrete needs — Independent scaling, deployment, technology, isolation, or regulation. Not by fashion or speculation.
•Design interfaces for network translation — Coarse-grained APIs, DTOs instead of domain objects, idempotent operations, explicit error contracts.
•Separate data logically — Modules own their tables exclusively. No cross-module SQL joins. Store IDs, not entities. Use events for synchronization.
•Use patterns that translate to distribution — Interface-based communication, domain events, saga pattern for complex coordination.
•Follow a phased extraction process — Validate, shell, shadow, migrate, switch reads, switch writes, cleanup. Rollback possible at every step.
•Don't over-prepare — No network protocols, separate databases, service discovery, or message queues until you actually extract. Keep monolith simple.

What's Next:

The final page explores the Benefits of Modular Monolith—a comprehensive look at why this architecture succeeds, including real-world case studies, quantified benefits, and common objections addressed. We'll solidify the case for considering this architecture as your primary approach.

Page Complete

You now understand how to prepare a modular monolith for future extraction while avoiding premature complexity. The goal is optionality—the ability to extract when needed, without paying the distributed systems tax until you have to. Next, we'll examine the comprehensive benefits of this architecture.