System Design (LLD)Domain-Driven Design

Repositories

LevelIntermediate

Duration60 mins

TopicDomain-Driven Design

2 / 4

Repository as Collection Abstraction

The Collection Metaphor

The most powerful insight about the Repository pattern is understanding it as a collection. Not a service. Not a data access layer. A collection—like the List<T>, Set<T>, or Map<K, V> you use every day in your code.

When you work with an in-memory collection, you never think about file I/O, serialization formats, or storage locations. You simply add items, remove items, and query for items. The collection handles the rest. A repository offers the exact same experience, but the 'rest' it handles includes database persistence, caching, and data mapping.

This metaphor isn't just conceptual elegance—it has profound practical implications for how you design repository interfaces, what methods you expose, and how client code interacts with your domain's persistent objects.

What You Will Learn

This page examines the collection abstraction in depth. You'll understand which collection operations map to repositories, why certain operations are absent, how to think about repository state, and how this mental model simplifies both implementation and testing.

Thinking in Collections

Consider how you interact with a standard in-memory collection:

Standard Collection Operations
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// In-memory collection behavior
const orders = new Map<string, Order>();
 
// Adding an item - it's now part of the collection
orders.set(order.id, order);
 
// Retrieving an item - it's returned from the collection
const retrieved = orders.get(orderId);
 
// Removing an item - it's no longer part of the collection
orders.delete(orderId);
 
// Querying the collection
const customerOrders = Array.from(orders.values())
    .filter(o => o.customerId === customerId);
 
// Modifying a retrieved object - changes are in the collection
retrieved.cancel();  // The order in the map is now cancelled
 
// Note: No "save" or "persist" operation needed
// Changes to objects in the collection are just... there

A repository mirrors this behavior exactly. From the domain code's perspective, the repository is this collection—an infinitely large set containing every aggregate of its type that has ever been created. The fact that this "collection" is actually backed by a PostgreSQL database, MongoDB cluster, or distributed cache is invisible.

This creates a powerful abstraction: domain code can pretend that all aggregates are always in memory, available for immediate access and modification. The repository maintains this illusion.

Collection Operations Mapped to Repository
Collection Operation	Repository Equivalent	Database Effect
`collection.add(item)`	`repository.add(aggregate)`	INSERT (or queue for INSERT)
`collection.get(key)`	`repository.findById(id)`	SELECT with WHERE clause
`collection.remove(item)`	`repository.remove(aggregate)`	DELETE (or queue for DELETE)
`collection.filter(predicate)`	`repository.findByXxx(criteria)`	SELECT with filtering
`item.mutate()`	Same—mutate retrieved aggregate	UPDATE (tracked by Unit of Work)
`collection.contains(item)`	`repository.exists(id)`	SELECT EXISTS or COUNT

The Missing Operation

Notice that save() or update() is missing from the collection metaphor. You don't call list.save() after modifying an object in a list—the modification is immediate. Similarly, a pure repository pattern (with Unit of Work) doesn't require explicit save calls. Changes to retrieved aggregates are tracked and persisted automatically at the end of the unit of work.

Core Collection Operations

A repository exposes a specific set of collection-like operations. Understanding each operation's semantics is essential for correct usage and implementation.

Add: Inserting a New Aggregate

The add() operation places a new aggregate into the collection. After this call, the aggregate is considered part of the repository's conceptual set.

Critical semantics:

The aggregate must be new (never previously added or retrieved from this repository)
The aggregate's identity must be unique within the collection
After add(), subsequent findById() calls should return this aggregate
The actual database INSERT may be deferred (via Unit of Work)

Add Operation Semantics
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
interface OrderRepository {
    add(order: Order): Promise<void>;
}
 
// Usage
class OrderService {
    async placeOrder(customerId: CustomerId, items: OrderItem[]): Promise<Order> {
        // Create a new aggregate
        const order = Order.create(customerId, items);
        
        // Add to the collection - it's now part of the repository
        await this.orderRepository.add(order);
        
        // At this point, logically, findById(order.id) would return this order
        // (though actual persistence may be deferred)
        
        return order;
    }
}
 
// Implementation note: add() should NOT accept existing aggregates
// Trying to add an already-existing aggregate is an error

Find By Identity: The Primary Retrieval

The findById() operation is the most fundamental query. It retrieves an aggregate by its unique identity, returning the fully reconstituted aggregate or null if not found.

Critical semantics:

Returns the complete aggregate including all entities and value objects
Returns null (or Optional.empty()) if no aggregate has this identity
The returned aggregate is connected to the repository—changes are tracked
Multiple calls with the same ID should return the same instance (identity map)

FindById Operation Semantics
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
interface OrderRepository {
    findById(orderId: OrderId): Promise<Order | null>;
}
 
// Usage
class OrderService {
    async cancelOrder(orderId: OrderId): Promise<void> {
        // Retrieve the aggregate
        const order = await this.orderRepository.findById(orderId);
        
        if (!order) {
            throw new OrderNotFoundException(orderId);
        }
        
        // Modify the aggregate - changes are tracked
        order.cancel();
        
        // No need to call save() - the Unit of Work will
        // detect changes and persist them
    }
}
 
// Implementation note: identity map pattern ensures that
// multiple findById() calls in the same transaction return
// the same object instance, preserving reference equality

Remove: Deleting an Aggregate

The remove() operation removes an aggregate from the collection. After this call, the aggregate is no longer part of the repository's conceptual set.

Critical semantics:

The aggregate should have been previously retrieved from this repository
After remove(), findById() with this ID should return null
The actual database DELETE may be deferred (via Unit of Work)
Removing a non-existent aggregate should be handled gracefully (no-op or exception)

Remove Operation Semantics
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
interface OrderRepository {
    remove(order: Order): Promise<void>;
}
 
// Usage
class OrderService {
    async deleteOrder(orderId: OrderId): Promise<void> {
        // First retrieve the aggregate
        const order = await this.orderRepository.findById(orderId);
        
        if (!order) {
            throw new OrderNotFoundException(orderId);
        }
        
        // Verify business rules allow deletion
        if (!order.canBeDeleted()) {
            throw new OrderCannotBeDeletedError(orderId);
        }
        
        // Remove from the collection
        await this.orderRepository.remove(order);
        
        // At this point, the order is no longer part of the collection
        // The actual DELETE happens at transaction commit
    }
}
 
// Note: Some implementations use removeById(orderId) for convenience
// when you don't need the aggregate for business logic validation

Rich Query Operations

Beyond basic add/find/remove, repositories provide domain-meaningful query operations. These operations filter the conceptual collection based on criteria that make sense in the domain.

The key principle: Query methods should express domain concepts, not database concepts.

Domain-Meaningful Queries

•findPendingOrders()
•findByCustomer(customerId)
•findOrdersPlacedBetween(start, end)
•findOverdueOrders()
•findHighValueOrders(threshold)

Database-Centric Queries (Avoid)

•findByStatusEquals('PENDING')
•findByCustomerIdIn([...])
•findByCreatedAtBetween(a, b)
•findByDueDateBeforeAndStatusIn(...)
•findByTotalGreaterThan(amount)

The Spring Data Trap

Frameworks like Spring Data JPA encourage method names that mirror SQL (findByFieldNameOperator). While convenient, this leaks database concerns into the domain. Prefer explicit domain-named methods even if they're less 'magical.' The implementation can still use query derivation internally.

Rich Query Methods
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
interface OrderRepository {
    // Identity-based retrieval
    findById(orderId: OrderId): Promise<Order | null>;
    
    // Domain-meaningful queries
    findByCustomer(customerId: CustomerId): Promise<Order[]>;
    findPendingOrders(): Promise<Order[]>;
    findOrdersAwaitingShipment(): Promise<Order[]>;
    findOrdersPlacedBetween(start: Date, end: Date): Promise<Order[]>;
    
    // Existence checks
    exists(orderId: OrderId): Promise<boolean>;
    hasCustomerPlacedOrderBefore(customerId: CustomerId): Promise<boolean>;
    
    // Counting (for domain purposes, not pagination)
    countPendingOrders(): Promise<number>;
    
    // Collection operations
    add(order: Order): Promise<void>;
    remove(order: Order): Promise<void>;
}
 
// Note: Each finder method returns complete aggregates
// This ensures invariants are maintained when working with results

Pagination and Sorting Considerations

Pagination and sorting are often needed but create tension with the collection metaphor. A finite page of results doesn't feel like a collection.

Recommendations:

Keep repositories aggregate-focused — Repositories retrieve aggregates for command processing, not for display.
Use CQRS for read-heavy use cases — If you need paginated lists for UI display, create separate read models/projections.
If pagination is unavoidable, make it explicit with domain-meaningful semantics:

Pagination When Necessary
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// If you must have pagination in a repository, make it domain-meaningful
interface OrderRepository {
    // Returns oldest pending orders (for processing queue)
    findOldestPendingOrders(limit: number): Promise<Order[]>;
    
    // Returns recent orders for a customer (for recent activity)
    findRecentOrdersByCustomer(
        customerId: CustomerId,
        limit: number
    ): Promise<Order[]>;
    
    // For paging through large result sets (use sparingly)
    findAllOrdersPaged(page: PageRequest): Promise<PagedResult<Order>>;
}
 
// But consider: if you're paginating heavily, you might need CQRS
// with separate read models optimized for querying

The Identity Map Pattern

For the collection metaphor to hold, retrieving the same aggregate twice must return the same object instance. This is the Identity Map pattern, and it's essential for repository correctness.

Why identity consistency matters:

Consider two service methods that both retrieve the same order:

Identity Map Necessity
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Without identity map - BROKEN
class BrokenOrderService {
    async processOrder(orderId: OrderId) {
        const order1 = await this.orderRepository.findById(orderId);
        order1.approve();  // Sets status to APPROVED
        
        // Some intermediate processing...
        await this.processPayment(orderId);
    }
    
    async processPayment(orderId: OrderId) {
        const order2 = await this.orderRepository.findById(orderId);
        
        // order2 is a DIFFERENT INSTANCE than order1!
        // order2.status is still PENDING, not APPROVED
        // Changes made to order1 are invisible to order2
        
        order2.markAsPaid();  // Now we have conflicting states
        
        // When we commit, which version wins? order1 or order2?
        // This leads to data loss or corruption
    }
}
 
// With identity map - CORRECT
class CorrectOrderService {
    async processOrder(orderId: OrderId) {
        const order1 = await this.orderRepository.findById(orderId);
        order1.approve();
        
        await this.processPayment(orderId);
    }
    
    async processPayment(orderId: OrderId) {
        const order2 = await this.orderRepository.findById(orderId);
        
        // order2 IS THE SAME INSTANCE as order1!
        // order2.status is APPROVED
        // All changes are visible because it's the same object
        
        order2.markAsPaid();  // Consistent modification
        
        // At commit, there's one object with all changes
    }
}

ORM Identity Maps

Most ORMs (Entity Framework, Hibernate, Prisma) implement identity maps automatically within a database context/session. This is one reason repositories typically wrap ORM operations rather than raw SQL—the ORM handles identity map bookkeeping.

Identity Map Implementation

An identity map is essentially a Map<Identity, Aggregate> that tracks all loaded aggregates. When findById() is called:

Check if the identity is already in the map
If yes, return the cached instance
If no, load from database, add to map, return

The identity map is typically scoped to a Unit of Work (transaction/request).

Identity Map Implementation Sketch
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
class OrderRepositoryImpl implements OrderRepository {
    // Identity map - scoped to this unit of work
    private identityMap = new Map<string, Order>();
    
    async findById(orderId: OrderId): Promise<Order | null> {
        const key = orderId.value;
        
        // Check identity map first
        if (this.identityMap.has(key)) {
            return this.identityMap.get(key)!;
        }
        
        // Not cached - load from database
        const data = await this.database.query(
            'SELECT * FROM orders WHERE id = ?',
            [key]
        );
        
        if (!data) return null;
        
        // Reconstitute the aggregate
        const order = this.mapper.toDomain(data);
        
        // Store in identity map
        this.identityMap.set(key, order);
        
        return order;
    }
    
    async add(order: Order): Promise<void> {
        // Add to identity map immediately
        this.identityMap.set(order.id.value, order);
        
        // Mark for insertion in Unit of Work
        this.unitOfWork.registerNew(order);
    }
    
    // At unit of work commit, persist all tracked changes
}

The Absence of Save()

One of the most counterintuitive aspects of pure repository design is the absence of a save() or update() method. This absence is intentional and flows directly from the collection metaphor.

Consider an in-memory collection:

Collections Don't Have Save
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// In-memory collection behavior
const orders = new Map<string, Order>();
 
// Add an order
orders.set(orderId, order);
 
// Retrieve and modify
const retrieved = orders.get(orderId);
retrieved.approve();  // Change is immediate and visible
 
// No "save" needed - the map already has the modified object
// orders.get(orderId).status === 'APPROVED'
 
// You don't call:
// orders.save(retrieved);  // This method doesn't exist!

If the repository is a collection, the same should apply.

When you retrieve an aggregate, modify it, and the unit of work completes, changes are persisted. The repository tracks which aggregates have been retrieved (via identity map), and the Unit of Work pattern detects changes to those aggregates (via dirty tracking or explicit registration).

However, pragmatic considerations:

Many real-world codebases include a save() method on repositories for several reasons:

•Explicit intent — Calling save() makes the persistence intent clear in code, aiding readability.
•Simpler implementation — Without sophisticated change tracking, explicit save() is easier to implement.
•Framework constraints — Some ORMs or frameworks don't support implicit change tracking well.
•Transaction boundaries — Explicit save() can help developers understand when persistence occurs.

Pragmatic Repository with Save
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Pragmatic approach: include save() for clarity
interface OrderRepository {
    findById(orderId: OrderId): Promise<Order | null>;
    add(order: Order): Promise<void>;
    save(order: Order): Promise<void>;  // Explicit update
    remove(order: Order): Promise<void>;
}
 
// Usage
class OrderService {
    async approveOrder(orderId: OrderId): Promise<void> {
        const order = await this.orderRepository.findById(orderId);
        if (!order) throw new OrderNotFoundException(orderId);
        
        order.approve();
        
        // Explicit save makes the persistence visible
        await this.orderRepository.save(order);
    }
}
 
// Note: The "pure" DDD approach would omit save() and use
// Unit of Work to detect and persist changes automatically.
// The pragmatic approach includes it for clarity.

Choose Consistently

Whether you include save() or use pure Unit of Work tracking, be consistent across your codebase. A mixed approach where some repositories require save() and others don't leads to confusion and bugs.

Testing with Collection Semantics

The collection abstraction shines brightest in testing. Because a repository behaves like a collection, you can implement test doubles using actual in-memory collections. No mocking frameworks, no database setup—just a HashMap.

In-Memory Test Repository
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// In-memory implementation for testing
class InMemoryOrderRepository implements OrderRepository {
    private orders = new Map<string, Order>();
    
    async findById(orderId: OrderId): Promise<Order | null> {
        return this.orders.get(orderId.value) ?? null;
    }
    
    async findByCustomer(customerId: CustomerId): Promise<Order[]> {
        return Array.from(this.orders.values())
            .filter(o => o.customerId.equals(customerId));
    }
    
    async findPendingOrders(): Promise<Order[]> {
        return Array.from(this.orders.values())
            .filter(o => o.status === OrderStatus.PENDING);
    }
    
    async add(order: Order): Promise<void> {
        if (this.orders.has(order.id.value)) {
            throw new Error('Order already exists');
        }
        this.orders.set(order.id.value, order);
    }
    
    async remove(order: Order): Promise<void> {
        this.orders.delete(order.id.value);
    }
    
    // Test helper methods
    clear(): void {
        this.orders.clear();
    }
    
    count(): number {
        return this.orders.size;
    }
    
    getAll(): Order[] {
        return Array.from(this.orders.values());
    }
}

Now tests become trivial:

Testing Domain Logic
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
describe('OrderService', () => {
    let orderRepository: InMemoryOrderRepository;
    let orderService: OrderService;
    
    beforeEach(() => {
        orderRepository = new InMemoryOrderRepository();
        orderService = new OrderService(orderRepository);
    });
    
    describe('placeOrder', () => {
        it('should create and store a new order', async () => {
            const customerId = CustomerId.create('customer-123');
            const items = [OrderItem.create('product-1', 2, Money.of(100))];
            
            const order = await orderService.placeOrder(customerId, items);
            
            // Verify the order exists in the repository
            const stored = await orderRepository.findById(order.id);
            expect(stored).not.toBeNull();
            expect(stored!.customerId).toEqual(customerId);
            expect(stored!.items).toHaveLength(1);
        });
    });
    
    describe('cancelOrder', () => {
        it('should mark an existing order as cancelled', async () => {
            // Arrange: set up an existing order
            const order = Order.create(
                CustomerId.create('customer-123'),
                [OrderItem.create('product-1', 1, Money.of(50))]
            );
            await orderRepository.add(order);
            
            // Act
            await orderService.cancelOrder(order.id);
            
            // Assert
            const cancelled = await orderRepository.findById(order.id);
            expect(cancelled!.status).toBe(OrderStatus.CANCELLED);
        });
        
        it('should throw when order does not exist', async () => {
            const nonExistentId = OrderId.create('does-not-exist');
            
            await expect(orderService.cancelOrder(nonExistentId))
                .rejects.toThrow(OrderNotFoundException);
        });
    });
});
 
// No database setup, no cleanup, no mocking frameworks
// Tests are fast, deterministic, and focused on domain logic

Test Speed and Reliability

In-memory repository tests run in milliseconds, not seconds. They never fail due to database connection issues. They don't require Docker, migrations, or test data setup. This is the testing power of proper abstractions.

Summary: Repository as Collection

The collection metaphor is the conceptual foundation of the Repository pattern. Let's consolidate what we've learned:

Key Takeaways

•A repository behaves like an in-memory collection — Add, remove, and query operations mimic Map or Set semantics.
•Query methods express domain concepts — Use findPendingOrders(), not findByStatusEquals('PENDING').
•Identity maps ensure consistency — The same aggregate retrieved twice is the same object instance.
•Save() is optional — Pure repositories with Unit of Work don't need explicit save; pragmatic repositories include it.
•Testing becomes trivial — In-memory implementations using HashMaps satisfy the interface perfectly.
•The metaphor guides design — If an operation doesn't make sense for a collection, it probably doesn't belong on a repository.

What's next:

We've established how to think about repositories conceptually. The next page dives into the practical details of Repository Interface Design—how to craft repository interfaces that are expressive, clean, and aligned with domain needs.

Page Complete

You now understand the collection abstraction that underlies the Repository pattern. This mental model will guide every repository you design—think collection first, infrastructure second.

2 / 4

Loading learning content...

System Design (LLD)Domain-Driven Design

Repositories

LevelIntermediate

Duration60 mins

TopicDomain-Driven Design

2 / 4

Repository as Collection Abstraction

The Collection Metaphor

What You Will Learn

Thinking in Collections

Consider how you interact with a standard in-memory collection:

Standard Collection Operations
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// In-memory collection behavior
const orders = new Map<string, Order>();
 
// Adding an item - it's now part of the collection
orders.set(order.id, order);
 
// Retrieving an item - it's returned from the collection
const retrieved = orders.get(orderId);
 
// Removing an item - it's no longer part of the collection
orders.delete(orderId);
 
// Querying the collection
const customerOrders = Array.from(orders.values())
    .filter(o => o.customerId === customerId);
 
// Modifying a retrieved object - changes are in the collection
retrieved.cancel();  // The order in the map is now cancelled
 
// Note: No "save" or "persist" operation needed
// Changes to objects in the collection are just... there

This creates a powerful abstraction: domain code can pretend that all aggregates are always in memory, available for immediate access and modification. The repository maintains this illusion.

Collection Operations Mapped to Repository
Collection Operation	Repository Equivalent	Database Effect
`collection.add(item)`	`repository.add(aggregate)`	INSERT (or queue for INSERT)
`collection.get(key)`	`repository.findById(id)`	SELECT with WHERE clause
`collection.remove(item)`	`repository.remove(aggregate)`	DELETE (or queue for DELETE)
`collection.filter(predicate)`	`repository.findByXxx(criteria)`	SELECT with filtering
`item.mutate()`	Same—mutate retrieved aggregate	UPDATE (tracked by Unit of Work)
`collection.contains(item)`	`repository.exists(id)`	SELECT EXISTS or COUNT

The Missing Operation

Core Collection Operations

A repository exposes a specific set of collection-like operations. Understanding each operation's semantics is essential for correct usage and implementation.

Add: Inserting a New Aggregate

The add() operation places a new aggregate into the collection. After this call, the aggregate is considered part of the repository's conceptual set.

Critical semantics:

The aggregate must be new (never previously added or retrieved from this repository)
The aggregate's identity must be unique within the collection
After add(), subsequent findById() calls should return this aggregate
The actual database INSERT may be deferred (via Unit of Work)

Add Operation Semantics
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
interface OrderRepository {
    add(order: Order): Promise<void>;
}
 
// Usage
class OrderService {
    async placeOrder(customerId: CustomerId, items: OrderItem[]): Promise<Order> {
        // Create a new aggregate
        const order = Order.create(customerId, items);
        
        // Add to the collection - it's now part of the repository
        await this.orderRepository.add(order);
        
        // At this point, logically, findById(order.id) would return this order
        // (though actual persistence may be deferred)
        
        return order;
    }
}
 
// Implementation note: add() should NOT accept existing aggregates
// Trying to add an already-existing aggregate is an error

Find By Identity: The Primary Retrieval

The findById() operation is the most fundamental query. It retrieves an aggregate by its unique identity, returning the fully reconstituted aggregate or null if not found.

Critical semantics:

Returns the complete aggregate including all entities and value objects
Returns null (or Optional.empty()) if no aggregate has this identity
The returned aggregate is connected to the repository—changes are tracked
Multiple calls with the same ID should return the same instance (identity map)

FindById Operation Semantics
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
interface OrderRepository {
    findById(orderId: OrderId): Promise<Order | null>;
}
 
// Usage
class OrderService {
    async cancelOrder(orderId: OrderId): Promise<void> {
        // Retrieve the aggregate
        const order = await this.orderRepository.findById(orderId);
        
        if (!order) {
            throw new OrderNotFoundException(orderId);
        }
        
        // Modify the aggregate - changes are tracked
        order.cancel();
        
        // No need to call save() - the Unit of Work will
        // detect changes and persist them
    }
}
 
// Implementation note: identity map pattern ensures that
// multiple findById() calls in the same transaction return
// the same object instance, preserving reference equality

Remove: Deleting an Aggregate

The remove() operation removes an aggregate from the collection. After this call, the aggregate is no longer part of the repository's conceptual set.

Critical semantics:

The aggregate should have been previously retrieved from this repository
After remove(), findById() with this ID should return null
The actual database DELETE may be deferred (via Unit of Work)
Removing a non-existent aggregate should be handled gracefully (no-op or exception)

Remove Operation Semantics
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
interface OrderRepository {
    remove(order: Order): Promise<void>;
}
 
// Usage
class OrderService {
    async deleteOrder(orderId: OrderId): Promise<void> {
        // First retrieve the aggregate
        const order = await this.orderRepository.findById(orderId);
        
        if (!order) {
            throw new OrderNotFoundException(orderId);
        }
        
        // Verify business rules allow deletion
        if (!order.canBeDeleted()) {
            throw new OrderCannotBeDeletedError(orderId);
        }
        
        // Remove from the collection
        await this.orderRepository.remove(order);
        
        // At this point, the order is no longer part of the collection
        // The actual DELETE happens at transaction commit
    }
}
 
// Note: Some implementations use removeById(orderId) for convenience
// when you don't need the aggregate for business logic validation

Rich Query Operations

Beyond basic add/find/remove, repositories provide domain-meaningful query operations. These operations filter the conceptual collection based on criteria that make sense in the domain.

The key principle: Query methods should express domain concepts, not database concepts.

Domain-Meaningful Queries

•findPendingOrders()
•findByCustomer(customerId)
•findOrdersPlacedBetween(start, end)
•findOverdueOrders()
•findHighValueOrders(threshold)

Database-Centric Queries (Avoid)

•findByStatusEquals('PENDING')
•findByCustomerIdIn([...])
•findByCreatedAtBetween(a, b)
•findByDueDateBeforeAndStatusIn(...)
•findByTotalGreaterThan(amount)

The Spring Data Trap

Rich Query Methods
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
interface OrderRepository {
    // Identity-based retrieval
    findById(orderId: OrderId): Promise<Order | null>;
    
    // Domain-meaningful queries
    findByCustomer(customerId: CustomerId): Promise<Order[]>;
    findPendingOrders(): Promise<Order[]>;
    findOrdersAwaitingShipment(): Promise<Order[]>;
    findOrdersPlacedBetween(start: Date, end: Date): Promise<Order[]>;
    
    // Existence checks
    exists(orderId: OrderId): Promise<boolean>;
    hasCustomerPlacedOrderBefore(customerId: CustomerId): Promise<boolean>;
    
    // Counting (for domain purposes, not pagination)
    countPendingOrders(): Promise<number>;
    
    // Collection operations
    add(order: Order): Promise<void>;
    remove(order: Order): Promise<void>;
}
 
// Note: Each finder method returns complete aggregates
// This ensures invariants are maintained when working with results

Pagination and Sorting Considerations

Pagination and sorting are often needed but create tension with the collection metaphor. A finite page of results doesn't feel like a collection.

Recommendations:

Keep repositories aggregate-focused — Repositories retrieve aggregates for command processing, not for display.
Use CQRS for read-heavy use cases — If you need paginated lists for UI display, create separate read models/projections.
If pagination is unavoidable, make it explicit with domain-meaningful semantics:

Pagination When Necessary
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// If you must have pagination in a repository, make it domain-meaningful
interface OrderRepository {
    // Returns oldest pending orders (for processing queue)
    findOldestPendingOrders(limit: number): Promise<Order[]>;
    
    // Returns recent orders for a customer (for recent activity)
    findRecentOrdersByCustomer(
        customerId: CustomerId,
        limit: number
    ): Promise<Order[]>;
    
    // For paging through large result sets (use sparingly)
    findAllOrdersPaged(page: PageRequest): Promise<PagedResult<Order>>;
}
 
// But consider: if you're paginating heavily, you might need CQRS
// with separate read models optimized for querying

The Identity Map Pattern

For the collection metaphor to hold, retrieving the same aggregate twice must return the same object instance. This is the Identity Map pattern, and it's essential for repository correctness.

Why identity consistency matters:

Consider two service methods that both retrieve the same order:

Identity Map Necessity
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Without identity map - BROKEN
class BrokenOrderService {
    async processOrder(orderId: OrderId) {
        const order1 = await this.orderRepository.findById(orderId);
        order1.approve();  // Sets status to APPROVED
        
        // Some intermediate processing...
        await this.processPayment(orderId);
    }
    
    async processPayment(orderId: OrderId) {
        const order2 = await this.orderRepository.findById(orderId);
        
        // order2 is a DIFFERENT INSTANCE than order1!
        // order2.status is still PENDING, not APPROVED
        // Changes made to order1 are invisible to order2
        
        order2.markAsPaid();  // Now we have conflicting states
        
        // When we commit, which version wins? order1 or order2?
        // This leads to data loss or corruption
    }
}
 
// With identity map - CORRECT
class CorrectOrderService {
    async processOrder(orderId: OrderId) {
        const order1 = await this.orderRepository.findById(orderId);
        order1.approve();
        
        await this.processPayment(orderId);
    }
    
    async processPayment(orderId: OrderId) {
        const order2 = await this.orderRepository.findById(orderId);
        
        // order2 IS THE SAME INSTANCE as order1!
        // order2.status is APPROVED
        // All changes are visible because it's the same object
        
        order2.markAsPaid();  // Consistent modification
        
        // At commit, there's one object with all changes
    }
}

ORM Identity Maps

Identity Map Implementation

An identity map is essentially a Map<Identity, Aggregate> that tracks all loaded aggregates. When findById() is called:

Check if the identity is already in the map
If yes, return the cached instance
If no, load from database, add to map, return

The identity map is typically scoped to a Unit of Work (transaction/request).

Identity Map Implementation Sketch
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
class OrderRepositoryImpl implements OrderRepository {
    // Identity map - scoped to this unit of work
    private identityMap = new Map<string, Order>();
    
    async findById(orderId: OrderId): Promise<Order | null> {
        const key = orderId.value;
        
        // Check identity map first
        if (this.identityMap.has(key)) {
            return this.identityMap.get(key)!;
        }
        
        // Not cached - load from database
        const data = await this.database.query(
            'SELECT * FROM orders WHERE id = ?',
            [key]
        );
        
        if (!data) return null;
        
        // Reconstitute the aggregate
        const order = this.mapper.toDomain(data);
        
        // Store in identity map
        this.identityMap.set(key, order);
        
        return order;
    }
    
    async add(order: Order): Promise<void> {
        // Add to identity map immediately
        this.identityMap.set(order.id.value, order);
        
        // Mark for insertion in Unit of Work
        this.unitOfWork.registerNew(order);
    }
    
    // At unit of work commit, persist all tracked changes
}

The Absence of Save()

One of the most counterintuitive aspects of pure repository design is the absence of a save() or update() method. This absence is intentional and flows directly from the collection metaphor.

Consider an in-memory collection:

Collections Don't Have Save
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// In-memory collection behavior
const orders = new Map<string, Order>();
 
// Add an order
orders.set(orderId, order);
 
// Retrieve and modify
const retrieved = orders.get(orderId);
retrieved.approve();  // Change is immediate and visible
 
// No "save" needed - the map already has the modified object
// orders.get(orderId).status === 'APPROVED'
 
// You don't call:
// orders.save(retrieved);  // This method doesn't exist!

If the repository is a collection, the same should apply.

However, pragmatic considerations:

Many real-world codebases include a save() method on repositories for several reasons:

•Explicit intent — Calling save() makes the persistence intent clear in code, aiding readability.
•Simpler implementation — Without sophisticated change tracking, explicit save() is easier to implement.
•Framework constraints — Some ORMs or frameworks don't support implicit change tracking well.
•Transaction boundaries — Explicit save() can help developers understand when persistence occurs.

Pragmatic Repository with Save
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Pragmatic approach: include save() for clarity
interface OrderRepository {
    findById(orderId: OrderId): Promise<Order | null>;
    add(order: Order): Promise<void>;
    save(order: Order): Promise<void>;  // Explicit update
    remove(order: Order): Promise<void>;
}
 
// Usage
class OrderService {
    async approveOrder(orderId: OrderId): Promise<void> {
        const order = await this.orderRepository.findById(orderId);
        if (!order) throw new OrderNotFoundException(orderId);
        
        order.approve();
        
        // Explicit save makes the persistence visible
        await this.orderRepository.save(order);
    }
}
 
// Note: The "pure" DDD approach would omit save() and use
// Unit of Work to detect and persist changes automatically.
// The pragmatic approach includes it for clarity.

Choose Consistently

Testing with Collection Semantics

In-Memory Test Repository
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// In-memory implementation for testing
class InMemoryOrderRepository implements OrderRepository {
    private orders = new Map<string, Order>();
    
    async findById(orderId: OrderId): Promise<Order | null> {
        return this.orders.get(orderId.value) ?? null;
    }
    
    async findByCustomer(customerId: CustomerId): Promise<Order[]> {
        return Array.from(this.orders.values())
            .filter(o => o.customerId.equals(customerId));
    }
    
    async findPendingOrders(): Promise<Order[]> {
        return Array.from(this.orders.values())
            .filter(o => o.status === OrderStatus.PENDING);
    }
    
    async add(order: Order): Promise<void> {
        if (this.orders.has(order.id.value)) {
            throw new Error('Order already exists');
        }
        this.orders.set(order.id.value, order);
    }
    
    async remove(order: Order): Promise<void> {
        this.orders.delete(order.id.value);
    }
    
    // Test helper methods
    clear(): void {
        this.orders.clear();
    }
    
    count(): number {
        return this.orders.size;
    }
    
    getAll(): Order[] {
        return Array.from(this.orders.values());
    }
}

Now tests become trivial:

Testing Domain Logic
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
describe('OrderService', () => {
    let orderRepository: InMemoryOrderRepository;
    let orderService: OrderService;
    
    beforeEach(() => {
        orderRepository = new InMemoryOrderRepository();
        orderService = new OrderService(orderRepository);
    });
    
    describe('placeOrder', () => {
        it('should create and store a new order', async () => {
            const customerId = CustomerId.create('customer-123');
            const items = [OrderItem.create('product-1', 2, Money.of(100))];
            
            const order = await orderService.placeOrder(customerId, items);
            
            // Verify the order exists in the repository
            const stored = await orderRepository.findById(order.id);
            expect(stored).not.toBeNull();
            expect(stored!.customerId).toEqual(customerId);
            expect(stored!.items).toHaveLength(1);
        });
    });
    
    describe('cancelOrder', () => {
        it('should mark an existing order as cancelled', async () => {
            // Arrange: set up an existing order
            const order = Order.create(
                CustomerId.create('customer-123'),
                [OrderItem.create('product-1', 1, Money.of(50))]
            );
            await orderRepository.add(order);
            
            // Act
            await orderService.cancelOrder(order.id);
            
            // Assert
            const cancelled = await orderRepository.findById(order.id);
            expect(cancelled!.status).toBe(OrderStatus.CANCELLED);
        });
        
        it('should throw when order does not exist', async () => {
            const nonExistentId = OrderId.create('does-not-exist');
            
            await expect(orderService.cancelOrder(nonExistentId))
                .rejects.toThrow(OrderNotFoundException);
        });
    });
});
 
// No database setup, no cleanup, no mocking frameworks
// Tests are fast, deterministic, and focused on domain logic

Test Speed and Reliability

Summary: Repository as Collection

The collection metaphor is the conceptual foundation of the Repository pattern. Let's consolidate what we've learned:

Key Takeaways

•A repository behaves like an in-memory collection — Add, remove, and query operations mimic Map or Set semantics.
•Query methods express domain concepts — Use findPendingOrders(), not findByStatusEquals('PENDING').
•Identity maps ensure consistency — The same aggregate retrieved twice is the same object instance.
•Save() is optional — Pure repositories with Unit of Work don't need explicit save; pragmatic repositories include it.
•Testing becomes trivial — In-memory implementations using HashMaps satisfy the interface perfectly.
•The metaphor guides design — If an operation doesn't make sense for a collection, it probably doesn't belong on a repository.

What's next:

Page Complete

You now understand the collection abstraction that underlies the Repository pattern. This mental model will guide every repository you design—think collection first, infrastructure second.

2 / 4