System Design (LLD)Domain-Driven Design

Repositories

LevelIntermediate

Duration60 mins

TopicDomain-Driven Design

1 / 4

What is a Repository

The Persistence Conundrum

Every domain-driven application faces a fundamental tension: domain objects must be stored and retrieved, yet the mechanics of storage—SQL queries, ORM configurations, connection management, transaction handling—are the antithesis of domain logic. These concerns operate at entirely different levels of abstraction, and allowing them to intermingle creates systems that are difficult to understand, test, and maintain.

Imagine you're building an e-commerce platform. Your Order aggregate encapsulates complex business rules about pricing, discounts, inventory reservations, and fulfillment workflows. The domain expert speaks in terms of placing orders, canceling orders, and finding orders by customer. They never mention SQL queries, database connections, or entity framework configurations. Yet somehow, orders must persist beyond the lifetime of a single request.

The Repository pattern bridges this gap. It provides a mechanism for your domain to interact with persistent storage using the language and concepts of the domain itself, completely hiding the infrastructure machinery beneath.

What You Will Learn

By the end of this page, you will understand what a Repository is in the context of DDD, why it exists, how it differs from other data access patterns, and when to apply it. You'll see how repositories enable your domain to remain pure and infrastructure-agnostic while still achieving reliable persistence.

Definition and Core Concept

A Repository is a DDD tactical pattern that mediates between the domain layer and the data mapping layer, acting like an in-memory collection of domain objects.

This definition, derived from Eric Evans' original Domain-Driven Design work, contains several important implications:

It mediates — The repository stands between your domain code and the persistence infrastructure, translating between domain concepts and storage mechanics.
It acts like a collection — From the domain's perspective, a repository behaves as if it were a simple in-memory collection (like a list or set). You add objects, remove objects, and query for objects using domain-meaningful criteria.
It hides mapping — All the complexity of converting domain objects to/from database representations is encapsulated within the repository implementation, invisible to the domain.

The Collection Illusion

The genius of the Repository pattern is that it creates an illusion of a collection. Your domain code can pretend that all domain objects exist in memory, ready to be accessed. The repository maintains this illusion while actually fetching from databases, caches, external services, or any other storage mechanism. This illusion liberates domain logic from persistence concerns.

Evans' original formulation:

"A Repository represents all objects of a certain type as a conceptual set (usually emulated). It acts like a collection, except with more elaborate querying capability. Objects of the appropriate type are added and removed, and the machinery behind the Repository inserts them or deletes them from the database."

This means a repository for Order aggregates conceptually contains all orders that exist. When domain code asks for an order by ID, the repository returns it as if it were always in memory. When domain code adds a new order to the repository, it becomes part of this conceptual collection and will be persisted. The domain never explicitly saves or loads—it simply works with the collection.

Repository Conceptual Interface
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// The repository interface expresses domain concepts, not database operations
interface OrderRepository {
    // Find an order by its identity - returns the aggregate root
    findById(orderId: OrderId): Promise<Order | null>;
    
    // Find orders matching domain criteria
    findByCustomer(customerId: CustomerId): Promise<Order[]>;
    findPendingOrders(): Promise<Order[]>;
    
    // Add an order to the collection (will be persisted)
    add(order: Order): Promise<void>;
    
    // Remove an order from the collection (will be deleted)
    remove(order: Order): Promise<void>;
    
    // Note: No "save" or "update" method!
    // Changes to retrieved orders are tracked and persisted
    // automatically by the Unit of Work pattern
}

Notice several critical characteristics in this interface:

No database terminology — There's no mention of SQL, queries, connections, or transactions. The interface speaks the language of the domain: orders, customers, pending status.
Identity-based retrieval — The primary retrieval method uses the aggregate's identity (OrderId), not database primary keys or GUIDs.
Domain-meaningful queries — Query methods express domain concepts like "orders for a customer" or "pending orders," not raw query parameters.
Collection semantics — We add and remove objects, mirroring how we'd work with an in-memory collection.
No explicit save — The most striking absence is a save() or update() method. In a proper DDD implementation, the repository (with a Unit of Work) tracks changes to retrieved aggregates and persists them automatically.

Why Repositories Exist

To appreciate the Repository pattern fully, we must understand the problems it solves. Without repositories, domain code becomes polluted with infrastructure concerns, leading to several pathologies:

Problems Without Repositories

•Domain Logic Contamination — Business rules become interleaved with SQL queries, ORM configurations, and connection management. Understanding the domain requires parsing through infrastructure noise.
•Testing Nightmare — Domain logic tests require database setup, connection management, and cleanup. Tests become slow, brittle, and dependent on external resources.
•Technology Lock-in — Domain code references specific database technologies, ORMs, or storage mechanisms. Changing persistence technology requires rewriting domain logic.
•Code Duplication — The same database queries appear scattered throughout the codebase, each with slight variations. Changes require hunting down all occurrences.
•Violation of Single Responsibility — Domain objects become responsible for both business logic and their own persistence, conflating two distinct concerns.
•Impedance Mismatch Bleeding — The differences between object models and relational models (the 'impedance mismatch') leak into domain code, forcing compromises in domain design.

Consider this anti-pattern:

Anti-Pattern: Domain Polluted with Persistence
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// ❌ ANTI-PATTERN: Domain service with embedded persistence concerns
class OrderService {
    private connection: DatabaseConnection;
    
    async placeOrder(customerId: string, items: OrderItem[]): Promise<Order> {
        // Business logic mixed with infrastructure
        const customer = await this.connection.query(
            'SELECT * FROM customers WHERE id = ?',
            [customerId]
        );
        
        if (!customer) {
            throw new Error('Customer not found');
        }
        
        // Domain logic
        const order = new Order(generateOrderId(), customerId);
        for (const item of items) {
            order.addItem(item);
        }
        order.calculateTotals();
        
        // More infrastructure concerns
        await this.connection.beginTransaction();
        try {
            await this.connection.query(
                'INSERT INTO orders (id, customer_id, total, status) VALUES (?, ?, ?, ?)',
                [order.id, order.customerId, order.total, order.status]
            );
            
            for (const item of order.items) {
                await this.connection.query(
                    'INSERT INTO order_items (order_id, product_id, quantity, price) VALUES (?, ?, ?, ?)',
                    [order.id, item.productId, item.quantity, item.price]
                );
            }
            
            await this.connection.commit();
        } catch (e) {
            await this.connection.rollback();
            throw e;
        }
        
        return order;
    }
}
 
// Problems with this approach:
// 1. Domain logic is buried in infrastructure code
// 2. Testing requires a real database
// 3. SQL is spread throughout the service
// 4. Transaction management pollutes business logic
// 5. Changing databases requires rewriting domain services

Now contrast with the Repository approach:

Clean Approach: Domain with Repository Abstraction
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// ✅ CLEAN: Domain service focused purely on domain logic
class OrderService {
    constructor(
        private orderRepository: OrderRepository,
        private customerRepository: CustomerRepository
    ) {}
    
    async placeOrder(customerId: CustomerId, items: OrderItem[]): Promise<Order> {
        // Domain logic is clear and focused
        const customer = await this.customerRepository.findById(customerId);
        
        if (!customer) {
            throw new CustomerNotFoundException(customerId);
        }
        
        // Pure domain operations
        const order = Order.place(customer, items);
        
        // Repository handles all persistence
        await this.orderRepository.add(order);
        
        return order;
    }
}
 
// Benefits of this approach:
// 1. Domain logic is clear and readable
// 2. Testing with mock repositories is trivial
// 3. No SQL or infrastructure details visible
// 4. Transaction management is handled elsewhere
// 5. Persistence technology can change without affecting this code

The Transformation

Notice how the repository-based version reads like a description of the domain process: find the customer, place the order, add it to the repository. There's no distraction from SQL, transactions, or database mechanics. This is what DDD strives for—code that expresses the domain directly.

Repositories vs Other Data Access Patterns

The Repository pattern is often confused with or conflated with other data access patterns. Understanding the distinctions is crucial for applying each pattern appropriately.

Repository vs Other Data Access Patterns
Pattern	Primary Purpose	Abstraction Level	DDD Role
Repository	Collection-like interface for aggregates	Domain concepts	Aggregate persistence
DAO (Data Access Object)	Abstract database operations	Database tables/records	Data layer encapsulation
Active Record	Domain objects manage own persistence	Object ↔ Table mapping	Not recommended in DDD
Table Gateway	Gateway to a database table	Single table operations	Low-level data access
Query Object	Encapsulate complex queries	Query construction	Can complement repositories

Repository vs DAO: The Critical Distinction

The most common confusion is between Repository and DAO (Data Access Object). While both abstract data access, they operate at fundamentally different levels:

DAO operates at the data layer:

Abstracts database technology (SQL, NoSQL, etc.)
Works with database concepts (tables, rows, queries)
One DAO typically maps to one table
Methods reflect CRUD operations: insert(), update(), delete(), findAll()

Repository operates at the domain layer:

Abstracts the concept of a collection of domain objects
Works with domain concepts (aggregates, entities, value objects)
One repository per aggregate root
Methods reflect domain operations: add(), remove(), findByXxx()

DAO Characteristics

•Table-centric design
•CRUD-focused methods
•Database terminology (insert, update)
•Returns data transfer objects or primitives
•Often one DAO per table
•Lives in data/infrastructure layer

Repository Characteristics

•Aggregate-centric design
•Collection-focused methods
•Domain terminology (add, remove)
•Returns fully reconstituted aggregates
•One repository per aggregate root
•Interface in domain, implementation in infrastructure

The Implementation Reality

A repository implementation often uses DAOs or ORM features internally. The repository is a higher-level abstraction that may delegate to lower-level data access mechanisms. The key point is that the domain code only sees the repository interface—never the DAOs or ORM details beneath.

Repository vs Active Record

The Active Record pattern, popularized by Ruby on Rails, makes domain objects responsible for their own persistence. Each object knows how to save, update, delete, and query itself.

While convenient for simple domains, Active Record violates the Single Responsibility Principle and creates tight coupling between domain objects and database schema. In DDD, we explicitly reject this pattern because:

Domain objects should be persistence-ignorant — They model business concepts, not database operations.
Testing becomes database-dependent — You can't test domain logic without a database.
Schema changes ripple through domain code — The impedance mismatch isn't managed.
It encourages anemic domains — Objects become data containers with persistence methods rather than rich behavioral models.

One Repository Per Aggregate

A foundational rule of DDD repositories is: Create one repository per aggregate root, never for entities or value objects within an aggregate.

This rule flows directly from the aggregate concept. An aggregate is a cluster of objects treated as a unit for data changes, with the aggregate root being the only entry point. Since the aggregate root controls all access to internal entities, it follows that persistence operations should also go through the aggregate root.

Why this matters:

Benefits of Aggregate-Level Repositories

•Maintains aggregate invariants — Retrieving the entire aggregate ensures all invariants are checked when reconstituting from storage.
•Preserves encapsulation — Internal entities can't be accessed or modified without going through the aggregate root.
•Simplifies concurrency — The aggregate becomes the unit of locking/versioning for optimistic concurrency control.
•Reduces repository proliferation — You don't create repositories for every entity, keeping the number of repositories manageable.
•Aligns with bounded context — Repositories reflect the aggregate structure, which in turn reflects the bounded context's model.

Aggregate-Level Repository Example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// Order Aggregate Structure
// - Order (Aggregate Root)
//   - OrderItem (Entity within aggregate)
//   - ShippingAddress (Value Object)
//   - PaymentInfo (Value Object)
 
// ✅ CORRECT: One repository for the aggregate root
interface OrderRepository {
    findById(orderId: OrderId): Promise<Order | null>;
    findByCustomer(customerId: CustomerId): Promise<Order[]>;
    add(order: Order): Promise<void>;
    remove(order: Order): Promise<void>;
}
 
// ❌ WRONG: Repositories for internal entities/value objects
interface OrderItemRepository {  // Don't do this!
    findById(itemId: OrderItemId): Promise<OrderItem | null>;
    add(item: OrderItem): Promise<void>;
}
 
interface ShippingAddressRepository {  // Don't do this!
    findByOrderId(orderId: OrderId): Promise<ShippingAddress | null>;
}
 
// Correct usage - access internal entities through the aggregate
class OrderService {
    constructor(private orderRepository: OrderRepository) {}
    
    async updateItemQuantity(
        orderId: OrderId,
        itemId: OrderItemId,
        newQuantity: number
    ): Promise<void> {
        // Retrieve the full aggregate
        const order = await this.orderRepository.findById(orderId);
        if (!order) throw new OrderNotFoundException(orderId);
        
        // Modify through aggregate root (enforces invariants)
        order.updateItemQuantity(itemId, newQuantity);
        
        // Repository will persist the entire aggregate
        // Changes are tracked automatically via Unit of Work
    }
}

What About Query Performance?

You might worry that always retrieving full aggregates is inefficient. For writes, it's necessary to maintain invariants. For reads, DDD recommends CQRS (Command Query Responsibility Segregation), where read-optimized projections bypass aggregates entirely. We'll cover this in later modules.

Repository Responsibilities

A well-designed repository has clearly defined responsibilities—and equally important, clear boundaries on what it should not do.

A Repository DOES

•Provide collection semantics — Add, remove, and find operations that mimic an in-memory collection.
•Reconstitute aggregates — Rebuild complete aggregate graphs from stored data, including all entities and value objects.
•Encapsulate query logic — Hide the SQL, ORM queries, or API calls that retrieve data.
•Translate between domain and persistence models — Convert domain objects to/from database representations.
•Implement domain query methods — Expose finder methods that match domain use cases (findByCustomer, findPendingOrders).
•Manage aggregate identity — Track whether an aggregate is new or existing for insert vs update decisions.

A Repository Does NOT

•Contain business logic — Business rules belong in domain objects, not repositories.
•Expose raw queries — Methods like executeQuery(sql) or findByCriteria(queryObject) break encapsulation.
•Return partial aggregates — A repository returns the complete aggregate or nothing. Partial loads violate invariants.
•Cross aggregate boundaries — A repository retrieves one type of aggregate. Joining multiple aggregates is the application layer's job.
•Manage transactions explicitly — Transaction boundaries are typically controlled by the application layer or Unit of Work.
•Implement complex reporting queries — Read-heavy queries should use CQRS read models, not repositories.

The Purity Test

Ask yourself: Can I replace the repository implementation with a simple in-memory HashMap and still satisfy the interface contract? If your repository interface requires knowledge of SQL, database connections, or specific ORM features to use correctly, it's leaking infrastructure concerns into the domain.

Repository in the Architecture

Understanding where repositories fit in the overall architecture is essential for correct implementation. The Repository pattern follows the Dependency Inversion Principle: high-level modules (domain) should not depend on low-level modules (infrastructure); both should depend on abstractions.

The architectural split:

Repository Interface — Defined in the domain layer. This is the contract that domain services depend upon.
Repository Implementation — Lives in the infrastructure layer. This is where ORM code, SQL queries, and database connections reside.

This separation is not merely organizational—it's fundamental to the pattern's value.

Architectural Layering

Folder Structure

src/
├── domain/                      # Core domain - no external dependencies
│   ├── model/
│   │   ├── Order.ts             # Aggregate root
│   │   ├── OrderItem.ts         # Entity within aggregate
│   │   ├── OrderId.ts           # Value object (identity)
│   │   └── ShippingAddress.ts   # Value object
│   │
│   ├── repository/              # Repository INTERFACES
│   │   ├── OrderRepository.ts   # Interface only - no implementation
│   │   └── CustomerRepository.ts
│   │
│   └── service/
│       └── OrderService.ts      # Domain service using repository interface
│
├── infrastructure/              # Infrastructure implementations
│   ├── persistence/
│   │   ├── OrderRepositoryImpl.ts    # SQL/ORM implementation
│   │   ├── CustomerRepositoryImpl.ts
│   │   └── mappers/
│   │       ├── OrderMapper.ts        # Domain ↔ DB mapping
│   │       └── CustomerMapper.ts
│   │
│   └── database/
│       └── connection.ts        # Database connection management
│
└── application/                 # Application layer - orchestration
    ├── commands/
    │   └── PlaceOrderHandler.ts # Coordinates domain and infrastructure
    └── config/
        └── di-container.ts      # Wires implementations to interfaces

The dependency flow:

Application Layer
       ↓
   depends on
       ↓
Domain Layer (including repository interfaces)
       ↓
   implemented by
       ↓
Infrastructure Layer (repository implementations)

Notice that dependencies point inward toward the domain. The domain layer has no knowledge of the infrastructure layer. It defines repository interfaces, and the infrastructure layer provides implementations. The application layer (or a DI container) wires everything together at runtime.

Testing Benefits

Because the domain layer only depends on repository interfaces, you can test domain logic with mock or in-memory repository implementations. No database required. This enables fast, reliable unit tests that focus on business rules without infrastructure noise.

Summary: What is a Repository

We've established the foundational understanding of what Repository means in DDD. Let's consolidate the key insights:

Key Takeaways

•A Repository is a collection abstraction — It provides an illusion that all aggregates exist in memory, ready for access.
•Repositories exist at the domain level — The interface is defined in the domain; the implementation lives in infrastructure.
•One repository per aggregate root — Never create repositories for internal entities or value objects.
•Repositories hide persistence mechanics — No SQL, ORM, or database concepts leak into the domain.
•Repository ≠ DAO — DAOs abstract database tables; repositories abstract aggregate collections.
•Repositories enable testability — Domain logic can be tested with in-memory fakes, no database required.

What's next:

Now that we understand what a repository is conceptually, we'll dive deeper into how a repository creates the illusion of an in-memory collection. The next page explores Repository as Collection Abstraction, examining the collection metaphor in detail and how it shapes repository design.

Page Complete

You now understand the Repository pattern's purpose and role in DDD. It bridges the gap between domain purity and persistence necessity by providing a collection-like interface that hides all infrastructure complexity. Next, we'll explore how to think about repositories as true collections.

1 / 4

Loading learning content...

System Design (LLD)Domain-Driven Design

Repositories

LevelIntermediate

Duration60 mins

TopicDomain-Driven Design

1 / 4

What is a Repository

The Persistence Conundrum

What You Will Learn

Definition and Core Concept

A Repository is a DDD tactical pattern that mediates between the domain layer and the data mapping layer, acting like an in-memory collection of domain objects.

This definition, derived from Eric Evans' original Domain-Driven Design work, contains several important implications:

It mediates — The repository stands between your domain code and the persistence infrastructure, translating between domain concepts and storage mechanics.
It acts like a collection — From the domain's perspective, a repository behaves as if it were a simple in-memory collection (like a list or set). You add objects, remove objects, and query for objects using domain-meaningful criteria.
It hides mapping — All the complexity of converting domain objects to/from database representations is encapsulated within the repository implementation, invisible to the domain.

The Collection Illusion

Evans' original formulation:

"A Repository represents all objects of a certain type as a conceptual set (usually emulated). It acts like a collection, except with more elaborate querying capability. Objects of the appropriate type are added and removed, and the machinery behind the Repository inserts them or deletes them from the database."

Repository Conceptual Interface
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// The repository interface expresses domain concepts, not database operations
interface OrderRepository {
    // Find an order by its identity - returns the aggregate root
    findById(orderId: OrderId): Promise<Order | null>;
    
    // Find orders matching domain criteria
    findByCustomer(customerId: CustomerId): Promise<Order[]>;
    findPendingOrders(): Promise<Order[]>;
    
    // Add an order to the collection (will be persisted)
    add(order: Order): Promise<void>;
    
    // Remove an order from the collection (will be deleted)
    remove(order: Order): Promise<void>;
    
    // Note: No "save" or "update" method!
    // Changes to retrieved orders are tracked and persisted
    // automatically by the Unit of Work pattern
}

Notice several critical characteristics in this interface:

No database terminology — There's no mention of SQL, queries, connections, or transactions. The interface speaks the language of the domain: orders, customers, pending status.
Identity-based retrieval — The primary retrieval method uses the aggregate's identity (OrderId), not database primary keys or GUIDs.
Domain-meaningful queries — Query methods express domain concepts like "orders for a customer" or "pending orders," not raw query parameters.
Collection semantics — We add and remove objects, mirroring how we'd work with an in-memory collection.
No explicit save — The most striking absence is a save() or update() method. In a proper DDD implementation, the repository (with a Unit of Work) tracks changes to retrieved aggregates and persists them automatically.

Why Repositories Exist

To appreciate the Repository pattern fully, we must understand the problems it solves. Without repositories, domain code becomes polluted with infrastructure concerns, leading to several pathologies:

Problems Without Repositories

•Domain Logic Contamination — Business rules become interleaved with SQL queries, ORM configurations, and connection management. Understanding the domain requires parsing through infrastructure noise.
•Testing Nightmare — Domain logic tests require database setup, connection management, and cleanup. Tests become slow, brittle, and dependent on external resources.
•Technology Lock-in — Domain code references specific database technologies, ORMs, or storage mechanisms. Changing persistence technology requires rewriting domain logic.
•Code Duplication — The same database queries appear scattered throughout the codebase, each with slight variations. Changes require hunting down all occurrences.
•Violation of Single Responsibility — Domain objects become responsible for both business logic and their own persistence, conflating two distinct concerns.
•Impedance Mismatch Bleeding — The differences between object models and relational models (the 'impedance mismatch') leak into domain code, forcing compromises in domain design.

Consider this anti-pattern:

Anti-Pattern: Domain Polluted with Persistence
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// ❌ ANTI-PATTERN: Domain service with embedded persistence concerns
class OrderService {
    private connection: DatabaseConnection;
    
    async placeOrder(customerId: string, items: OrderItem[]): Promise<Order> {
        // Business logic mixed with infrastructure
        const customer = await this.connection.query(
            'SELECT * FROM customers WHERE id = ?',
            [customerId]
        );
        
        if (!customer) {
            throw new Error('Customer not found');
        }
        
        // Domain logic
        const order = new Order(generateOrderId(), customerId);
        for (const item of items) {
            order.addItem(item);
        }
        order.calculateTotals();
        
        // More infrastructure concerns
        await this.connection.beginTransaction();
        try {
            await this.connection.query(
                'INSERT INTO orders (id, customer_id, total, status) VALUES (?, ?, ?, ?)',
                [order.id, order.customerId, order.total, order.status]
            );
            
            for (const item of order.items) {
                await this.connection.query(
                    'INSERT INTO order_items (order_id, product_id, quantity, price) VALUES (?, ?, ?, ?)',
                    [order.id, item.productId, item.quantity, item.price]
                );
            }
            
            await this.connection.commit();
        } catch (e) {
            await this.connection.rollback();
            throw e;
        }
        
        return order;
    }
}
 
// Problems with this approach:
// 1. Domain logic is buried in infrastructure code
// 2. Testing requires a real database
// 3. SQL is spread throughout the service
// 4. Transaction management pollutes business logic
// 5. Changing databases requires rewriting domain services

Now contrast with the Repository approach:

Clean Approach: Domain with Repository Abstraction
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// ✅ CLEAN: Domain service focused purely on domain logic
class OrderService {
    constructor(
        private orderRepository: OrderRepository,
        private customerRepository: CustomerRepository
    ) {}
    
    async placeOrder(customerId: CustomerId, items: OrderItem[]): Promise<Order> {
        // Domain logic is clear and focused
        const customer = await this.customerRepository.findById(customerId);
        
        if (!customer) {
            throw new CustomerNotFoundException(customerId);
        }
        
        // Pure domain operations
        const order = Order.place(customer, items);
        
        // Repository handles all persistence
        await this.orderRepository.add(order);
        
        return order;
    }
}
 
// Benefits of this approach:
// 1. Domain logic is clear and readable
// 2. Testing with mock repositories is trivial
// 3. No SQL or infrastructure details visible
// 4. Transaction management is handled elsewhere
// 5. Persistence technology can change without affecting this code

The Transformation

Repositories vs Other Data Access Patterns

The Repository pattern is often confused with or conflated with other data access patterns. Understanding the distinctions is crucial for applying each pattern appropriately.

Repository vs Other Data Access Patterns
Pattern	Primary Purpose	Abstraction Level	DDD Role
Repository	Collection-like interface for aggregates	Domain concepts	Aggregate persistence
DAO (Data Access Object)	Abstract database operations	Database tables/records	Data layer encapsulation
Active Record	Domain objects manage own persistence	Object ↔ Table mapping	Not recommended in DDD
Table Gateway	Gateway to a database table	Single table operations	Low-level data access
Query Object	Encapsulate complex queries	Query construction	Can complement repositories

Repository vs DAO: The Critical Distinction

The most common confusion is between Repository and DAO (Data Access Object). While both abstract data access, they operate at fundamentally different levels:

DAO operates at the data layer:

Abstracts database technology (SQL, NoSQL, etc.)
Works with database concepts (tables, rows, queries)
One DAO typically maps to one table
Methods reflect CRUD operations: insert(), update(), delete(), findAll()

Repository operates at the domain layer:

Abstracts the concept of a collection of domain objects
Works with domain concepts (aggregates, entities, value objects)
One repository per aggregate root
Methods reflect domain operations: add(), remove(), findByXxx()

DAO Characteristics

•Table-centric design
•CRUD-focused methods
•Database terminology (insert, update)
•Returns data transfer objects or primitives
•Often one DAO per table
•Lives in data/infrastructure layer

Repository Characteristics

•Aggregate-centric design
•Collection-focused methods
•Domain terminology (add, remove)
•Returns fully reconstituted aggregates
•One repository per aggregate root
•Interface in domain, implementation in infrastructure

The Implementation Reality

Repository vs Active Record

The Active Record pattern, popularized by Ruby on Rails, makes domain objects responsible for their own persistence. Each object knows how to save, update, delete, and query itself.

Domain objects should be persistence-ignorant — They model business concepts, not database operations.
Testing becomes database-dependent — You can't test domain logic without a database.
Schema changes ripple through domain code — The impedance mismatch isn't managed.
It encourages anemic domains — Objects become data containers with persistence methods rather than rich behavioral models.

One Repository Per Aggregate

A foundational rule of DDD repositories is: Create one repository per aggregate root, never for entities or value objects within an aggregate.

Why this matters:

Benefits of Aggregate-Level Repositories

•Maintains aggregate invariants — Retrieving the entire aggregate ensures all invariants are checked when reconstituting from storage.
•Preserves encapsulation — Internal entities can't be accessed or modified without going through the aggregate root.
•Simplifies concurrency — The aggregate becomes the unit of locking/versioning for optimistic concurrency control.
•Reduces repository proliferation — You don't create repositories for every entity, keeping the number of repositories manageable.
•Aligns with bounded context — Repositories reflect the aggregate structure, which in turn reflects the bounded context's model.

Aggregate-Level Repository Example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// Order Aggregate Structure
// - Order (Aggregate Root)
//   - OrderItem (Entity within aggregate)
//   - ShippingAddress (Value Object)
//   - PaymentInfo (Value Object)
 
// ✅ CORRECT: One repository for the aggregate root
interface OrderRepository {
    findById(orderId: OrderId): Promise<Order | null>;
    findByCustomer(customerId: CustomerId): Promise<Order[]>;
    add(order: Order): Promise<void>;
    remove(order: Order): Promise<void>;
}
 
// ❌ WRONG: Repositories for internal entities/value objects
interface OrderItemRepository {  // Don't do this!
    findById(itemId: OrderItemId): Promise<OrderItem | null>;
    add(item: OrderItem): Promise<void>;
}
 
interface ShippingAddressRepository {  // Don't do this!
    findByOrderId(orderId: OrderId): Promise<ShippingAddress | null>;
}
 
// Correct usage - access internal entities through the aggregate
class OrderService {
    constructor(private orderRepository: OrderRepository) {}
    
    async updateItemQuantity(
        orderId: OrderId,
        itemId: OrderItemId,
        newQuantity: number
    ): Promise<void> {
        // Retrieve the full aggregate
        const order = await this.orderRepository.findById(orderId);
        if (!order) throw new OrderNotFoundException(orderId);
        
        // Modify through aggregate root (enforces invariants)
        order.updateItemQuantity(itemId, newQuantity);
        
        // Repository will persist the entire aggregate
        // Changes are tracked automatically via Unit of Work
    }
}

What About Query Performance?

Repository Responsibilities

A well-designed repository has clearly defined responsibilities—and equally important, clear boundaries on what it should not do.

A Repository DOES

•Provide collection semantics — Add, remove, and find operations that mimic an in-memory collection.
•Reconstitute aggregates — Rebuild complete aggregate graphs from stored data, including all entities and value objects.
•Encapsulate query logic — Hide the SQL, ORM queries, or API calls that retrieve data.
•Translate between domain and persistence models — Convert domain objects to/from database representations.
•Implement domain query methods — Expose finder methods that match domain use cases (findByCustomer, findPendingOrders).
•Manage aggregate identity — Track whether an aggregate is new or existing for insert vs update decisions.

A Repository Does NOT

•Contain business logic — Business rules belong in domain objects, not repositories.
•Expose raw queries — Methods like executeQuery(sql) or findByCriteria(queryObject) break encapsulation.
•Return partial aggregates — A repository returns the complete aggregate or nothing. Partial loads violate invariants.
•Cross aggregate boundaries — A repository retrieves one type of aggregate. Joining multiple aggregates is the application layer's job.
•Manage transactions explicitly — Transaction boundaries are typically controlled by the application layer or Unit of Work.
•Implement complex reporting queries — Read-heavy queries should use CQRS read models, not repositories.

The Purity Test

Repository in the Architecture

The architectural split:

Repository Interface — Defined in the domain layer. This is the contract that domain services depend upon.
Repository Implementation — Lives in the infrastructure layer. This is where ORM code, SQL queries, and database connections reside.

This separation is not merely organizational—it's fundamental to the pattern's value.

Architectural Layering

Folder Structure

src/
├── domain/                      # Core domain - no external dependencies
│   ├── model/
│   │   ├── Order.ts             # Aggregate root
│   │   ├── OrderItem.ts         # Entity within aggregate
│   │   ├── OrderId.ts           # Value object (identity)
│   │   └── ShippingAddress.ts   # Value object
│   │
│   ├── repository/              # Repository INTERFACES
│   │   ├── OrderRepository.ts   # Interface only - no implementation
│   │   └── CustomerRepository.ts
│   │
│   └── service/
│       └── OrderService.ts      # Domain service using repository interface
│
├── infrastructure/              # Infrastructure implementations
│   ├── persistence/
│   │   ├── OrderRepositoryImpl.ts    # SQL/ORM implementation
│   │   ├── CustomerRepositoryImpl.ts
│   │   └── mappers/
│   │       ├── OrderMapper.ts        # Domain ↔ DB mapping
│   │       └── CustomerMapper.ts
│   │
│   └── database/
│       └── connection.ts        # Database connection management
│
└── application/                 # Application layer - orchestration
    ├── commands/
    │   └── PlaceOrderHandler.ts # Coordinates domain and infrastructure
    └── config/
        └── di-container.ts      # Wires implementations to interfaces

The dependency flow:

Application Layer
       ↓
   depends on
       ↓
Domain Layer (including repository interfaces)
       ↓
   implemented by
       ↓
Infrastructure Layer (repository implementations)

Testing Benefits

Summary: What is a Repository

We've established the foundational understanding of what Repository means in DDD. Let's consolidate the key insights:

Key Takeaways

•A Repository is a collection abstraction — It provides an illusion that all aggregates exist in memory, ready for access.
•Repositories exist at the domain level — The interface is defined in the domain; the implementation lives in infrastructure.
•One repository per aggregate root — Never create repositories for internal entities or value objects.
•Repositories hide persistence mechanics — No SQL, ORM, or database concepts leak into the domain.
•Repository ≠ DAO — DAOs abstract database tables; repositories abstract aggregate collections.
•Repositories enable testability — Domain logic can be tested with in-memory fakes, no database required.

What's next:

Page Complete

1 / 4