System Design (LLD)Persistence Layer Responsibilities

Persistence Layer Responsibilities

LevelIntermediate

Duration60 mins

TopicPersistence Layer Responsibilities

1 / 4

Data Storage and Retrieval

The Foundation of Application State

Every software application, from the simplest mobile app to the most complex enterprise system, must grapple with a fundamental challenge: how to persist data beyond the lifetime of a single process execution. When a user closes an application, when a server restarts, or when power is lost—the data must survive. This is the primary responsibility of the persistence layer: to provide reliable, consistent, and efficient storage and retrieval of application data.

The persistence layer is not merely a technical detail to be glossed over—it is a critical architectural component that profoundly impacts application correctness, performance, scalability, and maintainability. A poorly designed persistence layer becomes a source of bugs, performance bottlenecks, and technical debt that can cripple an otherwise well-architected system.

What You Will Learn

By the end of this page, you will understand the fundamental operations that constitute data storage and retrieval, the design principles that guide effective persistence layer implementation, and the common patterns that experienced engineers apply when building robust data access mechanisms. You'll develop the mental model necessary to reason about persistence challenges at a professional level.

Understanding Persistence

At its core, persistence refers to the characteristic of data that outlives the execution of the program that created it. Non-persistent (or transient) data exists only in memory and is lost when the process terminates. Persistent data, by contrast, is written to durable storage—typically disk-based systems—where it can survive process restarts, system failures, and power outages.

The persistence layer serves as the intermediary between the application's in-memory domain model and the physical storage mechanism. It translates between the rich object structures that applications manipulate and the structured data representations that storage systems understand.

Core Responsibilities of Persistence

•Durability Guarantee — Ensuring that once data is committed, it survives system crashes, power failures, and hardware malfunctions. This is the 'D' in ACID properties.
•Data Integrity Preservation — Maintaining the correctness and validity of stored data, enforcing constraints, and preventing corruption.
•Efficient Access Patterns — Providing performant mechanisms for both writing data to storage and reading it back into memory.
•Concurrency Management — Handling simultaneous access from multiple threads, processes, or distributed nodes without data corruption or loss.
•State Synchronization — Keeping the in-memory representation consistent with the persisted state, managing changes and updates coherently.

The Persistence Layer as Contract

Think of the persistence layer as a contract between your application and the underlying storage system. The application promises to use the persistence layer's API correctly, and the persistence layer promises to store and retrieve data faithfully. This contract allows the application to reason about data without needing to understand the intricacies of file systems, database protocols, or network storage.

The CRUD Operations

The fundamental operations of any persistence layer are captured in the acronym CRUD: Create, Read, Update, and Delete. These four operations represent the complete lifecycle of persistent data, from its initial creation to its eventual removal. Understanding these operations deeply is essential for designing effective data access mechanisms.

While CRUD appears simple on the surface, each operation carries significant complexity when you consider edge cases, error handling, concurrency, and performance optimization.

The Four CRUD Operations
Operation	Purpose	Key Challenges	Performance Considerations
Create	Insert new records into storage	Uniqueness constraints, auto-generation of IDs, validation	Batch inserts vs. single inserts, index maintenance
Read	Retrieve data from storage into memory	Query complexity, filtering, sorting, projection	Index utilization, caching, lazy loading
Update	Modify existing records in storage	Optimistic vs. pessimistic locking, partial updates	Write amplification, concurrent modifications
Delete	Remove records from storage	Cascading deletes, soft vs. hard delete, referential integrity	Orphaned data, storage reclamation

Create Operations:

When creating new data, the persistence layer must:

Validate the data against any defined constraints (not null, unique, foreign keys)
Generate identifiers if the storage system uses auto-incrementing IDs or UUIDs
Transform the object from its in-memory representation to the storage format
Write to durable storage and confirm successful persistence
Update indexes to ensure the new data is findable via indexed queries
Return confirmation to the caller, often including any generated values

The create operation must be idempotent in design or handle duplicate creation attempts gracefully. Consider what happens if a network timeout occurs after the data is written but before the confirmation reaches the caller—will a retry create duplicate records?

create-operation-example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// A well-designed create operation in the persistence layer
interface UserRepository {
    /**
     * Creates a new user in the storage system.
     * 
     * @param user - The user data to persist (without ID)
     * @returns The created user with generated ID
     * @throws DuplicateEmailError if email already exists
     * @throws ValidationError if required fields are missing
     */
    create(user: Omit<User, 'id' | 'createdAt'>): Promise<User>;
}
 
// Implementation considerations:
class SqlUserRepository implements UserRepository {
    async create(userData: Omit<User, 'id' | 'createdAt'>): Promise<User> {
        // 1. Validate input before hitting the database
        this.validateUserData(userData);
        
        // 2. Transform to storage format (may differ from domain model)
        const storageData = this.toStorageFormat(userData);
        
        // 3. Execute the insert within a transaction for atomicity
        const result = await this.db.transaction(async (tx) => {
            // Check for uniqueness constraints proactively
            const existing = await tx.query(
                'SELECT id FROM users WHERE email = ?',
                [storageData.email]
            );
            
            if (existing.length > 0) {
                throw new DuplicateEmailError(storageData.email);
            }
            
            // Perform the insert
            const insertResult = await tx.query(
                'INSERT INTO users (name, email, created_at) VALUES (?, ?, ?)',
                [storageData.name, storageData.email, new Date()]
            );
            
            return insertResult.insertId;
        });
        
        // 4. Fetch and return the created user with all generated fields
        return this.findById(result);
    }
}

Read Operations:

Read operations retrieve data from storage and reconstitute it as in-memory objects. The complexity of read operations varies enormously based on the query requirements:

Simple lookups by primary key are fast and straightforward
Filtered queries require condition evaluation against potentially large datasets
Sorted results may require sorting in-memory or via indexed access
Aggregations (count, sum, average) compute derived values from multiple records
Joins combine data from related entities

Effective read operations must balance completeness (returning all required data) with efficiency (minimizing resource consumption). This tension leads to patterns like lazy loading, pagination, and projection.

Read Operation Design Patterns

•Eager Loading — Load all related data upfront. Simpler code but may fetch unnecessary data. Best when you know all data will be needed.
•Lazy Loading — Load related data only when accessed. More efficient for partial access patterns but risks N+1 query problems.
•Projection — Fetch only the specific fields needed, not entire objects. Reduces data transfer but requires more specific query design.
•Pagination — Divide large result sets into manageable chunks. Essential for scalability but complicates client-side state management.
•Cursor-Based Navigation — Use opaque cursors instead of page numbers for more stable pagination across changing datasets.

Update and Delete Semantics

Update Operations:

Updating persistent data involves modifying existing records while maintaining data integrity and handling concurrent modifications. Update operations must address several key concerns:

1. Partial vs. Full Updates:

Full update (PUT semantics): Replace the entire entity with new data
Partial update (PATCH semantics): Modify only specific fields, leaving others unchanged

Partial updates are generally preferred because they reduce the risk of lost updates (where one client overwrites changes made by another), minimize data transfer, and are more explicit about intent. However, they require more sophisticated handling of null/undefined values—does a missing field mean "keep the existing value" or "set to null"?

Optimistic Locking

•Assumes conflicts are rare
•Uses version numbers or timestamps
•Allows concurrent read access
•Detects conflicts at write time
•Requires application to handle conflicts
•Better for read-heavy workloads
•Low overhead, high throughput

Pessimistic Locking

•Assumes conflicts are common
•Acquires locks before modifications
•Blocks concurrent access
•Prevents conflicts entirely
•Simpler conflict handling
•Better for write-heavy workloads
•Higher overhead, potential deadlocks

optimistic-locking-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
// Optimistic locking using version numbers
interface VersionedEntity {
    id: string;
    version: number;  // Incremented on each update
}
 
interface Account extends VersionedEntity {
    balance: number;
    lastModified: Date;
}
 
class AccountRepository {
    async updateBalance(
        accountId: string, 
        newBalance: number,
        expectedVersion: number
    ): Promise<Account> {
        // Attempt update only if version matches
        const result = await this.db.query(`
            UPDATE accounts 
            SET balance = ?, version = version + 1, last_modified = ?
            WHERE id = ? AND version = ?
        `, [newBalance, new Date(), accountId, expectedVersion]);
        
        if (result.affectedRows === 0) {
            // Either account doesn't exist or version mismatch
            const current = await this.findById(accountId);
            if (!current) {
                throw new EntityNotFoundError('Account', accountId);
            }
            // Version mismatch - concurrent modification occurred
            throw new OptimisticLockException(
                `Account ${accountId} was modified by another transaction. ` +
                                        `Expected version ${expectedVersion}, found ${current.version}`
            );
        }
        
        return this.findById(accountId);
    }
}
 
// Client code must handle conflicts appropriately
async function transferMoney(fromId: string, toId: string, amount: number) {
    const maxRetries = 3;
    
    for (let attempt = 0; attempt < maxRetries; attempt++) {
        try {
            const fromAccount = await accountRepo.findById(fromId);
            const toAccount = await accountRepo.findById(toId);
            
            await accountRepo.updateBalance(
                fromId, 
                fromAccount.balance - amount,
                fromAccount.version
            );
            await accountRepo.updateBalance(
                toId, 
                toAccount.balance + amount,
                toAccount.version
            );
            
            return; // Success
        } catch (e) {
            if (e instanceof OptimisticLockException && attempt < maxRetries - 1) {
                // Retry with fresh data
                continue;
            }
            throw e;
        }
    }
}

Delete Operations:

Deletion is often the most underestimated CRUD operation. While it seems simple—remove data from storage—the implications can be complex:

Hard Delete vs. Soft Delete:

Hard delete physically removes the data from storage. It's irreversible, reduces storage consumption, but loses historical information.
Soft delete marks records as deleted (typically with a deleted_at timestamp) while keeping them in storage. It's reversible, preserves history, but requires filtering in all queries and consumes storage.

Referential Integrity: When deleting data that other records reference, you must decide:

CASCADE: Automatically delete related records
SET NULL: Set the foreign key to null in related records
RESTRICT: Prevent deletion if related records exist
Application handling: Handle dependencies in code before deletion

The Danger of Orphaned Data

Improperly designed delete operations can leave orphaned data—records that reference deleted entities or are no longer reachable through normal application flows. Orphaned data wastes storage, can cause application errors, and may violate data integrity constraints. Always analyze the full dependency graph before implementing delete operations.

Data Transformation: Object-to-Storage Mapping

A critical responsibility of the persistence layer is transforming data between its in-memory representation and its storage format. This transformation is bidirectional:

Serialization: Converting in-memory objects to storage format (for Create/Update)
Deserialization: Converting storage data back to in-memory objects (for Read)

This transformation is non-trivial because in-memory object models and storage schemas often differ significantly. Rich domain objects with behavior, nested structures, and complex relationships must be mapped to flat, typed columns in relational databases or to document structures in NoSQL stores.

Common Mapping Challenges

•Type Mismatches — In-memory enums must become strings or integers; decimal values must be stored with appropriate precision; timestamps require timezone handling.
•Nested Objects — Complex object graphs must be flattened into separate tables with foreign key relationships, or embedded as JSON in single columns.
•Inheritance Hierarchies — OOP inheritance doesn't map directly to relational tables. Strategies include single table inheritance, class table inheritance, and concrete table inheritance.
•Value Objects — Immutable value objects (like Money, Address, DateRange) may be stored as multiple columns or serialized as JSON.
•Collections — Lists, sets, and maps require separate junction tables or array/JSON columns depending on the storage system.
•Lazy vs. Eager References — References to other entities may be stored as foreign keys and loaded lazily, or embedded and loaded eagerly.

data-mapping-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
// Domain Model - Rich, expressive objects
class Order {
    readonly id: OrderId;
    readonly customer: Customer;
    readonly items: OrderItem[];
    readonly shippingAddress: Address;  // Value Object
    readonly status: OrderStatus;       // Enum
    readonly totalAmount: Money;        // Value Object
    readonly createdAt: Date;
    
    // Business methods
    addItem(product: Product, quantity: number): void { /* ... */ }
    canBeCancelled(): boolean { /* ... */ }
}
 
class Address {  // Value Object - immutable, no identity
    constructor(
        readonly street: string,
        readonly city: string,
        readonly country: string,
        readonly postalCode: string
    ) {}
    
    equals(other: Address): boolean {
        return this.street === other.street 
            && this.city === other.city
            && this.country === other.country
            && this.postalCode === other.postalCode;
    }
}
 
class Money {  // Value Object
    constructor(
        readonly amount: number,
        readonly currency: string
    ) {}
}
 
// Storage Schema - Flat, persistence-friendly
interface OrderRecord {
    id: string;
    customer_id: string;  // Foreign key, not embedded Customer
    status: string;       // Enum stored as string
    total_amount: number; // Money.amount
    total_currency: string; // Money.currency
    // Address embedded as columns
    shipping_street: string;
    shipping_city: string;
    shipping_country: string;
    shipping_postal_code: string;
    created_at: Date;
}
 
interface OrderItemRecord {
    id: string;
    order_id: string;  // Foreign key to Order
    product_id: string;
    quantity: number;
    unit_price: number;
    unit_currency: string;
}
 
// Mapper - Translates between domain and storage
class OrderMapper {
    toDomain(record: OrderRecord, items: OrderItemRecord[]): Order {
        return new Order(
            new OrderId(record.id),
            // Customer loaded separately via repository
            await this.customerRepository.findById(record.customer_id),
            items.map(item => this.itemMapper.toDomain(item)),
            new Address(
                record.shipping_street,
                record.shipping_city,
                record.shipping_country,
                record.shipping_postal_code
            ),
            OrderStatus[record.status as keyof typeof OrderStatus],
            new Money(record.total_amount, record.total_currency),
            record.created_at
        );
    }
    
    toRecord(order: Order): OrderRecord {
        return {
            id: order.id.value,
            customer_id: order.customer.id.value,
            status: order.status.toString(),
            total_amount: order.totalAmount.amount,
            total_currency: order.totalAmount.currency,
            shipping_street: order.shippingAddress.street,
            shipping_city: order.shippingAddress.city,
            shipping_country: order.shippingAddress.country,
            shipping_postal_code: order.shippingAddress.postalCode,
            created_at: order.createdAt,
        };
    }
}

ORM vs. Manual Mapping

Object-Relational Mappers (ORMs) automate much of this transformation, but they introduce their own complexity and performance considerations. Understanding manual mapping helps you reason about what ORMs do under the hood and when to bypass them for critical operations. We'll explore this trade-off in detail in a later module.

Error Handling in Persistence Operations

Persistence operations can fail for numerous reasons, and a well-designed persistence layer must handle these failures gracefully. The key is to translate storage-specific errors into meaningful domain-level exceptions that callers can understand and handle appropriately.

Storage failures fall into several categories, each requiring different handling strategies:

Categories of Persistence Errors
Error Category	Examples	Typical Response	Recovery Possible?
Constraint Violations	Duplicate key, foreign key violation, null constraint	Translate to domain exception, inform caller	Yes, with corrected data
Concurrency Conflicts	Optimistic lock failure, deadlock detected	Retry operation or escalate to caller	Usually yes, with retry
Transient Failures	Connection timeout, temporary unavailability	Retry with backoff, circuit breaker	Usually yes, with waiting
Data Corruption	Checksum mismatch, inconsistent state	Log, alert, potentially fail hard	Requires investigation
Resource Exhaustion	Connection pool empty, disk full, memory limit	Back-pressure, queue requests	Yes, when resources free
Configuration Errors	Invalid credentials, wrong schema version	Fail startup, prevent operation	No, requires fix

persistence-error-handling
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
// Define domain-level persistence exceptions
// These abstract away storage-specific errors
 
abstract class PersistenceException extends Error {
    constructor(message: string, public readonly cause?: Error) {
        super(message);
        this.name = this.constructor.name;
    }
}
 
class EntityNotFoundException extends PersistenceException {
    constructor(
        public readonly entityType: string,
        public readonly id: string
    ) {
        super(`${entityType} with id '${id}' not found`);
    }
}
 
class DuplicateEntityException extends PersistenceException {
    constructor(
        public readonly entityType: string,
        public readonly field: string,
        public readonly value: string
    ) {
        super(`${entityType} with ${field}='${value}' already exists`);
    }
}
 
class ConcurrentModificationException extends PersistenceException {
    constructor(
        public readonly entityType: string,
        public readonly id: string,
        public readonly expectedVersion: number,
        public readonly actualVersion: number
    ) {
        super(
            `${entityType} '${id}' was modified concurrently. ` +
            `Expected version ${expectedVersion}, found ${actualVersion}`
        );
    }
}
 
class PersistenceUnavailableException extends PersistenceException {
    constructor(message: string, cause?: Error) {
        super(`Storage unavailable: ${message}`, cause);
    }
}
 
// Repository implementation translating storage errors
class SqlUserRepository implements UserRepository {
    async create(userData: CreateUserData): Promise<User> {
        try {
            const result = await this.db.query(
                'INSERT INTO users (email, name) VALUES (?, ?)',
                [userData.email, userData.name]
            );
            return this.findById(result.insertId);
        } catch (error) {
            // Translate database-specific errors to domain exceptions
            if (this.isDuplicateKeyError(error)) {
                throw new DuplicateEntityException(
                    'User', 
                    'email', 
                    userData.email
                );
            }
            if (this.isConnectionError(error)) {
                throw new PersistenceUnavailableException(
                    'Database connection failed',
                    error
                );
            }
            // Unknown error - wrap and rethrow
            throw new PersistenceException(
                'Unexpected error creating user',
                error
            );
        }
    }
    
    private isDuplicateKeyError(error: unknown): boolean {
        // MySQL: error code 1062
        // PostgreSQL: error code 23505
        return error instanceof Error && (
            error.message.includes('Duplicate entry') ||
            error.message.includes('unique constraint')
        );
    }
    
    private isConnectionError(error: unknown): boolean {
        return error instanceof Error && (
            error.message.includes('ECONNREFUSED') ||
            error.message.includes('timeout')
        );
    }
}

Never Expose Raw Database Errors

Raw database error messages often contain internal details (table names, column names, SQL syntax) that should not leak to API consumers. They may also expose potential security vulnerabilities. Always translate to clean, domain-appropriate exceptions or error responses before they leave your service boundary.

Design Principles for Data Storage and Retrieval

Effective persistence layer design follows principles that ensure reliability, maintainability, and performance. These principles have emerged from decades of industry experience and academic research.

Core Design Principles

•Single Responsibility — Each persistence component should focus on one entity or aggregate. Avoid "god repositories" that handle multiple unrelated entities.
•Encapsulation of Storage Details — Callers should not need to know whether data lives in MySQL, MongoDB, or a file system. The persistence layer abstracts these details.
•Explicit Transactions — Make transaction boundaries explicit in your API. Don't hide transactions inside individual methods where callers can't control commit/rollback.
•Fail-Fast on Errors — Detect and report problems as early as possible. Don't let corrupt or inconsistent data propagate through the system.
•Idempotency Where Possible — Design operations that can be safely retried without changing the outcome. This simplifies error recovery and retry logic.
•Separation of Query and Command — Consider separating read operations from write operations (CQRS pattern) when read and write patterns differ significantly.

The Repository Pattern:

The Repository pattern is the most common abstraction for data storage and retrieval. A repository presents a collection-like interface for domain objects, hiding the complexity of database access behind simple add, remove, and find operations.

Key characteristics of well-designed repositories:

Operate on Aggregates: Repositories work with aggregate roots, not arbitrary entities. This enforces consistency boundaries.
Return Domain Objects: Repositories return rich domain objects, not raw data or DTOs. The mapping is internal.
Use Specification Pattern for Complex Queries: Instead of numerous find methods, use specifications or criteria objects for flexible querying.
Are Testable: Repositories can be mocked or replaced with in-memory implementations for testing.

Repository vs. DAO

While Repository and Data Access Object (DAO) patterns are often confused, they differ in abstraction level. A DAO is table-centric—it provides CRUD for a specific database table. A Repository is domain-centric—it provides collection operations for aggregate roots, potentially spanning multiple tables. We'll explore this distinction in a dedicated module.

Summary: Data Storage and Retrieval

We've established the fundamental responsibility of the persistence layer: managing data storage and retrieval. Let's consolidate the key concepts:

Key Takeaways

•Persistence ensures durability — Data survives beyond process lifetime, system restarts, and failures.
•CRUD operations form the core — Create, Read, Update, Delete are the fundamental persistence operations, each with distinct challenges.
•Concurrency requires explicit handling — Optimistic and pessimistic locking strategies manage concurrent access differently.
•Data transformation bridges worlds — The persistence layer translates between rich domain objects and storage representations.
•Error handling must be robust — Storage errors should be translated to meaningful domain exceptions.
•Design principles guide implementation — Single responsibility, encapsulation, explicit transactions, and fail-fast behavior produce reliable systems.

What's Next:

Now that we understand the core data operations, the next page explores how the persistence layer abstracts storage details—hiding the complexity of specific database systems, file formats, and storage protocols behind clean, uniform interfaces. This abstraction is what allows applications to evolve their storage strategy without rewriting business logic.

Page Complete

You now understand the fundamental data storage and retrieval responsibilities of the persistence layer. These concepts form the foundation for all persistence patterns we'll explore in subsequent modules. Next, we'll examine how to abstract storage details effectively.

1 / 4

Loading learning content...

System Design (LLD)Persistence Layer Responsibilities

Persistence Layer Responsibilities

LevelIntermediate

Duration60 mins

TopicPersistence Layer Responsibilities

1 / 4

Data Storage and Retrieval

The Foundation of Application State

What You Will Learn

Understanding Persistence

Core Responsibilities of Persistence

•Durability Guarantee — Ensuring that once data is committed, it survives system crashes, power failures, and hardware malfunctions. This is the 'D' in ACID properties.
•Data Integrity Preservation — Maintaining the correctness and validity of stored data, enforcing constraints, and preventing corruption.
•Efficient Access Patterns — Providing performant mechanisms for both writing data to storage and reading it back into memory.
•Concurrency Management — Handling simultaneous access from multiple threads, processes, or distributed nodes without data corruption or loss.
•State Synchronization — Keeping the in-memory representation consistent with the persisted state, managing changes and updates coherently.

The Persistence Layer as Contract

The CRUD Operations

While CRUD appears simple on the surface, each operation carries significant complexity when you consider edge cases, error handling, concurrency, and performance optimization.

The Four CRUD Operations
Operation	Purpose	Key Challenges	Performance Considerations
Create	Insert new records into storage	Uniqueness constraints, auto-generation of IDs, validation	Batch inserts vs. single inserts, index maintenance
Read	Retrieve data from storage into memory	Query complexity, filtering, sorting, projection	Index utilization, caching, lazy loading
Update	Modify existing records in storage	Optimistic vs. pessimistic locking, partial updates	Write amplification, concurrent modifications
Delete	Remove records from storage	Cascading deletes, soft vs. hard delete, referential integrity	Orphaned data, storage reclamation

Create Operations:

When creating new data, the persistence layer must:

Validate the data against any defined constraints (not null, unique, foreign keys)
Generate identifiers if the storage system uses auto-incrementing IDs or UUIDs
Transform the object from its in-memory representation to the storage format
Write to durable storage and confirm successful persistence
Update indexes to ensure the new data is findable via indexed queries
Return confirmation to the caller, often including any generated values

create-operation-example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// A well-designed create operation in the persistence layer
interface UserRepository {
    /**
     * Creates a new user in the storage system.
     * 
     * @param user - The user data to persist (without ID)
     * @returns The created user with generated ID
     * @throws DuplicateEmailError if email already exists
     * @throws ValidationError if required fields are missing
     */
    create(user: Omit<User, 'id' | 'createdAt'>): Promise<User>;
}
 
// Implementation considerations:
class SqlUserRepository implements UserRepository {
    async create(userData: Omit<User, 'id' | 'createdAt'>): Promise<User> {
        // 1. Validate input before hitting the database
        this.validateUserData(userData);
        
        // 2. Transform to storage format (may differ from domain model)
        const storageData = this.toStorageFormat(userData);
        
        // 3. Execute the insert within a transaction for atomicity
        const result = await this.db.transaction(async (tx) => {
            // Check for uniqueness constraints proactively
            const existing = await tx.query(
                'SELECT id FROM users WHERE email = ?',
                [storageData.email]
            );
            
            if (existing.length > 0) {
                throw new DuplicateEmailError(storageData.email);
            }
            
            // Perform the insert
            const insertResult = await tx.query(
                'INSERT INTO users (name, email, created_at) VALUES (?, ?, ?)',
                [storageData.name, storageData.email, new Date()]
            );
            
            return insertResult.insertId;
        });
        
        // 4. Fetch and return the created user with all generated fields
        return this.findById(result);
    }
}

Read Operations:

Read operations retrieve data from storage and reconstitute it as in-memory objects. The complexity of read operations varies enormously based on the query requirements:

Simple lookups by primary key are fast and straightforward
Filtered queries require condition evaluation against potentially large datasets
Sorted results may require sorting in-memory or via indexed access
Aggregations (count, sum, average) compute derived values from multiple records
Joins combine data from related entities

Read Operation Design Patterns

•Eager Loading — Load all related data upfront. Simpler code but may fetch unnecessary data. Best when you know all data will be needed.
•Lazy Loading — Load related data only when accessed. More efficient for partial access patterns but risks N+1 query problems.
•Projection — Fetch only the specific fields needed, not entire objects. Reduces data transfer but requires more specific query design.
•Pagination — Divide large result sets into manageable chunks. Essential for scalability but complicates client-side state management.
•Cursor-Based Navigation — Use opaque cursors instead of page numbers for more stable pagination across changing datasets.

Update and Delete Semantics

Update Operations:

Updating persistent data involves modifying existing records while maintaining data integrity and handling concurrent modifications. Update operations must address several key concerns:

1. Partial vs. Full Updates:

Full update (PUT semantics): Replace the entire entity with new data
Partial update (PATCH semantics): Modify only specific fields, leaving others unchanged

Optimistic Locking

•Assumes conflicts are rare
•Uses version numbers or timestamps
•Allows concurrent read access
•Detects conflicts at write time
•Requires application to handle conflicts
•Better for read-heavy workloads
•Low overhead, high throughput

Pessimistic Locking

•Assumes conflicts are common
•Acquires locks before modifications
•Blocks concurrent access
•Prevents conflicts entirely
•Simpler conflict handling
•Better for write-heavy workloads
•Higher overhead, potential deadlocks

optimistic-locking-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
// Optimistic locking using version numbers
interface VersionedEntity {
    id: string;
    version: number;  // Incremented on each update
}
 
interface Account extends VersionedEntity {
    balance: number;
    lastModified: Date;
}
 
class AccountRepository {
    async updateBalance(
        accountId: string, 
        newBalance: number,
        expectedVersion: number
    ): Promise<Account> {
        // Attempt update only if version matches
        const result = await this.db.query(`
            UPDATE accounts 
            SET balance = ?, version = version + 1, last_modified = ?
            WHERE id = ? AND version = ?
        `, [newBalance, new Date(), accountId, expectedVersion]);
        
        if (result.affectedRows === 0) {
            // Either account doesn't exist or version mismatch
            const current = await this.findById(accountId);
            if (!current) {
                throw new EntityNotFoundError('Account', accountId);
            }
            // Version mismatch - concurrent modification occurred
            throw new OptimisticLockException(
                `Account ${accountId} was modified by another transaction. ` +
                                        `Expected version ${expectedVersion}, found ${current.version}`
            );
        }
        
        return this.findById(accountId);
    }
}
 
// Client code must handle conflicts appropriately
async function transferMoney(fromId: string, toId: string, amount: number) {
    const maxRetries = 3;
    
    for (let attempt = 0; attempt < maxRetries; attempt++) {
        try {
            const fromAccount = await accountRepo.findById(fromId);
            const toAccount = await accountRepo.findById(toId);
            
            await accountRepo.updateBalance(
                fromId, 
                fromAccount.balance - amount,
                fromAccount.version
            );
            await accountRepo.updateBalance(
                toId, 
                toAccount.balance + amount,
                toAccount.version
            );
            
            return; // Success
        } catch (e) {
            if (e instanceof OptimisticLockException && attempt < maxRetries - 1) {
                // Retry with fresh data
                continue;
            }
            throw e;
        }
    }
}

Delete Operations:

Deletion is often the most underestimated CRUD operation. While it seems simple—remove data from storage—the implications can be complex:

Hard Delete vs. Soft Delete:

Hard delete physically removes the data from storage. It's irreversible, reduces storage consumption, but loses historical information.
Soft delete marks records as deleted (typically with a deleted_at timestamp) while keeping them in storage. It's reversible, preserves history, but requires filtering in all queries and consumes storage.

Referential Integrity: When deleting data that other records reference, you must decide:

CASCADE: Automatically delete related records
SET NULL: Set the foreign key to null in related records
RESTRICT: Prevent deletion if related records exist
Application handling: Handle dependencies in code before deletion

The Danger of Orphaned Data

Data Transformation: Object-to-Storage Mapping

A critical responsibility of the persistence layer is transforming data between its in-memory representation and its storage format. This transformation is bidirectional:

Serialization: Converting in-memory objects to storage format (for Create/Update)
Deserialization: Converting storage data back to in-memory objects (for Read)

Common Mapping Challenges

•Type Mismatches — In-memory enums must become strings or integers; decimal values must be stored with appropriate precision; timestamps require timezone handling.
•Nested Objects — Complex object graphs must be flattened into separate tables with foreign key relationships, or embedded as JSON in single columns.
•Inheritance Hierarchies — OOP inheritance doesn't map directly to relational tables. Strategies include single table inheritance, class table inheritance, and concrete table inheritance.
•Value Objects — Immutable value objects (like Money, Address, DateRange) may be stored as multiple columns or serialized as JSON.
•Collections — Lists, sets, and maps require separate junction tables or array/JSON columns depending on the storage system.
•Lazy vs. Eager References — References to other entities may be stored as foreign keys and loaded lazily, or embedded and loaded eagerly.

data-mapping-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
// Domain Model - Rich, expressive objects
class Order {
    readonly id: OrderId;
    readonly customer: Customer;
    readonly items: OrderItem[];
    readonly shippingAddress: Address;  // Value Object
    readonly status: OrderStatus;       // Enum
    readonly totalAmount: Money;        // Value Object
    readonly createdAt: Date;
    
    // Business methods
    addItem(product: Product, quantity: number): void { /* ... */ }
    canBeCancelled(): boolean { /* ... */ }
}
 
class Address {  // Value Object - immutable, no identity
    constructor(
        readonly street: string,
        readonly city: string,
        readonly country: string,
        readonly postalCode: string
    ) {}
    
    equals(other: Address): boolean {
        return this.street === other.street 
            && this.city === other.city
            && this.country === other.country
            && this.postalCode === other.postalCode;
    }
}
 
class Money {  // Value Object
    constructor(
        readonly amount: number,
        readonly currency: string
    ) {}
}
 
// Storage Schema - Flat, persistence-friendly
interface OrderRecord {
    id: string;
    customer_id: string;  // Foreign key, not embedded Customer
    status: string;       // Enum stored as string
    total_amount: number; // Money.amount
    total_currency: string; // Money.currency
    // Address embedded as columns
    shipping_street: string;
    shipping_city: string;
    shipping_country: string;
    shipping_postal_code: string;
    created_at: Date;
}
 
interface OrderItemRecord {
    id: string;
    order_id: string;  // Foreign key to Order
    product_id: string;
    quantity: number;
    unit_price: number;
    unit_currency: string;
}
 
// Mapper - Translates between domain and storage
class OrderMapper {
    toDomain(record: OrderRecord, items: OrderItemRecord[]): Order {
        return new Order(
            new OrderId(record.id),
            // Customer loaded separately via repository
            await this.customerRepository.findById(record.customer_id),
            items.map(item => this.itemMapper.toDomain(item)),
            new Address(
                record.shipping_street,
                record.shipping_city,
                record.shipping_country,
                record.shipping_postal_code
            ),
            OrderStatus[record.status as keyof typeof OrderStatus],
            new Money(record.total_amount, record.total_currency),
            record.created_at
        );
    }
    
    toRecord(order: Order): OrderRecord {
        return {
            id: order.id.value,
            customer_id: order.customer.id.value,
            status: order.status.toString(),
            total_amount: order.totalAmount.amount,
            total_currency: order.totalAmount.currency,
            shipping_street: order.shippingAddress.street,
            shipping_city: order.shippingAddress.city,
            shipping_country: order.shippingAddress.country,
            shipping_postal_code: order.shippingAddress.postalCode,
            created_at: order.createdAt,
        };
    }
}

ORM vs. Manual Mapping

Error Handling in Persistence Operations

Storage failures fall into several categories, each requiring different handling strategies:

Categories of Persistence Errors
Error Category	Examples	Typical Response	Recovery Possible?
Constraint Violations	Duplicate key, foreign key violation, null constraint	Translate to domain exception, inform caller	Yes, with corrected data
Concurrency Conflicts	Optimistic lock failure, deadlock detected	Retry operation or escalate to caller	Usually yes, with retry
Transient Failures	Connection timeout, temporary unavailability	Retry with backoff, circuit breaker	Usually yes, with waiting
Data Corruption	Checksum mismatch, inconsistent state	Log, alert, potentially fail hard	Requires investigation
Resource Exhaustion	Connection pool empty, disk full, memory limit	Back-pressure, queue requests	Yes, when resources free
Configuration Errors	Invalid credentials, wrong schema version	Fail startup, prevent operation	No, requires fix

persistence-error-handling
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
// Define domain-level persistence exceptions
// These abstract away storage-specific errors
 
abstract class PersistenceException extends Error {
    constructor(message: string, public readonly cause?: Error) {
        super(message);
        this.name = this.constructor.name;
    }
}
 
class EntityNotFoundException extends PersistenceException {
    constructor(
        public readonly entityType: string,
        public readonly id: string
    ) {
        super(`${entityType} with id '${id}' not found`);
    }
}
 
class DuplicateEntityException extends PersistenceException {
    constructor(
        public readonly entityType: string,
        public readonly field: string,
        public readonly value: string
    ) {
        super(`${entityType} with ${field}='${value}' already exists`);
    }
}
 
class ConcurrentModificationException extends PersistenceException {
    constructor(
        public readonly entityType: string,
        public readonly id: string,
        public readonly expectedVersion: number,
        public readonly actualVersion: number
    ) {
        super(
            `${entityType} '${id}' was modified concurrently. ` +
            `Expected version ${expectedVersion}, found ${actualVersion}`
        );
    }
}
 
class PersistenceUnavailableException extends PersistenceException {
    constructor(message: string, cause?: Error) {
        super(`Storage unavailable: ${message}`, cause);
    }
}
 
// Repository implementation translating storage errors
class SqlUserRepository implements UserRepository {
    async create(userData: CreateUserData): Promise<User> {
        try {
            const result = await this.db.query(
                'INSERT INTO users (email, name) VALUES (?, ?)',
                [userData.email, userData.name]
            );
            return this.findById(result.insertId);
        } catch (error) {
            // Translate database-specific errors to domain exceptions
            if (this.isDuplicateKeyError(error)) {
                throw new DuplicateEntityException(
                    'User', 
                    'email', 
                    userData.email
                );
            }
            if (this.isConnectionError(error)) {
                throw new PersistenceUnavailableException(
                    'Database connection failed',
                    error
                );
            }
            // Unknown error - wrap and rethrow
            throw new PersistenceException(
                'Unexpected error creating user',
                error
            );
        }
    }
    
    private isDuplicateKeyError(error: unknown): boolean {
        // MySQL: error code 1062
        // PostgreSQL: error code 23505
        return error instanceof Error && (
            error.message.includes('Duplicate entry') ||
            error.message.includes('unique constraint')
        );
    }
    
    private isConnectionError(error: unknown): boolean {
        return error instanceof Error && (
            error.message.includes('ECONNREFUSED') ||
            error.message.includes('timeout')
        );
    }
}

Never Expose Raw Database Errors

Design Principles for Data Storage and Retrieval

Core Design Principles

•Single Responsibility — Each persistence component should focus on one entity or aggregate. Avoid "god repositories" that handle multiple unrelated entities.
•Encapsulation of Storage Details — Callers should not need to know whether data lives in MySQL, MongoDB, or a file system. The persistence layer abstracts these details.
•Explicit Transactions — Make transaction boundaries explicit in your API. Don't hide transactions inside individual methods where callers can't control commit/rollback.
•Fail-Fast on Errors — Detect and report problems as early as possible. Don't let corrupt or inconsistent data propagate through the system.
•Idempotency Where Possible — Design operations that can be safely retried without changing the outcome. This simplifies error recovery and retry logic.
•Separation of Query and Command — Consider separating read operations from write operations (CQRS pattern) when read and write patterns differ significantly.

The Repository Pattern:

Key characteristics of well-designed repositories:

Operate on Aggregates: Repositories work with aggregate roots, not arbitrary entities. This enforces consistency boundaries.
Return Domain Objects: Repositories return rich domain objects, not raw data or DTOs. The mapping is internal.
Use Specification Pattern for Complex Queries: Instead of numerous find methods, use specifications or criteria objects for flexible querying.
Are Testable: Repositories can be mocked or replaced with in-memory implementations for testing.

Repository vs. DAO

Summary: Data Storage and Retrieval

We've established the fundamental responsibility of the persistence layer: managing data storage and retrieval. Let's consolidate the key concepts:

Key Takeaways

•Persistence ensures durability — Data survives beyond process lifetime, system restarts, and failures.
•CRUD operations form the core — Create, Read, Update, Delete are the fundamental persistence operations, each with distinct challenges.
•Concurrency requires explicit handling — Optimistic and pessimistic locking strategies manage concurrent access differently.
•Data transformation bridges worlds — The persistence layer translates between rich domain objects and storage representations.
•Error handling must be robust — Storage errors should be translated to meaningful domain exceptions.
•Design principles guide implementation — Single responsibility, encapsulation, explicit transactions, and fail-fast behavior produce reliable systems.

What's Next:

Page Complete

1 / 4