System Design HLDSQL vs NoSQL

SQL vs NoSQL: Choosing the Right Database Paradigm

LevelIntermediate

Duration75 mins

TopicSQL vs NoSQL

5 / 5

Polyglot Persistence

One Size Does Not Fit All

For decades, enterprise architecture followed a simple pattern: one database for everything. Oracle or SQL Server stored user data, transactions, logs, sessions, analytics—everything. This approach had the virtue of simplicity but forced square pegs into round holes.

Modern systems have rejected this uniformity in favor of polyglot persistence: using multiple database technologies, each chosen for its specific strengths. Your e-commerce platform might use PostgreSQL for orders and inventory, Redis for sessions and cart, Elasticsearch for product search, and ClickHouse for analytics.

This isn't complexity for complexity's sake—it's optimization for reality. Different data has different access patterns, consistency requirements, and scale characteristics. Forcing everything into one database either under-serves some use cases or over-engineers for others.

Polyglot persistence is now the norm at scale. Understanding how to design, implement, and maintain multi-database systems is essential for modern system design.

What You Will Learn

By the end of this page, you will understand the principles of polyglot persistence, when to introduce additional databases, how to handle data consistency across systems, common architectural patterns, and the operational challenges of managing multiple database technologies.

Why Polyglot Persistence? The Case for Specialization

The fundamental argument for polyglot persistence is that no single database excels at everything. Each database technology makes trade-offs, and matching data characteristics to database strengths yields dramatic improvements.

The Trade-off Reality:

Database Trade-offs
Optimizes For	Sacrifices	Example
Strong consistency (ACID)	Horizontal scale, availability	PostgreSQL, MySQL
Write throughput	Read latency, consistency	Cassandra, ScyllaDB
Sub-millisecond reads	Persistence durability	Redis, Memcached
Full-text search	Transaction support	Elasticsearch, Solr
Relationship traversal	Query flexibility	Neo4j, JanusGraph
Analytical aggregations	Point queries	ClickHouse, Druid

A Practical Example:

Consider a social media platform. Its data needs include:

Social Media Data Requirements

•User accounts and profiles — Transactional, relational, must be consistent. → SQL database (PostgreSQL)
•Authentication sessions — Sub-millisecond reads, auto-expiration, can be rebuilt. → Key-value store (Redis)
•Post and content storage — Flexible schema, embedded media metadata. → Document store (MongoDB) or SQL with JSON
•Social graph — Friend connections, followers, friend-of-friend queries. → Graph database (Neo4j) or SQL with careful modeling
•Feed generation — Pre-computed, real-time, fanout on write. → Redis sorted sets or dedicated timeline service
•Search — Full-text search across posts, profiles, hashtags. → Search engine (Elasticsearch)
•Analytics and metrics — Aggregations, dashboards, retention analysis. → Columnar analytics (ClickHouse, Redshift)
•Activity logs — Append-only, high volume, time-series queries. → Time-series DB or Cassandra

Putting all of this in one PostgreSQL instance would be possible but suboptimal:

Session lookups would be 10-100x slower than Redis
Full-text search would be clunky compared to Elasticsearch
Analytics would impact transactional workloads
Feed generation at scale would hit limits

Polyglot persistence lets each use case use the optimal tool.

Complexity Cost

Every additional database adds operational burden: monitoring, backups, failover, upgrades, and team expertise. Don't adopt polyglot persistence prematurely. Start with SQL, add specialized stores when specific pain points emerge and the benefits clearly outweigh the complexity cost.

Common Polyglot Patterns

Let's examine the most common polyglot combinations and why they work together.

Pattern 1: SQL + Redis = Core + Caching

The most common polyglot pattern. PostgreSQL handles transactional data; Redis provides caching and ephemeral storage.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
// Primary data in PostgreSQL
interface User {
  id: string;
  email: string;
  name: string;
  preferences: Record<string, any>;
}
 
// Cache layer with Redis
const CACHE_TTL = 300; // 5 minutes
 
async function getUser(userId: string): Promise<User> {
  // Try cache first
  const cached = await redis.get(`user:${userId}`);
  if (cached) {
    return JSON.parse(cached);
  }
  
  // Miss: fetch from PostgreSQL
  const user = await db.query("SELECT * FROM users WHERE id = $1", [userId]);
  
  // Populate cache for next request
  await redis.set(`user:${userId}`, JSON.stringify(user), "EX", CACHE_TTL);
  
  return user;
}
 
// Session storage entirely in Redis (no SQL)
async function createSession(userId: string): Promise<string> {
  const sessionId = crypto.randomUUID();
  await redis.set(`session:${sessionId}`, userId, "EX", 86400); // 24h TTL
  return sessionId;
}
 
// Rate limiting in Redis
async function checkRateLimit(userId: string): Promise<boolean> {
  const key = `ratelimit:${userId}:${Math.floor(Date.now() / 60000)}`;
  const count = await redis.incr(key);
  if (count === 1) {
    await redis.expire(key, 60); // First request, set expiry
  }
  return count <= 100; // 100 requests per minute
}

Pattern 2: SQL + Search Engine

PostgreSQL stores authoritative data; Elasticsearch provides full-text search and faceted filtering.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Products stored in PostgreSQL (source of truth)
await db.query(`
  INSERT INTO products (id, name, description, category_id, price)
  VALUES ($1, $2, $3, $4, $5)
`, [product.id, product.name, product.description, product.categoryId, product.price]);
 
// Index product in Elasticsearch for search
await elasticsearch.index({
  index: 'products',
  id: product.id,
  body: {
    name: product.name,
    description: product.description,
    category: categoryName,
    price: product.price,
    searchable_text: `${product.name} ${product.description}`,
    facets: {
      category: categoryName,
      price_range: getPriceRange(product.price),
      brand: product.brand
    }
  }
});
 
// Search uses Elasticsearch
async function searchProducts(query: string, filters: Filters) {
  const results = await elasticsearch.search({
    index: 'products',
    body: {
      query: {
        bool: {
          must: { match: { searchable_text: query } },
          filter: buildFilters(filters)
        }
      },
      aggs: {
        categories: { terms: { field: 'facets.category' } },
        price_ranges: { terms: { field: 'facets.price_range' } }
      }
    }
  });
  
  // Return IDs, fetch full details from PostgreSQL if needed
  return results.hits.hits.map(hit => hit._id);
}

Pattern 3: SQL + Time-Series Database

PostgreSQL for application data; InfluxDB or TimescaleDB for metrics and monitoring data.

SQL + Time-Series Benefits

•Transactional data in PostgreSQL with full ACID support
•Metrics/logs in time-series DB with optimized compression
•Retention policies handle automatic data expiration
•Analytics dashboards can query time-series DB without impacting production
•Different backup/recovery requirements for operational vs analytical data

Start Small, Add Incrementally

Don't architect a 5-database system on day one. Start with PostgreSQL. When session performance becomes a bottleneck, add Redis. When search needs outgrow LIKE queries, add Elasticsearch. Each addition should solve a specific, measured problem.

Data Synchronization Strategies

The core challenge of polyglot persistence is keeping data synchronized across multiple systems. When a user updates their profile in PostgreSQL, how does that update reach the Redis cache and Elasticsearch index?

Synchronization Strategies:

Synchronization Approaches

•Dual-Write (Application-Level) — Application writes to both systems. Simple but dangerous—partial failures leave systems inconsistent.
•Change Data Capture (CDC) — Stream changes from primary database to other systems. Debezium reads PostgreSQL WAL and publishes to Kafka.
•Event Sourcing — Application emits events; consumers update their stores. Natural inversion of control.
•Cache Invalidation — Don't synchronize; invalidate secondary store on change. Next read fetches fresh data.
•Scheduled Sync (ETL) — Periodic batch synchronization. Acceptable for analytics, not for operational data.

The Dual-Write Problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// DANGEROUS: Dual-write pattern
async function updateUser(userId: string, data: UserUpdate) {
  // Write to PostgreSQL
  await db.query("UPDATE users SET name = $1 WHERE id = $2", [data.name, userId]);
  
  // Write to Redis cache
  await redis.set(`user:${userId}`, JSON.stringify({...data}));
  
  // Write to Elasticsearch
  await elasticsearch.update({ index: 'users', id: userId, body: { doc: data } });
  
  // PROBLEM: If elasticsearch.update() fails:
  // - PostgreSQL is updated ✓
  // - Redis is updated ✓  
  // - Elasticsearch is stale ✗
  // - No way to roll back PostgreSQL transaction
  // - Systems are now permanently inconsistent
}
 
// SAFER: Cache invalidation instead of dual-write
async function updateUserSafe(userId: string, data: UserUpdate) {
  // Write to PostgreSQL (source of truth)
  await db.query("UPDATE users SET name = $1 WHERE id = $2", [data.name, userId]);
  
  // Invalidate cache (next read will refresh)
  await redis.del(`user:${userId}`);
  
  // Emit event for async processing
  await messageQueue.publish('user.updated', { userId, data });
  // Elasticsearch updated by event consumer, can retry on failure
}

Change Data Capture (CDC) Architecture:

CDC captures database changes at the database level, ensuring all changes are captured regardless of which application made them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
                    PostgreSQL
                         │
                         │ Write-Ahead Log (WAL)
                         ▼
                    ┌─────────────┐
                    │  Debezium   │
                    │  Connector  │
                    └─────────────┘
                         │
                         │ Change Events
                         ▼
                    ┌─────────────┐
                    │    Kafka    │
                    │   Topics    │
                    └─────────────┘
                    ╱      │      ╲
                   ╱       │       ╲
                  ▼        ▼        ▼
           ┌─────────┐ ┌───────┐ ┌───────────┐
           │  Redis  │ │ Elastic│ │ Analytics │
           │ Consumer│ │ Consumer││  Consumer │
           └─────────┘ └───────┘ └───────────┘
           
Benefits:
- All changes captured (including direct SQL updates)
- Consumers can retry failures (Kafka retention)
- Multiple consumers from single stream
- Ordering guarantees per partition

Eventual Consistency Acceptance

With CDC or event-based sync, secondary systems receive updates with some delay (typically milliseconds to seconds). Your application must accept that a user who just updated their profile might briefly see stale data in search results. This is usually acceptable; understand the trade-offs.

The Source of Truth Principle

In polyglot architectures, exactly one database must be the source of truth for each piece of data. Other databases are derived views—optimized projections for specific access patterns.

The Source of Truth Rules:

Source of Truth Principles

•Writes always go to the source of truth — Never write directly to derived systems.
•The source of truth is authoritative — If systems disagree, the source of truth wins.
•Derived data can be rebuilt — If Elasticsearch gets corrupted, re-index from PostgreSQL.
•Source of truth gets the strongest guarantees — ACID, backups, point-in-time recovery.
•Derived systems may lag — Accept eventual consistency in derived views.

Example: E-Commerce System Data Ownership

Data Ownership Map
Data Type	Source of Truth	Derived Views	Sync Method
User accounts	PostgreSQL	Redis (cache), ES (search)	CDC + cache invalidation
Product catalog	PostgreSQL	ES (search), CDN (cache)	CDC to ES, TTL on CDN
Inventory levels	PostgreSQL	Redis (fast reads)	Write-through cache
Orders	PostgreSQL	Analytics DB	CDC to Kafka
Sessions	Redis	(none)	Ephemeral, no sync needed
Search index	Elasticsearch	(none)	Derived from PostgreSQL
Metrics	InfluxDB	Grafana cache	Native time-series

What Happens When Rules Are Violated:

Antipattern: Writing inventory to both PostgreSQL and Redis

Day 1: Both systems show 10 units in stock
Day 2: API writes to PostgreSQL (9 units), Redis write fails silently
Day 3: Customer sees 10 units (from Redis), orders, gets "out of stock" at checkout
Day 4: Manual reconciliation, angry customer support tickets

Correct approach: PostgreSQL is source of truth
- API writes to PostgreSQL only
- Redis is invalidated or updated via CDC
- If Redis is stale, worst case is customer sees stale count
- Checkout validates against PostgreSQL before confirming

Document Data Ownership

Create and maintain a data ownership map for your system. For each data type, document: what's the source of truth, what derived views exist, how are they synchronized, and what's the acceptable lag. This prevents confusion and conflicting assumptions across teams.

Common Polyglot Architecture Examples

Let's examine complete polyglot architectures for common application types.

Architecture 1: E-Commerce Platform

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
┌─────────────────────────────────────────────────────────────────┐
│                        E-Commerce Platform                       │
└─────────────────────────────────────────────────────────────────┘
 
                           Application Layer
                          ╱       │        ╲
                         ╱        │         ╲
                        ╱         │          ╲
              ┌─────────────┐ ┌─────────┐ ┌────────────┐
              │ PostgreSQL  │ │  Redis  │ │Elasticsearch│
              │             │ │         │ │            │
              │ ● Users     │ │ ● Cache │ │ ● Product  │
              │ ● Orders    │ │ ● Sessions│ │   Search   │
              │ ● Products  │ │ ● Cart   │ │ ● Faceted  │
              │ ● Inventory │ │ ● Rate   │ │   Filters  │
              │ ● Payments  │ │   Limits │ └────────────┘
              └─────────────┘ └─────────┘
                    │              
                    │ CDC (Debezium)
                    ▼
              ┌─────────────┐
              │    Kafka    │────────┐
              └─────────────┘        │
                    │                ▼
                    │          ┌────────────┐
                    │          │ ClickHouse │
                    │          │            │
                    └─────────►│ ● Analytics│
                               │ ● Reports  │
                               │ ● Metrics  │
                               └────────────┘
 
Data Flow:
1. User views → PostgreSQL (read) or Redis cache
2. User search → Elasticsearch → Product IDs → PostgreSQL (details)
3. Add to cart → Redis (cart storage)
4. Checkout → PostgreSQL transaction (order, inventory, payment)
5. Order events → Kafka → ClickHouse (analytics)

Architecture 2: Social Media Platform

Social Media Database Selection
Feature	Database	Rationale
User accounts	PostgreSQL	Transactional, relational, secure
Social graph	PostgreSQL or Neo4j	Depends on graph query complexity
Posts/content	PostgreSQL + S3	Metadata in PG, media in object storage
Timeline/feed	Redis	Pre-computed, sorted sets by timestamp
Notifications	Redis + PostgreSQL	Redis for unread, PG for history
Search	Elasticsearch	Full-text on posts, profiles, hashtags
Analytics	ClickHouse	Engagement metrics, trend detection
Sessions	Redis	Fast auth, auto-expiration
Activity logs	Cassandra/Scylla	Massive write volume, append-only

Architecture 3: IoT Platform

IoT Polyglot Stack

•Device registry — PostgreSQL: device metadata, ownership, configuration
•Real-time telemetry — InfluxDB/TimescaleDB: sensor readings, time-series optimized
•Device state — Redis: current state for real-time dashboards
•Command queue — Redis Streams or Kafka: commands to devices
•Alert rules — PostgreSQL: user-defined alert configurations
•Alert history — Cassandra: high-volume alert events
•Analytics — ClickHouse: aggregated metrics for dashboards

Pattern Recognition

Notice the patterns: PostgreSQL for core transactional data, Redis for caching and real-time, Elasticsearch for search, ClickHouse/Cassandra for analytics and logs. These combinations address complementary needs without overlap.

Operational Considerations

Polyglot persistence adds operational complexity. Each database technology requires different skills, tools, and procedures. Plan for this from the start.

Operational Challenges:

Operational Burden Areas

•Monitoring — Each database has different metrics, alerting thresholds, and dashboard needs. PostgreSQL's pg_stat_* views differ from Redis's INFO command.
•Backups and Recovery — Different backup mechanisms, retention policies, and restore procedures. PostgreSQL uses pg_dump/pg_basebackup; Redis uses RDB/AOF.
•Upgrades — Upgrade cycles and procedures vary. Redis cluster upgrades differ from PostgreSQL major version upgrades.
•Scaling — Some databases scale vertically (PostgreSQL), others horizontally (Cassandra). Different knobs, different procedures.
•Security — Authentication, authorization, encryption at rest/in transit configured differently per database.
•Debugging — Different query analysis tools: EXPLAIN for PostgreSQL, SLOWLOG for Redis, explain API for Elasticsearch.
•Team Knowledge — Engineers need breadth across technologies. Hiring for deep expertise in 5 databases is hard.

Mitigation Strategies:

Operational Mitigation

•Use Managed Services — AWS RDS, ElastiCache, OpenSearch Service handle backups, failover, patching. Trade cost for operational burden.
•Standardize Observability — Unified monitoring (Datadog, Grafana) that collects metrics across all databases in one dashboard.
•Runbooks — Document procedures: backup verification, disaster recovery, performance troubleshooting per database.
•Limit Variety — If Redis solves your problem, don't also add Memcached. Minimize distinct technologies.
•Platform Team — Dedicated team owning database infrastructure, providing standardized access for product teams.
•Blameless Postmortems — Learn from incidents. Polyglot systems have more failure modes; document and share learnings.

Self-Managed vs Managed Trade-offs
Aspect	Self-Managed	Managed Service
Control	Full configuration control	Limited to service options
Cost (small scale)	Lower (compute only)	Higher (managed premium)
Cost (large scale)	May be lower	May be higher
Operational effort	High	Low
Expertise needed	Deep	Shallow
Availability SLA	Self-guaranteed	Provider SLA
Backups	DIY	Automated
Upgrades	Manual, planned	Often click-button or automatic

Start Managed, Optimize Later

For most teams, managed services (RDS, ElastiCache, OpenSearch) are the right choice. The operational savings outweigh the cost premium. Only self-manage when you have specific needs (configuration, cost at extreme scale) and dedicated database expertise.

When NOT to Use Polyglot Persistence

Polyglot persistence isn't always the answer. Sometimes, sticking with one database is the better choice.

Don't Use Polyglot When:

Anti-Patterns for Polyglot

•You're optimizing prematurely — 'We might need this scale' isn't a reason. Add databases when you hit measured limits.
•Your team is small — Two engineers can't become experts in 5 databases. Keep it simple until you can staff for complexity.
•The problem doesn't require it — PostgreSQL can handle caching (materialized views), search (pg_trgm, full-text), and JSON. It's good enough for most workloads.
•Data synchronization isn't planned — Adding Elasticsearch without planning how to keep it consistent creates pain.
•It's résumé-driven development — 'Cassandra would be cool' isn't a business case.

Signs You Don't Need Polyglot Yet:

Stay with One Database If...

•PostgreSQL handles your load with spare capacity
•LIKE queries are acceptable for your search needs
•You're not hitting session latency problems
•Analytics can run during off-peak hours
•Your team has deep expertise in your current database
•Operational simplicity is more valuable than optimization

The Pragmatic Approach:

Start: PostgreSQL

When sessions are slow: Add Redis for sessions/cache
├─ Verify session latency is actually the bottleneck
├─ Measure improvement after adding
└─ If not improved, Redis wasn't the answer

When search is inadequate: Add Elasticsearch
├─ Verify search is business-critical (not just nice-to-have)
├─ Plan synchronization strategy before implementing
└─ Accept search may lag primary data by seconds

When analytics impact production: Add analytics database
├─ Only if queries compete for resources
├─ Consider read replicas first (simpler)
└─ ClickHouse/Redshift for truly heavy analytics

Each addition should solve a measured problem with clear ROI.

Complexity Is Not Free

Every additional database adds: deployment complexity, failure modes, debugging difficulty, team knowledge requirements, backup procedures, and migration challenges. Only add complexity when the benefits clearly outweigh these costs.

Implementation Guidelines

When you do adopt polyglot persistence, follow these guidelines for successful implementation:

Guideline 1: Start with Data Access Layer Abstraction

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// Abstract repository hides database choice from business logic
interface UserRepository {
  findById(id: string): Promise<User | null>;
  findByEmail(email: string): Promise<User | null>;
  save(user: User): Promise<void>;
  search(query: string): Promise<User[]>;
}
 
class PolyglotUserRepository implements UserRepository {
  constructor(
    private postgres: PostgresClient,
    private redis: RedisClient,
    private elasticsearch: ElasticsearchClient
  ) {}
  
  async findById(id: string): Promise<User | null> {
    // Check cache first
    const cached = await this.redis.get(`user:${id}`);
    if (cached) return JSON.parse(cached);
    
    // Miss: fetch from PostgreSQL
    const user = await this.postgres.query("SELECT * FROM users WHERE id = $1", [id]);
    if (user) {
      await this.redis.set(`user:${id}`, JSON.stringify(user), "EX", 300);
    }
    return user;
  }
  
  async save(user: User): Promise<void> {
    // Write to PostgreSQL (source of truth)
    await this.postgres.query("INSERT INTO users ... ON CONFLICT DO UPDATE ...", [...]);
    
    // Invalidate cache
    await this.redis.del(`user:${user.id}`);
    
    // Update search index (or emit event for async processing)
    await this.elasticsearch.index({ index: 'users', id: user.id, body: user });
  }
  
  async search(query: string): Promise<User[]> {
    // Search goes to Elasticsearch
    const results = await this.elasticsearch.search({ ... });
    return results.hits.hits.map(hit => hit._source as User);
  }
}
 
// Business logic doesn't know about polyglot complexity
class UserService {
  constructor(private users: UserRepository) {}
  
  async getProfile(userId: string) {
    return this.users.findById(userId); // Cache/DB abstracted
  }
}

Guideline 2: Monitor Cross-System Consistency

Consistency Monitoring

•Lag metrics — Measure delay between write to source and visibility in derived systems
•Count comparisons — Periodically compare record counts across systems
•Sampling validation — Randomly sample records and verify consistency
•Alert on divergence — Trigger alerts when sync lag exceeds thresholds
•Repair mechanisms — Have scripts to re-sync specific records or full tables

Guideline 3: Plan for Failure Modes

Failure Mode Planning
Failure	Impact	Mitigation
Redis unavailable	Cache miss for all reads	Fallback to PostgreSQL (accept higher latency)
Elasticsearch down	Search broken	Degrade gracefully with 'search unavailable' UX
Sync lag spike	Stale search/cache data	Monitor and alert; users see slightly stale data
CDC connector failure	Systems diverge	Alert, restart connector, run catch-up sync
PostgreSQL down	Core functionality offline	Promote replica, restore from backup (critical)

Graceful Degradation

Design for graceful degradation. If Redis is down, app should work (slower). If Elasticsearch is down, show a message but don't crash. Only PostgreSQL (or your source of truth) should be truly critical. Secondary systems should be optional, even if desired.

Summary: Polyglot Persistence

We've covered polyglot persistence comprehensively—the practice of using multiple database technologies, each optimized for specific use cases. Let's consolidate the key takeaways:

Key Takeaways

•Polyglot persistence matches data to optimal stores — Different data has different access patterns; specialized databases handle them better than one general store.
•Common patterns emerge — SQL + Redis (caching), SQL + Elasticsearch (search), SQL + Time-Series (metrics) are proven combinations.
•Data synchronization is the core challenge — Prefer CDC or event-driven sync over dual-writes. Accept eventual consistency in derived systems.
•One source of truth per data type — Writes go to the authoritative store; other systems are derived views.
•Operational complexity is real — Each database adds monitoring, backup, upgrade, and expertise requirements.
•Add databases incrementally — Start with SQL, add specialized stores when specific problems demand them.
•Design for graceful degradation — Secondary systems should be optional; only the source of truth is critical.

Module Complete:

You now have a comprehensive understanding of SQL vs NoSQL databases: the relational model's foundations, NoSQL's specialized approaches, when to choose each, and how polyglot persistence combines them into optimized architectures.

This knowledge enables you to make informed, defensible database decisions based on actual requirements rather than trends or assumptions.

Module Complete

Congratulations! You've completed the SQL vs NoSQL module. You now understand relational model foundations, NoSQL data models and trade-offs, decision frameworks for database selection, and polyglot persistence strategies. These skills are essential for any system design discussion involving data storage.

5 / 5

Loading learning content...

System Design HLDSQL vs NoSQL

SQL vs NoSQL: Choosing the Right Database Paradigm

LevelIntermediate

Duration75 mins

TopicSQL vs NoSQL

5 / 5

Polyglot Persistence

One Size Does Not Fit All

Polyglot persistence is now the norm at scale. Understanding how to design, implement, and maintain multi-database systems is essential for modern system design.

What You Will Learn

Why Polyglot Persistence? The Case for Specialization

The Trade-off Reality:

Database Trade-offs
Optimizes For	Sacrifices	Example
Strong consistency (ACID)	Horizontal scale, availability	PostgreSQL, MySQL
Write throughput	Read latency, consistency	Cassandra, ScyllaDB
Sub-millisecond reads	Persistence durability	Redis, Memcached
Full-text search	Transaction support	Elasticsearch, Solr
Relationship traversal	Query flexibility	Neo4j, JanusGraph
Analytical aggregations	Point queries	ClickHouse, Druid

A Practical Example:

Consider a social media platform. Its data needs include:

Social Media Data Requirements

•User accounts and profiles — Transactional, relational, must be consistent. → SQL database (PostgreSQL)
•Authentication sessions — Sub-millisecond reads, auto-expiration, can be rebuilt. → Key-value store (Redis)
•Post and content storage — Flexible schema, embedded media metadata. → Document store (MongoDB) or SQL with JSON
•Social graph — Friend connections, followers, friend-of-friend queries. → Graph database (Neo4j) or SQL with careful modeling
•Feed generation — Pre-computed, real-time, fanout on write. → Redis sorted sets or dedicated timeline service
•Search — Full-text search across posts, profiles, hashtags. → Search engine (Elasticsearch)
•Analytics and metrics — Aggregations, dashboards, retention analysis. → Columnar analytics (ClickHouse, Redshift)
•Activity logs — Append-only, high volume, time-series queries. → Time-series DB or Cassandra

Putting all of this in one PostgreSQL instance would be possible but suboptimal:

Session lookups would be 10-100x slower than Redis
Full-text search would be clunky compared to Elasticsearch
Analytics would impact transactional workloads
Feed generation at scale would hit limits

Polyglot persistence lets each use case use the optimal tool.

Complexity Cost

Common Polyglot Patterns

Let's examine the most common polyglot combinations and why they work together.

Pattern 1: SQL + Redis = Core + Caching

The most common polyglot pattern. PostgreSQL handles transactional data; Redis provides caching and ephemeral storage.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
// Primary data in PostgreSQL
interface User {
  id: string;
  email: string;
  name: string;
  preferences: Record<string, any>;
}
 
// Cache layer with Redis
const CACHE_TTL = 300; // 5 minutes
 
async function getUser(userId: string): Promise<User> {
  // Try cache first
  const cached = await redis.get(`user:${userId}`);
  if (cached) {
    return JSON.parse(cached);
  }
  
  // Miss: fetch from PostgreSQL
  const user = await db.query("SELECT * FROM users WHERE id = $1", [userId]);
  
  // Populate cache for next request
  await redis.set(`user:${userId}`, JSON.stringify(user), "EX", CACHE_TTL);
  
  return user;
}
 
// Session storage entirely in Redis (no SQL)
async function createSession(userId: string): Promise<string> {
  const sessionId = crypto.randomUUID();
  await redis.set(`session:${sessionId}`, userId, "EX", 86400); // 24h TTL
  return sessionId;
}
 
// Rate limiting in Redis
async function checkRateLimit(userId: string): Promise<boolean> {
  const key = `ratelimit:${userId}:${Math.floor(Date.now() / 60000)}`;
  const count = await redis.incr(key);
  if (count === 1) {
    await redis.expire(key, 60); // First request, set expiry
  }
  return count <= 100; // 100 requests per minute
}

Pattern 2: SQL + Search Engine

PostgreSQL stores authoritative data; Elasticsearch provides full-text search and faceted filtering.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Products stored in PostgreSQL (source of truth)
await db.query(`
  INSERT INTO products (id, name, description, category_id, price)
  VALUES ($1, $2, $3, $4, $5)
`, [product.id, product.name, product.description, product.categoryId, product.price]);
 
// Index product in Elasticsearch for search
await elasticsearch.index({
  index: 'products',
  id: product.id,
  body: {
    name: product.name,
    description: product.description,
    category: categoryName,
    price: product.price,
    searchable_text: `${product.name} ${product.description}`,
    facets: {
      category: categoryName,
      price_range: getPriceRange(product.price),
      brand: product.brand
    }
  }
});
 
// Search uses Elasticsearch
async function searchProducts(query: string, filters: Filters) {
  const results = await elasticsearch.search({
    index: 'products',
    body: {
      query: {
        bool: {
          must: { match: { searchable_text: query } },
          filter: buildFilters(filters)
        }
      },
      aggs: {
        categories: { terms: { field: 'facets.category' } },
        price_ranges: { terms: { field: 'facets.price_range' } }
      }
    }
  });
  
  // Return IDs, fetch full details from PostgreSQL if needed
  return results.hits.hits.map(hit => hit._id);
}

Pattern 3: SQL + Time-Series Database

PostgreSQL for application data; InfluxDB or TimescaleDB for metrics and monitoring data.

SQL + Time-Series Benefits

•Transactional data in PostgreSQL with full ACID support
•Metrics/logs in time-series DB with optimized compression
•Retention policies handle automatic data expiration
•Analytics dashboards can query time-series DB without impacting production
•Different backup/recovery requirements for operational vs analytical data

Start Small, Add Incrementally

Data Synchronization Strategies

Synchronization Strategies:

Synchronization Approaches

•Dual-Write (Application-Level) — Application writes to both systems. Simple but dangerous—partial failures leave systems inconsistent.
•Change Data Capture (CDC) — Stream changes from primary database to other systems. Debezium reads PostgreSQL WAL and publishes to Kafka.
•Event Sourcing — Application emits events; consumers update their stores. Natural inversion of control.
•Cache Invalidation — Don't synchronize; invalidate secondary store on change. Next read fetches fresh data.
•Scheduled Sync (ETL) — Periodic batch synchronization. Acceptable for analytics, not for operational data.

The Dual-Write Problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// DANGEROUS: Dual-write pattern
async function updateUser(userId: string, data: UserUpdate) {
  // Write to PostgreSQL
  await db.query("UPDATE users SET name = $1 WHERE id = $2", [data.name, userId]);
  
  // Write to Redis cache
  await redis.set(`user:${userId}`, JSON.stringify({...data}));
  
  // Write to Elasticsearch
  await elasticsearch.update({ index: 'users', id: userId, body: { doc: data } });
  
  // PROBLEM: If elasticsearch.update() fails:
  // - PostgreSQL is updated ✓
  // - Redis is updated ✓  
  // - Elasticsearch is stale ✗
  // - No way to roll back PostgreSQL transaction
  // - Systems are now permanently inconsistent
}
 
// SAFER: Cache invalidation instead of dual-write
async function updateUserSafe(userId: string, data: UserUpdate) {
  // Write to PostgreSQL (source of truth)
  await db.query("UPDATE users SET name = $1 WHERE id = $2", [data.name, userId]);
  
  // Invalidate cache (next read will refresh)
  await redis.del(`user:${userId}`);
  
  // Emit event for async processing
  await messageQueue.publish('user.updated', { userId, data });
  // Elasticsearch updated by event consumer, can retry on failure
}

Change Data Capture (CDC) Architecture:

CDC captures database changes at the database level, ensuring all changes are captured regardless of which application made them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
                    PostgreSQL
                         │
                         │ Write-Ahead Log (WAL)
                         ▼
                    ┌─────────────┐
                    │  Debezium   │
                    │  Connector  │
                    └─────────────┘
                         │
                         │ Change Events
                         ▼
                    ┌─────────────┐
                    │    Kafka    │
                    │   Topics    │
                    └─────────────┘
                    ╱      │      ╲
                   ╱       │       ╲
                  ▼        ▼        ▼
           ┌─────────┐ ┌───────┐ ┌───────────┐
           │  Redis  │ │ Elastic│ │ Analytics │
           │ Consumer│ │ Consumer││  Consumer │
           └─────────┘ └───────┘ └───────────┘
           
Benefits:
- All changes captured (including direct SQL updates)
- Consumers can retry failures (Kafka retention)
- Multiple consumers from single stream
- Ordering guarantees per partition

Eventual Consistency Acceptance

The Source of Truth Principle

In polyglot architectures, exactly one database must be the source of truth for each piece of data. Other databases are derived views—optimized projections for specific access patterns.

The Source of Truth Rules:

Source of Truth Principles

•Writes always go to the source of truth — Never write directly to derived systems.
•The source of truth is authoritative — If systems disagree, the source of truth wins.
•Derived data can be rebuilt — If Elasticsearch gets corrupted, re-index from PostgreSQL.
•Source of truth gets the strongest guarantees — ACID, backups, point-in-time recovery.
•Derived systems may lag — Accept eventual consistency in derived views.

Example: E-Commerce System Data Ownership

Data Ownership Map
Data Type	Source of Truth	Derived Views	Sync Method
User accounts	PostgreSQL	Redis (cache), ES (search)	CDC + cache invalidation
Product catalog	PostgreSQL	ES (search), CDN (cache)	CDC to ES, TTL on CDN
Inventory levels	PostgreSQL	Redis (fast reads)	Write-through cache
Orders	PostgreSQL	Analytics DB	CDC to Kafka
Sessions	Redis	(none)	Ephemeral, no sync needed
Search index	Elasticsearch	(none)	Derived from PostgreSQL
Metrics	InfluxDB	Grafana cache	Native time-series

What Happens When Rules Are Violated:

Antipattern: Writing inventory to both PostgreSQL and Redis

Day 1: Both systems show 10 units in stock
Day 2: API writes to PostgreSQL (9 units), Redis write fails silently
Day 3: Customer sees 10 units (from Redis), orders, gets "out of stock" at checkout
Day 4: Manual reconciliation, angry customer support tickets

Correct approach: PostgreSQL is source of truth
- API writes to PostgreSQL only
- Redis is invalidated or updated via CDC
- If Redis is stale, worst case is customer sees stale count
- Checkout validates against PostgreSQL before confirming

Document Data Ownership

Common Polyglot Architecture Examples

Let's examine complete polyglot architectures for common application types.

Architecture 1: E-Commerce Platform

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
┌─────────────────────────────────────────────────────────────────┐
│                        E-Commerce Platform                       │
└─────────────────────────────────────────────────────────────────┘
 
                           Application Layer
                          ╱       │        ╲
                         ╱        │         ╲
                        ╱         │          ╲
              ┌─────────────┐ ┌─────────┐ ┌────────────┐
              │ PostgreSQL  │ │  Redis  │ │Elasticsearch│
              │             │ │         │ │            │
              │ ● Users     │ │ ● Cache │ │ ● Product  │
              │ ● Orders    │ │ ● Sessions│ │   Search   │
              │ ● Products  │ │ ● Cart   │ │ ● Faceted  │
              │ ● Inventory │ │ ● Rate   │ │   Filters  │
              │ ● Payments  │ │   Limits │ └────────────┘
              └─────────────┘ └─────────┘
                    │              
                    │ CDC (Debezium)
                    ▼
              ┌─────────────┐
              │    Kafka    │────────┐
              └─────────────┘        │
                    │                ▼
                    │          ┌────────────┐
                    │          │ ClickHouse │
                    │          │            │
                    └─────────►│ ● Analytics│
                               │ ● Reports  │
                               │ ● Metrics  │
                               └────────────┘
 
Data Flow:
1. User views → PostgreSQL (read) or Redis cache
2. User search → Elasticsearch → Product IDs → PostgreSQL (details)
3. Add to cart → Redis (cart storage)
4. Checkout → PostgreSQL transaction (order, inventory, payment)
5. Order events → Kafka → ClickHouse (analytics)

Architecture 2: Social Media Platform

Social Media Database Selection
Feature	Database	Rationale
User accounts	PostgreSQL	Transactional, relational, secure
Social graph	PostgreSQL or Neo4j	Depends on graph query complexity
Posts/content	PostgreSQL + S3	Metadata in PG, media in object storage
Timeline/feed	Redis	Pre-computed, sorted sets by timestamp
Notifications	Redis + PostgreSQL	Redis for unread, PG for history
Search	Elasticsearch	Full-text on posts, profiles, hashtags
Analytics	ClickHouse	Engagement metrics, trend detection
Sessions	Redis	Fast auth, auto-expiration
Activity logs	Cassandra/Scylla	Massive write volume, append-only

Architecture 3: IoT Platform

IoT Polyglot Stack

•Device registry — PostgreSQL: device metadata, ownership, configuration
•Real-time telemetry — InfluxDB/TimescaleDB: sensor readings, time-series optimized
•Device state — Redis: current state for real-time dashboards
•Command queue — Redis Streams or Kafka: commands to devices
•Alert rules — PostgreSQL: user-defined alert configurations
•Alert history — Cassandra: high-volume alert events
•Analytics — ClickHouse: aggregated metrics for dashboards

Pattern Recognition

Operational Considerations

Polyglot persistence adds operational complexity. Each database technology requires different skills, tools, and procedures. Plan for this from the start.

Operational Challenges:

Operational Burden Areas

•Monitoring — Each database has different metrics, alerting thresholds, and dashboard needs. PostgreSQL's pg_stat_* views differ from Redis's INFO command.
•Backups and Recovery — Different backup mechanisms, retention policies, and restore procedures. PostgreSQL uses pg_dump/pg_basebackup; Redis uses RDB/AOF.
•Upgrades — Upgrade cycles and procedures vary. Redis cluster upgrades differ from PostgreSQL major version upgrades.
•Scaling — Some databases scale vertically (PostgreSQL), others horizontally (Cassandra). Different knobs, different procedures.
•Security — Authentication, authorization, encryption at rest/in transit configured differently per database.
•Debugging — Different query analysis tools: EXPLAIN for PostgreSQL, SLOWLOG for Redis, explain API for Elasticsearch.
•Team Knowledge — Engineers need breadth across technologies. Hiring for deep expertise in 5 databases is hard.

Mitigation Strategies:

Operational Mitigation

•Use Managed Services — AWS RDS, ElastiCache, OpenSearch Service handle backups, failover, patching. Trade cost for operational burden.
•Standardize Observability — Unified monitoring (Datadog, Grafana) that collects metrics across all databases in one dashboard.
•Runbooks — Document procedures: backup verification, disaster recovery, performance troubleshooting per database.
•Limit Variety — If Redis solves your problem, don't also add Memcached. Minimize distinct technologies.
•Platform Team — Dedicated team owning database infrastructure, providing standardized access for product teams.
•Blameless Postmortems — Learn from incidents. Polyglot systems have more failure modes; document and share learnings.

Self-Managed vs Managed Trade-offs
Aspect	Self-Managed	Managed Service
Control	Full configuration control	Limited to service options
Cost (small scale)	Lower (compute only)	Higher (managed premium)
Cost (large scale)	May be lower	May be higher
Operational effort	High	Low
Expertise needed	Deep	Shallow
Availability SLA	Self-guaranteed	Provider SLA
Backups	DIY	Automated
Upgrades	Manual, planned	Often click-button or automatic

Start Managed, Optimize Later

When NOT to Use Polyglot Persistence

Polyglot persistence isn't always the answer. Sometimes, sticking with one database is the better choice.

Don't Use Polyglot When:

Anti-Patterns for Polyglot

•You're optimizing prematurely — 'We might need this scale' isn't a reason. Add databases when you hit measured limits.
•Your team is small — Two engineers can't become experts in 5 databases. Keep it simple until you can staff for complexity.
•The problem doesn't require it — PostgreSQL can handle caching (materialized views), search (pg_trgm, full-text), and JSON. It's good enough for most workloads.
•Data synchronization isn't planned — Adding Elasticsearch without planning how to keep it consistent creates pain.
•It's résumé-driven development — 'Cassandra would be cool' isn't a business case.

Signs You Don't Need Polyglot Yet:

Stay with One Database If...

•PostgreSQL handles your load with spare capacity
•LIKE queries are acceptable for your search needs
•You're not hitting session latency problems
•Analytics can run during off-peak hours
•Your team has deep expertise in your current database
•Operational simplicity is more valuable than optimization

The Pragmatic Approach:

Start: PostgreSQL

When sessions are slow: Add Redis for sessions/cache
├─ Verify session latency is actually the bottleneck
├─ Measure improvement after adding
└─ If not improved, Redis wasn't the answer

When search is inadequate: Add Elasticsearch
├─ Verify search is business-critical (not just nice-to-have)
├─ Plan synchronization strategy before implementing
└─ Accept search may lag primary data by seconds

When analytics impact production: Add analytics database
├─ Only if queries compete for resources
├─ Consider read replicas first (simpler)
└─ ClickHouse/Redshift for truly heavy analytics

Each addition should solve a measured problem with clear ROI.

Complexity Is Not Free

Implementation Guidelines

When you do adopt polyglot persistence, follow these guidelines for successful implementation:

Guideline 1: Start with Data Access Layer Abstraction

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// Abstract repository hides database choice from business logic
interface UserRepository {
  findById(id: string): Promise<User | null>;
  findByEmail(email: string): Promise<User | null>;
  save(user: User): Promise<void>;
  search(query: string): Promise<User[]>;
}
 
class PolyglotUserRepository implements UserRepository {
  constructor(
    private postgres: PostgresClient,
    private redis: RedisClient,
    private elasticsearch: ElasticsearchClient
  ) {}
  
  async findById(id: string): Promise<User | null> {
    // Check cache first
    const cached = await this.redis.get(`user:${id}`);
    if (cached) return JSON.parse(cached);
    
    // Miss: fetch from PostgreSQL
    const user = await this.postgres.query("SELECT * FROM users WHERE id = $1", [id]);
    if (user) {
      await this.redis.set(`user:${id}`, JSON.stringify(user), "EX", 300);
    }
    return user;
  }
  
  async save(user: User): Promise<void> {
    // Write to PostgreSQL (source of truth)
    await this.postgres.query("INSERT INTO users ... ON CONFLICT DO UPDATE ...", [...]);
    
    // Invalidate cache
    await this.redis.del(`user:${user.id}`);
    
    // Update search index (or emit event for async processing)
    await this.elasticsearch.index({ index: 'users', id: user.id, body: user });
  }
  
  async search(query: string): Promise<User[]> {
    // Search goes to Elasticsearch
    const results = await this.elasticsearch.search({ ... });
    return results.hits.hits.map(hit => hit._source as User);
  }
}
 
// Business logic doesn't know about polyglot complexity
class UserService {
  constructor(private users: UserRepository) {}
  
  async getProfile(userId: string) {
    return this.users.findById(userId); // Cache/DB abstracted
  }
}

Guideline 2: Monitor Cross-System Consistency

Consistency Monitoring

•Lag metrics — Measure delay between write to source and visibility in derived systems
•Count comparisons — Periodically compare record counts across systems
•Sampling validation — Randomly sample records and verify consistency
•Alert on divergence — Trigger alerts when sync lag exceeds thresholds
•Repair mechanisms — Have scripts to re-sync specific records or full tables

Guideline 3: Plan for Failure Modes

Failure Mode Planning
Failure	Impact	Mitigation
Redis unavailable	Cache miss for all reads	Fallback to PostgreSQL (accept higher latency)
Elasticsearch down	Search broken	Degrade gracefully with 'search unavailable' UX
Sync lag spike	Stale search/cache data	Monitor and alert; users see slightly stale data
CDC connector failure	Systems diverge	Alert, restart connector, run catch-up sync
PostgreSQL down	Core functionality offline	Promote replica, restore from backup (critical)

Graceful Degradation

Summary: Polyglot Persistence

We've covered polyglot persistence comprehensively—the practice of using multiple database technologies, each optimized for specific use cases. Let's consolidate the key takeaways:

Key Takeaways

•Polyglot persistence matches data to optimal stores — Different data has different access patterns; specialized databases handle them better than one general store.
•Common patterns emerge — SQL + Redis (caching), SQL + Elasticsearch (search), SQL + Time-Series (metrics) are proven combinations.
•Data synchronization is the core challenge — Prefer CDC or event-driven sync over dual-writes. Accept eventual consistency in derived systems.
•One source of truth per data type — Writes go to the authoritative store; other systems are derived views.
•Operational complexity is real — Each database adds monitoring, backup, upgrade, and expertise requirements.
•Add databases incrementally — Start with SQL, add specialized stores when specific problems demand them.
•Design for graceful degradation — Secondary systems should be optional; only the source of truth is critical.

Module Complete:

This knowledge enables you to make informed, defensible database decisions based on actual requirements rather than trends or assumptions.

Module Complete

5 / 5