SQL vs NoSQL - Learning Module

Loading content...

0/273

NoSQL: Flexible Schemas, Specialized Models

Breaking Free from the Relational Mold

By the mid-2000s, internet-scale applications were pushing relational databases to their limits. Google's global infrastructure, Facebook's social graph, Amazon's shopping platform—these systems demanded data storage capabilities that traditional SQL databases struggled to provide: horizontal scalability to thousands of machines, schema flexibility for rapidly evolving features, and specialized data models optimized for specific access patterns.

The response was a proliferation of non-relational databases, collectively dubbed NoSQL (often interpreted as "Not Only SQL" rather than "No SQL"). This wasn't a single technology but a movement—a recognition that the relational model, despite its elegance, isn't the optimal solution for every problem.

Understanding NoSQL isn't just about learning different databases. It's about understanding why different data models exist and when each one provides advantages that outweigh the loss of relational guarantees.

What You Will Learn

By the end of this page, you will understand the core philosophy behind NoSQL databases, the meaning and implications of schema flexibility, the four major NoSQL data models (key-value, document, wide-column, graph), and the fundamental trade-offs NoSQL makes compared to relational systems.

The NoSQL Philosophy: Principles and Origins

NoSQL databases emerged from a set of defining principles that contrast with traditional relational systems. Understanding these principles explains the design decisions and trade-offs of NoSQL systems.

The Driving Forces Behind NoSQL:

Core NoSQL Motivations

•Horizontal Scalability — Relational databases traditionally scale vertically (bigger machines). NoSQL systems were designed from the ground up to distribute data across commodity hardware, scaling horizontally by adding more nodes.
•Flexible Data Models — Rigid schemas require migrations and downtime for changes. NoSQL embraces schema-on-read, allowing data structures to evolve without formal ALTER TABLE operations.
•High Availability — Internet-scale systems can't afford downtime. NoSQL databases often prioritize availability over consistency, allowing the system to continue operating during network partitions.
•Developer Productivity — For some use cases, mapping between relational tables and application objects (ORM) creates friction. Document databases, for example, store data in formats that match application structures directly.
•Specialized Use Cases — Graph databases optimize for relationship traversal. Time-series databases optimize for timestamped data. Key-value stores optimize for simple lookups. Each addresses a specific problem better than a general-purpose solution.

Historical Context: The Web Scale Challenge

The NoSQL movement crystallized around several seminal papers and systems:

1. Google's Bigtable (2006) — Described a distributed storage system for structured data at massive scale. Inspired HBase, Cassandra, and other wide-column stores.

2. Amazon's Dynamo (2007) — Described a highly available key-value store with eventual consistency. Inspired Riak, DynamoDB, and influenced Cassandra's design.

3. MongoDB (2009) — Made document databases accessible, popularizing schema flexibility and JSON-like storage.

These systems shared a willingness to sacrifice some relational guarantees (joins, ACID transactions, strict consistency) in exchange for properties that mattered more at their scale: partition tolerance, availability, and horizontal scalability.

The CAP Theorem Context:

The CAP theorem (Consistency, Availability, Partition tolerance—pick two) provided theoretical framing for NoSQL trade-offs. In a distributed system, when network partitions occur, you must choose between:

CP (Consistency + Partition tolerance): Sacrifice availability. Refuse requests if consistency can't be guaranteed.
AP (Availability + Partition tolerance): Sacrifice consistency. Return potentially stale data to maintain availability.

Many NoSQL databases chose AP, accepting eventual consistency in exchange for high availability—a rational choice for systems where temporary inconsistency is tolerable.

CAP Nuance

CAP is often misunderstood. The "pick two" framing is overly simplistic. Modern databases offer tunable consistency—you can often choose consistency levels per-operation. Partitions are rare; most of the time, you can have consistency AND availability. CAP only forces a choice during partitions.

Schema Flexibility: Freedom and Responsibility

One of NoSQL's most touted features is schema flexibility. But what does this actually mean, and what are the implications?

Schema-on-Write vs Schema-on-Read:

Relational databases enforce schema-on-write: the schema is defined before data is inserted, and every row must conform. Insert invalid data, get an error.

NoSQL databases often use schema-on-read: structure is interpreted when data is accessed, not when stored. You can insert documents with different fields, and the application determines how to handle variations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// Document 1: User with basic info
{
  "_id": "user_1001",
  "name": "Alice Johnson",
  "email": "alice@example.com",
  "created_at": "2024-01-15T10:30:00Z"
}
 
// Document 2: User with extended info (same collection!)
{
  "_id": "user_1002",
  "name": "Bob Smith",
  "email": "bob@example.com",
  "phone": "+1-555-123-4567",          // Not in Document 1
  "address": {                          // Nested object
    "street": "123 Main St",
    "city": "San Francisco",
    "country": "USA"
  },
  "preferences": {                      // Another nested object
    "newsletter": true,
    "dark_mode": true
  },
  "created_at": "2024-01-16T14:45:00Z"
}
 
// Document 3: User with completely different structure
{
  "_id": "user_1003",
  "name": "Charlie Chen",
  "email": "charlie@example.com",
  "social_profiles": ["twitter", "linkedin"],  // Array field
  "role": "admin",                              // New field
  "permissions": ["read", "write", "delete"],   // Array field
  "created_at": "2024-01-17T09:00:00Z"
}
 
// All three documents coexist in the same collection
// No schema migration was needed

Advantages of Schema Flexibility:

Benefits

•Rapid Iteration — Add new fields without migrations. Deploy feature flags, A/B test variations, or capture new data immediately.
•Heterogeneous Data — Entities that share a category but vary in structure can coexist. A 'products' collection can hold books (with ISBN, author) and electronics (with watts, voltage) without null-heavy columns.
•Easier Onboarding — Developers can start storing data without designing a complete schema upfront. Good for prototyping and MVPs.
•Embedded Relationships — Related data can be nested within documents, avoiding joins. A blog post can embed its comments directly.
•Natural Object Mapping — Application objects often map directly to documents. Less impedance mismatch than ORM with relational tables.

The Hidden Costs of Schema Flexibility:

Schema flexibility is not without trade-offs. The absence of enforcement shifts responsibility to the application:

Hidden Costs

•Schema Chaos — Without discipline, collections become inconsistent nightmares. Field names vary ('user_id' vs 'userId' vs 'user'), types change (string in old docs, number in new), required fields are sometimes missing.
•Application Complexity — Validation that the database would have handled now lives in application code—in every place that writes data. Bug here? Corrupt data stored indefinitely.
•Query Difficulty — Queries on schemaless collections must handle missing fields, type variations, and structural differences. Aggregation pipelines become defensive.
•Lost Documentation — A schema serves as documentation—you can see what a user record contains by examining the table definition. Schemaless collections require documentation discipline that frequently lapses.
•Migration Debt — Schema flexibility doesn't eliminate migrations; it defers them. Eventually, you'll want to standardize old documents, and that migration is harder without schema constraints.

Schemaless Requires Discipline

Modern practice increasingly embraces 'schema-lite' approaches: use schemaless databases but with application-level schema validation (JSON Schema, Mongoose schemas, etc.). This provides flexibility with guardrails. Pure schemaless is often regretted as systems mature.

Key-Value Stores: Simplicity at Scale

The simplest NoSQL model is the key-value store: a distributed hash table where each key maps to a value. This extreme simplicity enables extreme performance and scalability.

How Key-Value Stores Work:

SET user:1001 "{ name: 'Alice', email: 'alice@example.com' }"
GET user:1001  → "{ name: 'Alice', email: 'alice@example.com' }"

The database doesn't interpret the value—it's just bytes. No indexes on value fields, no queries on value contents (in pure key-value stores). You know the key, you get the value. Period.

Why This Simplicity Matters:

Key-Value Operations and Complexity
Operation	Time Complexity	Why
GET by key	O(1)	Hash function → partition → node lookup
SET key/value	O(1)	Same as GET, then store
DELETE by key	O(1)	Same as GET, then delete
Query by value field	O(n) or impossible	No indexes on value structure
Range query	O(log n) to O(n) if supported	Depends on implementation

Prominent Key-Value Stores:

Key-Value Systems

•Redis — In-memory key-value store with rich data structures (strings, lists, sets, sorted sets, hashes, streams). Blazingly fast (~100k+ ops/sec per node). Supports persistence, replication, clustering. The most popular choice for caching, sessions, and real-time features.
•Memcached — Simpler in-memory store focused purely on caching. Multi-threaded, extremely fast. No persistence, no data types beyond strings. Simple but limited.
•Amazon DynamoDB — Managed key-value/document store with automatic scaling, global tables, and predictable performance. Primary key access is O(1); secondary indexes enable queries.
•etcd — Distributed key-value store for configuration and service discovery. Powers Kubernetes. Strong consistency via Raft consensus.
•Riak — Dynamo-inspired distributed key-value store with tunable consistency. Fault-tolerant, eventually consistent by default.

Key Design Patterns:

Since you can only retrieve by key, key design becomes critical:

// Hierarchical keys for namespacing
user:1001                    // User record
user:1001:sessions           // User's sessions
user:1001:cart               // User's shopping cart

// Composite keys for relationships
order:2024-01-15:1001        // Date-prefixed for time-based queries
product:electronics:laptop   // Category-prefixed for scanning

// Unique identifiers
session:a1b2c3d4e5f6         // Session tokens
cache:api:/v1/users/1001     // Cached API responses

Key patterns effectively create pseudo-structures within a flat namespace. This is both powerful and primitive—you're building your own access patterns at the key level.

When to Use Key-Value Stores

Key-value stores excel for: caching (session data, API responses, computed results), rate limiting, feature flags, leaderboards, real-time counters, job queues, and any access pattern where you know the exact key. They're not suitable when you need to query by attributes or perform joins.

Document Stores: Structured Flexibility

Document databases extend the key-value model by understanding value structure. Documents are typically JSON (or JSON-like BSON) objects that the database can index, query, and validate.

Key Characteristics:

Semi-structured data: Documents have structure, but different documents in the same collection can have different structures
Rich querying: Unlike pure key-value stores, you can query on any field within documents
Embedded data: Related data can be nested within documents, reducing the need for joins
Indexes on any field: Create indexes on document fields for efficient queries

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// Order document with embedded line items (denormalized)
{
  "_id": ObjectId("65abc123def456789"),
  "order_number": "ORD-2024-00001",
  "customer": {
    "id": "cust_1001",
    "name": "Alice Johnson",
    "email": "alice@example.com"
  },
  "items": [
    {
      "product_id": "prod_5001",
      "name": "Mechanical Keyboard",
      "quantity": 1,
      "unit_price": 149.99
    },
    {
      "product_id": "prod_5002",
      "name": "Ergonomic Mouse",
      "quantity": 2,
      "unit_price": 79.99
    }
  ],
  "subtotal": 309.97,
  "tax": 27.90,
  "total": 337.87,
  "status": "processing",
  "shipping_address": {
    "street": "123 Main St",
    "city": "San Francisco",
    "state": "CA",
    "zip": "94102"
  },
  "created_at": ISODate("2024-01-15T10:30:00Z"),
  "updated_at": ISODate("2024-01-15T10:30:00Z")
}
 
// Query: Find all orders for a customer with total > $200
db.orders.find({
  "customer.id": "cust_1001",
  "total": { $gt: 200 }
}).sort({ created_at: -1 })
 
// Query: Find orders containing a specific product
db.orders.find({
  "items.product_id": "prod_5001"
})

Document Design: Embedding vs Referencing

The critical design decision in document databases is whether to embed related data within documents or reference it by ID:

Embedding (Denormalization):

Embedding Pros

•Single read retrieves all data
•Atomic updates within document
•No joins needed
•Better cache locality
•Natural for 1:few relationships

Embedding Cons

•Data duplication across documents
•Update anomalies if duplicated data changes
•Document size limits (16MB in MongoDB)
•Not suitable for 1:many or many:many
•No querying embedded entities independently

Referencing (Normalization):

Alternatively, store related data in separate documents with references:

// Order document (normalized)
{
  "_id": ObjectId("65abc123def456789"),
  "customer_id": ObjectId("65xyz789abc123456"),  // Reference
  "items": [
    { "product_id": ObjectId("..."), "quantity": 1, "unit_price": 149.99 }
  ],
  "total": 337.87
}

// Customer document (separate collection)
{
  "_id": ObjectId("65xyz789abc123456"),
  "name": "Alice Johnson",
  "email": "alice@example.com"
}

Referencing requires application-level joins (multiple queries) or aggregation pipeline $lookup operations (similar to SQL joins but typically slower).

Prominent Document Databases

MongoDB: The most popular document database. Flexible, scalable, rich query language. CouchDB: RESTful, multi-master replication, sync-friendly. Firestore: Google's managed document database, real-time sync, offline support. Amazon DocumentDB: MongoDB-compatible managed service on AWS.

Wide-Column Stores: Scale for Analytics

Wide-column stores (also called column-family stores) organize data into tables with rows and columns, but with crucial differences from relational tables:

Columns are grouped into column families: These are the basic unit of access and storage
Rows can have different columns: No need for every row to have the same structure
Designed for horizontal scale: Data is partitioned across nodes by row key
Optimized for writes: Append-only storage with eventual compaction

Conceptual Model:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Table: user_activity
────────────────────────────────────────────────────────────────────────
Row Key          │ Column Family: info         │ Column Family: events
                 │ name        │ email         │ 2024-01-15     │ 2024-01-16
────────────────────────────────────────────────────────────────────────
user:1001        │ "Alice"     │ "alice@..."   │ "login,click"  │ "purchase"
user:1002        │ "Bob"       │ "bob@..."     │ "login"        │ [empty]
user:1003        │ "Charlie"   │ [empty]       │ "signup"       │ "login,click"
────────────────────────────────────────────────────────────────────────
 
Key observations:
- Rows are identified by row key (often designed for range queries)
- Column families are fixed at schema time, but columns within them are dynamic
- Sparse columns are efficient (no storage for empty cells)
- Each cell can have multiple versions (timestamped)
- Data is sorted by row key, enabling efficient range scans

Why Wide-Column Stores Exist:

Wide-column stores emerged from Google's Bigtable to solve specific challenges:

Sparse Data: When most cells are empty, traditional tables waste space. Wide-column stores only store non-empty cells.
Time-Series Data: Columns can represent timestamps, with each row containing a time range of data. Efficient for metrics, logs, events.
Denormalization for Read Performance: Pre-joining data at write time. Each row contains all data needed for a query.
Massive Scale: Designed for petabytes across thousands of nodes. Row key determines partition, enabling horizontal scaling.

Prominent Wide-Column Stores

•Apache Cassandra — Masterless architecture, tunable consistency, linear scalability. Used by Apple (400+ PB), Netflix, Discord. Write-optimized, eventually consistent by default.
•Apache HBase — Bigtable clone on Hadoop. Strong consistency, tight HDFS integration. Used for Hadoop ecosystem analytics.
•Google Cloud Bigtable — Managed Bigtable service. Millisecond latency at massive scale. Powers Google Search, Gmail, Maps internally.
•ScyllaDB — Cassandra-compatible with C++ implementation. Claims 10x better performance through optimizations.

Row Key Design is Critical:

In wide-column stores, the row key determines:

Which partition stores the row
Sort order for range scans
Query efficiency

Poor row key design leads to "hot partitions" where one node handles disproportionate load.

// BAD: Timestamp as row key
row_key = "2024-01-15T10:30:00Z"  // All recent writes go to same partition

// BETTER: Include distribution factor
row_key = "sensor_5001:2024-01-15T10:30:00Z"  // Spread across partitions

// PATTERN: Reverse domain for hierarchy
row_key = "com.example.user:1001:2024-01-15"  // Enables prefix scans

When to Use Wide-Column Stores

Wide-column stores excel for: time-series data (metrics, logs, IoT), write-heavy workloads, analytics on massive datasets, use cases requiring predictable performance at scale. They're inappropriate for: complex queries with joins, transactions across rows, use cases requiring strong consistency.

Graph Databases: Relationships as First-Class Citizens

Graph databases model data as nodes (entities) and edges (relationships). Unlike relational databases where relationships are computed at query time through joins, graph databases store relationships explicitly, making traversals dramatically faster.

The Graph Model:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Create nodes
CREATE (alice:Person {name: 'Alice', age: 30})
CREATE (bob:Person {name: 'Bob', age: 32})
CREATE (charlie:Person {name: 'Charlie', age: 28})
CREATE (neo4j:Company {name: 'Neo4j', founded: 2007})
CREATE (graphdb:Skill {name: 'Graph Databases'})
 
// Create relationships
CREATE (alice)-[:FRIENDS_WITH {since: 2020}]->(bob)
CREATE (bob)-[:FRIENDS_WITH {since: 2019}]->(charlie)
CREATE (alice)-[:WORKS_AT {role: 'Engineer', since: 2022}]->(neo4j)
CREATE (bob)-[:WORKS_AT {role: 'Manager', since: 2021}]->(neo4j)
CREATE (alice)-[:HAS_SKILL {level: 'expert'}]->(graphdb)
CREATE (bob)-[:HAS_SKILL {level: 'intermediate'}]->(graphdb)
 
// Query: Find friends of friends who have the same skill
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH*2]-(friendOfFriend)
WHERE (alice)-[:HAS_SKILL]->(skill)<-[:HAS_SKILL]-(friendOfFriend)
AND alice <> friendOfFriend
RETURN friendOfFriend.name, skill.name

Why Graph Databases Outperform for Relationships:

In a relational database, finding friends-of-friends requires self-joins:

-- Relational: Friends of friends
SELECT DISTINCT f2.friend_id
FROM friendships f1
JOIN friendships f2 ON f1.friend_id = f2.user_id
WHERE f1.user_id = 1001
  AND f2.friend_id != 1001;

For 3 hops, add another join. For variable depth, use recursive CTEs. Performance degrades exponentially with depth because the database must scan join indexes repeatedly.

Graph databases store relationships as pointers. Traversing from node to neighbor is O(1)—just follow the pointer. Multi-hop traversals don't require index lookups at each step.

Relationship Traversal Performance
Depth	Relational (Joins)	Graph (Pointers)
1 hop	1 join, index lookup	~O(1) pointer follow
2 hops	2 joins, exponential rows	~O(k) where k = avg connections
3 hops	3 joins, potentially millions of rows	~O(k²) still manageable
Variable depth	Recursive CTE, very slow	BFS/DFS traversal, efficient

Graph Database Use Cases:

Where Graphs Excel

•Social Networks — Friends, followers, connections, network analysis. Facebook's social graph has trillions of edges.
•Recommendation Engines — 'Customers who bought X also bought Y' is a graph traversal. Path finding through user-item-user relationships.
•Fraud Detection — Identify suspicious patterns: circular transactions, connected accounts, unusual paths through financial networks.
•Knowledge Graphs — Connecting entities with relationships. Google's Knowledge Graph powers search cards.
•Network Infrastructure — Dependencies between services, routing paths, impact analysis for failures.
•Access Control — Complex permission hierarchies, group memberships, inheritance chains.

Prominent Graph Databases

Neo4j: Market leader, Cypher query language, ACID compliant. Amazon Neptune: Managed graph supporting Gremlin and SPARQL. JanusGraph: Distributed, scalable, open-source. TigerGraph: Analytics-focused, real-time deep link analysis. ArangoDB: Multi-model (graph + document + key-value).

BASE: The NoSQL Consistency Model

While relational databases emphasize ACID (Atomicity, Consistency, Isolation, Durability), many NoSQL systems adopt BASE: Basically Available, Soft state, Eventually consistent. Understanding BASE is crucial for working with distributed NoSQL systems.

BASE Properties Explained:

BASE Components

•Basically Available — The system guarantees availability: reads and writes will be processed, even if individual nodes fail. The system remains operational during failures, though responses may not reflect the most recent writes.
•Soft State — The state of the system may change over time, even without input. Replicas converge asynchronously. What you read now may differ from what you read in a moment.
•Eventually Consistent — If no new updates are made, eventually all reads will return the last updated value. The system converges to consistency over time, but there's no guarantee about how long that takes.

Eventual Consistency in Practice:

Eventual consistency means you might read stale data. Consider a shopping cart service:

Time 0: User adds item, write goes to Node A
Time 1ms: Node A acknowledges write, starts replicating to B and C
Time 2ms: User's next request routed to Node B (hasn't received replication yet)
Time 2ms: User reads cart from Node B → Item missing!
Time 10ms: Replication completes, all nodes consistent
Time 15ms: User reads again → Item appears

This is the consistency window—the period during which different nodes return different values.

ACID vs BASE Trade-offs
Aspect	ACID	BASE
Consistency	Strong: reads see latest writes	Eventual: reads may be stale
Availability	May refuse requests to maintain consistency	Prioritizes availability over consistency
Scalability	Harder to scale horizontally	Designed for horizontal scale
Application complexity	Simpler—database handles consistency	Harder—app must handle inconsistency
Use case fit	Financial, inventory, critical data	Social, caching, analytics

Tunable Consistency:

Many NoSQL databases offer tunable consistency, letting you choose per-operation:

// Cassandra: Write with QUORUM consistency
INSERT INTO orders (...) VALUES (...)
USING CONSISTENCY QUORUM;

// DynamoDB: Read with strong consistency
await dynamodb.get({
  TableName: 'orders',
  Key: { id: '1001' },
  ConsistentRead: true  // Strong consistency (higher latency)
});

This enables using eventual consistency for non-critical reads (user profiles, product catalog) while demanding strong consistency for critical operations (inventory, payments).

Know Your Consistency Needs

Eventual consistency is not always acceptable. Showing a user their own updates (read-your-writes consistency), financial balances, and inventory counts often require strong consistency. Don't blindly accept eventual consistency—understand where it's safe and where it's dangerous.

NoSQL Trade-offs: What You Gain and What You Lose

NoSQL databases don't provide something for nothing. Every advantage comes with trade-offs. Understanding these trade-offs is essential for making informed database choices.

What NoSQL Gives You:

NoSQL Advantages

•Horizontal Scalability — Add nodes to increase capacity. No single-machine limits. Scale to petabytes across thousands of nodes.
•Schema Flexibility — Evolve data structures without migrations. Handle varied data in the same collection. Rapid iteration.
•High Availability — Designed for failure. Nodes fail, system continues. Geographic distribution for disaster recovery.
•Specialized Models — Data models optimized for specific access patterns. Documents for hierarchical data, graphs for relationships, time-series for metrics.
•Simple Operations — Key-value stores are trivial to understand. Fewer concepts to learn than relational algebra.
•Developer Experience — JSON/BSON documents map naturally to programming language objects. Less impedance mismatch than ORM.

What NoSQL Costs You:

NoSQL Costs

•ACID Transactions — Most NoSQL stores sacrifice multi-document/multi-row transactions. Some offer them with limitations. Complex operations require application-level coordination.
•Joins — No native joins (or expensive joins via $lookup). Must denormalize or handle relationships in application code. Data duplication increases.
•Ad-Hoc Queries — Query capabilities are often limited compared to SQL. Indexes must be created for specific access patterns. Can't just write arbitrary queries.
•Schema Enforcement — The database won't catch invalid data. Validation moves to application code or middleware. Discipline required.
•Tooling Maturity — Less mature backup, monitoring, migration, and administration tools than 40+ years of SQL ecosystem.
•Query Language Fragmentation — Every NoSQL database has its own query language. No universal standard like SQL. Vendor lock-in increases.

No Free Lunch

Choose NoSQL when its advantages matter more than its costs for your specific use case. If you need transactions, complex queries, and schema enforcement—and most applications do—SQL is often the better choice. NoSQL wins when scale, flexibility, or specialized data models provide clear advantages.

Summary: Understanding NoSQL

We've covered the NoSQL ecosystem comprehensively. Let's consolidate the key takeaways:

Key Takeaways

•NoSQL emerged from web-scale challenges — Google, Facebook, Amazon needed horizontal scale, availability, and flexibility that traditional SQL struggled to provide.
•Schema flexibility is freedom with responsibility — You can evolve structures freely, but without enforcement, schema chaos is a real risk. Modern practice uses application-level validation.
•Four major data models exist — Key-value (simplicity, speed), document (structured flexibility), wide-column (scale, time-series), graph (relationships). Each solves specific problems.
•BASE trades consistency for availability — Eventual consistency enables scale and availability but requires applications to handle temporary inconsistency.
•Every advantage has costs — Horizontal scale costs transactions and joins. Schema flexibility costs enforcement. Choose based on which trade-offs your use case can tolerate.
•NoSQL is not a replacement for SQL — It's a toolkit of alternatives for specific problems. Most systems still need relational databases for their core data.

What's Next:

Now that we understand both the relational model and NoSQL approaches, we need practical guidance: When should you choose SQL? When is NoSQL the right answer? The next page provides a framework for making this crucial decision based on your specific requirements.

Page Complete

You now understand NoSQL's philosophy, schema flexibility implications, the four major data models, and the trade-offs involved. This knowledge enables you to evaluate NoSQL options intelligently rather than following trends. Next, we'll develop decision criteria for choosing between SQL and NoSQL.