System Design (HLD)Apache Cassandra

Apache Cassandra: Masterclass in Distributed NoSQL

LevelAdvanced

Duration120 mins

TopicApache Cassandra

4 / 6

Wide-Column Model

Beyond Rows and Tables: The Wide-Column Paradigm

If you approach Cassandra with a relational database mindset, you'll quickly run into frustration. Cassandra's data model looks superficially similar to SQL—it has tables, columns, and rows—but the underlying organization and access patterns are fundamentally different.

Cassandra uses a wide-column model, sometimes called a "partitioned row store." Data is organized into partitions, and each partition can contain many rows with different columns. This structure is optimized for specific access patterns: reading an entire partition is fast; reading across partitions is expensive.

Understanding this model is essential for designing effective Cassandra schemas. Get it right, and your queries execute in milliseconds at any scale. Get it wrong, and your cluster grinds to a halt.

What You Will Learn

By the end of this page, you will understand: (1) The fundamental concepts of partitions, rows, and columns, (2) How partition keys determine data distribution, (3) How clustering columns enable sorted data within partitions, (4) The difference between CQL tables and SSTable storage, (5) How to model data for Cassandra's access patterns, and (6) Common data modeling patterns and anti-patterns.

Partitions: The Fundamental Unit of Distribution

A partition is the fundamental unit of data distribution in Cassandra. All data with the same partition key is stored together on the same nodes. This has profound implications:

Partition Key Determines Location:

When you insert data, Cassandra hashes the partition key to determine which nodes store that partition. All rows with the same partition key live on the same replica nodes.

Partition-Local Operations Are Fast:

Reading all rows within a single partition is extremely efficient—it's essentially a single disk seek and sequential read. Cassandra is optimized for this access pattern.

Cross-Partition Operations Are Expensive:

Queries that span multiple partitions require contacting potentially all nodes in the cluster. This is why Cassandra requires you to specify the partition key in most queries.

partition_key_example.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Table definition with partition key
CREATE TABLE user_activity (
    user_id UUID,            -- Partition key
    activity_time TIMESTAMP, -- Clustering column
    activity_type TEXT,
    details TEXT,
    PRIMARY KEY (user_id, activity_time)
);
 
-- Data organization:
-- 
-- Partition: user_id = 'alice-uuid'
-- ├── Row: activity_time = '2024-01-01 10:00', activity_type = 'login'
-- ├── Row: activity_time = '2024-01-01 10:05', activity_type = 'view_page'
-- ├── Row: activity_time = '2024-01-01 10:10', activity_type = 'purchase'
-- └── Row: activity_time = '2024-01-01 10:15', activity_type = 'logout'
--
-- Partition: user_id = 'bob-uuid'
-- ├── Row: activity_time = '2024-01-01 09:00', activity_type = 'login'
-- └── Row: activity_time = '2024-01-01 09:30', activity_type = 'view_page'
 
-- This query is efficient (single partition):
SELECT * FROM user_activity WHERE user_id = 'alice-uuid';
 
-- This query is efficient (single partition, range scan):
SELECT * FROM user_activity 
WHERE user_id = 'alice-uuid' 
AND activity_time > '2024-01-01 10:00' 
AND activity_time < '2024-01-01 10:30';
 
-- This query is DANGEROUS (full cluster scan):
SELECT * FROM user_activity WHERE activity_type = 'login';  -- ❌

Partition Size Considerations:

While partitions can theoretically hold billions of rows, practical limits apply:

Recommended maximum: 100,000 rows or 100MB per partition
Hard limit: ~2 billion cells per partition (rarely reached)

Large partitions cause:

Slower reads (more data to scan)
Slower compaction (more data to merge)
Uneven load distribution (hot partitions)
Memory pressure during operations

The Unbounded Partition Anti-Pattern

A common mistake is choosing a partition key that creates unbounded partitions. For example, partitioning time-series data by sensor_id alone means each sensor's partition grows forever. Instead, include a time bucket (sensor_id, date) to limit partition size. Design for known, bounded partition sizes.

Clustering Columns: Sorted Data Within Partitions

While the partition key determines which nodes store the data, clustering columns determine how data is sorted within a partition. This sorting is physical—data is stored on disk in clustering column order.

Why Clustering Order Matters:

Cassandra stores data sorted by clustering columns. This enables extremely efficient range queries within a partition: reading "all activities between 10:00 and 11:00" for a user is a single sequential disk read, not a random scan.

Defining Clustering Order:

clustering_columns.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Clustering column with custom sort order
CREATE TABLE messages (
    chat_id UUID,
    sent_at TIMESTAMP,
    sender_id UUID,
    message_text TEXT,
    PRIMARY KEY (chat_id, sent_at, sender_id)
) WITH CLUSTERING ORDER BY (sent_at DESC, sender_id ASC);
 
-- Physical storage order within partition chat_id='chat-123':
-- 
-- sent_at=2024-01-01 12:00 | sender_id=alice | message_text='Hi!'
-- sent_at=2024-01-01 11:00 | sender_id=bob   | message_text='Hello'
-- sent_at=2024-01-01 10:00 | sender_id=alice | message_text='Hey'
-- 
-- Note: Newest messages first (DESC), then by sender (ASC)
 
-- Efficient: Get latest 20 messages in a chat
SELECT * FROM messages 
WHERE chat_id = 'chat-123' 
LIMIT 20;  -- Returns newest first!
 
-- Efficient: Get messages from specific time range
SELECT * FROM messages 
WHERE chat_id = 'chat-123' 
AND sent_at >= '2024-01-01 10:00' 
AND sent_at <= '2024-01-01 12:00';
 
-- Efficient: Get specific message
SELECT * FROM messages 
WHERE chat_id = 'chat-123' 
AND sent_at = '2024-01-01 11:00'
AND sender_id = 'bob-uuid';

The Clustering Column Hierarchy:

Clustering columns form a hierarchy. You can query:

Just the partition key (returns all rows in partition)
Partition key + first clustering column
Partition key + first and second clustering column
And so on...

But you cannot skip a level. With PRIMARY KEY (a, b, c, d) where a is partition key:

✅ WHERE a = 1 — Valid ✅ WHERE a = 1 AND b = 2 — Valid ✅ WHERE a = 1 AND b = 2 AND c = 3 — Valid ✅ WHERE a = 1 AND b > 2 — Valid (range on last specified) ❌ WHERE a = 1 AND c = 3 — Invalid (skipped b) ❌ WHERE a = 1 AND b > 2 AND c = 3 — Invalid (range mid-hierarchy)

Design for Your Queries

Choose clustering columns based on your query patterns. If you always want the most recent data first, use DESC ordering. If you frequently filter by a specific field, consider making it a clustering column. The goal is to make your common queries touch contiguous blocks of sorted data.

Compound and Composite Partition Keys

Cassandra supports both compound primary keys (partition key + clustering columns) and composite partition keys (multiple columns forming the partition key).

Compound Primary Key:

Simple partition key with one or more clustering columns:

key_types.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-- Compound Primary Key
-- Partition key: user_id (single column)
-- Clustering columns: post_time
CREATE TABLE user_posts (
    user_id UUID,
    post_time TIMESTAMP,
    content TEXT,
    PRIMARY KEY (user_id, post_time)
);
 
-- All posts for user_id='alice' are in ONE partition
-- Posts are sorted by post_time within that partition
 
 
-- Composite Partition Key
-- Partition key: (country, city) - TWO columns combined
-- Clustering columns: store_id
CREATE TABLE stores (
    country TEXT,
    city TEXT,
    store_id UUID,
    store_name TEXT,
    address TEXT,
    PRIMARY KEY ((country, city), store_id)
);
 
-- Note the double parentheses: ((country, city))
-- Stores in (US, NYC) are in a DIFFERENT partition than (US, LA)
-- Both columns must be specified to find the partition
 
-- Valid query:
SELECT * FROM stores WHERE country = 'US' AND city = 'NYC';
 
-- Invalid query (missing city):
SELECT * FROM stores WHERE country = 'US';  -- ❌ Requires full partition key

When to Use Composite Partition Keys:

Composite partition keys are useful when:

You always query by multiple columns together: If you always query by (tenant_id, user_id), make them the composite partition key.
You need to limit partition size: Adding a time bucket to the partition key limits how much data accumulates.
You want to distribute load: A composite key creates more, smaller partitions, spreading load across nodes.

Example: Time-Bucketed Sensor Data:

time_bucketed.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Anti-pattern: Unbounded partition
CREATE TABLE sensor_readings_bad (
    sensor_id UUID,
    reading_time TIMESTAMP,
    value DOUBLE,
    PRIMARY KEY (sensor_id, reading_time)
);
-- Problem: Each sensor partition grows forever!
 
-- Better: Time-bucketed partition key
CREATE TABLE sensor_readings (
    sensor_id UUID,
    date DATE,           -- Add date to partition key
    reading_time TIMESTAMP,
    value DOUBLE,
    PRIMARY KEY ((sensor_id, date), reading_time)
);
 
-- Now each partition contains ONE day of readings
-- Query for today's readings:
SELECT * FROM sensor_readings 
WHERE sensor_id = 'sensor-123' 
AND date = '2024-01-15';
 
-- Query for multi-day range requires multiple queries:
-- Day 1: WHERE sensor_id = X AND date = '2024-01-15'
-- Day 2: WHERE sensor_id = X AND date = '2024-01-16'
-- ...
-- Application must combine results
 
-- Trade-off: More queries, but bounded partition size

Partition Key Syntax

Single-column partition key: PRIMARY KEY (a, b, c) — 'a' is partition key, 'b' and 'c' are clustering columns. Composite partition key: PRIMARY KEY ((a, b), c) — both 'a' and 'b' form the partition key, 'c' is the clustering column. The inner parentheses define the partition key.

Static Columns: Partition-Level Attributes

Sometimes you need to store attributes that apply to the entire partition, not individual rows. Cassandra provides static columns for this purpose.

What Are Static Columns?

A static column is shared across all rows in a partition. There's only one value per partition, regardless of how many rows exist.

static_columns.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Table with static columns
CREATE TABLE orders (
    customer_id UUID,
    order_id UUID,
    order_date TIMESTAMP,
    
    -- Static columns: one per partition (customer)
    customer_name TEXT STATIC,
    customer_email TEXT STATIC,
    
    -- Regular columns: one per row (order)
    order_total DECIMAL,
    status TEXT,
    
    PRIMARY KEY (customer_id, order_id)
);
 
-- Insert customer info once:
INSERT INTO orders (customer_id, customer_name, customer_email)
VALUES ('cust-123', 'Alice Smith', 'alice@example.com');
 
-- Insert orders (customer info is inherited):
INSERT INTO orders (customer_id, order_id, order_date, order_total, status)
VALUES ('cust-123', 'order-1', '2024-01-10', 99.99, 'shipped');
 
INSERT INTO orders (customer_id, order_id, order_date, order_total, status)
VALUES ('cust-123', 'order-2', '2024-01-15', 149.99, 'pending');
 
-- Query returns:
-- customer_id | order_id | customer_name | customer_email      | order_total | status
-- cust-123    | order-1  | Alice Smith   | alice@example.com   | 99.99       | shipped
-- cust-123    | order-2  | Alice Smith   | alice@example.com   | 149.99      | pending
--
-- Note: customer_name and customer_email appear in every row but are stored once

Static Column Use Cases:

Denormalized parent data: Store customer info with their orders without redundant storage
Partition metadata: Store aggregates or statistics for the partition
Configuration: Store settings that apply to all rows in a partition

Important Considerations:

Updates affect all rows: Updating a static column updates it for the entire partition
Returned with every row: Static columns appear in every row of query results
Space efficient: Stored once per partition, not once per row
Not for large data: Keep static columns small; they're loaded with every partition access

Static vs. Denormalization

Static columns are more efficient than full denormalization. If you stored customer_name in every order row, you'd have N copies. With static columns, you have 1 copy per partition. Updates to static columns are O(1), not O(N) where N is rows in the partition.

CQL Tables vs. Physical Storage Model

CQL (Cassandra Query Language) presents a familiar tabular interface, but the underlying storage model is different from relational databases. Understanding this gap is crucial for performance tuning.

CQL Abstraction:

CQL makes Cassandra feel like a relational database with tables, rows, and columns. But internally, data is stored as sparse, sorted maps:

Partition Key → SortedMap<ClusteringKey, ColumnValues>

Physical Storage:

storage_model.txt
CQL View vs. Physical Storage
================================
 
CQL Table Definition:
CREATE TABLE users (
    user_id UUID,
    name TEXT,
    email TEXT,
    created_at TIMESTAMP,
    PRIMARY KEY (user_id)
);
 
CQL Result (what you see):
┌──────────────────────────────────────────────────────────────┐
│ user_id   │ name      │ email            │ created_at        │
├───────────┼───────────┼──────────────────┼───────────────────┤
│ uuid-1    │ Alice     │ alice@example.com│ 2024-01-01 10:00  │
│ uuid-2    │ Bob       │ bob@example.com  │ 2024-01-02 11:00  │
└──────────────────────────────────────────────────────────────┘
 
Physical Storage (SSTable):
Partition: uuid-1
  └── Row:
       ├── name = 'Alice' @ timestamp T1
       ├── email = 'alice@example.com' @ timestamp T2
       └── created_at = '2024-01-01 10:00' @ timestamp T3
 
Partition: uuid-2
  └── Row:
       ├── name = 'Bob' @ timestamp T4
       ├── email = 'bob@example.com' @ timestamp T5
       └── created_at = '2024-01-02 11:00' @ timestamp T6
 
Key insight: Every cell has its own timestamp for conflict resolution

With Clustering Columns:

storage_with_clustering.txt
CQL Table with Clustering:
CREATE TABLE time_series (
    sensor_id UUID,
    timestamp TIMESTAMP,
    temperature DOUBLE,
    humidity DOUBLE,
    PRIMARY KEY (sensor_id, timestamp)
);
 
CQL Result:
┌──────────────────────────────────────────────────────────────┐
│ sensor_id │ timestamp         │ temperature │ humidity       │
├───────────┼───────────────────┼─────────────┼────────────────┤
│ sensor-1  │ 2024-01-01 10:00  │ 22.5        │ 45.0           │
│ sensor-1  │ 2024-01-01 10:01  │ 22.6        │ 44.8           │
│ sensor-1  │ 2024-01-01 10:02  │ 22.4        │ 45.2           │
└──────────────────────────────────────────────────────────────┘
 
Physical Storage (SSTable):
Partition: sensor-1
  ├── ClusteringKey(2024-01-01 10:00)
  │    ├── temperature = 22.5
  │    └── humidity = 45.0
  ├── ClusteringKey(2024-01-01 10:01)
  │    ├── temperature = 22.6
  │    └── humidity = 44.8
  └── ClusteringKey(2024-01-01 10:02)
       ├── temperature = 22.4
       └── humidity = 45.2
 
The clustering key is part of the cell name in storage!
Rows are stored sorted by clustering key.

Wide Rows (the 'Wide-Column' in Wide-Column Store):

Historically, Cassandra's data model was described as 'wide rows' because a single partition could have millions of columns. The CQL abstraction maps this to multiple rows with clustering columns, but physically, all these 'rows' are stored contiguously within the partition.

This is what makes range queries within a partition so efficient—they're sequential disk reads of contiguous data.

Cell Timestamps

Every cell (column value) in Cassandra has its own timestamp, used for conflict resolution. This is why Cassandra can resolve concurrent writes: the cell with the highest timestamp wins (last-write-wins). This is fundamentally different from relational databases where rows are atomic units.

Data Modeling Best Practices

Effective Cassandra data modeling requires a fundamentally different mindset from relational modeling. The key principle: model for your queries, not for your entities.

Relational Mindset

•Normalize data to avoid duplication
•Design entities, figure out queries later
•Use JOINs to combine related data
•Optimize for storage efficiency
•One table per entity type

Cassandra Mindset

•Denormalize data for query efficiency
•Design queries first, then tables
•Pre-compute JOINs at write time
•Optimize for read performance
•One table per query pattern

Query-Driven Modeling Process:

List all queries your application needs
For each query, design a table that answers it efficiently
Accept data duplication as a feature, not a bug
Handle updates across denormalized tables in application logic

Common Patterns:

Common Cassandra Data Modeling Patterns
Pattern	Use Case	Example
Partition per entity	Lookup by ID	users table partitioned by user_id
Time-series	Temporal data with range queries	sensor_readings partitioned by (sensor_id, date)
Materialized view tables	Same data, different query patterns	orders_by_user and orders_by_date
Bucketing	Limit partition size	posts partitioned by (user_id, month)
Inverted index	Search by non-key column	users_by_email table

denormalized_example.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Query 1: Get user's posts (most recent first)
CREATE TABLE posts_by_user (
    user_id UUID,
    post_time TIMESTAMP,
    post_id UUID,
    title TEXT,
    content TEXT,
    PRIMARY KEY (user_id, post_time)
) WITH CLUSTERING ORDER BY (post_time DESC);
 
-- Query 2: Get posts by hashtag
CREATE TABLE posts_by_hashtag (
    hashtag TEXT,
    post_time TIMESTAMP,
    post_id UUID,
    user_id UUID,
    title TEXT,
    -- Note: same data duplicated for different access pattern
    PRIMARY KEY (hashtag, post_time)
) WITH CLUSTERING ORDER BY (post_time DESC);
 
-- When a post is created, write to BOTH tables:
-- Application logic handles this atomicity (or use logged batch)
 
BEGIN BATCH
    INSERT INTO posts_by_user (user_id, post_time, post_id, title, content)
    VALUES (?, ?, ?, ?, ?);
    
    INSERT INTO posts_by_hashtag (hashtag, post_time, post_id, user_id, title)
    VALUES (?, ?, ?, ?, ?);
APPLY BATCH;

The Denormalization Trade-off

Denormalization trades write complexity for read performance. Updating denormalized data requires updating multiple tables, which the application must handle carefully. For frequently updated data, consider if the complexity is worth it, or if a different database might be more appropriate.

Anti-Patterns to Avoid

Knowing what not to do is as important as knowing best practices. These anti-patterns cause performance problems and operational headaches:

Anti-Pattern: Partitions that grow without limit

-- BAD: Partition grows forever
CREATE TABLE events (
    sensor_id UUID,
    event_time TIMESTAMP,
    data TEXT,
    PRIMARY KEY (sensor_id, event_time)
);
-- Each sensor's partition grows indefinitely

Problems:

Slow reads as partition size increases
Compaction struggles with large partitions
Uneven cluster load (hot partitions)
Memory pressure during repairs

Solution: Bucket by time

-- GOOD: Bounded partition size
CREATE TABLE events (
    sensor_id UUID,
    event_date DATE,  -- Add time bucket
    event_time TIMESTAMP,
    data TEXT,
    PRIMARY KEY ((sensor_id, event_date), event_time)
);
-- Each partition contains one day of data

Data Modeling Tools

Use tools like Astra DB's Data Modeler or Hackolade to visually design Cassandra schemas. They enforce best practices and generate CQL automatically. For complex schemas, consider the Chebotko Diagram methodology for mapping application queries to physical tables.

Summary and Next Steps

We've explored Cassandra's wide-column data model in depth. Let's consolidate the key concepts:

Key Takeaways

•Partitions are the unit of distribution — All data with the same partition key is stored together on the same nodes.
•Clustering columns sort data within partitions — Enables efficient range queries through sequential disk reads.
•Composite partition keys enable multi-column partitioning and time-bucketing to bound partition sizes.
•Static columns store partition-level attributes efficiently without per-row duplication.
•CQL abstracts the physical model — Understand the underlying sorted map structure for effective tuning.
•Model for queries, not entities — Design tables to answer specific queries efficiently; accept denormalization.
•Avoid anti-patterns — Unbounded partitions, overused secondary indexes, ALLOW FILTERING, and relational thinking all cause problems.

What's Next:

With the data model understood, we now turn to Cassandra's most celebrated strength: write-optimized performance. The next page explores how Cassandra's write path—commit log, memtables, SSTables, and compaction—enables the extreme write throughput that makes Cassandra the go-to choice for high-velocity data.

Page Complete

You now understand Cassandra's wide-column data model—the foundation for designing effective schemas. Next, we'll explore how Cassandra's write-optimized architecture enables industry-leading write performance and what this means for system design.

4 / 6

Loading learning content...

System Design (HLD)Apache Cassandra

Apache Cassandra: Masterclass in Distributed NoSQL

LevelAdvanced

Duration120 mins

TopicApache Cassandra

4 / 6

Wide-Column Model

Beyond Rows and Tables: The Wide-Column Paradigm

Understanding this model is essential for designing effective Cassandra schemas. Get it right, and your queries execute in milliseconds at any scale. Get it wrong, and your cluster grinds to a halt.

What You Will Learn

Partitions: The Fundamental Unit of Distribution

A partition is the fundamental unit of data distribution in Cassandra. All data with the same partition key is stored together on the same nodes. This has profound implications:

Partition Key Determines Location:

When you insert data, Cassandra hashes the partition key to determine which nodes store that partition. All rows with the same partition key live on the same replica nodes.

Partition-Local Operations Are Fast:

Reading all rows within a single partition is extremely efficient—it's essentially a single disk seek and sequential read. Cassandra is optimized for this access pattern.

Cross-Partition Operations Are Expensive:

Queries that span multiple partitions require contacting potentially all nodes in the cluster. This is why Cassandra requires you to specify the partition key in most queries.

partition_key_example.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Table definition with partition key
CREATE TABLE user_activity (
    user_id UUID,            -- Partition key
    activity_time TIMESTAMP, -- Clustering column
    activity_type TEXT,
    details TEXT,
    PRIMARY KEY (user_id, activity_time)
);
 
-- Data organization:
-- 
-- Partition: user_id = 'alice-uuid'
-- ├── Row: activity_time = '2024-01-01 10:00', activity_type = 'login'
-- ├── Row: activity_time = '2024-01-01 10:05', activity_type = 'view_page'
-- ├── Row: activity_time = '2024-01-01 10:10', activity_type = 'purchase'
-- └── Row: activity_time = '2024-01-01 10:15', activity_type = 'logout'
--
-- Partition: user_id = 'bob-uuid'
-- ├── Row: activity_time = '2024-01-01 09:00', activity_type = 'login'
-- └── Row: activity_time = '2024-01-01 09:30', activity_type = 'view_page'
 
-- This query is efficient (single partition):
SELECT * FROM user_activity WHERE user_id = 'alice-uuid';
 
-- This query is efficient (single partition, range scan):
SELECT * FROM user_activity 
WHERE user_id = 'alice-uuid' 
AND activity_time > '2024-01-01 10:00' 
AND activity_time < '2024-01-01 10:30';
 
-- This query is DANGEROUS (full cluster scan):
SELECT * FROM user_activity WHERE activity_type = 'login';  -- ❌

Partition Size Considerations:

While partitions can theoretically hold billions of rows, practical limits apply:

Recommended maximum: 100,000 rows or 100MB per partition
Hard limit: ~2 billion cells per partition (rarely reached)

Large partitions cause:

Slower reads (more data to scan)
Slower compaction (more data to merge)
Uneven load distribution (hot partitions)
Memory pressure during operations

The Unbounded Partition Anti-Pattern

Clustering Columns: Sorted Data Within Partitions

Why Clustering Order Matters:

Defining Clustering Order:

clustering_columns.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Clustering column with custom sort order
CREATE TABLE messages (
    chat_id UUID,
    sent_at TIMESTAMP,
    sender_id UUID,
    message_text TEXT,
    PRIMARY KEY (chat_id, sent_at, sender_id)
) WITH CLUSTERING ORDER BY (sent_at DESC, sender_id ASC);
 
-- Physical storage order within partition chat_id='chat-123':
-- 
-- sent_at=2024-01-01 12:00 | sender_id=alice | message_text='Hi!'
-- sent_at=2024-01-01 11:00 | sender_id=bob   | message_text='Hello'
-- sent_at=2024-01-01 10:00 | sender_id=alice | message_text='Hey'
-- 
-- Note: Newest messages first (DESC), then by sender (ASC)
 
-- Efficient: Get latest 20 messages in a chat
SELECT * FROM messages 
WHERE chat_id = 'chat-123' 
LIMIT 20;  -- Returns newest first!
 
-- Efficient: Get messages from specific time range
SELECT * FROM messages 
WHERE chat_id = 'chat-123' 
AND sent_at >= '2024-01-01 10:00' 
AND sent_at <= '2024-01-01 12:00';
 
-- Efficient: Get specific message
SELECT * FROM messages 
WHERE chat_id = 'chat-123' 
AND sent_at = '2024-01-01 11:00'
AND sender_id = 'bob-uuid';

The Clustering Column Hierarchy:

Clustering columns form a hierarchy. You can query:

Just the partition key (returns all rows in partition)
Partition key + first clustering column
Partition key + first and second clustering column
And so on...

But you cannot skip a level. With PRIMARY KEY (a, b, c, d) where a is partition key:

Design for Your Queries

Compound and Composite Partition Keys

Cassandra supports both compound primary keys (partition key + clustering columns) and composite partition keys (multiple columns forming the partition key).

Compound Primary Key:

Simple partition key with one or more clustering columns:

key_types.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-- Compound Primary Key
-- Partition key: user_id (single column)
-- Clustering columns: post_time
CREATE TABLE user_posts (
    user_id UUID,
    post_time TIMESTAMP,
    content TEXT,
    PRIMARY KEY (user_id, post_time)
);
 
-- All posts for user_id='alice' are in ONE partition
-- Posts are sorted by post_time within that partition
 
 
-- Composite Partition Key
-- Partition key: (country, city) - TWO columns combined
-- Clustering columns: store_id
CREATE TABLE stores (
    country TEXT,
    city TEXT,
    store_id UUID,
    store_name TEXT,
    address TEXT,
    PRIMARY KEY ((country, city), store_id)
);
 
-- Note the double parentheses: ((country, city))
-- Stores in (US, NYC) are in a DIFFERENT partition than (US, LA)
-- Both columns must be specified to find the partition
 
-- Valid query:
SELECT * FROM stores WHERE country = 'US' AND city = 'NYC';
 
-- Invalid query (missing city):
SELECT * FROM stores WHERE country = 'US';  -- ❌ Requires full partition key

When to Use Composite Partition Keys:

Composite partition keys are useful when:

You always query by multiple columns together: If you always query by (tenant_id, user_id), make them the composite partition key.
You need to limit partition size: Adding a time bucket to the partition key limits how much data accumulates.
You want to distribute load: A composite key creates more, smaller partitions, spreading load across nodes.

Example: Time-Bucketed Sensor Data:

time_bucketed.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Anti-pattern: Unbounded partition
CREATE TABLE sensor_readings_bad (
    sensor_id UUID,
    reading_time TIMESTAMP,
    value DOUBLE,
    PRIMARY KEY (sensor_id, reading_time)
);
-- Problem: Each sensor partition grows forever!
 
-- Better: Time-bucketed partition key
CREATE TABLE sensor_readings (
    sensor_id UUID,
    date DATE,           -- Add date to partition key
    reading_time TIMESTAMP,
    value DOUBLE,
    PRIMARY KEY ((sensor_id, date), reading_time)
);
 
-- Now each partition contains ONE day of readings
-- Query for today's readings:
SELECT * FROM sensor_readings 
WHERE sensor_id = 'sensor-123' 
AND date = '2024-01-15';
 
-- Query for multi-day range requires multiple queries:
-- Day 1: WHERE sensor_id = X AND date = '2024-01-15'
-- Day 2: WHERE sensor_id = X AND date = '2024-01-16'
-- ...
-- Application must combine results
 
-- Trade-off: More queries, but bounded partition size

Partition Key Syntax

Static Columns: Partition-Level Attributes

Sometimes you need to store attributes that apply to the entire partition, not individual rows. Cassandra provides static columns for this purpose.

What Are Static Columns?

A static column is shared across all rows in a partition. There's only one value per partition, regardless of how many rows exist.

static_columns.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Table with static columns
CREATE TABLE orders (
    customer_id UUID,
    order_id UUID,
    order_date TIMESTAMP,
    
    -- Static columns: one per partition (customer)
    customer_name TEXT STATIC,
    customer_email TEXT STATIC,
    
    -- Regular columns: one per row (order)
    order_total DECIMAL,
    status TEXT,
    
    PRIMARY KEY (customer_id, order_id)
);
 
-- Insert customer info once:
INSERT INTO orders (customer_id, customer_name, customer_email)
VALUES ('cust-123', 'Alice Smith', 'alice@example.com');
 
-- Insert orders (customer info is inherited):
INSERT INTO orders (customer_id, order_id, order_date, order_total, status)
VALUES ('cust-123', 'order-1', '2024-01-10', 99.99, 'shipped');
 
INSERT INTO orders (customer_id, order_id, order_date, order_total, status)
VALUES ('cust-123', 'order-2', '2024-01-15', 149.99, 'pending');
 
-- Query returns:
-- customer_id | order_id | customer_name | customer_email      | order_total | status
-- cust-123    | order-1  | Alice Smith   | alice@example.com   | 99.99       | shipped
-- cust-123    | order-2  | Alice Smith   | alice@example.com   | 149.99      | pending
--
-- Note: customer_name and customer_email appear in every row but are stored once

Static Column Use Cases:

Denormalized parent data: Store customer info with their orders without redundant storage
Partition metadata: Store aggregates or statistics for the partition
Configuration: Store settings that apply to all rows in a partition

Important Considerations:

Updates affect all rows: Updating a static column updates it for the entire partition
Returned with every row: Static columns appear in every row of query results
Space efficient: Stored once per partition, not once per row
Not for large data: Keep static columns small; they're loaded with every partition access

Static vs. Denormalization

CQL Tables vs. Physical Storage Model

CQL Abstraction:

CQL makes Cassandra feel like a relational database with tables, rows, and columns. But internally, data is stored as sparse, sorted maps:

Partition Key → SortedMap<ClusteringKey, ColumnValues>

Physical Storage:

storage_model.txt
CQL View vs. Physical Storage
================================
 
CQL Table Definition:
CREATE TABLE users (
    user_id UUID,
    name TEXT,
    email TEXT,
    created_at TIMESTAMP,
    PRIMARY KEY (user_id)
);
 
CQL Result (what you see):
┌──────────────────────────────────────────────────────────────┐
│ user_id   │ name      │ email            │ created_at        │
├───────────┼───────────┼──────────────────┼───────────────────┤
│ uuid-1    │ Alice     │ alice@example.com│ 2024-01-01 10:00  │
│ uuid-2    │ Bob       │ bob@example.com  │ 2024-01-02 11:00  │
└──────────────────────────────────────────────────────────────┘
 
Physical Storage (SSTable):
Partition: uuid-1
  └── Row:
       ├── name = 'Alice' @ timestamp T1
       ├── email = 'alice@example.com' @ timestamp T2
       └── created_at = '2024-01-01 10:00' @ timestamp T3
 
Partition: uuid-2
  └── Row:
       ├── name = 'Bob' @ timestamp T4
       ├── email = 'bob@example.com' @ timestamp T5
       └── created_at = '2024-01-02 11:00' @ timestamp T6
 
Key insight: Every cell has its own timestamp for conflict resolution

With Clustering Columns:

storage_with_clustering.txt
CQL Table with Clustering:
CREATE TABLE time_series (
    sensor_id UUID,
    timestamp TIMESTAMP,
    temperature DOUBLE,
    humidity DOUBLE,
    PRIMARY KEY (sensor_id, timestamp)
);
 
CQL Result:
┌──────────────────────────────────────────────────────────────┐
│ sensor_id │ timestamp         │ temperature │ humidity       │
├───────────┼───────────────────┼─────────────┼────────────────┤
│ sensor-1  │ 2024-01-01 10:00  │ 22.5        │ 45.0           │
│ sensor-1  │ 2024-01-01 10:01  │ 22.6        │ 44.8           │
│ sensor-1  │ 2024-01-01 10:02  │ 22.4        │ 45.2           │
└──────────────────────────────────────────────────────────────┘
 
Physical Storage (SSTable):
Partition: sensor-1
  ├── ClusteringKey(2024-01-01 10:00)
  │    ├── temperature = 22.5
  │    └── humidity = 45.0
  ├── ClusteringKey(2024-01-01 10:01)
  │    ├── temperature = 22.6
  │    └── humidity = 44.8
  └── ClusteringKey(2024-01-01 10:02)
       ├── temperature = 22.4
       └── humidity = 45.2
 
The clustering key is part of the cell name in storage!
Rows are stored sorted by clustering key.

Wide Rows (the 'Wide-Column' in Wide-Column Store):

This is what makes range queries within a partition so efficient—they're sequential disk reads of contiguous data.

Cell Timestamps

Data Modeling Best Practices

Effective Cassandra data modeling requires a fundamentally different mindset from relational modeling. The key principle: model for your queries, not for your entities.

Relational Mindset

•Normalize data to avoid duplication
•Design entities, figure out queries later
•Use JOINs to combine related data
•Optimize for storage efficiency
•One table per entity type

Cassandra Mindset

•Denormalize data for query efficiency
•Design queries first, then tables
•Pre-compute JOINs at write time
•Optimize for read performance
•One table per query pattern

Query-Driven Modeling Process:

List all queries your application needs
For each query, design a table that answers it efficiently
Accept data duplication as a feature, not a bug
Handle updates across denormalized tables in application logic

Common Patterns:

Common Cassandra Data Modeling Patterns
Pattern	Use Case	Example
Partition per entity	Lookup by ID	users table partitioned by user_id
Time-series	Temporal data with range queries	sensor_readings partitioned by (sensor_id, date)
Materialized view tables	Same data, different query patterns	orders_by_user and orders_by_date
Bucketing	Limit partition size	posts partitioned by (user_id, month)
Inverted index	Search by non-key column	users_by_email table

denormalized_example.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- Query 1: Get user's posts (most recent first)
CREATE TABLE posts_by_user (
    user_id UUID,
    post_time TIMESTAMP,
    post_id UUID,
    title TEXT,
    content TEXT,
    PRIMARY KEY (user_id, post_time)
) WITH CLUSTERING ORDER BY (post_time DESC);
 
-- Query 2: Get posts by hashtag
CREATE TABLE posts_by_hashtag (
    hashtag TEXT,
    post_time TIMESTAMP,
    post_id UUID,
    user_id UUID,
    title TEXT,
    -- Note: same data duplicated for different access pattern
    PRIMARY KEY (hashtag, post_time)
) WITH CLUSTERING ORDER BY (post_time DESC);
 
-- When a post is created, write to BOTH tables:
-- Application logic handles this atomicity (or use logged batch)
 
BEGIN BATCH
    INSERT INTO posts_by_user (user_id, post_time, post_id, title, content)
    VALUES (?, ?, ?, ?, ?);
    
    INSERT INTO posts_by_hashtag (hashtag, post_time, post_id, user_id, title)
    VALUES (?, ?, ?, ?, ?);
APPLY BATCH;

The Denormalization Trade-off

Anti-Patterns to Avoid

Knowing what not to do is as important as knowing best practices. These anti-patterns cause performance problems and operational headaches:

Anti-Pattern: Partitions that grow without limit

-- BAD: Partition grows forever
CREATE TABLE events (
    sensor_id UUID,
    event_time TIMESTAMP,
    data TEXT,
    PRIMARY KEY (sensor_id, event_time)
);
-- Each sensor's partition grows indefinitely

Problems:

Slow reads as partition size increases
Compaction struggles with large partitions
Uneven cluster load (hot partitions)
Memory pressure during repairs

Solution: Bucket by time

-- GOOD: Bounded partition size
CREATE TABLE events (
    sensor_id UUID,
    event_date DATE,  -- Add time bucket
    event_time TIMESTAMP,
    data TEXT,
    PRIMARY KEY ((sensor_id, event_date), event_time)
);
-- Each partition contains one day of data

Data Modeling Tools

Summary and Next Steps

We've explored Cassandra's wide-column data model in depth. Let's consolidate the key concepts:

Key Takeaways

•Partitions are the unit of distribution — All data with the same partition key is stored together on the same nodes.
•Clustering columns sort data within partitions — Enables efficient range queries through sequential disk reads.
•Composite partition keys enable multi-column partitioning and time-bucketing to bound partition sizes.
•Static columns store partition-level attributes efficiently without per-row duplication.
•CQL abstracts the physical model — Understand the underlying sorted map structure for effective tuning.
•Model for queries, not entities — Design tables to answer specific queries efficiently; accept denormalization.
•Avoid anti-patterns — Unbounded partitions, overused secondary indexes, ALLOW FILTERING, and relational thinking all cause problems.

What's Next:

Page Complete

4 / 6