Distributed DatabasesFragmentation

Data Fragmentation in Distributed Databases

LevelAdvanced

Duration90 mins

TopicFragmentation

1 / 5

Horizontal Fragmentation

Dividing Data Across the Globe

Imagine you're architecting a global e-commerce platform serving 200 million customers across 50 countries. Your Orders table contains 5 billion rows and grows by 10 million daily. A single database server, regardless of its specifications, cannot efficiently store, process, or serve this data. The solution isn't merely scaling up hardware—it's strategically fragmenting your data across multiple nodes.

Horizontal fragmentation (also known as horizontal partitioning or sharding) is the technique of dividing a table's rows into disjoint subsets, distributing each subset to different database nodes while preserving the table schema. Each fragment contains complete rows but only a subset of the table's total population.

This page explores horizontal fragmentation in rigorous depth—from its theoretical foundations to its practical implementation challenges, equipping you with the knowledge to design fragmentation strategies that scale to billions of records while maintaining query performance and data integrity.

What You Will Learn

By the end of this page, you will understand: (1) The formal definition and mathematical foundations of horizontal fragmentation, (2) The completeness, disjointness, and reconstruction properties that valid fragments must satisfy, (3) Primary and derived horizontal fragmentation strategies, (4) Selection predicates and fragmentation schemas, (5) Allocation strategies and their impact on query performance, and (6) Real-world implementation patterns and their trade-offs.

Formal Foundation of Horizontal Fragmentation

Before diving into implementation details, we must establish the formal mathematical foundation. Understanding these principles rigorously distinguishes ad-hoc partitioning from principled fragmentation design.

Definition: Given a relation R, a horizontal fragmentation of R produces fragments R₁, R₂, ..., Rₙ such that:

Completeness: Every tuple in R must belong to at least one fragment Rᵢ
Disjointness: Fragments are mutually exclusive; no tuple appears in multiple fragments
Reconstruction: The original relation R can be obtained by taking the union of all fragments

Mathematically:

Completeness: R = R₁ ∪ R₂ ∪ ... ∪ Rₙ
Disjointness: Rᵢ ∩ Rⱼ = ∅ for all i ≠ j
Reconstruction: R = ⋃ᵢ₌₁ⁿ Rᵢ

Why These Properties Matter

Violating completeness means data loss—some tuples become inaccessible. Violating disjointness wastes storage and creates update anomalies where the same logical row must be modified in multiple locations. Violating reconstruction means you cannot reassemble the original table for reporting or migration. Production systems must verify these properties continuously.

Selection Predicates:

Horizontal fragments are defined using selection predicates—Boolean expressions that determine fragment membership. Each fragment Rᵢ is defined as:

Rᵢ = σ(pᵢ)(R)

Where σ denotes the selection operation and pᵢ is the predicate for fragment i. For valid fragmentation, the predicates must be:

Mutually exclusive: pᵢ ∧ pⱼ = false for i ≠ j
Collectively exhaustive: p₁ ∨ p₂ ∨ ... ∨ pₙ = true

Example: Geographic Fragmentation

Consider a Customers table with a country attribute. A geographic fragmentation might define:

p₁: country IN ('US', 'Canada', 'Mexico') → Fragment: North America
p₂: country IN ('UK', 'Germany', 'France', ...) → Fragment: Europe
p₃: country IN ('Japan', 'China', 'India', ...) → Fragment: Asia Pacific
p₄: NOT (p₁ OR p₂ OR p₃) → Fragment: Rest of World

The fourth predicate is crucial—it catches all countries not explicitly assigned, ensuring completeness.

Fragmentation Property Violations and Consequences
Property Violated	Condition	Consequence	Detection Method
Completeness	Some tuples match no predicate	Data inaccessible, query results incomplete	COUNT() on union ≠ COUNT() on original
Disjointness	Some tuples match multiple predicates	Storage waste, update anomalies, delete failures	SUM of fragment counts > original count
Reconstruction	Union cannot recreate original	Schema mismatch, data corruption	Schema comparison, integrity constraint checks

Primary Horizontal Fragmentation

Primary horizontal fragmentation partitions a relation based on predicates defined on the relation's own attributes. This is the most straightforward form of horizontal fragmentation, where the fragmentation criteria are intrinsic to the data being fragmented.

The Design Process:

Designing primary horizontal fragmentation requires analyzing:

Application Queries: What are the most frequent access patterns?
Data Distribution: How are attribute values distributed across the population?
Growth Patterns: How will data volumes evolve over time?
Locality Requirements: Where must data be physically located for performance or compliance?

Simple Predicates:

A simple predicate is an atomic Boolean condition of the form:

attribute θ value

Where θ ∈ {=, ≠, <, ≤, >, ≥}.

Examples of simple predicates:

order_date >= '2024-01-01'
status = 'ACTIVE'
amount > 10000

Minterm Predicates:

Given a set of simple predicates P = {p₁, p₂, ..., pₘ}, a minterm predicate is a conjunction of these predicates where each predicate appears in either its natural form (pᵢ) or negated form (¬pᵢ).

For m simple predicates, there are 2ᵐ possible minterms, though many may be contradictory (always false) and can be eliminated.

Example:

Given simple predicates:

p₁: status = 'ACTIVE'
p₂: balance > 1000

The four minterms are:

m₁: status = 'ACTIVE' AND balance > 1000 (Active, High Balance)
m₂: status = 'ACTIVE' AND balance ≤ 1000 (Active, Low Balance)
m₃: status ≠ 'ACTIVE' AND balance > 1000 (Inactive, High Balance)
m₄: status ≠ 'ACTIVE' AND balance ≤ 1000 (Inactive, Low Balance)

primary_fragmentation_example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
-- Original Orders table schema
CREATE TABLE Orders (
    order_id        BIGINT PRIMARY KEY,
    customer_id     BIGINT NOT NULL,
    order_date      DATE NOT NULL,
    region          VARCHAR(50) NOT NULL,
    amount          DECIMAL(15, 2) NOT NULL,
    status          VARCHAR(20) NOT NULL
);
 
-- Define fragmentation predicates based on region
-- Fragment 1: North America
CREATE TABLE Orders_NorthAmerica AS
SELECT * FROM Orders
WHERE region IN ('United States', 'Canada', 'Mexico');
 
-- Fragment 2: Europe  
CREATE TABLE Orders_Europe AS
SELECT * FROM Orders
WHERE region IN ('United Kingdom', 'Germany', 'France', 
                 'Italy', 'Spain', 'Netherlands');
 
-- Fragment 3: Asia Pacific
CREATE TABLE Orders_AsiaPacific AS
SELECT * FROM Orders
WHERE region IN ('Japan', 'China', 'India', 'Australia', 
                 'Singapore', 'South Korea');
 
-- Fragment 4: Rest of World (catch-all)
CREATE TABLE Orders_Other AS
SELECT * FROM Orders
WHERE region NOT IN ('United States', 'Canada', 'Mexico',
                     'United Kingdom', 'Germany', 'France',
                     'Italy', 'Spain', 'Netherlands',
                     'Japan', 'China', 'India', 'Australia',
                     'Singapore', 'South Korea');
 
-- Verify completeness: count should match original
SELECT 
    (SELECT COUNT(*) FROM Orders_NorthAmerica) +
    (SELECT COUNT(*) FROM Orders_Europe) +
    (SELECT COUNT(*) FROM Orders_AsiaPacific) +
    (SELECT COUNT(*) FROM Orders_Other) AS fragment_total,
    (SELECT COUNT(*) FROM Orders) AS original_total;

Choosing the Fragmentation Attribute

The fragmentation attribute should ideally be: (1) Frequently used in WHERE clauses to enable fragment elimination, (2) Relatively stable to avoid frequent tuple migrations, (3) Well-distributed to prevent skewed fragment sizes, and (4) Meaningful for physical co-location when data locality matters. In our example, 'region' allows queries by geography to access only relevant fragments while supporting data sovereignty requirements.

Derived Horizontal Fragmentation

Derived horizontal fragmentation partitions a relation based on the fragmentation of another (related) relation. This strategy is essential when tables have referential relationships and frequently join together.

Motivation:

Consider a parent-child relationship between Customers and Orders. If Customers is fragmented by region, queries joining these tables would require cross-fragment joins if Orders fragments don't align. Derived fragmentation solves this by fragmenting the child table (Orders) using a semi-join with the parent fragments.

Formal Definition:

Let R₁, R₂, ..., Rₙ be the fragments of parent relation R. For child relation S with foreign key referencing R, the derived fragments of S are:

Sᵢ = S ⋉ Rᵢ

Where ⋉ denotes the semi-join operation. Each Sᵢ contains tuples from S that reference tuples in Rᵢ.

Link Relation:

The relation whose fragmentation determines the derived fragmentation is called the link relation or owner relation. The attribute used for the join is the link attribute.

Properties of Derived Fragmentation:

Join Locality: Joins between parent and child fragments can be performed locally at each site
Cascade Design: Multiple levels of derivation form a fragmentation tree
Dependency: Changes to parent fragmentation require re-fragmenting children
Orphan Handling: Child tuples referencing non-existent parents must be addressed

derived_fragmentation_example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
-- Parent relation: Customers fragmented by region
-- Customers_NorthAmerica, Customers_Europe, Customers_AsiaPacific
 
-- Child relation: Orders
CREATE TABLE Orders (
    order_id        BIGINT PRIMARY KEY,
    customer_id     BIGINT NOT NULL REFERENCES Customers(customer_id),
    order_date      DATE NOT NULL,
    amount          DECIMAL(15, 2) NOT NULL,
    status          VARCHAR(20) NOT NULL
);
 
-- Derived fragmentation: Orders fragments align with Customers fragments
-- This uses a semi-join with the parent fragment
 
-- Fragment 1: Orders for North American customers
CREATE TABLE Orders_NorthAmerica AS
SELECT o.*
FROM Orders o
WHERE EXISTS (
    SELECT 1 FROM Customers_NorthAmerica c
    WHERE c.customer_id = o.customer_id
);
 
-- Fragment 2: Orders for European customers
CREATE TABLE Orders_Europe AS
SELECT o.*
FROM Orders o
WHERE EXISTS (
    SELECT 1 FROM Customers_Europe c
    WHERE c.customer_id = o.customer_id
);
 
-- Fragment 3: Orders for Asia Pacific customers
CREATE TABLE Orders_AsiaPacific AS
SELECT o.*
FROM Orders o
WHERE EXISTS (
    SELECT 1 FROM Customers_AsiaPacific c
    WHERE c.customer_id = o.customer_id
);
 
-- Now, local joins are possible without cross-site communication
-- Each site can locally execute:
SELECT c.customer_name, o.order_date, o.amount
FROM Customers_NorthAmerica c
JOIN Orders_NorthAmerica o ON c.customer_id = o.customer_id
WHERE o.amount > 1000;

Derived Fragmentation Advantages

•Join Locality — Parent-child joins execute locally, eliminating network overhead
•Query Performance — Common access patterns involving related tables are optimized
•Data Affinity — Related data stored together, improving cache utilization
•Transaction Locality — Transactions touching parent and child often complete locally
•Simpler Coordination — Reduces need for distributed join protocols

Derived Fragmentation Challenges

•Cascade Updates — Parent re-fragmentation requires child re-fragmentation
•Design Complexity — Deep hierarchies create complex dependency chains
•Cross-Fragment Queries — Queries not following the derivation path lose locality
•Flexibility Constraints — Child fragmentation fixed by parent decisions
•Orphan Management — Must handle children whose parents migrate

Fragment Allocation Strategies

Creating fragments is only half the problem—you must also allocate them to physical database nodes. The allocation decision profoundly impacts query performance, availability, and operational complexity.

Allocation Problem:

Given:

n fragments F₁, F₂, ..., Fₙ
m sites S₁, S₂, ..., Sₘ
Query workload Q = {q₁, q₂, ..., qₖ} with frequencies
Network topology and costs

Find an allocation A: fragments → sites that minimizes:

Total query response time
Data transfer costs
Storage costs

Subject to:

Storage capacity constraints
Availability requirements
Data sovereignty regulations

Allocation Strategies:

Non-replicated Allocation: Each fragment resides at exactly one site
- Minimizes storage but creates single points of failure
- Lower update costs but potential query bottlenecks
Replicated Allocation: Fragments may be duplicated across sites
- Improves read performance and availability
- Increases storage costs and update complexity
- Requires consistency protocols (discussed in the Replication module)
Partially Replicated Allocation: Strategic replication based on access patterns
- Hot fragments replicated more widely
- Cold fragments stored at single sites
- Balances performance, cost, and complexity

Allocation Strategy Comparison
Strategy	Read Performance	Write Complexity	Storage Cost	Availability
Non-replicated	May require remote access	Simple—single copy	Minimal	Low—SPOF per fragment
Fully Replicated	Always local	High—update all copies	Maximal	High—survives any failure
Partial Replication	Usually local	Moderate—depends on degree	Moderate	Configurable per fragment

Allocation Heuristics:

Optimal allocation is NP-hard for non-trivial workloads. Practical systems employ heuristics:

Query-Based Allocation:
- Analyze query frequency and access patterns
- Place fragments where most frequently accessed
- Replicate to sites with high read demand
Capacity-Based Allocation:
- Consider storage and processing capacity
- Balance load across available sites
- Monitor and rebalance as workloads shift
Locality-Based Allocation:
- Place data near its producers/consumers
- Respect data sovereignty requirements
- Minimize cross-datacenter traffic

Example Allocation Decision:

For regional order fragments:

Orders_NorthAmerica → US-East datacenter (primary), US-West (replica)
Orders_Europe → EU-Frankfurt datacenter (primary), EU-London (replica)
Orders_AsiaPacific → Singapore datacenter (primary), Tokyo (replica)

This allocation:

Respects data residency (EU data stays in EU)
Provides regional read performance
Offers cross-region disaster recovery

Dynamic Rebalancing

Production systems must continuously monitor fragment sizes and access patterns. As data grows unevenly, fragments may require splitting, merging, or migration. Modern distributed databases like CockroachDB and TiDB perform automatic rebalancing, moving range fragments between nodes to maintain even distribution without manual intervention.

Implementation Patterns and Considerations

Translating theoretical horizontal fragmentation into production systems requires addressing several practical concerns.

Range-Based vs. Hash-Based Partitioning:

The fragmentation predicate implementation typically follows one of two patterns:

Range-Based Partitioning:
- Predicates define value ranges: date >= '2024-01-01' AND date < '2024-02-01'
- Supports range queries efficiently
- Prone to hotspots if ranges receive uneven traffic
- Common for time-series data
Hash-Based Partitioning:
- Predicates based on hash function: hash(customer_id) MOD n = i
- Distributes data uniformly across fragments
- Poor for range queries (must scan all fragments)
- Excellent for point lookups
List-Based Partitioning:
- Explicit enumeration: region IN ('US', 'Canada')
- Full control over fragment membership
- Requires maintenance as new values appear
- Common for categorical attributes

partitioning_implementations.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
-- PostgreSQL Declarative Partitioning
 
-- Range partitioning by date (time-series data)
CREATE TABLE events (
    event_id    BIGINT NOT NULL,
    event_time  TIMESTAMP NOT NULL,
    event_type  VARCHAR(50) NOT NULL,
    payload     JSONB
) PARTITION BY RANGE (event_time);
 
-- Create partitions for each month
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
-- Continue for additional months...
 
-- Hash partitioning for even distribution
CREATE TABLE users (
    user_id     BIGINT NOT NULL,
    email       VARCHAR(255) NOT NULL,
    created_at  TIMESTAMP DEFAULT NOW()
) PARTITION BY HASH (user_id);
 
-- Create 8 hash partitions
CREATE TABLE users_0 PARTITION OF users
    FOR VALUES WITH (MODULUS 8, REMAINDER 0);
CREATE TABLE users_1 PARTITION OF users
    FOR VALUES WITH (MODULUS 8, REMAINDER 1);
-- Continue for remainders 2-7...
 
-- List partitioning by region
CREATE TABLE orders (
    order_id    BIGINT NOT NULL,
    region      VARCHAR(20) NOT NULL,
    amount      DECIMAL(15, 2)
) PARTITION BY LIST (region);
 
CREATE TABLE orders_americas PARTITION OF orders
    FOR VALUES IN ('US', 'CA', 'MX', 'BR');
CREATE TABLE orders_emea PARTITION OF orders
    FOR VALUES IN ('UK', 'DE', 'FR', 'IT');
CREATE TABLE orders_apac PARTITION OF orders
    FOR VALUES IN ('JP', 'CN', 'IN', 'AU');

Critical Implementation Considerations

•Partition Key Selection — The key must be included in primary/unique constraints and cannot be updated (or triggers migration). Choose stable, frequently-filtered attributes.
•NULL Handling — Decide how NULL values are assigned. PostgreSQL treats NULL as any other value; MySQL requires explicit NULL partition or DEFAULT partition.
•Cross-Partition Queries — Queries that cannot eliminate partitions scan all of them. Design fragmentation around actual query patterns.
•Partition Maintenance — For time-based partitions, automate partition creation (future) and archival (past). Most databases don't auto-create partitions.
•Foreign Keys — PostgreSQL supports foreign keys referencing partitioned tables only with restrictions. MySQL doesn't support them for partitioned tables.
•Index Strategy — Indexes are per-partition. Global indexes spanning partitions are generally unsupported, affecting certain unique constraint patterns.

Query Processing with Horizontal Fragments

When relations are horizontally fragmented, the query processor must determine which fragments to access and how to combine results. This process, called fragment elimination or partition pruning, is critical for performance.

Fragment Elimination:

The query optimizer analyzes WHERE clause predicates against fragment definitions. If a predicate contradicts a fragment's definition, that fragment can be skipped entirely.

Example:

Given fragments:

Orders_2024_Q1: order_date FROM '2024-01-01' TO '2024-04-01'
Orders_2024_Q2: order_date FROM '2024-04-01' TO '2024-07-01'
Orders_2024_Q3: order_date FROM '2024-07-01' TO '2024-10-01'

For query:

SELECT * FROM Orders WHERE order_date = '2024-05-15'

Only Orders_2024_Q2 needs to be accessed—the other fragments are eliminated.

Localization Program:

The distributed query processor transforms global queries into fragment-specific subqueries through a localization program:

Replace global relation with union of fragments
Apply algebraic simplification (push selections)
Eliminate fragments that return empty sets
Optimize remaining fragment accesses

Query Rewriting:

A global query on Orders is rewritten as:

-- Original
SELECT * FROM Orders WHERE region = 'US' AND amount > 1000;

-- After localization (fragments by region)
SELECT * FROM Orders_NorthAmerica WHERE region = 'US' AND amount > 1000;
-- Other fragment queries eliminated entirely

partition_pruning_explain.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
-- Demonstrating partition pruning with EXPLAIN
 
-- Time-partitioned table
CREATE TABLE events (
    id          BIGSERIAL,
    occurred_at TIMESTAMP NOT NULL,
    event_type  VARCHAR(50),
    data        JSONB
) PARTITION BY RANGE (occurred_at);
 
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
CREATE TABLE events_2024_03 PARTITION OF events
    FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
 
-- Query targeting specific date range
EXPLAIN (ANALYZE, COSTS OFF)
SELECT * FROM events
WHERE occurred_at >= '2024-02-10'
  AND occurred_at < '2024-02-20';
 
-- QUERY PLAN OUTPUT:
-- Append
--   ->  Seq Scan on events_2024_02 events_1
--         Filter: (occurred_at >= '2024-02-10' AND occurred_at < '2024-02-20')
-- (Partitions events_2024_01 and events_2024_03 are PRUNED)
 
-- Query without partition filter - scans ALL partitions
EXPLAIN (ANALYZE, COSTS OFF)
SELECT * FROM events
WHERE event_type = 'LOGIN';
 
-- QUERY PLAN OUTPUT:
-- Append
--   ->  Seq Scan on events_2024_01 events_1
--         Filter: (event_type = 'LOGIN')
--   ->  Seq Scan on events_2024_02 events_2
--         Filter: (event_type = 'LOGIN')
--   ->  Seq Scan on events_2024_03 events_3
--         Filter: (event_type = 'LOGIN')
-- (No pruning - all partitions scanned)

Optimizing for Partition Pruning

To maximize fragment elimination: (1) Include partition key in query predicates, (2) Use equality or range conditions—functions on partition keys prevent pruning, (3) Avoid OR conditions spanning partitions when possible, (4) Create composite partition keys for multi-dimensional access patterns, and (5) Monitor slow query logs for queries scanning all partitions unexpectedly.

Summary: Horizontal Fragmentation Mastery

Horizontal fragmentation is the foundational technique for scaling relational data beyond single-node limits. Let's consolidate the key concepts:

Key Takeaways

•Formal Properties — Valid fragmentation requires completeness (no data loss), disjointness (no duplicates), and reconstruction (union reproduces original).
•Selection Predicates — Fragments are defined by mutually exclusive, collectively exhaustive predicates determining tuple membership.
•Primary vs. Derived — Primary fragmentation uses intrinsic attributes; derived fragmentation follows parent table fragmentation via semi-joins.
•Allocation Matters — Fragment placement impacts query latency, availability, and compliance—non-replicated vs. replicated allocation trade storage for performance.
•Implementation Patterns — Range, hash, and list partitioning serve different access patterns; choose based on query workload analysis.
•Query Optimization — Partition pruning/fragment elimination enables accessing only relevant fragments—design fragmentation to maximize elimination opportunities.

What's Next:

Horizontal fragmentation divides by rows—but some scenarios require dividing by columns. The next page explores Vertical Fragmentation, where tables are split by attribute groups to optimize different access patterns and enable more granular data distribution.

Page Complete

You now understand horizontal fragmentation at a deep, principled level—from its formal foundations through practical implementation patterns. This knowledge is essential for designing distributed database schemas that scale while maintaining data integrity and query performance.

1 / 5

Loading learning content...

Distributed DatabasesFragmentation

Data Fragmentation in Distributed Databases

LevelAdvanced

Duration90 mins

TopicFragmentation

1 / 5

Horizontal Fragmentation

Dividing Data Across the Globe

What You Will Learn

Formal Foundation of Horizontal Fragmentation

Definition: Given a relation R, a horizontal fragmentation of R produces fragments R₁, R₂, ..., Rₙ such that:

Completeness: Every tuple in R must belong to at least one fragment Rᵢ
Disjointness: Fragments are mutually exclusive; no tuple appears in multiple fragments
Reconstruction: The original relation R can be obtained by taking the union of all fragments

Mathematically:

Completeness: R = R₁ ∪ R₂ ∪ ... ∪ Rₙ
Disjointness: Rᵢ ∩ Rⱼ = ∅ for all i ≠ j
Reconstruction: R = ⋃ᵢ₌₁ⁿ Rᵢ

Why These Properties Matter

Selection Predicates:

Horizontal fragments are defined using selection predicates—Boolean expressions that determine fragment membership. Each fragment Rᵢ is defined as:

Rᵢ = σ(pᵢ)(R)

Where σ denotes the selection operation and pᵢ is the predicate for fragment i. For valid fragmentation, the predicates must be:

Mutually exclusive: pᵢ ∧ pⱼ = false for i ≠ j
Collectively exhaustive: p₁ ∨ p₂ ∨ ... ∨ pₙ = true

Example: Geographic Fragmentation

Consider a Customers table with a country attribute. A geographic fragmentation might define:

p₁: country IN ('US', 'Canada', 'Mexico') → Fragment: North America
p₂: country IN ('UK', 'Germany', 'France', ...) → Fragment: Europe
p₃: country IN ('Japan', 'China', 'India', ...) → Fragment: Asia Pacific
p₄: NOT (p₁ OR p₂ OR p₃) → Fragment: Rest of World

The fourth predicate is crucial—it catches all countries not explicitly assigned, ensuring completeness.

Fragmentation Property Violations and Consequences
Property Violated	Condition	Consequence	Detection Method
Completeness	Some tuples match no predicate	Data inaccessible, query results incomplete	COUNT() on union ≠ COUNT() on original
Disjointness	Some tuples match multiple predicates	Storage waste, update anomalies, delete failures	SUM of fragment counts > original count
Reconstruction	Union cannot recreate original	Schema mismatch, data corruption	Schema comparison, integrity constraint checks

Primary Horizontal Fragmentation

The Design Process:

Designing primary horizontal fragmentation requires analyzing:

Application Queries: What are the most frequent access patterns?
Data Distribution: How are attribute values distributed across the population?
Growth Patterns: How will data volumes evolve over time?
Locality Requirements: Where must data be physically located for performance or compliance?

Simple Predicates:

A simple predicate is an atomic Boolean condition of the form:

attribute θ value

Where θ ∈ {=, ≠, <, ≤, >, ≥}.

Examples of simple predicates:

order_date >= '2024-01-01'
status = 'ACTIVE'
amount > 10000

Minterm Predicates:

For m simple predicates, there are 2ᵐ possible minterms, though many may be contradictory (always false) and can be eliminated.

Example:

Given simple predicates:

p₁: status = 'ACTIVE'
p₂: balance > 1000

The four minterms are:

m₁: status = 'ACTIVE' AND balance > 1000 (Active, High Balance)
m₂: status = 'ACTIVE' AND balance ≤ 1000 (Active, Low Balance)
m₃: status ≠ 'ACTIVE' AND balance > 1000 (Inactive, High Balance)
m₄: status ≠ 'ACTIVE' AND balance ≤ 1000 (Inactive, Low Balance)

primary_fragmentation_example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
-- Original Orders table schema
CREATE TABLE Orders (
    order_id        BIGINT PRIMARY KEY,
    customer_id     BIGINT NOT NULL,
    order_date      DATE NOT NULL,
    region          VARCHAR(50) NOT NULL,
    amount          DECIMAL(15, 2) NOT NULL,
    status          VARCHAR(20) NOT NULL
);
 
-- Define fragmentation predicates based on region
-- Fragment 1: North America
CREATE TABLE Orders_NorthAmerica AS
SELECT * FROM Orders
WHERE region IN ('United States', 'Canada', 'Mexico');
 
-- Fragment 2: Europe  
CREATE TABLE Orders_Europe AS
SELECT * FROM Orders
WHERE region IN ('United Kingdom', 'Germany', 'France', 
                 'Italy', 'Spain', 'Netherlands');
 
-- Fragment 3: Asia Pacific
CREATE TABLE Orders_AsiaPacific AS
SELECT * FROM Orders
WHERE region IN ('Japan', 'China', 'India', 'Australia', 
                 'Singapore', 'South Korea');
 
-- Fragment 4: Rest of World (catch-all)
CREATE TABLE Orders_Other AS
SELECT * FROM Orders
WHERE region NOT IN ('United States', 'Canada', 'Mexico',
                     'United Kingdom', 'Germany', 'France',
                     'Italy', 'Spain', 'Netherlands',
                     'Japan', 'China', 'India', 'Australia',
                     'Singapore', 'South Korea');
 
-- Verify completeness: count should match original
SELECT 
    (SELECT COUNT(*) FROM Orders_NorthAmerica) +
    (SELECT COUNT(*) FROM Orders_Europe) +
    (SELECT COUNT(*) FROM Orders_AsiaPacific) +
    (SELECT COUNT(*) FROM Orders_Other) AS fragment_total,
    (SELECT COUNT(*) FROM Orders) AS original_total;

Choosing the Fragmentation Attribute

Derived Horizontal Fragmentation

Motivation:

Formal Definition:

Let R₁, R₂, ..., Rₙ be the fragments of parent relation R. For child relation S with foreign key referencing R, the derived fragments of S are:

Sᵢ = S ⋉ Rᵢ

Where ⋉ denotes the semi-join operation. Each Sᵢ contains tuples from S that reference tuples in Rᵢ.

Link Relation:

The relation whose fragmentation determines the derived fragmentation is called the link relation or owner relation. The attribute used for the join is the link attribute.

Properties of Derived Fragmentation:

Join Locality: Joins between parent and child fragments can be performed locally at each site
Cascade Design: Multiple levels of derivation form a fragmentation tree
Dependency: Changes to parent fragmentation require re-fragmenting children
Orphan Handling: Child tuples referencing non-existent parents must be addressed

derived_fragmentation_example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
-- Parent relation: Customers fragmented by region
-- Customers_NorthAmerica, Customers_Europe, Customers_AsiaPacific
 
-- Child relation: Orders
CREATE TABLE Orders (
    order_id        BIGINT PRIMARY KEY,
    customer_id     BIGINT NOT NULL REFERENCES Customers(customer_id),
    order_date      DATE NOT NULL,
    amount          DECIMAL(15, 2) NOT NULL,
    status          VARCHAR(20) NOT NULL
);
 
-- Derived fragmentation: Orders fragments align with Customers fragments
-- This uses a semi-join with the parent fragment
 
-- Fragment 1: Orders for North American customers
CREATE TABLE Orders_NorthAmerica AS
SELECT o.*
FROM Orders o
WHERE EXISTS (
    SELECT 1 FROM Customers_NorthAmerica c
    WHERE c.customer_id = o.customer_id
);
 
-- Fragment 2: Orders for European customers
CREATE TABLE Orders_Europe AS
SELECT o.*
FROM Orders o
WHERE EXISTS (
    SELECT 1 FROM Customers_Europe c
    WHERE c.customer_id = o.customer_id
);
 
-- Fragment 3: Orders for Asia Pacific customers
CREATE TABLE Orders_AsiaPacific AS
SELECT o.*
FROM Orders o
WHERE EXISTS (
    SELECT 1 FROM Customers_AsiaPacific c
    WHERE c.customer_id = o.customer_id
);
 
-- Now, local joins are possible without cross-site communication
-- Each site can locally execute:
SELECT c.customer_name, o.order_date, o.amount
FROM Customers_NorthAmerica c
JOIN Orders_NorthAmerica o ON c.customer_id = o.customer_id
WHERE o.amount > 1000;

Derived Fragmentation Advantages

•Join Locality — Parent-child joins execute locally, eliminating network overhead
•Query Performance — Common access patterns involving related tables are optimized
•Data Affinity — Related data stored together, improving cache utilization
•Transaction Locality — Transactions touching parent and child often complete locally
•Simpler Coordination — Reduces need for distributed join protocols

Derived Fragmentation Challenges

•Cascade Updates — Parent re-fragmentation requires child re-fragmentation
•Design Complexity — Deep hierarchies create complex dependency chains
•Cross-Fragment Queries — Queries not following the derivation path lose locality
•Flexibility Constraints — Child fragmentation fixed by parent decisions
•Orphan Management — Must handle children whose parents migrate

Fragment Allocation Strategies

Allocation Problem:

Given:

n fragments F₁, F₂, ..., Fₙ
m sites S₁, S₂, ..., Sₘ
Query workload Q = {q₁, q₂, ..., qₖ} with frequencies
Network topology and costs

Find an allocation A: fragments → sites that minimizes:

Total query response time
Data transfer costs
Storage costs

Subject to:

Storage capacity constraints
Availability requirements
Data sovereignty regulations

Allocation Strategies:

Non-replicated Allocation: Each fragment resides at exactly one site
- Minimizes storage but creates single points of failure
- Lower update costs but potential query bottlenecks
Replicated Allocation: Fragments may be duplicated across sites
- Improves read performance and availability
- Increases storage costs and update complexity
- Requires consistency protocols (discussed in the Replication module)
Partially Replicated Allocation: Strategic replication based on access patterns
- Hot fragments replicated more widely
- Cold fragments stored at single sites
- Balances performance, cost, and complexity

Allocation Strategy Comparison
Strategy	Read Performance	Write Complexity	Storage Cost	Availability
Non-replicated	May require remote access	Simple—single copy	Minimal	Low—SPOF per fragment
Fully Replicated	Always local	High—update all copies	Maximal	High—survives any failure
Partial Replication	Usually local	Moderate—depends on degree	Moderate	Configurable per fragment

Allocation Heuristics:

Optimal allocation is NP-hard for non-trivial workloads. Practical systems employ heuristics:

Query-Based Allocation:
- Analyze query frequency and access patterns
- Place fragments where most frequently accessed
- Replicate to sites with high read demand
Capacity-Based Allocation:
- Consider storage and processing capacity
- Balance load across available sites
- Monitor and rebalance as workloads shift
Locality-Based Allocation:
- Place data near its producers/consumers
- Respect data sovereignty requirements
- Minimize cross-datacenter traffic

Example Allocation Decision:

For regional order fragments:

Orders_NorthAmerica → US-East datacenter (primary), US-West (replica)
Orders_Europe → EU-Frankfurt datacenter (primary), EU-London (replica)
Orders_AsiaPacific → Singapore datacenter (primary), Tokyo (replica)

This allocation:

Respects data residency (EU data stays in EU)
Provides regional read performance
Offers cross-region disaster recovery

Dynamic Rebalancing

Implementation Patterns and Considerations

Translating theoretical horizontal fragmentation into production systems requires addressing several practical concerns.

Range-Based vs. Hash-Based Partitioning:

The fragmentation predicate implementation typically follows one of two patterns:

Range-Based Partitioning:
- Predicates define value ranges: date >= '2024-01-01' AND date < '2024-02-01'
- Supports range queries efficiently
- Prone to hotspots if ranges receive uneven traffic
- Common for time-series data
Hash-Based Partitioning:
- Predicates based on hash function: hash(customer_id) MOD n = i
- Distributes data uniformly across fragments
- Poor for range queries (must scan all fragments)
- Excellent for point lookups
List-Based Partitioning:
- Explicit enumeration: region IN ('US', 'Canada')
- Full control over fragment membership
- Requires maintenance as new values appear
- Common for categorical attributes

partitioning_implementations.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
-- PostgreSQL Declarative Partitioning
 
-- Range partitioning by date (time-series data)
CREATE TABLE events (
    event_id    BIGINT NOT NULL,
    event_time  TIMESTAMP NOT NULL,
    event_type  VARCHAR(50) NOT NULL,
    payload     JSONB
) PARTITION BY RANGE (event_time);
 
-- Create partitions for each month
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
-- Continue for additional months...
 
-- Hash partitioning for even distribution
CREATE TABLE users (
    user_id     BIGINT NOT NULL,
    email       VARCHAR(255) NOT NULL,
    created_at  TIMESTAMP DEFAULT NOW()
) PARTITION BY HASH (user_id);
 
-- Create 8 hash partitions
CREATE TABLE users_0 PARTITION OF users
    FOR VALUES WITH (MODULUS 8, REMAINDER 0);
CREATE TABLE users_1 PARTITION OF users
    FOR VALUES WITH (MODULUS 8, REMAINDER 1);
-- Continue for remainders 2-7...
 
-- List partitioning by region
CREATE TABLE orders (
    order_id    BIGINT NOT NULL,
    region      VARCHAR(20) NOT NULL,
    amount      DECIMAL(15, 2)
) PARTITION BY LIST (region);
 
CREATE TABLE orders_americas PARTITION OF orders
    FOR VALUES IN ('US', 'CA', 'MX', 'BR');
CREATE TABLE orders_emea PARTITION OF orders
    FOR VALUES IN ('UK', 'DE', 'FR', 'IT');
CREATE TABLE orders_apac PARTITION OF orders
    FOR VALUES IN ('JP', 'CN', 'IN', 'AU');

Critical Implementation Considerations

•Partition Key Selection — The key must be included in primary/unique constraints and cannot be updated (or triggers migration). Choose stable, frequently-filtered attributes.
•NULL Handling — Decide how NULL values are assigned. PostgreSQL treats NULL as any other value; MySQL requires explicit NULL partition or DEFAULT partition.
•Cross-Partition Queries — Queries that cannot eliminate partitions scan all of them. Design fragmentation around actual query patterns.
•Partition Maintenance — For time-based partitions, automate partition creation (future) and archival (past). Most databases don't auto-create partitions.
•Foreign Keys — PostgreSQL supports foreign keys referencing partitioned tables only with restrictions. MySQL doesn't support them for partitioned tables.
•Index Strategy — Indexes are per-partition. Global indexes spanning partitions are generally unsupported, affecting certain unique constraint patterns.

Query Processing with Horizontal Fragments

Fragment Elimination:

The query optimizer analyzes WHERE clause predicates against fragment definitions. If a predicate contradicts a fragment's definition, that fragment can be skipped entirely.

Example:

Given fragments:

Orders_2024_Q1: order_date FROM '2024-01-01' TO '2024-04-01'
Orders_2024_Q2: order_date FROM '2024-04-01' TO '2024-07-01'
Orders_2024_Q3: order_date FROM '2024-07-01' TO '2024-10-01'

For query:

SELECT * FROM Orders WHERE order_date = '2024-05-15'

Only Orders_2024_Q2 needs to be accessed—the other fragments are eliminated.

Localization Program:

The distributed query processor transforms global queries into fragment-specific subqueries through a localization program:

Replace global relation with union of fragments
Apply algebraic simplification (push selections)
Eliminate fragments that return empty sets
Optimize remaining fragment accesses

Query Rewriting:

A global query on Orders is rewritten as:

-- Original
SELECT * FROM Orders WHERE region = 'US' AND amount > 1000;

-- After localization (fragments by region)
SELECT * FROM Orders_NorthAmerica WHERE region = 'US' AND amount > 1000;
-- Other fragment queries eliminated entirely

partition_pruning_explain.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
-- Demonstrating partition pruning with EXPLAIN
 
-- Time-partitioned table
CREATE TABLE events (
    id          BIGSERIAL,
    occurred_at TIMESTAMP NOT NULL,
    event_type  VARCHAR(50),
    data        JSONB
) PARTITION BY RANGE (occurred_at);
 
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
CREATE TABLE events_2024_03 PARTITION OF events
    FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
 
-- Query targeting specific date range
EXPLAIN (ANALYZE, COSTS OFF)
SELECT * FROM events
WHERE occurred_at >= '2024-02-10'
  AND occurred_at < '2024-02-20';
 
-- QUERY PLAN OUTPUT:
-- Append
--   ->  Seq Scan on events_2024_02 events_1
--         Filter: (occurred_at >= '2024-02-10' AND occurred_at < '2024-02-20')
-- (Partitions events_2024_01 and events_2024_03 are PRUNED)
 
-- Query without partition filter - scans ALL partitions
EXPLAIN (ANALYZE, COSTS OFF)
SELECT * FROM events
WHERE event_type = 'LOGIN';
 
-- QUERY PLAN OUTPUT:
-- Append
--   ->  Seq Scan on events_2024_01 events_1
--         Filter: (event_type = 'LOGIN')
--   ->  Seq Scan on events_2024_02 events_2
--         Filter: (event_type = 'LOGIN')
--   ->  Seq Scan on events_2024_03 events_3
--         Filter: (event_type = 'LOGIN')
-- (No pruning - all partitions scanned)

Optimizing for Partition Pruning

Summary: Horizontal Fragmentation Mastery

Horizontal fragmentation is the foundational technique for scaling relational data beyond single-node limits. Let's consolidate the key concepts:

Key Takeaways

•Formal Properties — Valid fragmentation requires completeness (no data loss), disjointness (no duplicates), and reconstruction (union reproduces original).
•Selection Predicates — Fragments are defined by mutually exclusive, collectively exhaustive predicates determining tuple membership.
•Primary vs. Derived — Primary fragmentation uses intrinsic attributes; derived fragmentation follows parent table fragmentation via semi-joins.
•Allocation Matters — Fragment placement impacts query latency, availability, and compliance—non-replicated vs. replicated allocation trade storage for performance.
•Implementation Patterns — Range, hash, and list partitioning serve different access patterns; choose based on query workload analysis.
•Query Optimization — Partition pruning/fragment elimination enables accessing only relevant fragments—design fragmentation to maximize elimination opportunities.

What's Next:

Page Complete

1 / 5