Loading content...
A logically perfect schema can be devastatingly slow in production. The elegant normalized tables, the carefully designed constraints, the meticulously documented relationships—all of this means nothing if queries take minutes instead of milliseconds. Physical design is where we bridge the gap between logical correctness and real-world performance.
Physical design operates at the intersection of database theory and computer architecture. It acknowledges that data lives on physical storage devices with specific characteristics: disks have seek times, memory has limited capacity, CPUs have cache hierarchies. Ignoring these realities produces databases that are theoretically beautiful but practically unusable.
This phase is also where the expertise of experienced database architects becomes most visible. Two architects given the same logical schema might produce physical designs with 100x performance differences. The difference lies in understanding access patterns, anticipating growth, and making informed trade-offs between competing concerns.
By the end of this page, you will master the physical design process: storage structure selection, indexing strategies and tradeoffs, partitioning schemes for large tables, denormalization decisions, capacity planning, and performance considerations. You will develop the judgment to transform correct schemas into fast schemas.
Physical design determines how the logical schema is implemented on physical storage. It introduces new structures (indexes, partitions) and may modify the logical schema itself (denormalization) to meet performance requirements.
Formal Definition:
Physical database design is the process of producing a description of the implementation of the database on secondary storage: the storage structures and access methods used to achieve efficient access to the data. It involves defining base tables, file organizations, indexes, integrity constraints, and security measures.
Physical Design Scope:
The Performance Triangle:
Physical design constantly balances three competing concerns:
Query Performance
/\
/ \
/ \
/ \
/________\
Storage Efficiency — Write Performance
Trade-off Examples:
Input to Physical Design:
| Input Category | Specific Information | Source |
|---|---|---|
| Workload Profile | Query patterns, transaction mix, read/write ratio | Requirements, logs, benchmarks |
| Volume Estimates | Current row counts, growth projections | Stakeholder interviews, historical data |
| Performance Targets | Response time SLAs, throughput requirements | NFR requirements specification |
| Hardware Profile | CPU, memory, storage type (SSD/HDD), network | Infrastructure team |
| Availability Requirements | Uptime SLA, maintenance windows | Operations requirements |
| Budget Constraints | Storage budget, licensing costs | Project management |
Never design physical structures without understanding the workload. A table used primarily for batch reporting needs different optimization than one handling thousands of real-time transactions per second. Collect or estimate query patterns, frequency, and criticality before making physical design decisions.
The underlying storage structure determines how rows are physically arranged on disk. This affects full table scan performance, clustering of related rows, and index effectiveness.
Primary Storage Structures:
| Structure | Description | Best For | Avoid For |
|---|---|---|---|
| Heap | Rows stored in insertion order, no organization | High-volume inserts, staging tables | Range queries, ordered access |
| Clustered (B-tree organized) | Rows physically ordered by key | Range queries on key, ordered retrieval | High insert volume, random writes |
| Hash organized | Rows placed by hash of key | Equality lookups | Range queries, ordered access |
| Column store | Columns stored separately | Analytics, aggregations, wide tables | OLTP, row-level operations |
| Index-organized (IOT) | Entire row stored in index leaf | Small rows, primarily key lookups | Large rows, many secondary indexes |
Heap vs. Clustered Trade-offs:
Heap Tables (No Clustering):
-- Most DBMS default; explicit in some
CREATE TABLE log_entries (
entry_id BIGINT PRIMARY KEY,
timestamp TIMESTAMP,
message TEXT
); -- Heap by default
Characteristics:
Clustered Tables (B-Tree Organized):
-- SQL Server clustered index
CREATE CLUSTERED INDEX idx_orders_date
ON orders(order_date);
-- PostgreSQL: order table by column
CLUSTER orders USING idx_orders_date;
Characteristics:
Column Store for Analytics:
Column stores organize data by column rather than row, dramatically improving analytical query performance:
Row Store (Traditional):
Page 1: [Row1: id=1, name='Alice', age=30, salary=50000]
[Row2: id=2, name='Bob', age=25, salary=45000]
...
Column Store:
id column: [1, 2, 3, 4, ...]
name column: ['Alice', 'Bob', 'Carol', 'Dave', ...]
age column: [30, 25, 35, 28, ...]
salary column: [50000, 45000, 60000, 55000, ...]
Column Store Advantages:
SELECT AVG(salary) FROM employees WHERE age > 30Column Store Disadvantages:
Modern databases often support hybrid approaches. PostgreSQL's cstore extension adds columnar tables. Oracle's dual-format storage keeps recent data row-based (for OLTP) and automatically converts older data to columnar (for analytics). SQL Server's columnstore indexes add column organization to specific columns within a row-based table.
Indexes are the most powerful tool for query optimization—and the most commonly misused. An index is an auxiliary data structure that provides fast access to rows based on column values, trading storage space and write performance for read performance.
Index Types:
B-Tree (Balanced Tree) Index:
The default and most versatile index type. A self-balancing tree structure that maintains sorted order of indexed values.
CREATE INDEX idx_customers_email
ON customers(email);
CREATE INDEX idx_orders_customer_date
ON orders(customer_id, order_date DESC);
Supports:
WHERE email = 'x@example.com'WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31'WHERE name LIKE 'Smith%'Composite Index Column Order:
For multi-column indexes, order matters. The index is sorted by first column, then by second within first, etc.
CREATE INDEX idx_orders_customer_date
ON orders(customer_id, order_date);
This index supports:
WHERE customer_id = 123 ✓WHERE customer_id = 123 AND order_date = '2024-01-15' ✓WHERE customer_id = 123 AND order_date > '2024-01-01' ✓WHERE order_date = '2024-01-15' ✗ (index not used efficiently)Rule: Put equality columns first, range columns last.
Covering Indexes:
A covering index includes all columns needed by a query, eliminating the need to access the base table:
-- Query:
-- SELECT customer_id, order_date, total
-- FROM orders WHERE customer_id = 123;
-- Covering index:
CREATE INDEX idx_orders_covering
ON orders(customer_id) INCLUDE (order_date, total);
The query can be satisfied entirely from the index—no table access needed. This is particularly valuable for:
Index Selection Guidelines:
Every index has costs: storage space, slower INSERT/UPDATE/DELETE operations, and maintenance overhead. Over-indexing is as harmful as under-indexing. A good rule: start with essential indexes (PKs, FKs, obvious query columns), then add more based on actual query performance monitoring. Drop unused indexes regularly.
Partitioning divides large tables into smaller, more manageable pieces. From the application perspective, the table remains a single logical entity; physically, it comprises multiple partitions that can be managed, optimized, and queried independently.
Why Partition?
Partitioning Strategies:
| Strategy | How Data is Distributed | Best Use Cases | Example |
|---|---|---|---|
| Range | By value ranges | Time-series, date-based data, sequential IDs | Orders by year-month |
| List | By explicit value lists | Categorical data, regional data | Customers by country |
| Hash | By hash of partition key | Even distribution, no natural range | Transactions by account_id |
| Composite | Combination of above | Complex access patterns | Orders: range by date, list by region |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
-- Range Partitioning by Date (PostgreSQL)CREATE TABLE orders ( order_id BIGINT, customer_id INTEGER, order_date DATE, total DECIMAL(10,2)) PARTITION BY RANGE (order_date); -- Create partitions for each monthCREATE TABLE orders_2024_01 PARTITION OF orders FOR VALUES FROM ('2024-01-01') TO ('2024-02-01'); CREATE TABLE orders_2024_02 PARTITION OF orders FOR VALUES FROM ('2024-02-01') TO ('2024-03-01'); -- Query with partition pruning:-- Only scans orders_2024_01 partitionSELECT * FROM orders WHERE order_date = '2024-01-15'; -- List Partitioning by RegionCREATE TABLE customers ( customer_id INTEGER, name VARCHAR(100), region VARCHAR(20)) PARTITION BY LIST (region); CREATE TABLE customers_americas PARTITION OF customers FOR VALUES IN ('US', 'CA', 'MX', 'BR'); CREATE TABLE customers_emea PARTITION OF customers FOR VALUES IN ('UK', 'DE', 'FR', 'AE'); CREATE TABLE customers_apac PARTITION OF customers FOR VALUES IN ('JP', 'AU', 'SG', 'IN'); -- Hash Partitioning for DistributionCREATE TABLE transactions ( transaction_id BIGINT, account_id INTEGER, amount DECIMAL(12,2)) PARTITION BY HASH (account_id); -- Create 8 hash partitions for parallel processingCREATE TABLE transactions_0 PARTITION OF transactions FOR VALUES WITH (MODULUS 8, REMAINDER 0);-- ... repeat for remainder 1-7Partition Key Selection:
The effectiveness of partitioning depends entirely on choosing the right partition key:
| Criterion | Guidance |
|---|---|
| Query patterns | Choose key that appears in WHERE clauses of most queries |
| Cardinality | Key should have enough distinct values for meaningful distribution |
| Even distribution | Avoid keys that create skewed partitions (99% of data in one partition) |
| Maintenance | Consider how data will age—date keys enable easy archival |
| Growth | Ensure strategy accommodates future data volumes |
Partition Maintenance:
-- Adding new partitions (before data arrives)
CREATE TABLE orders_2024_03 PARTITION OF orders
FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
-- Detaching old partitions for archival
ALTER TABLE orders DETACH PARTITION orders_2022_01;
-- Now orders_2022_01 is standalone table, can be archived/dropped
-- Dropping old data (instant, no row-by-row delete)
DROP TABLE orders_2020_01;
Always verify that queries benefit from partition pruning using EXPLAIN. If queries scan all partitions, the partitioning key doesn't match query patterns. Adjust the partitioning strategy or ensure queries include the partition key in WHERE clauses.
Denormalization is the deliberate introduction of redundancy to improve query performance. It reverses normalization decisions when the cost of joins exceeds the cost of maintaining redundant data.
Important: Denormalization is not poor design—it is an optimization applied after normalization. Never skip normalization and call it denormalization.
When to Denormalize:
Denormalization Techniques:
1. Redundant Columns:
Copy frequently accessed columns from related tables:
-- Normalized:
SELECT o.order_id, o.order_date,
c.name AS customer_name
FROM orders o
JOIN customers c
ON o.customer_id = c.customer_id;
-- Denormalized:
-- Add customer_name to orders table
ALTER TABLE orders
ADD COLUMN customer_name VARCHAR(100);
-- Now:
SELECT order_id, order_date, customer_name
FROM orders; -- No join needed
Maintenance: Trigger or application must update customer_name when customer name changes.
2. Pre-computed Aggregates:
Store aggregated values instead of computing:
-- Computing on-demand (slow):
SELECT customer_id,
COUNT(*) as order_count,
SUM(total) as total_spent
FROM orders
GROUP BY customer_id;
-- Pre-computed in customer table:
ALTER TABLE customers ADD COLUMN
order_count INTEGER DEFAULT 0;
ALTER TABLE customers ADD COLUMN
total_spent DECIMAL(12,2) DEFAULT 0;
-- Now:
SELECT customer_id, order_count, total_spent
FROM customers; -- Instant
Maintenance: Trigger updates aggregates on order insert/update/delete.
3. Merged Tables:
Combine frequently joined 1:1 related tables:
-- Normalized (2 tables):
employees(emp_id, name, hire_date, dept_id)
employee_details(emp_id, emergency_contact, tax_info, photo_url)
-- Denormalized (1 table):
employees(emp_id, name, hire_date, dept_id,
emergency_contact, tax_info, photo_url)
4. Summary Tables (Materialized Views):
Pre-computed query results refreshed periodically:
-- Materialized view for dashboard
CREATE MATERIALIZED VIEW daily_sales_summary AS
SELECT DATE(order_date) as day,
COUNT(*) as order_count,
SUM(total) as revenue,
AVG(total) as avg_order_value
FROM orders
GROUP BY DATE(order_date);
-- Refresh strategy
REFRESH MATERIALIZED VIEW daily_sales_summary;
-- Or: CONCURRENTLY for no locks
Denormalization creates data anomaly risks. Redundant data can become inconsistent if update logic fails. Document all denormalization decisions, implement reliable maintenance mechanisms (triggers, application logic), and monitor for data inconsistencies. Consider read replicas or caching as alternatives before denormalizing.
Physical design must account for not just current data volumes but projected growth. A schema that works perfectly at 1GB may completely collapse at 100GB if capacity wasn't considered.
Estimating Table Sizes:
123456789101112131415161718192021222324252627282930313233343536373839
-- Step 1: Calculate average row size-- Sum all column sizes (including overhead) /*orders table: order_id BIGINT 8 bytes customer_id INTEGER 4 bytes order_date DATE 4 bytes status VARCHAR(20) ~12 bytes (avg) + 4 overhead total DECIMAL(10,2) 8 bytes created_at TIMESTAMP 8 bytes Row overhead (PostgreSQL) ~24 bytes ----------------------------------------- Estimated row size: ~72 bytes*/ -- Step 2: Estimate row counts/*Current orders: 10,000,000Annual growth rate: 50%5-year projection: 10M * (1.5)^5 = 76M rows*/ -- Step 3: Calculate raw table size-- 76M rows × 72 bytes = 5.5 GB -- Step 4: Add index overhead/*Primary key index (B-tree): ~20 bytes/row = 1.5 GBFK index (customer_id): ~12 bytes/row = 0.9 GBCovering index (status, date): ~30 bytes/row = 2.3 GB-----------------------------------------Index total: 4.7 GB*/ -- Step 5: Total storage estimate-- Table: 5.5 GB + Indexes: 4.7 GB + 20% overhead = 12.2 GB -- Step 6: Apply for all tables, aggregateGrowth Model Considerations:
| Growth Pattern | Examples | Planning Approach |
|---|---|---|
| Linear | Regulatory filings, employee records | Estimate constant additions/year |
| Exponential | User-generated content, IoT sensor data | Apply compound growth rate |
| Seasonal | E-commerce, tax software | Plan for peak capacity + buffer |
| Episodic | Marketing campaigns, product launches | Reserve capacity for events |
| Declining | Legacy system data, superseded products | Plan for archival, not growth |
Capacity Planning Checklist:
Physical infrastructure decisions are expensive to change. Plan for at least 3-5 years of growth. It's cheaper to over-provision initially than to perform emergency migrations when capacity is exhausted. Consider cloud elasticity if growth is unpredictable.
Physical design must consider the full spectrum of performance factors: query execution, write efficiency, concurrency, and operational overhead.
Query Performance Factors:
| Performance Issue | Indicators | Physical Design Solutions |
|---|---|---|
| Full table scans | High I/O, long query times | Appropriate indexes, partition pruning |
| Expensive joins | High CPU, memory pressure | FK indexes, denormalization, covering indexes |
| Sorting overhead | temp file usage, slow ORDER BY | Indexes matching sort order, more memory |
| Lock contention | Waits, deadlocks | Better isolation levels, row-level locking, partition by access pattern |
| Buffer pool misses | High disk reads | Larger buffer pool, hot data partitioning |
Write Performance Optimization:
Indexes and constraints that improve read performance often hurt write performance:
Batch Insert Optimization:
-- Slow: single row inserts with constraint checks
INSERT INTO orders VALUES (...);
INSERT INTO orders VALUES (...);
-- ... 10,000 times
-- Fast: batch insert with deferred constraints
BEGIN;
SET CONSTRAINTS ALL DEFERRED;
COPY orders FROM '/data/orders.csv'; -- Bulk load
COMMIT; -- Constraints checked once
Index Maintenance Strategies:
Compression Trade-offs:
123456789101112131415161718192021
-- PostgreSQL: TOAST compression (automatic for large values)-- Row-level compression in table -- SQL Server: Page and row compressionALTER TABLE orders REBUILD WITH (DATA_COMPRESSION = PAGE);-- PAGE compression: Better ratio, more CPU overhead-- ROW compression: Less ratio, less CPU overhead -- Column store compression (typically 10:1 ratio)-- Best for analytics, cold data -- Compression Decision Matrix:/*| Scenario | Recommendation ||-----------------------|---------------------|| OLTP, hot data | Row or none || OLTP, warm data | Page compression || Analytics, cold data | Column store || Archive | Maximum compression || High write rate | Avoid compression |*/Never deploy physical design decisions without benchmarking with realistic data volumes and query patterns. What performs well at 10,000 rows may behave completely differently at 10,000,000. Use representative test data and production-like hardware for final validation.
Physical design transforms logically correct schemas into performant production systems. It requires understanding hardware characteristics, workload patterns, and the trade-offs between competing optimization goals.
What's Next:
With physical design complete, we reach the final phase: implementation. Here, we translate all our design work into actual database objects, deploy to production environments, load initial data, and establish ongoing operational procedures.
The next page explores implementation: DDL script development, deployment strategies, data migration, testing, and the transition from design artifacts to running database systems.
You now understand physical design as the bridge between logical correctness and production performance. You can select storage structures, design indexing strategies, implement partitioning, make informed denormalization decisions, plan for capacity growth, and optimize for specific performance goals. Next, we'll bring everything together in the implementation phase.