Loading content...
While individual query analysis tells you what could be indexed, workload analysis tells you what should be indexed. A workload is the complete picture of database activity: all queries, their frequencies, their timing, their interactions, and their collective resource consumption.
Why Workload Analysis Matters:
Optimizing individual queries in isolation can lead to disastrous outcomes:
Workload analysis provides the holistic view necessary for strategic index decisions.
By the end of this page, you will understand how to characterize database workloads, identify temporal patterns, perform trade-off analysis across competing requirements, and build a workload model that guides long-term indexing strategy.
A workload characterization captures the essential properties of database activity in a structured format that enables systematic analysis. Think of it as creating a profile or fingerprint of your database usage.
| Dimension | What to Measure | Indexing Relevance |
|---|---|---|
| Query Mix | % SELECT / INSERT / UPDATE / DELETE | Determines read vs. write optimization priority |
| Query Frequency | Executions per second/minute for each query type | Identifies high-impact optimization targets |
| Data Access Patterns | Sequential vs. random; point vs. range queries | Influences index type selection (B+-tree, hash, etc.) |
| Temporal Patterns | Peak hours, daily cycles, monthly surges | May require different strategies for different periods |
| Concurrency | Active connections, transactions per second | Affects lock contention considerations |
| Data Characteristics | Table sizes, growth rates, cardinality | Impacts index storage and maintenance costs |
| Resource Constraints | Memory, CPU, disk I/O limits | Constrains what indexing is feasible |
| SLA Requirements | Latency targets, throughput requirements | Defines optimization goals |
Building a Workload Profile:
A formal workload profile documents your findings in a structured format. Here's a template for capturing workload characteristics:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
{ "workload_name": "E-commerce Order Processing", "analysis_period": "2024-01-01 to 2024-01-31", "environment": "Production - Primary", "summary_metrics": { "total_queries_per_day": 12500000, "read_write_ratio": "92:8", "peak_queries_per_second": 850, "average_concurrent_connections": 45, "peak_concurrent_connections": 180 }, "tables": [ { "name": "orders", "rows": 45000000, "size_gb": 12.5, "daily_inserts": 15000, "daily_updates": 45000, "daily_deletes": 500, "growth_rate_percent_monthly": 3.2 } ], "query_classes": [ { "class_id": "Q1", "description": "Get order by ID", "pattern": "SELECT * FROM orders WHERE id = ?", "frequency_per_day": 3500000, "average_latency_ms": 2.1, "target_latency_ms": 5, "status": "MEETS_SLA" }, { "class_id": "Q2", "description": "Recent orders for customer", "pattern": "SELECT * FROM orders WHERE customer_id = ? ORDER BY created_at DESC LIMIT 20", "frequency_per_day": 850000, "average_latency_ms": 45, "target_latency_ms": 20, "status": "EXCEEDS_SLA" } ], "temporal_patterns": { "peak_hours": ["09:00-12:00", "14:00-17:00", "19:00-22:00"], "low_activity": ["02:00-06:00"], "monthly_spike": "Last 3 days of month", "batch_windows": ["03:00-05:00"] }}When building a workload profile, explicitly document your assumptions and data sources. 'Peak QPS is 850' is less useful than 'Peak QPS is 850 (measured on 2024-01-15, holiday sale event, may not reflect normal operations).' This context is crucial for decision-making.
The read/write ratio is the single most important metric for indexing strategy. It determines whether you should optimize aggressively for reads (add indexes) or conservatively (minimize indexes).
Measuring Read/Write Ratio:
123456789101112131415161718192021222324252627282930313233343536373839
-- Per-table read/write analysisSELECT relname AS table_name, seq_scan + idx_scan AS reads, n_tup_ins + n_tup_upd + n_tup_del AS writes, ROUND( 100.0 * (seq_scan + idx_scan) / NULLIF(seq_scan + idx_scan + n_tup_ins + n_tup_upd + n_tup_del, 0), 2 ) AS read_percent, CASE WHEN (seq_scan + idx_scan) > 10 * (n_tup_ins + n_tup_upd + n_tup_del) THEN 'READ_HEAVY' WHEN (seq_scan + idx_scan) > 4 * (n_tup_ins + n_tup_upd + n_tup_del) THEN 'READ_DOMINANT' WHEN (seq_scan + idx_scan) > (n_tup_ins + n_tup_upd + n_tup_del) THEN 'BALANCED' WHEN (n_tup_ins + n_tup_upd + n_tup_del) > 4 * (seq_scan + idx_scan) THEN 'WRITE_DOMINANT' ELSE 'WRITE_HEAVY' END AS workload_typeFROM pg_stat_user_tablesORDER BY reads + writes DESC; -- Update detail breakdown (which columns change?)SELECT a.attname AS column_name, s.null_frac AS null_percentage, s.n_distinct AS distinct_values, CASE WHEN a.attnum = ANY(i.indkey::int[]) THEN 'INDEXED' ELSE 'NOT_INDEXED' END AS index_statusFROM pg_stats sJOIN pg_attribute a ON a.attname = s.attname AND a.attrelid = (SELECT oid FROM pg_class WHERE relname = 'orders')LEFT JOIN pg_index i ON i.indrelid = a.attrelid AND a.attnum = ANY(i.indkey::int[])WHERE s.tablename = 'orders';A database might be 80% reads overall, but a specific logging table might be 95% writes. Apply indexing strategy at the table level, not database level. Your session table doesn't need the same indexes as your order history table.
Database workloads are not uniform over time. Understanding temporal patterns is critical because:
Common Temporal Patterns:
| Pattern | Example | Indexing Implication |
|---|---|---|
| Diurnal (Daily) | Business hours peak, overnight lull | Schedule index maintenance for off-peak. Design for peak load. |
| Weekly | Monday report surge, weekend quiet | Consider Monday-specific indexes. Heavy reports may need temporary indexes. |
| Monthly | End-of-month billing spike | Partition by date. Archive old data. Partial indexes on current month. |
| Seasonal | Holiday shopping surge | Capacity planning. Consider temporary indexes for seasonal queries. |
| Event-Driven | Sale events, product launches | Pre-create indexes for anticipated patterns. Monitor and add reactively. |
| Batch Windows | 3 AM ETL jobs | Batch queries may warrant different indexes than OLTP queries. |
123456789101112131415161718192021222324252627282930313233343536
-- Query volume by hour of day (requires logging)-- This example uses a custom query_log tableSELECT EXTRACT(HOUR FROM timestamp) AS hour, COUNT(*) AS query_count, AVG(duration_ms) AS avg_duration_ms, PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY duration_ms) AS p95_ms, SUM(CASE WHEN query_type = 'SELECT' THEN 1 ELSE 0 END) AS reads, SUM(CASE WHEN query_type IN ('INSERT','UPDATE','DELETE') THEN 1 ELSE 0 END) AS writesFROM query_logWHERE timestamp > NOW() - INTERVAL '7 days'GROUP BY EXTRACT(HOUR FROM timestamp)ORDER BY hour; -- Day-of-week analysisSELECT TO_CHAR(timestamp, 'Day') AS day_of_week, COUNT(*) AS query_count, AVG(duration_ms) AS avg_duration_msFROM query_logWHERE timestamp > NOW() - INTERVAL '4 weeks'GROUP BY TO_CHAR(timestamp, 'Day'), EXTRACT(DOW FROM timestamp)ORDER BY EXTRACT(DOW FROM MIN(timestamp)); -- Identify batch job windows (sudden activity spikes)SELECT DATE_TRUNC('hour', timestamp) AS hour, COUNT(*) AS queries, ROUND(AVG(rows_affected), 0) AS avg_rows_affectedFROM query_logWHERE query_type IN ('INSERT', 'UPDATE', 'DELETE') AND timestamp > NOW() - INTERVAL '7 days'GROUP BY DATE_TRUNC('hour', timestamp)HAVING COUNT(*) > 1000 OR AVG(rows_affected) > 1000ORDER BY queries DESC;Multi-Mode Workloads:
Many production systems exhibit fundamentally different patterns at different times:
Each mode may benefit from different indexes. In extreme cases, you might:
If your system must meet SLAs during peak load, design indexes for peak load. An index that works at 50 QPS but fails at 500 QPS is useless if your peak is 400 QPS. Always test index configurations under realistic load conditions.
Hot spots are tables, indexes, or data ranges that receive disproportionate traffic. Identifying hot spots is essential because they often determine system performance and are the highest-value optimization targets.
Types of Hot Spots:
12345678910111213141516171819202122232425262728293031323334353637383940414243
-- Table hot spots: Most accessed tablesSELECT schemaname, relname AS table_name, seq_scan + idx_scan AS total_scans, seq_tup_read + idx_tup_fetch AS total_tuples_accessed, n_tup_ins AS inserts, n_tup_upd AS updates, n_tup_del AS deletes, ROUND(100.0 * (seq_scan + idx_scan) / NULLIF(SUM(seq_scan + idx_scan) OVER (), 0), 2) AS pct_of_total_scansFROM pg_stat_user_tablesORDER BY total_scans DESCLIMIT 10; -- Index hot spots: Most used indexesSELECT schemaname, relname AS table_name, indexrelname AS index_name, idx_scan AS scans, idx_tup_read AS tuples_read, idx_tup_fetch AS tuples_fetched, ROUND(100.0 * idx_scan / NULLIF(SUM(idx_scan) OVER (), 0), 2) AS pct_of_total_scansFROM pg_stat_user_indexesORDER BY idx_scan DESCLIMIT 15; -- Page-level access patterns (requires pg_buffercache)CREATE EXTENSION IF NOT EXISTS pg_buffercache; SELECT c.relname AS table_name, COUNT(*) AS buffered_pages, ROUND(100.0 * COUNT(*) / (SELECT COUNT(*) FROM pg_buffercache), 2) AS pct_of_buffer, SUM(usagecount) AS total_usageFROM pg_buffercache bJOIN pg_class c ON b.relfilenode = c.relfilenodeWHERE c.relkind IN ('r', 'i') -- tables and indexesGROUP BY c.relnameORDER BY total_usage DESCLIMIT 20;Standard indexing advice may not apply to hot spots. A table handling 80% of your writes needs minimal indexes regardless of query patterns. A constantly-updated row may need to be moved to a separate table. Think creatively about hot spot mitigation.
Index design involves navigating multiple competing concerns. A formal trade-off analysis helps make these tensions explicit and enables informed decision-making.
| Trade-off | Option A | Option B | Decision Factor |
|---|---|---|---|
| Read vs Write | More indexes (faster reads) | Fewer indexes (faster writes) | Read/write ratio of workload |
| Space vs Performance | Covering indexes (larger) | Minimal indexes (smaller) | Available storage and memory |
| Query A vs Query B | Index optimized for A | Index optimized for B | Relative frequency and criticality |
| Simplicity vs Optimization | Fewer, broader indexes | Many specialized indexes | Maintenance capability |
| Latency vs Throughput | Optimize tail latency | Optimize average throughput | SLA requirements |
| Current vs Future | Optimize for today's workload | Build for anticipated growth | Change velocity and predictability |
Quantitative Trade-off Analysis:
For significant indexing decisions, quantify the trade-offs:
1234567891011121314151617181920212223242526272829303132333435
# Index Addition Trade-off Analysis ## Proposed IndexCREATE INDEX idx_orders_customer_status ON orders (customer_id, status, created_at DESC); ## Current State- Table size: 45 million rows- Current indexes: 4 ## Read Benefit Analysis| Query | Current (ms) | With Index (ms) | Improvement | Daily Executions | Daily Time Saved ||-------|--------------|-----------------|-------------|------------------|------------------|| Q1: Customer orders | 450 | 15 | 435ms | 850,000 | 102 hours || Q2: Status by customer | 380 | 8 | 372ms | 120,000 | 12 hours || Q3: Recent customer orders | 520 | 12 | 508ms | 200,000 | 28 hours | **Total Daily Read Benefit: 142 compute-hours saved** ## Write Cost Analysis| Operation | Current (ms) | With Index (ms) | Increase | Daily Ops | Daily Cost Added ||-----------|--------------|-----------------|----------|-----------|------------------|| INSERT | 0.8 | 1.1 | 0.3ms | 15,000 | 1.25 hours || UPDATE (status) | 1.2 | 1.8 | 0.6ms | 45,000 | 7.5 hours || DELETE | 0.9 | 1.3 | 0.4ms | 500 | 0.06 hours | **Total Daily Write Cost: 8.8 compute-hours added** ## Net Impact- **Daily Time Saved: 133 hours (142 - 8.8)**- **Storage Cost: +1.8 GB**- **Memory Requirement: +400 MB buffer preference** ## Recommendation**CREATE THE INDEX** - Net benefit is 16× the cost.Even accounting for peak-hour write concentration, the trade-off is strongly positive.Create a decision log for significant indexing choices. Document what you considered, what you chose, and why. This enables learning from outcomes and provides context for future engineers who inherit your system.
Capacity planning for indexes involves projecting future index size and resource requirements based on data growth patterns. Without capacity planning, systems gradually degrade until a crisis forces emergency action.
12345678910111213141516171819202122232425262728293031323334
-- Current index sizesWITH index_sizes AS ( SELECT schemaname, tablename, indexname, pg_relation_size(indexrelid) AS current_size_bytes FROM pg_stat_user_indexes)SELECT tablename, indexname, pg_size_pretty(current_size_bytes) AS current_size, -- Project growth (assuming 5% monthly growth) pg_size_pretty(current_size_bytes * 1.05) AS size_1_month, pg_size_pretty(current_size_bytes * POWER(1.05, 6)) AS size_6_months, pg_size_pretty(current_size_bytes * POWER(1.05, 12)) AS size_1_year, pg_size_pretty(current_size_bytes * POWER(1.05, 24)) AS size_2_yearsFROM index_sizesORDER BY current_size_bytes DESC; -- Total index capacity projectionSELECT pg_size_pretty(SUM(pg_relation_size(indexrelid))) AS total_index_size, pg_size_pretty(SUM(pg_relation_size(indexrelid)) * 1.05) AS size_1_month, pg_size_pretty(SUM(pg_relation_size(indexrelid)) * POWER(1.05, 6)) AS size_6_months, pg_size_pretty(SUM(pg_relation_size(indexrelid)) * POWER(1.05, 12)) AS size_1_year, -- Compare to buffer pool current_setting('shared_buffers') AS buffer_pool, ROUND(100.0 * SUM(pg_relation_size(indexrelid)) / pg_size_bytes(current_setting('shared_buffers')), 2) AS pct_of_buffer_now, ROUND(100.0 * SUM(pg_relation_size(indexrelid)) * POWER(1.05, 12) / pg_size_bytes(current_setting('shared_buffers')), 2) AS pct_of_buffer_1_yearFROM pg_stat_user_indexes;Capacity Thresholds to Monitor:
| Metric | Healthy | Warning | Critical |
|---|---|---|---|
| Index size vs. buffer pool | < 50% | 50-80% | 80% |
| Index growth vs. storage | < 70% capacity | 70-85% | 85% |
| Maintenance window usage | < 60% of window | 60-85% | 85% |
| Index build time (for rebuild) | < 2 hours | 2-8 hours | 8 hours |
Set alerts at warning thresholds to enable proactive intervention before critical levels are reached.
Business growth often accelerates: new products, more users, expanding markets. Your 5% monthly growth assumption may become 15% after a successful product launch. Build headroom into capacity plans and revisit projections quarterly.
Different workload types call for fundamentally different indexing strategies. Here we map common workload patterns to proven indexing approaches.
Online Transaction Processing (OLTP) workloads prioritize low latency, high concurrency, and data consistency.
Characteristics:
Index Strategy:
Workload analysis transforms index design from guesswork into engineering. By understanding your complete workload, you can make strategic decisions that serve the whole system, not just individual queries.
You now understand how to perform comprehensive workload analysis for index design. You can characterize workloads, identify patterns, make trade-off decisions, and plan for growth. In the final page, we'll explore automated index recommendation systems that apply these principles algorithmically.