Loading learning content...
You've designed the perfect partition key for your DynamoDB table. Your e-commerce orders table uses customerId as the partition key and orderTimestamp as the sort key. Queries for a customer's orders are blazing fast.
Then product management asks: "Can we find all orders for a specific product? We need to track which products are selling best."
Your heart sinks. The table is optimized for customer-centric queries. Product-centric queries would require a full table scan—touching every partition, reading every item, and filtering in application code. For a table with 100 million orders, this is a non-starter.
This is exactly the problem Global Secondary Indexes (GSIs) solve. GSIs allow you to create alternative views of your data with different partition and sort keys, enabling query patterns that the base table cannot efficiently support.
By the end of this page, you will understand what GSIs are and how they differ from Local Secondary Indexes, how to design GSIs for diverse access patterns, the cost and consistency implications of GSIs, sparse indexes and projection strategies for optimization, and common GSI design patterns used in production systems.
A Global Secondary Index (GSI) is an index with a partition key and optional sort key that can be different from those on the base table. "Global" means the index spans all partitions of the base table—it's not constrained to a single partition.
GSIs are essentially separate tables managed automatically by DynamoDB:
| Characteristic | Base Table | Global Secondary Index |
|---|---|---|
| Partition Key | Fixed at table creation | Any attribute from base table items |
| Sort Key | Fixed at table creation (optional) | Any attribute from base table items (optional) |
| Capacity | Table's provisioned/on-demand capacity | Separate provisioned/on-demand capacity |
| Consistency | Supports strong and eventual | Eventual consistency only |
| Item Size Limit | 400 KB | Projected attributes within 400 KB |
| Write Path | Direct writes | Asynchronous replication from base table |
| Creation | At table creation only | Any time (with eventual population) |
12345678910111213141516171819202122232425262728293031
┌────────────────────────────────────────────────────────────────────┐│ Base Table: Orders ││ Partition Key: customerId Sort Key: orderTimestamp │├────────────────────────────────────────────────────────────────────┤│ customerId │ orderTimestamp │ orderId │ productId │ status │├─────────────┼─────────────────┼─────────┼───────────┼─────────────┤│ CUST-001 │ 2024-06-15T10:00│ ORD-100 │ PROD-500 │ delivered ││ CUST-001 │ 2024-06-16T14:30│ ORD-101 │ PROD-200 │ shipped ││ CUST-002 │ 2024-06-15T09:00│ ORD-102 │ PROD-500 │ delivered ││ CUST-003 │ 2024-06-17T11:15│ ORD-103 │ PROD-300 │ pending │└────────────────────────────────────────────────────────────────────┘ │ ┌───────────────┴───────────────┐ │ Automatic Async Replication │ └───────────────┬───────────────┘ ▼┌────────────────────────────────────────────────────────────────────┐│ GSI: ProductOrders-Index ││ Partition Key: productId Sort Key: orderTimestamp │├────────────────────────────────────────────────────────────────────┤│ productId │ orderTimestamp │ orderId │ customerId │ (keys only) │├─────────────┼─────────────────┼─────────┼────────────┼─────────────┤│ PROD-200 │ 2024-06-16T14:30│ ORD-101 │ CUST-001 │ ││ PROD-300 │ 2024-06-17T11:15│ ORD-103 │ CUST-003 │ ││ PROD-500 │ 2024-06-15T09:00│ ORD-102 │ CUST-002 │ ││ PROD-500 │ 2024-06-15T10:00│ ORD-100 │ CUST-001 │ │└────────────────────────────────────────────────────────────────────┘ Query: "Get all orders for PROD-500"→ Queries GSI with PK = "PROD-500"→ Returns ORD-100 and ORD-102 instantly (no table scan!)DynamoDB also offers Local Secondary Indexes (LSIs), which share the base table's partition key but allow an alternative sort key. LSIs share capacity with the base table, support strong consistency, and must be created at table creation time. GSIs are more flexible and are used far more frequently in practice. This page focuses on GSIs.
Creating a GSI requires several key decisions: the key schema, which attributes to project, and capacity settings. Let's examine each in detail.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192
import { DynamoDB, CreateTableCommand } from "@aws-sdk/client-dynamodb"; const dynamodb = new DynamoDB({ region: "us-east-1" }); // Create table with GSI at creation timeconst createTableWithGSI = async () => { await dynamodb.send(new CreateTableCommand({ TableName: "Orders", // Base table key schema KeySchema: [ { AttributeName: "customerId", KeyType: "HASH" }, { AttributeName: "orderTimestamp", KeyType: "RANGE" } ], // Define all attributes used in keys (base + GSI) AttributeDefinitions: [ { AttributeName: "customerId", AttributeType: "S" }, { AttributeName: "orderTimestamp", AttributeType: "S" }, { AttributeName: "productId", AttributeType: "S" }, { AttributeName: "status", AttributeType: "S" } ], // GSI definitions GlobalSecondaryIndexes: [ { IndexName: "ProductOrders-Index", KeySchema: [ { AttributeName: "productId", KeyType: "HASH" }, { AttributeName: "orderTimestamp", KeyType: "RANGE" } ], // What to copy to the GSI Projection: { ProjectionType: "INCLUDE", NonKeyAttributes: ["orderId", "total", "customerId"] }, // GSI has its own capacity ProvisionedThroughput: { ReadCapacityUnits: 100, WriteCapacityUnits: 50 } }, { IndexName: "StatusOrders-Index", KeySchema: [ { AttributeName: "status", KeyType: "HASH" }, { AttributeName: "orderTimestamp", KeyType: "RANGE" } ], Projection: { ProjectionType: "KEYS_ONLY" // Only base table keys projected }, ProvisionedThroughput: { ReadCapacityUnits: 50, WriteCapacityUnits: 25 } } ], BillingMode: "PROVISIONED", ProvisionedThroughput: { ReadCapacityUnits: 500, WriteCapacityUnits: 200 } }));}; // Add GSI to existing table (backfill happens automatically)const addGSIToExistingTable = async () => { await dynamodb.send(new UpdateTableCommand({ TableName: "Orders", GlobalSecondaryIndexUpdates: [ { Create: { IndexName: "DateOrders-Index", KeySchema: [ { AttributeName: "orderDate", KeyType: "HASH" }, { AttributeName: "orderId", KeyType: "RANGE" } ], Projection: { ProjectionType: "ALL" }, ProvisionedThroughput: { ReadCapacityUnits: 100, WriteCapacityUnits: 50 } } } ], // Also need to add the attribute definition AttributeDefinitions: [ { AttributeName: "orderDate", AttributeType: "S" } ] }));};Projection Types Explained
Projection determines which attributes from the base table are copied to the GSI:
| Projection Type | Contents | Storage Cost | Query Flexibility |
|---|---|---|---|
| KEYS_ONLY | Only base table keys + GSI keys | Lowest | Must fetch from base table for other attributes |
| INCLUDE | Specified attributes + all keys | Medium | Good balance of cost and flexibility |
| ALL | All attributes from base table | Highest | No fetches needed, full item available |
Projection Strategy:
ALL (but watch storage costs)INCLUDE with just thoseKEYS_ONLY and fetch from base tableGSIs are powerful but not free. Understanding their cost model is essential for building cost-effective DynamoDB applications.
ALL projection, you're storing the full table twice.12345678910111213141516171819202122232425262728293031323334353637383940
// Scenario: Orders table with 3 GSIs// Base table keys: customerId (PK), orderTimestamp (SK) const tableConfig = { baseTable: { pk: "customerId", sk: "orderTimestamp" }, gsis: [ { name: "ProductOrders-Index", pk: "productId", projected: ["total"] }, { name: "StatusOrders-Index", pk: "status", projected: ["customerId"] }, { name: "DateOrders-Index", pk: "orderDate", projected: "ALL" } ]}; // When you write ONE item to the base table:const newOrder = { customerId: "CUST-001", // Base PK orderTimestamp: "2024-06-15", // Base SK productId: "PROD-500", // GSI 1 PK status: "pending", // GSI 2 PK orderDate: "2024-06-15", // GSI 3 PK total: 149.99, items: [...], // Not in any GSI projection (INCLUDE)}; // Cost breakdown for this single write:// 1. Base table write: 1 WCU (assuming <1 KB item)// 2. ProductOrders-Index: 1 WCU (pk + base keys + total)// 3. StatusOrders-Index: 1 WCU (pk + base keys + customerId)// 4. DateOrders-Index: 1 WCU (all attributes)// ─────────────────────────────────// TOTAL: 4 WCU for ONE logical write // If your application writes 10,000 items/second:// Base table needs: 10,000 WCU// GSI 1 needs: 10,000 WCU// GSI 2 needs: 10,000 WCU// GSI 3 needs: 10,000 WCU// TOTAL capacity: 40,000 WCU (4x base table writes!)Every GSI multiplies your write costs. A table with 5 GSIs has ~6x the write cost of a table with no GSIs. This is the most common source of unexpected DynamoDB bills. Carefully consider: Do you really need this GSI? Can you serve this query from an existing GSI? Can you use batch reads from the base table instead?
GSI Throttling Independence
A critical fact that surprises many developers: GSI throttling does not throttle base table writes.
If a GSI can't keep up with base table writes:
This means GSI reads might return stale data for extended periods if the GSI is under-provisioned. Monitor GSI ThrottledRequests and ReplicationLatency metrics to catch this before it causes application issues.
One of the most powerful (and underutilized) GSI features is the sparse index pattern. A GSI only includes items from the base table that have the GSI's key attributes. Items without those attributes are simply not indexed.
This behavior is automatic—DynamoDB doesn't index what doesn't exist. Savvy designers exploit this to create efficient, focused indexes.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// Scenario: Orders table where we need to find orders needing attention// Most orders are 'completed' or 'shipped' - only ~5% need attention // ❌ BAD APPROACH: GSI on 'status' field// Every order is in the GSI, but we only query for a few statuses// Result: Huge index, most of it never queried // ✅ GOOD APPROACH: Sparse index on 'needsAttention' attribute// Only orders requiring attention have this attribute const orderExamples = [ // Standard completed order - NO 'needsAttention' attribute { customerId: "CUST-001", orderTimestamp: "2024-06-15T10:00:00Z", status: "completed", total: 149.99 // Note: no 'needsAttention' attribute }, // Order with payment issue - HAS 'needsAttention' attribute { customerId: "CUST-002", orderTimestamp: "2024-06-16T14:30:00Z", status: "payment_failed", total: 89.50, needsAttention: "PAYMENT_ISSUE#2024-06-16T14:30:00Z", // GSI PK issueDetails: "Card declined - insufficient funds" }, // Order with shipping problem - HAS 'needsAttention' attribute { customerId: "CUST-003", orderTimestamp: "2024-06-17T11:15:00Z", status: "shipping_exception", total: 200.00, needsAttention: "SHIPPING_ISSUE#2024-06-17T11:15:00Z", // GSI PK issueDetails: "Address undeliverable" }]; // GSI: NeedsAttention-Index// PK: needsAttention// SK: (none needed, or orderTimestamp for sorting) // Result:// - GSI contains only ~5% of items (those with issues)// - Storage cost: ~5% of full index// - Query: "Get all orders needing attention" is instant and cheap// - When issue is resolved: remove 'needsAttention' attribute// → Item automatically removed from GSI!If you're creating a GSI to find a minority of items (items with errors, featured items, pending reviews), make it sparse. Instead of indexing a status attribute that exists on all items, add a special attribute only to the items you want indexed. The storage and write cost savings can be dramatic.
Many developers think carefully about base table partition keys but then choose GSI keys carelessly. GSIs have the same partition limits as base tables (3,000 RCU, 1,000 WCU per partition). All the partition key design principles apply equally to GSIs.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// ❌ BAD: Status GSI with low-cardinality partition keyconst badGSI = { IndexName: "StatusIndex", KeySchema: [ { AttributeName: "status", KeyType: "HASH" }, // 5 distinct values! { AttributeName: "createdAt", KeyType: "RANGE" } ]};// Problem: 90% of orders are "completed" → hot partition on "completed"// Query: "Find all pending orders" → Works fine (few items)// Query: "Find all completed orders" → Hits one overloaded partition // ✅ GOOD: Composite GSI key with date shardingconst goodGSI = { IndexName: "StatusDateIndex", KeySchema: [ // Composite key: status#date spreads status across many partitions { AttributeName: "statusDate", KeyType: "HASH" }, // "completed#2024-06-15" { AttributeName: "createdAt", KeyType: "RANGE" } ]};// Benefits:// - Each day's completed orders in separate partition// - Cardinality = statuses × days (365× improvement per year)// - Query: "completed orders on 2024-06-15" → Efficient single partition // ✅ ALTERNATIVE: Sharded status GSI for aggregation queriesconst shardedGSI = { IndexName: "StatusShardIndex", KeySchema: [ { AttributeName: "statusShard", KeyType: "HASH" }, // "completed#shard-7" { AttributeName: "createdAt", KeyType: "RANGE" } ]}; // Write logic: Assign random shardfunction getStatusShard(status: string, shardCount = 10): string { const shard = Math.floor(Math.random() * shardCount); return `${status}#shard-${shard}`;} // Read logic: Query all shards and mergeasync function getOrdersByStatus(status: string): Promise<Order[]> { const promises = Array.from({ length: 10 }, (_, i) => queryGSI("StatusShardIndex", `${status}#shard-${i}`) ); const results = await Promise.all(promises); return results.flat().sort((a, b) => b.createdAt - a.createdAt);}GSI keys can be attributes that don't exist on all items (sparse indexes), computed/composite values (status#date), or completely synthetic (hash-based shards). This flexibility is powerful—use it to create GSIs with excellent distribution characteristics regardless of your base table's natural attributes.
Querying GSIs follows the same patterns as querying base tables, with a few important distinctions. Let's examine common query patterns and their implementations.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102
import { DynamoDBDocumentClient, QueryCommand } from "@aws-sdk/lib-dynamodb"; const docClient = DynamoDBDocumentClient.from(dynamoDBClient); // ============================================// Pattern 1: Simple GSI Query// ============================================// Find all orders for a productasync function getOrdersByProduct(productId: string): Promise<Order[]> { const result = await docClient.send(new QueryCommand({ TableName: "Orders", IndexName: "ProductOrders-Index", // Specify the GSI KeyConditionExpression: "productId = :pid", ExpressionAttributeValues: { ":pid": productId } })); return result.Items as Order[];} // ============================================// Pattern 2: GSI with Sort Key Range// ============================================// Find orders for a product in a date rangeasync function getProductOrdersInRange( productId: string, startDate: string, endDate: string): Promise<Order[]> { const result = await docClient.send(new QueryCommand({ TableName: "Orders", IndexName: "ProductOrders-Index", KeyConditionExpression: "productId = :pid AND orderTimestamp BETWEEN :start AND :end", ExpressionAttributeValues: { ":pid": productId, ":start": startDate, ":end": endDate }, ScanIndexForward: false // Newest first })); return result.Items as Order[];} // ============================================// Pattern 3: GSI Query with Filter// ============================================// Find pending orders for a product (filter on non-key attribute)async function getPendingProductOrders(productId: string): Promise<Order[]> { const result = await docClient.send(new QueryCommand({ TableName: "Orders", IndexName: "ProductOrders-Index", KeyConditionExpression: "productId = :pid", FilterExpression: "#status = :status", // Post-query filter ExpressionAttributeNames: { "#status": "status" // 'status' is reserved word }, ExpressionAttributeValues: { ":pid": productId, ":status": "pending" } })); return result.Items as Order[];}// ⚠️ NOTE: FilterExpression filters AFTER reading items from GSI// You're charged for all items read, not just those returned// For frequent filtered queries, consider a more specific GSI // ============================================// Pattern 4: Fetch Full Item from Base Table// ============================================// When GSI uses KEYS_ONLY or INCLUDE projection, get full itemasync function getFullOrderDetails(productId: string): Promise<Order[]> { // Step 1: Query GSI to get keys const gsiResult = await docClient.send(new QueryCommand({ TableName: "Orders", IndexName: "ProductOrders-Index", KeyConditionExpression: "productId = :pid", ExpressionAttributeValues: { ":pid": productId } })); // Step 2: BatchGetItem from base table for full items if (!gsiResult.Items?.length) return []; const keys = gsiResult.Items.map(item => ({ customerId: item.customerId, // Base table PK orderTimestamp: item.orderTimestamp // Base table SK })); const batchResult = await docClient.send(new BatchGetCommand({ RequestItems: { Orders: { Keys: keys } } })); return batchResult.Responses?.Orders as Order[];}A common misconception: FilterExpression does NOT reduce the items read from the index—it only filters what's returned to your application. You're charged for all items matching the key conditions, even if 99% are filtered out. If you find yourself filtering heavily, you probably need a better GSI design.
Understanding GSI limits helps avoid surprises during development and scaling. Here are the constraints you must design around:
| Limit | Value | Notes |
|---|---|---|
| GSIs per table | 20 (soft limit) | Can request increase via AWS Support |
| Projected attribute size | Item ≤ 400 KB after projection | Larger items cause write failure |
| GSI partition throughput | 3,000 RCU / 1,000 WCU per partition | Same as base table partitions |
| Strong consistency | Not available | GSI queries are eventually consistent only |
| Transactions | Cannot read from GSI in transaction | Only base table reads in transactions |
| Backfill time | Hours to days for large tables | Depends on table size and capacity |
| Key attribute types | String, Number, Binary only | Complex types cannot be keys |
When you create a GSI on an existing table with data, DynamoDB performs a background backfill. The table remains fully operational, but the GSI is in CREATING status until complete. For large tables (billions of items), this can take days. Plan GSI additions during low-traffic periods and monitor the creation progress via CloudWatch.
Global Secondary Indexes are essential for building flexible DynamoDB applications. Let's consolidate the key insights:
KEYS_ONLY for minimal storage, INCLUDE for selected attributes, ALL for full flexibility at higher cost.What's Next
With partition keys and GSIs covered, we turn to one of DynamoDB's most important trade-offs: Eventual vs Strong Consistency. Understanding when to use each consistency level—and what happens when you choose wrong—is crucial for building systems that behave correctly while maintaining the performance DynamoDB is known for.
You now understand Global Secondary Indexes—how they work, their costs and constraints, and patterns for effective use. You can design GSIs for diverse access patterns, use sparse indexes for efficiency, and avoid the write amplification and hot partition traps that catch many DynamoDB users.