Database Management SystemsPage Layout

Page Layout and Organization

LevelIntermediate

Duration60 mins

TopicPage Layout

2 / 5

Slot Directory

The Indirection Layer That Makes Everything Work

How does a database know where to find a specific record on a page? If records can be variable-length and their positions can change during compaction, how do indexes maintain valid pointers? The answer lies in one of the most elegant abstractions in database design: the slot directory.\n\nThe slot directory is a level of indirection between logical record identifiers and physical byte offsets. This seemingly simple concept—an array of pointers—enables the entire machinery of efficient record management, including in-place updates, garbage collection, and stable external references. Without the slot directory, databases would face a choice between rigid fixed-length records or chaotic, fragile direct pointers.

What You Will Learn

By the end of this page, you will understand the complete mechanics of slot directories, including their structure, how they support record addressing, the relationship between slot numbers and tuple IDs (TIDs/RIDs), and how they enable compaction without invalidating external references. You'll also explore edge cases and implementation variations across major database systems.

The Problem Slot Directories Solve

Before understanding the solution, let's deeply understand the problem. Consider a page containing multiple records of varying sizes:\n\nThe Direct Addressing Problem\n\nImagine we simply stored records contiguously and addressed them by byte offset:\n\n\nPage with direct addressing:\n┌──────────────────────────────────────────────────────────┐\n│ Offset 24: Record A (100 bytes) │\n│ Offset 124: Record B (250 bytes) │\n│ Offset 374: Record C (75 bytes) │\n│ Offset 449: Record D (300 bytes) │\n└──────────────────────────────────────────────────────────┘\n\nIndex entry for Record C: (PageID=47, Offset=374)\n\n\nNow, what happens when we delete Record B?

Converting Mermaid diagram...

The Cascading Disaster\n\nWith compaction, Records C and D move to new offsets:\n- Record C: 374 → 124\n- Record D: 449 → 199\n\nBut the index entry for Record C still says offset 374! Every reference to these records is now invalid. We face an impossible choice:\n\n1. Never compact — Accept permanent space loss, eventual page exhaustion\n2. Update all references — Scan all indexes, all foreign key chains, all cached pointers. Prohibitively expensive\n3. Use fixed-length records — All records same size, can assign fixed slot positions. Inflexible, wasteful\n4. Add an indirection layer — The slot directory solution

The Power of Indirection

Computer science has a famous saying: 'All problems in computer science can be solved by another level of indirection.' The slot directory is a prime example. By inserting a stable layer between references and data, we gain the flexibility to reorganize data without breaking references.

Slot Directory Structure

The slot directory is an array of fixed-size entries, each containing information about one record on the page. The array typically grows from just after the page header toward the center of the page.

Slot Entry Contents\n\nEach slot entry contains at minimum:\n\n| Field | Size | Purpose |\n|-------|------|---------|\n| Offset | 2-4 bytes | Physical byte offset of record within page |\n| Length | 2 bytes | Size of record in bytes (optional; can derive from next) |\n| Flags | 1-2 bytes | Status bits: in-use, dead, redirected, etc. |\n\nTotal slot entry size is typically 4-8 bytes per slot.

Converting Mermaid diagram...

Key Observations\n\n1. Slot numbers are stable — Slot 0 is always slot 0, regardless of where its record moves\n2. Physical offsets can change — During compaction, records move but slot entries are updated accordingly\n3. Dead slots remain — Slot 2 is marked DEAD; the slot entry persists but points nowhere\n4. Records are not ordered — Record storage order doesn't match slot order\n5. External references use (PageID, SlotNum) — This tuple is called a TID (Tuple ID) or RID (Row ID)

Why Keep Dead Slots?

Dead slots cannot be immediately reused because slot numbers may be referenced externally (in indexes, by other transactions). Reusing a slot number could cause an old reference to reach a new, unrelated record. Slot reuse requires careful coordination—typically only after all potential references are guaranteed gone (e.g., after vacuum in PostgreSQL).

Tuple Identifiers: The Universal Address

The combination of page number and slot number forms the Tuple Identifier (TID) or Row Identifier (RID)—the universal way to refer to a specific record in the database.\n\nTID Format\n\n\nTID = (BlockNumber, OffsetNumber)\n \____________/\___________/\n Page ID Slot Number\n (4-6 bytes) (2 bytes)\n\nExample: (47, 3) means "Page 47, Slot 3"\n\n\nWhere TIDs Appear\n\nTIDs are used throughout the database system:

TID Usage Throughout the Database
Context	How TID Is Used	Example
B-tree Index Leaf	Index entry contains key + TID pointing to heap	(key='Smith', TID=(47,3))
Index-Only Scan	TID checked against visibility map	If page 47 all-visible, skip heap fetch
MVCC Tuple Chains	Old versions point to newer via TID	Tuple at (47,3) → successor at (47,15)
Foreign Key Checks	Internal FK validation uses TIDs	Child row references parent at (100,7)
CTID System Column	PostgreSQL exposes TID as 'ctid' pseudo-column	SELECT ctid FROM orders → (47,3)
HOT Chain	Heap-Only Tuple updates chain via TID	(47,3) → (47,12) → (47,18) on same page
Vacuum/Autovacuum	Dead TIDs collected for cleanup	Remove index entries pointing to dead TIDs

TID Stability Guarantees\n\nThe slot directory ensures that a TID remains valid as long as:\n1. The record exists (live or dead-but-not-vacuumed)\n2. The slot number has not been reused\n\nThis stability is critical for:\n- Index consistency — Index entries don't need updates when heap records move within a page\n- Cursor stability — Open cursors can rely on TIDs to refetch rows\n- Concurrency — Transactions can hold TID references without locking

TIDs Are Not Permanent

While TIDs are stable during a row's lifetime, they can change after VACUUM FULL, CLUSTER, or similar reorganization operations that move rows between pages. Applications should never store TIDs externally as permanent keys—use primary keys instead. TIDs are for internal database use.

Slot Directory Operations

Let's examine how the slot directory is manipulated during common operations:

INSERT: Allocating a New Slot\n\n\nOperation: INSERT new 150-byte record\n\nBefore:\n Lower = 40 (4 slots × 4 bytes/slot + 24 header)\n Upper = 7000 (first free byte before records)\n Free space = 7000 - 40 = 6960 bytes\n \nStep 1: Check space\n Need: 150 (record) + 4 (new slot) = 154 bytes\n Have: 6960 bytes ✓\n \nStep 2: Allocate record space\n New Upper = 7000 - 150 = 6850\n Copy record to offset 6850\n \nStep 3: Allocate slot\n New slot 4 at offset 40\n Slot 4 = {offset=6850, length=150, flags=USED}\n New Lower = 40 + 4 = 44\n \nStep 4: Update header\n Lower: 40 → 44\n Upper: 7000 → 6850\n\nAfter:\n Free space = 6850 - 44 = 6806 bytes\n New TID = (PageID, 4)\n

DELETE: Marking a Slot Dead\n\n\nOperation: DELETE record at slot 2\n\nBefore:\n Slot 2 = {offset=7500, length=200, flags=USED}\n \nStep 1: Mark slot dead\n Slot 2 = {offset=7500, length=200, flags=DEAD}\n \nStep 2: (Optional) Clear record content\n Some systems zero out the record bytes for security\n \nNote: Lower and Upper unchanged!\n The 200 bytes at offset 7500 are now 'dead space'\n They cannot be reclaimed until:\n - Page compaction occurs, OR\n - Vacuum frees the slot for reuse\n \nAfter:\n Logical free space unchanged (Upper-Lower)\n Actual reclaimable space = original + 200 dead bytes\n Dead slot remains to preserve TID stability\n

UPDATE: Modifying in Place or Chain\n\nUpdates are complex because:\n- If new record fits in old's space: in-place update possible\n- If new record is larger: delete old + insert new (potentially different page!)\n- With MVCC: old version retained, new version added\n\n\nScenario A: In-Place Update (record shrinks or same size)\n Slot 3 = {offset=7200, length=100, flags=USED}\n Update to 80-byte record\n \n Result:\n Copy new 80 bytes to offset 7200\n Slot 3 = {offset=7200, length=80, flags=USED}\n 20 bytes at 7280-7299 become internal fragmentation\n \nScenario B: MVCC Update (PostgreSQL style)\n Old tuple at Slot 3, need new version\n \n Result:\n Insert new tuple → gets Slot 7 at offset 6500\n Slot 3.flags = DEAD (but keeps for MVCC readers)\n Old tuple's t_ctid points to (PageID, 7)\n Slot 3 = {offset=7200, length=100, flags=DEAD, chain→7}\n

HOT Updates

PostgreSQL's Heap-Only Tuple (HOT) optimization creates update chains entirely within one page. If an update doesn't change indexed columns, the new version is added to the same page, linked from the old slot. Index entries remain valid (pointing to the chain head), avoiding expensive index updates. HOT can dramatically reduce index maintenance overhead for update-heavy workloads.

Compaction: Reorganizing Without Breaking References

Compaction (also called page pruning or defragmentation) reclaims the fragmented space left by deletions and updates. The slot directory makes this safe by providing a stable indirection layer.

Converting Mermaid diagram...

Compaction Algorithm\n\n\nPage Compaction Procedure:\n\n1. LOCK page exclusively\n\n2. SCAN slot directory, build list of live records:\n live_records = []\n for slot in slots:\n if slot.flags == LIVE:\n live_records.append({slot_num, slot.offset, slot.length})\n\n3. SORT live records by offset (highest first = bottom of record region)\n\n4. COMPACT records from page bottom:\n new_offset = PAGE_SIZE // Start from end\n for record in live_records (sorted):\n new_offset -= record.length\n memmove(page + new_offset, page + record.offset, record.length)\n slots[record.slot_num].offset = new_offset\n\n5. UPDATE header:\n Upper = new_offset // New boundary of record region\n // Lower unchanged (slot directory same size)\n\n6. ZERO dead space (security, debugging):\n memset(page + Lower, 0, Upper - Lower)\n\n7. WRITE WAL record (full-page image for safety)\n\n8. RELEASE lock\n\n\nWhy slot numbers don't change:\n\nThe critical insight is that we only update the offset field in each slot entry. The slot number (index in the array) remains constant. Any external reference using (PageID, SlotNumber) remains valid—it now resolves to the record's new physical location.

Compaction Triggers

Databases don't compact on every deletion. Common triggers include: (1) INSERT fails due to fragmented but reclaimable space, (2) Explicit VACUUM or maintenance command, (3) Dead tuple threshold exceeded, (4) Background page cleaner process. Eager compaction wastes CPU on pages that may see more deletions; lazy compaction may leave pages inefficiently used longer.

Implementation Variations Across Systems

While the slot directory concept is universal, implementations vary across database systems:

Slot Directory Implementations in Major Databases
Database	Slot Entry Name	Entry Size	Special Features
PostgreSQL	ItemId (Line Pointer)	4 bytes	15-bit offset, 15-bit flags, redirect chains for HOT
MySQL InnoDB	Record Directory	Variable	Implicit in record headers; uses next-record pointers
Oracle	Row Directory	2 bytes/entry	Offset only; row header contains length
SQL Server	Slot Array	2 bytes/slot	Grows from page end; offset to record start
SQLite	Cell Pointer Array	2 bytes/cell	Sorted by key for B-tree pages; unsorted for others

PostgreSQL's ItemId in Detail\n\nPostgreSQL's line pointer (ItemId) is a 4-byte packed structure:\n\nc\ntypedef struct ItemIdData {\n unsigned lp_off:15, // Offset to tuple (0-32767)\n lp_flags:2, // LP_UNUSED, LP_NORMAL, LP_REDIRECT, LP_DEAD\n lp_len:15; // Tuple length in bytes\n} ItemIdData;\n\n\nThe lp_flags field enables sophisticated behaviors:\n- LP_UNUSED — Slot is free for reuse\n- LP_NORMAL — Points to a regular tuple\n- LP_REDIRECT — Points to another slot (for HOT chains)\n- LP_DEAD — Tuple is dead, awaiting vacuum\n\nInnoDB's Implicit Directory\n\nInnoDB doesn't have a separate slot directory. Instead:\n- Records form a singly-linked list via 'next record' pointers in each record header\n- The page header contains a pointer to the first record\n- Finding a record requires traversing the list (but B-tree structure usually navigates directly)\n- Infimum/Supremum pseudo-records mark list boundaries

Trade-offs in Design

PostgreSQL's explicit directory enables O(1) access to any slot but consumes 4 bytes per record. InnoDB's linked-list approach saves per-record overhead but makes random slot access O(n). Since InnoDB pages are primarily accessed via B-tree traversal (which leads directly to records), the linked list is acceptable. PostgreSQL's heap tables benefit from direct slot access for sequential scans and TID lookups.

Edge Cases and Challenges

Real-world slot directory management involves numerous edge cases:

Common Edge Cases

•Slot Directory Overflow — What if a page accumulates so many slots (each insert adds one) that the slot directory exhausts free space? Solution: Limit maximum slots per page, or force page split before overflow.
•Slot Reuse Timing — When can a dead slot's number be reused? Too early: stale references hit wrong data. Too late: slot directory bloat. PostgreSQL waits until all snapshots that could see the old TID are gone.
•Slot Number Exhaustion — If slot numbers are 15-bit (0-32767), a heavily churned page could exhaust them. Rare in practice; vacuum typically reclaims slots before exhaustion.
•Redirect Chains — HOT update chains create redirect slots. Long chains degrade performance. PostgreSQL prunes chains, but complex update patterns can temporarily create deep chains.
•Concurrent Slot Allocation — Two transactions inserting simultaneously need slot allocation to be atomic. Typically resolved by page-level locking during insert.
•Slot Directory Corruption — If a slot's offset field is corrupted, pointing into the middle of another record, the page may be unrecoverable. Checksums help detect but not repair.

The Slot Bloat Problem\n\nConsider a pattern where a page sees many insert-delete cycles:\n\n\nInitial: 100 records, 100 slots\nDelete all 100 records: 0 live records, 100 dead slots\nInsert 100 new records: 100 live records, 200 total slots\nRepeat many times...\n\nAfter N cycles: ~100 live records, N×100 slot entries\n\n\nThe slot directory grows without bound because dead slots persist until vacuum. Solutions:\n1. Aggressive vacuum — Reclaim dead slots promptly\n2. Page rewrite — VACUUM FULL rewrites pages with fresh slot directories\n3. Slot reuse policy — Some systems aggressively reuse slots

Monitoring Slot Usage

In PostgreSQL, you can query pg_stat_user_tables for dead tuple counts and bloat estimates. High n_dead_tup values indicate vacuum isn't keeping pace. The pageinspect extension lets you examine actual slot directory contents: SELECT lp, lp_flags, lp_off, lp_len FROM heap_page_items(get_raw_page('tablename', 0));

Summary: The Slot Directory

The slot directory is a deceptively simple structure with profound implications. Let's consolidate the key takeaways:

Key Takeaways

•Indirection is the key insight — Slot numbers provide stable addresses while physical offsets can change freely.
•TIDs combine page and slot — The (PageID, SlotNumber) tuple uniquely identifies any record in the database.
•Dead slots persist temporarily — Slot numbers can't be immediately reused due to potential external references.
•Compaction updates offsets, not slot numbers — Records move, slot entries update, external references remain valid.
•Implementation varies but concept is universal — PostgreSQL's ItemId, Oracle's Row Directory, SQL Server's Slot Array all solve the same problem.
•Slot management has edge cases — Overflow, reuse timing, and bloat require careful handling.
•HOT chains use slot indirection — Update chains link through the slot directory, avoiding index maintenance.

What's Next:\n\nNow that you understand how records are addressed within pages, we'll examine the records themselves. How are individual column values laid out within a record? What about NULL handling, variable-length fields, and type encoding? The next page explores record layout in detail.

Page Complete

You now understand the slot directory—the indirection layer that makes variable-length record management possible. This knowledge is essential for understanding MVCC, index operations, vacuum processes, and storage optimization. Next, we'll dive into how records themselves are structured.

2 / 5

Loading learning content...

Database Management SystemsPage Layout

Page Layout and Organization

LevelIntermediate

Duration60 mins

TopicPage Layout

2 / 5

Slot Directory

The Indirection Layer That Makes Everything Work

What You Will Learn

The Problem Slot Directories Solve

Converting Mermaid diagram...

The Power of Indirection

Slot Directory Structure

Converting Mermaid diagram...

Why Keep Dead Slots?

Tuple Identifiers: The Universal Address

TID Usage Throughout the Database
Context	How TID Is Used	Example
B-tree Index Leaf	Index entry contains key + TID pointing to heap	(key='Smith', TID=(47,3))
Index-Only Scan	TID checked against visibility map	If page 47 all-visible, skip heap fetch
MVCC Tuple Chains	Old versions point to newer via TID	Tuple at (47,3) → successor at (47,15)
Foreign Key Checks	Internal FK validation uses TIDs	Child row references parent at (100,7)
CTID System Column	PostgreSQL exposes TID as 'ctid' pseudo-column	SELECT ctid FROM orders → (47,3)
HOT Chain	Heap-Only Tuple updates chain via TID	(47,3) → (47,12) → (47,18) on same page
Vacuum/Autovacuum	Dead TIDs collected for cleanup	Remove index entries pointing to dead TIDs

TIDs Are Not Permanent

Slot Directory Operations

Let's examine how the slot directory is manipulated during common operations:

HOT Updates

Compaction: Reorganizing Without Breaking References

Compaction (also called page pruning or defragmentation) reclaims the fragmented space left by deletions and updates. The slot directory makes this safe by providing a stable indirection layer.

Converting Mermaid diagram...

Compaction Triggers

Implementation Variations Across Systems

While the slot directory concept is universal, implementations vary across database systems:

Slot Directory Implementations in Major Databases
Database	Slot Entry Name	Entry Size	Special Features
PostgreSQL	ItemId (Line Pointer)	4 bytes	15-bit offset, 15-bit flags, redirect chains for HOT
MySQL InnoDB	Record Directory	Variable	Implicit in record headers; uses next-record pointers
Oracle	Row Directory	2 bytes/entry	Offset only; row header contains length
SQL Server	Slot Array	2 bytes/slot	Grows from page end; offset to record start
SQLite	Cell Pointer Array	2 bytes/cell	Sorted by key for B-tree pages; unsorted for others

Trade-offs in Design

Edge Cases and Challenges

Real-world slot directory management involves numerous edge cases:

Common Edge Cases

•Slot Directory Overflow — What if a page accumulates so many slots (each insert adds one) that the slot directory exhausts free space? Solution: Limit maximum slots per page, or force page split before overflow.
•Slot Reuse Timing — When can a dead slot's number be reused? Too early: stale references hit wrong data. Too late: slot directory bloat. PostgreSQL waits until all snapshots that could see the old TID are gone.
•Slot Number Exhaustion — If slot numbers are 15-bit (0-32767), a heavily churned page could exhaust them. Rare in practice; vacuum typically reclaims slots before exhaustion.
•Redirect Chains — HOT update chains create redirect slots. Long chains degrade performance. PostgreSQL prunes chains, but complex update patterns can temporarily create deep chains.
•Concurrent Slot Allocation — Two transactions inserting simultaneously need slot allocation to be atomic. Typically resolved by page-level locking during insert.
•Slot Directory Corruption — If a slot's offset field is corrupted, pointing into the middle of another record, the page may be unrecoverable. Checksums help detect but not repair.

Monitoring Slot Usage

Summary: The Slot Directory

The slot directory is a deceptively simple structure with profound implications. Let's consolidate the key takeaways:

Key Takeaways

•Indirection is the key insight — Slot numbers provide stable addresses while physical offsets can change freely.
•TIDs combine page and slot — The (PageID, SlotNumber) tuple uniquely identifies any record in the database.
•Dead slots persist temporarily — Slot numbers can't be immediately reused due to potential external references.
•Compaction updates offsets, not slot numbers — Records move, slot entries update, external references remain valid.
•Implementation varies but concept is universal — PostgreSQL's ItemId, Oracle's Row Directory, SQL Server's Slot Array all solve the same problem.
•Slot management has edge cases — Overflow, reuse timing, and bloat require careful handling.
•HOT chains use slot indirection — Update chains link through the slot directory, avoiding index maintenance.

Page Complete

2 / 5