Loading learning content...
When you have a collection of records to store, the most intuitive approach is often the simplest: just append each new record to the end of the file. This deceptively straightforward strategy—known as heap file organization—forms the foundation upon which all other file organization methods build.
Despite its simplicity, heap file organization is far from obsolete. It remains the default organization in most database systems and is often the optimal choice for specific workloads. Understanding heap files deeply is essential for making informed decisions about data organization in production systems.
By the end of this page, you will understand the internal structure of heap files, analyze the asymptotic performance of fundamental operations, appreciate the trade-offs that make heap files suitable for certain workloads, and recognize when alternative organizations become necessary.
A heap file (also called an unordered file or pile file) is a file organization method where records are stored in no particular order. New records are inserted at the end of the file (or in the first available space if deletions have occurred), without any consideration for the values they contain.
The term "heap" comes from the data structure concept of a heap—not the balanced tree structure, but rather the general notion of an unstructured pile of items. Just as you might toss items onto a heap without organizing them, records in a heap file accumulate without internal ordering.
Formal Definition:
A heap file H is a sequence of pages P₁, P₂, ..., Pₙ where:
Don't confuse heap files with heap data structures (priority queues). The heap file 'heap' refers to an unorganized collection, not the tree-based structure that maintains partial ordering. This naming confusion is historical and unfortunate, but the context (file organization vs. data structures) usually makes the meaning clear.
Key Characteristics:
No Ordering Constraint: Records are not sorted by any attribute, nor are they clustered by value. The physical position of a record has no relationship to its content.
Append-Oriented Insertion: The default insertion strategy adds records at the end of the file, making inserts very fast.
Space Reclamation: When records are deleted, space may be reclaimed immediately, marked for reuse, or left as fragmentation (implementation-dependent).
Full Scan Requirement: Finding a specific record by value requires scanning the entire file (in the worst case) because there's no ordering to exploit.
Minimal Metadata: Heap files require very little auxiliary structure—just enough to track pages and free space.
Understanding the physical layout of heap files is crucial for analyzing their performance characteristics. A heap file consists of several key components:
1. File Header
Every heap file begins with a header page (or header block) containing metadata:
123456789101112131415
// Heap File Header Structurestruct HeapFileHeader { int32 magic_number; // File type identifier (e.g., 0xHEAP) int32 version; // File format version int64 num_pages; // Total pages allocated int64 num_records; // Total records (may be approximate) int64 first_data_page; // Page number of first data page int64 last_data_page; // Page number of last data page int64 free_space_map_page; // Page number of free space map (if used) int32 page_size; // Size of each page in bytes byte reserved[200]; // Reserved for future use} // Example: 8KB pages, 4 bytes magic, rest as shown// Total header: typically one page (8192 bytes)2. Data Pages
The bulk of a heap file consists of data pages, each containing actual records. Pages are the unit of I/O transfer between disk and memory—they match the disk block size for efficiency (typically 4KB, 8KB, or 16KB).
Each data page has its own internal structure:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
// Heap Page Layout (Slotted Page)struct HeapPage { // Header Section (fixed position at page start) struct PageHeader { int32 page_id; // Unique page identifier int32 num_slots; // Number of slots in directory int32 free_space_start; // Offset where free space begins int32 free_space_end; // Offset where slot directory starts int16 flags; // Page flags (e.g., is_full, is_compacted) int16 reserved; } header; // Free Space Region // Records grow from bottom of page upward // Slot directory grows from end of page downward byte free_space[]; // Slot Directory (grows backward from page end) // Each slot is: (offset, length, flags) struct Slot { int16 offset; // Offset of record from page start (-1 if deleted) int16 length; // Length of record } slots[]; // slots[0] is closest to page end} // Record Identifier (RID)struct RID { int64 page_id; // Which page contains the record int32 slot_num; // Which slot within the page} /* * Visual Layout of a Slotted Page: * * ┌─────────────────────────────────────────────┐ * │ Page Header (page_id, num_slots, etc.) │ * ├─────────────────────────────────────────────┤ * │ Record 3 Data → → → → → → → → → → → → → → │ * │ Record 1 Data → → → → → → → → → → → → → → │ * │ (Deleted - now free space) │ * │ Record 2 Data → → → → → → → → → → → → → → │ * │ │ * │ ═══════════ FREE SPACE ═══════════════════ │ * │ │ * ├─────────────────────────────────────────────┤ * │ Slot 3 │ Slot 2 │ Slot 1 │ Slot 0 │← Dir │ * └─────────────────────────────────────────────┘ */3. Free Space Management
Managing free space efficiently is one of the main challenges in heap file implementation. Three common approaches exist:
Approach A: Linked List of Free Pages
Approach B: Page Directory (Free Space Map)
Approach C: Free Space Bitmap
| Approach | Space Overhead | Insert Performance | Suitable For |
|---|---|---|---|
| Linked List | O(1) per page | O(n) worst case | Small files, uniform records |
| Page Directory | O(n/k) pages | O(1) to O(n/k) | Large files, variable records |
| Free Space Bitmap | O(n/8) bytes | O(n/block) scan | Very large files, approximate |
Let's analyze the fundamental operations on heap files with rigorous complexity analysis. Understanding these operations is essential for evaluating when heap organization is appropriate.
INSERT Operation
Inserting a record into a heap file is straightforward:
Complexity Analysis:
Cost in I/O:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
function insert(HeapFile file, Record record) -> RID { // Step 1: Calculate space needed int space_needed = sizeof(record) + sizeof(Slot); // Step 2: Find page with sufficient space PageID page_id = file.free_space_map.find_page_with_space(space_needed); if (page_id == NONE) { // No existing page has space; allocate new page page_id = file.allocate_new_page(); file.free_space_map.add_page(page_id, PAGE_SIZE - HEADER_SIZE); } // Step 3: Load the page into buffer pool Page page = buffer_pool.get_page(file.file_id, page_id); // Step 4: Insert record into page int slot_num = page.add_record(record); // Step 5: Update free space map file.free_space_map.update(page_id, page.free_space()); // Step 6: Mark page dirty and return RID buffer_pool.mark_dirty(page); return RID(page_id, slot_num);} function Page.add_record(Record record) -> int { // Find next available slot int slot_num = this.header.num_slots; // Calculate record position (grows from bottom up) int record_offset = this.header.free_space_start; // Write record data memcpy(this.data + record_offset, record.data, record.length); // Update slot directory (grows from top down) this.slots[slot_num] = Slot(record_offset, record.length); // Update header this.header.num_slots++; this.header.free_space_start += record.length; this.header.free_space_end -= sizeof(Slot); return slot_num;}SEARCH Operation
Searching in a heap file—finding records that match a given predicate—is the primary weakness of this organization:
Equality Search (e.g., WHERE id = 42):
Range Search (e.g., WHERE price BETWEEN 10 AND 100):
Cost in I/O:
This O(n) search cost is the fundamental limitation of heap files. For any table where you frequently search by a specific attribute, you must either: (1) accept full table scans, (2) build an index on that attribute, or (3) use a different file organization.
DELETE Operation
Deleting a record identified by its RID proceeds as follows:
Complexity:
Space Reclamation Strategies:
Immediate Compaction: Shift remaining records to eliminate gap. Pro: No fragmentation. Con: Expensive for each delete.
Lazy Reclamation: Mark slot as deleted, reuse space later. Pro: Fast deletes. Con: Internal fragmentation.
Periodic Vacuum: Batch compaction during maintenance windows. Pro: Amortized cost. Con: Temporary space waste.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
function delete_by_rid(HeapFile file, RID rid) -> bool { // Step 1: Load the page Page page = buffer_pool.get_page(file.file_id, rid.page_id); // Step 2: Validate slot exists and is not already deleted if (rid.slot_num >= page.header.num_slots) { return false; // Invalid slot number } Slot slot = page.slots[rid.slot_num]; if (slot.offset == DELETED_MARKER) { return false; // Already deleted } // Step 3: Mark slot as deleted int freed_space = slot.length; page.slots[rid.slot_num].offset = DELETED_MARKER; page.slots[rid.slot_num].length = 0; // Step 4: Update page statistics page.header.deleted_count++; page.header.fragmented_bytes += freed_space; // Step 5: Optionally compact if fragmentation exceeds threshold if (page.fragmentation_ratio() > COMPACTION_THRESHOLD) { page.compact(); // Shift records to eliminate gaps } // Step 6: Update free space map file.free_space_map.update(rid.page_id, page.total_free_space()); buffer_pool.mark_dirty(page); return true;} function Page.compact() { // Create temporary buffer byte temp[PAGE_SIZE]; int write_offset = sizeof(PageHeader); // Copy non-deleted records contiguously for (int i = 0; i < this.header.num_slots; i++) { if (this.slots[i].offset != DELETED_MARKER) { int len = this.slots[i].length; memcpy(temp + write_offset, this.data + this.slots[i].offset, len); this.slots[i].offset = write_offset; write_offset += len; } } // Copy compacted data back memcpy(this.data + sizeof(PageHeader), temp + sizeof(PageHeader), write_offset - sizeof(PageHeader)); this.header.free_space_start = write_offset; this.header.fragmented_bytes = 0;}UPDATE Operation
Updating a record has two cases:
Case 1: Update in Place (record size unchanged)
Case 2: Record Size Changes
The potential for RID changes when records grow is a significant complication. Systems handle this through:
Let's establish a rigorous performance model for heap files. This analysis forms the baseline against which we'll compare other file organizations.
Notation:
Derived Relationships:
| Operation | I/O Cost (Pages) | CPU Cost | Notes |
|---|---|---|---|
| Insert | 1-2 | O(1) | Find page + write page |
| Delete (by RID) | 1 | O(1) | Direct page access |
| Delete (by value) | B/2 avg, B worst | O(n) | Must search first |
| Search (equality) | B/2 avg, B worst | O(n) | Linear scan |
| Search (range) | B | O(n) | Must scan all |
| Full scan | B | O(n) | Sequential I/O |
Concrete Example:
Consider a customer table with the following characteristics:
Calculations:
Search Cost:
This analysis demonstrates why heap files alone are impractical for search-heavy workloads.
The silver lining of heap file scans is that they're sequential I/O—reading contiguous pages from disk. Sequential I/O can be 100x faster than random I/O (10 ms random vs 0.1 ms sequential per page). Thus, full table scans on heap files are much faster than the naive calculation suggests when pages are physically contiguous.
Revised Analysis with Sequential I/O:
With sequential access optimization:
This is still slow, but far better than 21 minutes. Modern systems exploit:
With all optimizations: a full scan of 10M records might take 2-5 seconds on modern hardware.
Despite the search performance limitations, heap files are the optimal choice in several scenarios:
PostgreSQL uses heap organization as the default table storage (called 'heap tables'). Every table without explicit organization is a heap file. B-tree indexes point to heap tuple locations (TID = tuple identifier). This separation of data storage (heap) and access paths (indexes) provides flexibility: you can have zero, one, or many indexes on the same heap file.
Production implementations of heap files must address several practical challenges that the theoretical model glosses over:
1. Concurrency Control
Multiple transactions may attempt to access the same page simultaneously. Implementation must ensure:
1234567891011121314151617181920212223242526
function concurrent_insert(HeapFile file, Record record) -> RID { while (true) { // Step 1: Find candidate page (optimistic) PageID candidate = file.free_space_map.find_candidate(record.size); // Step 2: Pin and latch the page Page page = buffer_pool.pin_page(file.file_id, candidate); page.acquire_exclusive_latch(); // Short-term lock try { // Step 3: Verify there's still space (could have filled since check) if (page.free_space() >= record.size + SLOT_SIZE) { int slot_num = page.add_record(record); buffer_pool.mark_dirty(page); return RID(candidate, slot_num); } // Not enough space, will retry with different page } finally { page.release_exclusive_latch(); buffer_pool.unpin_page(page); } // Update free space map and retry file.free_space_map.mark_full(candidate); }}2. Recovery and Logging
For durability, all modifications must be logged:
3. Page Splits and Overflow
When a record doesn't fit in any existing page:
4. Variable-Length Records
Handling variable-length records adds complexity:
5. Large Records (TOAST)
Records exceeding a threshold (e.g., 2KB in PostgreSQL) require special handling:
PostgreSQL's TOAST system automatically handles large attribute values. When a row would exceed the page size, large columns are compressed and/or moved to a separate TOAST table, with the main row storing just a pointer. This is transparent to SQL queries—you simply SELECT the column, and PostgreSQL retrieves and reconstructs the value automatically.
Let's examine how major database systems implement and optimize heap file organization:
| System | Heap Implementation | Notable Features |
|---|---|---|
| PostgreSQL | Heap Tables (default) | MVCC with tuple versioning, TOAST for large values, Visibility Map for vacuum optimization |
| MySQL InnoDB | Primary key clustered only | Tables are always clustered by PK; no pure heap option |
| SQL Server | Heap Tables (explicit) | Optional; rowstore without clustered index, forwarding rows for updates |
| Oracle | Heap-Organized Tables (default) | Rowid-based access, ITL for concurrency, PCT_FREE for updates |
| SQLite | B-tree always | Even 'heap' tables are stored in B-tree by rowid |
PostgreSQL Deep Dive:
PostgreSQL's heap implementation is particularly instructive:
Tuple Headers: Each tuple (record) has a 23-byte header containing:
Visibility Map: One bit per page indicating whether all tuples on that page are visible to all transactions. Speeds up index-only scans—if all tuples visible, skip heap fetch.
Free Space Map: Tracks free space per page, stored in a separate fork of the relation file. Enables O(1) lookup for pages with sufficient space.
HOT Updates: Heap-Only Tuples optimization. When an update doesn't change indexed columns and new version fits on same page, create a HOT chain without updating indexes.
12345678910111213141516171819202122232425262728293031323334353637383940
-- Analyze heap file structure in PostgreSQL -- View heap file size and page countSELECT relname AS table_name, pg_size_pretty(pg_relation_size(oid)) AS heap_size, relpages AS page_count, reltuples AS estimated_rows, reltuples / NULLIF(relpages, 0) AS rows_per_pageFROM pg_classWHERE relname = 'customers'; -- Inspect individual page contents using pageinspect extensionCREATE EXTENSION IF NOT EXISTS pageinspect; -- View page header informationSELECT * FROM page_header(get_raw_page('customers', 0)); -- Result columns include:-- lsn: Log Sequence Number (for recovery)-- checksum: Page checksum (if enabled)-- flags: Page flags-- lower: Offset to start of free space-- upper: Offset to end of free space-- special: Offset to special space (used by indexes)-- pagesize: Page size (usually 8192)-- version: Page version-- prune_xid: Oldest XID requiring cleanup -- View heap tuple headersSELECT lp AS line_pointer, lp_off AS offset, lp_len AS length, t_xmin AS insert_xid, t_xmax AS delete_xid, t_ctid AS current_tuple_id, t_infomask::bit(16) AS flagsFROM heap_page_items(get_raw_page('customers', 0))LIMIT 10;We've conducted a thorough examination of heap file organization—the foundational storage method in database systems. Let's consolidate the key insights:
What's Next:
Now that we understand heap files—the baseline organization—we'll explore sorted file organization. By maintaining records in sorted order, we can dramatically improve search performance, at the cost of more expensive insertions. This trade-off is the essence of file organization design.
You now have a comprehensive understanding of heap file organization—its structure, operations, performance characteristics, and practical applications. This foundational knowledge will serve as the baseline for comparing other file organizations in subsequent pages.