Loading learning content...
Understanding overlay structure is essential to appreciating how programmers achieved the seemingly impossible: running programs significantly larger than available physical memory. The structure of an overlay program is not arbitrary—it's a carefully engineered architecture that balances memory constraints, execution flow, and data accessibility.
An overlay program consists of distinct components with precisely defined roles and relationships. The root segment provides the permanent foundation, overlay regions offer shared memory spaces, and overlay segments contain the code that dynamically swaps in and out. Together, these components form a coherent system that multiplexes limited memory across a large program.
By the end of this page, you will understand the precise structure of overlay programs: how root segments are designed, how overlay regions are allocated, how overlay trees organize program logic, and the critical design principles that ensure correct and efficient overlay operation.
The root segment (also called the resident segment or main segment) is the portion of an overlay program that remains in memory throughout the program's entire execution. It never unloads, never gets replaced, and provides the stable foundation upon which all overlay operations depend.
Critical Components of the Root Segment:
The root segment must contain everything that:
main() function or equivalent that initiates executionMemory Layout Example:
Consider a compiler with 180KB of code that must run on a 64KB system. A typical root segment design might look like:
Root Segment Sizing Trade-offs:
Designing the root segment involves a critical tension:
The optimal root segment contains exactly what's needed continuously—no more, no less. Determining this requires thorough analysis of the program's call patterns and data access profiles.
A function called by three overlays can be placed in the root (using root space but avoiding duplication) or replicated in each overlay (saving root space but wasting overlay space 3×). With limited memory, these decisions significantly impact overall capacity.
An overlay region is a contiguous block of memory designated to hold overlay segments. When an overlay is loaded, it occupies this region. When a different overlay loads, it replaces the previous contents completely.
Single-Region vs. Multi-Region Architectures:
Simpler overlay systems use a single overlay region—all overlay segments share one memory area. More sophisticated systems support multiple overlay regions, enabling more flexible overlay structures:
Multi-Region Example:
Consider a database application with major modes (Query, Update, Report) that each have sub-functions. A two-region design might use:
This allows the Query Mode overlay to remain loaded while different query sub-functions (Simple Select, Join, Aggregate) swap in and out of the smaller Region 2.
123456789101112131415161718192021222324252627282930313233
MEMORY MAP: Multi-Region Overlay Architecture Address Size Content─────────────────────────────────────────────────0x0000 16KB ROOT SEGMENT ├── Main control logic ├── Overlay manager ├── Shared data structures └── Common utilities─────────────────────────────────────────────────0x4000 30KB OVERLAY REGION 1 (Major Modes) Contains ONE of: ├── Query Mode Overlay (28KB) ├── Update Mode Overlay (30KB) └── Report Mode Overlay (26KB)─────────────────────────────────────────────────0xB800 15KB OVERLAY REGION 2 (Sub-Functions) Contains ONE of: When Query Mode loaded: ├── Simple Select (12KB) ├── Join Operations (15KB) └── Aggregate Functions (14KB) When Update Mode loaded: ├── Insert Logic (10KB) ├── Delete Logic (11KB) └── Modify Logic (13KB)─────────────────────────────────────────────────0xF400 3KB STACK─────────────────────────────────────────────────TOTAL: 64KB BENEFIT: Query Mode can stay loaded while sub-functions swap This reduces disk I/O for common workflowsRegion Sizing Considerations:
The size of an overlay region determines the maximum size of any overlay that uses that region. Careful analysis is required:
A larger overlay region allows bigger overlays but leaves less room for the root segment or other regions. Conversely, a generous root segment constrains overlay size. Finding the optimal balance often required multiple iterations and careful profiling.
Complex overlay programs are often organized as overlay trees (or overlay graphs)—hierarchical structures that define which overlays can call which other overlays and how overlays relate to each other.
The Tree Concept:
In an overlay tree:
Calling between siblings requires unloading one and loading the other. Calling from child to parent (ancestor) works because the parent is already loaded. Calling from parent to child triggers loading the child overlay.
Understanding the Tree Structure:
In the example above:
Call Path Rules:
| Call Direction | What Happens | Disk I/O? |
|---|---|---|
| Root → Level 1 | Load target overlay into Region 1 | Yes |
| Level 1 → Root | Call resident code directly | No |
| Level 1 Sibling → Sibling | Unload current, load target into Region 1 | Yes |
| Level 1 → Level 2 Child | Load child into Region 2 | Yes |
| Level 2 → Parent Level 1 | Call directly (parent still loaded) | No |
| Level 2 → Root | Call directly (root always resident) | No |
| Level 2 Sibling → Sibling | Unload current, load target into Region 2 | Yes |
Path Length and the Critical Invariant:
The path from root to any overlay defines which overlays must be simultaneously loaded. When CF (Const Fold) is executing:
This is the overlay path: ROOT → OPT → CF.
The critical invariant states:
The current execution context requires all overlays along the path from root to the current overlay to be simultaneously present in memory.
This is why multi-region architectures exist—each level in the overlay tree often corresponds to a separate overlay region.
An overlay tree of depth N typically requires N overlay regions (plus the root), since all overlays along a path must coexist. Deeper trees offer more organizational flexibility but require more memory partitioning and management complexity.
The overlay manager requires metadata to locate and load overlay segments correctly. This information is typically stored in overlay tables embedded in the executable or loaded at program startup.
Overlay Table Entry Contents:
Each entry in the overlay table describes one overlay segment:
123456789101112131415161718192021222324252627282930313233343536373839404142
/* * Overlay Table Data Structure * Stored in root segment for runtime access */ typedef struct overlay_entry { uint16_t overlay_id; /* Unique identifier for this overlay */ uint16_t region_id; /* Which overlay region this loads into */ uint16_t parent_id; /* Parent overlay (0 = root is parent) */ uint32_t disk_offset; /* Byte offset in overlay file on disk */ uint32_t segment_size; /* Size of overlay segment in bytes */ uint32_t checksum; /* Integrity verification checksum */ void* load_address; /* Memory address where overlay loads */ void* entry_point; /* First executable address in overlay */ uint8_t flags; /* Status flags */ uint8_t reserved[3]; /* Alignment padding */} overlay_entry_t; /* Overlay flags */#define OVL_LOADED 0x01 /* Currently in memory */#define OVL_DIRTY 0x02 /* Modified (for swap-out purposes) */#define OVL_PINNED 0x04 /* Cannot be unloaded */#define OVL_PRELOAD 0x08 /* Load at program startup */ /* * Master overlay table - allocated in root segment * Size determined at link time based on number of overlays */#define MAX_OVERLAYS 64 typedef struct overlay_table { uint16_t num_overlays; /* Total overlay count */ uint16_t num_regions; /* Number of overlay regions */ uint16_t current[4]; /* Currently loaded in each region */ overlay_entry_t entries[MAX_OVERLAYS]; /* Overlay metadata entries */} overlay_table_t; /* Global overlay table instance - in root segment */overlay_table_t g_overlay_table;Entry Point Tables:
Beyond the overlay table, the system needs to map function names or identifiers to their locations within specific overlays. This enables the overlay manager to:
12345678910111213141516171819202122232425262728293031323334
/* * Entry Point Resolution Table * Maps callable functions to their overlays and offsets */ typedef struct entry_point { char name[32]; /* Function/symbol name */ uint16_t overlay_id; /* Overlay containing this entry point */ uint16_t offset; /* Offset within overlay to function start */ uint8_t calling_convention; /* How to set up call frame */ uint8_t reserved;} entry_point_t; /* Linker generates this table based on overlay structure definition */entry_point_t g_entry_points[] = { /* Lexer overlay (ID: 1) entry points */ { "lexer_init", 1, 0x0000, CC_CDECL, 0 }, { "get_next_token", 1, 0x0120, CC_CDECL, 0 }, { "peek_token", 1, 0x0280, CC_CDECL, 0 }, /* Parser overlay (ID: 2) entry points */ { "parser_init", 2, 0x0000, CC_CDECL, 0 }, { "parse_statement", 2, 0x0100, CC_CDECL, 0 }, { "parse_expression", 2, 0x0400, CC_CDECL, 0 }, { "build_ast_node", 2, 0x0800, CC_CDECL, 0 }, /* Optimizer overlay (ID: 3) entry points */ { "optimize_ast", 3, 0x0000, CC_CDECL, 0 }, { "const_prop", 3, 0x0200, CC_CDECL, 0 }, { "dead_code_elim", 3, 0x0600, CC_CDECL, 0 }, /* ... additional entries ... */ { "", 0, 0, 0, 0 } /* Sentinel entry */};Symbol Resolution at Link Time vs. Run Time:
Overlay systems differ in when they resolve function references:
Link-time resolution: The linker knows the overlay structure and generates proper calling stubs directly. Faster at runtime but requires relink for overlay structure changes.
Run-time resolution: A symbol table is searched at each inter-overlay call. More flexible but adds runtime overhead to every cross-overlay function call.
Most production systems used link-time resolution for performance, with runtime resolution reserved for interpreted or highly dynamic environments.
The linker wasn't just combining object files—it was constructing the entire overlay architecture. It assigned overlays to regions, calculated load addresses, generated overlay tables, created inter-overlay calling stubs, and produced the overlay file on disk. A sophisticated overlay linker was essential infrastructure.
When code in one overlay needs to access code or data in another overlay, careful protocols ensure correctness. Inter-overlay communication encompasses function calls, data sharing, and control transfer.
Function Calls Across Overlays:
When a function in one overlay calls a function in a different overlay (especially a sibling), several steps occur:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
/* * Inter-Overlay Function Call Sequence * This is the calling stub generated by the overlay linker */ /* * Suppose we're in Overlay A (Lexer) and need to call * parse_expression() which is in Overlay B (Parser) * * The linker generates this stub in the root segment: */ int stub_parse_expression(ast_node_t* context) { /* Step 1: Save current overlay context */ save_overlay_context(); /* Step 2: Check if target overlay already loaded */ if (g_overlay_table.current[REGION_1] != OVERLAY_PARSER) { /* Step 3: Unload current overlay (Lexer) from region 1 */ /* Note: Lexer state must already be in root segment! */ /* Step 4: Load target overlay (Parser) into region 1 */ int result = load_overlay(OVERLAY_PARSER); if (result != OVL_SUCCESS) { error("Failed to load Parser overlay"); return -1; } } /* Step 5: Get function address within loaded overlay */ void* func_addr = get_entry_point("parse_expression"); /* Step 6: Call the actual function with original arguments */ int (*real_func)(ast_node_t*) = (int (*)(ast_node_t*))func_addr; int return_value = real_func(context); /* * Step 7: IMPORTANT - Do NOT automatically restore Lexer! * The caller must explicitly reload if needed. * Parser might make more calls before Lexer is needed again. */ /* Step 8: Restore saved context and return */ restore_overlay_context(); return return_value;}Data Sharing Across Overlays:
Data that multiple overlays access must reside in the root segment or a persistent memory area. Overlays cannot safely store data in the overlay region if another overlay might need that data later—it would be overwritten.
Strategies for Data Sharing:
12345678910111213141516171819202122232425262728293031323334353637383940
/* * Strategies for Data Sharing Between Overlays */ /* Strategy 1: Global variables in root segment *//* Declared in root segment, accessible to all overlays */symbol_table_t g_symbol_table; /* Shared by lexer, parser, codegen */error_list_t g_errors; /* All overlays can report errors */config_t g_config; /* Configuration shared everywhere */ /* Strategy 2: Communication buffer in root *//* A generic buffer for inter-overlay data transfer */#define COMM_BUFFER_SIZE 4096typedef struct { uint8_t data[COMM_BUFFER_SIZE]; uint16_t data_len; uint16_t data_type; uint16_t source_overlay; uint16_t target_overlay;} inter_overlay_comm_t; inter_overlay_comm_t g_comm_buffer; /* Overlay A writes data */void lexer_send_tokens(token_stream_t* tokens) { g_comm_buffer.data_len = serialize_tokens(tokens, g_comm_buffer.data, COMM_BUFFER_SIZE); g_comm_buffer.data_type = TYPE_TOKEN_STREAM; g_comm_buffer.source_overlay = OVERLAY_LEXER;} /* Overlay B receives data */token_stream_t* parser_receive_tokens(void) { if (g_comm_buffer.data_type != TYPE_TOKEN_STREAM) { return NULL; } return deserialize_tokens(g_comm_buffer.data, g_comm_buffer.data_len);}Pointers from one overlay to data within another overlay become dangling pointers when that overlay unloads. This was a major source of bugs. Programmers had to carefully ensure all cross-overlay references pointed into the root segment, not into overlay regions.
Experienced overlay programmers developed a set of design principles that maximized overlay effectiveness while minimizing bugs and performance problems:
Overlay Performance Optimization:
Beyond correctness, performance was crucial. Disk I/O for overlay loading was extremely slow compared to memory access (milliseconds vs. microseconds):
| Technique | Description | Trade-off |
|---|---|---|
| Overlay Caching | Keep recently used overlays in a memory buffer | Uses extra memory for disk buffers |
| Preloading | Load next expected overlay during idle time | Requires accurate prediction of usage patterns |
| Disk Layout Optimization | Place frequently sequential overlays adjacent on disk | Complex disk management; fragmentation issues |
| Overlay Combining | Merge small frequently-called overlays | Reduces flexibility; wastes space if not all code used |
| Background Loading | Load overlays asynchronously while current runs | Complex synchronization; not always possible |
Most programs spend 80% of time in 20% of code. Identifying and placing 'hot' code in the root segment or frequently-resident overlays dramatically reduced overlay transitions. Profiling execution to find hot paths was essential optimization work.
We've thoroughly explored the structure of overlay programs—the architectural decisions that made multiprogramming within tight memory constraints possible:
What's Next:
Now that we understand overlay structure, the next page examines manual memory management in overlay systems—the specific techniques programmers used to ensure correct operation, manage the complexity, and avoid the many pitfalls of manual memory control.
You now understand the complete structure of overlay programs: root segments, overlay regions, overlay trees, metadata tables, and inter-overlay communication. This architectural knowledge reveals the sophisticated engineering required to run large programs on limited memory.