Operating SystemsOverlays

Overlays

LevelIntermediate

Duration50 mins

TopicOverlays

2 / 5

Overlay Structure

Anatomy of an Overlay Program

Understanding overlay structure is essential to appreciating how programmers achieved the seemingly impossible: running programs significantly larger than available physical memory. The structure of an overlay program is not arbitrary—it's a carefully engineered architecture that balances memory constraints, execution flow, and data accessibility.

An overlay program consists of distinct components with precisely defined roles and relationships. The root segment provides the permanent foundation, overlay regions offer shared memory spaces, and overlay segments contain the code that dynamically swaps in and out. Together, these components form a coherent system that multiplexes limited memory across a large program.

What You Will Learn

By the end of this page, you will understand the precise structure of overlay programs: how root segments are designed, how overlay regions are allocated, how overlay trees organize program logic, and the critical design principles that ensure correct and efficient overlay operation.

The Root Segment: The Permanent Foundation

The root segment (also called the resident segment or main segment) is the portion of an overlay program that remains in memory throughout the program's entire execution. It never unloads, never gets replaced, and provides the stable foundation upon which all overlay operations depend.

Critical Components of the Root Segment:

The root segment must contain everything that:

Is needed continuously throughout execution
Is accessed by multiple overlay segments
Cannot tolerate the latency of disk loading
Manages the overlay system itself

Essential Root Segment Contents

•Program Entry Point: The main() function or equivalent that initiates execution
•Overlay Manager: Code to load, unload, and track overlay segments
•Global Data Structures: Variables and structures accessed across multiple overlays
•Shared Subroutines: Utility functions called by multiple overlay segments
•Interrupt Handlers: Time-critical code that must respond immediately
•Stack Space: The program stack (shared across all overlays)
•Heap Manager: Dynamic memory allocation routines
•Error Handlers: Critical error recovery code

Memory Layout Example:

Consider a compiler with 180KB of code that must run on a 64KB system. A typical root segment design might look like:

Converting Mermaid diagram...

Root Segment Sizing Trade-offs:

Designing the root segment involves a critical tension:

Too small: Missing shared subroutines forces duplication in overlays; missing global data causes complex inter-overlay communication
Too large: Reduces overlay region size, limiting maximum overlay segment size; wastes memory on rarely-used code

The optimal root segment contains exactly what's needed continuously—no more, no less. Determining this requires thorough analysis of the program's call patterns and data access profiles.

The Shared Subroutine Decision

A function called by three overlays can be placed in the root (using root space but avoiding duplication) or replicated in each overlay (saving root space but wasting overlay space 3×). With limited memory, these decisions significantly impact overall capacity.

Overlay Regions: Shared Memory Territories

An overlay region is a contiguous block of memory designated to hold overlay segments. When an overlay is loaded, it occupies this region. When a different overlay loads, it replaces the previous contents completely.

Single-Region vs. Multi-Region Architectures:

Simpler overlay systems use a single overlay region—all overlay segments share one memory area. More sophisticated systems support multiple overlay regions, enabling more flexible overlay structures:

Single-Region Architecture

•Simplicity: One overlay area to manage
•Clear semantics: Only one overlay active at a time
•Minimal overhead: Simple overlay table
•Constraint: All overlays mutually exclusive
•Best for: Simple, linear program phases

Multi-Region Architecture

•Flexibility: Independent overlay areas
•Concurrency: Multiple overlays active simultaneously
•Finer control: Different sizes for different uses
•Complexity: More sophisticated management
•Best for: Complex programs with sub-phases

Multi-Region Example:

Consider a database application with major modes (Query, Update, Report) that each have sub-functions. A two-region design might use:

Region 1 (30KB): Major mode overlays (Query Mode, Update Mode, Report Mode)
Region 2 (15KB): Sub-function overlays within each mode

This allows the Query Mode overlay to remain loaded while different query sub-functions (Simple Select, Join, Aggregate) swap in and out of the smaller Region 2.

multi_region_overlay.txt

Overlay Structure

MEMORY MAP: Multi-Region Overlay Architecture
 
Address     Size    Content
─────────────────────────────────────────────────
0x0000     16KB    ROOT SEGMENT
                   ├── Main control logic
                   ├── Overlay manager
                   ├── Shared data structures
                   └── Common utilities
─────────────────────────────────────────────────
0x4000     30KB    OVERLAY REGION 1 (Major Modes)
                   Contains ONE of:
                   ├── Query Mode Overlay (28KB)
                   ├── Update Mode Overlay (30KB)
                   └── Report Mode Overlay (26KB)
─────────────────────────────────────────────────
0xB800     15KB    OVERLAY REGION 2 (Sub-Functions)
                   Contains ONE of:
                   When Query Mode loaded:
                   ├── Simple Select (12KB)
                   ├── Join Operations (15KB)
                   └── Aggregate Functions (14KB)
                   When Update Mode loaded:
                   ├── Insert Logic (10KB)
                   ├── Delete Logic (11KB)
                   └── Modify Logic (13KB)
─────────────────────────────────────────────────
0xF400      3KB    STACK
─────────────────────────────────────────────────
TOTAL:     64KB
 
BENEFIT: Query Mode can stay loaded while sub-functions swap
         This reduces disk I/O for common workflows

Region Sizing Considerations:

The size of an overlay region determines the maximum size of any overlay that uses that region. Careful analysis is required:

Largest Overlay Rule: Region size ≥ largest overlay that will occupy it
Wasted Space: If overlays vary in size, smaller overlays leave unused space
Granularity Trade-off: More regions = more flexibility but more management overhead
Sum Constraint: All regions + root segment ≤ total available memory

The Sizing Paradox

A larger overlay region allows bigger overlays but leaves less room for the root segment or other regions. Conversely, a generous root segment constrains overlay size. Finding the optimal balance often required multiple iterations and careful profiling.

Overlay Trees: Hierarchical Organization

Complex overlay programs are often organized as overlay trees (or overlay graphs)—hierarchical structures that define which overlays can call which other overlays and how overlays relate to each other.

The Tree Concept:

In an overlay tree:

The root segment is the tree's root node
Each overlay segment is a node in the tree
An overlay can only directly call other overlays that are its children or ancestors
Sibling overlays are mutually exclusive—they share the same overlay region

Calling between siblings requires unloading one and loading the other. Calling from child to parent (ancestor) works because the parent is already loaded. Calling from parent to child triggers loading the child overlay.

Converting Mermaid diagram...

Understanding the Tree Structure:

In the example above:

Root can call any Level 1 overlay (LEX, PARSE, OPT, GEN)
Sibling overlays (LEX, PARSE, OPT, GEN) are mutually exclusive—calling from LEX to PARSE requires unloading LEX first
OPT can call its children (CF, DCE, LO) while remaining loaded
Level 2 overlays under OPT share Region 2 and are mutually exclusive with each other
CF, DCE, LO can call back to OPT (their parent) or ROOT (their grandparent)
CF cannot directly call RA—they exist in different subtrees and different contexts

Call Path Rules:

Overlay Call Path Rules
Call Direction	What Happens	Disk I/O?
Root → Level 1	Load target overlay into Region 1	Yes
Level 1 → Root	Call resident code directly	No
Level 1 Sibling → Sibling	Unload current, load target into Region 1	Yes
Level 1 → Level 2 Child	Load child into Region 2	Yes
Level 2 → Parent Level 1	Call directly (parent still loaded)	No
Level 2 → Root	Call directly (root always resident)	No
Level 2 Sibling → Sibling	Unload current, load target into Region 2	Yes

Path Length and the Critical Invariant:

The path from root to any overlay defines which overlays must be simultaneously loaded. When CF (Const Fold) is executing:

ROOT is resident (always)
OPT (Optimizer) is loaded in Region 1 (parent of CF)
CF is loaded in Region 2

This is the overlay path: ROOT → OPT → CF.

The critical invariant states:

The current execution context requires all overlays along the path from root to the current overlay to be simultaneously present in memory.

This is why multi-region architectures exist—each level in the overlay tree often corresponds to a separate overlay region.

Tree Depth vs. Memory Regions

An overlay tree of depth N typically requires N overlay regions (plus the root), since all overlays along a path must coexist. Deeper trees offer more organizational flexibility but require more memory partitioning and management complexity.

Overlay Tables and Metadata

The overlay manager requires metadata to locate and load overlay segments correctly. This information is typically stored in overlay tables embedded in the executable or loaded at program startup.

Overlay Table Entry Contents:

Each entry in the overlay table describes one overlay segment:

overlay_table.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
/*
 * Overlay Table Data Structure
 * Stored in root segment for runtime access
 */
 
typedef struct overlay_entry {
    uint16_t overlay_id;          /* Unique identifier for this overlay */
    uint16_t region_id;           /* Which overlay region this loads into */
    uint16_t parent_id;           /* Parent overlay (0 = root is parent) */
    
    uint32_t disk_offset;         /* Byte offset in overlay file on disk */
    uint32_t segment_size;        /* Size of overlay segment in bytes */
    uint32_t checksum;            /* Integrity verification checksum */
    
    void*    load_address;        /* Memory address where overlay loads */
    void*    entry_point;         /* First executable address in overlay */
    
    uint8_t  flags;               /* Status flags */
    uint8_t  reserved[3];         /* Alignment padding */
} overlay_entry_t;
 
/* Overlay flags */
#define OVL_LOADED      0x01     /* Currently in memory */
#define OVL_DIRTY       0x02     /* Modified (for swap-out purposes) */
#define OVL_PINNED      0x04     /* Cannot be unloaded */
#define OVL_PRELOAD     0x08     /* Load at program startup */
 
/* 
 * Master overlay table - allocated in root segment
 * Size determined at link time based on number of overlays
 */
#define MAX_OVERLAYS 64
 
typedef struct overlay_table {
    uint16_t         num_overlays;          /* Total overlay count */
    uint16_t         num_regions;           /* Number of overlay regions */
    uint16_t         current[4];            /* Currently loaded in each region */
    overlay_entry_t  entries[MAX_OVERLAYS]; /* Overlay metadata entries */
} overlay_table_t;
 
/* Global overlay table instance - in root segment */
overlay_table_t g_overlay_table;

Entry Point Tables:

Beyond the overlay table, the system needs to map function names or identifiers to their locations within specific overlays. This enables the overlay manager to:

Determine which overlay contains a called function
Load the correct overlay if not already present
Vector control to the function within the overlay

entry_point_table.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/*
 * Entry Point Resolution Table
 * Maps callable functions to their overlays and offsets
 */
 
typedef struct entry_point {
    char     name[32];            /* Function/symbol name */
    uint16_t overlay_id;          /* Overlay containing this entry point */
    uint16_t offset;              /* Offset within overlay to function start */
    uint8_t  calling_convention;  /* How to set up call frame */
    uint8_t  reserved;
} entry_point_t;
 
/* Linker generates this table based on overlay structure definition */
entry_point_t g_entry_points[] = {
    /* Lexer overlay (ID: 1) entry points */
    { "lexer_init",       1, 0x0000, CC_CDECL, 0 },
    { "get_next_token",   1, 0x0120, CC_CDECL, 0 },
    { "peek_token",       1, 0x0280, CC_CDECL, 0 },
    
    /* Parser overlay (ID: 2) entry points */
    { "parser_init",      2, 0x0000, CC_CDECL, 0 },
    { "parse_statement",  2, 0x0100, CC_CDECL, 0 },
    { "parse_expression", 2, 0x0400, CC_CDECL, 0 },
    { "build_ast_node",   2, 0x0800, CC_CDECL, 0 },
    
    /* Optimizer overlay (ID: 3) entry points */
    { "optimize_ast",     3, 0x0000, CC_CDECL, 0 },
    { "const_prop",       3, 0x0200, CC_CDECL, 0 },
    { "dead_code_elim",   3, 0x0600, CC_CDECL, 0 },
    
    /* ... additional entries ... */
    { "", 0, 0, 0, 0 }  /* Sentinel entry */
};

Symbol Resolution at Link Time vs. Run Time:

Overlay systems differ in when they resolve function references:

Link-time resolution: The linker knows the overlay structure and generates proper calling stubs directly. Faster at runtime but requires relink for overlay structure changes.
Run-time resolution: A symbol table is searched at each inter-overlay call. More flexible but adds runtime overhead to every cross-overlay function call.

Most production systems used link-time resolution for performance, with runtime resolution reserved for interpreted or highly dynamic environments.

The Linker's Critical Role

The linker wasn't just combining object files—it was constructing the entire overlay architecture. It assigned overlays to regions, calculated load addresses, generated overlay tables, created inter-overlay calling stubs, and produced the overlay file on disk. A sophisticated overlay linker was essential infrastructure.

Inter-Overlay Communication

When code in one overlay needs to access code or data in another overlay, careful protocols ensure correctness. Inter-overlay communication encompasses function calls, data sharing, and control transfer.

Function Calls Across Overlays:

When a function in one overlay calls a function in a different overlay (especially a sibling), several steps occur:

inter_overlay_call.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/*
 * Inter-Overlay Function Call Sequence
 * This is the calling stub generated by the overlay linker
 */
 
/*
 * Suppose we're in Overlay A (Lexer) and need to call 
 * parse_expression() which is in Overlay B (Parser)
 * 
 * The linker generates this stub in the root segment:
 */
 
int stub_parse_expression(ast_node_t* context) {
    /* Step 1: Save current overlay context */
    save_overlay_context();
    
    /* Step 2: Check if target overlay already loaded */
    if (g_overlay_table.current[REGION_1] != OVERLAY_PARSER) {
        /* Step 3: Unload current overlay (Lexer) from region 1 */
        /* Note: Lexer state must already be in root segment! */
        
        /* Step 4: Load target overlay (Parser) into region 1 */
        int result = load_overlay(OVERLAY_PARSER);
        if (result != OVL_SUCCESS) {
            error("Failed to load Parser overlay");
            return -1;
        }
    }
    
    /* Step 5: Get function address within loaded overlay */
    void* func_addr = get_entry_point("parse_expression");
    
    /* Step 6: Call the actual function with original arguments */
    int (*real_func)(ast_node_t*) = (int (*)(ast_node_t*))func_addr;
    int return_value = real_func(context);
    
    /* 
     * Step 7: IMPORTANT - Do NOT automatically restore Lexer!
     * The caller must explicitly reload if needed.
     * Parser might make more calls before Lexer is needed again.
     */
    
    /* Step 8: Restore saved context and return */
    restore_overlay_context();
    return return_value;
}

Data Sharing Across Overlays:

Data that multiple overlays access must reside in the root segment or a persistent memory area. Overlays cannot safely store data in the overlay region if another overlay might need that data later—it would be overwritten.

Strategies for Data Sharing:

•Global Variables in Root: All shared data lives in root segment. Simple and reliable, but consumes root space.
•Parameter Passing: Data passed as function arguments through the stack (in root). Works for small, transient data.
•Shared Memory Block: A designated area in root segment used as a communication buffer between overlays.
•Disk-Based Exchange: For large datasets, overlays write to disk and subsequent overlays read from disk. Slow but unlimited size.
•Persistent Heap: Dynamic memory allocation in root-managed heap. Overlays allocate/deallocate; heap persists.

shared_data_design.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
/*
 * Strategies for Data Sharing Between Overlays
 */
 
/* Strategy 1: Global variables in root segment */
/* Declared in root segment, accessible to all overlays */
symbol_table_t g_symbol_table;  /* Shared by lexer, parser, codegen */
error_list_t   g_errors;        /* All overlays can report errors */
config_t       g_config;        /* Configuration shared everywhere */
 
/* Strategy 2: Communication buffer in root */
/* A generic buffer for inter-overlay data transfer */
#define COMM_BUFFER_SIZE 4096
typedef struct {
    uint8_t  data[COMM_BUFFER_SIZE];
    uint16_t data_len;
    uint16_t data_type;
    uint16_t source_overlay;
    uint16_t target_overlay;
} inter_overlay_comm_t;
 
inter_overlay_comm_t g_comm_buffer;
 
/* Overlay A writes data */
void lexer_send_tokens(token_stream_t* tokens) {
    g_comm_buffer.data_len = serialize_tokens(tokens, 
                                               g_comm_buffer.data,
                                               COMM_BUFFER_SIZE);
    g_comm_buffer.data_type = TYPE_TOKEN_STREAM;
    g_comm_buffer.source_overlay = OVERLAY_LEXER;
}
 
/* Overlay B receives data */
token_stream_t* parser_receive_tokens(void) {
    if (g_comm_buffer.data_type != TYPE_TOKEN_STREAM) {
        return NULL;
    }
    return deserialize_tokens(g_comm_buffer.data, 
                              g_comm_buffer.data_len);
}

The Pointer Problem

Pointers from one overlay to data within another overlay become dangling pointers when that overlay unloads. This was a major source of bugs. Programmers had to carefully ensure all cross-overlay references pointed into the root segment, not into overlay regions.

Design Principles and Best Practices

Experienced overlay programmers developed a set of design principles that maximized overlay effectiveness while minimizing bugs and performance problems:

Overlay Design Principles

•Minimize Overlay Transitions: Design program flow to minimize how often overlays swap. Group related functionality into the same overlay.
•Maximize Overlay Size Uniformity: Keep overlays similarly sized to minimize wasted space in the overlay region.
•Careful Root Segment Curation: Include in root only what truly needs persistence. Resist the temptation to put "convenience" functions there.
•Single Responsibility Per Overlay: Each overlay should handle one coherent function (lexing, parsing, etc.). Mixed-purpose overlays create coupling.
•Data Locality: Keep data close to the code that uses it, unless sharing is required. Data in overlays loads with the overlay.
•Clear Entry/Exit Contracts: Document what each overlay expects in shared state on entry and what it produces on exit.
•Defensive Programming: Validate that expected overlays are loaded before accessing their data. Never assume.

Overlay Performance Optimization:

Beyond correctness, performance was crucial. Disk I/O for overlay loading was extremely slow compared to memory access (milliseconds vs. microseconds):

Performance Optimization Techniques
Technique	Description	Trade-off
Overlay Caching	Keep recently used overlays in a memory buffer	Uses extra memory for disk buffers
Preloading	Load next expected overlay during idle time	Requires accurate prediction of usage patterns
Disk Layout Optimization	Place frequently sequential overlays adjacent on disk	Complex disk management; fragmentation issues
Overlay Combining	Merge small frequently-called overlays	Reduces flexibility; wastes space if not all code used
Background Loading	Load overlays asynchronously while current runs	Complex synchronization; not always possible

The 80/20 Rule in Overlays

Most programs spend 80% of time in 20% of code. Identifying and placing 'hot' code in the root segment or frequently-resident overlays dramatically reduced overlay transitions. Profiling execution to find hot paths was essential optimization work.

Summary: Mastering Overlay Structure

We've thoroughly explored the structure of overlay programs—the architectural decisions that made multiprogramming within tight memory constraints possible:

Key Takeaways

•Root segments are the stable foundation — Containing overlay management, shared data, and common utilities, the root never unloads
•Overlay regions are time-shared memory — Multiple overlays share the same physical memory by loading sequentially, not simultaneously
•Overlay trees organize relationships — Hierarchical structures define which overlays can call which, enforcing mutual exclusion between siblings
•Overlay tables enable runtime management — Metadata about each overlay's location, size, and state enables the overlay manager to function
•Inter-overlay communication requires discipline — Shared data must reside in persistent locations; pointer safety is the programmer's responsibility
•Design principles minimize problems — Careful overlay design, minimal transitions, and performance optimization were essential skills

What's Next:

Now that we understand overlay structure, the next page examines manual memory management in overlay systems—the specific techniques programmers used to ensure correct operation, manage the complexity, and avoid the many pitfalls of manual memory control.

Page Complete

You now understand the complete structure of overlay programs: root segments, overlay regions, overlay trees, metadata tables, and inter-overlay communication. This architectural knowledge reveals the sophisticated engineering required to run large programs on limited memory.

2 / 5

Loading learning content...

Operating SystemsOverlays

Overlays

LevelIntermediate

Duration50 mins

TopicOverlays

2 / 5

Overlay Structure

Anatomy of an Overlay Program

What You Will Learn

The Root Segment: The Permanent Foundation

Critical Components of the Root Segment:

The root segment must contain everything that:

Is needed continuously throughout execution
Is accessed by multiple overlay segments
Cannot tolerate the latency of disk loading
Manages the overlay system itself

Essential Root Segment Contents

•Program Entry Point: The main() function or equivalent that initiates execution
•Overlay Manager: Code to load, unload, and track overlay segments
•Global Data Structures: Variables and structures accessed across multiple overlays
•Shared Subroutines: Utility functions called by multiple overlay segments
•Interrupt Handlers: Time-critical code that must respond immediately
•Stack Space: The program stack (shared across all overlays)
•Heap Manager: Dynamic memory allocation routines
•Error Handlers: Critical error recovery code

Memory Layout Example:

Consider a compiler with 180KB of code that must run on a 64KB system. A typical root segment design might look like:

Converting Mermaid diagram...

Root Segment Sizing Trade-offs:

Designing the root segment involves a critical tension:

Too small: Missing shared subroutines forces duplication in overlays; missing global data causes complex inter-overlay communication
Too large: Reduces overlay region size, limiting maximum overlay segment size; wastes memory on rarely-used code

The optimal root segment contains exactly what's needed continuously—no more, no less. Determining this requires thorough analysis of the program's call patterns and data access profiles.

The Shared Subroutine Decision

Overlay Regions: Shared Memory Territories

Single-Region vs. Multi-Region Architectures:

Single-Region Architecture

•Simplicity: One overlay area to manage
•Clear semantics: Only one overlay active at a time
•Minimal overhead: Simple overlay table
•Constraint: All overlays mutually exclusive
•Best for: Simple, linear program phases

Multi-Region Architecture

•Flexibility: Independent overlay areas
•Concurrency: Multiple overlays active simultaneously
•Finer control: Different sizes for different uses
•Complexity: More sophisticated management
•Best for: Complex programs with sub-phases

Multi-Region Example:

Consider a database application with major modes (Query, Update, Report) that each have sub-functions. A two-region design might use:

Region 1 (30KB): Major mode overlays (Query Mode, Update Mode, Report Mode)
Region 2 (15KB): Sub-function overlays within each mode

This allows the Query Mode overlay to remain loaded while different query sub-functions (Simple Select, Join, Aggregate) swap in and out of the smaller Region 2.

multi_region_overlay.txt

Overlay Structure

MEMORY MAP: Multi-Region Overlay Architecture
 
Address     Size    Content
─────────────────────────────────────────────────
0x0000     16KB    ROOT SEGMENT
                   ├── Main control logic
                   ├── Overlay manager
                   ├── Shared data structures
                   └── Common utilities
─────────────────────────────────────────────────
0x4000     30KB    OVERLAY REGION 1 (Major Modes)
                   Contains ONE of:
                   ├── Query Mode Overlay (28KB)
                   ├── Update Mode Overlay (30KB)
                   └── Report Mode Overlay (26KB)
─────────────────────────────────────────────────
0xB800     15KB    OVERLAY REGION 2 (Sub-Functions)
                   Contains ONE of:
                   When Query Mode loaded:
                   ├── Simple Select (12KB)
                   ├── Join Operations (15KB)
                   └── Aggregate Functions (14KB)
                   When Update Mode loaded:
                   ├── Insert Logic (10KB)
                   ├── Delete Logic (11KB)
                   └── Modify Logic (13KB)
─────────────────────────────────────────────────
0xF400      3KB    STACK
─────────────────────────────────────────────────
TOTAL:     64KB
 
BENEFIT: Query Mode can stay loaded while sub-functions swap
         This reduces disk I/O for common workflows

Region Sizing Considerations:

The size of an overlay region determines the maximum size of any overlay that uses that region. Careful analysis is required:

Largest Overlay Rule: Region size ≥ largest overlay that will occupy it
Wasted Space: If overlays vary in size, smaller overlays leave unused space
Granularity Trade-off: More regions = more flexibility but more management overhead
Sum Constraint: All regions + root segment ≤ total available memory

The Sizing Paradox

Overlay Trees: Hierarchical Organization

The Tree Concept:

In an overlay tree:

The root segment is the tree's root node
Each overlay segment is a node in the tree
An overlay can only directly call other overlays that are its children or ancestors
Sibling overlays are mutually exclusive—they share the same overlay region

Converting Mermaid diagram...

Understanding the Tree Structure:

In the example above:

Root can call any Level 1 overlay (LEX, PARSE, OPT, GEN)
Sibling overlays (LEX, PARSE, OPT, GEN) are mutually exclusive—calling from LEX to PARSE requires unloading LEX first
OPT can call its children (CF, DCE, LO) while remaining loaded
Level 2 overlays under OPT share Region 2 and are mutually exclusive with each other
CF, DCE, LO can call back to OPT (their parent) or ROOT (their grandparent)
CF cannot directly call RA—they exist in different subtrees and different contexts

Call Path Rules:

Overlay Call Path Rules
Call Direction	What Happens	Disk I/O?
Root → Level 1	Load target overlay into Region 1	Yes
Level 1 → Root	Call resident code directly	No
Level 1 Sibling → Sibling	Unload current, load target into Region 1	Yes
Level 1 → Level 2 Child	Load child into Region 2	Yes
Level 2 → Parent Level 1	Call directly (parent still loaded)	No
Level 2 → Root	Call directly (root always resident)	No
Level 2 Sibling → Sibling	Unload current, load target into Region 2	Yes

Path Length and the Critical Invariant:

The path from root to any overlay defines which overlays must be simultaneously loaded. When CF (Const Fold) is executing:

ROOT is resident (always)
OPT (Optimizer) is loaded in Region 1 (parent of CF)
CF is loaded in Region 2

This is the overlay path: ROOT → OPT → CF.

The critical invariant states:

The current execution context requires all overlays along the path from root to the current overlay to be simultaneously present in memory.

This is why multi-region architectures exist—each level in the overlay tree often corresponds to a separate overlay region.

Tree Depth vs. Memory Regions

Overlay Tables and Metadata

Overlay Table Entry Contents:

Each entry in the overlay table describes one overlay segment:

overlay_table.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
/*
 * Overlay Table Data Structure
 * Stored in root segment for runtime access
 */
 
typedef struct overlay_entry {
    uint16_t overlay_id;          /* Unique identifier for this overlay */
    uint16_t region_id;           /* Which overlay region this loads into */
    uint16_t parent_id;           /* Parent overlay (0 = root is parent) */
    
    uint32_t disk_offset;         /* Byte offset in overlay file on disk */
    uint32_t segment_size;        /* Size of overlay segment in bytes */
    uint32_t checksum;            /* Integrity verification checksum */
    
    void*    load_address;        /* Memory address where overlay loads */
    void*    entry_point;         /* First executable address in overlay */
    
    uint8_t  flags;               /* Status flags */
    uint8_t  reserved[3];         /* Alignment padding */
} overlay_entry_t;
 
/* Overlay flags */
#define OVL_LOADED      0x01     /* Currently in memory */
#define OVL_DIRTY       0x02     /* Modified (for swap-out purposes) */
#define OVL_PINNED      0x04     /* Cannot be unloaded */
#define OVL_PRELOAD     0x08     /* Load at program startup */
 
/* 
 * Master overlay table - allocated in root segment
 * Size determined at link time based on number of overlays
 */
#define MAX_OVERLAYS 64
 
typedef struct overlay_table {
    uint16_t         num_overlays;          /* Total overlay count */
    uint16_t         num_regions;           /* Number of overlay regions */
    uint16_t         current[4];            /* Currently loaded in each region */
    overlay_entry_t  entries[MAX_OVERLAYS]; /* Overlay metadata entries */
} overlay_table_t;
 
/* Global overlay table instance - in root segment */
overlay_table_t g_overlay_table;

Entry Point Tables:

Beyond the overlay table, the system needs to map function names or identifiers to their locations within specific overlays. This enables the overlay manager to:

Determine which overlay contains a called function
Load the correct overlay if not already present
Vector control to the function within the overlay

entry_point_table.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/*
 * Entry Point Resolution Table
 * Maps callable functions to their overlays and offsets
 */
 
typedef struct entry_point {
    char     name[32];            /* Function/symbol name */
    uint16_t overlay_id;          /* Overlay containing this entry point */
    uint16_t offset;              /* Offset within overlay to function start */
    uint8_t  calling_convention;  /* How to set up call frame */
    uint8_t  reserved;
} entry_point_t;
 
/* Linker generates this table based on overlay structure definition */
entry_point_t g_entry_points[] = {
    /* Lexer overlay (ID: 1) entry points */
    { "lexer_init",       1, 0x0000, CC_CDECL, 0 },
    { "get_next_token",   1, 0x0120, CC_CDECL, 0 },
    { "peek_token",       1, 0x0280, CC_CDECL, 0 },
    
    /* Parser overlay (ID: 2) entry points */
    { "parser_init",      2, 0x0000, CC_CDECL, 0 },
    { "parse_statement",  2, 0x0100, CC_CDECL, 0 },
    { "parse_expression", 2, 0x0400, CC_CDECL, 0 },
    { "build_ast_node",   2, 0x0800, CC_CDECL, 0 },
    
    /* Optimizer overlay (ID: 3) entry points */
    { "optimize_ast",     3, 0x0000, CC_CDECL, 0 },
    { "const_prop",       3, 0x0200, CC_CDECL, 0 },
    { "dead_code_elim",   3, 0x0600, CC_CDECL, 0 },
    
    /* ... additional entries ... */
    { "", 0, 0, 0, 0 }  /* Sentinel entry */
};

Symbol Resolution at Link Time vs. Run Time:

Overlay systems differ in when they resolve function references:

Link-time resolution: The linker knows the overlay structure and generates proper calling stubs directly. Faster at runtime but requires relink for overlay structure changes.
Run-time resolution: A symbol table is searched at each inter-overlay call. More flexible but adds runtime overhead to every cross-overlay function call.

Most production systems used link-time resolution for performance, with runtime resolution reserved for interpreted or highly dynamic environments.

The Linker's Critical Role

Inter-Overlay Communication

Function Calls Across Overlays:

When a function in one overlay calls a function in a different overlay (especially a sibling), several steps occur:

inter_overlay_call.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/*
 * Inter-Overlay Function Call Sequence
 * This is the calling stub generated by the overlay linker
 */
 
/*
 * Suppose we're in Overlay A (Lexer) and need to call 
 * parse_expression() which is in Overlay B (Parser)
 * 
 * The linker generates this stub in the root segment:
 */
 
int stub_parse_expression(ast_node_t* context) {
    /* Step 1: Save current overlay context */
    save_overlay_context();
    
    /* Step 2: Check if target overlay already loaded */
    if (g_overlay_table.current[REGION_1] != OVERLAY_PARSER) {
        /* Step 3: Unload current overlay (Lexer) from region 1 */
        /* Note: Lexer state must already be in root segment! */
        
        /* Step 4: Load target overlay (Parser) into region 1 */
        int result = load_overlay(OVERLAY_PARSER);
        if (result != OVL_SUCCESS) {
            error("Failed to load Parser overlay");
            return -1;
        }
    }
    
    /* Step 5: Get function address within loaded overlay */
    void* func_addr = get_entry_point("parse_expression");
    
    /* Step 6: Call the actual function with original arguments */
    int (*real_func)(ast_node_t*) = (int (*)(ast_node_t*))func_addr;
    int return_value = real_func(context);
    
    /* 
     * Step 7: IMPORTANT - Do NOT automatically restore Lexer!
     * The caller must explicitly reload if needed.
     * Parser might make more calls before Lexer is needed again.
     */
    
    /* Step 8: Restore saved context and return */
    restore_overlay_context();
    return return_value;
}

Data Sharing Across Overlays:

Strategies for Data Sharing:

•Global Variables in Root: All shared data lives in root segment. Simple and reliable, but consumes root space.
•Parameter Passing: Data passed as function arguments through the stack (in root). Works for small, transient data.
•Shared Memory Block: A designated area in root segment used as a communication buffer between overlays.
•Disk-Based Exchange: For large datasets, overlays write to disk and subsequent overlays read from disk. Slow but unlimited size.
•Persistent Heap: Dynamic memory allocation in root-managed heap. Overlays allocate/deallocate; heap persists.

shared_data_design.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
/*
 * Strategies for Data Sharing Between Overlays
 */
 
/* Strategy 1: Global variables in root segment */
/* Declared in root segment, accessible to all overlays */
symbol_table_t g_symbol_table;  /* Shared by lexer, parser, codegen */
error_list_t   g_errors;        /* All overlays can report errors */
config_t       g_config;        /* Configuration shared everywhere */
 
/* Strategy 2: Communication buffer in root */
/* A generic buffer for inter-overlay data transfer */
#define COMM_BUFFER_SIZE 4096
typedef struct {
    uint8_t  data[COMM_BUFFER_SIZE];
    uint16_t data_len;
    uint16_t data_type;
    uint16_t source_overlay;
    uint16_t target_overlay;
} inter_overlay_comm_t;
 
inter_overlay_comm_t g_comm_buffer;
 
/* Overlay A writes data */
void lexer_send_tokens(token_stream_t* tokens) {
    g_comm_buffer.data_len = serialize_tokens(tokens, 
                                               g_comm_buffer.data,
                                               COMM_BUFFER_SIZE);
    g_comm_buffer.data_type = TYPE_TOKEN_STREAM;
    g_comm_buffer.source_overlay = OVERLAY_LEXER;
}
 
/* Overlay B receives data */
token_stream_t* parser_receive_tokens(void) {
    if (g_comm_buffer.data_type != TYPE_TOKEN_STREAM) {
        return NULL;
    }
    return deserialize_tokens(g_comm_buffer.data, 
                              g_comm_buffer.data_len);
}

The Pointer Problem

Design Principles and Best Practices

Experienced overlay programmers developed a set of design principles that maximized overlay effectiveness while minimizing bugs and performance problems:

Overlay Design Principles

•Minimize Overlay Transitions: Design program flow to minimize how often overlays swap. Group related functionality into the same overlay.
•Maximize Overlay Size Uniformity: Keep overlays similarly sized to minimize wasted space in the overlay region.
•Careful Root Segment Curation: Include in root only what truly needs persistence. Resist the temptation to put "convenience" functions there.
•Single Responsibility Per Overlay: Each overlay should handle one coherent function (lexing, parsing, etc.). Mixed-purpose overlays create coupling.
•Data Locality: Keep data close to the code that uses it, unless sharing is required. Data in overlays loads with the overlay.
•Clear Entry/Exit Contracts: Document what each overlay expects in shared state on entry and what it produces on exit.
•Defensive Programming: Validate that expected overlays are loaded before accessing their data. Never assume.

Overlay Performance Optimization:

Beyond correctness, performance was crucial. Disk I/O for overlay loading was extremely slow compared to memory access (milliseconds vs. microseconds):

Performance Optimization Techniques
Technique	Description	Trade-off
Overlay Caching	Keep recently used overlays in a memory buffer	Uses extra memory for disk buffers
Preloading	Load next expected overlay during idle time	Requires accurate prediction of usage patterns
Disk Layout Optimization	Place frequently sequential overlays adjacent on disk	Complex disk management; fragmentation issues
Overlay Combining	Merge small frequently-called overlays	Reduces flexibility; wastes space if not all code used
Background Loading	Load overlays asynchronously while current runs	Complex synchronization; not always possible

The 80/20 Rule in Overlays

Summary: Mastering Overlay Structure

We've thoroughly explored the structure of overlay programs—the architectural decisions that made multiprogramming within tight memory constraints possible:

Key Takeaways

•Root segments are the stable foundation — Containing overlay management, shared data, and common utilities, the root never unloads
•Overlay regions are time-shared memory — Multiple overlays share the same physical memory by loading sequentially, not simultaneously
•Overlay trees organize relationships — Hierarchical structures define which overlays can call which, enforcing mutual exclusion between siblings
•Overlay tables enable runtime management — Metadata about each overlay's location, size, and state enables the overlay manager to function
•Inter-overlay communication requires discipline — Shared data must reside in persistent locations; pointer safety is the programmer's responsibility
•Design principles minimize problems — Careful overlay design, minimal transitions, and performance optimization were essential skills

What's Next:

Page Complete

2 / 5