Loading learning content...
Every program running on your computer operates under a grand illusion: it believes it has exclusive access to a vast, contiguous expanse of memory, starting from address zero and extending to the limits of architectural possibility. A 64-bit program perceives an address space of 16 exabytes—more than a billion times the physical RAM in any computer. Yet this illusion is essential, not accidental. It's one of the most successful abstractions in computing history: the logical address space.
This abstraction liberates programmers from the complexities of physical memory management—from knowing exactly where data resides in hardware, from coordinating with other programs for memory access, from dealing with fragmented or non-contiguous physical storage. Understanding logical address space is fundamental to grasping how operating systems provide isolation, security, and the seamless execution of multiple programs simultaneously.
By the end of this page, you will understand the precise definition and properties of logical address space, how it differs fundamentally from physical memory, why this abstraction was invented, and how it enables the multiprogramming capabilities we take for granted in modern systems. You'll develop the conceptual foundation necessary for understanding address translation, memory protection, and virtual memory.
A logical address (also called a virtual address) is an address generated by the CPU during program execution. The collection of all logical addresses that a program can generate constitutes its logical address space.
More formally:
The logical address space of a process is the set of all addresses that the process can reference during execution, as perceived by the CPU when executing instructions within that process's context.
This definition carries several important implications that we must unpack carefully.
x, the compiled code generates a logical address to access x's memory location.0x1000 in Process A refers to a different physical location (or no location at all) than address 0x1000 in Process B.The terms 'logical address' and 'virtual address' are often used interchangeably, though some texts distinguish them: 'logical' refers to the address from the CPU's perspective, while 'virtual' emphasizes the illusion of a larger-than-physical address space. In modern systems with virtual memory, the distinction is minimal. We'll use both terms to match different literature you may encounter.
Mathematical Representation:
For a system with n-bit logical addresses, the logical address space L is defined as:
L = {0, 1, 2, ..., 2ⁿ - 1}
For a 32-bit system: L = {0, 1, ..., 4,294,967,295} (4 GB of addressable locations) For a 64-bit system: L = {0, 1, ..., 18,446,744,073,709,551,615} (16 EB theoretical maximum)
In practice, the usable logical address space is often smaller due to architectural constraints, reserved regions, and address space layout policies—but the conceptual size is defined by the address width.
The concept of logical addressing emerged from real engineering problems in early computing. Understanding this history illuminates why the abstraction takes its current form.
The Early Days: Absolute Addressing
In the earliest computers (1940s-1950s), programmers used absolute addresses—physical memory locations hardcoded into programs. If your program stored a variable at address 1000, that meant physical memory location 1000 on the hardware.
This approach had severe limitations:
The Birth of Relocation: Base Registers
The first step toward logical addressing came with relocatable code and the base register (circa 1960s). The idea was simple but powerful:
This introduced a simple form of address translation:
Physical Address = Logical Address + Base Register
With this mechanism, two programs could coexist in memory by having different base values. Program A starts at physical address 0, Program B at address 50000. Each program thinks it starts at address 0.
123456789101112131415161718192021222324
Memory Layout with Base Register Relocation: Physical Memory:┌──────────────────────────────────────────────┐│ 0x00000: Operating System │├──────────────────────────────────────────────┤│ 0x10000: Program A (Base = 0x10000) ││ Logical Addr 0 → Physical 0x10000 ││ Logical Addr 100 → Physical 0x10100 │├──────────────────────────────────────────────┤│ 0x50000: Program B (Base = 0x50000) ││ Logical Addr 0 → Physical 0x50000 ││ Logical Addr 100 → Physical 0x50100 │├──────────────────────────────────────────────┤│ 0x90000: Free Memory │└──────────────────────────────────────────────┘ Both programs are written as if they start at address 0.The hardware adds the base value during every memory access. Context Switch:- When switching from Program A to Program B,- The OS updates the base register from 0x10000 to 0x50000- All subsequent memory accesses are automatically relocatedThe Evolution Continues:
Base register relocation was a crucial first step, but it still had limitations. The entire logical address space had to map to a contiguous physical region. As systems grew more complex, the need for more flexible mapping led to:
Each evolution maintained the core principle: the logical address space as an abstraction layer between programs and physical memory.
The shift from absolute to logical addressing exemplifies a key software engineering principle: solve problems through abstraction. Rather than making programs smarter about physical memory, make physical memory invisible to programs. This separation of concerns—program logic versus memory management—enabled decades of independent evolution in both areas.
While a process's logical address space appears as a uniform range of addresses, it is actually organized into distinct regions with different purposes, access permissions, and lifetime characteristics. Understanding this structure is essential for systems programming and security analysis.
| Segment | Contents | Permissions | Growth Direction | Lifetime |
|---|---|---|---|---|
| Text (Code) | Compiled machine instructions | Read + Execute | Fixed size | Process lifetime |
| Data | Initialized global and static variables | Read + Write | Fixed size | Process lifetime |
| BSS | Uninitialized global/static variables (zeroed) | Read + Write | Fixed size | Process lifetime |
| Heap | Dynamically allocated memory (malloc/new) | Read + Write | Grows upward | Until freed or process exit |
| Stack | Function call frames, local variables | Read + Write | Grows downward | Until function returns |
| Kernel | OS code and data (protected) | Kernel only | N/A | System lifetime |
The Stack-Heap Arrangement:
Notice that the stack grows downward (toward lower addresses) while the heap grows upward. This classic arrangement maximizes flexibility: the two dynamic regions can expand into the same gap, with collision occurring only when the combined allocation exceeds available address space.
In a 64-bit system, the gap between stack and heap is astronomically large—petabytes of unmapped addresses. Stack-heap collision is essentially impossible (address space exhaustion would occur first, typically limited by the OS to something reasonable like 128 TB).
Address Space Layout Randomization (ASLR):
Modern operating systems randomize the positions of segments within the logical address space on each program execution. This security measure makes it difficult for attackers to predict where code or data will be located, thwarting many exploitation techniques.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293
#include <stdio.h>#include <stdlib.h> // Global variables - Data/BSS segmentsint initialized_global = 42; // Data segmentint uninitialized_global; // BSS segment void print_addresses() { // Local variable - Stack segment int stack_var = 100; // Dynamic allocation - Heap segment int *heap_ptr = malloc(sizeof(int)); *heap_ptr = 200; printf("=== Logical Address Space Exploration === "); printf("Text Segment (Code):"); printf(" Function print_addresses: %p ", (void*)print_addresses); printf("Data Segment (Initialized Globals):"); printf(" initialized_global: %p ", (void*)&initialized_global); printf("BSS Segment (Uninitialized Globals):"); printf(" uninitialized_global: %p ", (void*)&uninitialized_global); printf("Heap Segment (Dynamic Allocation):"); printf(" heap_ptr value: %p ", (void*)heap_ptr); printf("Stack Segment (Local Variables):"); printf(" stack_var: %p ", (void*)&stack_var); // Demonstrate relative positions printf("=== Address Comparison ==="); printf("Stack is at higher addresses than Heap: %s", (void*)&stack_var > (void*)heap_ptr ? "Yes" : "No"); printf("Heap is at higher addresses than BSS: %s", (void*)heap_ptr > (void*)&uninitialized_global ? "Yes" : "No"); printf("BSS is at higher addresses than Data: %s", (void*)&uninitialized_global > (void*)&initialized_global ? "Yes" : "No"); free(heap_ptr);} /* * Sample Output (addresses vary due to ASLR): * * === Logical Address Space Exploration === * * Text Segment (Code): * Function print_addresses: 0x55d3a2c00189 * * Data Segment (Initialized Globals): * initialized_global: 0x55d3a2e03010 * * BSS Segment (Uninitialized Globals): * uninitialized_global: 0x55d3a2e03014 * * Heap Segment (Dynamic Allocation): * heap_ptr value: 0x55d3a3a052a0 * * Stack Segment (Local Variables): * stack_var: 0x7ffeba4c01dc * * Notice: Stack addresses start with 0x7ff... (high addresses) * Other segments start with 0x55... (lower addresses) */ int main() { print_addresses(); return 0;}Not every address in the logical address space is valid. Attempting to access unmapped regions—gaps between segments, addresses beyond allocated ranges, or null pointers (address 0)—triggers a hardware exception (segmentation fault on Unix, access violation on Windows). The OS terminates the offending process. This is memory protection in action, enforced through the same translation mechanism that enables logical addressing.
The concept of logical addressing manifests differently across various computing contexts. Understanding these variations reveals the universal importance of address abstraction.
| Context | Address Space Size | Translation Mechanism | Key Characteristics |
|---|---|---|---|
| 32-bit Process | 4 GB (2³² bytes) | Page tables + MMU | 3 GB user / 1 GB kernel split common |
| 64-bit Process | 256 TB canonical (48-bit) | 4-level page tables + MMU | Vast address space, sparse mapping |
| Java Virtual Machine | JVM heap size (configurable) | JVM internal + OS translation | Object references, not raw addresses |
| Web Browser JavaScript | ArrayBuffer size limits | V8/SpiderMonkey engine | Sandboxed, no direct memory access |
| Embedded Systems | Varies (often no MMU) | Direct or simple base+offset | Physical addressing may be used |
| GPU Computing | Device memory space | GPU-specific translation | Separate from CPU address space |
32-bit vs 64-bit Address Spaces:
The transition from 32-bit to 64-bit computing represents a fundamental expansion in logical address space:
32-bit Systems (4 GB limit): With 2³² possible addresses, the entire logical address space is 4 GB. This became limiting as programs grew—especially considering that the OS often reserves 1-2 GB of this space for kernel mapping.
64-bit Systems (practical limits): While 64 bits theoretically address 16 EB, practical implementations use 48 bits (256 TB) or 57 bits (128 PB) of virtual addressing. This is vastly more than any current physical memory, allowing for sparse address spaces and memory-mapped files exceeding physical RAM.
The key insight: logical address space size is an architectural decision, independent of physical memory.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
#include <stdio.h>#include <stdint.h> int main() { printf("Pointer size on this system: %zu bytes (%zu bits) ", sizeof(void*), sizeof(void*) * 8); printf("=== Theoretical Logical Address Space Sizes === "); // 32-bit system uint64_t addr_space_32 = 1ULL << 32; // 2^32 printf("32-bit address space:"); printf(" Addresses: %llu", addr_space_32); printf(" Size: %.2f GB ", addr_space_32 / (1024.0 * 1024.0 * 1024.0)); // 48-bit (common 64-bit implementation) uint64_t addr_space_48 = 1ULL << 48; // 2^48 printf("48-bit address space (common x86-64):"); printf(" Addresses: %llu", addr_space_48); printf(" Size: %.2f TB ", addr_space_48 / (1024.0 * 1024.0 * 1024.0 * 1024.0)); // 57-bit (Intel 5-level paging) uint64_t addr_space_57 = 1ULL << 57; // 2^57 printf("57-bit address space (5-level paging):"); printf(" Addresses: %llu", addr_space_57); printf(" Size: %.2f PB ", addr_space_57 / (1024.0 * 1024.0 * 1024.0 * 1024.0 * 1024.0)); printf("Note: Full 64-bit would be 16 EB, but no current system implements this."); printf("The unused high bits are sign-extended, creating 'canonical' addresses."); return 0;} /* * Output on a 64-bit system: * * Pointer size on this system: 8 bytes (64 bits) * * === Theoretical Logical Address Space Sizes === * * 32-bit address space: * Addresses: 4294967296 * Size: 4.00 GB * * 48-bit address space (common x86-64): * Addresses: 281474976710656 * Size: 256.00 TB * * 57-bit address space (5-level paging): * Addresses: 144115188075855872 * Size: 128.00 PB */Some embedded systems and real-time operating systems run without an MMU. In these cases, 'logical addresses' may directly correspond to physical addresses, or simple software-based translation is used. The abstraction benefits are reduced, but simplicity and deterministic timing can be gained. This represents a tradeoff, not an invalidation of the logical address concept.
The logical address space abstraction provides numerous benefits that are foundational to modern computing. Each benefit addresses a specific problem that would otherwise require complex, error-prone solutions from application programmers.
The Sharing Paradox:
Interestingly, logical address spaces also enable controlled sharing—the opposite of isolation. By mapping multiple logical addresses (potentially in different processes) to the same physical memory, the OS enables:
These capabilities would be extremely difficult to implement safely with physical addressing.
Resource Efficiency:
The logical address abstraction dramatically improves resource utilization:
Consider that the same ELF or PE binary can run on machines with 4 GB, 16 GB, or 256 GB of RAM, with different physical memory layouts, different amounts in use by other processes—all without recompilation. This extreme portability is a direct result of the logical address abstraction. The program's view of memory is invariant; only the mapping changes.
Understanding the origin of logical addresses reveals how the abstraction is maintained from source code through compilation to execution.
The Journey of an Address:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
/* * Address Generation Example: Following 'x' through the compilation pipeline */ // SOURCE CODE LEVEL// Variable 'x' is a name - no address yetint x = 42; int main() { int y = x + 1; // Reference to 'x' return y;} /* * AFTER COMPILATION (Assembly/Object Code): * * .data * x: ; 'x' is now a symbol * .long 42 * * .text * main: * movl x(%rip), %eax ; Reference to symbol 'x' (relocatable) * addl $1, %eax * ret * * At this stage, 'x' is a symbol that will be resolved to * an actual address during linking. */ /* * AFTER LINKING (Executable): * * Symbol table shows x is at logical address 0x404020: * * .data * .org 0x404020 * .long 42 * * .text * .org 0x401000 * main: * movl 0x404020(%rip), %eax ; Actual address substituted * addl $1, %eax ; (or RIP-relative offset) * ret * * The linker assigned address 0x404020 to 'x' within the * process's logical address space. */ /* * DURING EXECUTION: * * 1. CPU fetches instruction from logical address 0x401000 * 2. Instruction encodes access to logical address 0x404020 * 3. CPU presents 0x404020 to the MMU for translation * 4. MMU translates to physical address (e.g., 0x7FFFDE404020) * 5. Physical memory is accessed; value 42 is retrieved * 6. All of this happens transparently to the program */Position-Independent Code (PIC):
Modern shared libraries use Position-Independent Code, which avoids hardcoded absolute addresses. Instead, addresses are computed relative to the current instruction pointer (RIP-relative on x86-64). This allows the same code to be loaded at different logical addresses in different processes—essential for ASLR and shared library efficiency.
Dynamic Address Computation:
Many logical addresses aren't determined until runtime:
arr[i]) computes addresses from base + offset at runtime.The compiler ensures that all generated addresses fall within valid regions of the logical address space and that access patterns respect the boundaries established at link time. This is a contract: the compiler generates legal addresses, and the OS ensures those addresses map to actual memory (or signals an error if they don't).
The logical address space is not static—it can be expanded, contracted, and reorganized during program execution. Understanding how programs and operating systems manipulate logical address spaces is crucial for systems programming.
| Operation | Mechanism | Example API | Use Case |
|---|---|---|---|
| Expand Heap | brk/sbrk adjustment | sbrk(increment) / malloc() | Dynamic memory allocation |
| Memory Map | Create new mapping | mmap() | File I/O, shared memory, large allocations |
| Unmap Memory | Remove mapping | munmap() | Free large allocations, unload libraries |
| Protect Memory | Change permissions | mprotect() | JIT compilation, guard pages |
| Shared Memory | Map same pages | shm_open() + mmap() | Inter-process communication |
| Stack Expansion | Automatic on access | OS trap handler | Deep recursion, large local arrays |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115
#include <stdio.h>#include <stdlib.h>#include <sys/mman.h>#include <unistd.h>#include <string.h> void demonstrate_mmap() { printf("=== Address Space Manipulation with mmap === "); // Allocate a new region in our logical address space size_t size = 4096 * 1000; // ~4 MB void *region = mmap( NULL, // Let OS choose the address size, // Size of mapping PROT_READ | PROT_WRITE, // Readable and writable MAP_PRIVATE | MAP_ANONYMOUS, // Private, not file-backed -1, // No file descriptor 0 // No offset ); if (region == MAP_FAILED) { perror("mmap failed"); return; } printf("mmap allocated region at: %p", region); printf("Region size: %zu bytes", size); // The region is now part of our logical address space // But physical memory is allocated lazily (on first access) // Write to the region memset(region, 0xAB, 4096); // Now physical pages are allocated printf("Wrote to first page"); // Change protection to read-only if (mprotect(region, size, PROT_READ) == 0) { printf("Changed region to read-only"); } // Attempting to write now would cause SIGSEGV: // memset(region, 0, 4096); // CRASH! // Clean up - remove from logical address space if (munmap(region, size) == 0) { printf("Unmapped region - address space shrunk"); } // region is now an invalid address - accessing it would crash} void show_address_space_info() { printf("=== Current Address Space Info === "); // On Linux, we can read /proc/self/maps FILE *maps = fopen("/proc/self/maps", "r"); if (maps) { char line[256]; printf("%-20s %-5s %-8s %s", "Address Range", "Perm", "Offset", "Mapping"); printf("%-20s %-5s %-8s %s", "-------------", "----", "------", "-------"); int count = 0; while (fgets(line, sizeof(line), maps) && count < 15) { // Parse and display simplified view printf("%s", line); count++; } if (count == 15) { printf("... (truncated)"); } fclose(maps); }} int main() { demonstrate_mmap(); show_address_space_info(); return 0;} /* * Sample Output: * * === Address Space Manipulation with mmap === * * mmap allocated region at: 0x7f5a3c000000 * Region size: 4096000 bytes * Wrote to first page * Changed region to read-only * Unmapped region - address space shrunk * * === Current Address Space Info === * * Address Range Perm Offset Mapping * ------------- ---- ------ ------- * 55e8a5c00000-55e8a5c01000 r--p 00000000 /home/user/a.out * 55e8a5c01000-55e8a5c02000 r-xp 00001000 /home/user/a.out * ... */While logical address spaces can be vast, operating systems impose practical limits. On Linux, check /proc/sys/vm/overcommit_memory and ulimit -v. Excessive address space allocation (even without physical memory use) can be denied, and the kernel's virtual memory accounting may limit total mappings across all processes.
We've established a comprehensive understanding of logical address space—the foundational abstraction that enables modern multiprogramming, memory protection, and virtual memory. Let's consolidate the key insights:
What's Next:
Now that we understand logical address space—the abstraction processes use—we must examine its counterpart: physical address space. Physical addresses represent actual hardware locations. Understanding the physical layer is essential for understanding how translation bridges these two worlds and why memory management is necessary at all.
You now understand logical address space as a fundamental operating system abstraction. This knowledge is prerequisite for understanding address translation, the MMU, page tables, and virtual memory—all of which build upon the logical/physical distinction we've established. The journey into memory management continues with exploring the physical side of this duality.