Data Structures & AlgorithmsArray Memory Model & Contiguous Storage

Array Memory Model & Contiguous Storage

LevelBeginner

Duration55 mins

TopicArray Memory Model & Contiguous Storage

3 / 4

Index-Based Access and Address Calculation

The Magic Behind arr[i]

When you write arr[42], something remarkable happens: the computer retrieves the 43rd element instantly, without examining the preceding 42 elements. This isn't magic—it's elegant mathematics.

This page dissects the index-to-address conversion in complete detail. By the end, you'll understand exactly how a simple integer index becomes a memory address, why this conversion is so efficient, and how this mechanism forms the foundation of modern computing.

What You Will Learn

By the end of this page, you will master the address calculation formula, understand pointer arithmetic as an abstraction over this formula, see how different programming languages implement indexing, and appreciate why this simple mechanism is foundational to virtually all data processing.

The Fundamental Address Formula

Every array access ultimately reduces to a single arithmetic operation:

Address(arr[i]) = BaseAddress + (i × ElementSize)

This formula is the engine that powers array access across every programming language, every CPU architecture, and every operating system. Let's break it down component by component.

BaseAddress:

The memory address where the array begins—specifically, where the first element (arr[0]) is stored. This address is established when the array is allocated and remains constant throughout the array's lifetime.

i (Index):

The position of the desired element, conventionally numbered from 0. Index i means "the element that is i positions after the first element."

ElementSize:

The number of bytes each element occupies. This is determined by the element type and must be uniform across all elements (a defining characteristic of arrays).

Why Zero-Based Indexing?

Zero-based indexing isn't arbitrary. It directly reflects the physical model: index 0 means "0 elements away from the base," so Address = Base + 0×Size = Base. The first element is at the base address with zero offset. This mathematical naturalness is why most languages use zero-based indexing.

Working through a concrete example:

Given:
  Array base address: 0x4000
  Element type: 32-bit integer (4 bytes)
  Target index: 7

Calculation:
  Address(arr[7]) = 0x4000 + (7 × 4)
                  = 0x4000 + 28
                  = 0x4000 + 0x1C
                  = 0x401C

To access arr[7], the processor reads the 4 bytes starting at memory address 0x401C. This calculation takes one multiplication and one addition—operations that execute in a single CPU cycle.

The Formula in Action: Multiple Examples

Let's trace address calculations for various element types and indices to build intuition.

Address Calculations for Different Scenarios
Scenario	Base	Index	Element Size	Calculation	Result
char array	0x1000	5	1 byte	0x1000 + 5×1	0x1005
short array	0x2000	10	2 bytes	0x2000 + 10×2	0x2014
int array	0x3000	100	4 bytes	0x3000 + 100×4	0x3190
double array	0x4000	50	8 bytes	0x4000 + 50×8	0x4190
struct (24 bytes)	0x5000	3	24 bytes	0x5000 + 3×24	0x5048
First element	0x6000	0	any	0x6000 + 0×any	0x6000

Key observations:

Linear relationship — Addresses increase linearly with index. Each increment of i adds exactly ElementSize to the address.
Element size matters — The "stride" between elements equals ElementSize. For doubles, elements are 8 bytes apart; for chars, just 1 byte.
Index 0 is always at base — No matter the element size, arr[0] is at the base address.
Large indices work identically — arr[1000000] uses the same formula as arr[1]. The calculation complexity doesn't increase with the index value.

Address Calculation Demonstration
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <stdio.h>
#include <stdint.h>
 
int main() {
    int arr[10] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};
    
    // Get base address
    uintptr_t base = (uintptr_t)arr;
    size_t element_size = sizeof(int);
    
    printf("Base address: 0x%lX\n", base);
    printf("Element size: %zu bytes\n\n", element_size);
    
    for (int i = 0; i < 10; i++) {
        uintptr_t calculated_addr = base + (i * element_size);
        uintptr_t actual_addr = (uintptr_t)&arr[i];
        
        printf("arr[%d]: calculated=0x%lX, actual=0x%lX, value=%d %s\n",
               i, calculated_addr, actual_addr, arr[i],
               (calculated_addr == actual_addr) ? "✓" : "✗");
    }
    
    return 0;
}
 
/* Sample output:
Base address: 0x7FFD5678A420
Element size: 4 bytes
 
arr[0]: calculated=0x7FFD5678A420, actual=0x7FFD5678A420, value=0 ✓
arr[1]: calculated=0x7FFD5678A424, actual=0x7FFD5678A424, value=10 ✓
arr[2]: calculated=0x7FFD5678A428, actual=0x7FFD5678A428, value=20 ✓
...all match...
*/

Pointer Arithmetic: The Address Formula Abstracted

In C and C++, pointer arithmetic provides a convenient abstraction over the address formula. When you add an integer to a pointer, the compiler automatically scales by the element size.

The key insight:

If ptr is a pointer to a type of size S:

(ptr + n) = (address stored in ptr) + (n × S)

This means arr + 5 gives the address of arr[5]—the compiler handles the size multiplication automatically.

Equivalences:

&arr[i]   ≡   arr + i
arr[i]    ≡   *(arr + i)

The bracket notation arr[i] is actually syntactic sugar for pointer arithmetic followed by dereference.

Pointer Arithmetic Examples
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#include <stdio.h>
 
int main() {
    int arr[5] = {100, 200, 300, 400, 500};
    int *ptr = arr;  // ptr points to arr[0]
    
    // These all access the same element (arr[2])
    printf("Using arr[2]:      %d\n", arr[2]);
    printf("Using *(arr + 2):  %d\n", *(arr + 2));
    printf("Using ptr[2]:      %d\n", ptr[2]);
    printf("Using *(ptr + 2):  %d\n", *(ptr + 2));
    
    // Address comparison
    printf("\n&arr[2] = %p\n", (void*)&arr[2]);
    printf("arr + 2 = %p\n", (void*)(arr + 2));
    printf("ptr + 2 = %p\n", (void*)(ptr + 2));
    printf("All equal? %s\n", 
           (&arr[2] == arr + 2 && arr + 2 == ptr + 2) ? "Yes" : "No");
    
    // Pointer increment automatically scales by element size
    printf("\nPointer increment demonstration:\n");
    printf("ptr points to:     %p (value: %d)\n", (void*)ptr, *ptr);
    ptr++;  // Adds sizeof(int) = 4 bytes, not 1
    printf("ptr++ points to:   %p (value: %d)\n", (void*)ptr, *ptr);
    ptr++;
    printf("ptr++ points to:   %p (value: %d)\n", (void*)ptr, *ptr);
    
    return 0;
}
 
/* Output:
Using arr[2]:      300
Using *(arr + 2):  300
Using ptr[2]:      300
Using *(ptr + 2):  300
 
&arr[2] = 0x7ffd5678a428
arr + 2 = 0x7ffd5678a428
ptr + 2 = 0x7ffd5678a428
All equal? Yes
 
Pointer increment demonstration:
ptr points to:     0x7ffd5678a420 (value: 100)
ptr++ points to:   0x7ffd5678a424 (value: 200)
ptr++ points to:   0x7ffd5678a428 (value: 300)
*/

A Common Gotcha

When you increment a pointer (ptr++), it moves by sizeof(ptr) bytes, not by 1 byte. An int incremented moves 4 bytes; a double* moves 8 bytes. This automatic scaling is convenient but can be confusing when comparing with raw byte arithmetic.

Pointer subtraction:

You can also subtract pointers to determine the distance (in elements) between them:

int *p1 = &arr[5];
int *p2 = &arr[2];
ptrdiff_t distance = p1 - p2;  // Result: 3 (elements, not bytes)

The result is automatically scaled by element size—you get the number of elements between the pointers, not the number of bytes.

Hardware-Level Implementation

At the CPU level, the address formula is implemented with remarkable efficiency. Modern processors have dedicated addressing modes that perform the calculation in a single instruction.

x86/x64 addressing modes:

The x86 architecture supports "scaled-index" addressing that directly encodes the array access formula:

; Access arr[i] where arr is at address in RBX, i is in RCX
; Element size is 4 bytes
mov eax, [rbx + rcx*4]   ; Load arr[i] into EAX

This single instruction performs:

Multiply RCX (the index) by 4 (the scale factor)
Add the result to RBX (the base address)
Fetch 4 bytes from the resulting memory address
Store the value in EAX

All in essentially one clock cycle (memory latency aside).

Scaling Factors

x86 supports scale factors of 1, 2, 4, and 8 directly in the instruction encoding. These cover the most common element sizes (char, short, int/float, long/double). Non-standard sizes require explicit multiplication instructions, slightly increasing cost.

Comparison with other architectures:

ARM:

; Load arr[i], base in X0, index in X1, 4-byte elements
ldr w2, [x0, x1, lsl #2]  ; x1 << 2 = x1 * 4

RISC-V:

; Less specialized - explicit multiply may be needed
slli t1, a1, 2       ; t1 = a1 * 4 (shift left 2 = multiply by 4)
add t2, a0, t1       ; t2 = base + offset
lw a2, 0(t2)         ; load from calculated address

The bottom line:

Regardless of architecture, array access compiles to a small, fixed number of instructions. The address calculation doesn't depend on array size or index value—accessing arr[0] and arr[1000000] take essentially the same time.

Array Access Instruction Count by Architecture
Architecture	Instructions for arr[i] Load	Notes
x86/x64	1 (scaled-index addressing)	Scales 1,2,4,8 built into instruction
ARM (AArch64)	1 (shifted register offset)	Shifts for power-of-2 scales
RISC-V	2-3 (shift/mul + add + load)	Simpler encoding, more instructions
MIPS	2-3 (sll + addu + lw)	Similar to RISC-V

The Mathematical Beauty of O(1) Access

The O(1) random access property of arrays is one of the most profound results in data structure design. Let's appreciate why this is remarkable and what makes it possible.

What O(1) means for arrays:

The time to access arr[i] is constant—it doesn't grow with:

The array size (n)
The index value (i)
The number of previous accesses
The values stored in the array

Whether the array has 10 elements or 10 billion, accessing any element takes the same time.

What enables O(1) access:

The Requirements for O(1) Array Access

•Contiguous allocation — Elements must be adjacent with no gaps, establishing a predictable relationship between index and position.
•Uniform element size — Each element must occupy the same space, allowing the multiplication in the address formula.
•Known base address — We must know where the array starts to anchor the offset calculation.
•Random-access memory — The underlying hardware must support accessing any address with uniform cost (hence "RAM").
•Arithmetic is O(1) — Multiplication and addition of fixed-size integers must be constant-time operations (they are on all practical CPUs).

Contrast with sequential structures:

In a linked list, to access the i-th element, you must traverse from the head through i-1 predecessor nodes. Each step requires following a pointer. Access time is O(i), or O(n) in the worst case.

Linked list access to element 1000:
  head → node1 → node2 → node3 → ... → node1000
         1        2        3              1000 pointer dereferences

Array access to element 1000:
  base + 1000 × size = address
  1 multiplication + 1 addition

The array's O(1) vs. linked list's O(n) is one of the most significant differences between data structures.

RAM Really Means "Random Access Memory"

The term "RAM" reflects that any memory location can be accessed with the same latency—you don't have to scan from the beginning like a tape drive. Arrays leverage this property fully. Without random-access hardware, arrays as we know them wouldn't exist.

Array Bounds and Index Validity

The address formula works for any index—including invalid ones. This creates both flexibility and danger.

The formula doesn't know bounds:

int arr[5];  // Valid indices: 0, 1, 2, 3, 4

&arr[5]  = base + 5×4 = base + 20  // Just past the array (UB if dereferenced)
&arr[-1] = base + (-1)×4 = base - 4  // Before the array (definitely UB)
&arr[100] = base + 100×4 = base + 400  // Far past the array (definitely UB)

The formula happily computes these addresses. The hardware will attempt to access them. What happens depends on what's at those addresses.

Undefined behavior in C/C++:

Accessing out-of-bounds memory is undefined behavior—the language makes no guarantees:

The program might crash (segmentation fault if the address is unmapped)
The program might read/write garbage data silently
The program might appear to work but behave incorrectly later
The compiler might "optimize" assuming the behavior can't happen, causing bizarre effects

Buffer Overflows Are Catastrophic

Out-of-bounds array writes are the source of countless security vulnerabilities. Buffer overflow exploits overwrite adjacent memory (return addresses, function pointers, security flags) to hijack program execution. This remains one of the most dangerous bug classes in systems programming.

Bounds checking in safe languages:

Languages like Java, Python, Go, and Rust check array bounds on every access:

arr = [1, 2, 3, 4, 5]
arr[10]  # Raises IndexError: list index out of range

This prevents undefined behavior but adds overhead:

Load the array's length
Compare the index against the length
Branch to error handler if invalid
Only then perform the actual access

Some optimizations (bounds check elimination) can remove redundant checks, but the baseline overhead remains.

The tradeoff:

C/C++ approach: Maximum performance, programmer must ensure validity, catastrophic if wrong
Safe language approach: Slight overhead per access, guaranteed safety, clear errors on bugs

Modern systems increasingly favor safety, but performance-critical code often uses manual bounds management.

Bounds Checking Across Languages
Language	Bounds Checking	Out-of-Bounds Result	Overhead
C/C++	None by default	Undefined behavior	Zero
Java	Always	ArrayIndexOutOfBoundsException	~1-2 cycles
Python	Always	IndexError	Part of interpreted overhead
Go	Always	panic	~1-2 cycles
Rust	Yes (with opt-out)	panic (or unchecked)	~1-2 cycles (unless unsafe)
JavaScript	Implicit	Returns undefined	Part of dynamic typing

Negative Indices: Convention vs. Physics

Some languages support negative indices as a convenient way to access elements from the end of an array. Understanding how this works reveals the difference between language semantics and physical reality.

Python's negative indexing:

arr = [10, 20, 30, 40, 50]
arr[-1]  # 50 (last element)
arr[-2]  # 40 (second-to-last)
arr[-5]  # 10 (first element; same as arr[0])

How it works:

Python translates negative indices before applying the address formula:

if index < 0:
    index = length + index
finally apply: base + index × size

So arr[-1] becomes arr[5 + (-1)] = arr[4], which is the last element.

C doesn't have this:

In C, negative indices compute a memory address before the array starts:

int arr[5];
arr[-1];  // Computes (base - 4), accessing memory before arr
          // This is undefined behavior (unless deliberately engineered)

There's no automatic translation—the address formula works with the raw negative number.

Deliberate Negative Indexing in C

Sometimes negative indexing in C is intentional. If ptr points to the middle of a buffer, ptr[-1] accesses the previous element. This is valid only if the accessed address is within allocated memory. It's an advanced technique requiring careful reasoning about ownership and bounds.

Negative Index Behavior
1
2
3
4
5
6
7
8
9
10
11
12
# Python: Negative indices are translated
arr = [10, 20, 30, 40, 50]
print(f"arr[-1] = {arr[-1]}")  # 50
print(f"arr[-5] = {arr[-5]}")  # 10
 
# Internally: arr[-1] → arr[len(arr) + (-1)] → arr[4]
 
# Still bounds-checked:
try:
    print(arr[-6])  # IndexError: too negative
except IndexError as e:
    print(f"Error: {e}")

Multi-Dimensional Array Indexing

Multi-dimensional arrays (matrices, 3D grids, etc.) extend the address formula. Though they appear multi-dimensional to programmers, they're still stored in contiguous 1D memory.

Row-major vs. column-major order:

Consider a 3×4 matrix (3 rows, 4 columns):

Logical view:
        Col 0  Col 1  Col 2  Col 3
Row 0:  [a00]  [a01]  [a02]  [a03]
Row 1:  [a10]  [a11]  [a12]  [a13]
Row 2:  [a20]  [a21]  [a22]  [a23]

Row-major layout (C, C++, Python numpy default):

Elements in the same row are contiguous:

[a00][a01][a02][a03][a10][a11][a12][a13][a20][a21][a22][a23]

Column-major layout (Fortran, MATLAB, Julia):

Elements in the same column are contiguous:

[a00][a10][a20][a01][a11][a21][a02][a12][a22][a03][a13][a23]

Address formula for 2D arrays (row-major):

Address(arr[row][col]) = Base + ((row × columns) + col) × ElementSize

Example:

int arr[3][4];  // 3 rows, 4 columns
Base = 0x5000
ElementSize = 4 bytes

Address(arr[1][2]) = 0x5000 + ((1 × 4) + 2) × 4
                   = 0x5000 + (4 + 2) × 4
                   = 0x5000 + 24
                   = 0x5018

Generalizing to N dimensions:

For an array with dimensions D₀ × D₁ × D₂ × ... × Dₙ₋₁ (row-major):

Address(arr[i₀][i₁][i₂]...[iₙ₋₁]) = Base + Offset × ElementSize

Where Offset = i₀×(D₁×D₂×...×Dₙ₋₁) + i₁×(D₂×...×Dₙ₋₁) + ... + iₙ₋₁

The innermost index varies fastest; each outer index adds a larger stride.

Memory Layout Matters for Performance

Traversing a row-major 2D array row-by-row accesses sequential memory addresses (cache-friendly). Traversing column-by-column jumps by the row stride (cache-unfriendly). In large matrices, choosing the right traversal order can mean 10× or greater performance differences.

2D Array Address Verification
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <stdio.h>
#include <stdint.h>
 
int main() {
    int arr[3][4] = {
        {0, 1, 2, 3},
        {10, 11, 12, 13},
        {20, 21, 22, 23}
    };
    
    uintptr_t base = (uintptr_t)arr;
    int cols = 4;
    size_t elem_size = sizeof(int);
    
    printf("Base: 0x%lX, Columns: %d, Element size: %zu\n\n", 
           base, cols, elem_size);
    
    // Verify formula for arr[1][2]
    int row = 1, col = 2;
    uintptr_t calculated = base + ((row * cols) + col) * elem_size;
    uintptr_t actual = (uintptr_t)&arr[row][col];
    
    printf("arr[%d][%d]:\n", row, col);
    printf("  Formula: base + ((%d × %d) + %d) × %zu\n", row, cols, col, elem_size);
    printf("  Calculated: 0x%lX\n", calculated);
    printf("  Actual:     0x%lX\n", actual);
    printf("  Match: %s\n", (calculated == actual) ? "Yes ✓" : "No ✗");
    printf("  Value: %d\n", arr[row][col]);
    
    return 0;
}

Index Expressions and Compiler Optimizations

Index expressions can be complex, and compilers are remarkably clever at optimizing them.

Common patterns:

arr[i * stride + offset]    // Strided access with offset
arr[arr[i]]                 // Indirect indexing
arr[i & (size - 1)]         // Modular indexing (power of 2)
arr[(i + j) % n]            // Circular buffer access

Each computes an address using the formula, but the index calculation may itself be complex.

Compiler optimizations:

Strength reduction:

// Original: multiplies in every iteration
for (int i = 0; i < n; i++) {
    sum += arr[i * 4];  // i*4 computed each time
}

// Optimized: increments a scaled index
// Compiler transforms to:
int idx = 0;
for (int i = 0; i < n; i++) {
    sum += arr[idx];  // No multiplication
    idx += 4;
}

Common subexpression elimination:

arr[i*10 + 5] = arr[i*10 + 5] + 1;  // i*10 computed once, reused

Loop invariant code motion:

for (int i = 0; i < n; i++) {
    arr[x * y + i] = ...;  // x*y is constant; computed once outside loop
}

Trust the Compiler (Usually)

Modern compilers are extremely good at optimizing array access patterns. Hand-optimizing index calculations often produces code that's harder to read without being faster. Write clear code first; profile and optimize only when necessary.

When compilers can't help:

Indirect indexing: arr[indices[i]] — The indices come from another array at runtime; no pattern to exploit.
Data-dependent access: arr[arr[i]] — Each access depends on the previous result; can't parallelize or prefetch.
Aliasing concerns: If the compiler can't prove arrays don't overlap, it may miss optimizations.
Very complex expressions: Extremely convoluted index calculations may defeat optimization passes.

In these cases, understanding the address formula helps you reason about performance manually.

Summary: The Essence of Array Indexing

We've thoroughly examined the mechanism that makes arrays one of the most efficient data structures ever devised.

Key Takeaways

•The formula is everything — Address = Base + (Index × Size). This simple arithmetic enables O(1) access.
•Zero-indexing is mathematically natural — The first element at offset 0 aligns with the formula perfectly.
•Pointer arithmetic abstracts the formula — Compilers automatically scale by element size.
•Hardware supports arrays natively — CPU addressing modes implement the formula in single instructions.
•O(1) requires contiguity and uniformity — Without these, the formula doesn't apply.
•Bounds are your responsibility (or the language's) — The formula works for any index; validity is a separate concern.
•Multi-dimensional arrays extend the formula — Row-major and column-major layouts linearize N-D indices.
•Compilers aggressively optimize — Trust them for common patterns; understand the formula for unusual ones.

What's next:

Armed with a complete understanding of array memory layout, allocation, and indexing, we'll now explore why this all matters—the O(1) random access that makes arrays the foundation of efficient computing. We'll examine the formal analysis and practical implications of constant-time access.

Page Complete

You now understand exactly how array[index] becomes a memory address—from the mathematical formula to hardware implementation to compiler optimization. This knowledge is fundamental to computer science and applies throughout your career. Next, we'll formalize why random access is O(1) and explore its profound implications.

3 / 4

Loading learning content...

Data Structures & AlgorithmsArray Memory Model & Contiguous Storage

Array Memory Model & Contiguous Storage

LevelBeginner

Duration55 mins

TopicArray Memory Model & Contiguous Storage

3 / 4

Index-Based Access and Address Calculation

The Magic Behind arr[i]

When you write arr[42], something remarkable happens: the computer retrieves the 43rd element instantly, without examining the preceding 42 elements. This isn't magic—it's elegant mathematics.

What You Will Learn

The Fundamental Address Formula

Every array access ultimately reduces to a single arithmetic operation:

Address(arr[i]) = BaseAddress + (i × ElementSize)

This formula is the engine that powers array access across every programming language, every CPU architecture, and every operating system. Let's break it down component by component.

BaseAddress:

i (Index):

The position of the desired element, conventionally numbered from 0. Index i means "the element that is i positions after the first element."

ElementSize:

The number of bytes each element occupies. This is determined by the element type and must be uniform across all elements (a defining characteristic of arrays).

Why Zero-Based Indexing?

Working through a concrete example:

Given:
  Array base address: 0x4000
  Element type: 32-bit integer (4 bytes)
  Target index: 7

Calculation:
  Address(arr[7]) = 0x4000 + (7 × 4)
                  = 0x4000 + 28
                  = 0x4000 + 0x1C
                  = 0x401C

To access arr[7], the processor reads the 4 bytes starting at memory address 0x401C. This calculation takes one multiplication and one addition—operations that execute in a single CPU cycle.

The Formula in Action: Multiple Examples

Let's trace address calculations for various element types and indices to build intuition.

Address Calculations for Different Scenarios
Scenario	Base	Index	Element Size	Calculation	Result
char array	0x1000	5	1 byte	0x1000 + 5×1	0x1005
short array	0x2000	10	2 bytes	0x2000 + 10×2	0x2014
int array	0x3000	100	4 bytes	0x3000 + 100×4	0x3190
double array	0x4000	50	8 bytes	0x4000 + 50×8	0x4190
struct (24 bytes)	0x5000	3	24 bytes	0x5000 + 3×24	0x5048
First element	0x6000	0	any	0x6000 + 0×any	0x6000

Key observations:

Linear relationship — Addresses increase linearly with index. Each increment of i adds exactly ElementSize to the address.
Element size matters — The "stride" between elements equals ElementSize. For doubles, elements are 8 bytes apart; for chars, just 1 byte.
Index 0 is always at base — No matter the element size, arr[0] is at the base address.
Large indices work identically — arr[1000000] uses the same formula as arr[1]. The calculation complexity doesn't increase with the index value.

Address Calculation Demonstration
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <stdio.h>
#include <stdint.h>
 
int main() {
    int arr[10] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};
    
    // Get base address
    uintptr_t base = (uintptr_t)arr;
    size_t element_size = sizeof(int);
    
    printf("Base address: 0x%lX\n", base);
    printf("Element size: %zu bytes\n\n", element_size);
    
    for (int i = 0; i < 10; i++) {
        uintptr_t calculated_addr = base + (i * element_size);
        uintptr_t actual_addr = (uintptr_t)&arr[i];
        
        printf("arr[%d]: calculated=0x%lX, actual=0x%lX, value=%d %s\n",
               i, calculated_addr, actual_addr, arr[i],
               (calculated_addr == actual_addr) ? "✓" : "✗");
    }
    
    return 0;
}
 
/* Sample output:
Base address: 0x7FFD5678A420
Element size: 4 bytes
 
arr[0]: calculated=0x7FFD5678A420, actual=0x7FFD5678A420, value=0 ✓
arr[1]: calculated=0x7FFD5678A424, actual=0x7FFD5678A424, value=10 ✓
arr[2]: calculated=0x7FFD5678A428, actual=0x7FFD5678A428, value=20 ✓
...all match...
*/

Pointer Arithmetic: The Address Formula Abstracted

In C and C++, pointer arithmetic provides a convenient abstraction over the address formula. When you add an integer to a pointer, the compiler automatically scales by the element size.

The key insight:

If ptr is a pointer to a type of size S:

(ptr + n) = (address stored in ptr) + (n × S)

This means arr + 5 gives the address of arr[5]—the compiler handles the size multiplication automatically.

Equivalences:

&arr[i]   ≡   arr + i
arr[i]    ≡   *(arr + i)

The bracket notation arr[i] is actually syntactic sugar for pointer arithmetic followed by dereference.

Pointer Arithmetic Examples
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#include <stdio.h>
 
int main() {
    int arr[5] = {100, 200, 300, 400, 500};
    int *ptr = arr;  // ptr points to arr[0]
    
    // These all access the same element (arr[2])
    printf("Using arr[2]:      %d\n", arr[2]);
    printf("Using *(arr + 2):  %d\n", *(arr + 2));
    printf("Using ptr[2]:      %d\n", ptr[2]);
    printf("Using *(ptr + 2):  %d\n", *(ptr + 2));
    
    // Address comparison
    printf("\n&arr[2] = %p\n", (void*)&arr[2]);
    printf("arr + 2 = %p\n", (void*)(arr + 2));
    printf("ptr + 2 = %p\n", (void*)(ptr + 2));
    printf("All equal? %s\n", 
           (&arr[2] == arr + 2 && arr + 2 == ptr + 2) ? "Yes" : "No");
    
    // Pointer increment automatically scales by element size
    printf("\nPointer increment demonstration:\n");
    printf("ptr points to:     %p (value: %d)\n", (void*)ptr, *ptr);
    ptr++;  // Adds sizeof(int) = 4 bytes, not 1
    printf("ptr++ points to:   %p (value: %d)\n", (void*)ptr, *ptr);
    ptr++;
    printf("ptr++ points to:   %p (value: %d)\n", (void*)ptr, *ptr);
    
    return 0;
}
 
/* Output:
Using arr[2]:      300
Using *(arr + 2):  300
Using ptr[2]:      300
Using *(ptr + 2):  300
 
&arr[2] = 0x7ffd5678a428
arr + 2 = 0x7ffd5678a428
ptr + 2 = 0x7ffd5678a428
All equal? Yes
 
Pointer increment demonstration:
ptr points to:     0x7ffd5678a420 (value: 100)
ptr++ points to:   0x7ffd5678a424 (value: 200)
ptr++ points to:   0x7ffd5678a428 (value: 300)
*/

A Common Gotcha

Pointer subtraction:

You can also subtract pointers to determine the distance (in elements) between them:

int *p1 = &arr[5];
int *p2 = &arr[2];
ptrdiff_t distance = p1 - p2;  // Result: 3 (elements, not bytes)

The result is automatically scaled by element size—you get the number of elements between the pointers, not the number of bytes.

Hardware-Level Implementation

At the CPU level, the address formula is implemented with remarkable efficiency. Modern processors have dedicated addressing modes that perform the calculation in a single instruction.

x86/x64 addressing modes:

The x86 architecture supports "scaled-index" addressing that directly encodes the array access formula:

; Access arr[i] where arr is at address in RBX, i is in RCX
; Element size is 4 bytes
mov eax, [rbx + rcx*4]   ; Load arr[i] into EAX

This single instruction performs:

Multiply RCX (the index) by 4 (the scale factor)
Add the result to RBX (the base address)
Fetch 4 bytes from the resulting memory address
Store the value in EAX

All in essentially one clock cycle (memory latency aside).

Scaling Factors

Comparison with other architectures:

ARM:

; Load arr[i], base in X0, index in X1, 4-byte elements
ldr w2, [x0, x1, lsl #2]  ; x1 << 2 = x1 * 4

RISC-V:

; Less specialized - explicit multiply may be needed
slli t1, a1, 2       ; t1 = a1 * 4 (shift left 2 = multiply by 4)
add t2, a0, t1       ; t2 = base + offset
lw a2, 0(t2)         ; load from calculated address

The bottom line:

Array Access Instruction Count by Architecture
Architecture	Instructions for arr[i] Load	Notes
x86/x64	1 (scaled-index addressing)	Scales 1,2,4,8 built into instruction
ARM (AArch64)	1 (shifted register offset)	Shifts for power-of-2 scales
RISC-V	2-3 (shift/mul + add + load)	Simpler encoding, more instructions
MIPS	2-3 (sll + addu + lw)	Similar to RISC-V

The Mathematical Beauty of O(1) Access

The O(1) random access property of arrays is one of the most profound results in data structure design. Let's appreciate why this is remarkable and what makes it possible.

What O(1) means for arrays:

The time to access arr[i] is constant—it doesn't grow with:

The array size (n)
The index value (i)
The number of previous accesses
The values stored in the array

Whether the array has 10 elements or 10 billion, accessing any element takes the same time.

What enables O(1) access:

The Requirements for O(1) Array Access

•Contiguous allocation — Elements must be adjacent with no gaps, establishing a predictable relationship between index and position.
•Uniform element size — Each element must occupy the same space, allowing the multiplication in the address formula.
•Known base address — We must know where the array starts to anchor the offset calculation.
•Random-access memory — The underlying hardware must support accessing any address with uniform cost (hence "RAM").
•Arithmetic is O(1) — Multiplication and addition of fixed-size integers must be constant-time operations (they are on all practical CPUs).

Contrast with sequential structures:

In a linked list, to access the i-th element, you must traverse from the head through i-1 predecessor nodes. Each step requires following a pointer. Access time is O(i), or O(n) in the worst case.

Linked list access to element 1000:
  head → node1 → node2 → node3 → ... → node1000
         1        2        3              1000 pointer dereferences

Array access to element 1000:
  base + 1000 × size = address
  1 multiplication + 1 addition

The array's O(1) vs. linked list's O(n) is one of the most significant differences between data structures.

RAM Really Means "Random Access Memory"

Array Bounds and Index Validity

The address formula works for any index—including invalid ones. This creates both flexibility and danger.

The formula doesn't know bounds:

int arr[5];  // Valid indices: 0, 1, 2, 3, 4

&arr[5]  = base + 5×4 = base + 20  // Just past the array (UB if dereferenced)
&arr[-1] = base + (-1)×4 = base - 4  // Before the array (definitely UB)
&arr[100] = base + 100×4 = base + 400  // Far past the array (definitely UB)

The formula happily computes these addresses. The hardware will attempt to access them. What happens depends on what's at those addresses.

Undefined behavior in C/C++:

Accessing out-of-bounds memory is undefined behavior—the language makes no guarantees:

The program might crash (segmentation fault if the address is unmapped)
The program might read/write garbage data silently
The program might appear to work but behave incorrectly later
The compiler might "optimize" assuming the behavior can't happen, causing bizarre effects

Buffer Overflows Are Catastrophic

Bounds checking in safe languages:

Languages like Java, Python, Go, and Rust check array bounds on every access:

arr = [1, 2, 3, 4, 5]
arr[10]  # Raises IndexError: list index out of range

This prevents undefined behavior but adds overhead:

Load the array's length
Compare the index against the length
Branch to error handler if invalid
Only then perform the actual access

Some optimizations (bounds check elimination) can remove redundant checks, but the baseline overhead remains.

The tradeoff:

C/C++ approach: Maximum performance, programmer must ensure validity, catastrophic if wrong
Safe language approach: Slight overhead per access, guaranteed safety, clear errors on bugs

Modern systems increasingly favor safety, but performance-critical code often uses manual bounds management.

Bounds Checking Across Languages
Language	Bounds Checking	Out-of-Bounds Result	Overhead
C/C++	None by default	Undefined behavior	Zero
Java	Always	ArrayIndexOutOfBoundsException	~1-2 cycles
Python	Always	IndexError	Part of interpreted overhead
Go	Always	panic	~1-2 cycles
Rust	Yes (with opt-out)	panic (or unchecked)	~1-2 cycles (unless unsafe)
JavaScript	Implicit	Returns undefined	Part of dynamic typing

Negative Indices: Convention vs. Physics

Python's negative indexing:

arr = [10, 20, 30, 40, 50]
arr[-1]  # 50 (last element)
arr[-2]  # 40 (second-to-last)
arr[-5]  # 10 (first element; same as arr[0])

How it works:

Python translates negative indices before applying the address formula:

if index < 0:
    index = length + index
finally apply: base + index × size

So arr[-1] becomes arr[5 + (-1)] = arr[4], which is the last element.

C doesn't have this:

In C, negative indices compute a memory address before the array starts:

int arr[5];
arr[-1];  // Computes (base - 4), accessing memory before arr
          // This is undefined behavior (unless deliberately engineered)

There's no automatic translation—the address formula works with the raw negative number.

Deliberate Negative Indexing in C

Negative Index Behavior
1
2
3
4
5
6
7
8
9
10
11
12
# Python: Negative indices are translated
arr = [10, 20, 30, 40, 50]
print(f"arr[-1] = {arr[-1]}")  # 50
print(f"arr[-5] = {arr[-5]}")  # 10
 
# Internally: arr[-1] → arr[len(arr) + (-1)] → arr[4]
 
# Still bounds-checked:
try:
    print(arr[-6])  # IndexError: too negative
except IndexError as e:
    print(f"Error: {e}")

Multi-Dimensional Array Indexing

Multi-dimensional arrays (matrices, 3D grids, etc.) extend the address formula. Though they appear multi-dimensional to programmers, they're still stored in contiguous 1D memory.

Row-major vs. column-major order:

Consider a 3×4 matrix (3 rows, 4 columns):

Logical view:
        Col 0  Col 1  Col 2  Col 3
Row 0:  [a00]  [a01]  [a02]  [a03]
Row 1:  [a10]  [a11]  [a12]  [a13]
Row 2:  [a20]  [a21]  [a22]  [a23]

Row-major layout (C, C++, Python numpy default):

Elements in the same row are contiguous:

[a00][a01][a02][a03][a10][a11][a12][a13][a20][a21][a22][a23]

Column-major layout (Fortran, MATLAB, Julia):

Elements in the same column are contiguous:

[a00][a10][a20][a01][a11][a21][a02][a12][a22][a03][a13][a23]

Address formula for 2D arrays (row-major):

Address(arr[row][col]) = Base + ((row × columns) + col) × ElementSize

Example:

int arr[3][4];  // 3 rows, 4 columns
Base = 0x5000
ElementSize = 4 bytes

Address(arr[1][2]) = 0x5000 + ((1 × 4) + 2) × 4
                   = 0x5000 + (4 + 2) × 4
                   = 0x5000 + 24
                   = 0x5018

Generalizing to N dimensions:

For an array with dimensions D₀ × D₁ × D₂ × ... × Dₙ₋₁ (row-major):

Address(arr[i₀][i₁][i₂]...[iₙ₋₁]) = Base + Offset × ElementSize

Where Offset = i₀×(D₁×D₂×...×Dₙ₋₁) + i₁×(D₂×...×Dₙ₋₁) + ... + iₙ₋₁

The innermost index varies fastest; each outer index adds a larger stride.

Memory Layout Matters for Performance

2D Array Address Verification
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <stdio.h>
#include <stdint.h>
 
int main() {
    int arr[3][4] = {
        {0, 1, 2, 3},
        {10, 11, 12, 13},
        {20, 21, 22, 23}
    };
    
    uintptr_t base = (uintptr_t)arr;
    int cols = 4;
    size_t elem_size = sizeof(int);
    
    printf("Base: 0x%lX, Columns: %d, Element size: %zu\n\n", 
           base, cols, elem_size);
    
    // Verify formula for arr[1][2]
    int row = 1, col = 2;
    uintptr_t calculated = base + ((row * cols) + col) * elem_size;
    uintptr_t actual = (uintptr_t)&arr[row][col];
    
    printf("arr[%d][%d]:\n", row, col);
    printf("  Formula: base + ((%d × %d) + %d) × %zu\n", row, cols, col, elem_size);
    printf("  Calculated: 0x%lX\n", calculated);
    printf("  Actual:     0x%lX\n", actual);
    printf("  Match: %s\n", (calculated == actual) ? "Yes ✓" : "No ✗");
    printf("  Value: %d\n", arr[row][col]);
    
    return 0;
}

Index Expressions and Compiler Optimizations

Index expressions can be complex, and compilers are remarkably clever at optimizing them.

Common patterns:

arr[i * stride + offset]    // Strided access with offset
arr[arr[i]]                 // Indirect indexing
arr[i & (size - 1)]         // Modular indexing (power of 2)
arr[(i + j) % n]            // Circular buffer access

Each computes an address using the formula, but the index calculation may itself be complex.

Compiler optimizations:

Strength reduction:

// Original: multiplies in every iteration
for (int i = 0; i < n; i++) {
    sum += arr[i * 4];  // i*4 computed each time
}

// Optimized: increments a scaled index
// Compiler transforms to:
int idx = 0;
for (int i = 0; i < n; i++) {
    sum += arr[idx];  // No multiplication
    idx += 4;
}

Common subexpression elimination:

arr[i*10 + 5] = arr[i*10 + 5] + 1;  // i*10 computed once, reused

Loop invariant code motion:

for (int i = 0; i < n; i++) {
    arr[x * y + i] = ...;  // x*y is constant; computed once outside loop
}

Trust the Compiler (Usually)

When compilers can't help:

Indirect indexing: arr[indices[i]] — The indices come from another array at runtime; no pattern to exploit.
Data-dependent access: arr[arr[i]] — Each access depends on the previous result; can't parallelize or prefetch.
Aliasing concerns: If the compiler can't prove arrays don't overlap, it may miss optimizations.
Very complex expressions: Extremely convoluted index calculations may defeat optimization passes.

In these cases, understanding the address formula helps you reason about performance manually.

Summary: The Essence of Array Indexing

We've thoroughly examined the mechanism that makes arrays one of the most efficient data structures ever devised.

Key Takeaways

•The formula is everything — Address = Base + (Index × Size). This simple arithmetic enables O(1) access.
•Zero-indexing is mathematically natural — The first element at offset 0 aligns with the formula perfectly.
•Pointer arithmetic abstracts the formula — Compilers automatically scale by element size.
•Hardware supports arrays natively — CPU addressing modes implement the formula in single instructions.
•O(1) requires contiguity and uniformity — Without these, the formula doesn't apply.
•Bounds are your responsibility (or the language's) — The formula works for any index; validity is a separate concern.
•Multi-dimensional arrays extend the formula — Row-major and column-major layouts linearize N-D indices.
•Compilers aggressively optimize — Trust them for common patterns; understand the formula for unusual ones.

What's next:

Page Complete

3 / 4