Data Structures & AlgorithmsInteger Data Types

Integer Data Types

LevelBeginner

Duration75 mins

TopicInteger Data Types

3 / 5

Fixed-Size Storage (32-bit, 64-bit intuition)

Why Integers Come in Standard Sizes

Why do we have int8, int16, int32, and int64—but not int17 or int53? Why can't an integer simply grow as needed, dynamically allocating more bits when values get larger?

The answer lies at the intersection of hardware architecture, performance optimization, and memory efficiency. Understanding fixed-size storage isn't just about knowing "32 bits means 4 bytes"—it's about understanding how processors, memory systems, and compilers work together to execute your code efficiently.

This knowledge transforms how you think about data representation and helps you make informed decisions about which integer types to use in your programs.

What You Will Learn

By the end of this page, you will understand: (1) Why hardware enforces fixed-width integers, (2) How different bit widths map to memory and CPU registers, (3) The performance implications of choosing different integer sizes, (4) Alignment and padding in memory layout, and (5) Best practices for selecting integer types.

Hardware Architecture Dictates Integer Sizes

Computer processors are designed around specific word sizes—the fundamental unit of data that the CPU processes in a single operation. Modern processors are built with:

64-bit registers: Internal storage elements that hold operands for arithmetic
64-bit data paths: The "roads" that data travels between registers, ALU, and memory
Memory bus widths: The connection between CPU and RAM, often 64 bits per channel

Why powers of two?

The use of 8, 16, 32, 64 (all powers of two) isn't arbitrary:

Binary addressing: With n address lines, you can address exactly $2^n$ locations. Powers of two map cleanly.
Bit manipulation efficiency: Operations like shifting, masking, and extracting fields are trivially efficient when widths are powers of two.
Memory alignment: Memory is physically organized in power-of-two chunks. Aligning data to these boundaries enables single-cycle access.
Historical evolution: 8-bit → 16-bit → 32-bit → 64-bit evolution happened by doubling, each generation addressing the limitations of the previous.

Evolution of Processor Word Sizes
Era	Word Size	Example CPUs	Max RAM Addressable
1970s	8-bit	Intel 8080, Zilog Z80	64 KB
1980s	16-bit	Intel 8086, Motorola 68000	1 MB - 16 MB
1990s	32-bit	Intel 386/486/Pentium	4 GB
2000s+	64-bit	AMD64, ARM64, Apple M1/M2	16 EB (exabytes)

The 32-bit Memory Barrier

With 32-bit addresses, you can only address 2^32 = 4,294,967,296 bytes = 4 GB of memory. This became a severe limitation in the 2000s as applications grew. The transition to 64-bit extended addressable memory to 2^64 bytes—far more than we'll ever need. In practice, current systems support 48-52 bits of physical address space, which is still measured in petabytes.

Standard Integer Widths: A Complete Reference

Modern programming languages provide integer types at specific fixed widths. Understanding these helps you choose the right type for any situation.

Fixed-Width Integer Types Across Languages
Width	C/C++ (stdint.h)	Rust	Java	Go	Bytes
8-bit signed	int8_t	i8	byte	int8	1
8-bit unsigned	uint8_t	u8	—	uint8/byte	1
16-bit signed	int16_t	i16	short	int16	2
16-bit unsigned	uint16_t	u16	char (special)	uint16	2
32-bit signed	int32_t	i32	int	int32	4
32-bit unsigned	uint32_t	u32	—	uint32	4
64-bit signed	int64_t	i64	long	int64	8
64-bit unsigned	uint64_t	u64	—	uint64	8

The "native" integer type:

Most languages also provide a "native" or "platform" integer type that matches the processor's word size:

Language	Native Signed	Native Unsigned	Typical Size
C/C++	`int`	`unsigned int`	32-bit (usually)
C/C++	`size_t`	`size_t`	32 or 64-bit
Rust	`isize`	`usize`	32 or 64-bit
Go	`int`	`uint`	32 or 64-bit

Why size_t and usize exist:

These types are specifically designed to hold the size of any object or index into any array. On a 32-bit system, size_t is 32 bits; on a 64-bit system, it's 64 bits. This allows code to work correctly regardless of platform:

size_t len = strlen(str);  // Works correctly on both 32-bit and 64-bit
for (size_t i = 0; i < len; i++) {
    process(str[i]);
}

C's int Is NOT Always 32 Bits

In C, int is only guaranteed to be at least 16 bits. On most modern desktop systems it's 32 bits, but embedded systems may have 16-bit int. Always use <stdint.h> types (int32_t, etc.) when you need guaranteed sizes. The standard says: sizeof(short) ≤ sizeof(int) ≤ sizeof(long) ≤ sizeof(long long).

Memory Layout: How Integers Are Stored in RAM

When a multi-byte integer is stored in memory, its bytes must be arranged in some order. This byte order (or endianness) affects how data is interpreted, especially when transferring between systems.

Little-endian vs Big-endian:

Consider storing the 32-bit integer 0x12345678 at memory address 0x1000:

Address	Little-Endian	Big-Endian
0x1000	0x78 (LSB)	0x12 (MSB)
0x1001	0x56	0x34
0x1002	0x34	0x56
0x1003	0x12 (MSB)	0x78 (LSB)

Little-endian: Least significant byte first (x86, x64, ARM in standard mode)
Big-endian: Most significant byte first (network protocols, some older architectures)

Why endianness matters:

Network communication: Network protocols (TCP/IP) use big-endian ("network byte order"). Converting between host and network order is required.
File format interoperability: Binary files must specify their endianness to be portable.
Cross-platform serialization: Sending data between ARM and x86 systems requires awareness of byte order.

endianness.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <stdio.h>
#include <stdint.h>
#include <arpa/inet.h>  // For htonl, ntohl
 
// Detect system endianness at runtime
void detect_endianness() {
    uint32_t test = 0x01020304;
    uint8_t* bytes = (uint8_t*)&test;
    
    if (bytes[0] == 0x04) {
        printf("System is LITTLE-ENDIAN\n");
    } else if (bytes[0] == 0x01) {
        printf("System is BIG-ENDIAN\n");
    }
}
 
// Visualize how a 32-bit integer is stored in memory
void visualize_storage(uint32_t value) {
    uint8_t* bytes = (uint8_t*)&value;
    
    printf("Value 0x%08X is stored as bytes:\n", value);
    for (int i = 0; i < 4; i++) {
        printf("  Address +%d: 0x%02X\n", i, bytes[i]);
    }
}
 
// Network byte order conversion
void network_order_demo() {
    uint32_t host_value = 0x12345678;
    uint32_t network_value = htonl(host_value);  // Host to network long
    uint32_t restored = ntohl(network_value);    // Network to host long
    
    printf("Host value:    0x%08X\n", host_value);
    printf("Network order: 0x%08X\n", network_value);
    printf("Restored:      0x%08X\n", restored);
}
 
int main() {
    detect_endianness();
    printf("\n");
    visualize_storage(0x12345678);
    printf("\n");
    network_order_demo();
    return 0;
}

The Naming Origin

The terms "little-endian" and "big-endian" come from Jonathan Swift's Gulliver's Travels, where factions warred over which end of an egg to crack first. The point: the choice is largely arbitrary, but once made, consistency matters enormously.

Memory Alignment: Why Position Matters

Memory alignment refers to placing data at memory addresses that are multiples of the data's size. A 4-byte integer should ideally start at an address divisible by 4.

Why alignment matters:

Hardware efficiency: Many processors can only access memory at aligned addresses in a single operation. Unaligned access requires multiple memory reads and bit-shifting.
Atomicity: Aligned accesses are often atomic (indivisible), which is critical for multithreaded programming.
Some architectures disallow unaligned access: On certain ARM modes and older MIPS, unaligned access causes a hardware fault.
Cache efficiency: Aligned data is less likely to span cache lines, reducing memory traffic.

Alignment requirements by type:

Type Size	Natural Alignment	Address Divisible By
1 byte	1-byte aligned	Any address
2 bytes	2-byte aligned	2 (0x...0, 0x...2, 0x...4, etc.)
4 bytes	4-byte aligned	4 (0x...0, 0x...4, 0x...8, 0x...C)
8 bytes	8-byte aligned	8 (0x...0, 0x...8)

Struct padding example:

Consider this C struct:

struct Example {
    char a;      // 1 byte
    int b;       // 4 bytes
    char c;      // 1 byte
};

You might expect sizeof(struct Example) = 1 + 4 + 1 = 6. But it's actually 12 bytes on most systems! The compiler inserts padding to maintain alignment:

Offset 0: char a    (1 byte)
Offset 1-3: PADDING (3 bytes to align int to offset 4)
Offset 4-7: int b   (4 bytes)
Offset 8: char c    (1 byte)
Offset 9-11: PADDING (3 bytes for struct alignment)
Total: 12 bytes

alignment_demo.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <stdio.h>
#include <stddef.h>  // for offsetof
 
// Poorly ordered struct (wastes memory)
struct Wasteful {
    char a;      // 1 byte + 3 padding
    int b;       // 4 bytes
    char c;      // 1 byte + 3 padding
};              // Total: 12 bytes
 
// Well-ordered struct (minimal padding)
struct Efficient {
    int b;       // 4 bytes
    char a;      // 1 byte
    char c;      // 1 byte + 2 padding (for struct alignment)
};              // Total: 8 bytes
 
// Even more efficient with careful packing
struct Packed {
    int b;       // 4 bytes
    char a;      // 1 byte
    char c;      // 1 byte
    // 2 bytes padding if struct needs 4-byte alignment
} __attribute__((packed));  // GCC: disable padding, may hurt perf
 
int main() {
    printf("sizeof(Wasteful): %zu bytes\n", sizeof(struct Wasteful));
    printf("  offsetof(a): %zu\n", offsetof(struct Wasteful, a));
    printf("  offsetof(b): %zu\n", offsetof(struct Wasteful, b));
    printf("  offsetof(c): %zu\n", offsetof(struct Wasteful, c));
    
    printf("\nsizeof(Efficient): %zu bytes\n", sizeof(struct Efficient));
    printf("  offsetof(b): %zu\n", offsetof(struct Efficient, b));
    printf("  offsetof(a): %zu\n", offsetof(struct Efficient, a));
    printf("  offsetof(c): %zu\n", offsetof(struct Efficient, c));
    
    return 0;
}

Struct Ordering Rule of Thumb

Order struct members from largest to smallest. This minimizes padding by ensuring that smaller members can fill gaps left by alignment requirements. In performance-critical code with millions of objects, this can save significant memory.

CPU Registers and Integer Size

Modern CPUs contain registers—small, extremely fast storage locations directly on the processor chip. Understanding registers illuminates why native-width integers are efficient.

x86-64 General Purpose Registers:

Full Register	Lower 32 bits	Lower 16 bits	Lower 8 bits
RAX (64-bit)	EAX (32-bit)	AX (16-bit)	AL (8-bit)
RBX	EBX	BX	BL
RCX	ECX	CX	CL
RDX	EDX	DX	DL

Modern x86-64 has 16 general-purpose 64-bit registers. Each operation (add, multiply, compare) typically operates on entire registers.

Performance implications of integer width:

1. Native width is fastest for arithmetic:

On a 64-bit system, 64-bit operations are the "native" width. However, 32-bit operations are often equally fast because:

Modern CPUs have dedicated 32-bit execution units
32-bit results often zero-extend to 64-bit registers implicitly

2. Smaller types may require extra instructions:

Using int8_t or int16_t can sometimes be slower than int32_t because:

Sign or zero extension may be needed after arithmetic
Partial register updates can cause pipeline stalls on some architectures

3. Memory bandwidth often dominates:

Despite the above, smaller types can improve performance when:

Processing large arrays (more elements fit in cache)
Memory bandwidth is the bottleneck
SIMD operations process multiple small integers in parallel

The Counter-Intuitive Truth

Smaller types don't always mean faster code. Using int (32-bit) for loop counters is often faster than short (16-bit) because no additional extension instructions are needed. However, for large arrays of data, smaller types mean better cache utilization and potentially significant speedups. Profile, don't guess!

register_efficiency.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <stdint.h>
#include <stdio.h>
 
// Function using different integer widths
// Compile with: gcc -O2 -S to see generated assembly
 
// Using int8_t - may require sign extension
int8_t add_bytes(int8_t a, int8_t b) {
    return a + b;  // Result must be truncated to 8 bits
}
 
// Using int32_t - natural register width on most systems
int32_t add_ints(int32_t a, int32_t b) {
    return a + b;  // Direct addition, no extension needed
}
 
// Using int64_t - also natural on 64-bit systems
int64_t add_longs(int64_t a, int64_t b) {
    return a + b;  // Direct addition
}
 
// Array processing - smaller types can help cache
long sum_int8_array(const int8_t* arr, size_t n) {
    long sum = 0;
    for (size_t i = 0; i < n; i++) {
        sum += arr[i];  // Extension happens but more elements in cache
    }
    return sum;
}
 
long sum_int32_array(const int32_t* arr, size_t n) {
    long sum = 0;
    for (size_t i = 0; i < n; i++) {
        sum += arr[i];  // Fewer elements fit in cache
    }
    return sum;
}

Choosing the Right Integer Size

Selecting the appropriate integer width involves balancing range requirements, memory usage, and performance. Here's a systematic approach:

Integer Size Selection Guide
Use Case	Recommended Type	Rationale
Small enum/status values	int8_t or int32_t	int8 saves memory in arrays; int32 may be faster standalone
Loop counters	int or size_t	Native register width; no extension overhead
Array indices	size_t	Guaranteed to hold any valid array index
Counts up to billions	int64_t	32-bit overflows at ~2.1 billion
Timestamps (Unix epoch)	int64_t	32-bit overflows in 2038
File sizes	int64_t or off_t	Files can exceed 4 GB
Memory sizes	size_t	Can represent any object size on the platform
Network protocols	uint8_t, uint16_t, uint32_t	Fixed-width for portability
Bit flags	uint32_t or uint64_t	Unsigned for clean bit operations
RGB color values	uint8_t	Range 0-255 fits perfectly

Decision flowchart:

Do you need negative values? → If not, consider unsigned.
What's the maximum possible value?
- < 127: int8_t might work (but int32_t often faster)
- < 32,767: int16_t might work
- < 2.1 billion: int32_t is sufficient
- Larger: Use int64_t
Is this for array storage or single values?
- Arrays with millions of elements: Smaller types save memory
- Single values or small counts: Use int for simplicity
Is precise width required for interoperability?
- Binary file formats, network protocols: Use exact-width types
- Internal calculations: Flexibility is fine
Is this for indexing or sizes?
- Always use size_t for sizes and array indices to ensure portability

The "int by Default" Philosophy

Many experienced programmers use int for most local variables and only switch to specific widths when there's a reason. The reasoning: int is the natural/fast type, and premature optimization of integer widths rarely matters. Only specify exact widths when portability, range, or memory layout requirements demand it.

The Y2K38 Problem: A Case Study in Integer Width

The Year 2038 Problem (Y2K38 or the Unix Millennium Bug) is a perfect case study in the consequences of integer width choices made decades ago.

The setup:

Unix systems historically represented time as a time_t value—the number of seconds since January 1, 1970 (the Unix epoch). On many systems, time_t was a 32-bit signed integer.

The problem:

A 32-bit signed integer can represent values up to 2,147,483,647. Starting from 1970:

$$1970 + \frac{2,147,483,647 \text{ seconds}}{60 \times 60 \times 24 \times 365.25} \approx 2038$$

Specifically, at 03:14:07 UTC on January 19, 2038, the timestamp reaches 2,147,483,647. One second later, overflow occurs, and the value wraps to -2,147,483,648, which represents a date in 1901.

Critical Timestamps and 32-bit Limits
Time	32-bit Signed Value	Interpretation
Jan 1, 1970 00:00:00	0	Unix epoch
Jan 19, 2038 03:14:07	2,147,483,647	Maximum positive
Jan 19, 2038 03:14:08	-2,147,483,648	Wraps to ~1901!
Feb 7, 2106 06:28:15	4,294,967,295	Maximum unsigned 32-bit

Real-world impact:

This isn't theoretical:

Embedded systems: Millions of devices with 32-bit timestamps (ATMs, cars, industrial controllers)
Database records: Historical systems storing time_t in 32-bit fields
Certificate expiry: Some SSL/TLS certificates with dates past 2038 caused issues
GPS receivers: Some GPS systems experienced Y2K38-like rollover issues earlier

The solution:

Modern systems have transitioned to 64-bit time_t:

$$1970 + \frac{2^{63} - 1 \text{ seconds}}{60 \times 60 \times 24 \times 365.25} \approx 292 \text{ billion years}$$

That's roughly 20 times the age of the universe. We're safe for a while.

The Lesson

When the Unix time system was designed in the early 1970s, 2038 seemed impossibly far away, and memory was precious. The lesson: integer width decisions made today may have consequences decades later. When in doubt, use 64-bit for anything that might grow or persist long-term.

SIMD: Parallelism Through Integer Width

Single Instruction, Multiple Data (SIMD) is a technique where one instruction operates on multiple data elements simultaneously. Integer width directly affects SIMD efficiency.

The core concept:

Modern CPUs have wide SIMD registers (128, 256, or 512 bits). By using smaller integer types, you can process more elements per instruction:

SIMD Register Width	int8 elements	int16 elements	int32 elements	int64 elements
128 bits (SSE)	16	8	4	2
256 bits (AVX2)	32	16	8	4
512 bits (AVX-512)	64	32	16	8

Practical implication:

Processing an array of int8_t can be 8x faster than int64_t using AVX2—if the algorithm is amenable to vectorization.

simd_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <immintrin.h>  // Intel intrinsics
#include <stdint.h>
 
// Process 32 bytes (32 int8_t) in parallel with AVX2
void add_arrays_simd(int8_t* a, const int8_t* b, size_t n) {
    size_t i = 0;
    
    // Process 32 elements at a time with SIMD
    for (; i + 32 <= n; i += 32) {
        __m256i va = _mm256_loadu_si256((__m256i*)&a[i]);
        __m256i vb = _mm256_loadu_si256((__m256i*)&b[i]);
        __m256i vsum = _mm256_add_epi8(va, vb);  // 32 additions in 1 instruction!
        _mm256_storeu_si256((__m256i*)&a[i], vsum);
    }
    
    // Handle remaining elements
    for (; i < n; i++) {
        a[i] += b[i];
    }
}
 
// Compare: same operation with int32_t processes only 8 elements per instruction
void add_arrays_simd_int32(int32_t* a, const int32_t* b, size_t n) {
    size_t i = 0;
    
    // Process 8 elements at a time with SIMD (vs 32 for int8!)
    for (; i + 8 <= n; i += 8) {
        __m256i va = _mm256_loadu_si256((__m256i*)&a[i]);
        __m256i vb = _mm256_loadu_si256((__m256i*)&b[i]);
        __m256i vsum = _mm256_add_epi32(va, vb);  // Only 8 additions
        _mm256_storeu_si256((__m256i*)&a[i], vsum);
    }
    
    for (; i < n; i++) {
        a[i] += b[i];
    }
}

Compiler Auto-Vectorization

Modern compilers can automatically vectorize simple loops. Smaller integer types give the compiler more opportunities for aggressive vectorization. Even without writing SIMD intrinsics, choosing appropriate integer widths can significantly impact performance.

Summary: Fixed-Size Integer Storage

Fixed-size integers are a fundamental aspect of how computers represent and process data. Here's what we've covered:

Key Takeaways

•Hardware dictates standard sizes — Powers of two (8, 16, 32, 64) align with memory systems and enable efficient operations
•Byte order matters for interoperability — Little-endian vs big-endian affects network and file format portability
•Alignment impacts performance — Misaligned access can be slow or even crash on some architectures; struct ordering affects memory usage
•Native width is often fastest — But smaller types improve cache utilization for large data sets
•Use exact-width types when it matters — int32_t, int64_t for portability; int for general computation
•Future-proof with 64-bit — Y2K38 teaches us that integer width decisions persist for decades
•SIMD benefits from smaller types — More elements processed per instruction when widths are smaller

What's next:

With a solid understanding of how integers are stored, we'll examine the operations performed on integers and their computational cost model. Understanding operation costs connects integer types to algorithm analysis.

Page Complete

You now understand why integers come in fixed sizes, how they're stored in memory, and the performance implications of choosing different widths. This knowledge forms the bridge between abstract integers and physical computer systems.

3 / 5

Loading learning content...

Data Structures & AlgorithmsInteger Data Types

Integer Data Types

LevelBeginner

Duration75 mins

TopicInteger Data Types

3 / 5

Fixed-Size Storage (32-bit, 64-bit intuition)

Why Integers Come in Standard Sizes

Why do we have int8, int16, int32, and int64—but not int17 or int53? Why can't an integer simply grow as needed, dynamically allocating more bits when values get larger?

This knowledge transforms how you think about data representation and helps you make informed decisions about which integer types to use in your programs.

What You Will Learn

Hardware Architecture Dictates Integer Sizes

Computer processors are designed around specific word sizes—the fundamental unit of data that the CPU processes in a single operation. Modern processors are built with:

64-bit registers: Internal storage elements that hold operands for arithmetic
64-bit data paths: The "roads" that data travels between registers, ALU, and memory
Memory bus widths: The connection between CPU and RAM, often 64 bits per channel

Why powers of two?

The use of 8, 16, 32, 64 (all powers of two) isn't arbitrary:

Binary addressing: With n address lines, you can address exactly $2^n$ locations. Powers of two map cleanly.
Bit manipulation efficiency: Operations like shifting, masking, and extracting fields are trivially efficient when widths are powers of two.
Memory alignment: Memory is physically organized in power-of-two chunks. Aligning data to these boundaries enables single-cycle access.
Historical evolution: 8-bit → 16-bit → 32-bit → 64-bit evolution happened by doubling, each generation addressing the limitations of the previous.

Evolution of Processor Word Sizes
Era	Word Size	Example CPUs	Max RAM Addressable
1970s	8-bit	Intel 8080, Zilog Z80	64 KB
1980s	16-bit	Intel 8086, Motorola 68000	1 MB - 16 MB
1990s	32-bit	Intel 386/486/Pentium	4 GB
2000s+	64-bit	AMD64, ARM64, Apple M1/M2	16 EB (exabytes)

The 32-bit Memory Barrier

Standard Integer Widths: A Complete Reference

Modern programming languages provide integer types at specific fixed widths. Understanding these helps you choose the right type for any situation.

Fixed-Width Integer Types Across Languages
Width	C/C++ (stdint.h)	Rust	Java	Go	Bytes
8-bit signed	int8_t	i8	byte	int8	1
8-bit unsigned	uint8_t	u8	—	uint8/byte	1
16-bit signed	int16_t	i16	short	int16	2
16-bit unsigned	uint16_t	u16	char (special)	uint16	2
32-bit signed	int32_t	i32	int	int32	4
32-bit unsigned	uint32_t	u32	—	uint32	4
64-bit signed	int64_t	i64	long	int64	8
64-bit unsigned	uint64_t	u64	—	uint64	8

The "native" integer type:

Most languages also provide a "native" or "platform" integer type that matches the processor's word size:

Language	Native Signed	Native Unsigned	Typical Size
C/C++	`int`	`unsigned int`	32-bit (usually)
C/C++	`size_t`	`size_t`	32 or 64-bit
Rust	`isize`	`usize`	32 or 64-bit
Go	`int`	`uint`	32 or 64-bit

Why size_t and usize exist:

size_t len = strlen(str);  // Works correctly on both 32-bit and 64-bit
for (size_t i = 0; i < len; i++) {
    process(str[i]);
}

C's int Is NOT Always 32 Bits

Memory Layout: How Integers Are Stored in RAM

Little-endian vs Big-endian:

Consider storing the 32-bit integer 0x12345678 at memory address 0x1000:

Address	Little-Endian	Big-Endian
0x1000	0x78 (LSB)	0x12 (MSB)
0x1001	0x56	0x34
0x1002	0x34	0x56
0x1003	0x12 (MSB)	0x78 (LSB)

Little-endian: Least significant byte first (x86, x64, ARM in standard mode)
Big-endian: Most significant byte first (network protocols, some older architectures)

Why endianness matters:

Network communication: Network protocols (TCP/IP) use big-endian ("network byte order"). Converting between host and network order is required.
File format interoperability: Binary files must specify their endianness to be portable.
Cross-platform serialization: Sending data between ARM and x86 systems requires awareness of byte order.

endianness.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <stdio.h>
#include <stdint.h>
#include <arpa/inet.h>  // For htonl, ntohl
 
// Detect system endianness at runtime
void detect_endianness() {
    uint32_t test = 0x01020304;
    uint8_t* bytes = (uint8_t*)&test;
    
    if (bytes[0] == 0x04) {
        printf("System is LITTLE-ENDIAN\n");
    } else if (bytes[0] == 0x01) {
        printf("System is BIG-ENDIAN\n");
    }
}
 
// Visualize how a 32-bit integer is stored in memory
void visualize_storage(uint32_t value) {
    uint8_t* bytes = (uint8_t*)&value;
    
    printf("Value 0x%08X is stored as bytes:\n", value);
    for (int i = 0; i < 4; i++) {
        printf("  Address +%d: 0x%02X\n", i, bytes[i]);
    }
}
 
// Network byte order conversion
void network_order_demo() {
    uint32_t host_value = 0x12345678;
    uint32_t network_value = htonl(host_value);  // Host to network long
    uint32_t restored = ntohl(network_value);    // Network to host long
    
    printf("Host value:    0x%08X\n", host_value);
    printf("Network order: 0x%08X\n", network_value);
    printf("Restored:      0x%08X\n", restored);
}
 
int main() {
    detect_endianness();
    printf("\n");
    visualize_storage(0x12345678);
    printf("\n");
    network_order_demo();
    return 0;
}

The Naming Origin

Memory Alignment: Why Position Matters

Memory alignment refers to placing data at memory addresses that are multiples of the data's size. A 4-byte integer should ideally start at an address divisible by 4.

Why alignment matters:

Hardware efficiency: Many processors can only access memory at aligned addresses in a single operation. Unaligned access requires multiple memory reads and bit-shifting.
Atomicity: Aligned accesses are often atomic (indivisible), which is critical for multithreaded programming.
Some architectures disallow unaligned access: On certain ARM modes and older MIPS, unaligned access causes a hardware fault.
Cache efficiency: Aligned data is less likely to span cache lines, reducing memory traffic.

Alignment requirements by type:

Type Size	Natural Alignment	Address Divisible By
1 byte	1-byte aligned	Any address
2 bytes	2-byte aligned	2 (0x...0, 0x...2, 0x...4, etc.)
4 bytes	4-byte aligned	4 (0x...0, 0x...4, 0x...8, 0x...C)
8 bytes	8-byte aligned	8 (0x...0, 0x...8)

Struct padding example:

Consider this C struct:

struct Example {
    char a;      // 1 byte
    int b;       // 4 bytes
    char c;      // 1 byte
};

You might expect sizeof(struct Example) = 1 + 4 + 1 = 6. But it's actually 12 bytes on most systems! The compiler inserts padding to maintain alignment:

Offset 0: char a    (1 byte)
Offset 1-3: PADDING (3 bytes to align int to offset 4)
Offset 4-7: int b   (4 bytes)
Offset 8: char c    (1 byte)
Offset 9-11: PADDING (3 bytes for struct alignment)
Total: 12 bytes

alignment_demo.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <stdio.h>
#include <stddef.h>  // for offsetof
 
// Poorly ordered struct (wastes memory)
struct Wasteful {
    char a;      // 1 byte + 3 padding
    int b;       // 4 bytes
    char c;      // 1 byte + 3 padding
};              // Total: 12 bytes
 
// Well-ordered struct (minimal padding)
struct Efficient {
    int b;       // 4 bytes
    char a;      // 1 byte
    char c;      // 1 byte + 2 padding (for struct alignment)
};              // Total: 8 bytes
 
// Even more efficient with careful packing
struct Packed {
    int b;       // 4 bytes
    char a;      // 1 byte
    char c;      // 1 byte
    // 2 bytes padding if struct needs 4-byte alignment
} __attribute__((packed));  // GCC: disable padding, may hurt perf
 
int main() {
    printf("sizeof(Wasteful): %zu bytes\n", sizeof(struct Wasteful));
    printf("  offsetof(a): %zu\n", offsetof(struct Wasteful, a));
    printf("  offsetof(b): %zu\n", offsetof(struct Wasteful, b));
    printf("  offsetof(c): %zu\n", offsetof(struct Wasteful, c));
    
    printf("\nsizeof(Efficient): %zu bytes\n", sizeof(struct Efficient));
    printf("  offsetof(b): %zu\n", offsetof(struct Efficient, b));
    printf("  offsetof(a): %zu\n", offsetof(struct Efficient, a));
    printf("  offsetof(c): %zu\n", offsetof(struct Efficient, c));
    
    return 0;
}

Struct Ordering Rule of Thumb

CPU Registers and Integer Size

Modern CPUs contain registers—small, extremely fast storage locations directly on the processor chip. Understanding registers illuminates why native-width integers are efficient.

x86-64 General Purpose Registers:

Full Register	Lower 32 bits	Lower 16 bits	Lower 8 bits
RAX (64-bit)	EAX (32-bit)	AX (16-bit)	AL (8-bit)
RBX	EBX	BX	BL
RCX	ECX	CX	CL
RDX	EDX	DX	DL

Modern x86-64 has 16 general-purpose 64-bit registers. Each operation (add, multiply, compare) typically operates on entire registers.

Performance implications of integer width:

1. Native width is fastest for arithmetic:

On a 64-bit system, 64-bit operations are the "native" width. However, 32-bit operations are often equally fast because:

Modern CPUs have dedicated 32-bit execution units
32-bit results often zero-extend to 64-bit registers implicitly

2. Smaller types may require extra instructions:

Using int8_t or int16_t can sometimes be slower than int32_t because:

Sign or zero extension may be needed after arithmetic
Partial register updates can cause pipeline stalls on some architectures

3. Memory bandwidth often dominates:

Despite the above, smaller types can improve performance when:

Processing large arrays (more elements fit in cache)
Memory bandwidth is the bottleneck
SIMD operations process multiple small integers in parallel

The Counter-Intuitive Truth

register_efficiency.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <stdint.h>
#include <stdio.h>
 
// Function using different integer widths
// Compile with: gcc -O2 -S to see generated assembly
 
// Using int8_t - may require sign extension
int8_t add_bytes(int8_t a, int8_t b) {
    return a + b;  // Result must be truncated to 8 bits
}
 
// Using int32_t - natural register width on most systems
int32_t add_ints(int32_t a, int32_t b) {
    return a + b;  // Direct addition, no extension needed
}
 
// Using int64_t - also natural on 64-bit systems
int64_t add_longs(int64_t a, int64_t b) {
    return a + b;  // Direct addition
}
 
// Array processing - smaller types can help cache
long sum_int8_array(const int8_t* arr, size_t n) {
    long sum = 0;
    for (size_t i = 0; i < n; i++) {
        sum += arr[i];  // Extension happens but more elements in cache
    }
    return sum;
}
 
long sum_int32_array(const int32_t* arr, size_t n) {
    long sum = 0;
    for (size_t i = 0; i < n; i++) {
        sum += arr[i];  // Fewer elements fit in cache
    }
    return sum;
}

Choosing the Right Integer Size

Selecting the appropriate integer width involves balancing range requirements, memory usage, and performance. Here's a systematic approach:

Integer Size Selection Guide
Use Case	Recommended Type	Rationale
Small enum/status values	int8_t or int32_t	int8 saves memory in arrays; int32 may be faster standalone
Loop counters	int or size_t	Native register width; no extension overhead
Array indices	size_t	Guaranteed to hold any valid array index
Counts up to billions	int64_t	32-bit overflows at ~2.1 billion
Timestamps (Unix epoch)	int64_t	32-bit overflows in 2038
File sizes	int64_t or off_t	Files can exceed 4 GB
Memory sizes	size_t	Can represent any object size on the platform
Network protocols	uint8_t, uint16_t, uint32_t	Fixed-width for portability
Bit flags	uint32_t or uint64_t	Unsigned for clean bit operations
RGB color values	uint8_t	Range 0-255 fits perfectly

Decision flowchart:

Do you need negative values? → If not, consider unsigned.
What's the maximum possible value?
- < 127: int8_t might work (but int32_t often faster)
- < 32,767: int16_t might work
- < 2.1 billion: int32_t is sufficient
- Larger: Use int64_t
Is this for array storage or single values?
- Arrays with millions of elements: Smaller types save memory
- Single values or small counts: Use int for simplicity
Is precise width required for interoperability?
- Binary file formats, network protocols: Use exact-width types
- Internal calculations: Flexibility is fine
Is this for indexing or sizes?
- Always use size_t for sizes and array indices to ensure portability

The "int by Default" Philosophy

The Y2K38 Problem: A Case Study in Integer Width

The Year 2038 Problem (Y2K38 or the Unix Millennium Bug) is a perfect case study in the consequences of integer width choices made decades ago.

The setup:

Unix systems historically represented time as a time_t value—the number of seconds since January 1, 1970 (the Unix epoch). On many systems, time_t was a 32-bit signed integer.

The problem:

A 32-bit signed integer can represent values up to 2,147,483,647. Starting from 1970:

$$1970 + \frac{2,147,483,647 \text{ seconds}}{60 \times 60 \times 24 \times 365.25} \approx 2038$$

Specifically, at 03:14:07 UTC on January 19, 2038, the timestamp reaches 2,147,483,647. One second later, overflow occurs, and the value wraps to -2,147,483,648, which represents a date in 1901.

Critical Timestamps and 32-bit Limits
Time	32-bit Signed Value	Interpretation
Jan 1, 1970 00:00:00	0	Unix epoch
Jan 19, 2038 03:14:07	2,147,483,647	Maximum positive
Jan 19, 2038 03:14:08	-2,147,483,648	Wraps to ~1901!
Feb 7, 2106 06:28:15	4,294,967,295	Maximum unsigned 32-bit

Real-world impact:

This isn't theoretical:

Embedded systems: Millions of devices with 32-bit timestamps (ATMs, cars, industrial controllers)
Database records: Historical systems storing time_t in 32-bit fields
Certificate expiry: Some SSL/TLS certificates with dates past 2038 caused issues
GPS receivers: Some GPS systems experienced Y2K38-like rollover issues earlier

The solution:

Modern systems have transitioned to 64-bit time_t:

$$1970 + \frac{2^{63} - 1 \text{ seconds}}{60 \times 60 \times 24 \times 365.25} \approx 292 \text{ billion years}$$

That's roughly 20 times the age of the universe. We're safe for a while.

The Lesson

SIMD: Parallelism Through Integer Width

Single Instruction, Multiple Data (SIMD) is a technique where one instruction operates on multiple data elements simultaneously. Integer width directly affects SIMD efficiency.

The core concept:

Modern CPUs have wide SIMD registers (128, 256, or 512 bits). By using smaller integer types, you can process more elements per instruction:

SIMD Register Width	int8 elements	int16 elements	int32 elements	int64 elements
128 bits (SSE)	16	8	4	2
256 bits (AVX2)	32	16	8	4
512 bits (AVX-512)	64	32	16	8

Practical implication:

Processing an array of int8_t can be 8x faster than int64_t using AVX2—if the algorithm is amenable to vectorization.

simd_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <immintrin.h>  // Intel intrinsics
#include <stdint.h>
 
// Process 32 bytes (32 int8_t) in parallel with AVX2
void add_arrays_simd(int8_t* a, const int8_t* b, size_t n) {
    size_t i = 0;
    
    // Process 32 elements at a time with SIMD
    for (; i + 32 <= n; i += 32) {
        __m256i va = _mm256_loadu_si256((__m256i*)&a[i]);
        __m256i vb = _mm256_loadu_si256((__m256i*)&b[i]);
        __m256i vsum = _mm256_add_epi8(va, vb);  // 32 additions in 1 instruction!
        _mm256_storeu_si256((__m256i*)&a[i], vsum);
    }
    
    // Handle remaining elements
    for (; i < n; i++) {
        a[i] += b[i];
    }
}
 
// Compare: same operation with int32_t processes only 8 elements per instruction
void add_arrays_simd_int32(int32_t* a, const int32_t* b, size_t n) {
    size_t i = 0;
    
    // Process 8 elements at a time with SIMD (vs 32 for int8!)
    for (; i + 8 <= n; i += 8) {
        __m256i va = _mm256_loadu_si256((__m256i*)&a[i]);
        __m256i vb = _mm256_loadu_si256((__m256i*)&b[i]);
        __m256i vsum = _mm256_add_epi32(va, vb);  // Only 8 additions
        _mm256_storeu_si256((__m256i*)&a[i], vsum);
    }
    
    for (; i < n; i++) {
        a[i] += b[i];
    }
}

Compiler Auto-Vectorization

Summary: Fixed-Size Integer Storage

Fixed-size integers are a fundamental aspect of how computers represent and process data. Here's what we've covered:

Key Takeaways

•Hardware dictates standard sizes — Powers of two (8, 16, 32, 64) align with memory systems and enable efficient operations
•Byte order matters for interoperability — Little-endian vs big-endian affects network and file format portability
•Alignment impacts performance — Misaligned access can be slow or even crash on some architectures; struct ordering affects memory usage
•Native width is often fastest — But smaller types improve cache utilization for large data sets
•Use exact-width types when it matters — int32_t, int64_t for portability; int for general computation
•Future-proof with 64-bit — Y2K38 teaches us that integer width decisions persist for decades
•SIMD benefits from smaller types — More elements processed per instruction when widths are smaller

What's next:

Page Complete

3 / 5