Digital Forensics & Reverse Engineering – Lecture 0x0A Comprehensive Notes
Digital Forensics & Incident Response (DFIR)
Definition: Field within cybersecurity dedicated to the identification, investigation, containment, eradication and remediation of cyber-attacks.
Integrates both Digital Forensics (evidence acquisition & interpretation) and Incident Response (operational reaction to an active threat).
Drivers for growth:
Escalating volume, sophistication and automation of cyber-attacks.
Proliferation of heterogeneous endpoints (servers, workstations, IoT, cloud VMs, mobile, etc.).
Organisational structure:
CIRT / CSIRT (Cyber/Computer Incident Response Team) → multidisciplinary team that triages, responds and coordinates recovery.
Relies heavily on forensic artefacts to make evidence-based decisions.
Typical stakeholders: SOC analysts, threat hunters, legal/HR, PR, executive management.
NIST SP-800-86 Forensic Process Phases
Collection – acquire the data (live memory, disk, logs, network traffic) while maintaining integrity.
Examination – forensically process raw data (filtering, decrypting, carving, normalising formats).
Analysis – draw conclusions, correlate artefacts, recreate timelines, attribute activity.
Reporting – document methods, findings, chain-of-custody and recommended actions.
(Phases repeat iteratively as new leads emerge.)
Forensic Areas of Practice
File-System Forensics
Memory Forensics
Malware Analysis
Network Forensics
Mobile & IoT Forensics
Cloud Forensics (multi-tenant, API-driven evidence)
Log Analysis (system, application, authentication, audit, security devices)
Digital forensics is much more than just hard-drive analysis; any digital substrate that produces artefacts can be a target.
Logs Commonly Leveraged
System logs – kernel/OS events, crashes, reboots.
Application logs – user interactions, error traces.
Security-device logs – firewalls, IDS/IPS, EDR sensors.
Authentication logs – , MFA status.
Network-device logs – router/switch flow data, configuration events.
Audit logs – privileged actions, policy changes, compliance checkpoints.
File-Level Forensic Toolkit Quick-Reference
libmagic / file → signature-based format identification (“magic bytes”).
file screenshot.png⇒ returns PNG image data, 1920×1080, 8-bit/color …
Carving with dd – extract embedded objects
dd if=container.xxx of=payload.xxx bs=1 skip=<offset> count=<len>
strings – pulls ASCII/Unicode sequences:
strings -o screenshot.png.hexdump / xxd – binary & hex inspection; useful for manual header checks.
exiftool – rich metadata (EXIF, IPTC, XMP) for images, docs, video, etc.
Ethical note: ensure you respect privacy/SOC policies; metadata may reveal PII.
Network Forensics
Packet Trace (PCAP)
Captured via
tcpdump, Wireshark, or hardware taps.Contains full payloads → can reconstruct sessions, files, voip calls.
Network Logs
High-level events (src/dst, port, proto) but no payload.
Complementary to PCAP for long-term retention.
Packet Capture Techniques
Network Tap – passive inline optical/electrical splitter; zero packet loss, transparent.
Port Mirroring / SPAN – switch clones selected traffic to a monitor port.
Wireless Sniffing – monitor mode interface captures 802.11 frames.
Analysis Workflow
Import PCAP into Wireshark; apply protocol & display filters.
Triage (e.g., find HTTP POSTs, unusual DNS, malformed TLS handshakes).
Reassemble streams, export objects, carve malicious binaries.
Correlate timestamps with host logs for attribution.
Steganography & Steganalysis
Steganography = art/science of hiding data within innocuous carriers so that the existence of the message is concealed.
Not equivalent to encryption (crypto scrambles content but admits its presence).
Not equivalent to watermarking (fingerprinting) where an external index describes the file.
Motivations & Threat Landscape
Intellectual-property protection (embed author ID, anti-piracy codes).
Covert malware transport (payload inside a harmless JPEG in a spear-phish).
Data exfiltration from locked-down networks (post images w/ hidden corporate IP).
Techniques
Text Carriers
Line-shift coding: each text line moved up/down slightly – lines ⇒ code points.
Word-spacing: alter inter-word gaps to encode bits.
Character micro-changes: tiny glyph perturbations (PDF, PostScript) invisible to reader.
Image Carriers
Spatial-domain (LSB) – flip the Least-Significant Bit of pixel channels; imperceptible noise.
Colour-plane separation – embed across RGB components.
Frequency-domain (DCT/FFT) – inject bits into high-frequency coefficients (robust against cropping, but beware JPEG compression attacking same coefficients).
Network Steganography
Embed payloads in sequence numbers, timing gaps, or uncommon header fields.
Steganalysis
Goal: detect & extract covert data.
Methods:
Compare suspected file vs. known-good baseline (size, entropy, colour histograms).
Statistical tests (chi-square on LSB, RS analysis).
Visual inspection for artefacts (misaligned blocks, resolution loss).
Machine-learning classifiers on large corpora of “clean” vs “stego” samples.
Ethical/practical implication: false positives can be high → corroborate with additional evidence before attribution.
Reverse Engineering (RE) & Malware Analysis
Definition: Deconstruct a physical/software artefact to learn its design, behaviour, vulnerabilities, or to enable interoperability.
Core use-cases: vulnerability discovery, malware triage, patch diffing, legacy system maintenance, audit of closed-source products.
Legal Considerations (Australia)
Permitted when performed for:
Interoperability
Error correction
Security testing / research (malware, vuln analysis)
Prohibited when intent is:
Selling a competing clone, cracking copy protection, distributing licence bypasses.
Obtaining unauthorised access to computers.
(Always consult counsel; laws vary by jurisdiction.)
Compilation Pipeline (C/C++)
Static libraries or may be merged during linking.
Understanding each stage helps map binary artefacts back to source constructs.
Learning RE ≈ Learning a New Language
Vocabulary – mnemonics (
mov,cmp,jmp).Grammar – addressing modes, calling conventions (ABI).
Idioms/patterns – compiler optimisations, prologue/epilogue forms.
Toolchain dialects – GCC vs MSVC, O-levels, inline-function merging.
x86 (32-bit) Architecture Refresher
Special registers:
– instruction pointer.
– stack pointer (top of current stack frame).
– base pointer (frame pointer; usually at entry).
Stack layout (high addresses ↓ low): arguments → return address → old → locals.
Intel vs AT&T Syntax
Intel:
mov eax, 0xCA(dest, src);[ebp+0x8]dereferences.AT&T:
movl $0xCA, %eax(src, dest);-0x8(%ebp)dereferences.Course uses Intel.
Common Instruction Categories
Arithmetic:
add,sub,mul,div(quotient in , remainder in ).Data movement:
mov,lea(load effective address).Control flow:
call,ret, conditional jumpsje,jl, etc.
Example Walk-Through
C snippet:
int main(){
int year = 2019;
printf("hello csf %d\n", year);
return 0;
}
→ Compiled assembly includes:
Prologue aligning stack to -byte boundary (
and esp, 0xfffffff0).Local variable allocation (
sub esp, 0x14).Value assignments via
mov/add.Epilogue (
leave,ret).
Demonstrated in slides with additional conditional if(eax < a) & printf.
RE Without Source
Use disassemblers and decompilers to transform binaries back into human-readable forms.
Disassembler ⇒ assembly.
Decompiler ⇒ high-level C-like pseudo-code.
Popular tools: IDA Pro, Binary Ninja, Ghidra, HIEW, HT-Editor.
Static vs Dynamic Analysis
Static: inspect code without execution.
Pros: safer; covers all paths; works offline.
Techniques: string extraction, control-flow graph mapping, signature matching.
Dynamic: run program (sandbox, emulator, debugger) and monitor behaviour.
Pros: reveals unpacking, runtime decryption, environment checks; quick IOC extraction.
Must handle anti-debugging, VM detection.
Ghidra Highlights
Open-source NSA tool (~ LOC, Java).
Runs on Windows, macOS, Linux.
Currently static-only.
Key UI components (slides):
Program Tree, Symbol Tree, Data Type Manager, Listing (disassembly), Decompiler View, CFG (Control Flow Graph).
Supports scripting (Jython/Java) for automation.
Ethical note: Always analyse malware in isolated labs; respect software EULAs.
CTF vs Real-World Forensics Perspective
CTF tasks emphasise focused puzzles:
File-format quirks, stego, memory images, single PCAP, sharp flags.
Production forensics emphasises contextual evidence:
Chain-of-custody, timeline reconstruction, insider-threat behaviour, metadata correlation.
Must handle volume, incomplete data, legal admissibility.