Tor Network and Encryption Tools Lecture

Introduction to the Tor Network and Onion Routing

The Tor Network Definition: A service designed to provide anonymous web browsing by masking online identities and activities from attackers.
Onion Routing Premise: The core mechanism of the Tor network, which layers encryption around data to ensure privacy.
The Research Challenge: Determining if the service is truly $100\%$ secure or if its fundamental design contains inherent, unpatchable weaknesses.
Network Architecture: * Relays: Machines operated by volunteers who contribute bandwidth. * Circuit Construction: A path through the network for encrypted data to travel from a user to a server. * Encryption Layers: Data is encrypted in layers; the number of layers typically corresponds to the number of relays the data passes through, which is usually $3$ . * Onion Client: Local software required to access the network.
Types of Relays in a Standard Circuit: 1. Entry Relay: Only knows the user’s IP address. 2. Middle Relay: Knows neither the data nor the final destination; only knows the preceding and succeeding relay in the hop. 3. Exit Relay: Only knows the destination address of the server.
The Directory Server: Tasked with determining the path and relay selections for the network.

Vulnerabilities and Attacks on Tor Infrastructure

Traffic Correlation Attacks: * Mechanism: An attacker observes both ends of a circuit simultaneously, matching traffic flows based on timing and volume (e.g., matching a specific burst of data entering and exiting). * Evolution: While historically deemed impractical due to millions of daily users, modern research employs statistical tools and deep learning to optimize matches.
Infrastructure Manipulation (2015 Study): * Attackers can manipulate the underlying physical infrastructure of the internet to insert themselves into the data path. * Allows observation of traffic without touching a single Tor relay. * Detection: Tor cannot detect or prevent this because it assumes the routing infrastructure is secure.
DeepCore and Deep Learning (2018 Study): * DeepCore: A deep learning model trained on real Tor traffic to identify matched flows. * Capabilities: Recognizes specific behaviors, congestion patterns, and packet timing signatures. * Performance: Achieved $96\%$ accuracy compared to only $4\%$ accuracy using earlier statistical methods.
Active Relay Blocking (2022 Study): * Method: An attacker actively forces victims into controlled relays by selectively blocking legitimate ones. * Results: $90\%$ of all circuits were compromised in tests; users were identified with $100\%$ accuracy in under $40\,\text{seconds}$ . * Bypassing Guarantees: These attacks can bypass "guard rotation," Tor's protection mechanism that cycles entry relays every few months.

Malicious Actors and the Sybil Attack Threat

Definition of Cyber Attack (Sybil Attack): A scenario where one entity operates multiple compromised or malicious relays to gain a larger share of network traffic.
Network Analysis (2016 Study): Analyzed $9\,\text{years}$ of archive data (> 100\,\text{gigabytes}) to document malicious relay behavior.
The CyberHunter Tool: A detection tool built to find coordinated relay groups. * Case Study: Found a group of exit relays silently replacing Bitcoin addresses in web traffic to redirect transactions to attacker-controlled wallets.
The CAC17 Case Study (2021): * A single entity operated undetected for $4\,\text{years}$ . * At its peak, it controlled over $900$ relays across $50$ networks. * Impact: A user had a $10\%$ chance of passing through its guard relays and a $24\%$ chance of passing through its middle relays.
Economic Low Barriers (2025 Study): * The top $10$ bandwidth contributors control over half of all exit traffic. * Operating just $300$ relays across two perceived identities could compromise nearly $1\%$ of all circuits (tens of thousands of circuits). * Cost: Approximately $\$400$ per month in cloud hosting allows an attacker to observe over $13\%$ of exit traffic.
Current Countermeasures: * Limiting relays per IP address. * Minimum uptime requirements. * "Trusted flags" for relays. * Avoiding relays from the same network block in a single path.

Onion Services and Server-Side Anonymity

Onion Services: Websites and services existing entirely within Tor with no public IP address.
Predictability Vulnerabilities (2013 Study): * Onion services register with specific relays based on publicly known information (like time connectivity). * Attackers can mathematically calculate and place their own relays in those positions in advance.
De-anonymization Categories: 1. Traffic Analysis. 2. Exploiting Implementation Vulnerabilities. 3. Bad Operational Security (OpSec).
Carnegie Mellon University (CMU) Attack (2014): * Researchers controlled both directory and entry positions. * Used non-standard signals to link users to the specific services they visited; ran undetected for months.
The Human Factor (Operational Security): * Most legal/illegal marketplaces were compromised due to human error (e.g., using personal emails in forum posts or exposing info in routine site communications). * Response Leakage: Details in server responses can be matched against public internet scanning tools to find the physical server IP.

Fundamentals of Data Encryption Tools

Data Breach Statistics: Mention of $21,000,000$ stolen passwords, $17,000,000$ leaked emails, and $1\,\text{TB}$ of data lost in single attacks.
Encryption Types: 1. File Encryption: Locking individual documents (e.g., financial or legal files). 2. Container Encryption: Creating an encrypted vault or folder within a standard system (e.g., Veracrypt). 3. Full Disk Encryption (FDE): Locking the entire hard drive; requires a password at boot (e.g., LUKS, Diskryptor).

Deep Dive into Encryption Software: LUKS, Diskryptor, VeraCrypt, and AxCrypt

LUKS (Linux Unified Key Setup): * Platform: Standard for Linux distributions (Debian, Ubuntu, Fedora). * Scope: Full disk encryption. * Interface: No Graphical User Interface (GUI); managed entirely via command line. * Target User: System administrators and technical users.
Diskryptor: * Platform: Windows. * Scope: Full disk or specific partitions. * Algorithms: Supports $\text{AES}$ , $\text{Serpent}$ , and $\text{TwoFish}$ . * Interface: Includes a GUI for easier use. * Performance Note: Slightly slower than VeraCrypt in benchmark testing.
VeraCrypt: * Platform: Windows, Linux, and MacOS. * Scope: Both full disk and container-based encryption. * Hidden Volumes: Allows for a secret volume inside an encrypted volume. An user can provide the "outer" password if coerced, while the highly sensitive data remains safe in the secret "inner" volume. * Algorithms: $\text{AES}$ , $\text{Serpent}$ , and $\text{TwoFish}$ ; can be used in layered formats.
AxCrypt (AX script): * Platform: Windows focus. * Scope: Individual file encryption. * Features: Auto-encryption (files re-encrypt automatically after editing) and easy right-click integration in shell. * Standards: Uses $\text{AES-128}$ or $\text{AES-256}$ .

Critical Security Risks and Human Factors

Weak Passwords: Encryption strength is irrelevant if the password is easy to guess. Recommendations include at least $15$ characters with symbols and numbers.
Physical Access/Power Status: Encryption keys are held in memory while the machine is on. An unlocked, powered-on machine is vulnerable to physical bypass.
Human Error: Writing passwords on sticky notes, storing them in unencrypted files, or sharing them over email.
Key Size Selection: Choosing $\text{AES-256}$ for sensitive business/legal data over the lower $\text{AES-128}$ standard.

Questions & Discussion

Balance of Problems and Solutions: The moderator advised that research presentations should balance the identification of vulnerabilities with proposed solutions, noting that the Tor presentation was heavy on flaws but light on remediation.
How DeepCore Works: * Prompt: Can you explain how it works? * Response: It is a training model that analyzes real Tor traffic on live networks to identify patterns and match both ends of a circuit, effectively de-anonymizing the user and the service they access.
De-anonymization Mechanism in DeepCore: * Prompt: How does de-anonymization work in DeepCore? * Response: It matches characteristics of ingoing and outgoing data by counting packet volume and tracking the timing of data transmission and receipt.
Relay Node Configuration: * Prompt: Does each node have a specific start or end point? * Response: While users can configure their relays for specific roles, generally any relay can function as an entry, middle, or exit node.
Research Depth on Encryption Tools: * Feedback: The moderator suggested that instead of an overview of many tools, a research presentation should focus on one tool in depth, examining its implementation and identifying specific vulnerabilities an attacker could exploit.
Layered Encryption Complexity: * Prompt: How can you combine different encryptions like Serpent and $\text{AES}$ ? * Response: These are implemented in a layered format at the implementation level to provide higher security by stacking the algorithms.