Meeting Notes on Password Cracking Benchmarks
Discussion on Statistical Terminology and Graph Analysis
- Professor Vareen agreed that limited conclusions can be drawn from the graphs.
- Graphs show data within certain intervals.
- With filters, data is between 300 and 400.
- With 1500 Shari, reaches up to 4000.
- Hardware acceleration on CPU might reduce increment compared to GPU.
- CPU performs strongly.
Pseudo Time Index and Experiment Independence
- Pseudo time index refers to the order of the experiment.
- Each experiment is independent.
- Ideally, with consistent parameters, a flat line is expected.
- Variations exist despite the expectation of equal parameters.
- Password changes are a separate consideration.
- In an ideal scenario with proper setup and uniform password extraction, the result should be flat.
Analysis of MD5 and Core Usage
- MD5 shows little oscillation.
- Core usage on CPU was examined.
- Inconsistencies in behavior observed.
- Division between IPE and OpenCL for GPU usage.
- Differences may stem from the code's optimization by the individual who wrote it.
- Profile 4 on MD5 seems better with IPE, possibly due to its native GPU integration, but results are inconsistent.
Variance Calculation and Graph Stability
- Varying core numbers previously showed unusual behavior.
- Increase reaches up to 1500, with others nearing 1000, indicating relative alignment.
- I fusch appears more stable even beyond interpolation.
Consultation with Cristiano and Parameter Control
- Cristiano was shown some graphs initially to assess the approach's validity.
- Without precise knowledge of benchmark parameters and passwords, interpreting peaks is challenging.
- Peaks may result from password variations affecting memory loading and cache times.
- Experiments need tighter controls for clearer explanations of peak occurrences.
Password Handling and Brute Force Approaches
- Ensure consistent passwords, possibly through brute force using a password list.
- Two approaches: using masks and rules.
- Alternative: benchmark with 100 MD5 passwords over a set time to observe Sharething; control password characteristics (e.g., length between 8 and 12 characters).
- Hashcat allows random subset selection from a dictionary, but defining masks is necessary.
Defining Research Questions and Variables
- Determine what the measurements aim to demonstrate.
- Consider altering length, complexity, or specific profiles.
- Current variables: algorithm, benchmark speed, and benchmark order.
- A model with only speed as a variable is limiting.
- Define research questions precisely.
- Example: Estimate the advantage of a GPU attacker over a CPU attacker for various algorithms.
Parameter Impact and Experimental Design
- Password length may not necessarily impact all algorithms due to constant-size block usage.
- Hypothesis: Increasing parameters should reduce GPU advantage.
- Systematically design experiments.
Splitting Research Questions
- Divide research into two parts:
- CPU vs. GPU comparison with the same algorithm.
- Identifying the most secure algorithm for different situations.
- Compare algorithms with standard parameters on Linux.
Linux Algorithms and Attacker Perspective
- MD5 and SHA are used within more complex algorithms like MD5 Crypto or shadow fit, involving multiple iterations.
- These iterations likely slow down both CPU and GPU performance similarly.
- Focus on the attacker's perspective: which tools and hardware are most beneficial.
- If John the Ripper performs better on CPU than Hashcat, an attacker might opt for multiple CPUs.
- Implementation specifics are less relevant from this viewpoint.
- Comparing tools like John the Ripper and Hashcat may be more relevant from an attacker's perspective than comparing benchmark implementations.
- Try cracking the same password list with different tools to compare their effectiveness.
- Ethical question: Should all research material be made available, considering potential misuse?
- In a previous work passwords were not made fully available for ethical reasons.
Core Scaling and Memory Usage
- Transitioning from 0 to nearly double (200%) with Hashcat and MD5.
- Reaching nearly 12 times the performance with 16 cores, aligning with expectations.
- 16 cores consistently perform lower than 15 cores, suggesting memory sharing impacts.
- Memory usage: Allocating 64K of memory; accesses within this space are password-dependent and hard to predict for attackers.
- True memory-hard algorithms use significantly more memory.
- Tools may be optimized for specific hardware (CPU vs. GPU).
- Consider the need of Perez and SELLET for cracking, given that custom scripts might be required.
- Comparing memory-hard vs. non-memory-hard tools within the same tool (like John the Ripper) provides better comparison.
- Caution against comparing results obtained with fundamentally different tools and techniques.
John the Ripper Peculiarities and Parameter Randomization
- John the Ripper shows more oscillation.
- Unclear why some core counts perform better than higher core counts.
- Doubts persist regarding password selection and parameter changes during benchmarking.
- Potential parameter randomization might explain certain peaks.
- Testing is needed to fix parameters and observe consistency.
Experiment Stabilization and Controlled Conditions
- Establish controlled experimental conditions.
- Prioritize understanding and stabilizing tool behavior.
- Aim for similar conditions across tests for coherent results.