Meeting Notes on Password Cracking Benchmarks

Discussion on Statistical Terminology and Graph Analysis

Professor Vareen agreed that limited conclusions can be drawn from the graphs.
Graphs show data within certain intervals.
- With filters, data is between 300 and 400.
- With 1500 Shari, reaches up to 4000.
Hardware acceleration on CPU might reduce increment compared to GPU.
CPU performs strongly.

Pseudo Time Index and Experiment Independence

Pseudo time index refers to the order of the experiment.
Each experiment is independent.
Ideally, with consistent parameters, a flat line is expected.
Variations exist despite the expectation of equal parameters.
Password changes are a separate consideration.
In an ideal scenario with proper setup and uniform password extraction, the result should be flat.

Analysis of MD5 and Core Usage

MD5 shows little oscillation.
Core usage on CPU was examined.
Inconsistencies in behavior observed.
Division between IPE and OpenCL for GPU usage.
Differences may stem from the code's optimization by the individual who wrote it.
Profile 4 on MD5 seems better with IPE, possibly due to its native GPU integration, but results are inconsistent.

Variance Calculation and Graph Stability

Varying core numbers previously showed unusual behavior.
Increase reaches up to 1500, with others nearing 1000, indicating relative alignment.
I fusch appears more stable even beyond interpolation.

Consultation with Cristiano and Parameter Control

Cristiano was shown some graphs initially to assess the approach's validity.
Without precise knowledge of benchmark parameters and passwords, interpreting peaks is challenging.
Peaks may result from password variations affecting memory loading and cache times.
Experiments need tighter controls for clearer explanations of peak occurrences.

Password Handling and Brute Force Approaches

Ensure consistent passwords, possibly through brute force using a password list.
Two approaches: using masks and rules.
Alternative: benchmark with 100 MD5 passwords over a set time to observe Sharething; control password characteristics (e.g., length between 8 and 12 characters).
Hashcat allows random subset selection from a dictionary, but defining masks is necessary.

Defining Research Questions and Variables

Determine what the measurements aim to demonstrate.
Consider altering length, complexity, or specific profiles.
Current variables: algorithm, benchmark speed, and benchmark order.
A model with only speed as a variable is limiting.
Define research questions precisely.
Example: Estimate the advantage of a GPU attacker over a CPU attacker for various algorithms.

Parameter Impact and Experimental Design

Password length may not necessarily impact all algorithms due to constant-size block usage.
Hypothesis: Increasing parameters should reduce GPU advantage.
Systematically design experiments.

Splitting Research Questions

Divide research into two parts:
- CPU vs. GPU comparison with the same algorithm.
- Identifying the most secure algorithm for different situations.
Compare algorithms with standard parameters on Linux.

Linux Algorithms and Attacker Perspective

MD5 and SHA are used within more complex algorithms like MD5 Crypto or shadow fit, involving multiple iterations.
These iterations likely slow down both CPU and GPU performance similarly.
Focus on the attacker's perspective: which tools and hardware are most beneficial.
If John the Ripper performs better on CPU than Hashcat, an attacker might opt for multiple CPUs.
Implementation specifics are less relevant from this viewpoint.

Tool Comparison and Ethical Considerations

Comparing tools like John the Ripper and Hashcat may be more relevant from an attacker's perspective than comparing benchmark implementations.
Try cracking the same password list with different tools to compare their effectiveness.
Ethical question: Should all research material be made available, considering potential misuse?
In a previous work passwords were not made fully available for ethical reasons.

Core Scaling and Memory Usage

Transitioning from 0 to nearly double (200%) with Hashcat and MD5.
Reaching nearly 12 times the performance with 16 cores, aligning with expectations.
16 cores consistently perform lower than 15 cores, suggesting memory sharing impacts.
Memory usage: Allocating 64K of memory; accesses within this space are password-dependent and hard to predict for attackers.
True memory-hard algorithms use significantly more memory.

Tool Optimization and Comparative Analysis

Tools may be optimized for specific hardware (CPU vs. GPU).
Consider the need of Perez and SELLET for cracking, given that custom scripts might be required.
Comparing memory-hard vs. non-memory-hard tools within the same tool (like John the Ripper) provides better comparison.
Caution against comparing results obtained with fundamentally different tools and techniques.

John the Ripper Peculiarities and Parameter Randomization

John the Ripper shows more oscillation.
Unclear why some core counts perform better than higher core counts.
Doubts persist regarding password selection and parameter changes during benchmarking.
Potential parameter randomization might explain certain peaks.
Testing is needed to fix parameters and observe consistency.

Experiment Stabilization and Controlled Conditions

Establish controlled experimental conditions.
Prioritize understanding and stabilizing tool behavior.
Aim for similar conditions across tests for coherent results.