1/14
Vocabulary-style flashcards covering the technical specifications, metrics, and components of the custom DistilBERT-based email phishing detection system.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
DistilBERT
A compressed version of BERT that retains 97% of language comprehension capabilities while being 40% smaller and 60% faster.
Anti-Phishing Working Group (APWG)
An organization that reported 165,772 phishing attacks in the first quarter of 2020, up from 162,155 in the previous quarter.
PhishKiller
A tool utilizing featureless machine learning techniques that achieves 98.30% accuracy and can block malicious websites in 81.68 milliseconds.
Deep Neural Network (DNN) Approach [2]
A method for phishing URL detection achieving accuracy rates of 90% for Ham, 92% for Phishing Corpus, and 89% for Phishload.
Kaggle Email Dataset
A dataset consisting of 82,486 entries with 43,057 (52.20%) phishing emails and 39,429 (47.80%) non-phishing emails.
Custom Classifier Head
A modification to the standard DistilBERT architecture consisting of a two-layer feedforward network with ReLU activation.
Dynamic Threshold Adjustment
A custom modification replacing the fixed classification threshold with a learnable α parameter.
Enhanced Loss Function
A custom function combining standard cross-entropy with a False Positive Rate (FPR) penalty term.
Daily Retraining Mechanism
A process that aggregates new detection results every 24 hours to fine-tune the model against evolving phishing patterns.
Controlled Environment Accuracy
The highest performance metric achieved by the custom DistilBERT model, recorded at 99.29%.
Real-World Accuracy
The system performance metric of 95.45% achieved during monitoring of incoming Gmail messages.
Average Response Time
The system's real-world detection speed, which averaged 1.88s, meeting the sub-2-second target.
AUC-ROC (Controlled)
An evaluation metric for the model's discriminatory capability, which reached a value of 0.9994 in controlled tests.
False Positive Rate (FPR)
The rate of legitimate emails misclassified as phishing, which was 0.69% in controlled tests and 6.67% in real-world scenarios.
OAuth 2.0
The secure authentication protocol utilized for integrating the detection system with the Gmail API.