Real-Time Email Phishing Detection Practice Flashcards

0.0(0)

Studied by 0 people

View linked note

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/14

Earn XP

Description and Tags

Vocabulary-style flashcards covering the technical specifications, metrics, and components of the custom DistilBERT-based email phishing detection system.

Last updated 11:53 AM on 6/19/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

15 Terms

1

New cards

DistilBERT

A compressed version of BERT that retains $97\%$ of language comprehension capabilities while being $40\%$ smaller and $60\%$ faster.

2

New cards

Anti-Phishing Working Group (APWG)

An organization that reported $165,772$ phishing attacks in the first quarter of $2020$ , up from $162,155$ in the previous quarter.

3

New cards

PhishKiller

A tool utilizing featureless machine learning techniques that achieves $98.30\%$ accuracy and can block malicious websites in $81.68$ milliseconds.

4

New cards

Deep Neural Network (DNN) Approach [2]

A method for phishing URL detection achieving accuracy rates of $90\%$ for Ham, $92\%$ for Phishing Corpus, and $89\%$ for Phishload.

5

New cards

Kaggle Email Dataset

A dataset consisting of $82,486$ entries with $43,057$ ( $52.20\%$ ) phishing emails and $39,429$ ( $47.80\%$ ) non-phishing emails.

6

New cards

Custom Classifier Head

A modification to the standard DistilBERT architecture consisting of a two-layer feedforward network with ReLU activation.

7

New cards

Dynamic Threshold Adjustment

A custom modification replacing the fixed classification threshold with a learnable $\alpha$ parameter.

8

New cards

Enhanced Loss Function

A custom function combining standard cross-entropy with a False Positive Rate (FPR) penalty term.

9

New cards

Daily Retraining Mechanism

A process that aggregates new detection results every $24$ hours to fine-tune the model against evolving phishing patterns.

10

New cards

Controlled Environment Accuracy

The highest performance metric achieved by the custom DistilBERT model, recorded at $99.29\%$ .

11

New cards

Real-World Accuracy

The system performance metric of $95.45\%$ achieved during monitoring of incoming Gmail messages.

12

New cards

Average Response Time

The system's real-world detection speed, which averaged $1.88\,\text{s}$ , meeting the sub- $2$ -second target.

13

New cards

AUC-ROC (Controlled)

An evaluation metric for the model's discriminatory capability, which reached a value of $0.9994$ in controlled tests.

14

New cards

False Positive Rate (FPR)

The rate of legitimate emails misclassified as phishing, which was $0.69\%$ in controlled tests and $6.67\%$ in real-world scenarios.

15

New cards

OAuth 2.0

The secure authentication protocol utilized for integrating the detection system with the Gmail API.