Week 9 Lecture 3: Evaluating Malware Detection Systems

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/29

Earn XP

Description and Tags

Flashcards covering key concepts in evaluating malware detection systems, including accuracy, precision, recall, F-score, ROC curves, and practical considerations.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

30 Terms

New cards

What are common metrics used to evaluate malware detection systems?

Accuracy, Precision, Recall, F-score, and ROC Curves

New cards

What is a potential problem when evaluating malware detection algorithms on skewed datasets?

High accuracy can be misleading if the detector simply predicts 'clean' for all files, as malware files are typically a small fraction of all files.

New cards

What is a skewed dataset?

A dataset where the proportion of positive and negative examples are not equal (e.g., 95% positive, 5% negative).

New cards

Why is percentage classification accuracy not sufficient to evaluate classifier performance on a skewed dataset?

Because a classifier answering false all the time will almost always be correct, despite being useless because it does not identify malware.

New cards

Define True Positive in the context of malware detection.

A file is malware and the detector correctly predicts malware.

New cards

Define False Positive in the context of malware detection.

A file is clean, but the detector incorrectly predicts malware.

New cards

Define False Negative in the context of malware detection.

A file is malware, but the detector incorrectly predicts clean.

New cards

Define True Negative in the context of malware detection.

A file is clean, and the detector correctly predicts clean.

New cards

What is a Confusion Matrix?

A 2x2 table that arranges the unique combinations of each file’s true label and the classifier's prediction.

New cards

Why is a Confusion Matrix useful?

Helps us understand different aspects of the classifier’s performance.

New cards

What is the 'positive case' when doing malware analysis?

Identifying malware.

New cards

In a multi-class confusion matrix, what must be designated to make sense of the terms?

A particular class as positive.

New cards

What is the formula for Accuracy?

Accuracy = (True Positives + True Negatives) / (True Positives + True Negatives + False Positives + False Negatives)

New cards

When is Accuracy a useful way to measure system performance?

If the number of positive and negative examples in the test set are equal.

New cards

What is the formula for Precision?

Precision = True Positives / (True Positives + False Positives)

New cards

What does Precision measure?

Of all the files the classifier predicts to be malware, what fraction actually is malware.

New cards

What is the formula for Recall?

Recall = True Positives / (True Positives + False Negatives)

New cards

What does Recall measure?

Of all the files that are malware, what fraction did the detector correctly identify?

New cards

What is the formula for F1-Score?

F1-score = 2PR / (P + R) where P=Precision and R=Recall

New cards

Why is F1-Score useful?

It combines precision and recall into one number to ensure high recall and precision.

New cards

What do high precision and high recall indicate?

High Precision – Makes accurate decisions; High Recall – Correctly finds most malware

New cards

In practice, why should the number of false-positives be kept very low in a malware detector?

Users will become annoyed and disable the malware detector.

New cards

What is the purpose of calibrating classifier sensitivity?

To use a threshold to control when the classifier generates an alert.

New cards

What is plotted in a classifier output distribution?

A histogram of malware probability scores for all samples.

New cards

What does a Precision-Recall (PR) curve show?

The classifier’s performance in terms of precision and recall as the detection threshold is varied.

New cards

What does an ROC curve show?

The classifier’s performance as the detection threshold is varied, plotting the true positive rate (tpr) against the false positive rate (fpr).

New cards

What are the desired properties of a good classifier's ROC curve?

Low false-positive rate and a high true-positive rate; the ROC curve should bend towards the top-left corner of the graph.

New cards

What is 'base rate' in the context of malware detection?

The percentage of actual malware files encountered by a system.

New cards

What is the formula for Expected Precision?

Expected Precision = (True Positive Rate * Base Rate) / (True Positive Rate * Base Rate + False Positive Rate * (1 - Base Rate))

New cards

What key questions should be asked when evaluating a new malware detection method?

How do we know if the new system is better than the old system? How much better is the new system? What are the conditions where the system fails?