lecture 8

0.0(0)

Studied by 0 people

View linked note

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/25

Earn XP

Description and Tags

Flashcards covering key vocabulary and concepts related to data mining and model evaluation.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

26 Terms

New cards

What are the two main types of data mining tasks?

Descriptive tasks and predictive tasks.

New cards

What do predictive data mining tasks involve?

They make predictions about unknown future events based on known past information.

New cards

What is Descriptive data mining tasks involve?

a descriptive task means finding patterns that describe or summarize the data without making predictions.

New cards

What is feature construction?

Creating new features from existing ones to improve model performance.

New cards

How is accuracy calculated in a classification model?

Accuracy = (TP + TN) / (TP + TN + FP + FN)

New cards

Why is accuracy alone often insufficient for healthcare applications?

Because it doesn't distinguish between types of errors, which can have different implications in healthcare contexts.

New cards

What is a confusion matrix?

A table used to evaluate classification model performance by comparing predicted and actual results.

New cards

What is a True Positive (TP)?

Cases predicted as positive that are indeed positive.

New cards

What is a True Negative (TN)?

Cases predicted as negative that are indeed negative.

New cards

What is a False Positive (FP)?

Cases predicted as positive but actually negative;

New cards

What is a False Negative (FN) ?

Cases predicted as negative but actually positive;

New cards

What is sensitivity ?

The likelihood that a diseased patient has a positive test; TP/(TP+FN)

New cards

What characterizes a desirable diagnostic test?

It has high sensitivity (TPR) and high specificity (TNR).

New cards

What is specificity ?

True-Negative Rate (TNR): likelihood that a healthy patient has a negative test

New cards

Why are thresholds needed in most prediction models?

Because most tests produce continuous output results that need to be interpreted as positive/negative.

New cards

How does changing the threshold affect sensitivity and specificity?

Lowering the threshold typically increases sensitivity (catches more true positives) but decreases specificity (more false positives)
Raising the threshold typically increases specificity (fewer false positives) but decreases sensitivity (more false negatives)

New cards

When would you prioritize sensitivity over specificity?

When the disease is serious and life-saving therapy is available (minimizing false negatives)

New cards

When would you prioritize specificity over sensitivity?

When the disease is not serious and the therapy has risks (minimizing false positives)

New cards

What are "black box" models?

Models that are not easily interpretable by humans, such as Artificial Neural Networks and Support Vector Machines.

New cards

What are "white box" models?

Models that provide clear reasoning for predictions, such as Decision Trees.

New cards

Give me examples of a predictive data mining

classification (svm) (regrestion)

New cards

give me a example of descriptive data mining

Clustering

New cards

is TPR (True Positive Rate) Specificity or sensitivity

Sensitivity

New cards

is TNR (True Negative Rate) Specificity or sensitivity

Specificity

New cards

How do you calculate Sensitivity?

TPR = TP/TP+FN

New cards

How do you calculate Specificity?

TNR = TN/TN+FP