Data Mining Quiz 4

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/37

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

38 Terms

New cards

Naive Bayes

Data Driven, makes no assumption and created by Thomas Bayes

New cards

Naive Bayes Usage

Requires categorical variables, can be used for large data set

New cards

Exact Bayes Classifier

Relies on finding other records that share same predictor, finds probability, and needs exact match

New cards

Naive Bayes - solution to exact bayes

Assume independence of predictor variables, use multiplication rule, find same probability

New cards

Example of Naive Bayes

Financial Fraud

New cards

Exact Bayes Calculations Example

Classify a small firm if with charges filed

New cards

Naive bayes calculations

Classify a small firm with charged files with quantities

New cards

Other notes about Naive Bayes

Probability estimate does not differ from greatly from exact, all records are used in calculations, and is most practical

New cards

Independence Assumpition

Not strictly justified and often good enough

New cards

Advantages of Naive Bayes

Handles purely categorical data well, works with large set and is simple and efficient

New cards

Disadvantages of Naive Bayes

Requires large number of records. problematic when a predictor category is not present in training data

New cards

Assumptions of MLR

Linearity, Independence, Homoscedasticity, Normality, No multicollinearity, no auto-correlation

New cards

Linearity

Relationship between predictors and outcome is linear

New cards

Independence

Observations are independent of each other

New cards

Homoscedasticity

Constance variance of residuals

New cards

Normality

Residuals are normally distributed

New cards

No multicollinearity

Predictors aren’t highly correlated

New cards

No auto-correlation

Residuals are independent

New cards

When does multicollinearity occur

When two or more independent variables are highly correlated and can inflate standard error

New cards

Variance Inflation Factor

Quantifies how much the variance of a regression coefficient is inflated due to multicollinearity

New cards

VIF = 1

No multicollinearity and is ideaVI

New cards

VIF < 5

Low to moderate multicollinearity and generally acceptable

New cards

VIF > 5

High multicollinearity and is problematic

New cards

VIF > 10

Serve multicollinearity and strong evidence to remove or combine variable

New cards

How to detect violations

Linearity, Independence, Homoscedasticity, Normality, Multicollinearity, Auto-correlation

New cards

Lineartity

Residual plots, scatterplots

New cards

Independence

Study design, durbin-watson test

New cards

Homoscedasticity

Residual vs fitted plot

New cards

Normality

Q-Q plot

New cards

Multicollinearity

Variance Inflation Factor

New cards

Auto Correlation

Durbin-Watson Test

New cards

Durbin Watson test

Statistic range between 0 and 4 to test auto correlation

New cards

Durbin Watson Test with result of 2.0

no autocorrelation and is ideal case

New cards

Durbin Watson test of 0-2

Positive correlation

New cards

Durbin Test of 2.0-4

Negative correlation

New cards

Durbin Watson Test of 1.5-2.5

Generally acceptable

New cards

Durbin Watson Test of 1.5>x <2.5

Suggest potential autocorrelation problems

New cards

How to fix autocrrelation

Linearity, Independence, Homoscedasticity, Normality, Multicollinearity, Auto-correlation

Explore top notes

Kingdom Monera

Updated 901d ago

Note

Developmental Psych Chapter 19

Updated 956d ago

Note

my antonia - vocab

Updated 913d ago

Note

Dinero y Banca - 5. El valor del dinero

Updated 105d ago

Note

Chapter 22- Evolution by Natural Selection

Updated 853d ago

Note

6.5 Economic Imperialism

Updated 807d ago

Note

Chapter 6: Authoritarian Regimes

Updated 753d ago

Note

Unit 2: Exploring Two-Variable Data

Updated 751d ago

Note

Explore top flashcards

AICE Human Geography Unit 1 Section 1

Updated 837d ago

Flashcards (22)

HMP 401 - Exam 3 (Chapters 11-13)

Updated 723d ago

Flashcards (26)

Med Term Midterm Review

Updated 503d ago

Flashcards (175)

Narrative Nonfiction Terms

Updated 168d ago

Flashcards (26)

HOSA biotechnology review

Updated 784d ago

Flashcards (191)

H446 Section 10 Computational Thinking

Updated 122d ago

Flashcards (43)

Medias in Res 3

Updated 899d ago

Flashcards (27)

Chemistry Polyatomic ion charges

Updated 894d ago

Flashcards (20)