BAN 402 Exam

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/36

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

37 Terms

1
New cards

Tidymodels framwork ordering

Recipe, model, workflow

2
New cards

K-fold is used to do

parameter tuning

3
New cards

a logistic regression variable is

yes/no

4
New cards

A linear regression variable is

quantitative, test score, miler per gallon

5
New cards

For a logistic regression model, you look at (blank) on the lefthand side

The log of a particular class

6
New cards

In a logistic regression model you are trying to predict

probability

7
New cards

The process by which a second sample group is given a test to ensure it is applicable to more than one group

Cross validation

8
New cards

What are the k-folds

3, 5, 10

9
New cards

What does cross validation help with

tuning

10
New cards

response variable is categorical, qualitative predicting binary variable, beta is rate "glm"

logistic regression model

11
New cards

In classification, you want AIC to be

low

12
New cards

in classification, you want r squared to be

high

13
New cards

response variable is numerical, quantitative "lm"

linear regression model

14
New cards

confusion matrix

Predictions vs. Actual

15
New cards

developing probabilities/predictions

50% is default threshold, balancing sensitivity and specificity (important in healthcare, credit card fraud, insurance)

16
New cards

splitting data helps with

overfitting

17
New cards

naive accuracy

Confusion Matrix and Statistics: "No Information Rate" and accuracy.. usually w all variables

18
New cards

The r-squared value for a classification model is

AIC

19
New cards

In ROCR, you are looking for

the curve closest to the top left corner

20
New cards

Classification trees use the

rpart package

21
New cards

The parameter to change complexity is

cp

22
New cards

Lower cp means

tree is big

23
New cards

higher cp means

tree is small

24
New cards

A tree with no splits

terminal node

25
New cards

random forest uses

minn and mtry

26
New cards

in clustering, you do not know

dependent variable

27
New cards

in clustering, who specifies the number of clusters

the user

28
New cards

in clustering, you use what function

mbclust

29
New cards

in clustering, what algorithm do you use

kmeans

30
New cards

What does the "CP" parameter control in rpart()?

The complexity of the classification tree

31
New cards

Which function is used to develop predictions from a classification tree in caret?

predict()

32
New cards

What is the term used to describe a tree without any splits?

leaf node

33
New cards

What does the term "min.node.size" control in the ranger function?

The number of observations in a terminal node

34
New cards

Which measure indicates the proportion of true negative instances that are correctly identified by the model?

specificity

35
New cards

What is the key difference between clustering and previous subjects?

dependent variable is unknown

36
New cards

Which package is used to determine the optimal threshold for classification?

ROCR

37
New cards

Which package is used to determine probabilities from logistic regression?

ROCR