Predictive Risk Modeling

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/46

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:54 PM on 4/26/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

47 Terms

1
New cards

What is high variance in a model?

Small changes in training data cause large changes in the fitted model.

2
New cards

What is bias in a model?

Error caused by overly simplistic/incorrect assumptions about the relationship.

3
New cards

Do more flexible models always predict better?

No. They may overfit and increase variance.

4
New cards

In linear regression, what does β₁ mean?

Average change in Y for a one-unit increase in X₁ holding others constant.

5
New cards

What does β₀ represent?

Intercept; expected Y when all predictors = 0.

6
New cards

What does BLUE mean?

Best Linear Unbiased Estimator.

7
New cards

What estimator is BLUE in OLS regression?

The least squares estimator.

8
New cards

If residual spread increases with fitted values, what may help?

Log transform of Y.

9
New cards

Another transformation for increasing variance?

Square root transform.

10
New cards

When is logit transformation commonly used?

When response is a proportion between 0 and 1.

11
New cards

What does the t-test test in regression?

Whether an individual coefficient equals zero.

12
New cards

What does the F-test test?

Whether all slope coefficients are zero (overall model significance).

13
New cards

If all individual t-tests reject, will F-test reject?

Yes, generally overall significance exists.

14
New cards

What does R² measure?

Proportion of variance in Y explained by predictors.

15
New cards

Higher R² means what?

Better fit to training data.

16
New cards

What assumption is made when predicting new data?

New observations follow same population/model as training data.

17
New cards

Which is more informative: narrow or wide prediction interval?

Narrower interval.

18
New cards

Should point prediction lie inside prediction interval?

Yes.

19
New cards

What does best subset selection do?

Evaluates all predictor subsets and chooses best by criterion.

20
New cards

Does stepwise always find best subset?

No.

21
New cards

What does lasso do?

Shrinks coefficients and can set some exactly to zero.

22
New cards

If λ = 0 in lasso, result equals what?

Ordinary Least Squares.

23
New cards

Why is lasso useful?

Variable selection + reduced overfitting.

24
New cards

What type of splits do decision trees use?

Recursive binary splits.

25
New cards

What is pruning a tree for?

Reduce overfitting / improve test accuracy.

26
New cards

What is bagging?

Bootstrap samples + average many trees.

27
New cards

What is random forest?

Bagging + random subset of predictors at each split.

28
New cards

If m = p in random forest, what is it?

Bagging.

29
New cards

Why random forest beats single tree?

Lower variance, better prediction.

30
New cards

Cost of bagging/random forest?

Less interpretability.

31
New cards

What does low Gini index mean?

Node is mostly one class (pure).

32
New cards

Lowest possible Gini?

0.

33
New cards

Purpose of PCA?

Reduce dimensionality while preserving variance.

34
New cards

What is first principal component?

Direction capturing maximum variance.

35
New cards

Do cumulative explained variances increase or decrease?

Increase

36
New cards

What is a scree plot used for?

Decide number of PCs.

37
New cards

Using all PCs gives what?

Full variance explained but less simplification.

38
New cards

Sum of variance explained by all PCs equals what?

100%.

39
New cards

Is clustering supervised or unsupervised?

Unsupervised.

40
New cards

Purpose of clustering?

Find homogeneous groups in data.

41
New cards

What must be chosen before K-means?

Number of clusters K.

42
New cards

Does K-means always give same result?

No, depends on starting centroids.

43
New cards

What does hierarchical clustering produce?

Dendrogram.

44
New cards

How do you choose cluster count in hierarchical clustering?

Cut dendrogram at chosen height.

45
New cards

Better training fit always means better test fit?

No.

46
New cards

More complexity always better?

No.

47
New cards

Correlation implies causation?

No.