DTSC MIDTERM FINAL REVIEW HELP SHEET

0.0(0)
studied byStudied by 0 people
0.0(0)
call with kaiCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/37

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

38 Terms

1
New cards

Two numerical variables

Scatterplot used to examine the relationship between two numeric variables

2
New cards

Numerical and categorical variables

Side-by-side boxplot used to compare a numeric variable across categories

3
New cards

Two categorical variables

Stacked bar chart or ribbon plot used to show joint distribution

4
New cards

One categorical variable

Barchart used to display counts or proportions

5
New cards

Stacked boxplot

Not a valid or standard plot and typically a trap answer

6
New cards

Scatterplot

Used to look for patterns or trends between two numeric variables

7
New cards

Correlation

Measures the strength and direction of a linear relationship

8
New cards

Linear regression

Models and predicts a numeric response using one or more predictors

9
New cards

Indicator variable

A binary variable coded as 0 or 1 representing group membership

10
New cards

Interaction variable

A variable created by multiplying predictors to test whether effects differ across groups

11
New cards

Supervised classification

Modeling with a known categorical response variable

12
New cards

Supervised regression

Modeling with a known numeric response variable

13
New cards

Unsupervised classification

Grouping or clustering data without a known response variable

14
New cards

Unsupervised regression

Finding structure or patterns in numeric data without labeled outcomes

15
New cards

Consequentialist theories

Evaluate actions based on outcomes and consequences

16
New cards

Deontological theories

Focus on duties rights fairness and justice

17
New cards

Utilitarianism

A form of consequentialism focused on maximizing overall benefit

18
New cards

Virtue theories

Focus on integrity character and moral responsibility

19
New cards

R squared

The proportion of variability in the response explained by the predictors

20
New cards

R squared interpretation

Measures explanatory power not prediction accuracy

21
New cards

Residual

Observed value minus predicted value

22
New cards

Residual interpretation

Positive residual means underprediction negative residual means overprediction

23
New cards

P-value

Probability of observing results as extreme as the data assuming the null hypothesis is true

24
New cards

P-value interpretation

Small p-value indicates statistical significance large p-value indicates insufficient evidence

25
New cards

Mileage_Porsche coefficient interpretation

Tests whether mileage affects price differently for Porsches compared to the baseline group

26
New cards

Non-significant interaction term

Indicates no statistical evidence that effects differ across groups

27
New cards

Modeling as a socio-technical loop

Models influence society and societal decisions influence future data and models

28
New cards

RMSE

Measures the typical size of prediction errors in the same units as the response

29
New cards

RMSE interpretation

Lower RMSE indicates better predictive accuracy

30
New cards

F-test

Evaluates whether a regression model is useful overall

31
New cards

F-test large value interpretation

Large F-statistic with small p-value indicates at least one predictor has a non-zero effect

32
New cards

F-test small value interpretation

Small F-statistic with large p-value indicates the model is not useful overall

33
New cards

Residuals vs fitted plot

A plot used to check whether prediction errors behave randomly

34
New cards

Good residuals vs fitted pattern

Errors appear random and unrelated to predictions indicating assumptions are reasonable

35
New cards

Bad residuals vs fitted pattern

Errors show systematic behavior indicating the model missed structure or violated assumptions

36
New cards

QQ plot

Used to assess whether residuals are approximately normally distributed

37
New cards

Bootstrapping

Resampling with replacement from observed data

38
New cards

Purpose of bootstrapping

Used to estimate uncertainty when distributional assumptions may not hold