1/37
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Two numerical variables
Scatterplot used to examine the relationship between two numeric variables
Numerical and categorical variables
Side-by-side boxplot used to compare a numeric variable across categories
Two categorical variables
Stacked bar chart or ribbon plot used to show joint distribution
One categorical variable
Barchart used to display counts or proportions
Stacked boxplot
Not a valid or standard plot and typically a trap answer
Scatterplot
Used to look for patterns or trends between two numeric variables
Correlation
Measures the strength and direction of a linear relationship
Linear regression
Models and predicts a numeric response using one or more predictors
Indicator variable
A binary variable coded as 0 or 1 representing group membership
Interaction variable
A variable created by multiplying predictors to test whether effects differ across groups
Supervised classification
Modeling with a known categorical response variable
Supervised regression
Modeling with a known numeric response variable
Unsupervised classification
Grouping or clustering data without a known response variable
Unsupervised regression
Finding structure or patterns in numeric data without labeled outcomes
Consequentialist theories
Evaluate actions based on outcomes and consequences
Deontological theories
Focus on duties rights fairness and justice
Utilitarianism
A form of consequentialism focused on maximizing overall benefit
Virtue theories
Focus on integrity character and moral responsibility
R squared
The proportion of variability in the response explained by the predictors
R squared interpretation
Measures explanatory power not prediction accuracy
Residual
Observed value minus predicted value
Residual interpretation
Positive residual means underprediction negative residual means overprediction
P-value
Probability of observing results as extreme as the data assuming the null hypothesis is true
P-value interpretation
Small p-value indicates statistical significance large p-value indicates insufficient evidence
Mileage_Porsche coefficient interpretation
Tests whether mileage affects price differently for Porsches compared to the baseline group
Non-significant interaction term
Indicates no statistical evidence that effects differ across groups
Modeling as a socio-technical loop
Models influence society and societal decisions influence future data and models
RMSE
Measures the typical size of prediction errors in the same units as the response
RMSE interpretation
Lower RMSE indicates better predictive accuracy
F-test
Evaluates whether a regression model is useful overall
F-test large value interpretation
Large F-statistic with small p-value indicates at least one predictor has a non-zero effect
F-test small value interpretation
Small F-statistic with large p-value indicates the model is not useful overall
Residuals vs fitted plot
A plot used to check whether prediction errors behave randomly
Good residuals vs fitted pattern
Errors appear random and unrelated to predictions indicating assumptions are reasonable
Bad residuals vs fitted pattern
Errors show systematic behavior indicating the model missed structure or violated assumptions
QQ plot
Used to assess whether residuals are approximately normally distributed
Bootstrapping
Resampling with replacement from observed data
Purpose of bootstrapping
Used to estimate uncertainty when distributional assumptions may not hold