1/13
Flashcards covering key concepts from STAT 252 Week 8 lecture notes, focusing on categorical variables, indicator variables, interaction terms, and model building in regression analysis.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Indicator Variables in Regression
Categorical variables (factors) must be converted into numeric form using indicator (dummy) variables so they can be included in a regression model to estimate separate intercepts for each group.
Effect of Indicator Variables
Indicator variables shift the intercept, enabling the model to capture group-level differences while maintaining the same slope across categories (unless interaction terms are added).
Interactions in Regression
Interactions allow slopes to vary across groups, capturing cases where the effect of one variable depends on the level of another.
Types of Interaction Terms
Quantitative × categorical interactions model different slopes for different groups, while quantitative × quantitative interactions capture how one continuous predictor moderates the effect of another.
Centering Variables
Centering variables (subtracting the mean) helps interpret the intercept meaningfully and reduces multicollinearity in models with interaction or polynomial terms.
Indicator Predictors
Converting a categorical variable into numeric indicator (dummy) variables allows the regression model to estimate separate effects for each group, capturing how group membership influences the response.
What is the same in the equations for each species when using indicator variables?
They all share the same slope.
What is a potential model improvement when examining the relationship between bill length and body mass across different penguin species?
It may be useful to build a new model that allows for differing slopes for each species.
What does the coefficient for speciesChinstrap represent in the model output?
The estimated difference in baseline body mass between Chinstrap and Adelie penguins. It implies the difference in average body size when bill length is constant.
What question can be tested by incorporating interaction terms between a quantitative predictor and a categorical predictor?
Whether the relationship between bill length and body mass is the same across all penguin species. Interaction terms can have differing relationships between species
What information can be gathered from visualizing the data before building a model with interaction terms?
Approximate range of bill length values observed for Adelie penguins based on the plot of body mass vs. bill length.
How to write the regression equation for Adelie penguins
The regression equation for Adelie penguins, which includes the intercept and slope of bill length. The intercept represents the predicted body mass when bill length is zero, and the slope represents the change in body mass for each unit increase in bill length.
What is used to determine is there is a stastically significant difference in slope?
Whether the interaction term between speciesGentoo and billlengthmm is statistically significant. A significant interaction term indicates a difference in slope.
What assumptions should be verified when predicting for the Chinstrap penguin with a bill length of 48mm?
Assumptions such as linearity, independence, homoscedasticity, and normality of residuals should be checked. The predicted value represents the estimated body mass for a Chinstrap penguin with a bill length of 48 mm, given the model.