STA Class 4: Interactions Between Variables

  • An interaction is an additional term in a regression model that allows the slope of one variable to depend on the value of another

  • An interaction is modeled as the product of two variables

  • Non-interacted coefficients are called the main effects

  • In a model with an interaction term X1 * X2, you must also keep the main effects: the variables that are being interacted together

  • In a model with an interaction term X1 * X2, the main effect of X1 represents the predicted increase in Y for a 1-unit change in X1, holding X2 constant at 0.

  • Interactions make a model more complex to analyze, so it’s only worth it when you get a substantial bump in R-squared by including the interaction

  • Choose interactions by thinking about what you are trying to model: if you suspect the impact of one variable depends on the value of another, try an interaction term between them

  • Interactions are not the same as correlations between predictors

    • They are not about one X variable affecting another X variable- instead, they model a situation where the relationship of one predictor variable and Y is different depending on another X variable

  • If the effect of one variable (it’s slope) depends on the value of another, then we need an interaction.

  • Interactions serve as slope-modifiers

  • By the hierarchical principal, if we choose to include an interaction, we must also include the main effects that make up that interaction even if they are not statistically significant on their own