An interaction is an additional term in a regression model that allows the slope of one variable to depend on the value of another
An interaction is modeled as the product of two variables
Non-interacted coefficients are called the main effects
In a model with an interaction term X1 * X2, you must also keep the main effects: the variables that are being interacted together
In a model with an interaction term X1 * X2, the main effect of X1 represents the predicted increase in Y for a 1-unit change in X1, holding X2 constant at 0.
Interactions make a model more complex to analyze, so it’s only worth it when you get a substantial bump in R-squared by including the interaction
Choose interactions by thinking about what you are trying to model: if you suspect the impact of one variable depends on the value of another, try an interaction term between them
Interactions are not the same as correlations between predictors
They are not about one X variable affecting another X variable- instead, they model a situation where the relationship of one predictor variable and Y is different depending on another X variable
If the effect of one variable (it’s slope) depends on the value of another, then we need an interaction.
Interactions serve as slope-modifiers
By the hierarchical principal, if we choose to include an interaction, we must also include the main effects that make up that interaction even if they are not statistically significant on their own