Visually inspect the data to see if the relationship appears linear.
Use Pearson's correlation coefficient r to quantify the linear relationship.
Formula for Pearson's correlation coefficient (will be provided in exams):
r ranges from -1 to +1.
Caveat: Pearson's correlation coefficient doesn't distinguish patterns well.
lm
function for simple and multiple linear regression.lm(y ~ x, data = galton_data)
y
is the response variable (child height).x
is the predictor variable (average parent height).galton_data
is the dataset.par(mfrow = c(2, 2))
to arrange plots.plot(model)
function to generate diagnostic plots.ggfortify
package: Provides prettier plots but lacks Cook's distance lines.performance
package: Provides instructions for each assumption plot.performance
package.anova(model)
summary(model)
to get detailed regression output.predict(model, newdata = data.frame(parent = 70))