Multiple Linear Regression

Multiple Linear Regression

  • Examines relationship between one dependent variable and multiple independent variables.

Benefits Over Simple Regression

  • Incorporates multiple predictors for combined effects.

  • Improved predictive accuracy through simultaneous consideration of predictors.

  • Controls for confounding variables, isolating unique contributions.

  • Assesses relative importance of each predictor variable.

Assumptions Pre and Post-Test

  • Pre-Test: 1) Linear relationship between variables; 2) No multicollinearity (correlation < .80).

  • Post-Test: Multivariate normality of residuals; homoscedasticity of residuals.

Example Scenario: Real Estate

  • Dependent Variable: House prices

  • Independent Variables: Size, number of bedrooms, bathrooms, distance from city centre, violent crime rate.

Regression Statistics Overview

  • Sample Data: Shows different predictors and their influence on house prices.

Key Assumptions of Regression

  • Continuous variables in pairs; linear and homoscedastic relationships.

Multiple Regression Model Creation

  • Focused on factors like house size, number of bedrooms, etc.

  • Importance of checking for multicollinearity via correlation matrices.

Important Metrics

  • Adjusted R-squared: Proportion of variance explained by predictors.

  • Coefficients: Estimated changes in dependent variable per unit change in predictor.

  • Significance Levels (p-values): Determine statistical significance of coefficients.

Homoscedasticity and Normality

  • Residuals should show consistent variance (homoscedasticity) and normal distribution.

Key Findings from Model

  • Significant predictors: Number of bedrooms, distance to city.

  • Non-significant predictors: House size, number of bathrooms, violent crime rate.

Takeaways

  • Multiple regression assesses effects of multiple variables on one dependent variable.

  • Useful for prediction, but does not confirm causality due to possible omitted variables.