In-Depth Notes on Regression Analysis and Model Evaluation
Understanding Regression and Model Evaluation
Key Concepts of Regression
- Regression: A method allowing us to predict the outcome based on one or more predictors using the method of least squares.
- General Equation:
- y = Predicted score on the outcome
- X = Score on the predictor variable
- b0 = Intercept
- b1 = Slope
Evaluating the Model
- To assess model effectiveness:
- Goodness of Fit
- R²: Proportion of variance in the outcome explained by the regression model.
- Statistical Significance
- F-test: Tests if the model is significantly better than a model with no predictors (null hypothesis).
Goodness of Fit Metrics
- Understand how well the model fits by examining:
- Regression Coefficients R and R²:
- For bivariate regression, properties are:
- R: Correlation
- R²: Proportion of variability accounted for by the model
- The residual variances can be calculated as:
- Total Sum of Squares (SST): Variability around the mean of Y
- Regression Sum of Squares (SSR): Variability explained by the model
- Residual Sum of Squares (SSM): Unexplained variability
Evaluating Statistical Significance
- The significance of the contribution of predictors to the model is determined using:
- F statistic
- A higher F indicates a more significant model, while an F close to zero suggests a lack of predictive power.
- t-tests for individual predictors help determine their contribution:
- Each t statistic tests if the predictor's contribution is significant.
Regression Analysis Using Software (jamovi)
- Model Fit Measures:
- Overall Model Test: Assess significance through F-test, R², and adjusted R², using jamovi outputs.
- Significance and Coefficients: Each predictor will show both its coefficient and the associated statistical significance (p-value).
Multiple Regression Concepts
- Multiple Regression involves predicting an outcome from two or more predictors. It allows:
- Assessing Total Variability: How much total variability in Y is accounted for by the predictors.
- Comparing Models: Assess how adding predictors improves model fit (change in R²).
- Unique Contribution Assessment: Through beta coefficients and individual significance testing.
Model Comparison and Variable Inclusion
- Comparing successive regression models to evaluate how additional variables contribute:
- Model A: Performance predicted by one predictor.
- Model B: Performance predicted by adding another variable.
- Use adjusted R² and F-change to determine the effectiveness of adding variables.
Important Considerations in Multiple Regression
- While regression helps identify relationships, it does not establish causality. Hence:
- Variable Selection: Variables entered should be based on evidence and theory.
- Sample Size Considerations: Larger sample size is often required for more predictors, generally aiming for a power of .8.
- Researchers must balance comprehensive models against the principle of parsimony (simplicity).
Partial and Semi-Partial Correlations
- Partial Correlations evaluate the relationship between two variables while controlling for the influence of one or more other variables.
- Semi-Partial Correlations assess contributions of predictors while controlling for others, allowing insights into unique effects.
- Useful in determining the unique variance explained by predictors in the presence of correlations.
Application in Analysis
- When using statistical software (like jamovi) for regression analyses:
- Start with correlation matrices for initial insights.
- Assess overall model fit using ANOVA measures.
- Interpret coefficients to understand the impact of each independent variable on the dependent variable after accounting for others.