Regression

Key Concepts of Regression Analysis

  • Regression Analysis: A statistical technique used to find the best-fitting straight line for a set of data and allows for predictions based on correlations.

  • Linear Relationship Equation: The equation representing this relationship is Y = bX + a, where Y is the dependent variable, X is the independent variable, a is the Y-intercept, and b is the slope constant.

  • Correlation: Assesses the relationship between two variables.

  • Regression: Focuses on predicting one variable based on another.

  • Regression Line: Represents the best fit for data points and minimizes the distance to each point.

  • Standard Error of Estimate (SEE): Measures prediction accuracy, similar to standard deviation.

  • Variability (r²): Indicates how much variability in Y is explained by its relationship with X; a higher value indicates better predictability.

  • Simple Linear Regression: Predicts the dependent variable from one independent variable.

  • Multiple Linear Regression: Predicts the dependent variable from two or more independent variables.

  • Independence of Residuals: Residuals should be uncorrelated, verified by the Durbin-Watson test.

  • No Multicollinearity: Independent variables should not be highly correlated; checks involve Tolerance and VIF.

  • Homoscedasticity: Residual variance should be constant across levels of the dependent variable.

  • Normal Distribution of Residuals: The residuals should follow a normal distribution, assessed using tests such as Shapiro-Wilk or Q-Q plots.