The Ordinary Least Squares (OLS) estimator is a fundamental tool in econometrics for estimating the relationship between variables.
Fitted Values: The predicted values from the regression model, defined mathematically.
Residuals: The difference between observed values and fitted values.
The notation for the residual for observation $i$ is represented as: [ u_i = y_i - \hat{y_i} = y_i - (\hat{\beta_0} + \hat{\beta_1} x_i) ]
The goal of OLS is to choose parameters ( \hat{\beta_0} ) and ( \hat{\beta_1} ) that minimize the sum of squared residuals.
Graphical Interpretation: Helps in understanding the geometric meaning of minimizing the distance from the regression line.
The estimated regression line or sample regression function (SRF) is the observed relationship derived from the sample data: [ E(y | x) = \hat{\beta_0} + \hat{\beta_1} x ]
Where ( , \hat{y} , ) signifies the sample estimate of ( \mathbb{E} y | x ).
Serves as the sample equivalent to the population regression function (PRF).
Important for evaluating how explanatory variables influence the dependent variable.
It is essential to consider non-linear relationships to obtain more accurate and meaningful models.
The course discusses different forms of incorporating non-linearities into MLR, including quadratic terms and interaction terms.
Regression of Log Wages on Years of Education:
This formulation allows interpretation in terms of percentage changes.
Example: A 1-year increase in education is associated with an 8.3% increase in wages.
Used when analyzing relationships such as CEO salary and firm sales.
Implicates a constant elasticity model where interpreting coefficients will account for percentage changes.
The OLS estimators exhibit several key algebraic properties applicable across both simple and multiple regression contexts.
The sample average of the residuals should theoretically equal zero, showing predictions center around actual values.
Another important property is that the covariances between each independent variable and the OLS residuals are also zero.
Goodness-of-Fit testing is crucial to determine how well the model explains the variance in the dependent variable.
Metrics include Total Sum of Squares, Explained Sum of Squares, and Residual Sum of Squares.
R-squared: Measures the fraction of the variance explained by the predictors, but caution is required as a high R-squared does not guarantee causality.
OLS coefficients are considered random variables as they derive from random samples.
Unbiasedness: An estimator is unbiased if its expected value equals the true population value.
Efficiency: OLS aims for the lowest variance among unbiased estimators (BLUE) under the Gauss-Markov theorem, highlighting the efficiency of OLS under specific conditions.
Assumption MLR.1 (Linear in Parameters): The relationship must be linear in parameters, not necessarily in independent variables.
Assumption MLR.2 (Random Sampling): Data must be randomly drawn to meet the necessary conditions for inference.
Assumption MLR.3 (No Perfect Collinearity): Independent variables must vary and should not have perfect linear relationships among them.
Assumption MLR.4 (Zero Conditional Mean): Explanatory variables must not convey information about the mean of the error term.
Assumption MLR.5 (Homoskedasticity): You must meet conditions for the variability of unobserved factors to remain constant.
Summarized results reinforce that adherence to these assumptions underpins statistical validity in models and enhances estimation reliability.
A practical implication is the necessity of understanding sampling variability and its impact on inference in econometric analysis.