ECMT2150 Lecture 2

Deriving the OLS Estimator

Introduction to the OLS Estimator

  • The Ordinary Least Squares (OLS) estimator is a fundamental tool in econometrics for estimating the relationship between variables.

  • Fitted Values: The predicted values from the regression model, defined mathematically.

  • Residuals: The difference between observed values and fitted values.

Mathematical Notation
  • The notation for the residual for observation $i$ is represented as: [ u_i = y_i - \hat{y_i} = y_i - (\hat{\beta_0} + \hat{\beta_1} x_i) ]

Minimization Process

  • The goal of OLS is to choose parameters ( \hat{\beta_0} ) and ( \hat{\beta_1} ) that minimize the sum of squared residuals.

  • Graphical Interpretation: Helps in understanding the geometric meaning of minimizing the distance from the regression line.

Sample Regression Function

Definition

  • The estimated regression line or sample regression function (SRF) is the observed relationship derived from the sample data: [ E(y | x) = \hat{\beta_0} + \hat{\beta_1} x ]

  • Where ( , \hat{y} , ) signifies the sample estimate of ( \mathbb{E} y | x ).

Importance of SRF

  • Serves as the sample equivalent to the population regression function (PRF).

  • Important for evaluating how explanatory variables influence the dependent variable.

Incorporating Non-Linearities

Introduction to Non-Linearities

  • It is essential to consider non-linear relationships to obtain more accurate and meaningful models.

  • The course discusses different forms of incorporating non-linearities into MLR, including quadratic terms and interaction terms.

Semi-Logarithmic Forms

  • Regression of Log Wages on Years of Education:

    • This formulation allows interpretation in terms of percentage changes.

    • Example: A 1-year increase in education is associated with an 8.3% increase in wages.

Log-Log Forms

  • Used when analyzing relationships such as CEO salary and firm sales.

  • Implicates a constant elasticity model where interpreting coefficients will account for percentage changes.

Properties of the OLS Estimator

Algebraic Properties

  • The OLS estimators exhibit several key algebraic properties applicable across both simple and multiple regression contexts.

  • The sample average of the residuals should theoretically equal zero, showing predictions center around actual values.

  • Another important property is that the covariances between each independent variable and the OLS residuals are also zero.

Fit of the Model

  • Goodness-of-Fit testing is crucial to determine how well the model explains the variance in the dependent variable.

  • Metrics include Total Sum of Squares, Explained Sum of Squares, and Residual Sum of Squares.

  • R-squared: Measures the fraction of the variance explained by the predictors, but caution is required as a high R-squared does not guarantee causality.

Statistical Properties of OLS

Key Concepts

  • OLS coefficients are considered random variables as they derive from random samples.

  • Unbiasedness: An estimator is unbiased if its expected value equals the true population value.

  • Efficiency: OLS aims for the lowest variance among unbiased estimators (BLUE) under the Gauss-Markov theorem, highlighting the efficiency of OLS under specific conditions.

Key Assumptions

  • Assumption MLR.1 (Linear in Parameters): The relationship must be linear in parameters, not necessarily in independent variables.

  • Assumption MLR.2 (Random Sampling): Data must be randomly drawn to meet the necessary conditions for inference.

  • Assumption MLR.3 (No Perfect Collinearity): Independent variables must vary and should not have perfect linear relationships among them.

  • Assumption MLR.4 (Zero Conditional Mean): Explanatory variables must not convey information about the mean of the error term.

  • Assumption MLR.5 (Homoskedasticity): You must meet conditions for the variability of unobserved factors to remain constant.

Conclusion

  • Summarized results reinforce that adherence to these assumptions underpins statistical validity in models and enhances estimation reliability.

  • A practical implication is the necessity of understanding sampling variability and its impact on inference in econometric analysis.

robot