Week_2_PPT
Week 2: Ordinary Least Squares (OLS) in Applied Econometrics
Overview
Course: ECO440/ECO640
Institution: Niagara University
Lecture Outline
Goal of OLS Regression: Estimating the empirical regression equation.
Mechanics of OLS Estimator: Parameter estimates in both univariate and multivariate regression.
OLS Regression: Intuition and interpretation of results.
Importance of Fit: Understanding the decomposition of variance and R².
From Theory to Empirics
Purpose of Regression Analysis: Transition from a theoretical equation to an estimated empirical regression equation.
Key Questions:
What empirical regression line should be used?
How to choose among several alternative versions of the regression equation?
Selecting an Empirical Regression Equation
Aim for a regression line that resembles the theoretical model, e.g., a linear function connecting schooling to wages.
The model should provide the best fit for the data, specifically assessing the parameters.
Example: How does an additional year of schooling affect wage levels?
Possible Empirical Regression Candidates
Analyze and assess the fit of candidate regression equations.
Evaluate prediction errors (residuals) using actual vs predicted wage values.
Consider slope implications regarding education's impact on wages.
Preferred Empirical Regression Equation
The best-fitting regression line is derived using OLS methods.
Discuss reasons why this regression equation is favored over others.
Ordinary Least Squares Regression
OLS as the primary method for obtaining regression estimates.
Goal: Minimize the squared errors (residuals).
Rationale: Minimizing the square of errors rather than their sum improves accuracy.
Distinguishing Estimators vs. Estimates
Estimator: A technique applied to sample data to estimate population regression coefficients.
Estimate: The computed value of a regression coefficient.
Key components: Parameters (unknown betas, β) vs variables (Y, X).
Why Use OLS?
Advantages of OLS:
Ease of use.
Conceptual appeal of minimizing squared errors.
Properties:
Residuals sum to zero.
Under certain conditions, OLS is the BLUE (Best Linear Unbiased Estimator).
Ordinary Least Squares Mechanics
OLS minimizes the squared error:
Mathematically, the method involves computing values that minimize the expression ∑(Y - Ŷ)².
OLS Estimate Solutions for Single Variable Regression
Derivation of OLS estimates is based on theoretical equations.
Fundamental equations for computation: (2.1), (2.4), (2.5).
Intuition of OLS Slope Coefficient Estimate
The numerator represents covariance between X and Y, indicating the relationship strength.
The denominator indicates the variance of X, showing data dispersion.
Interpretation: The slope (β₁) indicates how a unit change in X affects Y, weighted by X's variance.
Mechanically Calculating OLS Estimates
Steps:
Calculate means for X and Y.
Compute residuals and sums of products.
Derive estimates using manual calculations or software.
Examples of Mechanically Calculating OLS Estimates
Data and intermediate calculations are crucial for estimating coefficients.
Illustration using height/weight data to compute regression coefficients.
Estimating Multivariate Regression Models with OLS
Multivariate models needed as many Y variables cannot be explained by a single X.
General form of multivariate regression with K independent variables:
Y = β₀ + β₁X₁ + β₂X₂ + ... + βₖXₖ + ε.
Interpreting OLS Estimates from Multivariate Regression Models
Coefficients in multivariate models indicate the change in Y with a one-unit increase in X, holding other variables constant.
Interpreting OLS Examples
Example 1: Demand for beef in the U.S.
Consumption relation to income, controlling for price.
Example 2: Financial aid effects based on parents’ contribution and student GPA.
Insights drawn from the coefficients and their implications on aid calculations.
The Fit of a Regression
Importance of understanding fit, focusing on how well the model predicts Y based on X.
Total sum of squares (TSS) measures variation in Y.
Decomposing Total Sum of Squares
TSS comprises two components:
Explained sum of squares (ESS): Variation explained by regression.
Residual sum of squares (RSS): Variation not explained.
Measuring the Overall Fit of an Estimated OLS Regression
ESS must represent a large portion of TSS for a successful model fit.
Introduces the coefficient of determination, R².
R² and Overall Fit of Estimated OLS Regression
R² is the ratio of ESS to TSS, indicating how well the regression explains the data.
Values vary between 0 and 1, with higher indicating better fit.
Adding Variables and R² Limitations
Adding variables can artificially inflate R², potentially without meaningful impact on the model.
New variables require estimation and affect degrees of freedom, which should be considered.
Using R² to Describe Overall Fit
Adjusted R² accounts for degrees of freedom, allowing for more meaningful comparisons when adding variables.
Appropriate and Inappropriate Uses of R²
R² is useful for comparing equations with the same dependent variable but not for different ones.
Warning against optimizing R² at the expense of meaningful theory behind model choice.
Example of Misusing Fit and R²
Example with mozzarella cheese consumption illustrates the danger of adding nonsensical adjustments to the model, leading to misleading conclusions about R².