CONCEPTS PODCAST (1)

Core Concepts

Econometrics

Definition: The application of economic theory, mathematics, and statistical inference to analyze economic phenomena.

Theoretical Econometrics

Focus: Developing methods to measure economic relationships in econometric models.

Economic Statistics

Focus: Collecting, processing, and presenting economic data using charts and tables.

Population Regression Function (PRF)

Definition: The set of conditional means of the dependent variable for fixed explanatory variable values.
Also called the Conditional Expectation Function (CEF).

Adjusted R-squared

Purpose: Adjusts the goodness of fit for the number of variables in the regression model to help prevent overfitting.

Specification Bias

Definition: Error that occurs when a key variable is omitted from the regression model.

Coefficient of Partial Determination

Definition: Measures the variation in the dependent variable explained by including an additional independent variable.

F-Test (ANOVA)

Purpose: Evaluates the overall significance of a regression model.

Covariate

Definition: A control variable used in models that combine quantitative and qualitative regressors.

Chow Test

Purpose: Tests for structural stability of regression models, identifying if regressions differ, but not specifying the cause.

Data Types

Panel Data

Definition: Tracks the same units (e.g., a family or firm) over time.

Pooled Data

Definition: Combines cross-sectional and time-series data.

Regression Models

Linear Regression Models

Definition: Models linear in parameters, which may not always be linear in regressand or regressors.

Dummy Variables in Regression

Purpose: Represent structural changes, with a "slope drifter" indicating how the slope coefficient differs for a specific group.

Assumptions of Classical Linear Regression Model (CLRM)

Linear in Parameters
- Relationship must be linear in the model's coefficients.
Independent Explanatory Variables from Error Terms
- Independent variables should not be correlated with the error term.
Zero Mean Value of the Error Term
- Expected value of the error term for any variable should be zero.
Homoscedasticity
- Variance of the error term is constant across observations.
No Autocorrelation
- Error terms should not be correlated across observations.
Sufficient Sample Size (n > k)
- The number of observations must exceed the number of explanatory variables.
No Exact Collinearity Between Variables
- There should be no perfect linear relationship among explanatory variables.
Correct Functional Form (No Specification Bias)
- Model must be correctly specified without missing variables or incorrect functional forms.

Hypothesis Testing

Test of Significance and Confidence Interval
- Two complementary approaches for testing hypotheses.

Methods of Estimation

Ordinary Least Squares (OLS)

Definition: Ensures BLUE (Best Linear Unbiased Estimator) properties.

Maximum Likelihood Method

Method assumes normal distribution for parameter estimation.

Types of Regression Comparisons (Chow Test)

Coincidental Regression
- Same slope, same intercept.
Parallel Regression
- Same slope, different intercept.
Concurrent Regression
- Different slope, same intercept.
Dissimilar Regression
- Different slope and intercept.

True or False Clarifications

r squared in two-variable regression equals the coefficient of determination. — True

Indicates the goodness of fit of the model.

r squared works the same way in multiple regression models. — False

Adjusted R squared is used instead to account for additional variables.

Zero correlation implies independence. — False

Independence is stronger; zero correlation doesn’t imply no relationship.

Correlation is symmetrical (r xy = r yx). — True
Correlation implies cause-effect relationships in strong linear associations. — False

Correlation measures association, not causation.

Linear PRFs may not always be linear in variables. — True
Linear in parameters means equations can include nonlinear variables. — True
Sampling fluctuations may lead to PRF over-/under-estimation. — True
Least-squares estimators are BLUE. — True
Zero covariance between variables and error term is essential. — True
Partial regression coefficients are for explanatory variables, not dummy variables. — True
Adjusted R squared increases less than R squared as variables are added. — True

Reasons for Excluding Variables in Regression Models

Vagueness of Theory
- When the theoretical basis for including a variable is unclear.
Unavailability of Data
- Data for the variable may not be accessible or reliable.
Core vs. Peripheral Variables
- Focus on the most relevant variables and exclude less critical ones.
Intrinsic Randomness in Human Behavior
- Human actions may introduce randomness, making some variables unmeasurable.
Poor Proxy Variable
- Inability to find a good substitute for an unobservable variable.
Principle of Parsimony
- Keep the model as simple as possible to avoid overfitting.
Wrong Functional Form
- Errors in specifying the relationship between variables.

Methods of Estimation in Regression

Ordinary Least Squares (OLS)
- Minimizes the sum of squared residuals, ensuring estimators are BLUE.
Maximum Likelihood Estimation (MLE)
- Derives parameter estimates assuming a normal distribution for the error term.

Assumptions of CLRM

See above under CLRM assumptions for complete definitions.

Types of Regression Comparisons in Chow Test

Coincidental Regression: Same slope, same intercept.
Parallel Regression: Same slope, different intercept.
Concurrent Regression: Different slope, same intercept.
Dissimilar Regression: Different slope and intercept.