1/17
Flashcards reviewing types of biases in Ordinary Least Squares (OLS) regression, including omitted variable bias, error in variable bias, sample selection bias, and simultaneous causality bias.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is the fundamental requirement of OLS regarding the error term (e) and the independent variable (X)?
E(e|X) = 0, meaning the expected value of the error term, given X, must be zero.
What does it mean when E(e|education) ≠ 0 in a regression of Income on Education?
It means OLS is biased, and cannot accurately estimate the true marginal effect of education on income.
Name four types of biases discussed in the lecture.
Omitted variable bias, error in variable bias, self-selection bias, and simultaneous causality bias.
What is omitted variable bias (OVB)?
Bias that occurs when a relevant variable is omitted from the regression model.
How is OVB calculated?
OVB = Cov(education, skill) * coefficient of skill (effect of the omitted variable on Y).
Under what conditions is OVB equal to zero?
When either the omitted variable and the included variable (X) are uncorrelated (cov(X, omitted) = 0), OR the omitted variable and the dependent variable (Y) are uncorrelated.
If an omitted variable is related to BOTH X and Y, what does this imply about OVB?
There is OVB (≠ 0).
If Cov(X, omitted) > 0 and Cov(Y, omitted) > 0, is OVB positive or negative?
OVB > 0, which means the marginal effect of X is overestimated.
What is 'error in variable' bias?
Bias caused by errors in measuring the independent variable (X).
How does error in variable bias arise when using contaminated data X(obs) in place of X(true)?
It introduces a bias because error(Y) and error(X) are now correlated.
What is sample selection bias?
Bias resulting from non-randomly selecting the sample, where a sub-population is omitted.
How does sample selection bias differ from omitted variable bias?
OVB involves omitting variables (in X), while sample selection bias involves omitting samples (sub-populations).
What is simultaneous causality bias?
Bias that occurs when Y also causes X, creating a feedback loop.
In a regression of income on education, if higher skill levels (omitted variable) are correlated with both higher education and higher income, is the OLS estimate of the effect of education on income biased? If so, in which direction?
Yes, the OLS estimate is biased. Since higher skill levels lead to both higher education and higher income, the OVB is positive, overestimating the effect of education.
Suppose you are studying the effect of exercise on health outcomes, but your sample only includes individuals who regularly visit the gym. What type of bias might you encounter, and how could it affect your results?
You might encounter sample selection bias. Since the sample is not representative of the general population, the effect of exercise on health outcomes might be overestimated due to healthier individuals being more likely to visit the gym.
If a study aims to evaluate the impact of a job training program on employment rates, but individuals who are more motivated are more likely to enroll in the program, what type of bias is this, and how does it affect the estimation?
This is an example of self-selection bias. The effect of the job training program may be overestimated because more motivated individuals are more likely to find employment regardless of the program.
Consider a scenario where you're analyzing the effect of police presence on crime rates. However, higher crime rates might also lead to increased police presence. What type of bias does this create, and what does it imply for your OLS estimates?
This creates simultaneous causality bias. It means that the OLS estimates may not accurately reflect the causal effect of police presence on crime rates because the relationship is bidirectional. Crime rates and police presence are jointly determined.
In a regression analysis, an important variable 'motivation' is not included. The correlation between 'motivation' and the included independent variable 'effort' is positive, and 'motivation' also positively affects the dependent variable 'performance'. What type of bias is present, and how does it impact the results?
Omitted variable bias (OVB) is present. Since 'effort' and 'motivation' are positively correlated, and 'motivation' positively affects 'performance', the effect of 'effort' on 'performance' will be overestimated.