1/24
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
A statistical analysis is internally valid if
the statistical inferences about causal effects are valid for the population studied
Threats to internal validity lead to:
failures of one or more of the least squares assumptions.
Comparing the California test scores to test scores in Massachusetts is appropriate for external validity if
the institutional settings in California and Massachusetts, such as organization in classroom instruction and curriculum, were similar in the two states.
The question of reliability/unreliability of a multiple regression depends on:
internal and external validity.
Internal validity is that:
the estimator of the causal effect should be unbiased and consistent.
The true causal effect might not be the same in the population studied and the population of interest because
of differences in characteristics of the population.
of geographical differences.
the study is out of date.
What is the trade-off when including an extra variable in a regression?
An extra variable could control for omitted variable bias, but it also increases the variance of other estimated coefficients.
Suppose that a state offered voluntary standardized tests to all its third graders and that these data were used in a study of class size on student performance. Which of the following would generate selection bias?
Schools with higher-achieving students could be more likely to volunteer to take the test.
A researcher estimates the effect on crime rates of spending on police by using city-level data. Which of the following represents simultaneous causality?
Cities with high crime rates may need a larger police force, and thus more spending. More police spending, in turn, reduces crime.
A researcher estimates a regression using two different software packages. The first uses the homoskedasticity-only formula for standard errors. The second uses the heteroskedasticity-robust formula. The standard errors are very different. Which should the researcher use?
The heteroskedasticity-robust standard errors should be used
Labor economists studying the determinants of women's earnings discovered a puzzling empirical result. Using randomly selected employed women, they regressed earnings on the women's number of children and a set of control variables (age, education, occupation, and so forth). They found that women with more children had higher wages, controlling for these other factors. What is most likely causing this result?
Sample selection bias
A survey of earnings contains an unusually high fraction of individuals who state their weekly earnings in 100s, such as 300, 400, 500, etc.
This is an example of:
errors-in-variables bias.
In the case of errors-in-variables bias
the OLS estimator is consistent if the variance in the unobservable variable is relatively large compared to the variance in the measurement error.
In the case of errors-in-variables bias, the precise size and direction of the bias depend on
the correlation between the measured variable and the measurement error.
Suppose that the linear probability model yields a predicted value of Y that is equal to 1.3. Explain why this is nonsensical.
The predicted value of Y must be between 0 and 1.
One of your friends is using data on individuals to study the determinants of smoking at your university. She is particularly concerned with estimating marginal effects on the probability of smoking at the extremes. She asks you whether she should use a probit, logit, or linear probability model. What advice do you give her?
She should use the logit or probit, but not the linear probability model.
The linear probability model is:
the application of the linear multiple regression model to a binary dependent variable.
F-statistics computed using maximum likelihood estimators
can be used to test joint hypotheses
The probit model
forces the predicted values to lie between 0 and 1.
In the probit regression, the coefficient beta 1 indicates:
the change in the the z-value associated with a unit change in X.
Probit coefficients are typically estimated using:
the method of maximum likelihood
Why are the coefficients of probit and logit models estimated by maximum likelihood instead of OLS?
OLS cannot be used because the regression function is not a linear function of the regression coefficients.
To measure the fit of the probit model, you should:
use the "fraction correctly predicted" or the "pseudo R squared."
Nonlinear least squares
solves the minimization of the sum of squared predictive mistakes through sophisticated mathematical so routines, essentially by trial-and-error methods.
When testing joint hypotheses, you can use
either the F-statistic or the chi-squared statistic.