BUSN 5000 FINAL

5.0(1)

Studied by 16 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/95

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

96 Terms

New cards

If we say E(y|x) = Bo + B1x, where Bo and B1 solve the population least-squares problem, then the CEF is the population regression BLANK and Bo and B1 are population regression BLANK

function, coefficients

New cards

the population regression function provides the best BLANK to the CEF

linear approximation

New cards

(simple regression model) the coefficient B1 measures the BLANK in y BLANK with a BLANK in x1, holding all of the unobservables constant

change, associated, unit change

New cards

(simple regression model) if Bo and B1 solve the population least-squared problem their values BLANK the expected value of the BLANK difference between the dependent variable and the CEF

minimize, squared

New cards

the value of B1 that solves the population least-squares problem is:

cov(xi yi)/ var(xi)

New cards

the OLS estimator for B1 can be obtained by plugging in the BLANK of xi and yi in BLANK and plugging in another BLANK for each outer expectations

sample averages, population average, sample average

New cards

if there were more than one x in (1), then the formula for B1 would be the BLANk, except xi1 would be replaced with the BLANk from a regression of xi1 on the other xs

same, residual

New cards

the BLANK theorem says you can control for other explanatory variables in estimating the effect of an x on y by either including the other variables directly or regressing y on the BLANK from a regression of x on the other variables

FWL, residuals

New cards

when the PRF includes more than one x, we say that B1 measures the BLANK effect of x1 (w/o necessary giving a casual interpretation)

partial

New cards

if E(ui | xi1) = 0 in (1), xi1 is BLANK of ui and the sampling error of B1 hat equals Blank on average, which implies that B1 hat is BLANK

mean independent, 0, unbiased

New cards

(if E(ui | xi1,xi2) = 0) if you omit xi2 from (2), B1 hat will be biased BLANK if B2 and cov(xi1,xi2) have the same BLANK

upward, sign

New cards

if yi1 is log wage, xi1 is education and xi2 is labor market experience, and you omit xi2 from (2), then B1 hat will be biased BLANk because B2 is BLANK and cov(xi1,xi2) are BLANK correlated

downward, positive, negatively

New cards

let’s say you don’t omit xi2, but it is measures with error. Then B2 hat will be

biased down

New cards

R² measures how much the variance of the BLANK variable is accounted for by the BLANK variables

dependent, explanatory

New cards

true or false: R² is centrally important for doing casual inference

false

New cards

basic OLS inference is grounded in the application of the BLANK, which says that the BLANK of the OLS estimator can be regarded as approximately BLANK for large samples

CLT, sampling distribution, normal

New cards

the modern approach to regression inference allows for the variance of the errors depends on the BLANK variables

explanantory

New cards

the modern approach means we should always report BLANK standard errors and test statistics

robust

New cards

the R function lm gives the wrong standard errors, test statistics, and confidence intervals because it ignores

heteroscedasticity

New cards

if E(ui|xi1)=0 in (1), the sampling error of B1 hat converges to 0 and B1 hat is BLANK

consistent

New cards

the test statistic for whether a explanatory variable has a statistically significant association with the dependent variable is the ratio of the explanatory variable’s BLANK to its BLANK

estimated coefficient, standard error

New cards

in (2), the test statistic for the null hypothesis that B2=1 is BLANK

(B2 hat -1)/se(B2)

New cards

larger BLANK statistics and smaller BLANK values indicate stronger evidence BLANK the null hypothesis

test, p, against

New cards

suppose yi= Bo + B1xi1+ B2xi2 + B3xi3 + B4xi4 + ui. To test the null that B3=B4=0, you can use a BLANK test, which compares the fit of a short regression that BLANK x3 and x4 with the fit of a longer regression that BLANK them

F, excludes, includes

New cards

true or flase. if corr(x,) = 0, y does not depend on x

false

New cards

true or false. if x causes y, the conditional distribution of y given x must depend on x

true

New cards

in the above DAG, z is a BLANK

confounder

New cards

you can’t observe the effect of a treatment on an individual bc you can’t observe their BLANK outcome. In this sense, causal inference is fundamentally a BLANK data problem

counterfactual, missing

New cards

while individual treatment effects are not observable, you may able to identify the average treatment effect (ATE), which is the difference in average BLANK outcomes

potential

New cards

Using the difference in sample average outcomes for treated and untreated individuals generally won’t work for estimating the ATE because potential outcomes are not independently of treatment assignment, which results in what kind of bias?

selection

New cards

term 1 in (1) is E(y1i|Di = 1) − E(y0i|Di = 1)

the average treatment on the treated

New cards

term 2 in (1) is E(y1i|Di = 1) − E(y0i|Di = 0)(1)

selection bias

New cards

if treatment assigned is randomized, then term 2 E(y0i|Di = 1) − E(y0i|Di = 0)(1) equals BLANK and term 1 E(y1i|Di = 1) − E(y0i|Di = 1) equals BLANK

0, ATE

New cards

if the potential outcomes are BLANK of treatment assignment, the assignment mechanism is BLANK and the difference in sample average outcomes for treated and untreated individuals will identify the ATE

independent, ignorable

New cards

potential outcomes will be BLANK of treatment assignment if individuals are BLANK assigned to treated and untreated groups

independent, randomly

New cards

the conditional independence assumption (CIA) is a claim that there is a set of covariates that once you control for them, you can consider the potential outcomes to be BLANK of treated assignment. The CIA is a claim of unBLANK and is untestable

independent, confoundedness

New cards

to estimate the ATE under a CIA, you also need overlap, which is the ability to observe BLANK and BLANK units for any set of covariate values

untreated, treated

New cards

if you have a set of control variables for which a CIA holds, you can identify the average effect of the treatment on the outcome by running a regression of the outcome on the BLANK from a regression of the treatment dummy on the controls

residuals

New cards

unlike in standard regression analysis, in RD designs there is no BLANK in treated and control units because individuals with different values of D, the treatment, have different values of the covariate by construction

overlap

New cards

in a sharp RD design, the conditional BLANK assumption holds automatically because treatment assignment is determined solely by the cutoff value of the BLANK variable

independence, running

New cards

in a fuzzy RD design, the cutoff value of the running variable determines the BLANK of treatment

probability

New cards

the key identifying assumption of an RD design is that the average BLANK outcomes are BLANK through the cutoff

potential, continuous

New cards

under the assumptions of a sharp RD design, you identify an

average treatment effect on the treated

New cards

the black lines are linear regression approximations to the CEFs for the BLANK outcomes

potential

New cards

select the regression specification that is consistent with the black lines

yi = Bo + B1xi + tDi + ui

New cards

under the key identifying assumption of a sharp RD design, the model in question 7 identified

t=E(y1i-yoi|xi=c)

New cards

the basis for an RD analysis should be apparent in a binned BLANK plot of the outcome and BLANK variable

scatter, running

New cards

in general, the RD specification should include a low-order BLANK in the running variable and an interaction of the running variable with the BLANK indicator

polynomial, treatment

New cards

the distribution of the running variable should show

no evidence of manipulation because it is smooth throughout the cutoff

New cards

an RD analysis of baseline BLANK should show no evidence of BLANK among them

covariates, discontinuities

New cards

including the baseline BLANK in the regression model BLANK affect the estimated treatment effect

covariates, should not

New cards

the ldurat difference in differences is

0.20

New cards

the benefit difference in differences is

New cards

the high-earner group is BLANK male and BLANK married, but the male and married shares BLANK change over time for either group

more, more, do not

New cards

the high-earner groups is BLANK likely to work in manufacturing and BLANK likely to work in construction, and the share of high earners in construction BLANK by BLANK points after the WBA increase

less, more, falls, 4

New cards

based on table 1, average time out of work rose BLANK % because of the WBA increase

New cards

column (1) indicates that time out of work (BLANK) rise for low earners

did not

New cards

column (1) indicates that average time out of work was BLANK % BLANK for high earners

25.6, higher

New cards

the results in column (1) suggest that time out of work rose BLANK % in Kentucky because of the WBA increase

19.1

New cards

the standard error for the estimated DD coefficient is BLANK, which implies that the result is significant at the BLANK % level

.069, 1

New cards

controlling for gender, industry affiliation and injury type BLANK the DD coefficient estimate for KY by BLANK percentage points

increases, 4

New cards

controlling for gender, industry affiliation and injury type BLANK the overall fit of the regression by BLANK percentage points

increases, 2

New cards

still, the overall fit reported in column (2) is too low for the regression results to be trustworthy

false

New cards

the results in column (3) suggest that time out of work rose BLANK % in Michigan because of the WBA incease

19.2

New cards

the t statistic for the estimated DD coefficient in column (3) is BLANK, which implies you BLANK reject the null at the 5% level

1.25, cannot

New cards

the value of the BLANK test for the null that the coefficients of the control are jointly zero is BLANK, so the null is BLANK

F, 9.8, rejected

New cards

the metric that we use to compare prediction models is BLANK or MSPE

mean squared prediction error

New cards

mean squared error of (uhat)=

New cards

meansquarederrorof(uhat)=

0.75

New cards

E(uhat) - u = BLANK, which implies uhat is BLANK

0, unbiased

New cards

although uhat is BLANK, it has a lower mean squared error

biased

New cards

R² penalizes the inclusion of an additional explanatory variable if its associated t-statistics is less than

New cards

machine learning that involved predicting an outcome with a set of explanatory variables is called BLANK learning

supervised

New cards

choosing the best-performing ML model involved empirically tuning model complexity through

cross-validation

New cards

cross-validation beings by dividing the data into BLANK and BLANK samples

training, testing

New cards

the training sample is divided into BLANK, one of which is held out for BLANK while the others are used to BLANK the model

folds, validation, estimate

New cards

cross-validation involves computing the BLANK for each fold and BLANK them over all folds

MSPE, averaging

New cards

cross-validation is repeated for different values of the BLANK parameter, which determines the strength of the BLANK imposed by the regularizer

tuning, penalty

New cards

LASSO is a BLANK estimator that also performs variables BLANK by forcing the coefficients of the least releavant variables to be equal to BLANK relevant

shrinkage, selection, zero

New cards

the 2×2 DD analysis compares the difference in average outcomes for the BLANK observations before and BLANK treatment with the difference in mean outcomes for the control observations BLANK and BLANK treatment

treated, after, before, after

New cards

a DD analysis targets the average treatment effect on the BLANK

treated

New cards

the target estimand cannot be estimated directly because E(yo|g=1,t=1) is BLANK

unobserved

New cards

the key identifying assumption in a DD analysis is that the treated and untreated outcomes would follow BLANK trends in the BLANK of the treatment

parallel, absence

New cards

a simple before vs after comparison of treated observations misses the BLANK in the outcome not associated with treatment

trends

New cards

a simple comparison of treated vs control observations after treatment misses factors that cause non-random BLANK into treatment

selection

New cards

the parameter Y reflects the average difference between BLANK and BLANK outcomes before treatment

treated, untreated

New cards

the parameter N reflects the average differences in outcomes BLANK and BLANK treatment for the untreated group

before, after

New cards

the parameter N also reflects the BLANK average difference in outcomes between periods 0 and 1 for the BLANK group

counterfactual, treated

New cards

if N varied by group, the BLANK assumption would not hold

parallel trends

New cards

the parameter S represents the BLANK

difference in differences

New cards

the standard 2×2 DD analysis can be carried out by regressing the outcome on a BLANK dummy, a period BLANK, and their BLANK

group, dummy, interaction

New cards

a formal expression of the DD regression consistent with table 2 is:

y=u + Ytreat +Nafter + Streat *after + ua

New cards

a regression formulation of DD design is appealing because it

all of the above

New cards

we described a TWFE model as a regression model for data with both a BLANK and time dimension

group

New cards

estimating a TWFE model with data on multiple groups and variation in treatment timing can identify the ATE if the treatment effect is

homogeneous

New cards

computing the correct standard errors for TWFE estimates usually requires BLANK at the group level to account for BLANK and BLANK correlation

clustering ,heteroscedasticity, serial