BUSN 5000 FINAL

studied byStudied by 16 people
5.0(1)
Get a hint
Hint

If we say E(y|x) = Bo + B1x, where Bo and B1 solve the population least-squares problem, then the CEF is the population regression BLANK and Bo and B1 are population regression BLANK

1 / 95

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

96 Terms

1

If we say E(y|x) = Bo + B1x, where Bo and B1 solve the population least-squares problem, then the CEF is the population regression BLANK and Bo and B1 are population regression BLANK

function, coefficients

New cards
2

the population regression function provides the best BLANK to the CEF

linear approximation

New cards
3

(simple regression model) the coefficient B1 measures the BLANK in y BLANK with a BLANK in x1, holding all of the unobservables constant

change, associated, unit change

New cards
4

(simple regression model) if Bo and B1 solve the population least-squared problem their values BLANK the expected value of the BLANK difference between the dependent variable and the CEF

minimize, squared

New cards
5

the value of B1 that solves the population least-squares problem is:

cov(xi yi)/ var(xi)

New cards
6

the OLS estimator for B1 can be obtained by plugging in the BLANK of xi and yi in BLANK and plugging in another BLANK for each outer expectations

sample averages, population average, sample average

New cards
7

if there were more than one x in (1), then the formula for B1 would be the BLANk, except xi1 would be replaced with the BLANk from a regression of xi1 on the other xs

same, residual

New cards
8

the BLANK theorem says you can control for other explanatory variables in estimating the effect of an x on y by either including the other variables directly or regressing y on the BLANK from a regression of x on the other variables

FWL, residuals

New cards
9

when the PRF includes more than one x, we say that B1 measures the BLANK effect of x1 (w/o necessary giving a casual interpretation)

partial

New cards
10

if E(ui | xi1) = 0 in (1), xi1 is BLANK of ui and the sampling error of B1 hat equals Blank on average, which implies that B1 hat is BLANK

mean independent, 0, unbiased

New cards
11

(if E(ui | xi1,xi2) = 0) if you omit xi2 from (2), B1 hat will be biased BLANK if B2 and cov(xi1,xi2) have the same BLANK

upward, sign

New cards
12

if yi1 is log wage, xi1 is education and xi2 is labor market experience, and you omit xi2 from (2), then B1 hat will be biased BLANk because B2 is BLANK and cov(xi1,xi2) are BLANK correlated

downward, positive, negatively

New cards
13

let’s say you don’t omit xi2, but it is measures with error. Then B2 hat will be

biased down

New cards
14

R² measures how much the variance of the BLANK variable is accounted for by the BLANK variables

dependent, explanatory

New cards
15

true or false: R² is centrally important for doing casual inference

false

New cards
16

basic OLS inference is grounded in the application of the BLANK, which says that the BLANK of the OLS estimator can be regarded as approximately BLANK for large samples

CLT, sampling distribution, normal

New cards
17

the modern approach to regression inference allows for the variance of the errors depends on the BLANK variables

explanantory

New cards
18

the modern approach means we should always report BLANK standard errors and test statistics

robust

New cards
19

the R function lm gives the wrong standard errors, test statistics, and confidence intervals because it ignores

heteroscedasticity

New cards
20

if E(ui|xi1)=0 in (1), the sampling error of B1 hat converges to 0 and B1 hat is BLANK

consistent

New cards
21

the test statistic for whether a explanatory variable has a statistically significant association with the dependent variable is the ratio of the explanatory variable’s BLANK to its BLANK

estimated coefficient, standard error

New cards
22

in (2), the test statistic for the null hypothesis that B2=1 is BLANK

(B2 hat -1)/se(B2)

New cards
23

larger BLANK statistics and smaller BLANK values indicate stronger evidence BLANK the null hypothesis

test, p, against

New cards
24

suppose yi= Bo + B1xi1+ B2xi2 + B3xi3 + B4xi4 + ui. To test the null that B3=B4=0, you can use a BLANK test, which compares the fit of a short regression that BLANK x3 and x4 with the fit of a longer regression that BLANK them

F, excludes, includes

New cards
25

true or flase. if corr(x,) = 0, y does not depend on x

false

New cards
26

true or false. if x causes y, the conditional distribution of y given x must depend on x

true

New cards
27

in the above DAG, z is a BLANK

confounder

New cards
28

you can’t observe the effect of a treatment on an individual bc you can’t observe their BLANK outcome. In this sense, causal inference is fundamentally a BLANK data problem

counterfactual, missing

New cards
29

while individual treatment effects are not observable, you may able to identify the average treatment effect (ATE), which is the difference in average BLANK outcomes

potential

New cards
30

Using the difference in sample average outcomes for treated and untreated individuals generally won’t work for estimating the ATE because potential outcomes are not independently of treatment assignment, which results in what kind of bias?

selection

New cards
31

term 1 in (1) is E(y1i|Di = 1) − E(y0i|Di = 1)

the average treatment on the treated

New cards
32

term 2 in (1) is E(y1i|Di = 1) − E(y0i|Di = 0)(1)

selection bias

New cards
33

if treatment assigned is randomized, then term 2 E(y0i|Di = 1) − E(y0i|Di = 0)(1) equals BLANK and term 1 E(y1i|Di = 1) − E(y0i|Di = 1) equals BLANK

0, ATE

New cards
34

if the potential outcomes are BLANK of treatment assignment, the assignment mechanism is BLANK and the difference in sample average outcomes for treated and untreated individuals will identify the ATE

independent, ignorable

New cards
35

potential outcomes will be BLANK of treatment assignment if individuals are BLANK assigned to treated and untreated groups

independent, randomly

New cards
36

the conditional independence assumption (CIA) is a claim that there is a set of covariates that once you control for them, you can consider the potential outcomes to be BLANK of treated assignment. The CIA is a claim of unBLANK and is untestable

independent, confoundedness

New cards
37

to estimate the ATE under a CIA, you also need overlap, which is the ability to observe BLANK and BLANK units for any set of covariate values

untreated, treated

New cards
38

if you have a set of control variables for which a CIA holds, you can identify the average effect of the treatment on the outcome by running a regression of the outcome on the BLANK from a regression of the treatment dummy on the controls

residuals

New cards
39

unlike in standard regression analysis, in RD designs there is no BLANK in treated and control units because individuals with different values of D, the treatment, have different values of the covariate by construction

overlap

New cards
40

in a sharp RD design, the conditional BLANK assumption holds automatically because treatment assignment is determined solely by the cutoff value of the BLANK variable

independence, running

New cards
41

in a fuzzy RD design, the cutoff value of the running variable determines the BLANK of treatment

probability

New cards
42

the key identifying assumption of an RD design is that the average BLANK outcomes are BLANK through the cutoff

potential, continuous

New cards
43

under the assumptions of a sharp RD design, you identify an

average treatment effect on the treated

New cards
44

the black lines are linear regression approximations to the CEFs for the BLANK outcomes

potential

New cards
45

select the regression specification that is consistent with the black lines

yi = Bo + B1xi + tDi + ui

New cards
46

under the key identifying assumption of a sharp RD design, the model in question 7 identified

t=E(y1i-yoi|xi=c)

New cards
47

the basis for an RD analysis should be apparent in a binned BLANK plot of the outcome and BLANK variable

scatter, running

New cards
48

in general, the RD specification should include a low-order BLANK in the running variable and an interaction of the running variable with the BLANK indicator

polynomial, treatment

New cards
49

the distribution of the running variable should show

no evidence of manipulation because it is smooth throughout the cutoff

New cards
50

an RD analysis of baseline BLANK should show no evidence of BLANK among them

covariates, discontinuities

New cards
51

including the baseline BLANK in the regression model BLANK affect the estimated treatment effect

covariates, should not

New cards
52

the ldurat difference in differences is

0.20

New cards
53

the benefit difference in differences is

88

New cards
54

the high-earner group is BLANK male and BLANK married, but the male and married shares BLANK change over time for either group

more, more, do not

New cards
55

the high-earner groups is BLANK likely to work in manufacturing and BLANK likely to work in construction, and the share of high earners in construction BLANK by BLANK points after the WBA increase

less, more, falls, 4

New cards
56

based on table 1, average time out of work rose BLANK % because of the WBA increase

20

New cards
57

column (1) indicates that time out of work (BLANK) rise for low earners

did not

New cards
58

column (1) indicates that average time out of work was BLANK % BLANK for high earners

25.6, higher

New cards
59

the results in column (1) suggest that time out of work rose BLANK % in Kentucky because of the WBA increase

19.1

New cards
60

the standard error for the estimated DD coefficient is BLANK, which implies that the result is significant at the BLANK % level

.069, 1

New cards
61

controlling for gender, industry affiliation and injury type BLANK the DD coefficient estimate for KY by BLANK percentage points

increases, 4

New cards
62

controlling for gender, industry affiliation and injury type BLANK the overall fit of the regression by BLANK percentage points

increases, 2

New cards
63

still, the overall fit reported in column (2) is too low for the regression results to be trustworthy

false

New cards
64

the results in column (3) suggest that time out of work rose BLANK % in Michigan because of the WBA incease

19.2

New cards
65

the t statistic for the estimated DD coefficient in column (3) is BLANK, which implies you BLANK reject the null at the 5% level

1.25, cannot

New cards
66

the value of the BLANK test for the null that the coefficients of the control are jointly zero is BLANK, so the null is BLANK

F, 9.8, rejected

New cards
67

the metric that we use to compare prediction models is BLANK or MSPE

mean squared prediction error

New cards
68

mean squared error of (uhat)=

1

New cards
69

meansquarederrorof(uhat)=

0.75

New cards
70

E(uhat) - u = BLANK, which implies uhat is BLANK

0, unbiased

New cards
71

although uhat is BLANK, it has a lower mean squared error

biased

New cards
72

R² penalizes the inclusion of an additional explanatory variable if its associated t-statistics is less than

1

New cards
73

machine learning that involved predicting an outcome with a set of explanatory variables is called BLANK learning

supervised

New cards
74

choosing the best-performing ML model involved empirically tuning model complexity through

cross-validation

New cards
75

cross-validation beings by dividing the data into BLANK and BLANK samples

training, testing

New cards
76

the training sample is divided into BLANK, one of which is held out for BLANK while the others are used to BLANK the model

folds, validation, estimate

New cards
77

cross-validation involves computing the BLANK for each fold and BLANK them over all folds

MSPE, averaging

New cards
78

cross-validation is repeated for different values of the BLANK parameter, which determines the strength of the BLANK imposed by the regularizer

tuning, penalty

New cards
79

LASSO is a BLANK estimator that also performs variables BLANK by forcing the coefficients of the least releavant variables to be equal to BLANK relevant

shrinkage, selection, zero

New cards
80

the 2×2 DD analysis compares the difference in average outcomes for the BLANK observations before and BLANK treatment with the difference in mean outcomes for the control observations BLANK and BLANK treatment

treated, after, before, after

New cards
81

a DD analysis targets the average treatment effect on the BLANK

treated

New cards
82

the target estimand cannot be estimated directly because E(yo|g=1,t=1) is BLANK

unobserved

New cards
83

the key identifying assumption in a DD analysis is that the treated and untreated outcomes would follow BLANK trends in the BLANK of the treatment

parallel, absence

New cards
84

a simple before vs after comparison of treated observations misses the BLANK in the outcome not associated with treatment

trends

New cards
85

a simple comparison of treated vs control observations after treatment misses factors that cause non-random BLANK into treatment

selection

New cards
86

the parameter Y reflects the average difference between BLANK and BLANK outcomes before treatment

treated, untreated

New cards
87

the parameter N reflects the average differences in outcomes BLANK and BLANK treatment for the untreated group

before, after

New cards
88

the parameter N also reflects the BLANK average difference in outcomes between periods 0 and 1 for the BLANK group

counterfactual, treated

New cards
89

if N varied by group, the BLANK assumption would not hold

parallel trends

New cards
90

the parameter S represents the BLANK

difference in differences

New cards
91

the standard 2×2 DD analysis can be carried out by regressing the outcome on a BLANK dummy, a period BLANK, and their BLANK

group, dummy, interaction

New cards
92

a formal expression of the DD regression consistent with table 2 is:

y=u + Ytreat +Nafter + Streat *after + ua

New cards
93

a regression formulation of DD design is appealing because it

all of the above

New cards
94

we described a TWFE model as a regression model for data with both a BLANK and time dimension

group

New cards
95

estimating a TWFE model with data on multiple groups and variation in treatment timing can identify the ATE if the treatment effect is

homogeneous

New cards
96

computing the correct standard errors for TWFE estimates usually requires BLANK at the group level to account for BLANK and BLANK correlation

clustering ,heteroscedasticity, serial

New cards

Explore top notes

note Note
studied byStudied by 9 people
... ago
5.0(1)
note Note
studied byStudied by 42 people
... ago
5.0(5)
note Note
studied byStudied by 13 people
... ago
5.0(1)
note Note
studied byStudied by 9 people
... ago
5.0(1)
note Note
studied byStudied by 31 people
... ago
5.0(1)
note Note
studied byStudied by 14 people
... ago
5.0(1)
note Note
studied byStudied by 3 people
... ago
5.0(1)
note Note
studied byStudied by 18 people
... ago
5.0(1)

Explore top flashcards

flashcards Flashcard (23)
studied byStudied by 1 person
... ago
5.0(1)
flashcards Flashcard (103)
studied byStudied by 31 people
... ago
5.0(1)
flashcards Flashcard (60)
studied byStudied by 7 people
... ago
4.0(1)
flashcards Flashcard (41)
studied byStudied by 25 people
... ago
5.0(1)
flashcards Flashcard (1000)
studied byStudied by 25 people
... ago
5.0(1)
flashcards Flashcard (181)
studied byStudied by 91 people
... ago
5.0(1)
flashcards Flashcard (33)
studied byStudied by 40 people
... ago
5.0(2)
flashcards Flashcard (241)
studied byStudied by 3 people
... ago
5.0(1)
robot