gov 300 final

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/129

flashcard set

Earn XP

Description and Tags

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

130 Terms

1
New cards
What is Causality
Causality is the connection of phenomena through which one thing or event (the cause) under certain conditions gives rise to, causes, something else (the effect).
2
New cards
Factual:
Something that we have observed in reality
3
New cards
Counterfactual:
What we would have observed if some key condition were different, but everything else remained the same
4
New cards
Causality & Counterfactuals
We can say X causes Y when the factual and the counterfactual differ
5
New cards
Ceteris Paribus
We can only establish causality when the only difference between the factual and the counterfactual is the key condition X, and nothing else.
6
New cards
Fundamental Problem of Causal Inference
We observe what happens of what is
But not what would have happened or would have been
We do not know the counterfactual outcome
7
New cards
Potential Outcomes Framework
It is a formal way of thinking about counterfactuals and causality. We imagine that people have multiple potential outcomes, only one of which (the factual) is observed.
8
New cards
Causal Effect (or Treatment Effect)
The effect for individual i is the difference between the two potential outcomes:
Yi(1) − Yi(0)
9
New cards
SATE
It is the average of the individual treatment effects
10
New cards
A positive SATE tells us that
on average, our treatment has a positive effect on the outcome.
11
New cards
A negative SATE tells us that
on average, our treatment has a negative effect on the outcome.
12
New cards
When the SATE is zero that means that
our treatment has no causal effect, on average, on the outcome
13
New cards
how to we make the two groups be very similar to each other (ideally identical)?
randomization
14
New cards
How do we make sure that the treatment group and the control groups are very similar to each other in every respect, other than the treatment
We randomly assign the treatment
15
New cards
Why randomize?
Randomization makes the two groups be very similar to each other.
Provided the sample is not too small) there will be no tendency to one group having too many of one 'type' of unit or too few.
It makes the treatment and control groups be, on average very similar.
16
New cards
DiM is the
average Y of the treated − average Y of the control
17
New cards
With randomization of treatment
We can interpret DiM as an estimate of the SATE
18
New cards
Intuition
When we randomize the treatment, we make the treatment and control groups be very similar to each other. Therefore, each group, as a whole, serves as a inferred counterfactual for the other group
19
New cards
Experimental Compliance
Compliance in an experiment occurs when the subjects receive the treatment group to which they are assigned by the researcher.
20
New cards
Placebo Effect
An effect produced by a placebo intervention that cannot be attributed to the intervention itself
21
New cards
Placebo treatment:
give those in the control group a "fake" treatment that is not designed to produce any effect itself
22
New cards
Hawthorne Effect
A phenomenon where the subject of a study behave differently just because they know they're being studied
23
New cards
Solutions to Hawthorne Effect
Not let people know they're being studied (controversial)
24
New cards
Create a baseline comparison in which people know they're under study, but don't receive any treatment (effectiveness debatable)
25
New cards
Internal Validity
The extent to which causal assumptions are satisfied in the study
26
New cards
Can we really say that our treatment (or independent variable) is the cause of our outcome?
27
New cards
External Validity
The extent to which the conclusions can be generalized beyond a particular study
28
New cards
Is the study realistic? Would the conclusions apply in real life? Do they apply beyond the sample in the study?
external validity
29
New cards
Things that reduce internal validity:
Experimental non-compliance
Lack of randomization (or failed randomization)
Placebo effect
Hawthorne effect
30
New cards
Things that reduce external validity:
Using a sample that contains very specific types of people (representativeness)
Hawthorne effect
Unrealistic experimental environment
Unrealistic treatment
31
New cards
Three types of experiments:
Lab experiments: high internal, low external Sample usually not representative, unrealistic environment or intervention
32
New cards
Field experiments:
med-high internal, med-high external Less control, but more realistic environment and intervention
33
New cards
Natural experiments:
med internal, med-high external No control over experiment, realistic environment, realistic intervention, could be too particular
34
New cards
Human Subject
A living individual about whom an investigator conducting research obtains:
- Data through intervention or interaction with the individual.
-Identifiable private information about the subject, including a subject's opinions.
35
New cards
Research
Systematic investigation designed to develop or contribute to generalizable knowledge.
36
New cards
Human subjects research ethics is governed by:
Informed Consent
Avoidance of harm
Avoidance of deception
Privacy protection
Anonymity and confidentiality
37
New cards
Informed Consent
Procedure for ensuring that research participants understand what is being done to them, the limits to their participation, and awareness of any potential risks they incur.
38
New cards
Privacy
Individual's right to control the disclosure of what they deem personal or non-public information about themselves.
39
New cards
Anonymity
It refers to the elimination or strong protection of any identifiable information about the individual, organization, or place in which research took place.
40
New cards
Confidentiality
Concerns with making sure that the information subjects provide is not shared with third parties.
41
New cards
Forced response
In this question, I want you to answers yes or no. But I want you to consider the number of your dice throw. If 1 shows on the dice, tell me no. If 6 shows on the dice, tell me yes. But if another number, like 2, 3, 4, or 5 shows, tell me your opinion about the question that I will ask you after you throw the dice. [Respondent throws dice (researcher cannot see). Researcher asks question. Respondent answers.]
42
New cards
Researcher does not know whether respondent said 'yes' because they were forced (by dice), or because they meant it.
forced response
43
New cards
This protects respondents, as nobody knows the 'real' answer.
forced response
44
New cards
Item Count Design
It is based on an experimental design. A control group is shown a list of items and asked how many they agree with. A treatment group is shown the same list but with one additional item (the sensitive one) and asked how many they agree with.
45
New cards
For the people in the treatment group, we don't know who agrees with the sensitive item (that's the point!)
item count design
46
New cards
But we can figure out what proportion of our sample agrees with it.
item count design
47
New cards
Observational Study
Use data generated in an environment not controlled by researchers. They are distinguished from experimental studies by the non-randomization of the treatment and are sometimes referred to as non-experimental studies.
48
New cards
Confounder
Is a variable that is associated with both the treatment and the outcome
49
New cards
Confounder Bias
Is the incorrect adjudication of causality because of the presence of a confounder.
50
New cards
Upward bias:
Confounder incorrectly leads us to conclude that treatment has a larger effect than it actually does
51
New cards
Downward bias:
Confounder incorrectly leads us to conclude that treatment has a smaller effect than it actually does
52
New cards
Statistical controls
Statistical procedures by which we adjust our estimators because of the presence of confounder
53
New cards
Subclassification
we compare treatment and control groups among subgroups of the sample that are identical (or very similar) in the confounder variable
54
New cards
Before and After design
Compares the outcome in the same unit(s) before and after the treatment was assigned:
55
New cards
Difference in Differences Design (DiD)
Compare the "before and after" of a treatment group to the "before and after" of a control group
56
New cards
Parallel Trends Assumption
absent the treatment, the treatment group would have increased the same as the control group actually increased.
57
New cards
Census
Data with information about the entire population of interest
58
New cards
Sample
It is a subset of units (or cases), often a small proportion of the population
59
New cards
Representative Sample
It is a sample that accurately reflects the overall population. That is, it is similar to the population (in both observed and unobserved characteristics)
60
New cards
Simple Random Sampling
Randomly select units from a population, where each unit has the same probability of being chosen.
61
New cards
sampling frame
a list of everyone in the population
62
New cards
Stratified Sampling
Randomly select units from different strata, where each unit within a stratum has the same probability of being chosen. But units across strata may have different probabilities of being chosen.
63
New cards
how to do Stratified Sampling
1 Divide population into strata (e.g., states) 2 Randomly sample from each stratum
64
New cards
Cluster Sampling
Randomly choose clusters, then sample all the units within those clusters
65
New cards
Multilevel Sampling
Randomly choose clusters, then randomly choose units within those clusters.
66
New cards
Quota Sampling
Establish fixed _______ respondents with some characteristics such that the sample characteristics are similar to those of the population
67
New cards
Convenience Sampling
Simply choosing individuals who are nearby or easy to access in some way. Online surveys are usually convenience samples
68
New cards
Sample Selection Bias
Making incorrect inferences about a population from a sample because the sample was not representative of the population
69
New cards
Unit non-response
The people we randomly selected don't pick up the phone
70
New cards
Item non-response
The people we randomly selected answer the phone but refuse to answer a particular question
71
New cards
Survey Weighting
Survey weighting allows us to correct the representativeness of a sample when we know exactly how and why it is not representative
72
New cards
Location or Central Tendency
It refers to a "central" or "typical" value of a variable.
73
New cards
The median is less "sensitive" to
extreme values than the mean:
74
New cards
Quantiles
divide a set of observations into groups based on the magnitude of the variable
75
New cards
Spread
measures (or dispersion, scatter, or variability) capture how similar the values of a variable tend to be. When they are all rather similar, then the ___ is small; when they are rather dissimilar, the spread is large
76
New cards
• Interquartile range (IRQ):
the difference between the 3rd and 1st quartiles
77
New cards
Scatter Plot
A plot of two variables measured for the same set of units (i.e., voters), by plotting the value of one variable against that of the other for each unit
78
New cards
Correlation
Measures the degree to which two variables are linearly associated with each other
79
New cards
Coefficient of Determination: R2
It is a measure of model fit. It represents the proportion of the variation in the outcome variable that is explained by our regression
80
New cards
Regression Discontinuity Designs (RDD)
allow us to approximate experimental randomization of the treatment in observational studies
81
New cards
The Idea of RDD
Compare candidates that barely got elected to candidates that barely lost the election
82
New cards
R2 tells us the
percentage of the variation in the outcome that our regression predicted
83
New cards
Overfitting:
By adding more variables to the regression, we can always make R2 at least a little bit larger. Taken to an extreme, we might start 'predicting' even random things
84
New cards
The median is less "sensitive" to
extreme values
85
New cards
Confounders lead us to
incorrect conclusions about causality
86
New cards
Holding all other independent variables constant:
this means that we don't need to worry about those confounding variables that we have included in the regression, because we are controlling for those confounders
87
New cards
FWL Theorem (simplified)
The coefficient of any variable in a multivariate regression is the effect of that variable on the outcome, net from the effects that other independent variables may have on the outcome
88
New cards
Dummy Variables Definition
Also known as indicator variables or dichotomous variables or binomial variables, are variables that take two values: zero and one.
89
New cards
Categorical Variables
Also known as multinomial variables, take on a finite number of values denoting that an observation belongs to one of several mutually exclusive categories.
90
New cards
Wide Format:
the information for each period belongs in a different column for each variable.
91
New cards
Long Format:
all time periods for a variable are in the same column, and there is a column indicating the time period.
92
New cards
We Can Use Regression. So What?
Using regression can make the code for DiM, DiD, and BA a little easier (although it is much a matter of taste).
But mainly, it allows us to control for more confounders!
How? By simply adding them into the regression as additional control variables.
This means that we can improve upon what the basic estimators go in a noticeable way
93
New cards
Sampling randomness:
the results will change if we selected other observations to the sample
94
New cards
Unmeasured randomness:
there is inherent randomness in the data generated, even when the data comes from the entire population
95
New cards
Non-Sampling Errors
These types of errors occur when we do something wrong:
96
New cards
Law of Large Numbers (LLN)
It establishes that as our sample grows, the average from a sample will converge to the expected value (the average for the population).
97
New cards
Central Limit Theorem
It establishes that, as long as our sample is large enough, the probability distribution of estimates from the sample will follow a normal distribution centered around the true population value.
98
New cards
Parameter
A quantity of interest from the population that we would like to know
99
New cards
Estimator
A quantity we compute from a sample to estimate the parameter of interest. Estimators are random variables If we change the sample, the value of the estimator will be different
100
New cards
what is the difference between PATE and SATE
PATE looks at the difference between the potential outcome under treatment and the potential outcome under control in the entire population and SATE looks at just the difference in the treatment effect