gov 300 final

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/129

Earn XP

Description and Tags

Government

2nd

Last updated 6:12 PM on 12/4/22

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

130 Terms

New cards

What is Causality

Causality is the connection of phenomena through which one thing or event (the cause) under certain conditions gives rise to, causes, something else (the effect).

New cards

Factual:

Something that we have observed in reality

New cards

Counterfactual:

What we would have observed if some key condition were different, but everything else remained the same

New cards

Causality & Counterfactuals

We can say X causes Y when the factual and the counterfactual differ

New cards

Ceteris Paribus

We can only establish causality when the only difference between the factual and the counterfactual is the key condition X, and nothing else.

New cards

Fundamental Problem of Causal Inference

We observe what happens of what is
But not what would have happened or would have been
We do not know the counterfactual outcome

New cards

Potential Outcomes Framework

It is a formal way of thinking about counterfactuals and causality. We imagine that people have multiple potential outcomes, only one of which (the factual) is observed.

New cards

Causal Effect (or Treatment Effect)

The effect for individual i is the difference between the two potential outcomes:
Yi(1) − Yi(0)

New cards

SATE

It is the average of the individual treatment effects

New cards

A positive SATE tells us that

on average, our treatment has a positive effect on the outcome.

New cards

A negative SATE tells us that

on average, our treatment has a negative effect on the outcome.

New cards

When the SATE is zero that means that

our treatment has no causal effect, on average, on the outcome

New cards

how to we make the two groups be very similar to each other (ideally identical)?

randomization

New cards

How do we make sure that the treatment group and the control groups are very similar to each other in every respect, other than the treatment

We randomly assign the treatment

New cards

Why randomize?

Randomization makes the two groups be very similar to each other.
Provided the sample is not too small) there will be no tendency to one group having too many of one 'type' of unit or too few.
It makes the treatment and control groups be, on average very similar.

New cards

DiM is the

average Y of the treated − average Y of the control

New cards

With randomization of treatment

We can interpret DiM as an estimate of the SATE

New cards

Intuition

When we randomize the treatment, we make the treatment and control groups be very similar to each other. Therefore, each group, as a whole, serves as a inferred counterfactual for the other group

New cards

Experimental Compliance

Compliance in an experiment occurs when the subjects receive the treatment group to which they are assigned by the researcher.

New cards

Placebo Effect

An effect produced by a placebo intervention that cannot be attributed to the intervention itself

New cards

Placebo treatment:

give those in the control group a "fake" treatment that is not designed to produce any effect itself

New cards

Hawthorne Effect

A phenomenon where the subject of a study behave differently just because they know they're being studied

New cards

Solutions to Hawthorne Effect

Not let people know they're being studied (controversial)

New cards

Create a baseline comparison in which people know they're under study, but don't receive any treatment (effectiveness debatable)

New cards

Internal Validity

The extent to which causal assumptions are satisfied in the study

New cards

Can we really say that our treatment (or independent variable) is the cause of our outcome?

New cards

External Validity

The extent to which the conclusions can be generalized beyond a particular study

New cards

Is the study realistic? Would the conclusions apply in real life? Do they apply beyond the sample in the study?

external validity

New cards

Things that reduce internal validity:

Experimental non-compliance
Lack of randomization (or failed randomization)
Placebo effect
Hawthorne effect

New cards

Things that reduce external validity:

Using a sample that contains very specific types of people (representativeness)
Hawthorne effect
Unrealistic experimental environment
Unrealistic treatment

New cards

Three types of experiments:

Lab experiments: high internal, low external Sample usually not representative, unrealistic environment or intervention

New cards

Field experiments:

med-high internal, med-high external Less control, but more realistic environment and intervention

New cards

Natural experiments:

med internal, med-high external No control over experiment, realistic environment, realistic intervention, could be too particular

New cards

Human Subject

A living individual about whom an investigator conducting research obtains:
- Data through intervention or interaction with the individual.
-Identifiable private information about the subject, including a subject's opinions.

New cards

Research

Systematic investigation designed to develop or contribute to generalizable knowledge.

New cards

Human subjects research ethics is governed by:

Informed Consent
Avoidance of harm
Avoidance of deception
Privacy protection
Anonymity and confidentiality

New cards

Informed Consent

Procedure for ensuring that research participants understand what is being done to them, the limits to their participation, and awareness of any potential risks they incur.

New cards

Privacy

Individual's right to control the disclosure of what they deem personal or non-public information about themselves.

New cards

Anonymity

It refers to the elimination or strong protection of any identifiable information about the individual, organization, or place in which research took place.

New cards

Confidentiality

Concerns with making sure that the information subjects provide is not shared with third parties.

New cards

Forced response

In this question, I want you to answers yes or no. But I want you to consider the number of your dice throw. If 1 shows on the dice, tell me no. If 6 shows on the dice, tell me yes. But if another number, like 2, 3, 4, or 5 shows, tell me your opinion about the question that I will ask you after you throw the dice. [Respondent throws dice (researcher cannot see). Researcher asks question. Respondent answers.]

New cards

Researcher does not know whether respondent said 'yes' because they were forced (by dice), or because they meant it.

forced response

New cards

This protects respondents, as nobody knows the 'real' answer.

forced response

New cards

Item Count Design

It is based on an experimental design. A control group is shown a list of items and asked how many they agree with. A treatment group is shown the same list but with one additional item (the sensitive one) and asked how many they agree with.

New cards

For the people in the treatment group, we don't know who agrees with the sensitive item (that's the point!)

item count design

New cards

But we can figure out what proportion of our sample agrees with it.

item count design

New cards

Observational Study

Use data generated in an environment not controlled by researchers. They are distinguished from experimental studies by the non-randomization of the treatment and are sometimes referred to as non-experimental studies.

New cards

Confounder

Is a variable that is associated with both the treatment and the outcome

New cards

Confounder Bias

Is the incorrect adjudication of causality because of the presence of a confounder.

New cards

Upward bias:

Confounder incorrectly leads us to conclude that treatment has a larger effect than it actually does

New cards

Downward bias:

Confounder incorrectly leads us to conclude that treatment has a smaller effect than it actually does

New cards

Statistical controls

Statistical procedures by which we adjust our estimators because of the presence of confounder

New cards

Subclassification

we compare treatment and control groups among subgroups of the sample that are identical (or very similar) in the confounder variable

New cards

Before and After design

Compares the outcome in the same unit(s) before and after the treatment was assigned:

New cards

Difference in Differences Design (DiD)

Compare the "before and after" of a treatment group to the "before and after" of a control group

New cards

Parallel Trends Assumption

absent the treatment, the treatment group would have increased the same as the control group actually increased.

New cards

Census

Data with information about the entire population of interest

New cards

Sample

It is a subset of units (or cases), often a small proportion of the population

New cards

Representative Sample

It is a sample that accurately reflects the overall population. That is, it is similar to the population (in both observed and unobserved characteristics)

New cards

Simple Random Sampling

Randomly select units from a population, where each unit has the same probability of being chosen.

New cards

sampling frame

a list of everyone in the population

New cards

Stratified Sampling

Randomly select units from different strata, where each unit within a stratum has the same probability of being chosen. But units across strata may have different probabilities of being chosen.

New cards

how to do Stratified Sampling

1 Divide population into strata (e.g., states) 2 Randomly sample from each stratum

New cards

Cluster Sampling

Randomly choose clusters, then sample all the units within those clusters

New cards

Multilevel Sampling

Randomly choose clusters, then randomly choose units within those clusters.

New cards

Quota Sampling

Establish fixed _______ respondents with some characteristics such that the sample characteristics are similar to those of the population

New cards

Convenience Sampling

Simply choosing individuals who are nearby or easy to access in some way. Online surveys are usually convenience samples

New cards

Sample Selection Bias

Making incorrect inferences about a population from a sample because the sample was not representative of the population

New cards

Unit non-response

The people we randomly selected don't pick up the phone

New cards

Item non-response

The people we randomly selected answer the phone but refuse to answer a particular question

New cards

Survey Weighting

Survey weighting allows us to correct the representativeness of a sample when we know exactly how and why it is not representative

New cards

Location or Central Tendency

It refers to a "central" or "typical" value of a variable.

New cards

The median is less "sensitive" to

extreme values than the mean:

New cards

Quantiles

divide a set of observations into groups based on the magnitude of the variable

New cards

Spread

measures (or dispersion, scatter, or variability) capture how similar the values of a variable tend to be. When they are all rather similar, then the ___ is small; when they are rather dissimilar, the spread is large

New cards

• Interquartile range (IRQ):

the difference between the 3rd and 1st quartiles

New cards

Scatter Plot

A plot of two variables measured for the same set of units (i.e., voters), by plotting the value of one variable against that of the other for each unit

New cards

Correlation

Measures the degree to which two variables are linearly associated with each other

New cards

Coefficient of Determination: R2

It is a measure of model fit. It represents the proportion of the variation in the outcome variable that is explained by our regression

New cards

Regression Discontinuity Designs (RDD)

allow us to approximate experimental randomization of the treatment in observational studies

New cards

The Idea of RDD

Compare candidates that barely got elected to candidates that barely lost the election

New cards

R2 tells us the

percentage of the variation in the outcome that our regression predicted

New cards

Overfitting:

By adding more variables to the regression, we can always make R2 at least a little bit larger. Taken to an extreme, we might start 'predicting' even random things

New cards

The median is less "sensitive" to

extreme values

New cards

Confounders lead us to

incorrect conclusions about causality

New cards

Holding all other independent variables constant:

this means that we don't need to worry about those confounding variables that we have included in the regression, because we are controlling for those confounders

New cards

FWL Theorem (simplified)

The coefficient of any variable in a multivariate regression is the effect of that variable on the outcome, net from the effects that other independent variables may have on the outcome

New cards

Dummy Variables Definition

Also known as indicator variables or dichotomous variables or binomial variables, are variables that take two values: zero and one.

New cards

Categorical Variables

Also known as multinomial variables, take on a finite number of values denoting that an observation belongs to one of several mutually exclusive categories.

New cards

Wide Format:

the information for each period belongs in a different column for each variable.

New cards

Long Format:

all time periods for a variable are in the same column, and there is a column indicating the time period.

New cards

We Can Use Regression. So What?

Using regression can make the code for DiM, DiD, and BA a little easier (although it is much a matter of taste).
But mainly, it allows us to control for more confounders!
How? By simply adding them into the regression as additional control variables.
This means that we can improve upon what the basic estimators go in a noticeable way

New cards

Sampling randomness:

the results will change if we selected other observations to the sample

New cards

Unmeasured randomness:

there is inherent randomness in the data generated, even when the data comes from the entire population

New cards

Non-Sampling Errors

These types of errors occur when we do something wrong:

New cards

Law of Large Numbers (LLN)

It establishes that as our sample grows, the average from a sample will converge to the expected value (the average for the population).

New cards

Central Limit Theorem

It establishes that, as long as our sample is large enough, the probability distribution of estimates from the sample will follow a normal distribution centered around the true population value.

New cards

Parameter

A quantity of interest from the population that we would like to know

New cards

Estimator

A quantity we compute from a sample to estimate the parameter of interest. Estimators are random variables If we change the sample, the value of the estimator will be different

100

New cards

what is the difference between PATE and SATE

PATE looks at the difference between the potential outcome under treatment and the potential outcome under control in the entire population and SATE looks at just the difference in the treatment effect