1/108
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
what are causal claims
appear in support of prescriptive claims (what you should do) → consequences of actions
causes of effects
an attempt to explain what has happened
effects of causes
what happens if we do something — focused on consequences
counterfactuals
how the world would be if events transpired differently
potential outcome
y for a specific case (i) and x is the variable for the suspected cause
potential outcome notation
Yi = (x = ?)
causal variable x, affected variable y
what is the key takeaway for counterfactuals
x causes y if there is an identical universe where everything else is the same… if this doesn’t happen, there is a possibility of confounding
what are the two ways of making causal claims
causes of effects
effects of causes
what are causes of effects also called
deterministic
what are effects of causes also called
probabilistic
deterministic causal claims
what happened with certainty under specific conditions
when cause is there, the effect ALWAYS happens, or when the cause ain’t there, the effect never happens
probabilistic causal claims
cause increases and decreases the effect ON AVERAGE
the effect can happen when the cause is absent, and the effect may not even happen when the cause is present
effects of causes
what type of causal claims are we more focused upon
effects of causes; easier to address than causes of effects
necessary conditions
a cause must happen for an effect. does not mean that if the cause is present, the effect MUST happen.
sufficient
cause always produces the effect when present. every time the cause is present, the effect WILL happen
complex causality
multiple factors may be necessary — conjunctural causality — or different causes produce the same effect — multiple causality.
what do causal claims imply
a relationship between potential outcomes
we can never fs know the counterfactuals; we are relying on our imagination
connection to fundamental problem of causal inference
we say x causes y only if y would be diff when x changes.
but in actuality, we only see what actually happened (the y or PO) with the x that is attached to that y.
with that, we cannot see what would have happened if we had a diff x b/c that diff x did NOT happen🤣
we can only see the direct, true cause and effect for ONE specific case leading to the fundamental problem of causal inference
independent variable
alleged cause in causal claim - x
dependent variable
alleged outcome in causal claim - y
potential outcome
values of dependent variable (y) a case would take if exposed to a different independent variable (x)
selection on dependent variable (y)
only look at cases based on the outcome (y) they had.
you cannot compare outcomes to non-outcomes, so you cannot determine what actually caused y.
absence of weak severity.
texas sharpshooter fallacy (type of selection on dependent variable)
seeing patterns in random data and pretending they’re meaningful
causality = counterfactual
outcome changes when independent (x) variable changes. when you select on dependent variable, you do not observe the outcome under different exposure to cause
observing the outcome is dependent on
different levels or causes of independent variable (x).
what causes the fundamental problem of causal inference
bc we cannot see the counterfactual…
how does one solve this FPCI
replace the missing counterfactual
compare observed outcomes of y 4 cases that factually have diff values of cause x
assume that the actual PO from one case = counterfactual PO of another case
the observed association between x and y ends up being the correlation
correlation factors to look at
direction, strength, magnitude
direction
+, -
strength
move together a lot vs. little
magnitude
how much change? (slope or nah)
what’s the scale for correlation
-1 → 1
values that are closer to each of these numbers have a strong linear association
what does 0 mean for linear association for correlation
weak degree of linear association
key note about value of correlation (-1,1)
value of correlation doesn’t tell us about the magnitude (slope)
negative correlation
< 0, values of x and y move in the opposite direction
positive correlation
> 0, values of x and y move in the same direction
weak correlation
values for x and y don’t cluster strongly along a line
strong correlation
values for x and y cluster strongly along a line
what are the two types of causal problems
random and bias
random association
correlations occur by chance. there isn’t any systematic relationship
bias (aka confounding)
x and y are correlation but it doesnt result from the causal relationship
how can we solve random bias
for one, we cannot rule it out entirely
we can figure out how likely the correlation occurred by chance
if it’s unlikely, we can set aside this concern
solving random bias (steps)
compute the correlation of x and y
strength?
number of cases
assign a certain probability that it would happen by chance
this process for solving random association only works IF
we correctly describe chance processes
don’t misuse statistics
what is statistical significance
how likely the correlation we observed could have happened purely by chance
what is the relationship between statistical significance and likelihood of happening by chance?
increase SS, decreased likelihood of happening by chance
p value
probability of observing this correlation, assuming truth is a 0 correlation between x and y
scale of p value
0-1
relationship between p value and ss
decreased p value, increased ss
what is the threshold for ss/p value
when p is below 0.05, the threshold for statistical significance has been reached
how do we know this threshold shii for p value has been reached
if we don’t abuse the tests. if we use this correlation of evidence of supporting a claim, we need to consider the probability of accepting the claim in error?
p hacking
when the p value becomes insignificant (above 0.05). this occurs when we look at too many correlations, reminding us to only report the cases that are significant
confounding
is when the systematic observed correlation between x and y is NOT chance. we note that if we looked at more data, the relationship would still persist
why does confounding exist?
when cases that have diff levels of x have systematic differences (fact vs. counterfactual) in the potential outcome of y. for example, if cases have different baseline outcome (selection bias) or respond differently to the cause x (heterogeneity bias)
other differences between cases that causally affect x and y…
causal graphs
we never truly know the true causal graphs.
they are used to map out the causal relationships between variables.
dots represent the variables, arrows represent the direction of flow.
what’s one thing to note about causal graphs
doesn’t indicate if x increases or decreases y
how do we know there is confounding
if there’s a lil variable w has ANY causal path of ANY length to x and y
or if there’s some sort of backdoor path or non causal path
when correlation suffers from the two sources of error (random and confounding), what do we usually do
plug in the missing counterfacutal
which variables DON’T produce confounding
antedecent
intervening
reverse causality
antecedent
affects x. there is a causal path from w to y that passes thru x — they create confounding if there’s another path from w to y that DOESN’T include x.
intervening variable
affects y and is affected by x. these variables do not produce confounding b/c they are on a causal path from x to y
reverse causality
dependent variable y is caused by independent variable x
special case of bias/confounding
what happens when the true causal effect direction is unknown…
the bias is upwards and the correlation is positive. this fails weak severity
if the bias is downwards and the correlation is positive, what happens to the true causal effect?
observed correlation downplays the true causal effect. severity (weak or strong) depends on the size of the bias, not its direction.
measurement bias + weak severity
small measurement bias → minor violation of the assumption → assumption fails weakly (weak severity).
what is that product of signs thing
the product of signs on causal path from w → x and w → y gives us the direction of the bias/confounding
sources of confounding
cases select themselves into being exposed to a cause; these cases are already different than those that do not.
solution to confounding step by step (brief)
make comparisons, assumptions, and evaluate the trade offs
prime solution to confounding
experimentation
tell me about experimentation as a solution to confounding
it’s basically finding out what would happen on average if we made everyone do ____.
random sampling allows us to use the sample as an inference about the population.
evaluate the correlation between x and y for cases where levels of x is assigned AT RANDOM.
random assignment to treatment
all cases have an equal probability of being assigned to each condition or exposure to X.
exclusion restriction
a phenomenon where only x is changing
helps us carefully consider experimental design
why are experiments such a good idea
they help us calculate chance correlations and unbiased average causal effect of x on y.
randomness ensures cases in treatment and control have similar PO on average. the averages in both are observable.
the randomness also balances cases with similar values of confounding variable w.
breaks the “backdoor path” between w and x (and all confounding😉)
downside to experiments
we cannot always just whip up an experiment.
all solutions to confounding are basically a trade off between internal and external validity.
internal validity
extent to which the correlation of x and y in a research design actually shows the true causal effect.
external validity
the degree to which the causal relationship in a study is even relevant in causal claim.
what does random sampling allow us to do
sample an unbiased inference about the population
what do experiments do
examines the correlation between x and y for cases where level of x is assigned at random
then compare outcomes for cases with increased or decreased values of x only when at random
experiments in detail
unbiased correlation
random assignment to treatment — x = yes, control — x = no. not all cases have an equal probability of being exposed to the x
exclusion restriction
only thing changing is x — it considers experiment design
how do experiments solve confounding
randomization ensures cases in treatment and control have similar potential outcomes on average
randomization balances cases with similar values of confounding w in treatment and control. it breaks the link between w and x
removes ALL confounding
how does the randomness solve CF
cases in treatment and control have the same PO on average. average in control is observable for treatment and vice versa
how are experiments the best solution for confounding and the fpci
strong severity says the evidence is convincing to the extent assignments are checked
external validity concern
if the causal variable maps onto defined cause in causal claim, it’s good.
relationship between internal validity and external validity
increased IV, decreased EV
experiments have limited ___ validity
external validity
conditioning (and when is it possible)
isolates the true relationship between treatment and the outcome. this happens b/c of a “blocking of the backdoor path” (confounding).
conditioning is possible for any case and for any possible cause. this is why it has a higher external validity than experiments.
how does conditioning solve confounding in its own big mama way
compares the same values on confounding variable w and holds w constant. that way, w cannot affect x or y bc w is no longer moving. the backdoor path from x and y is blocked
under what circumstances is conditioning possible
for any case and any possible cause x
it has a greater external validity than experiments cuz there’s more freedom and fluidity
characteristics of conditioning
solves confounding by holding the confounding constant
has an increased external validity in comparison to experiments
assumptions of conditioning
no other confounding variables that are NOT conditioned upon. however, this cannot be fully proven because we’ll never know the full causal graph.
assumes NO MEASUREMENT ERROR between x and y. if you don’t do that, you are not comparing like and like as ensured.
find cases that are the same on confounding w and different in x
design based solutions
remove confounding as well!
how do design based solutions like before and after solve confounding
selects cases for comparison to eliminate the unknown/known, measurable/unmeasurable confounding
the comparison unchanging CLASSES (shared properties) of confounding, constant
examine correlation of x and y within cases where x changes overtime
how does before and after work
all confounding variables that are unchanging overtime are held constant
how does conditioning ACTUALLY solve confounding
holds measured confounding variables constant
examines correlation of x and y for cases that are the same on confounding w
before and after
examines changes in y in a single case or group where x changes overtime
how does before and after solve confounding
by selecting cases for comparison to eliminate known and unknown, measurable and unmeasurable, confounding
comparison holds constant classes of confounding not specific
assumptions for b and a
y would have remained the same overtime if x had not occurred
no variables w that affect y and changes overtime with x
(they basically say the same thing but the slides say the second one)
difference in difference
design based
careful comparison rules out groups of confounding
compare changes in treated cases before and after treatment to before and after after changes in untreated cases
how does DID work
hold constant unchanging attributes of cases
hold constant variables that change together overtime in treated and untreated cases