1/96
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
causal inference
inference abt which factor(s) may be responsible for causing a particular effect on something
Causal Inference: Observed Outcome
what ACTUALLY happened in the real world
Causal inference: Controlling
act of taking steps to ensure the baseline characteristics for some factors are the SAME between the CONTROL and TREATMENT observations
Causal inference: POOR Internal Validity --> ???
there is a CAUSAL BIAS -> prevent us from making an causal inferences
Case-Control Study
study in which individual control observations are each matched to individual "treatment" observations
- MANY PAIRS OF MATCHED OBSERVATION
Confounding bias prevents us from _____
prevents us from MAKING VALID INFERENCES based on the study's results
Confounding Factors
factors that themselves have a causal effect on both the MAIN CAUSAL factor we are trying to study AND the OUTCOME factor we are trying to study
Uncontrolled confounders can ________
can make it LOOK like there is a cause and effect relationship between two factors EVEN WHEN THERE ISN'T
2 steps to the process of identifying a possible cofounder
1. Establish a difference between the treatment and control observations based on the possible confounding factor
2. Establish that the confounding factor is related to the outcome
Confounding Bias: What do we need to remember with case-control studies?
CASE-CONTROL STUDIES: we DO know that the treatment observations won't be EXACTLY like the control observations + that we're counting on having lots of matches so that all of the little differences will average out
Ch. 14 - Moderators
factors that change the effect a cause has on the outcome
Ch. 14 - Moderators/Interaction Effects: Another way to say that there IS an interaction effect
The ATE is different DEPENDING on who we're talking abt
Ch. 14 - 2 steps to figure out whether you have a moderating factor creating an interaction effect by (VISUALLY):
1. Create an interaction plot
2. Visually determine if the slopes are OBVIOUSLY different
Ch. 14 - Moderators/Interaction Effects: Interaction Plots --- MORE overlap of the error bars suggests WHAT?
a LOWER likelihood of there being an interaction, EVEN IF the slopes appear to be different
Ch. 15 - Random Assignment: Is it an implicit OR explicit control? AND WHY?
Implicit Control - bc random assignment CANNOT guarantee similarity between groups
Ch. 15 - Block Randomization Study Design (2 steps)
1. entire group of SIMILAR people are MATCHED and BLOCKED together
2. within EACH BLOCK, some are RANDOMLY assigned to the treatment group, while the others are RANDOMLY assigned to the control group
Ch. 15 - Which study design should be used when...
Group of observations + NO random assignment
Cohort Control
Representatives: Sampling (Ch. 16)
act of selecting observations from which to collect data
Sample Statistic (ch. 16 - pg. 163)
computing averages for a sample
Generalization Inference (ch. 16 - pg. 163)
the act of using a sample statistic to determine the value of a parameter
Generalization: Estimate (ch. 16 - pg. 164)
value we use as a stand-in for the parameter
External Validity: What does having a DIFFERENCE in the distribution of important underlying baseline characteristics between the sample and the population of interest mean? (ch. 16 - pg. 164)
IT MEANS THAT THE SAMPLE IS NOT REPRESENTATIVE of THE WHOLE POPULATION.
Sampling: In what cases do we say that there is a sample bias preventing us from making a generalization inference? (ch. 16 - pg. 166)
when a sample IS NOT REPRESENTATIVE of a population of interest --> not appropriate to use sample data to make generalizations about the population of interest
Sampling: Inclusion Criteria (ch. 16 - pg. 166)
The way in which statisticians and data scientists define the population of interest
WHO'S NOT HERE???????
Volunteer Sampling (Ch. 17 - pg. 173)
sampling strategy in which you make a survey or study available to some or all respondents who meet the inclusion criteria
Probability Sampling (ch. 17 - pg. 173)
sampling strategy in which all eligible observations are assigned a non-zero probability to be included in the sample + then some observations are selected using a random process
- random process = distinguising feature of probability samples
Quota Sampling
sampling strategy in which you have specific quotas (limits) for specific strata of respondents within the population of interest
Quota Sampling: Quota is USUALLY based on ???? (ch. 17 - 174)
characteristics of the population
Population-based Quota Sampling: How does it improve the external validity? (ch. 17 - pg. 175)
it forces the sample characteristics to match the population's aggregate characteristics
importance of survey incentive (2) (ch. 17 - pg. 178)
1. constitute an ethical return for the time the respondents take to complete the survey
2. help reduce non-response
Quota Sampling Strategies:
1. Did everyone in the population of interest have a chance of being in the sample?
2. Was some component of random sampling utilized?
3. Were quotas or blocks utilized?
1. Sometimes
2. No random sampling
3. Yes Quotas/Blocks
Probability Sampling Strategies
1. Did everyone in the population of interest have a chance of being in the sample?
2. Was some component of random sampling utilized?
3. Were quotas or blocks utilized?
1. Yes
2. YES random sampling
3. No quotas/blocks
Stratified Sampling Strategies:
1. Did everyone in the population of interest have a chance of being in the sample?
2. Was some component of random sampling utilized?
3. Were quotas or blocks utilized?
1. Yes
2. Yes random sampling
3. Yes quotas/blocks
KEY GIVEWAYS: "Population-based" = ???
quotas
KEY GIVEWAYS: "Adjusted" = ???
quota
KEY GIVEWAYS: Randomly Selected = ????
random sampling
KEY GIVEWAYS: Representative = ????
Random Sampling
Multi-Stage probability Samples help with what + examples
provide external validity for samples that are trying to generalize toe very broad populations
EX: All Americans or all human adults
What is stratification?
dividing population into groups FIRST, then random sampling inside each group
causal inference: Causal Factor + Outcome Factor
Causal: something that we will do that may have an effect
Outcome: the thing that will be affected
Causal inference: causal graph
a graphic depiction of the relationship between a causal factor and an outcome factor
Causal inference: Counterfactual question
what WOULD HAVE happened in the PARALLEL universe
- WHAT IF????
Causal inference: Treatment Observation
observations that were EXPOSED to some causal factor
Causal inference: Baseline Characteristics
attributes that we think might be RELATED to the OUTCOME we are studying
Causal inference: Matching
process of finding suitable CONTROL observations to COMPARE to TREATMENT observation and VICE VERSA
Causal inference: Internal Validity
depends on the extent to which the CONTROL observation are LIKE the TREATMENT observation
- depends on the SIMILARITY in the BASELINE characteristics between the CONTROL observation and TREATMENT observation
Cohort-Control study: ____ ____ is NOT a baseline characteristics that we have to control for
Sample Size
Causal Inference: When comparing PERCENTAGES, what should you compute?
AD and RD
Causal Inference: When comparing MEANS, what should you compute?
Effect size
Template for Counterfactual Questions
What if (TREATMENT observations) didn't use (TREATMENT)
PG. 83 - What should you use to compare company size and profit? AND WHY?
Correlation
When comparing baseline characteristics, what effect size would give us CONCERN?
0.20ish
- We want the effect size to be less than 0.1 if possible, to make sure the baseline characteristics are AS SIMILAR AS POSSIBLE
Ch. 14 - Moderators/Interaction Effects: True or False - The impact that a treatment has on an outcome (ATE) DOES NOT have to be the same for everyone.
True - The impact DOES NOT have to be the same for everyone.
Ch. 15 - How would you know if a study used random assignment?
it would DIRECTLY say it in the problem
Ch. 15 - Random Assignment
A strategy that utilizes a random process to determine which observations receive a TREATMENT, and which serve as a CONTROL
Ch. 15 - Matching: Is it an implicit OR explicit control? AND WHY?
Explicit Control - BC Matching DOES guarantee that the control observations and treatment observations are VERY ALIKE of the baseline characteristics used to conduct the matching process (pg. 144)
Ch. 15 - Does block randomization tend to have HIGH or LOW internal validity?
HIGH internal validity -- esp when the size of each block is relatively large
- combines the benefits of explicit controls via matching on important baseline characteristics AND incorporates the benefits of implicit controls via including random assignment
Column Percents: Relative Difference - Calculation and Benchmark
DIVING one column percent from the other
- used usually when percentages are below 50%
DANGER ZONE: LESS than 0.8 or GREATER than 1.25
if in danger zone --> interpret that difference to indicate a real-world difference between the two groups
Ratio of SD: Calculation + Benchmark
LARGER SD DIVIDED by SMALLER SD
- DANGER ZONE: if the ratio is approximately 3 OR HIGHER, we say that one distribution is more spread out than the other
When do you compute EFFECT SIZE?
only when the RATIO OF SD has SIMILAR SPREAD (LESS THAN 3)
Representatives: Sample (Ch. 16)
observations selected
Representatives: Population of Interest (Ch. 16)
All observations that we are interested in studying
Representatives: INSTEAD collecting data from ALL of the observations in the population of interest.... we ____ (ch. 16 - pg. 163)
select a SAMPLE of observations from the population of interest + collect data from the sample of observations
External Validity (ch. 16 - pg. 164)
degree to which a sampling strategy supports making a generalization inference
External Validity: Sample's representatives (ch. 16 - pg. 164)
extent to which the observations in the sample are SIMILAR to the observations in the population of interest
Sampling: A Sample is representative when __________ (ch. 16 - pg. 165)
when everyone in the population of interest is represented by someone in the sample
What's another term for Volunteer Sampling? (ch. 17 - pg. 173)
Convenience Sampling
Do probability samples require that EVERY observation has the SAME probability of being selected for the sample?
NO - it DOESN'T require that every observation has the same probability of being selected for the sample
Adding strata to a sampling strategy is an _________ control. (ch. 17 - pg. 175)
EXPLICIT control --> GUARANTEES that the sample's characteristics will be similar to the population's aggregate characteristics
Random Sampling is an ______ control. (ch. 17 - pg. 175)
IMPLICIT control --> DOES NOT GUARANTEE anything + adds some measure of representatives for all the other non-important characteristics that might impede our ability to generalize the sample statistic to the entire population of interest
survey incentive (ch. 17 - pg. 178)
the act of offering something in return for a respondent providing their data
Volunteer Sampling Strategies:
1. Did everyone in the population of interest have a chance of being in the sample?
2. Was some component of random sampling utilized?
3. Were quotas or blocks utilized?
1. Sometimes
2. No random sampling
3. No quotas/blocks
Quota Sampling: TRUE OR FALSE - Sample Size IS NOT the same thing as a quota.
TRUE
Causal inference: Control Observation
observations that are VERY SIMILAR to treatment observations, BUT were NOT EXPOSED to the causal factor
Causal Inference: Treatment EFFECT (+ how is it represented on the causal graph)
best guess as to HOW MUCH of an effect a causal factor has on an outcome factor
- represented as the arrow between the causal factor and outcome factor
Causal inference: 2 things we need to pay attention to when examining a study's results?
1. Study Design
2. INTERNAL VALIDITY (what was compared?)
Case-Control Study: Average Treatment Effect (ATE) represents WHAT?
The typical amount by which we expect the outcome factor to change if an observation is given a specific treatment
Cohort-Control study
study in which an ENTIRE group of control observations is matched to the group of "treatment" observations based on their aggregate characteristics
Case-Control Study: How should you compute the average treatment effect?
(ADD all the differences within the pair) AND then divide it by the number of observations
ALWAYS TREATMENT - OBSERVATION
PG. 74 - What should you use to compare sales in Spring and Fall? AND WHY?
Ratio of SD and effect size
8 STEPS when working with Baseline Differences and Confounding Effects
1. Identify the causal factor and outcome factor
2. Draw a causal graph
3. Write the counterfactual question
4. Identify the treatment observations
5. Identify the study design
6. Compare baseline characteristics
7. Determine if any of the baseline characteristics with differences are related to the outcome factor
8. Identify confounding factors and biases
Ch. 14 - What causes interactions to occur?
Moderating Factors
Ch. 14 - Moderators/Interaction Effects: What is an indication of a moderating factor?
If the treatment impact on the outcome works BETTER OR WORSE for SOME people --> indication of a moderating factor
(S26 Ch14 Slides)
Ch. 14 - 3 Steps to figure out whether you have a moderating factor creating an interaction effect
1. Split both the treatment and control groups into groups based on the moderating factor
2. Compute SEPARATE ATE for the groups
3. Seeing if the ATEs are different
Ch. 15 - Matching Pairs Design (2 steps)
1. two observations are MATCHED based on important baseline characteristics
2. one observation in the pair is RANDOMLY assigned to the treatment group, while the other is RANDOMLY assigned to the control group
Ch. 15 - Difference between Matched Pairs Studies and Case Controls
Matched Pairs: Researchers DECIDE which observations get exposed to the treatment and which does not
Case Controls: participants based on their OWN LIFE CHOICES have either already been exposed to the treatment or not
Ch. 15 - Which study design should be used when...
individual observations + random assignment
Matched Pairs
Ch. 15 - Which study design should be used when...
group of observations + random assignment
Block randomization
Ch. 15 - Which study design should be used when...
individual observations + NO random assignment
Case-Control
Column Percents: Absolute Difference - Calculation and Benchmark
SUBTRACTING one column percent from the other and taking the absolute value
DANGER ZONE: the absolute difference is GREATER THAN 10%
-- if greater than 10%, there IS a difference to indicate a real-world difference between two groups
Effect Size: Calculation + Benchmark
(Group 1 Mean - Group 2 Mean) DIVIDED by LARGER SD
General Benchmark:
0.10 or LESS - NO difference in the averages between the groups
0.25ish - SMALL difference in the averages between the groups
0.50 - MODERATE difference in the averages between the groups
0.75ish - LARGE difference in the averages between the groups
Parameters (ch. 16 - pg. 163)
Characteristics of an entire population of interest
Sampling: A sample MUST have a VERY _______ (high or low) external validity to support generalization (ch. 16 - pg. 166)
VERY HIGH EXTERNAL VALIDITY
Probability Sampling - What do you need to determine first before randomly selecting participants? (ch. 17 - 173)
MUST determine the desired sample size FIRST, and then randomly selected
Stratified Random Sample (ch. 17 - pg. 175)
Adding strata to a probability sample
Internal Validity: Key questions to ask (2)
What was compared?
Was it apples-to-apples?
External Validity: Key questions to ask (2)
Who’s not here?
Who should be here?