1/12
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
reverse causation
A situation where the effect is mistaken for the cause, indicating that a change in the outcome influences the independent variable rather than the other way around.
Belief: A → B; reality: B → A
omitted variable bias
A form of bias in statistical analysis that occurs when a model fails to include important variables that influence both the independent and dependent variables, leading to incorrect conclusions about relationships.
Third variable (C) → both A and B. It can distort the estimated effect of the independent variable on the dependent variable.
selection bias
A form of bias that occurs when the sample used in a study is not representative of the population being studied, often leading to skewed results and invalid conclusions. Example: self-selection where individuals volunteer for a study, which may not reflect the wider population.
potential outcomes
A framework for understanding causal inference that describes the possible results for each individual under different treatment conditions. It allows researchers to estimate the causal effect of a treatment by comparing the outcomes that would occur under treatment versus no treatment.
average treatment effect
The average treatment effect (ATE) is a measure used in causal inference that quantifies the difference in average outcomes between individuals who receive a treatment and those who do not. It provides insight into the effectiveness of an intervention across a population.
experiments
Statistical procedure used to evaluate the effects of treatments or interventions by randomly assigning subjects to different conditions.
randomization
A process in experiments where participants are assigned to different treatment groups in a way that each individual has an equal chance of being assigned, minimizing bias and allowing for causal inferences.
standard error
A statistical term that measures the accuracy with which a sample represents a population, calculated as the standard deviation divided by the square root of the sample size.
always positive: because you are measuring the distance or spread, it cannot be negative.
large SE = lots of uncertainty; small SE = precise; never negative
t-statistic
A value used in hypothesis testing to determine if there is a significant difference between the means of two groups. It represents the ratio of the difference between the sample mean and the population mean to the standard error of the sample.
p-value
measures the probability of obtaining results as extreme as those observed, assuming that the treatment has no effect (the null hypothesis is true). if the p value is < 0.05, you can reject the null hypothesis and conclude that there is a statistically significant effect.
always positive: probabilities can never be negative because they measure likelihood.
sample size
number of observations or participants included in a study or experiment. larger sample size increases accuracy and reliability of results by reducing random error and making statistical estimate more precise.
confidence intervals
range of values derived from a data set that is likely to contain the true population parameter with a specified level of confidence, often expressed as a percentage (e.g., 95% confidence interval). e.g., if we say we’re 95% confident the true average is between 10 and 15, it means we think the real number is probably in that range. if i surveyed 100 uc berkeley students and found that 60% support a new policy, the sample proportion is 0.6. the true population proportion is the actual percentage of all UC berkeley students who support it – which you don’t know exactly, but your confidence itnerval estimates a range where that real number probably lies.