1/20
These flashcards cover key concepts in regression analysis, the validity of research studies, and fundamental statistical principles important for understanding and analyzing empirical data.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is R-squared and what does it measure?
R-squared, also known as the coefficient of determination, ranges from 0 to 1 and measures the proportion of the variation of the outcome variable explained by the model.
What are observed outcomes in regression analysis?
Observed outcomes are the actual values of Y, which are contrasted with predicted values that are estimated.
What are predicted outcomes?
Predicted outcomes are values of Y estimated from the fitted model based on observed values of X.
What are prediction errors in regression?
Prediction errors, pronounced epsilon-hat, are the differences between observed outcomes and predicted outcomes.
What does the estimated slope in a regression equation represent?
The estimated slope indicates the change in the predicted outcome associated with a one-unit increase in the independent variable.
How is the intercept in a regression equation interpreted?
The intercept represents the expected value of the outcome variable when the predictor variable is zero, but this may not always be meaningful.
What does an R-squared value of 0.41 indicate in the context of predicting grades?
It means that 41% of the variation in final exam grades is explained by midterm grades.
Explain the relationship between correlation and R-squared.
R-squared is the square of the correlation coefficient, indicating that a stronger correlation results in a higher R-squared.
What is a confounding variable?
A confounding variable is a variable that affects both the treatment and the outcome, complicating causal inference.
What is the purpose of the difference-in-means estimator (DIME)?
DIME estimates the average causal effect of treatment on the outcome when treatment and control groups are comparable.
How are observational studies different from experimental studies?
Observational studies collect data on naturally occurring events, while experimental studies involve random assignment of treatment.
What is internal validity in research?
Internal validity refers to the extent to which causal conclusions drawn from a study are valid for the sampled observations.
What is external validity in research?
External validity concerns the generalizability of the causal conclusions drawn from the research to other settings or populations.
Why do confounders pose a problem in observational studies?
Confounders can bias estimates by affecting both treatment and outcome, making it difficult to attribute effects accurately.
What are the advantages of randomized experiments compared to observational studies?
Randomized experiments eliminate confounders and ensure comparable treatment and control groups, enhancing internal validity.
What does beta_hat represent in predictive models versus causal inference?
In predictive models, it reflects the change in predicted outcome with a unit increase in X; in causal inference, it indicates the causal effect of treatment X on Y.
What is the Central Limit Theorem?
The Central Limit Theorem states that the distribution of sample means will approximate a normal distribution as sample size increases, regardless of the original distribution.
What is a Bernoulli distribution?
A Bernoulli distribution is the probability distribution of a binary variable, characterized by one parameter, p.
What is a normal distribution?
A normal distribution is characterized by its mean and variance and describes the distribution of a normal random variable.
How should a researcher interpret a 95% confidence interval for a treatment effect?
It indicates the range of plausible values for the true treatment effect; if it does not include zero, the effect is statistically significant.
What does a p-value represent in hypothesis testing?
A p-value is the probability of observing a test statistic at least as extreme as what was observed if the null hypothesis is true.