1/19
These flashcards cover key concepts and definitions from the chapter on inference for categorical data.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Sampling distribution
The probability distribution of a statistic obtained through a large number of samples drawn from a specific population.
Normal approximation
A method to approximate the binomial distribution with a normal distribution under certain conditions.
Success-failure condition
For the sampling distribution of the sample proportion to be approximately normal, there must be at least 10 successes and 10 failures.
Binomial distribution
A probability distribution that describes the number of successes in a fixed number of independent Bernoulli trials.
P-value
The probability of observing a test statistic as extreme as the sample statistic, assuming the null hypothesis is true.
Confidence interval
A range of values, derived from a data set, that is likely to contain the value of an unknown population parameter.
Inference for a single proportion
Methods to make conclusions about a single population proportion from sample data.
Null Hypothesis (H_0)
A statement about a population parameter that is assumed to be true until evidence suggests otherwise.
Alternative Hypothesis (H_A)
A statement about a population parameter that contradicts the null hypothesis, representing what the researcher is trying to find evidence for.
Test statistic
A value calculated from sample data during a hypothesis test that describes how far the sample estimate is from the null hypothesis value, typically in terms of standard errors.
Conditions for inference for proportions
To perform valid inference for proportions, the following conditions typically need to be met: random sample/assignment, independence (often checked by the 10% condition for sampling without replacement), and the success-failure condition.
Inference for two proportions
Methods to compare two population proportions using sample data from two independent groups.
Chi-square tests
Statistical tests used to determine if there is a significant association between categorical variables or if a categorical distribution matches an expected distribution.
Expected count (for Chi-square tests)
The number of observations that would be expected in a cell of a contingency table if the null hypothesis of no association or homogeneity were true.
Degrees of freedom (for Chi-square tests)
A parameter that determines the shape of the chi-square distribution, calculated as (rows - 1) \times (columns - 1) for tests of independence/homogeneity, or (categories - 1) for goodness-of-fit tests.
Chi-square Goodness-of-Fit test
A chi-square test used to determine if the observed distribution of a single categorical variable matches an expected hypothesized distribution.
Chi-square Test of Independence
A chi-square test used to determine if there is an association between two categorical variables in a single population.
Chi-square Test of Homogeneity
A chi-square test used to compare the distribution of a single categorical variable across two or more independent populations or populations defined by the explanatory variable.
Odds ratio
A measure of association between exposure and an outcome, representing the odds of the outcome occurring in the exposed group compared to the unexposed group.
Outcome-based sampling
A sampling method where groups are formed based on a certain outcome rather than using random sampling.