8. Inference for Categorical Data

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
full-widthPodcast
1
Card Sorting

1/19

flashcard set

Earn XP

Description and Tags

These flashcards cover key concepts and definitions from the chapter on inference for categorical data.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

20 Terms

1
New cards

Sampling distribution

The probability distribution of a statistic obtained through a large number of samples drawn from a specific population.

2
New cards

Normal approximation

A method to approximate the binomial distribution with a normal distribution under certain conditions.

3
New cards

Success-failure condition

For the sampling distribution of the sample proportion to be approximately normal, there must be at least 10 successes and 10 failures.

4
New cards

Binomial distribution

A probability distribution that describes the number of successes in a fixed number of independent Bernoulli trials.

5
New cards

P-value

The probability of observing a test statistic as extreme as the sample statistic, assuming the null hypothesis is true.

6
New cards

Confidence interval

A range of values, derived from a data set, that is likely to contain the value of an unknown population parameter.

7
New cards

Inference for a single proportion

Methods to make conclusions about a single population proportion from sample data.

8
New cards

Null Hypothesis (H_0)

A statement about a population parameter that is assumed to be true until evidence suggests otherwise.

9
New cards

Alternative Hypothesis (H_A)

A statement about a population parameter that contradicts the null hypothesis, representing what the researcher is trying to find evidence for.

10
New cards

Test statistic

A value calculated from sample data during a hypothesis test that describes how far the sample estimate is from the null hypothesis value, typically in terms of standard errors.

11
New cards

Conditions for inference for proportions

To perform valid inference for proportions, the following conditions typically need to be met: random sample/assignment, independence (often checked by the 10% condition for sampling without replacement), and the success-failure condition.

12
New cards

Inference for two proportions

Methods to compare two population proportions using sample data from two independent groups.

13
New cards

Chi-square tests

Statistical tests used to determine if there is a significant association between categorical variables or if a categorical distribution matches an expected distribution.

14
New cards

Expected count (for Chi-square tests)

The number of observations that would be expected in a cell of a contingency table if the null hypothesis of no association or homogeneity were true.

15
New cards

Degrees of freedom (for Chi-square tests)

A parameter that determines the shape of the chi-square distribution, calculated as (rows - 1) \times (columns - 1) for tests of independence/homogeneity, or (categories - 1) for goodness-of-fit tests.

16
New cards

Chi-square Goodness-of-Fit test

A chi-square test used to determine if the observed distribution of a single categorical variable matches an expected hypothesized distribution.

17
New cards

Chi-square Test of Independence

A chi-square test used to determine if there is an association between two categorical variables in a single population.

18
New cards

Chi-square Test of Homogeneity

A chi-square test used to compare the distribution of a single categorical variable across two or more independent populations or populations defined by the explanatory variable.

19
New cards

Odds ratio

A measure of association between exposure and an outcome, representing the odds of the outcome occurring in the exposed group compared to the unexposed group.

20
New cards

Outcome-based sampling

A sampling method where groups are formed based on a certain outcome rather than using random sampling.