Unit 8 Inference for Categorical Data: Understanding and Using Chi-Square Procedures

0.0(0)
Studied by 0 people
0%Unit 8 Mastery
0%Exam Mastery
Build your Mastery score
multiple choiceMultiple Choice
call kaiCall Kai
Supplemental Materials
Card Sorting

1/24

Last updated 3:08 PM on 3/12/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

25 Terms

1
New cards

Chi-square goodness of fit test (GOF test)

A hypothesis test for one categorical variable that compares observed counts to expected counts from a claimed distribution.

2
New cards

Observed count (O)

The number of sample observations that fall in a particular category (or cell) of a table.

3
New cards

Expected count in GOF (Eᵢ = n·pᵢ,₀)

The count predicted by the null model for category i, found by multiplying the sample size n by the claimed proportion pᵢ,₀.

4
New cards

Chi-square test statistic (χ²)

A measure of overall discrepancy between observed and expected counts: χ² = Σ (O − E)² / E.

5
New cards

Chi-square contribution

A single term (O − E)² / E showing how much one category/cell contributes to the total χ²; larger values indicate bigger disagreement with the null model.

6
New cards

Degrees of freedom for GOF (df = k − 1)

For a goodness of fit test with k categories, df equals the number of categories minus 1.

7
New cards

Right-tailed chi-square test

A chi-square test where only large χ² values support the alternative, because large χ² indicates large overall mismatch between O and E.

8
New cards

Random condition (chi-square tests)

The data should come from a random sample (or random assignment in an experiment) to justify inference.

9
New cards

Independence condition (observations)

Sample observations must be independent of one another; this is about people/outcomes not influencing each other, not about variables being independent.

10
New cards

10% condition

When sampling without replacement, the sample size should be no more than about 10% of the population to support independence.

11
New cards

Large expected counts condition

All expected counts should be at least 5 (E ≥ 5 in every category/cell) so the chi-square approximation is valid.

12
New cards

Combine categories

A remedy when some expected counts are too small (E < 5): merge categories in a contextually sensible way to increase expected counts.

13
New cards

Fail to reject H₀

A decision meaning the sample data do not provide convincing evidence against the null; it does not prove the null model is true.

14
New cards

Chi-square test for homogeneity

A chi-square procedure comparing the distribution of one categorical response across two or more populations or treatments using separate samples (or treatment groups).

15
New cards

Two-way table (contingency table)

A table of counts classified by two categorical variables (or by group and response categories), used in homogeneity and independence tests.

16
New cards

Expected count in a two-way table (E = row total·column total / grand total)

The count predicted under the null (same distribution across groups or independence), computed from marginal totals.

17
New cards

Degrees of freedom for homogeneity/independence (df = (r − 1)(c − 1))

For an r×c contingency table, df equals (number of rows − 1) times (number of columns − 1).

18
New cards

Chi-square test for independence

A chi-square test using one sample where two categorical variables are measured on each individual to assess whether the variables are associated.

19
New cards

Association (categorical variables)

A relationship where the distribution of one categorical variable differs across the categories of another (i.e., variables are not independent).

20
New cards

Standardized residual ((O − E)/√E)

A cell-by-cell measure of deviation; large positive means more observed than expected, large negative means fewer observed than expected.

21
New cards

Marginal totals (row/column totals)

Totals across rows and columns in a two-way table; used to compute expected counts under homogeneity/independence.

22
New cards

Chi-square vs two-proportion z link (χ² = z² in 2×2)

For equivalent 2×2 setups, the chi-square test and the two-proportion z test are mathematically related by χ² = z² and usually lead to the same significance conclusion.

23
New cards

Procedure selection: homogeneity vs independence

Homogeneity uses multiple samples/populations (one response variable); independence uses one sample with two variables measured on each subject.

24
New cards

Counts vs percentages (common mistake)

The χ² formula must use counts for O and E; using proportions/percentages directly in the statistic is incorrect.

25
New cards

Chi-square p-value

The probability, under the null model, of getting a chi-square statistic at least as large as the observed χ² (area to the right under the chi-square curve).