1/24
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Chi-square goodness of fit test (GOF test)
A hypothesis test for one categorical variable that compares observed counts to expected counts from a claimed distribution.
Observed count (O)
The number of sample observations that fall in a particular category (or cell) of a table.
Expected count in GOF (Eᵢ = n·pᵢ,₀)
The count predicted by the null model for category i, found by multiplying the sample size n by the claimed proportion pᵢ,₀.
Chi-square test statistic (χ²)
A measure of overall discrepancy between observed and expected counts: χ² = Σ (O − E)² / E.
Chi-square contribution
A single term (O − E)² / E showing how much one category/cell contributes to the total χ²; larger values indicate bigger disagreement with the null model.
Degrees of freedom for GOF (df = k − 1)
For a goodness of fit test with k categories, df equals the number of categories minus 1.
Right-tailed chi-square test
A chi-square test where only large χ² values support the alternative, because large χ² indicates large overall mismatch between O and E.
Random condition (chi-square tests)
The data should come from a random sample (or random assignment in an experiment) to justify inference.
Independence condition (observations)
Sample observations must be independent of one another; this is about people/outcomes not influencing each other, not about variables being independent.
10% condition
When sampling without replacement, the sample size should be no more than about 10% of the population to support independence.
Large expected counts condition
All expected counts should be at least 5 (E ≥ 5 in every category/cell) so the chi-square approximation is valid.
Combine categories
A remedy when some expected counts are too small (E < 5): merge categories in a contextually sensible way to increase expected counts.
Fail to reject H₀
A decision meaning the sample data do not provide convincing evidence against the null; it does not prove the null model is true.
Chi-square test for homogeneity
A chi-square procedure comparing the distribution of one categorical response across two or more populations or treatments using separate samples (or treatment groups).
Two-way table (contingency table)
A table of counts classified by two categorical variables (or by group and response categories), used in homogeneity and independence tests.
Expected count in a two-way table (E = row total·column total / grand total)
The count predicted under the null (same distribution across groups or independence), computed from marginal totals.
Degrees of freedom for homogeneity/independence (df = (r − 1)(c − 1))
For an r×c contingency table, df equals (number of rows − 1) times (number of columns − 1).
Chi-square test for independence
A chi-square test using one sample where two categorical variables are measured on each individual to assess whether the variables are associated.
Association (categorical variables)
A relationship where the distribution of one categorical variable differs across the categories of another (i.e., variables are not independent).
Standardized residual ((O − E)/√E)
A cell-by-cell measure of deviation; large positive means more observed than expected, large negative means fewer observed than expected.
Marginal totals (row/column totals)
Totals across rows and columns in a two-way table; used to compute expected counts under homogeneity/independence.
Chi-square vs two-proportion z link (χ² = z² in 2×2)
For equivalent 2×2 setups, the chi-square test and the two-proportion z test are mathematically related by χ² = z² and usually lead to the same significance conclusion.
Procedure selection: homogeneity vs independence
Homogeneity uses multiple samples/populations (one response variable); independence uses one sample with two variables measured on each subject.
Counts vs percentages (common mistake)
The χ² formula must use counts for O and E; using proportions/percentages directly in the statistic is incorrect.
Chi-square p-value
The probability, under the null model, of getting a chi-square statistic at least as large as the observed χ² (area to the right under the chi-square curve).