Chapter 15: The Chi-Square Statistic: Tests for Goodness of Fit and Independence
Chapter 15: The Chi-Square Statistic: Tests for Goodness of Fit and Independence
Chapter Overview
- This chapter covers the chi-square statistic and its applications in analyzing data through tests for goodness of fit and tests of independence.
Learning Outcomes
- Explain when a chi-square test is appropriate.
- Test hypotheses about the shape of a distribution using the chi-square goodness of fit.
- Test hypotheses about the relationship between variables using the chi-square test of independence.
- Evaluate the effect size using the phi coefficient or Cramér’s V.
- Proportions (for math review, reference Appendix A).
- Frequency distributions (refer to Chapter 2).
15-1: Introduction to Chi-Square: The Test for Goodness of Fit
- Statistical tests previously discussed aim to test hypotheses regarding population parameters; these tests are categorized as parametric tests.
Characteristics of Parametric Tests
- Share several fundamental assumptions:
- Normal distribution in the population.
- Homogeneity of variance across the population.
- Requirement of a numerical score for each individual.
- Data must stem from an interval or ratio scale.
Nonparametric Tests
- If research data does not meet the requirements for parametric tests, nonparametric tests serve as an alternative:
- Do not specify hypotheses based on a specific population parameter.
- Fewer assumptions about population distribution (termed as "distribution-free" tests).
- Chi-square tests are two types of nonparametric tests.
Classification and Measurement Scale
- Nonparametric tests typically classify participants into categories using nominal or ordinal scales.
- Data for nonparametric tests may include frequency counts (e.g., number of individuals from different political affiliations).
- Nonparametric methods are useful when interval or ratio measures cannot be achieved.
- Situations arise where it is impossible to obtain precise scores; categorical classification is then employed.
Selection of Statistical Procedure
- Choice of statistical analyses rests primarily on the measurement level:
- Chi-square tests, t-tests, or ANOVA may be selected for hypothesis testing concerning relationships.
- Evaluation of relationship strength may also involve the use of $r^2$.
The Chi-Square Test for Goodness of Fit
- The chi-square test evaluates research questions regarding proportions or relative frequencies within a distribution.
- It utilizes sample data to test hypotheses regarding the shape or proportions of the population’s distribution.
- The test assesses how well the sample data proportions align with the expected proportions postulated for the population.
Null Hypothesis for the Goodness-of-Fit Test
- The null hypothesis (H₀) defines the proportion or percentage distribution of the population across each category.
- Main rationale for the null hypothesis includes:
- No preference among categories (implying equal proportions).
- No observed difference in the specific population relative to another known population’s proportions.
- The alternative hypothesis (H₁) posits that the population distribution deviates from the specified proportions in H₀, often concisely presented as "Not H₀".
Data Requirements for Goodness-of-Fit Test
- Sample mean or SS calculation is not necessary:
- Individuals classified based on categories (e.g., grades, frequency of exercise).
- Each measurement category has documented observed frequencies.
- Observations must be mutually exclusive (an individual belongs to one category).
Expected Frequencies
- The goodness-of-fit test contrasts observed frequencies against expected frequencies based on the null hypothesis.
- Expected frequencies are computed to align with the null hypothesis and are dependent on sample size (n).
- Represent an idealized prediction of sample distribution.
The Chi-Square Statistic
- Denoted by $\chi^2$, it represents the chi-square statistic:
- $f_o$ is the set of observed frequencies.
- $f_e$ is the set of expected frequencies.
Chi-Square Value Implications
- The chi-square value signifies the discrepancy magnitude between observed data and expected frequencies:
- A smaller chi-square value suggests a closer alignment to the null hypothesis, indicating a good fit.
Chi-Square Distribution and Degrees of Freedom
Null Hypothesis Discrimination Criteria
- The null hypothesis should be upheld when there is a small discrepancy between observed and expected values.
- Rejection occurs with substantial discrepancies.
Characteristics of Chi-Square Distribution
- The chi-square distribution encompasses values for all possible random samples when H₀ is accurate.
- All chi-square values are $
geq 0$, indicating that when H₀ holds, chi-square values should remain low. - The distribution exhibits positive skewness and is a family of distributions influenced by degrees of freedom (df).
Degrees of Freedom for Goodness-of-Fit
- Formula for degrees of freedom:
- df=C−1
- Where C represents the number of categories.
- df is unaffected by sample size (n).
Locating the Critical Region for a Chi-Square Test
- Steps to find the critical region include:
- Setting significance level (alpha).
- Using a chi-square distribution table listed in Appendix B, identifying critical chi-square values based on:
- Degrees of freedom (df) in the first column.
- Significance level (alpha) in the top row.
Critical Value Representation
- The critical values of chi-square are found within the table body, facilitating decision making regarding the null hypothesis.
Reporting Results of Chi-Square
- The results should detail significant differences among categories and incorporate summary elements:
- Presenting results as $\chi^2$ with degrees of freedom, sample size (n), and the test statistic.
- Example: $\chi^2(3, n = 50) = 9.36, p < .05$.
- Include observed frequencies for each category when applicable.
Goodness of Fit versus Single-Sample t Test
- Comparison of nonparametric chi-square tests against parametric t tests:
- The chi-square test for goodness of fit operates independently of population distribution assumptions.
- Conversely, the t test presumes a normal population, necessitating numerical scores, and evaluates hypotheses concerning population means.
Test Similarities
- Both tests use a single sample to infer conditions regarding the single population.
- The measurement level dictates the choice of testing method:
- Numerical scores (interval/ratio) imply usage of t test.
- Non-numerical classifications (ordinal or nominal) necessitate use of chi-square tests based on proportions or percentages.
Learning Check 1
- Question: Expected frequencies in a chi-square test…
- are always whole numbers.
- can contain fractions or decimal values.
- can contain both positive and negative values.
- can contain fractions and negative numbers.
Learning Check 1 – Answer
- Correct Option: can contain fractions or decimal values.
Learning Check 2
- Determine True or False:
- In a chi-square test, the observed frequencies are always whole numbers. (T/F)
- A large value for chi-square will tend to retain the null hypothesis. (T/F)
Learning Check 2 – Answers
- True: Observed frequencies are counts, hence no fractions.
- False: Large chi-square values suggest considerable disparity from null hypothesis predictions.
15-3: The Chi-Square Test for Independence
- Chi-square can also be employed to ascertain auxiliary relationships between two variables:
- Each participant is measured against both variables, categorized into a matrix.
- This design may be derived from either experimental or non-experimental frameworks.
- Frequency data from the sample tests the evidence for the relationship between two variables across the population using a two-dimensional frequency distribution matrix.
Null Hypothesis for Test of Independence
- States that the two variables are independent, indicating that no inherent relationship exists.
- Two interpretations of this hypothesis:
- Single sample where each individual measures two variables, indicating no relationship.
- Multiple separate samples imply no differing distribution across variable proportions amongst the populations.
Observed and Expected Frequencies
- In the sample population distribution, observed frequencies will be counted.
- Expected frequencies derive from the null hypothesis and are computed by the distribution proportions governed by both variables’ rows and columns.
Computing Expected Frequencies
- Utilizes a straightforward method for any cell in the frequency distribution matrix:
- f<em>e=frn⋅f</em>c
- Where:
- $f_c$ = column frequency total.
- $f_r$ = row frequency total.
- $n$ = total number of individuals in the sample.
Chi-Square Statistic and Degrees of Freedom for Independence
- The chi-square statistic calculation remains consistent with the goodness-of-fit test:
- Degrees of freedom for the test of independence determined by:
- df=(R−1)(C−1)
- Where R is the count of rows and C the number of columns.
Summary Steps for Chi-Square Test for Independence
- Standard procedure encompasses four steps:
- State hypotheses and establish alpha level.
- Identify the critical region.
- Compute the test statistic.
- Reach a conclusion based on results.
15-4: Effect Size and Assumptions for Chi-Square Tests
- A significant chi-square test outcome indicates a non-chance occurrence.
- Significance tests hinge on both treatment effect magnitude and sample size.
- Each hypothesis test result should be accompanied by an appropriate effect size measure.
Cohen’s w
- Applicable for both chi-square tests regarding effect size:
- The Pₒ values signify the observed proportions.
- As per Cohen, effect size standards are as follows:
- 0.10 indicates a small effect.
- 0.30 indicates a medium effect.
- 0.50 indicates a large effect.
- Notably, Cohen’s w is independent of sample size; only sample proportions relative to the null hypothesis signal the computation of w.
- w and chi-square possess algebraic relations.
Phi-Coefficient and Cramér’s V
- The phi coefficient (Φ) assesses strength for 2 × 2 matrices:
- Φ2 accounts for the proportion of variance.
- For larger matrices, Cramér’s V, an adjustment to the phi coefficient, measures effect size:
- df∗ is derived from the lower of (R - 1) or (C - 1).
Standards for Interpreting Cramér’s V
- Interpretive Standards for Cramér’s V:
- Small Effect | Medium Effect | Large Effect
- For df* = 1: 0.10 | 0.30 | 0.50
- For df* = 2: 0.07 | 0.21 | 0.35
- For df* = 3: 0.06 | 0.17 | 0.29
Assumptions and Restrictions for Chi-Square Tests
- Specific conditions must be adhered to in order to apply a chi-square test for goodness of fit or independence:
- Independence of Observations: Observed frequencies must derive from different individuals.
- Expected Frequencies' Size: Performing chi-square tests is inadvisable if any cell's expected frequency is below 5.
Learning Check 3
- Question: A basic assumption for chi-square hypothesis test is .
- The population distributions must be normal.
- The scores necessitate an interval or ratio scale.
- The observations must remain independent.
- None of the above options are assumptions for chi-square.
Learning Check 3 – Answer
- Correct Option: The observations must remain independent.
Learning Check 4
- Determine True or False Statements:
- The value of df for a chi-square test does not rely on sample size (n). (T/F)
- A positive chi-square statistic indicates a positive correlation between the two variables. (T/F)
Learning Check 4 – Answers
- True: The df value remains contingent solely on the count of rows and columns in the observation matrix.
- False: Chi-square cannot yield negative numbers and thus cannot accurately reflect correlation type between variables.
Example of SPSS Output for Chi-Square Test for Independence
- Crosstabulation Example:
- VAR00002 * VAR00003 counts presented in a 2x2 matrix format, detailing observed values across categories.
- Chi-Square Test Values:
- Results indicate statistical significance in relation to null hypothesis with provided p-values and the expectation that at least 5 counts appear in every cell.
Conclusion
- The chapter offers extensive insights into chi-square tests used within statistical analysis frameworks and critically engages students to understand application logistics including hypotheses, calculations, and interpretative reporting.