Chi-Square Test Notes

Chi-Square Test Notes

Session Overview

  • Focus on chi-square tests and their applications within quantitative research.

  • Topics covered in this workshop:

    • Goodness-of-fit test (one-way)

    • Test for independence (two-way)

    • Workshop tasks involving chi-square tests.

Associations in Research

  • Types of Designs:

    • Correlational DesignChi-Square Test Notess: Test relationships between two variables.

    • Parametric Tests:

      • Pearson correlation (for scale data)

    • Non-parametric Tests:

      • Spearman or Kendall-Tau correlation (for ordinal data)

    • Experimental Designs: Test for differences.

    • Within Subjects IV:

      • Paired t-test (parametric)

      • Wilcoxon test (non-parametric)

    • Between Subjects IV:

      • Independent t-test (parametric)

      • Mann-Whitney test (non-parametric)

Levels of Measurement

  • Scale (Interval/Ratio): Numbers display amount of difference between observations.

    • Example: A score of 57 is as much higher than 50 as 45 is from 38.

  • Ordinal: Numbers represent more or less of a measure.

    • Example: Stating 7 is happier than 4, which is happier than 2.

  • Nominal (Categorical): Numbers serve as labels for categories without numerical value.

    • Example: Gender represented as labels (1 = male, 2 = female).

    • Example: Eye color (1 = brown, 2 = blue, 3 = green).

Introduction to Chi-Square Tests (χ²)

  • Used to examine relationships between nominal (categorical) variables.

  • Purpose: To determine if observed frequencies significantly differ from expected frequencies.

  • Commonly applied in:

    • Categorical survey responses

    • Experimental conditions

    • Behavioral studies involving count data.

  • Hypothesis Tested: Are the observed arrangements random or indicative of an effect?

Types of Chi-Square Tests (χ²)

  1. Goodness-of-Fit Test (One-way):

    • Compares observed data against an expected distribution.

    • Example: Analyze racial proportions of students at Cambridge University vs. general population proportions.

  2. Test for Independence (Two-way):

    • Examines relation between two categorical variables.

    • Example: Analyze if proportions of smokers differ between genders (male/female).

    • Can handle multiple levels (e.g., analyzing age categories against marital status).

    • Can be treated as a test of difference between IV (independent variable) and DV (dependent variable).

Assumptions of Chi-Square Tests (χ²)

  • Data must be categorical (nominal or ordinal).

  • Data should be independent: Each participant contributes to only one cell in the contingency table.

  • Expected Frequency Requirements:

    • At least 5 expected frequencies in 80% of cells.

    • In larger tables, up to 20% may be under 5 but should never be below 1 for any cell.

    • If below 1, use Fisher's exact test (only applicable to 2x2 designs) or collapse some cells together.

Limitations of Chi-Square Tests (χ²)

  • Limited to two categorical variables (one usually as the response category).

  • More than two categorical variables require log-linear analysis.

  • Cannot analyze parametric data.

  • Do not measure the strength or direction of relationships.

  • Sensitive to sample size; larger samples may yield significant results with minimal effect size differences.

  • Generally low statistical power, making it hard to detect true effects.

Goodness-of-Fit Test

  • Usage: Compares observed data distribution to an expected theoretical distribution of one variable.

  • Examines the number of cases in each level of a variable against expected frequencies under the null hypothesis.

Goodness-of-Fit Test Process
  1. Hypotheses:

    • Null (H0): Number of cases in each category is equal (random arrangement).

    • Alternative (H1): Numbers differ significantly (not randomly arranged).

  2. Data Collection: Gather observed values and determine expected frequencies from the theoretical prediction or assume uniform distribution.

  3. Statistical Calculation: Calculate probability of obtaining a chi-square statistic, using the formula:
    \chi^2 = \sum \frac{(O - E)^2}{E}
    where O = observed frequency, E = expected frequency.

Worked Example – Goodness-of-Fit Test
  • Scenario: Polling for upcoming presidential election whether voters support Dale or Beck.

  • Poll Results: 58 voted for Beck, 42 for Dale out of 100.

  • Questions raised: Is this difference significant enough to predict Beck's victory?

  • Predictions: Null hypothesis assumes equal preference (50:50) in the population.

  • Calculate chi-square statistic for observed votes vs. expectations (50 for each).

  • Contingency Table:

    • Votes for Beck: Observed 58 (Expected 50)

    • Votes for Dale: Observed 42 (Expected 50)

Table 1: Contingency table showing expected and observed voting preferences.

Conducting the Test in JASP
  • Statistical software JASP is utilized to run multinomial tests by selecting Frequencies > Multinomial test.

  • Conclusion Interpretation:

    • Compare chi-square output against the p-value (e.g., p < 0.05 considered significant).

More than Two Levels in Goodness-of-Fit
  • Scenario Expanded: An election included 5 candidates and data recorded:

    • Votes: Dale 35, Beck 47, Palmer 5, Bartlet 2, No vote 11.

  • Null hypothesis: No voter preference.

  • Degrees of freedom: k = 5 gives df = 4. Expected frequencies based on uniform distribution.

Test for Independence

  • Usage: Explore relationships between two categorical variables or the effect of a categorical independent variable (IV) on a categorical dependent variable (DV).

  • Each variable can possess two or more categories.

  • Objective: Compare observed frequencies against expected values when there’s no association between the two variables.

Test for Independence Process
  1. Hypotheses:

    • Null (H0): No association between variables.

    • Alternative (H1): Variables are related.

  2. Data Collection: Record frequencies in a contingency table (observed and expected).

  3. Calculation: Compute chi-square statistic using:
    \chi^2 = \sum \frac{(O - E)^2}{E}
    and determine probability by chance for extreme values.

Worked Example – Test for Independence
  • Scenario: Police line-ups where victims may incorrectly identify suspects based on clothing color.

  • Variables: Outcome (correct ID/wrong ID) and clothing color (same/different).

Table of Results:

  • Observed Frequencies: 24 correct ID and 61 wrong ID for same clothes; 23 correct ID and 37 wrong ID for different clothes.

Reporting Results
  • Final results should be formatted according to APA style in contingency tables, detailing expected versus observed counts, chi-square values, and p-values.

  • Example non-significant report: "There was no significant effect of line-up type…"

  • Example significant report: "There was a significant effect…"

Summary of Chi-Square Tests

Goodness-of-Fit Test Approach:
  • Define hypotheses.

  • Determine expected frequencies.

  • Compute chi-square value and p-value.

  • Interpret findings meaningfully.

Test for Independence Approach:
  • Formulate null and alternative hypotheses.

  • Create a contingency table with observed data.

  • Compute the chi-square statistic and p-value.

  • Interpret results and assess effect size.

Practical Exercise

  • Conduct and interpret chi-square tests using statistical software JASP.

  • Write up results accurately following APA style guidelines.

Final Objectives

  • Proficiently conduct chi-square tests and interpret statistical significance findings.

  • Report results in a format consistent with academic standards (APA format).