Statistics and Chi-Square Tests

Nonparametric Statistics

  • Definition: Nonparametric statistics do not assume a specific distribution for the data.
    • Commonly used when parametric assumptions cannot be met.

Parametric vs Nonparametric Tests

  • Parametric Tests:

    • Examples: t-tests, ANOVA
    • Assumed underlying data distribution: normal
    • Defined by parameters like means, standard deviations
  • Nonparametric Tests:

    • Examples: chi-square tests, sign test
    • Used for nominal data or when parametric assumptions are violated
    • Generally less power than parametric tests

Chi-Square Goodness-of-Fit Test

  • Function: Analyzes frequency data to see if observed counts differ from expected counts.
  • Common use case: Determine preferences (e.g., preferred drinks like Coke vs Pepsi).
  • Measures how well observed frequencies match expected frequencies.

Assumptions of Chi-Square Test:

  1. 1 categorical independent variable with two or more levels.
  2. Dependent variable is a frequency or count.
    • If continuous, prefer parametric test.
  3. Independence of scores (no score dependency across cells).
  4. Groups are mutually exclusive.
  5. Minimum of 5 subjects/events expected per group; otherwise, power is reduced too much.

Choosing the Correct Analysis

  • Dependent Variable is Numerical:

    • Independent Variable Numerical: Linear regression
    • Independent Variable Categorical: t-test or ANOVA (based on levels in categorical variable)
  • Dependent Variable is Categorical:

    • Independent Variable Numerical: Logistic Regression
    • Independent Variable Categorical: Chi-square tests, sign test, binomial test.

Example 1: Chi-Square Test with Divers

  • Question: Do male deep-sea divers have different numbers of male vs. female first-born?

  • Hypotheses:

    • Null (H0): Pboy = Pgirl
    • Alternative (HA): Pboy ≠ Pgirl
  • Observed Data for 25 Divers:

    • Boys: 10, Girls: 15, Total: 25

Expected Values Calculation:

  • Calculate expected counts by multiplying probability by total participants:
    • Boys: $12.5$, Girls: $12.5$ (based on ratio of 0.5)

Step 2: Calculate Chi-Square ($\chi^2$)

  • Formula: \chi^2 = \sum \frac{(Observed - Expected)^2}{Expected}
  • For the divers:
    • \chi^2 = \frac{(10-12.5)^2}{12.5} + \frac{(15-12.5)^2}{12.5} = 1.00

Step 3: Degrees of Freedom

  • df = number \, of \, categories - 1
  • For this test, df = 2 - 1 = 1

Step 4: Assess Hypothesis

  • Critical Value: Compare with chi-square distribution.

  • If \chi^2{calculated} < \chi^2{critical}, fail to reject H0.

  • Calculate and compare: (1.00 < 3.841), therefore, fail to reject H0.


Example 2: Ratios in Epilepsy Patients

  • Question: Ratio of right-handed (RH) to left-handed (LH) patients in 100 patients with epilepsy.

  • Hypotheses:

    • Null (H0): P(RH) = 0.95, P(LH) = 0.05
    • Alternative (HA): H0 is not true.
  • Observed Counts:

    • RH: 85; LH: 15
    • Total: 100

Step 2: Calculate Expected Values:

  • Expected counts (theoretical): RH = 95, LH = 5 based on ratios provided.

Step 3: Calculate Chi-Square ($\chi^2$)

  • \chi^2 = \frac{(85-95)^2}{95} + \frac{(15-5)^2}{5} = 21.05

Step 4: Degrees of Freedom

  • df = 2 - 1 = 1

  • Rejection: Calculate critical value.

  • Find that 21.05 > 3.841, thus reject H0.

APA Style Results

  • Report: The ratio of right-handed to left-handed people was significantly less than the rate observed in the general population: \chi^2(1, N=100) = 21.05, p < .01.

Example 3: Rats in a Maze

  • Question: Do rats prefer one door over the others?
  • Ideal hypotheses and approach similar to previous examples with observed counts in a grid format.

Chi-Square Test of Independence

  • Used to determine if two categorical variables are independent (e.g., handedness and gender).

Example Setup

  1. Observed counts provided, calculate expected counts based on marginal totals.
  2. Follow through steps outlined in previous examples for hypothesis testing.

Step 1 - 3 Processed Similarly

  • Compute \chi^2 and associated degrees of freedom, assess through critical values table to determine if results support H0 or HA.