Lecture 10 chi Square Testing

Lecture 10: Chi Square Test

Overview

  • In Lecture 9, we discussed hypothesis testing for mean and proportion in quantitative data.

  • This lecture introduces the Chi Square test for qualitative data.

Applications of Chi Square Test

  • Testing relationships in categories:

    • Relationship between sex (Male or Female) and smoking habit (Smoker or non-smoker).

    • Education level (University, Secondary, Primary) and religion (Christian, Catholic, Buddhism).

    • Sex (Male, Female) and vegetarian tendency.

Goodness of Fit Test

  • A statistical test validating the null hypothesis that observed data follow a specific probability distribution.

  • Comparison of observed frequency to expected frequency.

  • Large differences indicate a significant violation of the null hypothesis.

  • Example: Testing the ratio of boys to girls among 2650 students in 2012 at a 1% significance level.

    • Observed Frequencies: Boys - 1600, Girls - 1050.

    • Null Hypothesis (H0): The ratio of boys and girls is the same.

    • Expected Frequencies: Boys = 1325, Girls = 1325.

Chi Square Statistic Calculation

  • The formula: [ \chi^2 = \sum \frac{(O_i - E_i)^2}{E_i} ]

  • Where:

    • (O_i) = observed frequency in group i.

    • (E_i) = expected frequency in group i.

  • The test statistic follows the Chi Square distribution with degrees of freedom (number of categories - 1).

  • Conclusion: If the calculated (\chi^2) > critical value, reject the null hypothesis.

Chi Square Test for Independence

  • Suitable for more than one category; uses a contingency table.

  • Example: Examining performance differences between boys and girls.

    • Contingency Table Data: Boys and Girls scores across test results A, B, C.

Expected Frequency Calculation

  • Consider total boys and girls:

    • Total Boys = 1098, Total Girls = 1552, Total = 2650.

  • Calculate expected results based on proportions:

    • Boys getting A: Expected = 600; Girls getting A: Expected = 600.

    • Complete calculation for all categories.

  • Null Hypotheses for independence:

    • H0: No difference between test result and sex.

    • H1: There is a difference.

Test Result Interpretation

  • Final Chi Square calculation shows that the null hypothesis is not rejected at the 5% significance level:

    • Calculated Chi Square value < Critical Chi Square value at both degrees of freedom.

Activities

  1. Test Example: Relationship between smoking and sex using the data:

    • Observed results: Boys Smoker - 125, Girls Smoker - 56, Non-smoker Boys - 300, Non-smoker Girls - 519.

    • Perform a Chi-square test at a 5% significance level.

  2. A-level Results Analysis: Is there significant difference among the results of:

    • Mathematics: Passed 324, Failed 447

    • Economics: Passed 81, Failed 211

    • Applied Mathematics: Passed 65, Failed 66

    • Test at 1% significance level.

  3. Genetic Theory Ratio Analysis: Colour-strains ratio of plants:

    • Ratios expected: Yellow 1, Black 4, Blue 5 with total 200 plants.

    • Observed counts: Yellow 23, Black 86, Blue 91.

    • Test for significant differences at 5% level.

  4. Smoking Habit and Age Group Test: 3200 interviews with results indicating smokers and non-smokers within specified age ranges.

    • Test for relationship at 1% significance.

  5. Salary Relation with Education: Interviews show:

    • Master Degree and No Master Degree salary distributions and results.

    • Test relationship between education level and salary at 1% significance.