Lecture 13: Chi-Square

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/25

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 1:33 AM on 5/2/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

26 Terms

1
New cards

Chi-square test

Statistical test used to analyze categorical data (counts) by comparing what you observe to what you would expect under some assumption

  • GOF test: does one variable follow a specific distribution?

  • Independence test: are two categorical variables related?

2
New cards

What is the difference between a chi-square test and a t-test?

Chi-square test

  • Uses categorical data (counts/frequencies)

  • Looks at patterns in proportions

T-test

  • Uses continuous data (means)

  • Compares average values between groups

3
New cards

What is the chi-square distribution and how does it differ from the normal distribution?

  • Shape is right-skewed 

  • Values are always ≥0 

  • Distribution changes shape depending on df

4
New cards

How does the chi-square distribution change as a function of degrees of freedom / sample size?

As df increases:

  • Distribution becomes less skewed

  • Starts to look more symmetrical

  • The peak shifts to the right

5
New cards

Can a chi-square test be one-sided and/or two-sided?

Always one-sided (right-skewed)

6
New cards

What kind of data must we have in order to conduct a chi-square test?

  • Categorical variables

  • Data in the form of frequencies/counts 

  • Independent observations 

  • Setup is usually a contingency table (ex: 2x3)

7
New cards

What is a chi-square goodness-of-fit test? What type of question does it test?

examines whether the observed distribution of a single categorical variable matches a theoretical or expected distribution

8
New cards

What do the null and alternative hypotheses look like for GOF?

Ho: The observed frequencies match the expected frequencies

Ha: The observed frequencies do NOT match the expected frequencies

9
New cards

How do we compute a chi-square test statistic?

Numerator: squared difference between Observed (O) and Expected (E) → (O - E)²

Denominator: expected frequency (E), which scales the difference

A sum of standardized squared deviations between observed and expected counts

10
New cards

What happens to chi-square when the difference between the observed frequencies and the expected frequencies (as specified under the null) increases (assuming all other things stay equal)?

If (O - E) gets larger → the numerator increases → chi-square increases → more evidence against Ho

11
New cards

What happens to the chi-square value when the sample size increases (assuming all other things stay equal)?

  • Expected counts (E) increases, even small proportional differences can produce larger chi-square values

  • Larger samples make it easier to detect significant differences

12
New cards

What do we compare the chi-square statistic to?

A critical value from the chi-square distribution OR use a p-value

13
New cards

What are the assumptions of a GOF test?

  • Categorical data 

  • Independent observations

  • Expected frequencies are sufficiently large

14
New cards

How do we compute degrees of freedom for GOF?

df = k - 1

  • k: number of categories

15
New cards

R output for a GOF test

  • Refers to one variable only

  • No mention of rows/columns

  • Hypothesis is about distribution matching expected proportions

<ul><li><p><span style="background-color: transparent;">Refers to one variable only</span></p></li><li><p><span style="background-color: transparent;">No mention of rows/columns</span></p></li><li><p><span style="background-color: transparent;">Hypothesis is about distribution matching expected proportions</span></p></li></ul><p></p>
16
New cards

chi-square test of independence

Examine whether two categorical variables are related or if they’re independent of each other

  • Tests questions like: “is gender related to political preference?”

17
New cards

Null and alternative hypotheses (chi-square test of independence)

Ho: The two variables are independent (no relationship exists)

Ha: The two variables are not independent (there’s an association)

18
New cards

How do we compute a chi-square test statistic?

Numerator: squared difference between observed and expected counts → (O - E)²

Denominator: expected counts (which standardizes the difference) → E

19
New cards

What happens to chi when the difference between the observed frequencies and the expected frequencies (as specified under the null) increases (assuming all other things stay equal)?

If (O - E) increases:

  • The numerator increases

  • So chi-square increases

20
New cards

What happens to the chi-square value when the sample size increases (assuming all other things stay equal)?

  • Expected frequencies increase

  • Even small proportional differences can produce larger chi-square values

21
New cards

What do we compare the chi-square statistic to?

A critical value from the chi-square distribution OR a p-value

22
New cards

What are the assumptions of a chi-sq test of independence?

  • Categorical variables 

  • Independent observations (no repeated measures in the same cell)

  • Expected cell frequencies are sufficiently large

    • Rule of thumb: each expected count ≥ 5

  • Data are in a contingency table 

23
New cards

How do we compute degrees of freedom for this test?

df = (r - 1)(c - 1)

24
New cards

What is Yate’s correction? When is it used?

An adjustment applied to a chi-square test to make it more accurate when working with small samples and discrete data

  • Chi-square test uses a continuous distribution to approximate results from discrete count data

  • Yates’ correction compensates for this mismatch by slightly shrinking the difference between observed and expected values before squaring it (by 0.5)

  • It reduces the chi-square value, increases p-value, and makes the test more conservative (harder to reject Ho)

  • Primarily used for a test of independence (2x2 tables)

25
New cards

R output for chi-square test of independence with Yates’ continuity correction

  • Label becomes: "Pearson's Chi-squared test with Yates’ continuity correction”

  • Chi-squared value is smaller

  • P-value is larger (more conservative)

  • Mainly used for 2x2 tables

  • Yates’ correction: adjusts for the fact that chi is a continuous approximation but data are discrete counts

<ul><li><p><span style="background-color: transparent;">Label becomes: "Pearson's Chi-squared test with Yates’ continuity correction”</span></p></li><li><p><span style="background-color: transparent;">Chi-squared value is smaller</span></p></li><li><p><span style="background-color: transparent;">P-value is larger (more conservative)</span></p></li><li><p><span style="background-color: transparent;">Mainly used for 2x2 tables</span></p></li><li><p><span style="background-color: transparent;">Yates’ correction: adjusts for the fact that chi is a continuous approximation but data are discrete counts</span></p></li></ul><p></p>
26
New cards

R output for chi-square test of independence w/o Yates’ continuity correction

  • Based on a contingency table (2+ variables)

  • You can extract expected counts ($expected) and residuals ($residuals)

  • df = (r-1)(c-1)

  • Tests association

<ul><li><p><span style="background-color: transparent;">Based on a contingency table (2+ variables)</span></p></li><li><p><span style="background-color: transparent;">You can extract expected counts ($expected) and residuals ($residuals)</span></p></li><li><p><span style="background-color: transparent;">df = (r-1)(c-1)</span></p></li><li><p><span style="background-color: transparent;">Tests association</span></p></li></ul><p></p>