BRM Chapter 24 - Two-Way Tables and the Chi-Square Test

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/34

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

35 Terms

1
New cards

Two-way table - definition

A table that classifies individuals according to two categorical variables, with rows for one variable's categories and columns for the other's categories.​

2
New cards

Row variable - definition

The categorical variable whose categories form the rows of a two-way table (for example, graduation status: graduated vs did not graduate).​

3
New cards

Column variable - definition

The categorical variable whose categories form the columns of a two-way table (for example, race/ethnicity).​

4
New cards

Marginal totals - definition

The "Total" row and "Total" column that show the overall distributions of each variable separately, combining over the other variable's categories.​

5
New cards

Using percents in two-way tables

It is often clearer to convert cell counts to percentages when comparing groups so that patterns in the relationship between variables are easier to see.​

6
New cards

Describing relationships in two-way tables

To describe an association, compute relevant percents (such as the percent graduating within each race) and compare them across rows or columns.​

7
New cards

Graduation and race example - pattern

In the graduation-by-race table, over 60% of white students and more than 70% of Asian students graduated in 6 years, but less than 40% of Black and American Indian/Alaska Native students did.​

8
New cards

Question of inference for two-way tables

When a sample table shows an association, we ask whether this reflects a real association in the population or could be due to random sampling variation.​

9
New cards

Cocaine treatment example - setup

In a randomized study, 72 cocaine addicts were assigned equally to three treatments (desipramine, lithium, placebo), and success was defined as not using cocaine.​

10
New cards

Cocaine treatment example - observed pattern

The proportion of subjects who did not use cocaine was much higher in the desipramine group than in the lithium or placebo groups.​

11
New cards

Null hypothesis for a two-way table

The null hypothesis states there is no association between the row and column variables; any differences in sample counts are due to chance alone.​

12
New cards

Null hypothesis - cocaine study

H0: There is no association between the treatment an addict receives and whether or not there is success in not using cocaine in the population of all cocaine addicts.​

13
New cards

Alternative hypothesis for a two-way table

The alternative hypothesis states there is an association between the row and column variables; the distribution of one variable differs across levels of the other.​

14
New cards

Alternative hypothesis - cocaine study

Ha: There is an association between the treatment an addict receives and whether or not there is success in not using cocaine in the population of all cocaine addicts.​

15
New cards

Expected counts - definition

The counts we would expect in each cell of the two-way table if the null hypothesis of no association were true, allowing for random variation.​

16
New cards

Expected counts - equal-group cocaine example

If the overall success rate is 24/72 = 1/3 and each treatment group has 24 subjects, we expect 8 successes and 16 failures in each treatment group under H0.​

17
New cards

General idea of the chi-square test

To test H0, compare observed cell counts with expected counts; large overall discrepancies provide evidence against "no association."​

18
New cards

Chi-square statistic - concept

A single number that measures how far the observed counts in all cells are from their expected counts, combining squared differences over all cells.​

19
New cards

Chi-square distribution - definition

The sampling distribution of the chi-square statistic when H0 is true; it takes only nonnegative values and is skewed to the right.​

20
New cards

Degrees of freedom for chi-square

For a two-way table with r rows and c columns, the chi-square test uses a chi-square distribution with (r − 1)(c − 1) degrees of freedom.​

21
New cards

Degrees of freedom - cocaine example

The cocaine table has 3 treatments and 2 outcomes, so df = (3 − 1)(2 − 1) = 2.​

22
New cards

Using chi-square critical values

Tables give critical values showing how large the chi-square statistic must be (for a given df) to be significant at levels such as 0.05 or 0.01.​

23
New cards

Cocaine study - chi-square result

With df = 2 and χ² = 10.5, the statistic exceeds the 0.01 critical value (9.21), so the association between treatment and success is significant at P < 0.01.​

24
New cards

Interpreting significant chi-square

The test shows strong evidence of some association; to see the nature of the relationship, look back at the table (desipramine performs better than the other treatments).​

25
New cards

Conditions for using the chi-square test

You can safely use the chi-square test when no more than 20% of expected counts are less than 5 and all expected counts are at least 1.​

26
New cards

Chi-square test - what it tells us

The chi-square test tells whether an observed association is statistically significant, not whether it is large or practically important.​

27
New cards

Simpson's paradox - idea

An association that holds within each of several groups can disappear or reverse when the data from all groups are combined into a single table.​

28
New cards

Medical helicopter example - overall pattern

Overall, 32% of helicopter patients died versus 24% of road-transport patients, suggesting helicopters are worse when seriousness of accidents is ignored.​

29
New cards

Medical helicopter example - within groups

When data are broken down by seriousness of accident, the death rate is lower for helicopter patients in both serious and non-serious accidents.​

30
New cards

Lurking variable in helicopter example

Seriousness of the accident is a lurking variable; helicopters are used more often for serious accidents, so combining all patients without this variable is misleading.​

31
New cards

Simpson's paradox - definition

When an association or comparison that holds within each of several groups reverses or disappears when the groups are combined, this is called Simpson's paradox.​

32
New cards

Lurking variables and categorical data

As with quantitative data, lurking variables can change or reverse observed associations between categorical variables in two-way tables.​

33
New cards

Statistics in summary - two-way tables

Categorical variables group individuals into classes; to display the relationship between two categorical variables, use a two-way table and compare appropriate percentages.​

34
New cards

Statistics in summary - Simpson's paradox

Lurking variables can make an observed association misleading; Simpson's paradox is an extreme case where combining groups reverses the association.​

35
New cards

Statistics in summary - chi-square test

The chi-square test compares observed and expected counts in a two-way table and uses the chi-square distribution to decide whether an observed association is statistically significant.