Analysis of Categorical Data and Contingency Tables

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/25

Earn XP

Description and Tags

Vocabulary flashcards covering key terms from the lecture notes on categorical data analysis, contingency tables, chi-square tests, Fisher’s exact test, and odds ratios.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

26 Terms

New cards

Categorical Variable

A variable that takes on categories (e.g., eye color).

New cards

Numerical Variable

A variable that takes numeric values (e.g., height, age).

New cards

Ordinal

Categorical with a meaningful order (e.g., first, second, third).

New cards

Nominal

Categorical without an inherent order.

New cards

Contingency Table

A table showing frequencies for two categorical variables to assess association.

New cards

Observed Frequencies

Frequencies that are actually observed in the data for each cell.

New cards

Expected Frequencies

Frequencies expected under the null hypothesis, calculated from margins.

New cards

Null Hypothesis (H0) in Contingency Table

There is no association between the variables; they are independent.

New cards

Alternative Hypothesis (Ha)

There is an association between the variables.

New cards

Pearson Chi-Square Test

Tests for an association between two categorical variables by comparing observed and expected frequencies.

New cards

Chi-Square Statistic

Sum over all cells of (O − E)² / E, where O is observed and E is expected.

New cards

Degrees of Freedom (df)

For an r × c table, df = (r − 1) × (c − 1).

New cards

Yates' Continuity Correction

Adjustment to the chi-square for 2×2 tables in small samples, subtracting 0.5 from |O − E|.

New cards

Fisher’s Exact Test

An exact test for count data; preferred when expected counts are < 5; more conservative than chi-square.

New cards

Odds Ratio (OR)

A measure of effect size for binary variables; OR = (ad)/(bc) for a 2×2 table.

New cards

Interpreting OR > 1 or OR < 1

OR > 1 indicates higher odds of the outcome with the first condition; OR < 1 indicates lower odds.

New cards

Independence in Contingency Table

Null hypothesis that row and column variables are independent (no association).

New cards

Standardized Residuals

Residuals divided by their standard deviation; help identify cells contributing to the chi-square.

New cards

Observed vs Expected in R Output

O = observed frequencies; E = expected frequencies; used to compute chi-square and residuals.

New cards

CrossTable (gmodels)

R function to display a contingency table with options like fisher, chisq, expected, and standardized residuals.

New cards

Chi-Square Test in R (chisq.test)

R function to perform Pearson's chi-square test; can disable Yates' correction with correct=FALSE.

New cards

Fisher's Exact Test in R (fisher.test)

R function to perform Fisher's Exact Test for count data.

New cards

Expected Counts in a 2×2 Table

Calculated as (row total × column total) / grand total.

New cards

Contingency Test Summary

Use Chi-square when expected counts are sufficient; use Fisher's exact when not; if they disagree, prefer Fisher.

New cards

Example Contingency Table (Training vs Dancing in Cats)

A case study with variables like Reward type and whether cats danced, used to illustrate testing and odds ratios.

New cards

Sample Size Assumption Met

All expected cell counts ≥ 5 allows use of Pearson's chi-square; otherwise use Fisher's exact.