1/25
Vocabulary flashcards covering key terms from the lecture notes on categorical data analysis, contingency tables, chi-square tests, Fisher’s exact test, and odds ratios.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Categorical Variable
A variable that takes on categories (e.g., eye color).
Numerical Variable
A variable that takes numeric values (e.g., height, age).
Ordinal
Categorical with a meaningful order (e.g., first, second, third).
Nominal
Categorical without an inherent order.
Contingency Table
A table showing frequencies for two categorical variables to assess association.
Observed Frequencies
Frequencies that are actually observed in the data for each cell.
Expected Frequencies
Frequencies expected under the null hypothesis, calculated from margins.
Null Hypothesis (H0) in Contingency Table
There is no association between the variables; they are independent.
Alternative Hypothesis (Ha)
There is an association between the variables.
Pearson Chi-Square Test
Tests for an association between two categorical variables by comparing observed and expected frequencies.
Chi-Square Statistic
Sum over all cells of (O − E)² / E, where O is observed and E is expected.
Degrees of Freedom (df)
For an r × c table, df = (r − 1) × (c − 1).
Yates' Continuity Correction
Adjustment to the chi-square for 2×2 tables in small samples, subtracting 0.5 from |O − E|.
Fisher’s Exact Test
An exact test for count data; preferred when expected counts are < 5; more conservative than chi-square.
Odds Ratio (OR)
A measure of effect size for binary variables; OR = (ad)/(bc) for a 2×2 table.
Interpreting OR > 1 or OR < 1
OR > 1 indicates higher odds of the outcome with the first condition; OR < 1 indicates lower odds.
Independence in Contingency Table
Null hypothesis that row and column variables are independent (no association).
Standardized Residuals
Residuals divided by their standard deviation; help identify cells contributing to the chi-square.
Observed vs Expected in R Output
O = observed frequencies; E = expected frequencies; used to compute chi-square and residuals.
CrossTable (gmodels)
R function to display a contingency table with options like fisher, chisq, expected, and standardized residuals.
Chi-Square Test in R (chisq.test)
R function to perform Pearson's chi-square test; can disable Yates' correction with correct=FALSE.
Fisher's Exact Test in R (fisher.test)
R function to perform Fisher's Exact Test for count data.
Expected Counts in a 2×2 Table
Calculated as (row total × column total) / grand total.
Contingency Test Summary
Use Chi-square when expected counts are sufficient; use Fisher's exact when not; if they disagree, prefer Fisher.
Example Contingency Table (Training vs Dancing in Cats)
A case study with variables like Reward type and whether cats danced, used to illustrate testing and odds ratios.
Sample Size Assumption Met
All expected cell counts ≥ 5 allows use of Pearson's chi-square; otherwise use Fisher's exact.