Notes on Analyzing Categorical Data – Lesson 1.1
QuickNotes: Key definitions and concepts from LT# material
Quantitative variable: takes numerical values (counts, measures)
Categorical variable: takes category names or labels
Bar graphs show either frequency (counts) or relative frequency (percent)
Two-way table concepts:
Joint distribution: the frequency/percent for a given combination of two categorical variables.
Marginal distribution: the distribution of one variable ignoring the other (row and column totals).
Conditional distribution: the distribution of one variable given a specific value of the other (e.g., Elective given Core=Math).
Misleading graphs cautions:
The vertical axis should start at 0; otherwise, small differences look exaggerated.
Be wary of using images that compress or misrepresent area/height.
Relationship insight:
Knowing one variable helps predict the other (association between core class and elective).
Side-by-side vs segmented bar graphs are different visual representations of the same joint distribution.