Notes on Analyzing Categorical Data – Lesson 1.1

QuickNotes: Key definitions and concepts from LT# material

  • Quantitative variable: takes numerical values (counts, measures) Quantitative\text{Quantitative}

  • Categorical variable: takes category names or labels Categorical\text{Categorical}

  • Bar graphs show either frequency (counts) or relative frequency (percent)

  • Two-way table concepts:

    • Joint distribution: the frequency/percent for a given combination of two categorical variables.

    • Marginal distribution: the distribution of one variable ignoring the other (row and column totals).

    • Conditional distribution: the distribution of one variable given a specific value of the other (e.g., Elective given Core=Math).

  • Misleading graphs cautions:

    • The vertical axis should start at 0; otherwise, small differences look exaggerated.

    • Be wary of using images that compress or misrepresent area/height.

  • Relationship insight:

    • Knowing one variable helps predict the other (association between core class and elective).

    • Side-by-side vs segmented bar graphs are different visual representations of the same joint distribution.