Constructing Graphical and Tabular Displays of Data

Variable Classification

  • Categorical Variable: Consists of names or labels identifying groups of individuals.

    • Examples: Academic major, ZIP code of a home in Arkansas, and variables where numbers represent categories (e.g., 00 for man, 11 for woman).

  • Numerical Variable: Consists of measurable quantities.

    • Examples: Age of a hip-hop artist (e.g., Travis Scott at 2828 years or Kendrick Lamar at 3232 years) and maximum speed of a car (e.g., Toyota Camry at 114 mph114 \text{ mph} or Porsche 911 at 201 mph201 \text{ mph}).

Frequency and Distributions

  • Frequency: The number of observations in a specific category.

  • Frequency Table: Lists all categories and their respective frequencies.

  • Relative Frequency: The proportion of total observations belonging to a category, calculated as:

Relative Frequency=Frequency of CategoryTotal Number of Observations\text{Relative Frequency} = \frac{\text{Frequency of Category}}{\text{Total Number of Observations}}

  • Sum of Relative Frequencies: For any categorical variable, the sum of relative frequencies across all categories always equals 11.

  • Example Calculations (Movie Genres):

    • Action: Frequency of 66 out of 4242 observations leads to a relative frequency of 6/42=3/210.1436/42 = 3/21 \approx 0.143 (14.3%14.3\%).

    • Comedy: Largest relative frequency at 16/42=8/210.38116/42 = 8/21 \approx 0.381 (38.1%38.1\%).

Logical Grouping (AND and OR)

  • AND: Refers to the intersection of criteria. For November 2020, dates that are in the third week AND are Thursdays includes only the date 1919.

  • OR: Refers to the union of criteria. Dates in the third week OR Thursdays in November 2020 include: 55, 1212, 1515, 1616, 1717, 1818, 1919, 2020, 2121, and 2626.

Graphical Data Interpretation

  • Frequency Bar Graph: Displays categories on the horizontal axis and frequencies on the vertical axis with bars representing the count for each category.

  • Multiple Bar Graph: Used to compare proportions between different groups (e.g., men vs. women).

  • Political Survey Analysis (1960 adults):

    • Women identifying as Democrats: 0.370.37 proportion.

    • Men identifying as Independents: 0.430.43 (the greatest proportion for men).

    • Proportion vs. Count: A smaller proportion can represent a larger count if the total group size is larger. For example, a 0.380.38 proportion of 10811081 women (411411) is more individuals than a 0.430.43 proportion of 879879 men (378378).

  • Sampling Error: Findings from a sample (e.g., 24%24\% of men surveyed are Republicans) cannot be generalized as an exact percentage for the entire population (e.g., all American men) due to potential fluctuations in data.