Notes on Levels of Measurement, Frequency, and Graphical Representations

Range and Data Types

  • Range definition: range = maximum value − minimum value in a dataset.
    • Example: max = 100, min = 20 → range = 100 − 20 = 80.
    • Represents the overall spread of the data (in units of the variable).
  • Range applicability:
    • Applies to numerical data where arithmetic makes sense.
    • Not meaningful for data like addresses or identification numbers (e.g., ZIP codes, house numbers, phone numbers, Social Security numbers) because adding/subtracting these values has no real interpretation.
    • Example where range makes sense: model year of a vehicle (e.g., 02/2010 to 02/2015 gives a 5-year range).
  • Key takeaway: range depends on the type of data; some data are numerical but still not meaningful to perform arithmetic on (e.g., ZIP codes, phone numbers).

Levels of Measurement (four levels)

  • Purpose: classify each variable to determine which statistics and operations are appropriate.
  • Four levels: nominal, ordinal, interval, ratio.
  • Statistics applicability depends on level of measurement.

Nominal level (categorical, qualitative)

  • Characteristics: categories with no inherent order.
  • Examples: names, colors, favorite foods/bands, church affiliation, political affiliation.
  • Also includes some numbers that don’t permit math operations as values (e.g., ZIP codes, phone numbers, Social Security numbers).
  • Important: no meaningful arithmetic or ranking among categories.

Ordinal level (categorical with order, but not equal intervals)

  • Characteristics: categories with a meaningful order but not necessarily equal spacing.
  • Examples with implied order:
    • Never, Sometimes, Always (driving over the speed limit).
    • Class year: Freshman, Sophomore, Junior, Senior.
    • Rating scales: 1–5 stars.
  • Caution: although there is an order, the intervals between categories may not be equal. Some sources debate whether all ordinal scales imply a strict, consistent order; in practice, they do have an order, but you should not assume equal spacing.

Interval level (numerical with meaningful differences, no true zero)

  • Characteristics: numeric values with a consistent unit; differences between values are meaningful.
  • Zero point is arbitrary (not a true zero).
  • Examples: temperature scales in Fahrenheit or Celsius (differences like 20° vs 10° have meaning; 0° does not mean 'no temperature').
  • Implication: you can subtract values, but you cannot meaningfully form ratios (e.g., 100° is not twice as warm as 50°).
  • Other example discussed: calendar years can be treated as interval (differences in years are meaningful, but ratios like 2015/2005 are not).

Ratio level (numerical with meaningful differences and a true zero)

  • Characteristics: numeric values with a true zero; both differences and ratios are meaningful.
  • Examples: tire pressure in pounds per square inch, counts of vehicles, height, weight, Kelvin temperature (absolute scale).
  • Key concept: zero represents a true absence (e.g., 0 pounds of pressure means no air in the tire).
  • Quotients make sense: if P1 = 30 and P2 = 15, then P1 is twice P2; but note some contexts (like Fahrenheit/Celsius) do not permit meaningful temperature ratios.

How to classify a variable (quick guidance)

  • If math operations don’t make sense (and no meaningful order): nominal.
  • If there is order but not necessarily equal intervals: ordinal.
  • If numbers have meaningful differences but no true zero: interval.
  • If numbers have meaningful differences and a true zero: ratio.
  • Important note from the lecture: even when a variable is numeric, you must check whether the quotient (division) or zero has a meaningful interpretation before assigning ratio vs. interval.

Frequency and Frequency Tables

  • Frequency (count): the number of individuals in the sample with a specific value of a variable.
    • Example: number of juniors in a class.
  • Frequency table (frequency distribution): lists each variable value and its frequency next to it.
    • Example: favorite colors with counts: Red — 10, Green — 12, etc.
  • Bar graphs vs histograms:
    • Bar graph: used for categorical data; bars are separated by gaps to emphasize discreteness.
    • Histogram: used for numerical (numerical) data; bars touch each other to indicate continuous intervals; width represents the interval size.
    • Bar graphs scale: scale choice affects readability (inappropriate scales distort the visual). Software may auto-scale to fill the graph; ensure the scale is meaningful.

Examples of Frequency Tables and Graphs

  • Class standing example (categorical): Freshmen, Sophomores, Juniors, Seniors with frequencies 16, 15, 8, 6 (total 45).
  • Relative frequency (proportion): for each category, f_i / N.
    • Example: Freshmen relative frequency = 16/45 ≈ 0.3556 (≈ 35.56%).
  • Cumulative frequency (ordinal/interval/ratio; not nominal): sum of frequencies up to and including a given value.
    • Example: cumulative frequency for credits (e.g., 15 credits or fewer).
    • Ogive: the graph of the cumulative frequency curve.
  • Relative frequency vs percent:
    • Relative frequency can be expressed as a decimal or as a percent: % = 100 × (f_i / N).

Relative Frequency and Percentages (Worked Example)

  • Given N = 45 with class frequencies: Freshmen 16, Sophomores 15, Juniors 8, Seniors 6.
    • Relative frequencies:
    • Freshmen: rac{16}{45} \
    • Sophomores: rac{15}{45} = 0.3333…
    • Juniors: rac{8}{45} \
    • Seniors: rac{6}{45} = 0.1333…
    • Percentages: Freshmen ≈ 35.56%, Sophomores ≈ 33.33%, Juniors ≈ 17.78%, Seniors ≈ 13.33%.
  • Note on precision: some systems require exact fractions (e.g., 16/45) rather than a rounded decimal (0.3556). Some interfaces allow decimals; check instructions for rounding or whether fractions are accepted.

Practical Visualization and Interpretation Notes

  • For numerical data that supports math, use histograms for frequency visualization (bars touching).
  • For categorical data, use bar graphs (bars separated, with consistent y-axis scale).
  • When constructing frequency distributions for wide numerical ranges (e.g., total earned credits), aggregate into intervals (e.g., 10–19, 20–29, etc.) to keep the graph readable.
  • If there are gaps in data (no observations in an interval), there will be zero-height bars in the histogram.
  • The width of the bars in a histogram carries meaning (the interval size), not just the height (frequency).

Homework and Data Interpretation Tips (as discussed)

  • You will be asked to determine the level of measurement for different scenarios.
  • You may be asked to construct or interpret frequency tables, relative frequencies, and cumulative frequencies.
  • When converting to relative frequencies, you may convert to decimals or percentages; follow the instruction about precision and rounding.
  • In Lumen or similar platforms, answers may require either exact fractions or decimal representations; know your platform’s rules and provide fractions when required.
  • If a problem asks for a precise fraction like rac{f}{N}, providing a decimal approximation may be marked incorrect if the platform expects the exact fraction.
  • Always check whether a task calls for an ogive (cumulative frequency graph) or a simple histogram/bar graph.

Quick Reference Formulas (LaTeX)

  • Range: range = x{ ext{max}} - x{ ext{min}}
  • Relative frequency for value i: fi^{(rel)} = rac{fi}{N}
  • Percentage for value i: ext{Percent}i = 100 imes rac{fi}{N}
  • Cumulative frequency up to value j: Fj = \sum{i \,:\, valuei \le valuej} f_i
  • Note on intervals (histogram): if data are numerical, group into intervals [a, b) or similar; the width is (b - a).

Summary of Key Points

  • Data can be nominal, ordinal, interval, or ratio; this classification affects what statistics and operations are appropriate.
  • The range is a simple spread measure applicable to numerical data with meaningful subtraction.
  • Frequency concepts (count, relative frequency, percent, cumulative frequency) apply across levels, but cumulative frequency is not defined for nominal data.
  • Visualizations (bar graphs vs histograms) depend on whether data are categorical or numerical; histogram bars touch to indicate continuous data.
  • Exactness matters in data reporting (fractions vs decimals) and depends on assignment instructions in platforms like Lumen.