Notes on Levels of Measurement, Frequency, and Graphical Representations
Range and Data Types
- Range definition: range = maximum value − minimum value in a dataset.
- Example: max = 100, min = 20 → range = 100 − 20 = 80.
- Represents the overall spread of the data (in units of the variable).
- Range applicability:
- Applies to numerical data where arithmetic makes sense.
- Not meaningful for data like addresses or identification numbers (e.g., ZIP codes, house numbers, phone numbers, Social Security numbers) because adding/subtracting these values has no real interpretation.
- Example where range makes sense: model year of a vehicle (e.g., 02/2010 to 02/2015 gives a 5-year range).
- Key takeaway: range depends on the type of data; some data are numerical but still not meaningful to perform arithmetic on (e.g., ZIP codes, phone numbers).
Levels of Measurement (four levels)
- Purpose: classify each variable to determine which statistics and operations are appropriate.
- Four levels: nominal, ordinal, interval, ratio.
- Statistics applicability depends on level of measurement.
Nominal level (categorical, qualitative)
- Characteristics: categories with no inherent order.
- Examples: names, colors, favorite foods/bands, church affiliation, political affiliation.
- Also includes some numbers that don’t permit math operations as values (e.g., ZIP codes, phone numbers, Social Security numbers).
- Important: no meaningful arithmetic or ranking among categories.
Ordinal level (categorical with order, but not equal intervals)
- Characteristics: categories with a meaningful order but not necessarily equal spacing.
- Examples with implied order:
- Never, Sometimes, Always (driving over the speed limit).
- Class year: Freshman, Sophomore, Junior, Senior.
- Rating scales: 1–5 stars.
- Caution: although there is an order, the intervals between categories may not be equal. Some sources debate whether all ordinal scales imply a strict, consistent order; in practice, they do have an order, but you should not assume equal spacing.
Interval level (numerical with meaningful differences, no true zero)
- Characteristics: numeric values with a consistent unit; differences between values are meaningful.
- Zero point is arbitrary (not a true zero).
- Examples: temperature scales in Fahrenheit or Celsius (differences like 20° vs 10° have meaning; 0° does not mean 'no temperature').
- Implication: you can subtract values, but you cannot meaningfully form ratios (e.g., 100° is not twice as warm as 50°).
- Other example discussed: calendar years can be treated as interval (differences in years are meaningful, but ratios like 2015/2005 are not).
Ratio level (numerical with meaningful differences and a true zero)
- Characteristics: numeric values with a true zero; both differences and ratios are meaningful.
- Examples: tire pressure in pounds per square inch, counts of vehicles, height, weight, Kelvin temperature (absolute scale).
- Key concept: zero represents a true absence (e.g., 0 pounds of pressure means no air in the tire).
- Quotients make sense: if P1 = 30 and P2 = 15, then P1 is twice P2; but note some contexts (like Fahrenheit/Celsius) do not permit meaningful temperature ratios.
How to classify a variable (quick guidance)
- If math operations don’t make sense (and no meaningful order): nominal.
- If there is order but not necessarily equal intervals: ordinal.
- If numbers have meaningful differences but no true zero: interval.
- If numbers have meaningful differences and a true zero: ratio.
- Important note from the lecture: even when a variable is numeric, you must check whether the quotient (division) or zero has a meaningful interpretation before assigning ratio vs. interval.
Frequency and Frequency Tables
- Frequency (count): the number of individuals in the sample with a specific value of a variable.
- Example: number of juniors in a class.
- Frequency table (frequency distribution): lists each variable value and its frequency next to it.
- Example: favorite colors with counts: Red — 10, Green — 12, etc.
- Bar graphs vs histograms:
- Bar graph: used for categorical data; bars are separated by gaps to emphasize discreteness.
- Histogram: used for numerical (numerical) data; bars touch each other to indicate continuous intervals; width represents the interval size.
- Bar graphs scale: scale choice affects readability (inappropriate scales distort the visual). Software may auto-scale to fill the graph; ensure the scale is meaningful.
Examples of Frequency Tables and Graphs
- Class standing example (categorical): Freshmen, Sophomores, Juniors, Seniors with frequencies 16, 15, 8, 6 (total 45).
- Relative frequency (proportion): for each category, f_i / N.
- Example: Freshmen relative frequency = 16/45 ≈ 0.3556 (≈ 35.56%).
- Cumulative frequency (ordinal/interval/ratio; not nominal): sum of frequencies up to and including a given value.
- Example: cumulative frequency for credits (e.g., 15 credits or fewer).
- Ogive: the graph of the cumulative frequency curve.
- Relative frequency vs percent:
- Relative frequency can be expressed as a decimal or as a percent: % = 100 × (f_i / N).
Relative Frequency and Percentages (Worked Example)
- Given N = 45 with class frequencies: Freshmen 16, Sophomores 15, Juniors 8, Seniors 6.
- Relative frequencies:
- Freshmen: rac{16}{45} \
- Sophomores: rac{15}{45} = 0.3333…
- Juniors: rac{8}{45} \
- Seniors: rac{6}{45} = 0.1333…
- Percentages: Freshmen ≈ 35.56%, Sophomores ≈ 33.33%, Juniors ≈ 17.78%, Seniors ≈ 13.33%.
- Note on precision: some systems require exact fractions (e.g., 16/45) rather than a rounded decimal (0.3556). Some interfaces allow decimals; check instructions for rounding or whether fractions are accepted.
Practical Visualization and Interpretation Notes
- For numerical data that supports math, use histograms for frequency visualization (bars touching).
- For categorical data, use bar graphs (bars separated, with consistent y-axis scale).
- When constructing frequency distributions for wide numerical ranges (e.g., total earned credits), aggregate into intervals (e.g., 10–19, 20–29, etc.) to keep the graph readable.
- If there are gaps in data (no observations in an interval), there will be zero-height bars in the histogram.
- The width of the bars in a histogram carries meaning (the interval size), not just the height (frequency).
Homework and Data Interpretation Tips (as discussed)
- You will be asked to determine the level of measurement for different scenarios.
- You may be asked to construct or interpret frequency tables, relative frequencies, and cumulative frequencies.
- When converting to relative frequencies, you may convert to decimals or percentages; follow the instruction about precision and rounding.
- In Lumen or similar platforms, answers may require either exact fractions or decimal representations; know your platform’s rules and provide fractions when required.
- If a problem asks for a precise fraction like rac{f}{N}, providing a decimal approximation may be marked incorrect if the platform expects the exact fraction.
- Always check whether a task calls for an ogive (cumulative frequency graph) or a simple histogram/bar graph.
- Range: range = x{ ext{max}} - x{ ext{min}}
- Relative frequency for value i: fi^{(rel)} = rac{fi}{N}
- Percentage for value i: ext{Percent}i = 100 imes rac{fi}{N}
- Cumulative frequency up to value j: Fj = \sum{i \,:\, valuei \le valuej} f_i
- Note on intervals (histogram): if data are numerical, group into intervals [a, b) or similar; the width is (b - a).
Summary of Key Points
- Data can be nominal, ordinal, interval, or ratio; this classification affects what statistics and operations are appropriate.
- The range is a simple spread measure applicable to numerical data with meaningful subtraction.
- Frequency concepts (count, relative frequency, percent, cumulative frequency) apply across levels, but cumulative frequency is not defined for nominal data.
- Visualizations (bar graphs vs histograms) depend on whether data are categorical or numerical; histogram bars touch to indicate continuous data.
- Exactness matters in data reporting (fractions vs decimals) and depends on assignment instructions in platforms like Lumen.