UQ Extend Module 4&5 - Measurement and Data Presentation

Levels of Measurement

Four levels of measurement in research: nominal, ordinal, interval, and ratio. These levels provide increasing precision and determine which statistical analyses are appropriate.
The way you measure a variable constrains the kinds of analyses you can perform and test hypotheses effectively.

Qualitative vs Quantitative; Discrete vs Continuous; Dichotomous

Qualitative measurement captures attributes that don’t have meaningful numerical values; numbers used are labels, not magnitudes (e.g., eye colour, political leanings, countries visited).
Quantitative measurement records numeric values where the numbers have meaning (e.g., height in cm).
Qualitative variables use numbers as labels only; they do not imply magnitude (e.g., 1 = brown, 2 = blue, etc.).
Discrete variables can take only whole values (no intermediate values); e.g., number of chess pieces lost.
Dichotomous variables are a subset of discrete variables with exactly two possible values (e.g., coin toss: heads or tails).
Continuous variables can take an infinite number of values between any two points (e.g., millilitres of milk, grams of flour).

The four levels of measurement with the swimming race example

Nominal (qualitative): record whether each swimmer finished the race (Yes/No). This is a dichotomous nominal variable.
Ordinal: record placings (1st, 2nd, 3rd, …). There is order, but the intervals between places are not known.
Interval: quantify differences between scale points with meaningful intervals but no true zero (e.g., differences in seconds compared to a club record, where a negative value is possible). Distinguishing feature: equal intervals, but no absolute zero.
Ratio: has a meaningful absolute zero, allowing meaningful ratios between values (e.g., actual swim times in seconds). Zero means absence of the quantity.

What counts as a measurement in practice

Measurement can be self-reported, behavioural, or physiological.
Self-report: what people say about themselves (surveys, questionnaires, interviews).
Behavioural: what participants actually do (e.g., counts of aggressive acts, reaction time).
Physiological: changes in physiological activity (heart rate, hormone levels, brain blood flow).

Key issues in measurement: reliability and validity

Reliability: consistency or stability of a measurement across time or raters.
- Example: a ruler giving consistent length measurements (test-retest reliability).
- Poor reliability: measurements with high random variability (e.g., using a rubbery ruler).
- Types include: test-retest reliability, internal consistency, and interrater reliability.
Validity: whether a measure actually measures what it is intended to measure (face validity, predictive validity, construct validity).
- Face validity: the measure appears to measure what it should (expert judgment).
- Predictive validity: how well a measurement predicts a criterion (e.g., ATAR predicting university performance).
- Construct validity: the measure aligns with theoretical concepts (e.g., IQ tests and intelligence).
All measures can be evaluated for reliability and validity.

How data are organized and displayed

Tables and graphs help summarize data visually.
Frequency table: rows for each possible value; tallies show how many times each score occurs; final frequencies show counts.
Grouped frequency table: collapse data into equal-width intervals to manage wide ranges; aim for roughly 10–20 intervals.
Stem-and-leaf plot: a compromise between tables and graphs showing stems (higher units) and leaves (lower units).
Box-and-whisker plot: shows range (min to max), interquartile range (25th to 75th percentile), and median (50th percentile).
Bar graph: qualitative data (nominal) with non-touching bars; X axis lists categories, Y axis shows frequencies.
Histogram: quantitative data (interval/ratio) with touching bars; grouped histograms extend to grouped data.
Frequency polygon: line graph version of a histogram, useful for overlaying distributions (e.g., actual vs ideal weights by gender).
Choice of graph depends on data type and whether grouping is used; grouping can simplify wide ranges but reduces precision.

Percentiles and grouped distributions

Percentile: the percent of scores at or below a given score in the dataset.
Notation: n = number of observations; SF = simple frequency; CF = cumulative frequency.
Formula (for a specific score): ext{Percentile} = rac{CF}{n} imes 100.
To compute percentiles for a single score, rank data, compute SF and CF, then apply the formula.
For grouped distributions, percentile refers to the percentage of scores at or below the highest score in a group.
Example steps: rank data; compute SF and CF for groups; compute percentiles for groups using adjusted interpretation.
Example values and steps are provided in the notes (e.g., computing percentiles for a dataset with n = 20 to obtain a percentile of 65 for a score of 14).

Percentiles in grouped data: practical interpretation

When using grouped data, the percentile for a group (e.g., 25–29) is the percentage of scores at or below the highest score in that group (here, 29).
Total n remains the number of observations; CF/n gives the proportion, multiplied by 100 gives the percentile.

Measures of central tendency

Mode: most frequently occurring score; in a distribution, the tallest bar in a histogram or the peak of a frequency polygon.
Median: middle value when data are ordered; for odd n, the middle value; for even n, the average of the two middle values.
Mean (average): ar{x} = rac{1}{n} \, \sum{i=1}^{n} xi.
Examples:
- A small set {3, 4, 4, 5} has mean 4 and median 4.
- In a skewed dataset, mean and median are pulled toward the tail; they can diverge from the mode.
Mean as balancing point: if you imagine a pile of numbers, the mean is the point where the distribution balances; extreme values pull the mean toward them.

Measures of variability

Range: difference between the highest and lowest score; simplest measure but sensitive to extreme values.
- Example: Range = max − min.
Variance: average of squared deviations from the mean; measures how spread out the data are.
- Deviation score: di = xi - ar{x}. The sum of deviations is zero, so we square them to compute variability.
- Sum of squares: SS = \, \sum{i=1}^{n} (xi - ar{x})^2.
- Population variance: \sigma^2 = rac{1}{N} \sum{i=1}^{N} (xi - \,\mu)^2.
- Sample variance: s^2 = rac{1}{n-1} \sum{i=1}^{n} (xi - \,\bar{x})^2.
Standard deviation: square root of variance; puts variability in the same units as the data.
- Population SD: \sigma = \sqrt{\sigma^2}.
- Sample SD: s = \sqrt{s^2}.
Notes:
- Variance is in squared units, which can be unintuitive; SD provides a more interpretable measure in original units.
- Examples in the notes show how variance can differ across groups (e.g., Class A variance = 81 vs Class B variance = 225).
Key concept: range is simple but sensitive to extremes; variance uses all scores; standard deviation is the intuitive spread in original units.

Shape of distributions: skewness and kurtosis

Normal curve: bell-shaped, symmetric, with tails extending to extremes; peak is the mode when symmetric.
Skewness: measure of symmetry.
- Positive skew: right tail longer; mean and median are dragged toward the right tail; example: income distributions often positively skewed due to a few very high incomes.
- Negative skew: left tail longer; mean and median are dragged toward the left tail; example: exam scores with a cluster at the high end and a few very low scores.
Kurtosis: measure of spread and peakedness.
- Leptokurtic: tall, narrow peak; scores concentrated in a narrow range, tails may extend far.
- Platykurtic: flat or plateau-like distribution; more spread-out around the center.
Relationship with measures of central tendency:
- In symmetric distributions, mean = median = mode.
- For skewed distributions, median and mean move toward the tail; mode remains at the highest point; if mode is smallest value, positive skew is indicated; if mode is largest value, negative skew is indicated.
These shape characteristics influence the appropriateness of statistical techniques and interpretations.

Putting it together: practical aspects for research design

Use as high a level of measurement as possible because you can convert downwards but not upwards (e.g., you can convert times to placings, but you cannot recover precise times from placings).
The level of measurement affects which analyses are permissible and meaningful.
In psychology, measurements include self-report, behavioural, and physiological, each with reliability and validity considerations.

Quick reference to formulas and notation

Mean: ar{x} = rac{1}{n}\sum{i=1}^{n} xi
Range: \text{Range} = \max(xi) - \min(xi)
Deviation score: di = xi - \bar{x}
Sum of squares: SS = \sum{i=1}^{n} (xi - \bar{x})^2
Variance (population): \sigma^2 = \frac{1}{N} \sum{i=1}^{N} (xi - \mu)^2
Variance (sample): s^2 = \frac{1}{n-1} \sum{i=1}^{n} (xi - \bar{x})^2
Standard deviation (population): \sigma = \sqrt{\sigma^2}
Standard deviation (sample): s = \sqrt{s^2}
Percentile: \text{Percentile} = \frac{CF}{n} \times 100 where CF is the cumulative frequency and n is the sample size.
Box-and-whisker: key components are minimum, first quartile (Q1, 25th percentile), median (Q2, 50th), third quartile (Q3, 75th), maximum; interquartile range is IQR = Q3 - Q1
Note: all formulas use the standard notation: n for sample size, N for population size, x_i for individual scores,
\mu for population mean, and \bar{x} for sample mean.

Connections to examples from the transcript

Nominal example: finish vs not finish in a swimming race (dichotomous nominal).
Ordinal example: placing in a swimming race (1st, 2nd, 3rd) with no information about time gaps.
Interval example: seconds slower relative to club record; negative values possible when beating the record.
Ratio example: actual swim times in seconds; zero represents no time elapsed; meaningful ratios like 60s vs 120s.
Reliability examples: test-retest reliability, internal consistency, and interrater reliability; validity examples include face validity, predictive validity (ATAR), and construct validity (IQ tests).
Data organization examples: standard and grouped frequency tables, stem-and-leaf plots, box-and-whisker plots, bar graphs, histograms, and frequency polygons; use grouping to handle wide data ranges but accept loss of precise scores within groups.
Percentile example: computation steps and interpretation for a dataset, illustrating both individual-score and grouped-percentile computations.
Central tendency and variability: discussion of when to use mode, median, or mean, and how variability (range, variance, standard deviation) describes how spread out the data are, with emphasis on how extreme scores affect the mean but not the median.

Note: These notes summarize the key ideas, definitions, examples, and formulas from the transcript to support study and exam preparation. Be sure to understand how to apply each concept to data sets and to select appropriate analyses based on the level of measurement and distribution characteristics.