Ch 4 definitions: Measurement Scales, Distributions, and Standard Scores

Level of Measurement and Related Concepts (Ch 4)

  • Level of measurement describes the relationship among the numbers assigned to information; it governs what mathematical operations and statistics are appropriate.

Nominal level

  • Numbers or codes are used to label or categorize data only; they do not represent magnitude or distance.

  • Examples: labels such as color categories, gender codes, or other category labels (e.g., 1 = red, 2 = blue, 3 = green).

  • Arithmetic operations are not meaningful beyond counting frequencies; cannot meaningfully compute sums or averages.

  • Common statistics/tests used: frequency distributions, mode, chi-square tests.

Ordinal level

  • Data are ranked or ordered; the order conveys relative position (which is higher or lower).

  • Distances between adjacent ranks are not necessarily equal; variances between ranks can differ.

  • Example: rank-order preferences (e.g., 1 = most preferred, 5 = least preferred) or color orderings (e.g., 1–6 in rank order).

  • Suitable statistics: mode, median, percentile; sometimes non-parametric tests; not appropriate to assume equal intervals or compute meaningful means.

Interval level

  • Data are rank-ordered with equal intervals between adjacent values; the difference between values is meaningful and constant.

  • However, there is no true zero point; zero does not indicate the absence of the attribute.

  • Example: temperature in Celsius or Fahrenheit; health scales like 1 = very poor, 5 = excellent (assuming equal intervals).

  • Statistical operations: mean, median, mode are meaningful; measures of variability such as standard deviation are meaningful; parametric tests (e.g., t-tests) are applicable.

Ratio level

  • Has all the properties of interval data plus a true zero point (absence of the attribute).

  • Allows meaningful comparisons of absolute magnitudes and ratios.

  • Examples: height, weight, reaction time, counts.

  • Enables all arithmetic operations, including proportions and ratios.

  • In statistics, often used with proportion calculations and product-m moment correlation.

Categorical vs. Raw data

  • Categorical data: data grouped by a common property; nominal and ordinal data fall here.

  • Raw scores: the most basic data obtained directly from a psychological test or measurement instrument (before any transformation).

  • Data sources can be described via frequency distributions and visualized with class intervals (for continuous scores).

Frequency distribution and class intervals

  • Frequency distribution: summarizes the actual number of observations falling into each category or score range.

  • Provides an overview of how data are spread across the possible values.

  • Class interval: a grouping of scores into ranges to display data (e.g., 0–9, 10–19, 20–29).

  • Used to build histograms and to display distributions for both discrete and continuous data.

Normal distribution (Normal curve)

  • The normal distribution is a theoretical distribution, often used as a model of many natural phenomena when sample sizes are large.

  • Characteristics: perfectly symmetrical bell-shaped curve; many scores cluster around the middle; tails extend indefinitely.

  • Notation: often denoted as

    • Population: X o N(oldsymbol{ ext{μ}}, oldsymbol{ ext{σ}}^2)

    • Probability density function: f(x) = rac{1}{oldsymbol{σ} \, ext{√}{2π}} \, ext{exp} igg(-\frac{(x-μ)^2}{2σ^2}\bigg)

  • Real-world data are approximately normal after appropriate transformations or with sufficient sample size.

Central tendency and variability

  • Central tendency: describes the center of a distribution.

    • Mean: ar{X} = rac{1}{N} \, extstyle\sum{i=1}^{N} Xi

    • Median: middle score in an ordered list (or average of the two middle scores if N is even).

    • Mode: most frequent value.

  • Variability (spread): describes how spread out the scores are.

    • Range: difference between maximum and minimum values.

    • Variance and standard deviation quantify average squared deviation from the center.

    • Population variance: ext{Var}(X) = σ^2 = rac{1}{N} \, \sum{i=1}^{N} (Xi - μ)^2

    • Sample variance: s^2 = rac{1}{N-1} \, \sum{i=1}^{N} (Xi - \bar{X})^2

    • Standard deviation: σ = \, ext{sqrt}(σ^2) or for samples s = \, ext{sqrt}(s^2)

Correlation and relationship measures

  • Correlation coefficient: statistic used to describe the strength and direction of a relationship between two (or more) variables; denoted by r.

  • Pearson product-moment correlation coefficient (most common):
    r = rac{ \sum{i=1}^{N} (Xi - \bar{X})(Yi - \bar{Y}) }{ \sqrt{ \sum{i=1}^{N} (Xi - \bar{X})^2 } \; \sqrt{ \sum{i=1}^{N} (Y_i - \bar{Y})^2 } }

  • Interpretation: values close to +1 or -1 indicate strong linear relationships; values near 0 indicate weak or no linear relationship.

Standard scores and transformative scales

  • Standard score (z-score): converts a raw score into standard units based on population parameters.

    • Formula: z = \frac{X - μ}{σ}

    • Interpretation: number of standard deviations a score is from the mean.

  • Linear transformation: changes the unit or scale using a linear equation but does not change the relative ordering or distributional shape. General form: X' = a + bX where a and b are constants.

  • Other transformation notes:

    • Transformations may rely on the normal curve (e.g., standard scores).

    • Some transformations change units of measurement (linear) and some do not (nonlinear) depending on goal and the metric level.

T-scores and stanines

  • T-score: a standardized score with mean 50 and standard deviation 10; always nonnegative.

    • Formula: T = 50 + 10z where z = \frac{X - μ}{σ}

  • Stanines: a nine-point standardized score scale (1–9) designed to describe a distribution with simple verbal descriptors (e.g., above/below average).

Percentiles and related concepts

  • Percentile: the value below which a given percentage of observations fall; used with ordinal and scale data to describe relative standing.

Proportion and categorical statistics

  • Proportion: part-to-whole ratio for a subset of a population.

    • Formula: p = \frac{k}{N} where k is the number in the subset and N is the total.

  • Correlation and proportional reasoning often rely on ratio-level information when interpreting relationships.

Summary of practical implications

  • Choose level of measurement first to determine permissible analyses (e.g., mean vs. median vs. mode; parametric vs. non-parametric tests).

  • For nominal data, use frequencies, mode, and chi-square tests; avoid means.

  • For ordinal data, use median, mode, and percentiles; be cautious about means.

  • For interval data, you can use means, standard deviations, and parametric tests, but no true zero.

  • For ratio data, you can perform all arithmetic operations, compute proportions, and use all standard statistical methods.

  • Use frequency distributions and class intervals to summarize and visualize distributions, especially with large data sets.

  • Standard scores (z, T) and stanines facilitate comparison across different scales and populations.

  • Always report the sample size (N) and ensure alignment of the statistics with the data type.