Ch 4 definitions: Measurement Scales, Distributions, and Standard Scores

Level of measurement describes the relationship among the numbers assigned to information; it governs what mathematical operations and statistics are appropriate.

Numbers or codes are used to label or categorize data only; they do not represent magnitude or distance.
Examples: labels such as color categories, gender codes, or other category labels (e.g., 1 = red, 2 = blue, 3 = green).
Arithmetic operations are not meaningful beyond counting frequencies; cannot meaningfully compute sums or averages.
Common statistics/tests used: frequency distributions, mode, chi-square tests.

Data are ranked or ordered; the order conveys relative position (which is higher or lower).
Distances between adjacent ranks are not necessarily equal; variances between ranks can differ.
Example: rank-order preferences (e.g., 1 = most preferred, 5 = least preferred) or color orderings (e.g., 1–6 in rank order).
Suitable statistics: mode, median, percentile; sometimes non-parametric tests; not appropriate to assume equal intervals or compute meaningful means.

Data are rank-ordered with equal intervals between adjacent values; the difference between values is meaningful and constant.
However, there is no true zero point; zero does not indicate the absence of the attribute.
Example: temperature in Celsius or Fahrenheit; health scales like 1 = very poor, 5 = excellent (assuming equal intervals).
Statistical operations: mean, median, mode are meaningful; measures of variability such as standard deviation are meaningful; parametric tests (e.g., t-tests) are applicable.

Has all the properties of interval data plus a true zero point (absence of the attribute).
Allows meaningful comparisons of absolute magnitudes and ratios.
Examples: height, weight, reaction time, counts.
Enables all arithmetic operations, including proportions and ratios.
In statistics, often used with proportion calculations and product-m moment correlation.

Categorical data: data grouped by a common property; nominal and ordinal data fall here.
Raw scores: the most basic data obtained directly from a psychological test or measurement instrument (before any transformation).
Data sources can be described via frequency distributions and visualized with class intervals (for continuous scores).

Frequency distribution: summarizes the actual number of observations falling into each category or score range.
Provides an overview of how data are spread across the possible values.
Class interval: a grouping of scores into ranges to display data (e.g., 0–9, 10–19, 20–29).
Used to build histograms and to display distributions for both discrete and continuous data.

The normal distribution is a theoretical distribution, often used as a model of many natural phenomena when sample sizes are large.
Characteristics: perfectly symmetrical bell-shaped curve; many scores cluster around the middle; tails extend indefinitely.
Notation: often denoted as
- Population: X o N(oldsymbol{ ext{μ}}, oldsymbol{ ext{σ}}^2)
- Probability density function: f(x) = rac{1}{oldsymbol{σ} \, ext{√}{2π}} \, ext{exp} igg(-\frac{(x-μ)^2}{2σ^2}\bigg)
Real-world data are approximately normal after appropriate transformations or with sufficient sample size.

Central tendency: describes the center of a distribution.
- Mean: ar{X} = rac{1}{N} \, extstyle\sum{i=1}^{N} Xi
- Median: middle score in an ordered list (or average of the two middle scores if N is even).
- Mode: most frequent value.
Variability (spread): describes how spread out the scores are.
- Range: difference between maximum and minimum values.
- Variance and standard deviation quantify average squared deviation from the center.
- Population variance: ext{Var}(X) = σ^2 = rac{1}{N} \, \sum{i=1}^{N} (Xi - μ)^2
- Sample variance: s^2 = rac{1}{N-1} \, \sum{i=1}^{N} (Xi - \bar{X})^2
- Standard deviation: σ = \, ext{sqrt}(σ^2) or for samples s = \, ext{sqrt}(s^2)

Correlation coefficient: statistic used to describe the strength and direction of a relationship between two (or more) variables; denoted by r.
Pearson product-moment correlation coefficient (most common):
r = rac{ \sum{i=1}^{N} (Xi - \bar{X})(Yi - \bar{Y}) }{ \sqrt{ \sum{i=1}^{N} (Xi - \bar{X})^2 } \; \sqrt{ \sum{i=1}^{N} (Y_i - \bar{Y})^2 } }
Interpretation: values close to +1 or -1 indicate strong linear relationships; values near 0 indicate weak or no linear relationship.

Standard score (z-score): converts a raw score into standard units based on population parameters.
- Formula: z = \frac{X - μ}{σ}
- Interpretation: number of standard deviations a score is from the mean.
Linear transformation: changes the unit or scale using a linear equation but does not change the relative ordering or distributional shape. General form: X' = a + bX where a and b are constants.
Other transformation notes:
- Transformations may rely on the normal curve (e.g., standard scores).
- Some transformations change units of measurement (linear) and some do not (nonlinear) depending on goal and the metric level.

T-score: a standardized score with mean 50 and standard deviation 10; always nonnegative.
- Formula: T = 50 + 10z where z = \frac{X - μ}{σ}
Stanines: a nine-point standardized score scale (1–9) designed to describe a distribution with simple verbal descriptors (e.g., above/below average).

Percentile: the value below which a given percentage of observations fall; used with ordinal and scale data to describe relative standing.

Proportion: part-to-whole ratio for a subset of a population.
- Formula: p = \frac{k}{N} where k is the number in the subset and N is the total.
Correlation and proportional reasoning often rely on ratio-level information when interpreting relationships.

Choose level of measurement first to determine permissible analyses (e.g., mean vs. median vs. mode; parametric vs. non-parametric tests).
For nominal data, use frequencies, mode, and chi-square tests; avoid means.
For ordinal data, use median, mode, and percentiles; be cautious about means.
For interval data, you can use means, standard deviations, and parametric tests, but no true zero.
For ratio data, you can perform all arithmetic operations, compute proportions, and use all standard statistical methods.
Use frequency distributions and class intervals to summarize and visualize distributions, especially with large data sets.
Standard scores (z, T) and stanines facilitate comparison across different scales and populations.
Always report the sample size (N) and ensure alignment of the statistics with the data type.