1/32
Vocabulary flashcards covering central tendency, variability, shape, distribution, and basic descriptive statistics concepts from Chapter 3 notes.
Name  | Mastery  | Learn  | Test  | Matching  | Spaced  | 
|---|
No study sessions yet.
Mode
The most frequently occurring value in a data set; may not exist or may be bimodal or multimodal; applicable to all levels of data.
Median
The middle value in an ordered data set; for an even number of observations, the average of the two middle terms; not affected by extreme values.
Mean (Arithmetic Mean)
The average of a group of numbers; includes all data in the set; a common measure of central tendency.
Population Mean (μ)
The average of a population; a parameter that describes the entire group.
Sample Mean (x̄)
The average of a sample; a statistic used to estimate the population mean.
Percentile
A value below which a certain percentage of data fall; e.g., the nth percentile has at least n% of data at or below it.
Pth Percentile Calculation
To find the Pth percentile: order data, locate the percentile position (i = p/100 * N), and average the surrounding values if needed (or round up when the position is not whole).
Quartiles
Five-number summary that divides data into four parts; Q1 (25th percentile), Q2 (median), Q3 (75th percentile); Q2 equals the median.
First Quartile (Q1)
The 25th percentile of the data.
Second Quartile (Q2) / Median
The 50th percentile; the middle value of the ordered data.
Third Quartile (Q3)
The 75th percentile of the data.
Interquartile Range (IQR)
The spread of the middle 50% of the data; IQR = Q3 − Q1.
Range
The difference between the largest and smallest values in a data set; easy to compute but affected by outliers.
Variance (Population)
The average of the squared deviations from the population mean; measured in squared units.
Variance (Sample)
An unbiased estimator of the population variance, computed with denominator n−1.
Standard Deviation (Population)
The square root of the population variance; measures spread in the same units as the data.
Standard Deviation (Sample)
The square root of the sample variance; used to estimate population variability.
Mean Absolute Deviation (MAD)
The average distance between each data value and the mean; a measure of spread using absolute deviations.
Empirical Rule (68-95-99.7)
For normally distributed data: about 68% within 1 SD, 95% within 2 SD, and 99.7% within 3 SD of the mean.
Chebyshev’s Theorem
For any distribution, at least (1 − 1/k^2) of data lie within k standard deviations of the mean (k>1).
Skewness
A measure of asymmetry in a distribution; skewness can be left (negative) or right (positive).
Kurtosis
A measure of the peakedness or flatness of a distribution.
Box-and-Whisker Plot
A graphical representation of a distribution using the min, Q1, median, Q3, and max; includes whiskers and outlier detection via fences.
Five-Number Summary
The minimum, Q1, median, Q3, and maximum used to construct a box plot.
Outlier
A value that falls outside the inner fences of a box plot; can be mild (within outer fences) or extreme (beyond outer fences).
Inner Fences
Boundaries at Q1 − 1.5·IQR and Q3 + 1.5·IQR used to identify mild outliers.
Outer Fences
Boundaries at Q1 − 3·IQR and Q3 + 3·IQR used to identify extreme outliers.
Hinges
The endpoints of the box in a box plot, corresponding to Q1 and Q3.
Z-Score
The number of standard deviations a value is from the mean; z = (x − μ)/σ for a population, or (x − x̄)/s for a sample.
Coefficient of Variation (CV)
The ratio of the standard deviation to the mean, expressed as a percentage; used to compare variability across datasets with different scales.
Symmetry of Mean, Median, and Mode
In a symmetric distribution, mean = median = mode; skewness affects their ordering (negative or positive skew).
Normal Distribution
Bell-shaped, symmetric distribution where empirical rules and standard deviation-based inferences apply.
Box Plot Utility in Shape Assessment
Box plots help assess skewness and identify outliers by examining the median position and whisker lengths.