1/24
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Variation
The natural differences in data values from individual to individual; the reason graphs and summaries are useful.
Variable
A characteristic recorded for each individual in a dataset (e.g., height, eye color, minutes exercised).
Categorical variable
A variable that places individuals into groups; values are labels (e.g., political party, yes/no, brand).
Quantitative variable
A variable that takes numerical values for which arithmetic makes sense in context (e.g., height, reaction time, test score).
Distribution
What values a variable takes and how often it takes them.
Count (frequency)
The number of observations in a category (categorical data) or in a bin/interval (quantitative data).
Proportion (relative frequency)
A count divided by the total number of observations; often expressed as a percent.
Bar chart
A graph for categorical variables showing categories on one axis and counts/proportions on the other; bars are separated.
Dotplot
A quantitative graph that places a dot for each data value above a number line; repeated values stack.
Stemplot (stem-and-leaf plot)
A quantitative display that splits values into stems and leaves, preserving the original data values.
Stem (in a stemplot)
The leading digit(s) of each number used to group data values in a stem-and-leaf plot.
Leaf (in a stemplot)
The trailing digit of each number written next to the appropriate stem in a stem-and-leaf plot.
Key (stemplot key)
A note that explains the scale of a stemplot (e.g., 4|7 means 47).
Histogram
A quantitative graph that groups data into bins (intervals) and shows frequency in each bin; bars touch because the x-axis is a number line.
Bin (class interval)
A range of values used to group quantitative data in a histogram.
Relative frequency histogram
A histogram that uses proportions/percents (not counts) on the y-axis; useful for comparing groups with different sample sizes.
SOCS
A framework for describing quantitative distributions: Shape, Outliers (and other unusual features), Center, Spread.
Shape
The overall form of a quantitative distribution (e.g., symmetric or skewed; unimodal or bimodal).
Outlier
An observation unusually far from the rest of the data; can strongly affect mean and standard deviation.
Mean (x̄)
The arithmetic average of quantitative data: x̄ = (1/n)∑xᵢ; not resistant to outliers.
Median
The middle value of ordered data (or the average of the two middle values if n is even); resistant to outliers.
Range
A simple measure of spread: max − min; very sensitive to outliers.
Interquartile range (IQR)
A resistant measure of spread for the middle 50%: IQR = Q3 − Q1.
Standard deviation (s)
Measures a typical distance of values from the mean: s = sqrt[(1/(n−1))∑(xᵢ − x̄)²]; not resistant to outliers.
Resistant (statistic)
A statistic that is not strongly affected by extreme values (e.g., median and IQR are resistant; mean and standard deviation are not).