1/38
Vocabulary terms and definitions from introductory statistics covering data types, study designs, and sampling methods.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Summary statistic
A single number that condenses a lot of data, such as a proportion.
Treatment group
The group in a study that receives the intervention being tested.
Control group
The comparison group that does not receive the intervention; it serves as the reference point.
Data matrix
Data organized as rows (cases)×columns (variables).
Case / observational unit
One row in a data matrix representing one entity measured, such as a patient or a county.
Variable
One column in a data matrix representing a characteristic measured on every case.
Numerical variable
Values where arithmetic is meaningful.
Continuous numerical variable
Numerical values over a range, such as height or rate.
Discrete numerical variable
Numerical values that are counts or jumps, such as population or \text{# siblings}.
Categorical variable
Values that represent categories.
Level
A possible value or category of a categorical variable.
Nominal categorical variable
Categorical values with no natural order, such as state names.
Ordinal categorical variable
Categorical values with a natural order, such as "below hs" reaching to "bachelors".
Scatterplot
A graph of two numerical variables, with one dot per case.
Associated (dependent) variables
Variables that show a discernible pattern together.
Positive association
A relationship where both variables move in the same direction.
Negative association
A relationship where one variable rises as the other falls.
Independent variables
Variables with no evident relationship; variables are either associated or independent, never both.
Explanatory variable
The variable suspected of affecting the other.
Response variable
The variable suspected of being affected by the explanatory variable.
Observational study
Data collected without interfering; it shows association only.
Experiment
A study where treatments are actively assigned to subjects.
Randomized experiment
An experiment where subjects are assigned to groups at random, which licenses causal claims.
Population (target population)
The full set of cases the question is about.
Sample
A subset of the population that is actually measured.
Anecdotal evidence
Haphazard data from a few striking cases that is usually unrepresentative.
Bias
Systematic skew that makes a sample unrepresentative.
Simple random sample
A sample where every case has an equal chance of being selected and selections are unconnected, like a raffle.
Non-response bias
Skew that occurs when sampled people do not respond.
Convenience sample
A potentially unrepresentative sample where only easily-reached cases are included.
Observational data
Data with no treatment applied or withheld.
Confounding variable (lurking variable)
A variable correlated with both explanatory and response variables; the reason observational data cannot prove causation.
Prospective study
A study that follows cases forward as events unfold.
Retrospective study
A study that looks backward through records after events have occurred.
Stratified sampling
Splitting the population into similar groups (strata), then random-sampling within each stratum.
Cluster sampling
Splitting the population into clusters, picking a few whole clusters, and taking all cases within them.
Multistage sampling
A process similar to cluster sampling, but random-sampling is performed within each chosen cluster.
Sample statistic
A number computed from the sample used as an estimate.
Population parameter
The true value for the whole population, which is usually unknown.