This covers basic vocabulary and sampling vocabulary in the Sampling portion of Chapter 5 in AP Statistics.
Statistics
values calculated for sample data
Population
the entire group of individuals or instances about whom we hope to learn
Sample
a (representative) subset of a population, examined in hope of learning about the population
Parameter
a numerically valued attribute of a model for a population; we hope to estimate the true value from sample data
Sample Statistic
those that correspond to, and thus estimate, a population parameter, are of particular interest
Census
a sample that consists of the entire population
Sample Survey
a survey that asks questions of a sample drawn from some population in the hope of learning something about the entire population
Representative
a sample is said to be representative if the statistics computed from it accurately reflect the corresponding population parameters
Bias
any systemic failure of a sampling method to represent its population
Undercoverage/Selection Bias
a sampling scheme that biases the sample in a way that gives party of the population less representation than it has in the population
Response/Measurement Bias
anything in a survey design that influences responses; one typical response bias arises from the wording of a question which may favor certain responses
Nonresponse Bias
bias introduced to a sample when a large fraction of those sampled fail to respond; those that do respond likely do not represent the total population
Voluntary Response Bias
bias introduced to a survey when individuals can choose on their own whether or participate in the sample; samples based on voluntary responses are always invalid and cannot be recovered
Randomization
the best defense against bias is randomization, in which each individual is given a fair, random chance at being selected
Sampling Frame
a list of individuals from whom the sample is draw; may be in the population of interest but those not in the sampling frame cannot be included in any sample
Sampling Variability
the natural tendency of randomly drawn samples to differ from one another; sometimes called sampling error but is not actually an error, just a natural result of random sampling
SRS (Simple Random Sampling)
a simple random sample of size n is one in which each set of n elements in the population has n equal chance of selection
Sampling With Replacement (Independent Events)
picking a thing and then placing it back into the sample, to be able to be selected again
Sampling Without Replacement (Dependent Events)
picking a thing and then not putting it back into the sample so it cannot be selected again
Stratified Random Sample
a population divided into several strata and then random samples are drawn from each stratum; the the stratum are homogeneous but different from each other, a stratified sample may yield more consistent results
Strata
subpopulations with homogeneous classmates but are distinctly different from other groups
Cluster Sample
entire groups, “clusters”, are chosen at random; usually selected as a matter of convenience, practicality, or cost; each cluster should be heterogeneous (and representative of the population) so all the clusters should be similar to each other
Multistage Sample
combines several sampling methods.
example: national polling service may stratify the country by geographical regions, select a random sample of cities from each region, and then interview a cluster of residents in each city
Systemic Sample
drawn by selecting individuals systematically from a sample frame; when there is no relationship between the order of the sampling frame and the variables of interest, a systemic sample can be representative
Convenience Sample
consists of the individuals who are conveniently available; often fail to be representative because every individual in the population is not equally convenient to sample