Looks like no one added any tags here yet for you.
The entire group of individuals about which we want information.
A subset of the population from which we collect information.
Collects data from the entire population.
convenience sample
Choosing individuals who are easiest to reach, not randomized.
voluntary response sampling
Allows people to choose to be in the sample by responding to a general invitation, often biased.
simple random sample (SRS)
Every group of n individuals in the population has an equal chance to be selected as the sample.
stratified random sample
Selects a sample by choosing an SRS from each stratum and combining SRS's into one overall sample.
cluster sample
Obtained by selecting all individuals within a randomly selected collection or group of individuals.
Groups within a population that are homogeneous based on a relevant characteristic.
Groups within diverse populations that ideally represent the population on a smaller scale.
systematic random sample
Selects a sample from an ordered arrangement of the population by randomly selecting one of the first k individuals and choosing every kth individual thereafter.
The systematic favoring of certain outcomes due to the method of collecting data.
sampling frame
A list of all individuals in the population.
nonresponse bias
Occurs when an individual chosen for the sample can't be contacted or refuses to respond.
Occurs when some groups in the population are left out of the sampling process.
observational study
Observes individuals and measures variables of interest without attempting to influence responses.
response bias
When participants provide inaccurate, false, or misleading answers due to various influences.
A relationship between two or more variables, not implying causation.
confounding variable
An outside factor that influences both the independent and dependent variable.
Imposes treatment on individuals to measure their responses.
A specific condition applied to the individuals in study.
The explanatory variables being manipulated that may cause a change in the response variable.
Different values of the factors applied in an experiment.
A treatment with no active ingredient, but is similar to other treatments.
single-blind study
Either the subjects or the researchers are unaware of who receives active treatment or placebo.
double-blind study
Neither the participant nor the researcher knows who received the treatment or placebo.
placebo effect
Experimental results caused by expectations alone.
control group
The group that does not receive the experimental treatment for comparison.
random assignment
Creates groups that are roughly equivalent at the beginning of an experiment.
statistical significance
A statistical statement of how likely an obtained result occurred by chance.
Giving each treatment to enough experimental units to distinguish differences in treatment effects.
completely randomized design
Treatments are assigned to experimental units completely by chance.
randomized block design
Random assignment of experimental units to treatments carried out within each block.
matched pairs design
Pairs of subjects are matched on a characteristic and randomly assigned to groups.
Using information from a sample to draw conclusions about the population.
sampling variability
The natural tendency of randomly drawn samples to differ from one another.
scope of inference
The extent to which conclusions can be made about the population.
frequency table
Summarizes one categorical variable using counts.
relative frequency table
Summarizes one categorical variable using percentages or proportions.
two-way table
A table containing counts for two categorical variables.
marginal relative frequency
The percent of individuals with a specific value for one categorical variable.
joint relative frequency
The percent of individuals that have specific values for two categorical variables.
conditional relative frequency
The percent of individuals with a specific value for one categorical variable among those who share a value of another variable.
mosaic plot
A modified segmented bar graph where the width of each rectangle is proportional to the number in that category.
For describing distribution: Shape, Unusual values, Center, Spread.
A graphical representation of a dataset that organizes and displays data while preserving original values.
5 number summary
Includes min, Q1, median, Q3, and max.
density curve
A model that describes the overall pattern of a distribution of a random variable.
For describing correlation: Direction, Unusual values, Form, Strength.
high leverage points
Points with much larger or smaller x-values than other points in the dataset.
outliers (regression)
Points that do not follow the data pattern and have large residuals.
influential point
An extreme value whose removal drastically changes the slope, y-intercept, or correlation.
regression line (LSRL)
A linear equation represented as predicted y = a + bx.
power model
When logging both variables linearizes the data.
exponential model
When logging only the y-variable linearizes the data.
large counts condition
Using normal approximation when np>=10 and n(1-p)>=10.
central limit theorem
When the number of samples is ≥30, the sampling distribution of the sample mean is approximately normal.