AP Statistics | 2024-2025
What is a population?
the entire group of individuals we want information about
What is a sample?
a subset of individuals n the population from which we actually collect data
What is a census?
data from every individual in the population
What is a sample survey?
a study that collects data from a sample that is chosen to represent a specific population
What are the steps for planning a sample survey?
Decide what population you want to describe
Decide what you want to measure
Decide how to choose a sample from the population
What does poor sampling lead to in your results?
bias
What is bias?
using a value that will consistently overestimate or underestimate the value you want to know
What is convenience sampling?
choosing individuals who are easy to reach
What is voluntary response sampling?
allowing people to choose to be in the sample by responding to a general invitation
Why might voluntary response sampling show bias?
because people will strong feelings (often in the same direction) are most likely to respond
How do you ensure that the conclusion of your study doesn’t become rendered invalid?
by doing everything in your power to ensure that the sample was collected truly, utterly, and completely randomly
What is random sampling?
a chance process to determine which members of a population are included in the sample
What is a simple random sample (SRS)?
a sample chosen in such a way that every group of n individuals in the population has an equal chance to be selected as the sample
Why might you choose a sample by chance?
to avoid bias affecting the results
How can you choose an SRS?
using technology or Table D
What are the 3 steps to choosing an SRS?
Label
Randomize
Select
What is N in regard to SRS?
the number of individuals in the population
What is n in regard to SRS?
sample size
What is the Label step of choosing an SRS with technology?
Give each individual in the population a distinct numerical label from 1 to N
What is the Randomize step of choosing an SRS with technology?
Use a random number generator to obtain n different integers from 1 to N
What is the Select step of choosing an SRS with technology?
Choose the individuals that correspond to the randomly selected integers
How do you find SRS using a calculator?
Math → PRB → 5: randomInt(1, N)
What is the Label step of choosing an SRS with Table D?
Give each member of the population a numerical label with the same number of digits. Use as few digits as possible
What is the Randomize step of choosing an SRS with Table D?
Read consecutive groups of digits of the appropriate length from left to right across a line in Table D. Ignore any groups of digits that wasn’t used as a label or that duplicates a label already in the sample. Stop when you have chosen n different labels
What is the Select step of choosing an SRS with Table D?
Choose the individuals that correspond to the randomly selected integers
What is a table of random digits?
a long string of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 with these two properties:
each entry in the table is equally likely to be any of the 10 digits (0-9)
the entries are independent of each other, and knowledge of one part of the table gives no information about any other part
What are strata?
groups of similar groups
What is a stratified random sample?
a sample that takes an SRS within each group and combines the SRS’s into one overall sample
Why is it beneficial to use a stratified random sample?
it provides a more precise estimate with less variability
How do you choose a variable to stratify by?
pick the variable that is the best predictor of what you’re measuring
When is it preferred to use cluster sampling instead of SRS or stratified random sampling?
when the populations are large and spread over a wide area
What is a cluster?
a group of individuals that are located near each other
What is a cluster sampling?
randomly choosing clusters and including each member of the selected clusters in the sample
Why are cluster samples used?
for practical reasons like saving time and money
When do cluster samples work best?
when the cluster looks like the population, just on a smaller scale
How do you describe stratified random sampling?
Define the strata
obtain an SRS of [ n/number of strata] from each [strata]
result – stratified random sample of n students
How do you describe cluster sampling?
Use […] as clusters, assuming x individuals per [cluster]
Randomly selected [n/number of individuals per cluster]
Result – the n individuals will be our sample
What is the drawback of SRS?
there is a large amount of variability, and it is time-consuming
What is the drawback of stratified random sampling?
there might not be many individuals for some strata, which can influence the result
What is the drawback of cluster sampling?
the clusters used may not be good representations of the entire population
What is systematic random sampling?
selecting a sample from an ordered arrangement of the population by randomly selecting one of the first k individuals and every kth individual thereafter
What can affect sample surveys in addition to sampling variability?
errors
What do good sampling techniques include?
the art of reducing all sources of error
When does undercoverage occur?
when some members of the population are less likely to be chosen or cannot be chosen in a sample
When does nonresponse occur?
when an individual chosen for the sample can’t be contacted or refuses to participate
When does response bias occur?
when there is a systematic pattern of inaccurate answers to a survey question
What is the most important influence on the answers given to a sample survey?
the wording of questions
Why should you rely on random sampling?
to avoid bias in selecting samples from the lists of available individuals
the laws of probability allow trustworthy inference about the population
What is a margin of error?
how far we expect the sample proportion to be from the actual
What is the benefit of increasing the sample size?
increased precision (but not accuracy)
What are errors in design methods (designer flaw)?
convenience sampling
voluntary response sampling
What are errors causing response bias (response flaw)?
undercoverage
nonresponse
wording of questions
What is an observational study?
a study that observes individuals and measures variables of interest but does not attempt to influence the response
What is a retrospective observational study?
one that examines existing data
What is a prospective observational study?
one that tracks individuals into the future
When does confounding occur?
when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other
What does an experiment do?
deliberately imposes some treatment on individuals to measure their responses
What is a placebo?
a treatment that has no active ingredient, but is otherwise like other treatments
What is the only source of full convincing data when our goal is to understand cause and effect?
experiments
What is a treatment?
a specific condition applied to the individuals in an experiment
What is an experimental unit?
the object to which a treatment is randomly assigned
What are experimental units called when they are human beings?
subjects
How do experiments differ from observational studies?
observational studies observe individuals and ask them questions, while experiments impose some treatment in order to measure the response
Why do observational studies of the effect on an explanatory variable on a response variable often fail?
because of confounding between the explanatory variable and one or more other variables
What do well-designed experiments take steps to do?
prevent confounding
What is a factor in an experiment?
an explanatory variable that is manipulated and may cause a change in the response variable
What are levels in an experiment?
the different values of a factor
Why is a control group used?
to provide a baseline for comparing effects of other treatments
What is the placebo effect?
the effect that some subjects in an experiment will respond favorably to any treatment, even an inactive treatment
What is a double-blind experiment?
an experiment in which neither the subjects nor those who interact with them and measure the response variable know which treatment a subject received
What is a single-blind experiment?
an experiment in which either the subjects don’t know which treatment they are receiving or the people who interact with them and measure the response variable don’t know which subjects are receiving which treatment
What is random assignment in an experiment?
using chance to assign experimental units to treatments
What is the purpose of random assignment?
to help create roughly equivalent groups of experimental units by balancing the effects of other variables among the treatments
What does control mean in an experiment?
keeping other variables constant for all experimental units
What does random assignment ensure?
that the effects of uncontrolled variables are balanced among treatment groups
What is replication in an experiment?
using enough experimental units to distinguish a difference in the effects of the treatments from chance variation due to the random assignment
What can replication also refer to?
repeating the experiment with different subjects
How does an experiment benefit from replication?
confounding is prevented and variability is reduced
What are the 4 principles of experimental design?
comparison
random assignment
control
replication
What is comparison?
using a design that compares two or more treatments
What is a completely randomized design?
a design in which the experimental units are assigned to the treatments completely by chance
What is a block?
a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments
What is a randomized block design?
a design in which the random assignment of experimental units to treatments is carried out separately within each black
What does blocking account for?
a source of variability
What are the best variables to use for blocking?
ones that best predict the response variable
What is a matched pairs design?
a design for comparing two treatments that uses blocks of size 2
How are pairs used in matched pairs designs?
either two very similar experimental units are paired and the two treatments are randomly assigned within each pair, or each experimental unit receives both treatments in a random order
What do researchers usually hope to see in an experiment?
a difference in the responses that is so large that it is unlikely to have happened just by chance variation
How can we learn whether the treatments effects are larger than we would expect to see if only chance was operation?
by using the laws of probability
When is an observed effect statistically significant?
when it is so large that it would rarely occur
True or false: A statistically significant association in data from a well-designed experiment does not imply causation.
false
What do we need to do when we do an experiment and find a difference between two groups?
we need to determine if this difference can be attributed to the chance of variation in random assignment or because there really is a difference in effects of the treatments
How can we determine if the results of our experiment are statistically significant?
by conducting a simulation that models truly random outcomes and using the results to conclude statistically significance
What does the scope of inferences refer to?
the types of inferences (conclusions) that can be drawn from a study
When can we make inferences about the population?
when the individuals are randomly selected from a population
When can we make inferences about cause and effect?
when the individuals are randomly assigned to groups
What do well-designed experiments do?
randomly assign individuals to treatment groups
Why can’t inferences about cause and effect be made in regard to observational studies?
they don’t randomly assign individuals to groups
What type of observational studies can make inferences about population?
ones that use random sampling
What does a well-designed experiment tell us?
that changes in the explanatory variable cause changes in the response variable