Untitled Flashcards Set
population - entire group we want info from
sample - subset of population we actually collect data from
census - survey that collects info from everyone in the population
convenience sample - sample taken conveniently for the surveyor
problem with convenience samples - it leads to bias because they are taken in a way that is only convenient to the sampler instead of actually being a good representation of the population (not random)
bias - studies show bias if its likely to overestimate/underestimate the value you are trying identify
voluntary response sample - sample from those who volunteer
problem with voluntary response samples - people who volunteer their info often have stronger opinions and are often negatively charged
simple random sample - chosen in a way that every group within a population have an equal chance of being sampled (names in a hat, random number generator, table of random digits)
How to choose a simple random sample - Label, randomize, select
strata - variable/aspect by which populations can be split in order to more fully capture data from more diverse perspectives, stratas have individuals with shared characteristics (best strata for TS concert is rows so fans from all distances away from TS can respond)
how do you know which method will produce the best estimate for a whole population based on the dotplot - the data points are closer/more clustered together rather spread out
stratified random sample - split population into stratas then collect data via simple random samples from those stratas which are combined into one overall sample
how do you know if a sampling method produces the best estimates - low bias and low variability
how do you know if there is low bias - clustered to the median rather than clustered elsewhere (high bias)
why use a stratified random sample - when you have variability in a population and to prevent from randomly accidentally selecting all of a similar population
how to choose a strata - Choose the variable with the strongest association with the outcome of the dats
in an explanation make sure to _______ - Include whether the real data would be higher or lower
replacement - an individual being put back into the population with a chance of re-selection
n - number of individuals in a population/strata
When talking about random number generators, make sure to include whether or not to include ______ - Repeats
Cluster sample - Data being collected from all of the individuals in specific strata instead of samples from all the stratas
Systematic random sample - Individuals being picked by equal systematic intervals
Sampling frame - List of individuals that form the population from which a sample is taken
Undercoverage - Type of (inherent) bias where members of a population are excluded because of how a sample is collected
Nonresponse - Type of bias where individuals chosen for a sample are unwilling or unable to participate, only happens after sample is already selected, can be reduced with incentives or reminders or keeping the survey short
Response bias - Pattern of inaccurate responses
Potential causes for response bias - Embarrassment of real answer, confusion on question wording, characteristics of the interviewer
observational study - no treatment imposed
experiment - treatment imposed
Why can’t clear cause/effect relationships be given? - There might be other variables/factors influencing the response variable, you need an experiment specifically to show such a relationship
Control group - Experiment case with no treatment (for comparison)
Single blind - respondends dont know which group theyre in
Double blind - both responder and tester dont know which groups theyre in
To estimate the proportion of families that oppose budget cuts to the athletic department, the principal surveys families as they enter the football stadium on Friday night. Explain how this plan will result in bias and how the bias will affect the estimated proportion - Convenience sample: Since this sample is attending a football game, they probably support the athletic department & would cause an overestimate the proportion of people in the population that oppose the budget cuts.
Difference between voluntary response and nonresponse bias - Nonresponse can only happen AFTER the sample is chosen (25 people hang up the survey call while 75 people answer) while people CHOOSE to be in the voluntary response sample (75 people calling in)
Observational studies can’t show cause and effect because - you need an experiment (WITH random assignment and random sample), there might be confounding variables influencing the answer
Experimental units - Those being surveyed/getting treatment
Experiment or observational? - Assigned/forced (+ treatment) or voluntary/natural
Factors - Aspects/parts of an experiment
Levels of factors - Possibilities/options of factors
Replication - Doing an experiment over a long period of time/multiple times
The track coach decided to offer the track athletes two options for daily workouts: a rigorous
strength building workout, or a relaxed workout of mostly stretching. After two months, the athletes that chose the strength building workout were 11% faster, while the relaxed workout group was only 3% faster. What is wrong with this experiment? - Athletes got to choose their treatment
On a recent test day, an AP Stats teacher wanted to determine if smart pills would improve grades. As students walked into class, they were randomly assigned a pill that said either “smart pill” or “sugar pill” on it. The test scores were significantly higher for the group of students that took the smart pill.What is wrong with this experiment? - Students knew which treatment they were getting
Stratified random sampling vs block design - Observational vs experiment