random
when we know the possible values an outcome can have but not the particular value, free of human influence
simulation
models real-world situation by using random-digit outcomes to mimic the uncertainty of a response variable of interest
trial
sequence of several components representing events that we are pretending will take place
simulation component
uses equally likely random digits to model simple random occurrences whose outcomes may not be equally likely
response variable
values record the results of each trial with respect to what we were interested in
population
the entire group of individuals or items that we want to study or draw conclusions about
sample
a subset of individuals selected from a larger population, used to estimate characteristics of the whole population
sample survey
a method of collecting data by asking a subset of individuals from a population in order to infer insights about the entire group
bias
any systematic failure of a sampling method to represent its population (voluntary response, undercoverage, response bias, or nonresponse bias)
randomization
best defense against bias, all individuals given fair and random chance at selection
sample size
The number of individuals in a sample, determines how well the sample represents the population, not the fraction of the population sampled
census
a complete count of a population, typically conducted at regular intervals, to gather demographic and statistical information
population parameter
a value that summarizes a characteristic of the entire population, such as a mean or proportion, hope to estimate it from sampled data
statistic/sample statistic
a numerical value calculated from a sample, used to estimate a population parameter
representative
what a sample is said to be if the statistics computed from it accurately reflect the corresponding population parameters
simple random sample
sample in which each set of n elements in the population has an equal chance of selection
sampling frame
a list of individuals from whom the sample is drawn, individuals not in it cannot be in any sample
sampling variability
the natural tendency of randomly drawn samples to differ from one another, sometimes called sampling error though it is not an error
stratified random sample
a sampling design in which the population is divided into several subpopulations, random individuals are then drawn from each group, groups should be homogenous to reduce variability
cluster sample
entire groups chosen at random, a random sample of these heterogenous groups should be representative of the population, usually selected for convenience/cost
multistage sample
sampling schemes that combine several sampling methods
systematic sample
a sample drawn by selecting individuals with consistent variations from a sampling frame, when there is no relationship between the order of the sampling frame and the variables of interest, this can be representative
pilot survey
a small trial run of a survey to check whether questions are clear
voluntary response bias
bias introduced to a sample when individuals can choose on their own whether to participate in the sample
convenience sample
consists of the individuals who are readily available, often fail to be representative because every individual in the population is not equally easy to sample
undercoverage
a sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population
nonresponse bias
bias introduced when a large fraction of those sampled fails to respond, those who do respond are likely to not represent the entire population, voluntary response bias is a form of this
response bias
anything in a survey design that influences responses, one typical source arises from the wording of questions, which may suggest a favored response
observational study
a study based on data in which no manipulation of factors has been employed
retrospective study
an observational study in which subjects are selected and then their previous conditions or behaviors are determined
prospective study
an observational study in which subjects are followed to observe future outcomes, not an experiment because no treatments are applied, typically focus on estimating differences among groups that might appear as the groups are followed during the course of the study
experiment
manipulates factor levels to create treatments, randomly assigns subjects to these treatment levels, and then compares the responses of the subject groups across treatment levels
random assignment
to be valid, an experiment must assign experimental units to treatment groups at random
factor
a variable whose levels are manipulated by the experimenter, experiments attempt to discover the effects that differences in these levels may have on the responses of the experimental units
response variable
a variable whose values are compared across different treatments
experimental units
individuals on whom an experiment is performed, usually called subjects or participants when they are human
level
the specific values that the experimenter chooses for a factor
treatment
the process, intervention, or other controlled circumstance applied to randomly assigned experimental units, the different levels of a single factor or the make up of combinations of levels of two or more factors (level → factor → ___)
principals of experimental design
control, randomize, replicate, block
completely randomized design
all experimental units have an equal chance of receiving any treatment
statistically significant
when an observed difference is too large for us to believe that it is likely to have occurred by chance
control group
the experimental units assigned to a baseline treatment level, typically either the default treatment, which is well understood, or a null, placebo treatment, provides basis for comparison
blinding
any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups
single blind
when every individual in either of these classes (influencing results or evaluating results) is blinded
double blind
when everyone in both classes (influencing results and evaluating results) is blinded
placebo
a treatment known to have no effect, administered to one group so that all groups experience the same conditions
placebo effect
the tendency of many human subjects (often 20% or more of experiment subjects) to show a response even when administered a placebo
blocking
When subgroups of the experimental units differ in ways that may affect their responses to treatments, isolate the variability attributable to the differences between the groups so that we can see the differences caused by the treatments more clearly
randomized block design
the subjects are randomly assigned to treatments only within blocks
matching
in a retrospective or prospective study, subjects who are similar in ways not under study may be paired and compared with each other on the variables of interest, reduced unwanted variation similar to blocking
confounding
when the levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated