Population
In a statistical study is the entire group of individuals we want information about.
Census
Collects data from every individual in the population.
Sample
is a subset of individuals in the population which we actually collect data.
Sample Survey
for studies that use an organized plan to choose a sample that represents some specific population.
Convenience Sample
is a sampling method that involves choosing individuals from the population who are easy to reach. It is generally biased because they don’t really represent the entire population.
Bias
is shown in a statistical study when it consistently underestimates or overestimates the value you want to know. Bias is the tendency for a sample to differ from the corresponding population in some systematic way. Some part of the population is systematically favored over another part.
Voluntary Response Sample
Consists of people who choose themselves by responding to a general invitation. They are generally biased as they over represent people with strong opinions.
Random Sampling
Involves using a chance process to determine which members of a population are included in the sample. Larger random samples typically give better information about the population than smaller samples.
Simple Random Sample(SRS)
of size n is chosen in such a way that every group of n individuals in the population has an equal chance to be selected as the sample. An SRS also gives each member of the population an equal chance to be included in the sample.
Random Number Table
is a table of random digits and can be used to choose individuals randomly.
Stratified Random Sample
starts by classifying the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the sample.
Cluster Sample
starts by classifying the population into groups of individuals that are located near each other, called clusters. Then choose an SRS of the clusters. All individuals in the chosen clusters are included in the sample.
Inference
Is the process of drawing conclusions about a population on the basis of sample data.
Undercoverage(selection bias)
Occurs when some members of the population cannot be chosen in a sample.
Nonresponse bias
Occurs when an individual chosen for the sample cannot be contacted or refuses to participate.
Response Bias
Occurs when inaccurate answers are given in survey questions. Some reasons are wording of the question, appearance of the interviewer, asking about illegal behavior, unpopular beliefs, past events, taking measurements.
Observational study
Observes individuals and measures variables of interest but does not attempt to influence the responses. Looking for patterns or associations. No cause or effect.
Experiment
Deliberately imposes some treatment on individuals to measure their responses. Must give treatment. Used to figure out cause and effect.
Confounding
Occurs when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other. What is something else that is contributing to the response variable.
Treatment
Is a specific condition applied to the individuals in an experiment, If an experiment has several explanatory variables, a treatment is a combination of specific values of these variables (also called factors).
Experimental Units
Are the smallest collection of individuals to which treatments are applied. When the units are human beings, they often are called subjects. Who is being assigned.
Random Assignment
In an experiment means that experimental units are assigned to treatments using a chance process. Doing so helps create roughly equivalent groups of experimental units by balancing the effects of other variables among the treatment groups.
Control
Is the attempt to keep other variables that might affect the response the same for all groups. None of the treatment in this group.
Replication
Use enough experimental units in each group so that any differences in the effects of the treatments can be distinguished from chance differences between the groups. Repeat the experiment because there are enough subjects. “have enough experimental units”
Comparison
Is using an experimental design that compares two or more treatments. Must at least two treatments.
Control Group
Is used to provide a baseline for comparing the effects of the other treatments.
Completely Randomized Design
Has the experimental units assigned to the treatments completely by chance. Flow chart.
Statistically Significant
Result occurs when the observed effect of an experiment is so large that it would rarely occur by chance. Results are very different.
Double-blind
Neither the subjects nor those who interact with them and measure the response variable know which treatment a subject received.
Single-blind
if one party knows and the other does not. one group is blind and the other isn’t.
Placebo
Is the fake treatment given to control group in an experiment.
Placebo Effect
Occurs when some patients taking the placebo improve.
Randomized Block Design
A block is a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments. In a randomized block design, the random assignment of experimental units to the treatments is carried out separately within each block. Block → split the group into blocks → becomes similar.
Matched Pairs Design
Is a common form of blocking for comparing just two treatments. In some matched pairs designs, each subject receives both treatments in a random order. In others, two very similar subjects are paired, and the two treatments are randomly assigned within each pair. Data collected in pairs.
Principles of Experimental Design
Comparison
Random assignment
Control
Replication