Unit 4 Stat
Population: The entire group of individuals we want information about.
Census: A collection of data from every individual in the population.
Sample: A subset of the population from which we want to collect data.
Sample on the Calculator - [MATH]→(PRB)→5:RandInt. RandInt(lower, upper, n(optional))
Simple Random Sample (SRS): A sample where all sizes of n have an equal chance of being selected.
Paragraph for Writing SRS - the paragraph must include the following
the range of numbers being used
how many digits will be read each time
what each number will represent
what line you are starting with
what direction you will read if using a digits table
what numbers will you ignore (repeats?)
explain how you will know when you are finished
Stratified Random Sample: Divide the population into groups of similar individuals, called strata. Within each strata, number each individual. Always define strata first. Use an SRS within each strata.
Clusters: They are subsets of a population that are roughly identical to the population but are physically closer to each other.
Advantages of using Clusters - sometimes stratified random samples and SRS are hard to use when populations are large and very spread out over a large area.
Collecting a Cluster - classify individuals that are near each other in a single cluster and designate them as a cluster. Always define clusters first. Number each cluster. Perform a SRS to select the clusters. Explain how the clusters were selected and why it is better than a simple random sample.
Systematic Sample: Choose a random starting point and then every kth member of the population.
Multi-stage Sampling: Divide a population into smaller and smaller stages. Often involves a combination of SRS, stratified random sampling, and cluster sampling.
Sample Survey: A study that uses an organized plan to choose a sample that represents some specific population. Randomness helps avoid bias. Laws of probability allow for trustworthy inferences.
Choosing a Sample for a Sample Survey -
Define a population you want to describe.
Say exactly what you are measuring.
Decide how to choose a sample from the population
Bias: Inference is drawing conclusions on the basis of a sample. You need to ensure our samples are representative of the populations we’re intending to study. A study has bias if it would consistently overestimate/understimate the value we want to know.
Biased Sampling Methods:
Convenience Sampling - choosing individuals easy to reach
Voluntary Response Sample - people deciding to participate in an open invitation. People with strong opinions are more likely to participate.
Selection Bias: Occurs when we intentionally or unintentionally favor one segment of a population over another. This also occurs when underrepresenting a segment of a population.
Undercoverage Bias - occurs when certain groups are left out of a sample.
Voluntary Response Bias - occurs when people choose themselves to be in a sample, resulting in too much emphasis on those with strong opinions.
Response Bias: responses that differ in some systematic way from the truth.
wording of the question
ordering of questions
appearance and behavior of the interviewer
respondent answers
Non Response Bias: occurs when people refuse to response or are too difficult to reach.
Sampling Error: Refers to the difference in results between the population and your sample. The sample will rarely have the same results as a population. Error will always exist.
Confounding Variables: Observational studies cannot determine cause and effect relationship because there are other variables in play. Two variables are said to be confounding when their effects on a response variable cannot be distinguished.
Observational Study: Observes individuals and measures variables of interest. Doesn’t attempt to influence responses. Doesn’t provide evidence of causation.
Experiment: Deliberately imposes some treatment on individuals in order to observe responses. A well-designed experiment can provide evidence of causation.
Experiment Vocab:
experimental units - individuals on which an experiment is done
treatment - specific experimental condition applied to each experimental unit
explanatory variables - attempts to explain the observed outcomes of an experiment
response variables - measures the outcome(s) of a study
Advantages of Experiments -
allows us to evaluate cause-and-effect relationships
can control the environment
can study the combined effects of several factors simultaneously
Disadvantages of Experiments -
time consuming
expensive
difficult to get volunteers
difficult to control for variables
sometimes unethical
Principles of Experiments -
Comparison - compares two or more treatments
Random Assignment - creates about equivalent groups by balancing the effects of other variables
Control - keep other variables the same across the board, which helps prevent confounding
Replication - use enough experimental units, 30+ is good
Randomization: Attempts to create equivalent experimental groups, ensuring that the experimenter doesn't favor one group over another.
Completely Randomized Design - treatments are assigned to all the experimental units completely by chance. Some experiments include a control group.
Block Design - the random assignment of units to treatments is carried out separately within each block. Ex. breaking an experiment down into males and females and randomly assigning treatment to different groups within the male and female groups.
Statistically Significant: This means “unlikely to happen by chance,” implying that the treatment actually caused the change in the response variable.
Matched Pairs Design: A specific kind of randomized block design, where each block has either 1 or 2 experimental units. Used when the experiment only has two treatment conditions, and the subjects can be grouped into pais. Tend to be better than completely randomized designs because they reduce bias and better control confounding variables.
2 Subjects in Each Block - each block has a matched pair. The two subjects in the matched pair get different treatments.
1 Subject in Each Block - Each subject takes both treatments at different times. Researcher than compares the findings. Each person serves as their own “control” group which reduces confounding variables.