1/27
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Cluster Sampling / Clusters
A cluster is a group of individuals in the population that are located near each other.
Cluster sampling selects a sample by randomly choosing clusters and including each member of the selected clusters in the sample.
For example, instead of picking 200 random high school students out of a hat, a teacher randomly selects 10 homerooms with 20 students each and only surveys them.
Systematic Random Sampling (SRS)
Systematic Random Sampling selects a sample from an ordered arrangement of the population by randomly selecting one of the first k individuals and choosing every kth individual thereafter.
For example, if we want to survey 80 students at an assembly of 800, we could choose the 10th student to come into the auditorium and then interview every person-per-10 after that.
(ex: choosing person #2 out of a sample group and choosing every other/second person after them, or choosing person #3 and skipping to the 6th person after)
Undercoverage
Undercoverage occurs when some members of the population are less likely to be chosen, or cannot be chosen at all, in a sample.
For example, a survey for registered voters cannot accurately represent the entire American population, as not an overwhelming majority of Americans are registered voters; the survey would exclude people younger than 18 and people applying for citizenship who are not legally permitted to vote yet.
So, there would be undercoverage for minors & non-citizens.
Response Bias
Response bias occurs when there is a systematic pattern of inaccurate answers to a survey question.
For example, an anonymous survey about hygiene habits may garner different results from the same exact group as opposed to if they were to be interviewed spontaneously or face-to-face/in public by somebody.
Response bias is something that is introspectively deduced.
Nonresponse
Nonresponse occurs when an individual chosen for a sample can’t be contacted or refuses to participate.
For example, a mayor calls every landline phone in the district he represents, but some of the landlines are dead, or some that do pick up assume that they’re being scam-called and do not give the survey/experiment they are unknowingly in their time.
Prospective vs. Retrospective Observational Study
Experiment vs. Observational Study
Observational study: observes individuals and measures variables of interest, but does not attempt to influence the responses.
Experiment: deliberately imposes treatments/conditions on individuals to measure their responses (influences results artificially/intentionally).
Population
The population is the entire group of individuals or instances about whom we hope to learn.
Census
A census collects data from which every individual in the population.
Sample
A sample is a subset of individuals in the population from which we collect data.
Sample Survey
A sample survey selects a smaller sample of individuals from a larger population about which we need information.
The goal of a sample survey is to draw conclusions about the population based on the data gathered from the sample.
Convenience Sampling
Convenience sample selects individuals from the population who are easy to reach. It often produces unrepresentative data.
This can happen if you simply survey the first 30 people that step into a room instead of trying to find different people to survey from different areas or at different times, or, in short, if there is little diversity/variation in the subjects.
Bias
The design of a statistical study shows bias if it is very likely to underestimate/overestimate the value you want to know.
Voluntary Response Sampling
Voluntary Response Sampling allows people to choose to be in the sample by responding to a general invitation.
Random Sampling
Random Sampling involves using a chance process to determine which members of a population are included in the sample.
This can be done by writing the names of all the students in a class on slips of papers and picking out a sample size at complete random (with no repeats).
Simple Random Sample (SRS) & Larger Sample SRS w/ RNG, Table D
An SRS of size # is chosen in such a way that every group of # individuals in the population has an equal chance to be selected as the sample (or, there is no rigging involved).
To choose an SRS with technology (for larger samples/in place of paper slips):
1) Give a unique label to every individual in the population.
2) Randomize via RNG (Math → Prob → randInt(sample #)) & select unique integers until your sample size is reached.
3) Choose the individuals that correspond to said integers.
4) Conduct study/experiment/etc.
To choose an SRS using Table D (random digit lines):
1) Give a unique label to every individual in the population.
2) Read the given consecutive group of digits and scan left-to-right for nonrepeating integers that match the sample number and are unique.
3) Choose the individuals that correspond to said integers.
4) Conduct study/experiment etc.
Stratas & Stratified Random Sampling (Strata)
Strata are groups of individuals in a population who share characteristics thought to be associated with the variables being measured in a study.
Stratified Random Sampling selects a sample by choosing a Simple Random Sample (even chance) from each group (strata) and combining every subject chosen via SRS into one overall sample.
For example, a study on sleep habits of high schoolers may be separated into individual stratum of freshmen or juniors instead of altogether (especially if the school’s size is larger).
Response Variable vs. Explanatory Variable
An explanatory variable (x) may help explain/predict changes in a response variable.
A response variable (y) measures an outcome of a study. It is dependent on the explanatory variable.
For example, sunlight exposure would be an explanatory variable (x) for the growth (response variable, y) of the same plants in different environments. Less growth could be explained by less sunlight exposure.
Confounding + Confounding Variables
Confounding occurs when 2 variables are associated in such a way that their influence over a response variable cannot be distinguished from one-another or explicitly identified.
Placebo
A placebo is a treatment that has no active ingredient, but is otherwise like other treatments.
For example, in a medical study, 500 patients may be given an actual pill/medication, while the other 500 may be given a sugar pill (placebo: fake/no actual influence). Placebos are often given to control groups.
Treatment
A treatment is a specific condition applied to the individuals in an experiment.
If an experiment has several explanatory variables, a treatment is a combination of specific values of these variables.
Experimental Unit
An experimental unit is an experimental subject/object to which a treatment is (randomly) assigned.
When the experimental units happen to be humans, they are also often called subjects.
Factors & Levels
In an experiment, a factor is an explanatory variable (x) that is manipulated and may cause a change in the response variable.
The different values of a factor are called levels.
For example: the general treatment of a condition is an explanatory variable, linked to the state of said condition and if it improves/declines. The general treatment would be a factor; the levels could be the different treatments (ex: Medicine A, Medicine B, Medicine C) provided.
Replication
Replication: “use enough subjects”
In an experiment, replication means giving each treatment to enough experimental units so that a difference in the effects of the treatments can be distinguished from chance variation due to the random assignment. In short, replication is what helps statisticians determine whether the results of their experiment were a chance fluke, or if they are actually characteristic of a larger population.
This can be reaffirmed by other statisticians conducting the same experiments in similar environments to “double-check” results.
Experimental Design
1) Use a design that compares two or more treatments.
2) Use chance to assign experimental units to treatments, or vice versa, via random assignment. This will help to reduce bias and create roughly equivalent groups of units by balancing effects of other variables.
3) Keep a control group.
4) Replicate until a consistent pattern can be confirmed in the results.
Completely Randomized Designs
In a completely randomized design, experimental units are assigned to treatments completely at random.
The design itself goes:
1) Determine group size.
2) Random Assignment.
3) Branch out into even groups & indicate value for each.
4) Give each group a different treatment.
5) Compare the changes.
For example:
20 volunteer students → Random assignment . . . → Compare (last step)
→ . . . Group 1 (10 students) → Treatment 1 (caffeine)
→ . . . Group 2 (10 students) → Treatment 2 (no caffeine/control)
Blocks & Randomized Block Design
A block is a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments.
In a randomized block design, the random assignment of experimental units to treatments is carried out separately within each block. However, block assignment is not random.
For example, subjects are first separated into blocks, then, randomly assigned to 2 treatments. Then, their blocks become completely separate, but parallel/visually identical, and tie back together only at the very end after respective comparisons within the blocks to compare results overall.
Sampling Variability
Sampling Variability is a phenomenon that refers to the fact that different random samples of one population produce different estimates every time.
Estimates from larger samples are more precise.