Looks like no one added any tags here yet for you.
Population
everyone or everything needed to study or survey in order to answer whatever question we have
purpose
answer questions about populations
sample
subset of your population
descriptive statistics
collecting, summarizing and presenting sample data in a graph/chart
inferential statistics
using a sample to “go beyond” and tell about a population
categorical data
data in the form of categories
numerical data
data in the form of numbers
continuous numerical data
can be measured at whatever decimal level you want (time, lengths, weights)
discrete numerical data
data that isn’t continuous (counts, money(usually))
what chart is used for categorical data?
bar chart
what chart is used for numerical data?
dot plot
where do the actual data values go on an axis of a bar chart?
the bottom
To describe a dot plot (or any plot of numerical data) what do you need to provide?
center, shape, spread, context
what shapes are there?
skewed, left, right, bimodal, mound shape, roughly symmetric
What do you do for center?
Estimate the center. Later, mean and median! Also include context with it.
Spread?
Give the range (high minus low)
What should you do with two data sets?
COMPARE THEM (centers, shapes and spreads)
How to compare?
lower than, more spread out than, etc.
what are the two types of studies?
observational and experiments
what do you do in an observational study
go out and collect data
what is the goal in an experiment
to determine if there’s a cause and effect relationship of an explanatory variable on a response variable
How do you do an experiment?
By RANDOMLY ASSIGNING the experimental units to different treatment groups where you will apply the explanatory variable to each group to see if there is a difference in the response variable among the treatments.
Explanatory variable
the variable causing the difference in treatments. The possible values of the treatment (it explains the variation between the groups)
Response variable
the variable you are measuring at the end, to see if there is a difference. The possible values of the result that you are measuring
Experimental units
the things or people that are randomly assigned to treatment groups
Random selection (random sampling)
when you select subjects from the population “at random.” Everyone has an equal chance of being picked
How to create a random sample of 50 people from a list of 2000?
Use a hat! Put all 2000 names in a hat, shake the hat, and pull out 50. They are your subjects
Random assignment (randomization)
Done in experiments only. It’s when we randomly put our already-selected subjects into our different treatments.
What is the purpose of random selection? (Magic phrase)
To create a sample that is representative of the population.
What is the purpose of random assignment? (Magic phrase)
To create treatment groups that are roughly equal on extraneous variables, so that the only difference between them is the explanatory variable (or treatment).
Simple random sampling (SRS)
In an SRS, you get every possible name from your population, and you simply choose n in some random way from the list.
Stratified random sampling
If the population is naturally split into subgroups, and you want to make sure all groups are represented proportionally, sample from each group separately. These groups are called strata.
Ideally you would sample from each proportionally, but it’s not required.
Cluster sampling
When you pick an entire group at random rather than individuals. You can only do this if you are sure the group will be representative of your population.
Systematic sampling
When you pick the first person from a list at random, then choose every kth person after that. (Ex. every 4th person on a list of 200)
Convenience sampling
When you pick whoever is nearby and easy. Or pick your friend. Or take only those who call in, or only those who log into your website. It doesn’t make a representative sample, and it makes me unhappy.
Bias
Even if you do random selection (random sample) correctly, bad things can still happen to mess up your sample. Sometimes it’s your fault, sometimes not. In general, bias is a tendency for samples to differ from the population in some consistent way.
Selection bias
When you systematically exclude a part of the population.
Example: Phone surveys exclude those without phones, generally homeless.
Measurement bias (or response bias)
When how you collect the data is a problem
Example: Asking “knowing that smoking causes lung cancer, how do you feel about smoking?”
Nonresponse bias
When you don’t get responses from the sample you select.
Example: When you mail out surveys, and you only get a 40% response rate.
Treatments
The different experimental conditions being compared
Explanatory variable
Replication
Using more than one subject or observation for each treatment group.
Direct control
Variables that the experimenter directly manipulates or controls.
What is the goal of direct control? (Magic phrase)
The goal is to reduce variability so that differences can be more easily seen.
Control group
When one experimental group receives no treatment. Not every experiment needs a control group! You only do this if your experiment is comparing one treatment (like a new drug) to nothing (like no drug).
Confounding variable
A variable related to (in between) both the explanatory variable and the response variable. If you have it, you can’t tell if treatment effects are due to treatment or a different factor. Rare in a well-designed experiment