1/48
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Statistic
The study of how to collect, organize, analyze, and interpret data collected from a group
Individual
A person or object that you are interested in finding out information about
Variable (AKA random variable)
The measurement or observation of the individual
Population
set of all the entire group of individuals about which we are interested
Sample
A subset from the population. It looks just like the population, but contains less data
-Does not perfectly represent population, because each sample will provide slightly different data
Parameter
A numerical result summarizing a population (e.g., mean, proportion).
Remember:
P-arameter
P-opulation
Statistic
A number calculated Grimm the sample. Since you can find samples, the statistic is readily known, though it changes depending on the sample taken.
-It is used to estimate the parameter value
Remember:
S-tatistic
S-ample
μ
Population mean
P
Population proportion
x̅
Sample mean
p̂
Sample proportion
Qualitative (categorical) variable
Answer is a word or name that describes a quality of the individual
QUALItative --> QUALIty
Quantitative (numerical) variable
Answer is a number, something that can be counted or measured from the individual
2 types:
- Discrete
- Continuous
Note: if its a number it's probably quantitative unless it makes no sense to do arithmetic
Discrete quantitative
Data can only take on certain values (usually integers-usuabkly things you can count)
ex: Number of siblings, number of free throws made in a season
Continuous quantitative
Data can take on any value (usually something you measure)
ex: Height, weight, time to complete a task
Simple random sample
Every different possible sample of size "n" has the same chance of being selected
- Pros: A lot of variety throughout sample (No pattern/predictability)
- Cons: Hard to truly conduct random sample
ex: Pulling names out of a hat, random number generator
Stratified sample
Divide into "strata", randomly select individuals from each
- Pros: Obtain specific data
- Cons: Data may be too specific, resultantly, becoming their own subgroups.
ex: divide into male and female, then randomly choose from each group
Systematic sample
After ordering individuals, take every nth to be in your sample
- Pros: Data is very consistent
- Cons: Not very accurate
ex: choose every 10th item off a production line to inspect
Cluster sample
Divide into clusters, randomly select clusters, sample ALL individuals in selected clusters
- Pros: Easiest sample to conduct when collecting polling data based on location
- Cons: People who reside in these regions may have similar opinions
Census
Every individual of interest is measured
- Pros: Gives us the parameter - best result possible (very accurate)
- Cons: Usually hard to do , and very time-consuming
ex: US census
Convenience sample
Selecting a sample based on what is convenient
-Pros: Easy, convenient
-Cons: Not statistically valid (inherent biases)
ex: selecting people you know, choosing the people in the front row, choosing rats by grabbing them out of a cage.
Observational Study
When investigators collect data merely by watching or asking questions. They don't change anything.
(very hard to establish cause and effect)
ex: Survey
Experiment
When the investigator changes a variable or imposes a treatment to determine its effect. Intended to establish cause-and-effect.
ex: seeing if a test prep class improves test scores
Why not always just do an experiment?
- It is unethical to give someone a condition to study ( ex. give them a disease)
- Cant randomly assign certain explanatory variables such as gender, or handedness
Extrapolation
Assume something that is outside scope of study (assuming pattern continues outside of study)
Guidlines for planning a statistical study
1. Identify individuals
2. Specify the variable
3. Specify the population
4. Specify data collection models
5. Census vs. sample
6. Collecting data
7. Describe data and run inferential stats
8. Concerns for future investigations
1. Identify individuals (things to make note of)
Can only make conclusions about the individuals.
ex. Can't assume humans will react the same way lab rats do
2. Specify the variable (things to make note of)
Try to control as many other variables as possible
3. Specify the population (things to make note of)
Can't extrapolate beyond population
Randomized two-treatment experiment
Control is key concept. There are two treatments, and individuals are randomly placed into the two groups
Placebo
harmless medicine with no effect; dummy medicine.
Often used for control (a person gets better just because they are taking something)
Randomized Block Design
A block is a group of subjects that are similar, but the blocks differ from each other. Then randomly assign treatments to subjects inside each block.
- Provides more control for more accurate results
Rigorously Controlled Design
Assign subjects to different treatments to control as many variables as possible
ex: Twin studies
- Difficult to carry out because it's hard to find a match that is exactly representative with the other
Matched Pairs Design
The treatments are given to two groups that can be matched up with each other in some ways.
ex: Have the same person complete a driving simulator while sober and while drunk.
Other aspects of experimentation
Sample size should be large and the study should be repeated for best results.
Blinding (2 types)
1. Single blinded
2. Double blinded
Single blinded
Subjects do not know if what they are receiving is an actual treatment or a placebo
Double blinded
Neither the experimenters doing the measuring or the subjects being tested know who is getting the treatment or getting the placebo.
- This method eliminates bias from both parties.
ex: Experimenter may ask (assuming they know who got the actual treatment) "are you sure you don't feel any side effects/ feel different"...
Overgeneralization (things not to do in stats)
When you do a study on one group and then try to say that ir will happen/ apply to all groups.
ex: Just because something works on a rat doesn't mean it will work on a human
Cause and effect (things not to do in stats)
When people decide that one variable causes the other just because the variables are related or correlated.
ex: As jacket sales increase, ice cream sales decrease.
- Temperature is a lurking variable
Lurking or confounding variables
When you cannot rule out the possibility that t he observed effect is due to some other variable rather than the factor being studied.
Remember: Correlation IS NOT causation
Sampling error (Issues throughout stats)
The difference between the sample results and the true population results
- This can happen by chance. This is why replication of studies is important!
- This type of error, because it is unavoidable
Non-sampling error
The error we get when we don't do studies appropriately (want to AVOID)
ex: Thermometer always 1 degree higher than actual temperature.
Statistical significance
The results are probably not due to chance
ex: Something that has been proven to yield a specific outcome
Practical or clinical significance
Insignificant effect....Does the effect really matter?
ex: A 12 week expensive weight loss program helps people on average lose 3 pounds.... Is that enough to make one pay and participate in the program?
ex: Coffee helps you live longer... Stats show coffee makes people live on average 3 days longer...... significant?
Hidden bias
When the questions are asked in a way that makes a person respond a certain way.
ex: How SATISFIED where you with your service at the restaurant tonight?
Bias due to order of questions
The order in which the questions are arranged can alter peoples responses. This is because previous questions may provoke some sort of mood or thought process that implies a relation to the proceeding questions.
ex:
"How many dates did you have last month?"
"How happy are you with life in general?"
- Implies there is a connection between Dates and happiness.
CONVERSLY: reverse the order
"How happy are you with life in general?"
"How many dates did you have last month?"
- The relationship between the questions is very low
Non-response bias
When you send out a survey, but not everyone returns the survey.
If response rate is under 30% then the results are very poor
Voluntary response bias
Where people are asked to respond via phone,email, or online. The problem is that only people who really care about the topic are likely to return the call, or email.
- People are most likely only gonna respond if they have a great or terrible experience. Someone with a mediocre experience is not strongly passionate for either side, so most likely we tend to see people not waste time to fill out the survey which leads to biased results.