1/66
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Anecdotal evidence
one or a few specific cases/stories
Non-response bias
only a small fraction of the randomly sampled people choose to respond, the survey may no longer be representative of the population
Voluntary response
sample consists of people who volunteer to respond because they have strong opinions on the subject
Convenience sample
individuals who are easily accessible are more likely to be included in the sample
Perspective study
identifies individuals and collects information as events unfold
Retrospective studies
collect data after events have taken place
Simple sampling
randomly select cases from population, no implied connection between points selected
Stratified sample
strata are made up of similar observations. we take a simple random sample from each strata
Cluster sample
clusters are not made up of homogeneous observations, we take a simple random sample of clusters and then sample all observations in that cluster
Multistage sample
clusters are usually not made up of homogeneous observations. we take a random sample of clusters and then take a random sample of observations from sampled clusters
Principles of experimental design
control, randomize, replicate, block
Block
if there are variables that are known or suspected to affect the response variable. first group subjects into blocks based on these variables, and then randomize cases within each block to treatment groups
Double-blind
both subject and researchers don’t know what group subjects are in
P-value
probability value. The probability that we would get results this extreme assuming the null hypothesis is true
One-sided hypothesis
focuses on one specific value
p (hat)
sample proportion
Null hypothesis
“There is nothing going on.” The variables are independent, observed differences are due to chance
Alternative hypothesis
“There is something going on.” Variables are dependent, not due to chance.
Either reject or ____ the null hypothesis
fail to reject
Convincing evidence
allows us to reject the null hypothesis
Type 1 error
rejecting the null hypothesis when its true
Type 2 error
don’t reject the null hypothesis when the alternative hypothesis is true
two-sided
looking for extreme results on either side of the distribution
Confirmation bias
we only look for data that supports our own idea
Random process
is a situation in which we know what outcomes could happen, but we don’t know what particular outcome we will get
Frequentist interpretation
the probability of an outcome is the proportion of times the outcome would occur if we observed the random process an infinite number of times
Bayesian interpretation
interprets probability as a subjective degree of belief. for the same events, two separate people could have different viewpoints
Law of large numbers
states that as more observations are collected, the proportion of occurrences with a particular outcome, p(hat)n, converges to the probability of that outcome, p (population proportion)
Gambler’s fallacy
random processes are supposed to compensate for whatever happened in the past (ex. 10 heads, next one should be tails)
Disjoint (mutually exclusive) outcomes
cannot happen at the same time (can’t get heads and tails on a single coin toss)
Non-disjoint outcomes
can happen at the same time (can’t get heads and tails on a single coin toss)
General addition rule
P(A or B) = P(A) + P(B) - P(A and B) for disjoint events: P(A or B) = P(A) + P(B)
Probability distribution
lists all possible events and the probabilities with which they occur
Rules for probability distributions:
the events listed must be disjoint
each probability must be between 0 and 1
the probabilities must total 1
Sample space
all the possible outcomes of a trial
Complementary events
two mutually exclusive events whose probabilities add up to 1
Independent
knowing the outcome of one provides no useful info about the outcome of the other
Associated or dependent
The outcome of one affects the outcome of another
Product rule for independent events
if A and B are independent events, then P(A and B) = P(A) x P(B)
Complementary events
are opposite, but always add up to 1 (like heads and tails)
Conditional probability
P(A|B) = P(A and B)/P(B) (A given B is true)
General multiplication rule
If A and B represent two outcomes or events, then P(A and B) = P(A|B) x P(B)
Rule of thumb
if we are sampling without replacement, then the rule of thumb is if we sample less than 10% of the population, the effect on conditional probability is small, so we assume independence
The probability of 1 specific value amongst infinite values is ______
effectively zero
Normal distribution
Unimodal and symmetric, bell shaped curve
Measures of spread
variance, standard deviation, and range
z score
how many SD we are from the mean (only for normal!)
percentile
percentage of observations that fall below a given data point
Unusual
observations that are more than 2 SDs away from the mean
Statistic
any number that’s used to represent something
One version of scientific methodology:
observe data, form hypothesis, gather more data, test/evaluate/adjust hypothesis, gather more data, etc.
explanatory variable
variable that is manipulated or changed
Response variable
outcome (yes or no)
Population
all those who could be tested
Sample
subgroup or subset of the population
Numerical variable
continuous (number line) or discrete (gaps between numbers)
Categorical variable
ex. (T-shirt size) nominal or ordinal (natural order)
68-95-99.7 Rule
for nearly normally distributed data:
about 68% falls within 1 SD of mean
about 95% falls within 2 SD of the mean
about 99.7% falls within 3 SD of the mean
Population proportion
percentage of a population that meets some specific criterion
Sample proportion
how many in the sample meet that criterion divided by the total amount in the sample (p hat)
Plurality
not the majority, but the highest number
Population parameter of interest
the # we want to know
Sample statistic
usually a point estimate, the # we get from a sample
Central Limit Theorem (proportion version)
sample proportions will be nearly normally distributed with mean equal to the population proportion, p, and standard error equal to ______
CLT conditions
Independence - n<10% of population
Sample size - there should be at least 10 expected successes and 10 expected failures
Confidence interval
a plausible range of values for the population parameter
significance level (alpha)
probability of a Type 1 error, usually 0.05