1/50
1-10: unit one, 11/12: unit two, 13-21: unit three, 22-29: unit four, 30-33: unit five, 34-42: unit six, 43-45: unit seven, 46-48: unit eight, 49-51: unit nine
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
categorical variable
classifies by a category/group (countries, colors, zip codes)
quantitative variable
classifies by a numerical, measurable value (height, number of something, etc.) can be discrete (only so many valuables) or continuous (many values)
frequency table vs relative frequency table
frequency table has the individual values, relative frequency table has the proportion
how to describe a distribution of a quantitative variable
center: median/mean
unusual features: outliers
shape: symmetrical, skewed, uniform
spread: range, iqr, sd
statistic vs parameter
statistic = sample
parameter = population
how to find outliers
q1 - 1.5(IQR) = leftmost outliers
q3 + 1.5(IQR) = rightmost outliers
or over 2 sd from the mean
resistant/robust statistics
median and iqr (not affected by outliers)
nonresistant/nonrobust statistics
mean, sd, range (affected by outliers)
five number summary plot
boxplot (min, q1, mean, q3, max)
empirical rule
68-95-99.7 = 68% of the data is 1 sd away, 95% is 2 sd, and 99.7 is 3 sd
how to describe a scatterplot
direction: positive/negative
unusual features: outliers
form: linear/nonlinear association
strength: weak or strong based on the line of best fit
extrapolation
BAD!!! DONT TAKE AN X FROM OUTSIDE THE GIVEN RANGE
experiment
participants are assigned treatments. CAN DETERMINE CORRELATION!
observational study
no treatments assigned. CANNOT DETERMINE CAUSATION!!!
simple random sample
everyone has an equal chance of being chosen
stratified random sample
divided into groups and then individuals are selected randomly in each group
cluster sample
divided into groups and the entire group is randomly selected
systematic random sample
first sample is random, then systematically chosen afterwards (every 5th person)
biases
voluntary response bias, undercoverage bias, nonresponse bias, and question wording bias
blinding
single blind: subjects dont know what treatment they’re getting
double blind: subjects and researchers don’t know what treatment they’re getting
experimental designs
completely randomized, randomized block design (sorted into groups then assigned treatments), and matched-pairs
conditional probability
a will occur, given b
p(a|b)= p(anb)/p(b)
multiplication rule for joint probabilities
p(anb)= p(a) x p(b|a)
how to tell if a and b are independent
p(a|b)= p(a) / p(b|a)= p(b)
addition rule / union
probability that either a or b will occur
p(aub)= p(a)+p(b)-p(anb)
mutually exclusive
p(anb)= 0
p(aub)= p(a)+p(b)
bernoulli trial
only two possible outcomes (success/fail) and probability of success is the same every time the experiment is conducted
binomial distribution
probability of a specific amount of successes in a specific amount of trials
geometric distribution
probability of first success on a specific trial number
standard normal distribution
bell curve shape (tails touching the line!!!), mu = 0 and sd = 1. named N(mu,sd)
central limit theorum
original population is normally distributed or SAMPLE SIZE IS >=30!!! also independent
assumptions and conditions for proportions
randomization, sample size (>10% of pop.), independence
assumptions and conditions for means
randomization, independence, sample size (>10% of pop. and n>=30)
confidence interval rules
sample size increases, width of confidence interval decreases
width of confidence interval increases, confidence level increases
conditions for confidence interval for one proportion
independence and normality (>=10)
conditions for confidence interval for two proportions
independence and normality for each proportion (>=10)
hypothesis test
procedure for testing a claim about a population based on a sample
null - h(o) = 0
alternative - h(a) >,<,=/= 0
p is more than alpha - fail to reject the null hypothesis
p is less than alpha - reject the null hypothesis
type 1 error
null hypothesis is rejected, but is actually true / false positive / probability of this error is alpha
type 2 error
null hypothesis is not rejected when it actually false / false negative
power of the test
null hypothesis will be rejected if it actually is false / beta
alpha and beta rules
inverse relationship - b increases, a decreases, etc.
b decreases, sample size increases/standard error decreases
z score
z = phat-p/sqrt(pq/n)
df
degrees of freedom = n-1
df increases, tail height decreases (looks more like normal model)
hypothesis test for mean
h(o) - mu = 0
h(a) - mu <,>,=/= 0
t score
t = (mean - mu)/(sd/sqrt(n))
t = (mean1 - mean2)/(SE)
chi square goodness of fit
whether a population fits a certain distribution
h(o) = there is no difference
h(a) = there is a difference
chi square homogeneity
test if two or more populations follow the same categorical distribution
h(o) = there is no difference in proportions
h(a) = there is a difference in proportions
chi square independence
like homogeneity, but for two variables for a single population
h(o)= there is no association
h(a)= there is an association
least squares regression line
yhat = a+bx
df = n-2
assumptions and conditions for lsrl
linearity, residuals, randomization, nearly normal
hypothesis test for slope
h(o) - b = 0
h(a) - b >,<,=/= 0