1/94
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
properties of a normal distribution
fully described by its mean and standard deviation
symmetric around its mean
mean=median=mode
2/3 of random draws are within one SD of the mean
~95% of random draws are within 2 SD of the mean
standard normal distribution
mean is zero
standard deviation is 1
standard normal table
gives the probability of getting a random draw from a standard normal distribution greater than a given value
standard normal is symmetric so…
Pr[Z>x] = Pr[Z<-x]
Pr[Z<x]=1-Pr[Z>x]
what about other normal distributions
all normal distributions are shaped alike, just with different means and variances
any normal distribution can be converted to a standard normal distribution by Z=Y-μ/σ
What does Z tell us
how many standard deviations Y is from the mean
sample means are normally distributed
the mean of the sample means is μ
the standard deviation of the sample means is SD/square root of number of samples
standard error
standard deviation of the distribution of sample means
= s/square root of n
central limit theorem
the sum or mean of a large number of measurements randomly sampled from any population is approximately normally distributed
inference about means
because y bar is normally distributed, we can convert its distribution to a standard normal distribution
this gives a probability distribution of the difference between a sample mean and the population mean
what can s be used for
an estimation of SD
student’s t test
good approximation to the standard normal, has a t distribution
degrees of freedom for t-test
n-1
what can we use the t-distribution for
calculate confidence interval of the mean
one-sample t-test
compares the mean of a random sample from a normal population with the population mean proposed in a null hypothesis
test statistic for one sample t-test
y-bar - mean proposed by the null divided by s/square root of n
one sample t-test assumptions
variable is normally distributed
the sample is a random sample
comparing two means
tests with one categorical and one numeric variable
goal: to compare the mean of a numerical variable for different groups
paired design examples
before and after treatment
upstream and downstream of a power plant
identical twins: one with a treatment and one without
earwigs in each ear: how to get them out? compare tweezers to hot oil
paired t-test
compares the mean of the differences to a value given in the null hypothesis
for each pair, calculate the difference. the paired t-test is simply a one-sample t-test on the differences
degrees of freedom for paired t-test
number of pairs-1
assumptions of paired t-test
pairs are chosen at random
differences have a normal distribution
2 sample t-test
compares the means of a numerical variable between two populations
assumptions of two-sample t-test
both samples are random samples
both populations have normal distributions
the variances of both populations is equal
Welch’s t-test
compares the means of two normally distributed populations that have unequal variances
how to compare variance between groups
the f-test
f-test f
two different degrees of freedom, one for the numerator and one for the denominator
very sensitive to assumption that both distributions are normal
levene’s test
more robust test to compare variances (between 2 or more groups)
how to detect deviations from normality
previous data/theory
histograms
quantile plots
shapiro-wilk test
shapiro-wilk test
used to test statistically whether a set of data comes from a normal distribution
what to do when assumptions aren’t true
transformations
non-parametric tests
randomization and resampling
the normal approximation
means of large samples are normally distributed
the parametric tests on large samples work relatively well, even for non-normal data
rule of thumb, if n>~50, the normal approximations may work
parametric tests - unequal variance
welch’s t-test would work
if sample sizes are equal and large, then even a ten-fold difference in variance is approximately acceptable
data transformations
changes each data point by some simple mathematical formula
log-transformation
y = ln[y]
when is the log transformation useful
the variable is likely to be the result of multiplication of various components
the frequency distribution of the data is skewed to the right
the variance seems to increase as the mean gets larger (in comparisons across groups)
other transformations
arcsine, square-root, square, reciprocal, antilog
valid transformations
require the same transformation be applied to each individual
have one-to-one correspondence to original values
have a monotonic relationship with the original values
choosing transformations
must transform each individual in the same way
you CAN try different transformations until you find one that makes that makes the data fit the assumptions
you CANNOT keep trying transformations until P<0.05
non-parametric methods
assume less about the underlying distributions
also called “distribution-free'“
“parametric” methods assume a distribution or a parameter
non-parametric test
sign test
compares data from one sample to a constant
simple: for each data point, record whether individual is above (+) or below (-) the hypothesized constant
use a binomial test to compare result to 1/2
the sign test has very low power
it is quite likely to not reject a false null hypothesis
most non-parametric methods use ranked order of data points
rank each data point in all samples from lowest to highest
lowest data point gets rank 1, next lowest rank gets 2
mann-whitney U test
compares the central tendencies of two groups using ranks
non-parametric method
Performing a mann-whitney U test
rank all individuals from both groups together in order
sum the ranks for all individuals in each group —> R1 and R2
assumptions of mann-whitney U test
both samples are random samples
both populations have the same shape of distribution
permutation tests
also known as randomization tests
used for hypothesis testing on measures of association
mixes the real data randomly
variable 1 from an individual is paired with variable 2 data from a randomly chosen individual. this is done for all individuals
the estimate is made on the randomized data
this is repeated numerous times
without replacement
permutation tests are done without replacement
all data points are used exactly once in each permuted data set
goals of experiments
eliminate bias
reduce sampling error (increase precision and power)
what is the question
what kind of data do you need?
how much time/space/money/other resources do you have?
factor
the independent or experimental variable
level
one version of the experimental variable
treatment
the total experimental manipulation applied to a “unit” or “sample”
features that reduce bias
controls, random assignment to treatments, blinding
controls
a group which is identical to the experimental treatment in all respects aside from the treatment itself
establish a baseline
compare to the status quo
placebo-procedural control
example of placebo
some illnesses, e.g. pain and depression, respond to fact of treatment, even with no pharmaceutically active ingredients
control: “sugar pills”
independent recovery
patients tend to seek treatment when they feel very bad
as a result, they often visit the doctor when they are at their worst. improvement may be inevitable, even without treatment
random assignment averages out the effects of confounding variables
allocation of treatments at random to avoid unknown bias
use a random number table, coin flip, deck of cards, etc.
blinding
preventing knowledge of experimenter (or patient) of which treatment is given to whom
unblinded studies usually find much larger effects (sometimes threefold higher), showing bias that results from lack of blinding
error and variation
experimental error
natural differences in experimental units
variation in measurement
environmental conditions
variance of experimental error is used to conduct statistical comparisons
replication
carry out study on multiple independent objects
balance
nearly equal sample sizes in each treatment
blocking
grouping of experimental unit; within each group, different experimental treatments are applied to different units
extreme treatments
stronger treatments can increase the signal-to-noise ratio
blocking
controls for known bias or variation
age
sex
weight
nutrient level
size
location
replication
used to minimize unknown bias or error
indication of variation of results
experimental unit
in field biology, known as “plot”
physical entity to which a treatment is randomly assigned or a subject that is randomly selected from a treatment population
avoid pseudoreplication
analysis of variance (ANOVA)
like a t-test, but can compare more than two groups
asks whether any of two or more mean as is different from any other
in other words, is the variance among groups greater than 0?
ANOVA assumptions
all samples are random samples
all populations are normally distributed
the variance for all groups are equal
kruskal-wallis test
non-parametric alternative to ANOVA
uses the ranks of the data points
correlation:r
describes the relationship between two numerical variables
correlation assumes…
random sample
X is normally distributed with equal variance for all values of Y
Y is normally distributed with equal variance for all values of X
regression
predicts Y from X
linear regression assumes that the relationship between X and Y can be described by a line
regression assumes…
random sample
Y is normally distributed with equal variance for all values of X
multiple-factor ANOVA = MANOVA
a factor is a categorical variable
ANOVAs can be generalized to look at more than one categorical variable at a time
not only can we ask whether each categorical variable affects a numerical variable, but also do they interact in affecting the numerical variable
fixed effects
treatments are chosen by experimenter, they are not a random subset of all possible treatments
random effects
the treatments are a random sample from all possible treatments
method for multiple comparisons
tukey-kramer test
tukey-kramer test
done after finding variation among groups with single-factor ANOVA
compares all group means to all other group means
why not use a series of two-sample t-tests
multiple comparisons would cause the t-tests to reject too many true null hypotheses
tukey-kramer adjusts for the number of tests
tukey-kramer also uses info about the variance within groups from all the data, so it has more power than a t-test with a bonferroni correction
estimate correlation coefficient
sum of products/sum of squares = r
spearmen’s rank correlation
alternative to correlation that does not make so many assumptions
attenuation
the estimated correlation will be lower if X or Y are estimated with error
parameter of linear regression
Y = α + β X
estimating a regression line
Y=a+bX
best estimate of the slope
=”Sum of cross products” over “sum of squares of X”
coefficient of determination
r², square of the correlation coefficient r
non-linear relationships
transformations, quadratic regression, splines
AIC vs. inferential statistics
power of p-value
multiple models as alternative hypotheses
statistically significant versus biologically signifiant
AIC
estimate relative fit of a set of competing statistical models → model selection
model fit is never exact, some fit better than others
AIC balances goodness of fit with number of parameters in the model (more parameterized models are penalized)
AIC calculations
models get scored
AIC = 2k-sln(L-hat)
k=number of parameters
L-hat=max values of the liklihood function
measures how much information (fit) is gaines/lost by adding a predictor (parameter)
relative quality of the model
AIC score described the relative quality of the model
changes with changes in the model set
how to choose the model set?
how to make an inference?
summary of AIC approach
not leaning on p-values
multiple models as alternative hypothesis
comparing relative model fit
strength of evidence approach
information theory statistics