biostats - unit 2

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/64

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

65 Terms

New cards

inferential statistics

determine how close our sample estimates are to our population parameters

New cards

null hypothesis

a specific statement about a population made for the sake of argument

typically says there’s no difference between the true value of a population and hypothesized value or no difference between 2+ samples drawn from a population

New cards

alternative hypothesis

states that our parameter estimate is different from null hypothesis

hypothesis that parameters are influenced by non-random cause

New cards

two-tailed test

no direction of alternative hypothesis, either outcome is tested/possible

New cards

one-tailed test

direction of hypothesis is predicted (need good reason for one-tailed test)

New cards

binomial exact test

use to determine if observed proportion of successes in a sample differs significantly from hypothesized population proportion

New cards

test-statistic

single, standardized measure of a sample distribution

allows us to evaluate how comparable our sample is to what we would expect under the null hypothesis

New cards

null distribution

sampling distribution of the test statistic if the null hypothesis were true

New cards

why is null distribution

tells you whether something interesting is going on or not!!

key to understanding all statistical tests and p-values

if test-statistic falls in the middle of null distribution, you can assume nothing special is happening (fail to reject null hypothesis)

New cards

p-value

probability of obtaining test-statistic or more extreme values if the null hypothesis is true

New cards

interpretation if p-value is 0.031

3.1% of the time, if p = 0.5 in a population of toads, we would expect to obtain 14 or more righties or lefties in a sample of 18 toads —> therefore, our sample seems unlikely if Ho is true

New cards

significance level

probability used as the threshold for rejecting the null hypothesis

if p-value is =< threshold value, we find support for Ha and can say result is “statistically significant” —> reject Ho

New cards

critical value

value of a test-statistic required to achieve P =< threshold value

New cards

type I error

mistakenly rejecting a true null, false positive

occurs with probability threshold value

New cards

why can’t we just lower our threshold value to prevent type I errors?

almost no chance any sample could lead to rejection of null and would cause us to routinely fail to reject false null hypotheses

New cards

type II error

false negative, mistakenly accepting a false null

keeps us from setting threshold value too low
- depends on sample size, effect size, and precision —> more difficult to estimate

New cards

ecological/biological hypothesis

what is the mechanism we think underlies the difference we expect to see

New cards

exploratory analysis

goal is to find the story of the data

New cards

explanatory analysis

goal is to share the story of the data

New cards

goal of two variable comparisons

visualize the association between variables

New cards

numeric vs numeric

scatterplot

New cards

categorical vs numeric

mean and error bars or boxplots

New cards

contingency table

frequency of occurence of all combinations of 2 or more variables

New cards

categorical vs categorical

contingency table
grouped bar graph
mosaic plot

New cards

correlation

correlation coefficient (r ) measures the strength and direction of the association between 2 numerical variables

no implications of causality (no explanatory or response variables)

New cards

regression

measures the functional/causal relationship between two variables

gives us fit line (incorporates correlation and slope)

New cards

regression - big slope but small correlation

when x changes, y changes a lot BUT pattern isn’t reliable (very noisy)

New cards

finding line of best fit for regression

from sum of squares (think: variance)

New cards

regression - confidence bands

measure precision of the predicted Y for each value of x

New cards

regression - prediction intervals

measures the precision of a single predicted y-value for each X

New cards

what we report for regression

r², F-statistic, and p-value

New cards

t-tests and ANOVAs

both compare difference between groups, so predictor is categorical and each category is a different group

New cards

t-test

is our sample mean more than two 1.96 SE away from null distribution parameter mean? if so, we would only expect to see that data 5% of the time

New cards

ANOVA

is the amount of variability in each group greater than the amount of variability between the group means?

New cards

F-distribution

null distribution of F-statistic, skewed right, all non-negative values (because variances can’t be negative)

New cards

ANOVA and F-distribution

if F-statistic from ANOVA falls in tail of F-distribution - reject null hypothesis

f-statistic comes from comparing among to within group variances to detect differences in group means
- F > 1 = more difference between treatment groups than variation within groups
- F = 1 means equal among and within group variance

New cards

what we report for t-test

t-statistic, p-value, and df

New cards

what we report for ANOVA

F-statistic, df, and p-value

New cards

when to use t-test or ANOVA

categorical predictor and numerical response

New cards

when to use x²

categorical predictor and categorical response

New cards

goal of x²

what is the observed ratio of members of each group? does it differ from expected ratio?

are the two categorical variables independent of each other?

New cards

x² distribution

frequency distribution of x² under a null hypothesis predicting expected categorical counts

not symmetric and right skewed
all non-negative values

New cards

what we report for x²

x², df, and p-value

New cards

parametric assumptions

random, independent sampling and sufficient sample size
normality
homoscedasticity
no outliers

New cards

requirements for x² test

all expected values greater than 1 and 80% greater than 5

New cards

testing for normality

visualization of residuals (histogram) or shapiro-wilk test

New cards

shapiro-wilk

p-value has to be greater than 0.05 to indicate that the data is normal

New cards

testing for homoscedasticity

look at actual values of raw data, look at residuals vs fitted plots (dotted line should be approximately horizontal)

use levene’s test

New cards

homoscedasticity

one group should not be much more variable than the others

New cards

power of log transformations

helps with right skew, outliers, and heteroscedasticity

New cards

why are transformations like log permitted

still preserves the relationship in the data, it just rescales it to make things behave better

New cards

non parametric tests

less powerful so you’re more likely to miss a real difference
much harder to interpret biologically

New cards

why it’s important to start with a thorough visualization of data

if you start out by only testing one hypothesis, you could get tunnel vision and miss finding something really important

data visualization opens your mind so you don’t miss important patterns, weird shapes, outliers, etc.

New cards

QQ plot

visualize normality

New cards

residuals vs fitted values

visualize heteroscedasticity

New cards

Post-ANOVA analysis for fixed effects

planned comparisons and unplanned comparisons

New cards

syntax for t-test

t.test(y ~ x, data = dataName) OR t.test(Data$Var1, Data$Var2, paired = TRUE)

New cards

calculating expected for chi squared test

chisq.test(fish_table)$expected

New cards

syntax for chi squared test

chisq.test(fish_table, correct = FALSE)

New cards

planned comparisons

focus in on a few scientifically sensible comparisons. You can't decide which comparisons to do after looking at the data. The choice must be based on the scientific questions you are asking, and be chosen when you design the experiment

New cards

unplanned comparisons

post hoc tests

testing differences between all pairs of group means while providing protection against rising Type I errors that would result from multiple comparisons

New cards

Q-Q plot

plots theoretical distribution values against residuals

New cards

scale-location homoscedasticity

checks the assumption of equal variances

New cards

histogram of residuals

checks the assumption of normality looking at results from sum-of-squares line

New cards

linearity

checks the relationship between fitted values and residuals