biostats review

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/84

There's no tags or description

Looks like no tags are added yet.

Last updated 12:33 PM on 5/5/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

85 Terms

New cards

R²

SSM/SSE for bivariate regression

New cards

how to find p value for bivariate regression

pf(f value, df model, df error)

New cards

if R² is close to then

model explains the data well

New cards

R² for bivariate regression measures

the proportion of variance

SSM/SST

New cards

correlation r for bivariate regression

measures the direct linear relationship between X and Y

found by sqrt(R²) and then change the sign on it based off of the slope

New cards

σ^e is

estimated standard deviation of the residuals from the line of best fit

sqrtMSE

New cards

calculate z score without the sample size

observed-null/sd

New cards

when is a z score considered unusual

when values lie more or less than 2 standard deviations from the mean

New cards

what is a pooled proportion and when to use it

combines the sample means (x bar) and sample sizes (n) use it when finding SE for a normal distribution comparing a difference in proportions

New cards

Standard error

measures how much a sample statistic (like the mean) would vary from sample to sample.

MANY SAMPLES

New cards

SE gets smaller when

sample sizes are large

data is less variable

New cards

A standard error of 0.050 means

the difference in proportions of about +/ 0.05 is what to expect from random chance variation

New cards

Standard deviation (SD)

measures spread of individual data points in a sample or population.

ONE SAMPLE

New cards

z score is used for

normal distributions

New cards

what does z score tell us

How unusual is this result if the null hypothesis were true

if close to 0, that means the result is very typical under the null

New cards

the p value is

how likely is it to see a test statistic this extreme?

New cards

is bootstrap or null distributions used for SE

bootstrap! it estimates variability. it conceptualizes uncertainity in our estimate

New cards

what are randomization distributions also called and what are they for

null distributions and hypothesis testing

New cards

the middle 95% of values fall within

mean ± 2*SE

New cards

what type of distribution is used to get p value

randomization or null distributions

New cards

a randomization spread tells us

what we would expect if there were no real difference in whatever is being measured

New cards

what to look for when comparing bootstrap and randomization distributions

are the graphs shaped and spread similarity

bulk of data lies in the middle 95%

New cards

when p value is greater than the significance level

fail to reject the null hypothesis

New cards

when p value is smaller than the significance level

reject H0

New cards

when looking at residual plots you want to have

no patterns present in spacing and an even spread to the dots

New cards

in a residual plot if the dot is above the line it means

the model underestimated the value of the observed

New cards

in a residual plot if the dot is below the line it means

the model overestimated the value of the observed

New cards

residual is

observed value - predicted value

New cards

what can we learn from our residual plots

we can infer if the model is good or not

the model is not a good fit if the residual plot will have weird patterns and curves

New cards

what are the 4 steps to hypothesis testing

state the null and alternative hypothesis
calculate the test statistic
find the p value
draw a conclusion

New cards

middle line in a boxplot is the

median

New cards

the box length spreads from

Q1 to Q3

it represents the middle 50% of the data

New cards

if the whisker is longer on the right the distribution is

right-skewed

New cards

when multiple boxplots are side-by-side they are used to compare a

quantitative variable

New cards

a larger IQR means

The group is more spread out and is less consistent

New cards

mean compared to median

less resistant to change so goes in the direction of the skew

New cards

50% of the values in a boxplot fall in

the IQR

New cards

a curved distribution will have a boxplot with

a LARGE IQR because the middle 50% spreads across a sparse middle region

New cards

how would i estimate the p value given a dotplot

locate the test statistic then see how many dots fall to the more extreme

divide that number by the number of dots

New cards

r command for a randomization distribution

do(1000) * diffmean(response variable ~ shuffle(explanatory variable), data = YOUR DATA)

diffmean could also be diffprop

New cards

diffmean is used when the response variable for a randomization distribution is

quantitative

New cards

diffprop is used when the response variable for a randomization distribution is

categorical

New cards

to make a histogram of a randomization/null distribution use

gf_histogram(~ diffmean, data = YOUR RANDOMIZATION DATA)

diffmean could also be diffprop

New cards

How do you use a confidence interval to estimate the p-value for a hypothesis test?

look to see if the confidence interval includes 0

result is statistically significant at 0.05

New cards

when is the result statistically significant at a 95% CI

at 0.05 (two-sided)

New cards

if the confidence interval includes 0 then

The p-value is greater than 0.05

New cards

how is critical value found with a t distribution

qt(confidence interval, df = n-1)

New cards

degrees freedom for critical value computation for one mean

sample size -1

New cards

empirical rule

68 (1 sd), 95 (2 sd), 99.7 (3 sd)

New cards

how to find percentages in a normal distribution when given specific values

pnorm(#, mean, sd)

New cards

how to find a specific value in a normal distribution when given percentile

qnorm(percentile, mean, sd)

New cards

how to find p value when given SE

p^ - p0 divided by the SE

gives z score

then z score is plugged into r

pnorm(z score)

New cards

when to use one sided vs two sided

if the question points a certain direction then just use one sided

New cards

where is the randomization distribution centered?

at the value of the parameter specified in the null hypothesis

New cards

where is the bootstrap distribution centered?

at the observed sample statistic

New cards

how do i see if two events are independent?

P(A)P(B) = P(A and B)

New cards

when matching the boxplot with the ANOVA table look for

sum of squares residual

New cards

what does the sum of squares residual tell you for ANOVA

difference WITHIN groups

if large, it means the data points are more spread out within groups

larger IQR for a boxplot

New cards

A big F for ANOVA means

sees if the differences between group means are significant by comparing the variance between groups to the variance within groups

there is a significant difference between at least one groups means

New cards

what does a f value result in?

A low p-value

New cards

IQR is

Q3-Q1

pnorm(0.75, mean, sd) - pnorm(0.25, mean, sd)

New cards

df for model for ANOVA

#groups - 1

New cards

df for residuals for ANOVA

#observations - # of groups

New cards

mean sq for groups for ANOVA

SSM/DFM

New cards

mean square for residuals ANOVA

SSE/DFE

New cards

f value for ANOVA

SSM/SSE

New cards

how is pvalue found for anova

1-pf(f value, df1 = ___, df2 = ___)

New cards

bayes theorem

P(A|B)= P(B|A) * P(A) / P(B)

New cards

sensitivity means

positive given they have it

New cards

specificity means

negative given they dont have it

New cards

bayes thereom

P(A|B) = P(B|A) * P(A) / P(B)

New cards

how to find the denominator for bayes theorem

P(B) = (P(B|A) P(A)) + (P(B|not A) * P(not A))

New cards

what does one dot on a bootstrap sample represent

1 bootstrap sample

New cards

to approximate a confidence interval using a dot plot of a bootstrap you

count the number of dots

Find the percentage of the dots that are not included in the CI and count them

New cards

how to find p value for a hypothesis test

find observed difference

Calculate the test statistic, either t (means) or z (proportions)

then pt() or pnorm()

New cards

how to make a bootstrap distribution with R

do (1000) * diffmean(~explanatory variable, data = reshuffle (____))

New cards

(the first quartile) corresponds to the 25^th percentile, or the value at which 25% of the data lies at or below this value.

New cards

Median in a boxplot

corresponds to the 50^th percentile or the middle value, or the value at which 50% of the data lies at or below this value.

New cards

Q₃

the third quartile) corresponds to the 75^th percentile, or the value at which 75% of the data lies at or below this value.

New cards

for a normal distribution the mean and median are

the same

New cards

z score for a normal distribution

z = x-m/sd

New cards

central limit theorem tells us

the sampling distribution will be approximately normal when the sample size it large.

New cards

if you know a population sd use

normal (Z) distribution

New cards

type one error

rejects a true null

New cards

level of significance is

probability of a Type 1 error. It is the probability of rejecting the null hypothesis when the null hypothesis is true.