OPT 323 Biostatistics

0.0(0)

Studied by 0 people

0.0(0)

Call with Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/79

There's no tags or description

Looks like no tags are added yet.

Last updated 1:00 AM on 1/19/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

80 Terms

New cards

What is a population?

collection of people that we want to generalize a set of findings to

New cards

What is a sample?

small part of population that we study to determine the generalities we are interested in = should be representative of the population

New cards

What is a discrete numeric variable?

numeric variable that can only take on certain discrete values with gaps or interruptions in the values that the variable can assume (usually integers)

ex) count data of patients

<p>numeric variable that can only take on certain discrete values with gaps or interruptions in the values that the variable can assume (usually integers)</p><p>ex) count data of patients</p>

New cards

What is a continuous numeric variable?

numeric variable that can technically be measured with unlimited precision with NO gaps in values that the variable could assume

ex) BP, IOP

New cards

What is an ordered categorical variable?

categorical with a “value” variable that can take on a logical order, sequence or rank

ex) level of physical fitness

New cards

What is an unordered categorical variable?

categorical variable with a “value” that is NOT able to be organized in a logical order, sequence or rank

ex) eye colour

New cards

What is a dichotomous variable?

variable only consists of 2 categories

New cards

Ex) Defining a cataract as either nuclear, cortical, or posterior subcapsular is an example of what type of variable?

categorical, unordered

New cards

Ex) Defining a cataract as either 1+, 2+, 3+, or 4+ is an example of what type of variable?

numeric/categorical, discrete, ordered

New cards

What is an independent variable?

variable that is manipulated by the experimenter and that does not depend on any other variables = “predictor variable” = X axis

New cards

What is a dependent variable?

variable that is not manipulated by the experimenter and that does depend on the other variables = "outcome variable" = Y axis

New cards

What is the mean?

average of all values in a data set

New cards

What is the median?

middle value in a data set

New cards

What is the mode?

most commonly occurring value in a data set

New cards

What is variance?

reflects how different each data point if from the mean

New cards

What is standard deviation?

square root of the variance

New cards

What % of data falls within 1 standard deviation of the mean?

68%

New cards

What % of data falls within 2 standard deviations of the mean?

95%

New cards

What % of data falls within 3 standard deviations of the mean?

99.7%

New cards

What is a normal distribution? What 2 things do we need to define it?

symmetric bell curve distribution of data defined solely by the mean and standard deviation

New cards

When might we use a t distribution?

useful for data sets that are not fit by the normal distribution (small sample sizes esp < 30) = distribution of data changes based on the degrees of freedom

<p>useful for data sets that are not fit by the normal distribution (small sample sizes esp < 30) = distribution of data changes based on the degrees of freedom</p>

New cards

The smaller the degrees of freedom, the __________ the peak and the ___________ the tails of a t distribution.

smaller degrees of freedom:

lower peak

higher tails

New cards

What is a null hypothesis?

H0 = there is no statistical difference between the 2 groups

New cards

What is an alternative hypothesis?

Ha = there is a statistical difference between the 2 groups

New cards

What is the p-value?

probability of observing a certain data set given that the null hypothesis is true

New cards

What does it mean if we have a large p-value that is larger than our pre-set alpha value?

we do not have evidence to reject the null hypothesis = supports the null

New cards

What does it mean if we have a small p-value that is smaller than our pre-set alpha value?

we do have evidence to reject the null hypothesis = supports the alternative

New cards

Essentially, a smaller p-value indicates that there is ___________ support for our alternative hypothesis.

stronger

New cards

What is an independent t-test?

determines whether the 2 means collected from 2 independent sample groups are significantly different

ex) 1 group receives drug, 1 group receives placebo

New cards

What is a dependent t-test?

determines whether the 2 means collected from 1 dependent sample group are significantly different

ex) 2 measurements conducted on the same person at different times

New cards

How do we determine cumulative incidence from a contingency table?

= exposed people with disease / total exposed people with and without disease

can also do with unexposed

New cards

How do we determine relative risk from a contingency table?

cohort study:

= cumulative incidence in exposed / cumulative incidence in unexposed

New cards

How do we determine odds from a contingency table?

= exposed people with disease / exposed people without disease

can also do with unexposed

New cards

How do we determine odds ratio from a contingency table?

case control study:

= odds in exposed / odds in unexposed

New cards

Does the relative risk or odds ratio always overestimate the risk?

odds ratio = this is especially true when the outcome/disease is common

New cards

Ex) from this data of Yellow Fever in Memphis, what is the cumulative incidence in the exposed?

= exposed people with disease / total exposed people with and without disease

= 4204 / 6000

= 70%

New cards

Ex) from this data of Yellow Fever in Memphis, what is the cumulative incidence in the unexposed?

= unexposed people with disease / total unexposed people with and without disease

= 946 / 14,000

= 6.8%

New cards

Ex) from this data of Yellow Fever in Memphis, what is the relative risk for the outcome?

= cumulative incidence in exposed / cumulative incidence in unexposed

= 70% / 6.8%

= 10.3x

New cards

Ex) from this data of Yellow Fever in Memphis, what is the odds in the exposed?

= exposed people with disease / exposed people without disease

= 4204 / 1769

= 2.34

New cards

Ex) from this data of Yellow Fever in Memphis, what is the odds in the unexposed?

= unexposed people with disease / unexposed people without disease

= 946 / 13,054

= 0.072

New cards

Ex) from this data of Yellow Fever in Memphis, what is the odds ratio?

= odds in exposed / odds in unexposed

= (4204 / 1976) / (946 / 13,054)

= 32.30

New cards

Ex) from this data of SCO Honors students, what is the cumulative incidence in the exposed?

41.9%

New cards

Ex) from this data of SCO Honors students, what is the cumulative incidence in the unexposed?

29.9%

New cards

Ex) from this data of SCO Honors students, what is the relative risk?

1.40

New cards

Ex) from this data of SCO Honors students, what is the odds in the exposed?

0.720

New cards

Ex) from this data of SCO Honors students, what is the odds in the unexposed?

0.426

New cards

Ex) from this data of SCO Honors students, what is the odds ratio?

1.69

New cards

What is the Chi-squared test for independence?

tests the association between 2 categorical variables using a p-value to assess H0 (no association) and Ha (association)

New cards

What is a type I error?

rejecting the null hypothesis when it is actually true = false positive = worst kind of error!

ex) convicting someone of a crime they did not commit

ex) approving an IOP drop as working "better" than timolol when it is not

<p>rejecting the null hypothesis when it is actually true = false positive = worst kind of error!</p><p>ex) convicting someone of a crime they did not commit</p><p>ex) approving an IOP drop as working "better" than timolol when it is not</p>

New cards

What is a type II error?

rejecting the alternate hypothesis when it is actually true = false negative

ex) a guilty person is set free

ex) not approving an IOP drop even though it actually is "better" than timolol

New cards

Ex) what is the number of false positives for this data of SCO Honors students?

New cards

Ex) what is the number of false negatives for this data of SCO Honors students?

New cards

What is sensitivity?

proportion of subjects with the disease who have a positive test result = how good the test is at detecting true positives out of all people with disease

<p>proportion of subjects with the disease who have a positive test result = how good the test is at detecting true positives out of all people with disease</p>

New cards

How do we calculate sensitivity?

= # true positives / all people with disease

New cards

What is specificity?

proportion of subjects without the disease who have a negative test result = how good the test is at detecting true negatives out of all people without disease

<p>proportion of subjects without the disease who have a negative test result = how good the test is at detecting true negatives out of all people without disease</p>

New cards

How do we calculate specificity?

= # true negatives / all people without disease

New cards

What is positive predictive value?

proportion of subjects who test positive that actually have the condition = how good the test is at detecting people with disease out of all people who test positive

<p>proportion of subjects who test positive that actually have the condition = how good the test is at detecting people with disease out of all people who test positive</p>

New cards

How do we calculate positive predictive value?

= # true positives / all people who test positive

New cards

Ex) from this data set, what is the sensitivity?

64.3%

New cards

Ex) from this data set, what is the specificity?

83.3%

New cards

Ex) from this data set, what is the positive predictive value?

90%

New cards

What is a correlation coefficient (r)?

used to assess the strength of the correlation between 2 continuous variables

New cards

What does the + or - mean for correlation coefficient (r)?

+ means positive correlation

- means negative correlation

THINK: reflects the slope of the line

New cards

What is the range of values for correlation coefficient (r)?

r = 1.0 perfectly correlated

r ≥ 0.8 strong correlation

r < 0.8 but ≥ 0.5 fairly strong correlation

r < 0.5 weak correlation

r = 0.0 no correlation

New cards

What is simple linear regression?

linear model where one outcome is predicted from one predictor variable with a best-fit line

New cards

What is the formula for simple linear regression?

y = mx + b

where y is the dependent variable, m is the slope, x is the independent variable, and b is the y intercept

New cards

What is multiple regression?

linear model where one outcome is predicted from two or more predictor variables

New cards

What is the formula for multiple regression?

y = (each beta coefficient x each independent variable) + b

New cards

What is the constant in multiple regression?

y-intercept = value of the dependent variable in a regression equation when it's independent variable(s) equal 0

New cards

What is the beta coefficient in multiple regression?

degree of change in the dependent variable for every 1-unit change in a certain independent variable

ex) if beta is 0.2, then for every one unit increase in x there is a 0.2 increase in y

New cards

What is the coefficient p-value in multiple regression?

tells us whether or not an independent variable is statisticlaly significant

ex) if the p-value is less than the cutoff, the independent variable is stat significant

New cards

What is the standard error in multiple regression?

another way to tell us how well the linear regression lines fits the data = average distance that the observes values fall from the regression line

<p>another way to tell us how well the linear regression lines fits the data = average distance that the observes values fall from the regression line</p>

New cards

Does a smaller or larger standard error indicate that the model is better able to fit the data?

smaller SE

New cards

What is the R-squared value in multiple regression?

tells us how well the linear regression lines "fits" the data = the proportion of the variance in the dependent variables that can be explained by the independent variables

ex) and R2 values of 0.16 tells us that only 16% of the variance in the dependent variable can be explained by the independent variables

<p>tells us how well the linear regression lines "fits" the data = the proportion of the variance in the dependent variables that can be explained by the independent variables</p><p>ex) and R2 values of 0.16 tells us that only 16% of the variance in the dependent variable can be explained by the independent variables</p>

New cards

What is the possible range of values of the R-squared value in multiple regression?

0 to 1

0 indicates that the response variable is not explained by the predictor variable at all

1 indicates that the response variable is completely explained by the predictor variable w/o error

New cards

What is logistic regression?

a subset of multiple regression only used when the outcome is a categorical variable (e.g. outcome of seeing an optometrist or not, having disease or not) = since there is no linear relationship between x and y, we have to use the log of the y value (typically the log of the odds/probability)

<p>a subset of multiple regression only used when the outcome is a categorical variable (e.g. outcome of seeing an optometrist or not, having disease or not) = since there is no linear relationship between x and y, we have to use the log of the y value (typically the log of the odds/probability)</p>

New cards

How do we typically use logistic regression?

use formula to calculate probability that an observation takes on a value of 1 = use this formula to predict whether something will be 1 = use predetermined probability threshold to classify an observation as 1 or 0