Biostats (Cheng)

0.0(0)

Studied by 5 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/42

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

43 Terms

New cards

Descriptive Statistics

small population
can collect all data
cannot use to make conclusions beyond the data

New cards

Inferential Statistics

population is large (cannot collect all population data)
can only collect sample of population
can use sample data to make inferences about a population
focus on making predictions/generalizations about a larger dataset based on a sample

New cards

Estimation

inference about 1 group
can be point estimate or interval estimate
can estimate a proportion or mean

New cards

Comparison

inference about 2 or more groups

New cards

Correlation

relationship between 2 variables

New cards

Point estimate

single value
sample mean = population mean

New cards

Interval Estimate

defined by two numbers between which a population parameter is said to lie

New cards

Confidence interval

measure of how sure one can be
expressed as a percentage (most commonly 95%)
as confidence level (percentage) increases, the confidence interval widens
as sample size decreases, the confidence interval widens
represents confidence that population statistic is within the confidence interval

New cards

Prevalence

aka prevalence proportion
proportion of a population found to have a condition
- includes ALL cases (new and pre-existing)
(number of subjects with disease)/(total population)
usually expressed as fraction, percentage, or number of cases per 10,000 or 100,000 people

New cards

Incidence rate

rate of new cases of a disease occurring in a specific population over a particular period of time
- limited to NEW cases only
(number of NEW cases during a specified time period)/(person years at risk during the same time period)

New cards

Nominal data

categorical, unranked data
ex. gender, eye color, surgical outcome, blood type
when only 2 possible categories: dichotomous, binary, binomial

New cards

Ordinal Data

variables with an inherent order to the relationship among the different categories
implied ordering of the categories with unknown quantitative distance
distances between the levels may not be the same
meaning of different levels may not be the same for different individuals
utilizes numbers to indicate rank/order, but numerical values do not hold mathematical significance
ex. stages of cancer, education level, pain level, satisfaction level, agreement level

New cards

Unpaired samples

two groups from different populations

sample size may be different

New cards

Paired samples

Same samples undergoing same treatments

Same sample size

Can be same people measured at different times or asked about same products

New cards

Steps for group comparison

Check data type
Check dependence (paired or unpaired)

New cards

Unpaired Nominal Data

Chi-squared test
- all values above or equal to 5 (large sample size)
Fisher’s exact test
- any values below 5 (small sample size)

New cards

Paired Nominal Data

McNemar’s Test
Kappa statistics
- measure of agreement

New cards

Unpaired ordinal data

Mann-Whitney U Test

aka Wilcoxon two sample test

New cards

Paired ordinal data

Wilcoxon paired sign rank test

New cards

Unpaired continuous data

unpaired t-test

New cards

Paired continuous data

paired t-test

New cards

Contingency table

table of observed data for categorical data

New cards

Expected table

table of expected data for categorical data (if no difference between groups)
for first row, first column = (first row margin)(first column margin)/(total)
has same margin and grand total values as contingency table

New cards

Odds ratio

odds: P/(1-P)
odds ratio: odds/odds
cross-product method: ad/bc

	Yes (disease)	No (disease)
Yes (risk factor)	a	b
No (risk factor)	c	d

OR = 1 means no association between outcome and exposure
OR >1 means exposure associated with increased risk for outcome
- harmful effect
OR <1 means exposure is associated with reduced risk for outcome
- protective effect
consider confidence interval (if it contains 1, not statistically significant)

New cards

Accuracy

number of correct diagnoses divided by entire population

New cards

Sensitivity

used for paired nominal data
measures of performance of binary classification test
true positive rate
measures proportion of actual positives which are correctly identified
how good a test is at finding actual positive
- complementary to false negative rate
- used for diagnosis
(actual positives identified)/(actual positives)

New cards

Specificity

true negative rate
measures performance of binary classification test
proportion of negatives which are correctly identified
- complementary to false positive rate
- used for diagnosis
(actual negatives identified)/(actual negatives)

New cards

Positive predictive values (PPV)

(number of true positives)/(number of positive calls)
- number of positive calls = number of true positives + number of false positives
the chance that a person with a positive test truly has the disease
- used for patient knowledge

New cards

Negative predictive values (NPV)

probability that a subject with a negative screening test really does not have the disease
- used for patient knowledge
(number of true negatives)/(number of negative calls)
- number of negative calls = number of true negatives + number of false negatives

New cards

Kappa statistics

statistical measure of inter-rater agreement
- agreement: both raters have same outcome
for paired nominal data

New cards

Kappa statistic strengths of agreement

Poor <0.2
Fair 0.21-0.4
Moderate 0.41-0.60
Good 0.61-0.8
Very Good 0.81-1

New cards

T-tests

assess whether the means of two groups are statistically different from each other

New cards

Non-parametric tests

distribution free test
does not assume anything about the underlying distribution
ex. Chi-squared test, Fisher’s exact tests, McNemar’s test, Mann-Whitney U Test, and Wilcoxon sign rank test

New cards

Parametric test

makes assumptions about a population’s parameters
usually means tests like t-test or ANOVA
- assume the population data has a normal distribution

New cards

Tests that check normal distribution

QQ plot
Shapiro Wilk Test

New cards

QQ plot

quantie-quantile plot
shows distribution of the data against the expected normal distribution
for normally distributed data, observations should lie approximately on a straight line
possible outliers are points at the ends of the line

New cards

Shapiro Wilk Test

test of normality in frequentist statistics
null hypothesis: population is normally distributed
if P < 0.05, not normally distributed
- nonparametric test should be used
if P > 0.05, normal distribution
- t test can be used
best power for a given significance

New cards

Unpaired t-test

two sample t test
applied to 2 independent groups (different people in 2 different groups
sample size may be unequal in each group

New cards

Paired t-test

one sample t test
measures whether means from a within subjects test group vary over 2 test conditions (same people in same group)
equal sample size
takes into account the fact that pairs of subjects go together

New cards

One-tailed t-test

first mean expected to be larger than the second or first mean expected to be smaller than the second
expect the effect to be in a certain direction

New cards

2-tailed t tiest

first mean expected to be different from the second in EITHER direction
used when looking for any difference between samples

New cards

Test of equal variance (F-Test)

used to test if the variances of 2 populations are equal
- ratio of standard deviations of each group
if variances are equal, F = 1
- P>0.01
- use unpaired t test
the more ratio deviates from 1, the stronger the evidence for unequal population variances
- P<0.05
- use Welch’s unpaired t-test
used for unpaired data
Excel: FTEST (array1,array2)
- returns 2 tailed probability that the variances in array1 and array2 are not significantly different
should check normality before using