Biostats (Cheng)

0.0(0)
studied byStudied by 5 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/42

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

43 Terms

1
New cards

Descriptive Statistics

  • small population

  • can collect all data

  • cannot use to make conclusions beyond the data

2
New cards

Inferential Statistics

  • population is large (cannot collect all population data)

  • can only collect sample of population

  • can use sample data to make inferences about a population

  • focus on making predictions/generalizations about a larger dataset based on a sample

3
New cards

Estimation

  • inference about 1 group

  • can be point estimate or interval estimate

  • can estimate a proportion or mean

4
New cards

Comparison

  • inference about 2 or more groups

5
New cards

Correlation

  • relationship between 2 variables

6
New cards

Point estimate

  • single value

  • sample mean = population mean

7
New cards

Interval Estimate

  • defined by two numbers between which a population parameter is said to lie

8
New cards

Confidence interval

  • measure of how sure one can be

  • expressed as a percentage (most commonly 95%)

  • as confidence level (percentage) increases, the confidence interval widens

  • as sample size decreases, the confidence interval widens

  • represents confidence that population statistic is within the confidence interval

9
New cards

Prevalence

  • aka prevalence proportion

  • proportion of a population found to have a condition

    • includes ALL cases (new and pre-existing)

  • (number of subjects with disease)/(total population)

  • usually expressed as fraction, percentage, or number of cases per 10,000 or 100,000 people

10
New cards

Incidence rate

  • rate of new cases of a disease occurring in a specific population over a particular period of time

    • limited to NEW cases only

  • (number of NEW cases during a specified time period)/(person years at risk during the same time period)

11
New cards

Nominal data

  • categorical, unranked data

  • ex. gender, eye color, surgical outcome, blood type

  • when only 2 possible categories: dichotomous, binary, binomial

12
New cards

Ordinal Data

  • variables with an inherent order to the relationship among the different categories

  • implied ordering of the categories with unknown quantitative distance

  • distances between the levels may not be the same

  • meaning of different levels may not be the same for different individuals

  • utilizes numbers to indicate rank/order, but numerical values do not hold mathematical significance

  • ex. stages of cancer, education level, pain level, satisfaction level, agreement level

13
New cards

Unpaired samples

two groups from different populations

sample size may be different

14
New cards

Paired samples

Same samples undergoing same treatments

Same sample size

Can be same people measured at different times or asked about same products

15
New cards

Steps for group comparison

  1. Check data type

  2. Check dependence (paired or unpaired)

16
New cards

Unpaired Nominal Data

  • Chi-squared test

    • all values above or equal to 5 (large sample size)

  • Fisher’s exact test

    • any values below 5 (small sample size)

17
New cards

Paired Nominal Data

  • McNemar’s Test

  • Kappa statistics

    • measure of agreement

18
New cards

Unpaired ordinal data

Mann-Whitney U Test

aka Wilcoxon two sample test

19
New cards

Paired ordinal data

Wilcoxon paired sign rank test

20
New cards

Unpaired continuous data

unpaired t-test

21
New cards

Paired continuous data

paired t-test

22
New cards

Contingency table

  • table of observed data for categorical data

23
New cards

Expected table

  • table of expected data for categorical data (if no difference between groups)

  • for first row, first column = (first row margin)(first column margin)/(total)

  • has same margin and grand total values as contingency table

24
New cards

Odds ratio

  • odds: P/(1-P)

  • odds ratio: odds/odds

  • cross-product method: ad/bc

 

Yes (disease)

No (disease)

 Yes (risk factor)

 a

 No (risk factor)

 c

  • OR = 1 means no association between outcome and exposure

  • OR >1 means exposure associated with increased risk for outcome

    • harmful effect

  • OR <1 means exposure is associated with reduced risk for outcome

    • protective effect

  • consider confidence interval (if it contains 1, not statistically significant)

25
New cards

Accuracy

  • number of correct diagnoses divided by entire population

26
New cards

Sensitivity

  • used for paired nominal data

  • measures of performance of binary classification test

  • true positive rate

  • measures proportion of actual positives which are correctly identified

  • how good a test is at finding actual positive

    • complementary to false negative rate

    • used for diagnosis

  • (actual positives identified)/(actual positives)

27
New cards

Specificity

  • true negative rate

  • measures performance of binary classification test

  • proportion of negatives which are correctly identified

    • complementary to false positive rate

    • used for diagnosis

  • (actual negatives identified)/(actual negatives)

28
New cards

Positive predictive values (PPV)

  • (number of true positives)/(number of positive calls)

    • number of positive calls = number of true positives + number of false positives

  • the chance that a person with a positive test truly has the disease

    • used for patient knowledge

29
New cards

Negative predictive values (NPV)

  • probability that a subject with a negative screening test really does not have the disease

    • used for patient knowledge

  • (number of true negatives)/(number of negative calls)

    • number of negative calls = number of true negatives + number of false negatives

30
New cards

Kappa statistics

  • statistical measure of inter-rater agreement

    • agreement: both raters have same outcome

  • for paired nominal data

31
New cards

Kappa statistic strengths of agreement

  • Poor <0.2

  • Fair 0.21-0.4

  • Moderate 0.41-0.60

  • Good 0.61-0.8

  • Very Good 0.81-1

32
New cards

T-tests

  • assess whether the means of two groups are statistically different from each other

33
New cards

Non-parametric tests

  • distribution free test

  • does not assume anything about the underlying distribution

  • ex. Chi-squared test, Fisher’s exact tests, McNemar’s test, Mann-Whitney U Test, and Wilcoxon sign rank test

34
New cards

Parametric test

  • makes assumptions about a population’s parameters

  • usually means tests like t-test or ANOVA

    • assume the population data has a normal distribution

35
New cards

Tests that check normal distribution

  • QQ plot

  • Shapiro Wilk Test

36
New cards

QQ plot

  • quantie-quantile plot

  • shows distribution of the data against the expected normal distribution

  • for normally distributed data, observations should lie approximately on a straight line

  • possible outliers are points at the ends of the line

37
New cards

Shapiro Wilk Test

  • test of normality in frequentist statistics

  • null hypothesis: population is normally distributed

  • if P < 0.05, not normally distributed

    • nonparametric test should be used

  • if P > 0.05, normal distribution

    • t test can be used

  • best power for a given significance

38
New cards

Unpaired t-test

  • two sample t test

  • applied to 2 independent groups (different people in 2 different groups

  • sample size may be unequal in each group

39
New cards

Paired t-test

  • one sample t test

  • measures whether means from a within subjects test group vary over 2 test conditions (same people in same group)

  • equal sample size

  • takes into account the fact that pairs of subjects go together

40
New cards

One-tailed t-test

  • first mean expected to be larger than the second or first mean expected to be smaller than the second

  • expect the effect to be in a certain direction

41
New cards

2-tailed t tiest

  • first mean expected to be different from the second in EITHER direction

  • used when looking for any difference between samples

42
New cards

Test of equal variance (F-Test)

  • used to test if the variances of 2 populations are equal

    • ratio of standard deviations of each group

  • if variances are equal, F = 1

    • P>0.01

    • use unpaired t test

  • the more ratio deviates from 1, the stronger the evidence for unequal population variances

    • P<0.05

    • use Welch’s unpaired t-test

  • used for unpaired data

  • Excel: FTEST (array1,array2)

    • returns 2 tailed probability that the variances in array1 and array2 are not significantly different

  • should check normality before using

43
New cards