Biostats final

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/66

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

67 Terms

1
New cards

What makes a good sample

precise (low sampling error)

accurate (unbiased)

random

2
New cards

What constitutes a random sample

independent (one doesn’t influence the others selection) and has equal chance of being selected

3
New cards

what makes a bad sample

sample of connivence - collection of individuals that happen to be available

4
New cards

population parameters

are consistent

5
New cards

sample estimates ____ with the sample

vary

6
New cards

2 ways to describe uncertainty

95% CI and Standard Deviation

7
New cards

Pseudoreplication

samples that are not independent but are treated like they are

8
New cards

How to change CI intervals with % change from 95 to 99

interval must widen

9
New cards

how increasing sample size effects CI interval at same %

narrower

10
New cards

Graphing 1 numerical

histogram

11
New cards

Graphing 1 categorical

bar graph

12
New cards

Graphing 2 categorical

mosaic or grouped bar graph

13
New cards

Graphing 1 categorical 1 numerical

box plot, violin plot, strip chart

14
New cards

Graphing 2 numerical

scatter plot

15
New cards

common problems with figures

do axes start at zero?

do axes titles have units?

are x-axis group names in a legend?

16
New cards

2 general statistics and examples

descriptive - characterizing aspects of a numerical data set (mean and sd)

inferential - evaluate strength of evidence about a hypothesis (t-value, F-ratio)

17
New cards

Type 1 error and how to decrease

false positive: rejecting Ho when it is actually true

decrease decision threshold

18
New cards

Type 2 error and how to decrease

false negative: fail to reject Ho when it is actually false

increase sample size = increased power

19
New cards

power

probability that a random sample results in the rejection of a false Ho

20
New cards

p value

probability of getting the data set if Ho is true

21
New cards

signal to noise ratios

t-value

F-ratio

r-value

22
New cards

1 sample t-test

comparing one value to another

23
New cards

proportions

when you have a number of successes/ sample number and want to test it to a known value

24
New cards

paired t-test

2 variables that are connected, test if they are different

25
New cards

2-sample t-test

2 variables not connected, see if they are different

26
New cards

paired, 2-sample, and ANOVA assumptions

random

normal distributions

27
New cards

2-sample and anova assumptions

variances of both populations are equal

28
New cards

correlation

describes linear assumptions between 2 numerical variables

29
New cards

correlation coefficient

quantifies linear association through

direction and

magnitiude/strength

30
New cards

why variables might be correlated

chance

a causes b

c may cause/influence a and b

a may lead to an increase in c which increases b

31
New cards

correlation missconception

association/correlation does not imply causation

32
New cards

point of linear regression

predict the value of one variable for another

33
New cards

difference between regression and correlation

regression does not treat the two variables equally

34
New cards

residuals

the difference between the actual value and the predicted value of the response variable

minimize these squared for best fit line

35
New cards

variance in y (response variables) explained by x (explanatory variable)

36
New cards

predicting mean y for x

higher precision

use confidence intervals = bend towards the mean

37
New cards

predicting specific y for x

lower precision

use prediction intervals = run parallel to line of best fit

38
New cards

extrapolation

predicting y for x out of range

you don’t know if the trend continues

39
New cards

assumptions for correlation and regression

relationship between x and y is linear

frequency distributions are normal (no gaps, no outliers)

variance of x does not change with y (no funnel)

(each year is chosen at random for each x - regression)

40
New cards

three things to do when assumptions fail

ignore them

transform the data

non-parametric tests

41
New cards

non-parametric tests

ranks the data

42
New cards

Wilcox assumptions

both samples are random

both have same distribution shape

43
New cards

plots for addressing normal frequency

histgoram

qq plot

44
New cards

plots for addressing equal variance

equal IRQs

45
New cards

why normal distributions are important

the occur naturally in nature

symmetric about mean - bell shaped

fully described by mean and sd, mean = median = mode

95% of data are with in 2 sd of the mean

46
New cards

2 goals of experimental design

eliminate bias (increase accurate)

decrease sampling error (increase precision)

47
New cards

ways to eliminate bias

controls

random assignment of samples to treatments

blinding

48
New cards

random assignment can only happen for

experimental studies not observational

49
New cards

Placebo

special kind of control, the expectation to get better is powerful

50
New cards

independent recovery

people will get better inevitably cus they seek treatment when they don’t feel good so you need a control to compare too to see if the treatment actually works

51
New cards

goal of reducing sampling error

increase signal to noise ration

52
New cards

how to increase signal to noise ratio

increase sample size

decrease sd

53
New cards

4 ways to reduce sampling error

replication

balance

blocking

extreme treatments

54
New cards

balance

each group has equal number of samples

n1=n2 means smallest error so smallest noise

55
New cards

blocking

grouping of experimental unit with in each group

creates mini experiments in each block

accounts for variation between blocks

56
New cards

extreme treatments

stronger treatments can increase signal to noise ratio

57
New cards

threats to reproducible science

not biased - no controls, randomization, for blinding

low statistical power - small sample size

poor quality control (higher sampling error) -no replication, balance, block, or extremes

P-hacking - keep adding data till significance is seen

publication bias - only publishing significant results

harking - hypothesis testing after results are known

58
New cards

planning sample size and problems with that

want a sample size with sufficient power and precision

calculate sample size assuming 2 sample experiment comparing means with normal data sets and equal sd

n = 8 x (sd/margin of error)²

problem = can be hard to estimate the population sd and margin of error without literature or short trial experiment

focus on the beginning of the curve because that’s where you can add the most precision

59
New cards

frequentist stats

defines probability of some event in terms of relative frequency with which it tends to occur deductive

the true population looks like this so my sample should look like this

60
New cards

baysean stats

more subjective, defines probability as a measure of strength of your belief regarding true situation

inductive

my sample came out like this so the true population might be this

61
New cards

bayes theory for particular parameter

posterior = likelihood x prior

62
New cards

posterior

new probability given prior data and new data

63
New cards

likelihood

probability of data given parameter

64
New cards

prior

probability of parameter

65
New cards
66
New cards
67
New cards

2 factors that effect here posterior is on the graph

precision - posterior by skinniest curve

distance between prior and likelihood - further apart = more shifted posterior