Biostats final

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/75

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

76 Terms

1
New cards

Truth

Something we can never know with certainty

2
New cards

Elements of a good scientific research question

  • Clarifies your primary goal

  • Identifies the scope of inference

  • Interesting to others

  • Can be answered with data

  • Grounded in theory

  • Feasible

3
New cards

Goals of scientific research

  • Description: numerically summarize a quantity of interest

  • Prediction: forecast some future event based on historical or current data

  • Explanation: understand whether a particular variable is cause of some other variable

4
New cards

Explanatory variable

variable inducing a causal effect

5
New cards

Response variable

variable receiving causal effects

6
New cards

Quantitative

numeric: continuous or discrete

7
New cards

Qualitative

categorical, factor: nominal or ordinal

8
New cards

Continuous variable

scale down to any number of decimal places (ex. height, weight, temp)

9
New cards

Discrete variable

can only take on whole numbers (ex. number of students in the classroom)

10
New cards

Nominal variable

categories do not have an order

11
New cards

Ordinal

categories have an order (ex. lifespan)

12
New cards

Sample size

  • effects the shape of distribution

  • can lead to misleading patterns or conclusions if it is too small

  • can increase uncertainty if too small

13
New cards

DAGs

  • used to visualize the causal assumptions of a hypothesis

  • communicates your causal assumptions

14
New cards

Cause

refers to a variable or event that produces a change in another variable

15
New cards

Counterfactual

refers to a hypothetical scenario

16
New cards

The fork

  • Confounder

  • Can lead to spurious relationships

  • Can mask real causal relationships between explanatory and response

  • directed, open

17
New cards

The pipe

  • Mediator

  • Can cause post-treatment bias with the total effect

  • directed, closed

18
New cards

The inverted fork

  • Collider

  • directed, closed

19
New cards

Total effect

indirect effects + direct effects

20
New cards

Observational Study

  • sensitive to confounding variables

  • inferring causation requires formal causal inference approaches for statistical analysis

21
New cards

Experimental Study

  • randomization breaks association between treatments and other variables

  • causal inference approaches can be necessary to avoid post-treatment bias

22
New cards

Randomization

The defining feature of an experiment

23
New cards

Populations

entire group of individuals of units you are interested in studying

24
New cards

Samples

a subset of the population that is observed or measured

25
New cards

Parameter

Quantities with unknown values

26
New cards

Statistic

a numerical summary calculated from a sample

27
New cards

estimate

the value of a statistic used as an approximation of a parameter

28
New cards

estimand

parameter of interest

29
New cards

Sampling distribution

distribution of possible outcomes of an estimate based on our sampling process

30
New cards

Precision

  • measure of how consistent samples should be when we repeatedly sample from a population

  • describes sampling error

31
New cards

Accuracy

  • describes bias

32
New cards

How to maximize precision

replication

33
New cards

How to maximize accuracy

doing random sampling

34
New cards

Random trial

each test is a random process

35
New cards

outcome

end results of each trial

36
New cards

sample space

set of all possible outcomes

37
New cards

Frequentist

probability is the proportional of trials n where we observe the event of interest, X

38
New cards

Bayesian

Probability is as a strength of beleif

39
New cards

Kolmogorov’s Axioms of probability

  • Rule 1: the probability if any event X is non-negative

    • P(X) > 0

  • Rule 2: the probability of any possible outcome in the same sample space is certain (1)

    • P(Ω)=1

  • Rule 3: Addition rule

    • P(A or B)=P(A)+P(B)

40
New cards

Joint probablity

probability of the first and second event

41
New cards

Marginal probability

to get the total probability

42
New cards

Independent events

  • two events are independent if the occurrence does not affect the probability of the other

  • (A and B)=P(A)*P(B)

43
New cards

Mutually exclusive

Example: an individual cannot test positive and negative for the infection at the same time

44
New cards

Conditional probability

  • defined as the probability of event A given that know event B is true

  • P(A|B)= P(A and B)/P(B)

45
New cards

Random variable

where the term random implies an element of chance in terms of how we observe the variable

46
New cards

Probability distribution

probability of observing each mutually exclusive outcome

47
New cards

Probability density

how probability is distributed across the possible value of a continuous random variable

48
New cards

Probability mass

can be quantified as area under the probability density function for any interval of interest

49
New cards

Empirical Rule

  • 68% of observations are within 1 standard deviation of the mean

  • 95% of the observations are within 2 standards deviation of the mean

50
New cards

Parameter estimation

assumes a single trait true parameter value

51
New cards

point estimates

represents the single best estimate of the parameter of interest

52
New cards

sampling distribution

  • the probability used to describe a sample estimate

  • an illustration of uncertainty about the estimates taken from samples

53
New cards

The mean in a sampling distribution

the true parameter value

54
New cards

Standard error

standard deviation of the sampling distribution

55
New cards

Law of large numbers

as the sample size N increases, the point estimate ultimately converges on the true parameter value

56
New cards

Central Limits Theorem

the distribution of a sample estimate will be approximately normal at large sample sizes regardless of the shape of the probability distribution for the underlying probability distribution

57
New cards

Confidence Interval

Use info from the standard error to quantity at a range of possible values for the true proportion parameter at a given level of confidence

58
New cards

Steps of null hypothesis significance testing

1) Specify the null and alternative hypothesis

2) Determine the test statistic and significance value

3) Compute the sampling distribution for the null hypothesis

4) Compute p-value

5) Make a decision

59
New cards

Null hypothesis

Ho, represents hypothesis with “no effect”

60
New cards

Alternative hypothesis

Ha, represents the opposite of the null hypothesis

61
New cards

Null distribution

sampling distribution for the null hypothesis

62
New cards

p-value

probability of getting an estimate as or more extreme than our sample estimate, assuming the null is true

63
New cards

Type I error

We reject a null hypothesis that is actually true

64
New cards

Type II error

  • Failing to reject a null hypothesis that false

65
New cards

Factors that affect Type II errors

  • low sample size and high variability

  • effect size is small

  • low significance value

66
New cards

One-tailed test

the alternative hypothesis is directional

67
New cards

significance value (α)

refers to the probability of observations in the tails

68
New cards

Two-tailed test

does not specify directionality of a potential effect

69
New cards

prior probability

P(Hypothesis), quantitative statement of your degree of belief about the hypothesis prior to collecting new data

70
New cards

Likelihood

P(Data|Hypothesis), probability of new data you observe given that the hypothesis is true

71
New cards

Marginal Likelihood

P(Data), overall probability of the data integrated across all possible hypotheses

72
New cards

Posterior Probability

P(Hypothesis|Data), our updated probability given the new data

73
New cards

Point estimates

represent the single best estimate of the parameters of interest

74
New cards

Steps of bayesian

1) Specify the prior distribution

2) Quantify the likelihood'

3) Quantify the marginal likelihood'

4) Quantify the posterior distribution

75
New cards

Binomial distribution

probability distribution for a binary variable where the outcome of that binary variable is examined across n trials

76
New cards

Statistical models

  • quantitative representations of our scientific models

  • consist of an equation, or a set of equations, that describe single variables, and more often in causal inference, the relationship between variables.