Advanced Epi Methods

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/37

flashcard set

Earn XP

Description and Tags

Linear, Logistic, Poisson, Survival Analysis

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

38 Terms

1
New cards

Regression Analysis

to determine if 1 or more independent variables is associated with a dependent variable

2
New cards

Independent variable

Explanatory variable

Predictor variable

X

3
New cards

Dependent variable

Response variable

Outcome variable

Y

4
New cards

What is a statistical model?

The equation that describes the putative relationship among variables.

5
New cards

Multivariable analysis

Inferences based on the parameter for any independent variable are conditional on the other independent variables in the model.

Avoid omitting potential confounding while not including variables of minimal sequence.

6
New cards

Linear Regression

Outcome is measured on a CONTINUOUS scale i.e. body weight

Predictors can be measures on a continuous or categorical (dichotomous) scale

7
New cards

Linear Regression Example

Is chest girth (cm) significantly associated with body weight (kg) among heifers?

8
New cards

How do we determine the line that best fits our data?

The method of least squares is used to estimate the parameters (in this case, β0 and β1) in such a way as to minimize the sum of the squared residuals

9
New cards

What is a residual?

Used to estimate error in the model

The difference between an observed value of Y & its predicted value for a given value of X

10
New cards

T-test LR

β/SE

Used to evaluate whether the predictor is significantly associated with the outcome

H0: β=0

HA: β≠0

A significant t value denotes that the predictor explains some of the variation in the outcome

11
New cards

R²

Describes the proportion of variance in the outcome variable that is explained by the predictor(s)

It always ↑ as predictors are added to the model (thus can’t be used for variable selection

12
New cards

Adjusted R²

Its value is adjusted for the number of predictor variables (k) in the model

Will ↓ if added predictors have minimal additional impact on the outcome

13
New cards

Model Assumptions (LR)

Independence: the values of the outcome variable are independent from one another, i.e. no clustered data

Linearity: the relationship between the outcome and any continuous predictor variables is linear

Normal distribution: the residuals are normally distributed

Homoscedasticity: the variance of the residuals is the same across the range of predicted values of y

14
New cards

What if underlying assumptions are not met?

Independence & linearity assumptions are the most important

Can do data transformation, e.g. logarithmic

Can proceed as planned if there are moderate departures from normality & homoscedasticity

15
New cards

Cook’s Distance (Di)

assesses the influence of each observation

Standardized measure of the change in regression parameters if the particular observation was omitted

16
New cards

Collinearity

presence of highly correlated predictor variables in the model

Results in large standard errors of regression parameters

Leads to t-test statistics that are spuriously small and thus p-values that are

misleading

Assessed using variance inflation factor (VIF)

17
New cards

Variance inflation factor (VIF)

Measures how much the variance of regression coefficients in the model is inflated by addition of a predictor variable that contains very similar information

Values of VIF > 10 indicate serious collinearity

The SE of a regression parameter will ↑ by a factor of about the square root of VIF when a collinear predictor variable is added to the model

18
New cards

VIF = 1/(1 – R2X)

where R2X is the coefficient of determination for describing the amount of variance in the incoming X that is explained by the predictors already in the model

19
New cards

Logistic Regression

Outcome of interest is measured on a categorical scale

Usually dichotomous: yes/no, negative/positive, 0/1

Predictors can be measured on a continuous or categorical

20
New cards

can we use regression model for logistic regression?

no. as we would be unable to interpret any predicted values of Y other than 0 or 1

21
New cards

Generalized linear models (GLM)

Random component: identifies the outcome variable Y & selects a probability distribution for it, e.g. normal, binomial, Poisson, negative binomial

Systematic component: specifies the linear combination of predictor variables, e.g. β0 + β1X1

Link function: specifies a function that relates the expected value of Y to the linear combination of predictor variables, i.e. it connects the random & systematic components

â–Ş Gives us a linear relationship between our outcome variable & predictor(s)

22
New cards

Interpreting OR for continuous predictors

The factor by which the odds are ↑ (or ↓) for each unit change in the predictor

23
New cards

Maximum likelihood estimation

used to estimate the regression parameters

24
New cards

Wald chi-squared test

used to evaluate the significance of individual parameters

25
New cards

Model assumptions logistic regression

Independence: the observations are independent from one another

Linearity: the relationship between the outcome (i.e. ln{p/(1 – p)}) and any continuous predictor variables is linear

26
New cards

Goodness-of-fit statistics address the differences between observed & predicted values or their ratio

Pearson χ2

Deviance χ2

Hosmer-Lemeshow test

27
New cards

Pearson & deviance χ²

Based on dividing the data into covariate patterns

Within each pattern, the predicted # of outcomes is computed & compared to the observed # of outcomes to yield the Pearson & deviance residuals

The Pearson & deviance chi-squared statistics represent the sums of the respective squared residuals

28
New cards

Hosmer-Lemeshow test

Based on dividing the data in more arbitrary fashion, e.g.percentiles of estimated probability

Predicted & observed outcome probabilities within each group are compared as before

More reliable if the # of covariate patterns is high relative to the # of observations

29
New cards

Poisson Regression

Outcome of interest is measured on a discrete scale. e.g. # of cases of disease, # of deaths

Predictors can be measured on a continuous or categorical (including dichotomous) scale

30
New cards

Model assumptions

Independence: the observations are independent from one another

Linearity: the relationship between the outcome, i.e. ln (ÎĽ/N), & any continuous predictor variables is linear

Mean = variance

31
New cards
32
New cards
33
New cards
34
New cards
35
New cards
36
New cards
37
New cards
38
New cards