Advanced Epi Methods

0.0(0)

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/37

Earn XP

Description and Tags

Linear, Logistic, Poisson, Survival Analysis

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

38 Terms

New cards

Regression Analysis

to determine if 1 or more independent variables is associated with a dependent variable

New cards

Independent variable

Explanatory variable

Predictor variable

New cards

Dependent variable

Response variable

Outcome variable

New cards

What is a statistical model?

The equation that describes the putative relationship among variables.

New cards

Multivariable analysis

Inferences based on the parameter for any independent variable are conditional on the other independent variables in the model.

Avoid omitting potential confounding while not including variables of minimal sequence.

New cards

Linear Regression

Outcome is measured on a CONTINUOUS scale i.e. body weight

Predictors can be measures on a continuous or categorical (dichotomous) scale

New cards

Linear Regression Example

Is chest girth (cm) significantly associated with body weight (kg) among heifers?

New cards

How do we determine the line that best fits our data?

The method of least squares is used to estimate the parameters (in this case, β0 and β1) in such a way as to minimize the sum of the squared residuals

New cards

What is a residual?

Used to estimate error in the model

The difference between an observed value of Y & its predicted value for a given value of X

New cards

T-test LR

β/SE

Used to evaluate whether the predictor is significantly associated with the outcome

H0: β=0

HA: β≠0

A significant t value denotes that the predictor explains some of the variation in the outcome

New cards

R²

Describes the proportion of variance in the outcome variable that is explained by the predictor(s)

It always ↑ as predictors are added to the model (thus can’t be used for variable selection

New cards

Adjusted R²

Its value is adjusted for the number of predictor variables (k) in the model

Will ↓ if added predictors have minimal additional impact on the outcome

New cards

Model Assumptions (LR)

Independence: the values of the outcome variable are independent from one another, i.e. no clustered data

Linearity: the relationship between the outcome and any continuous predictor variables is linear

Normal distribution: the residuals are normally distributed

Homoscedasticity: the variance of the residuals is the same across the range of predicted values of y

New cards

What if underlying assumptions are not met?

Independence & linearity assumptions are the most important

Can do data transformation, e.g. logarithmic

Can proceed as planned if there are moderate departures from normality & homoscedasticity

New cards

Cook’s Distance (Di)

assesses the influence of each observation

Standardized measure of the change in regression parameters if the particular observation was omitted

New cards

Collinearity

presence of highly correlated predictor variables in the model

Results in large standard errors of regression parameters

Leads to t-test statistics that are spuriously small and thus p-values that are

misleading

Assessed using variance inflation factor (VIF)

New cards

Variance inflation factor (VIF)

Measures how much the variance of regression coefficients in the model is inflated by addition of a predictor variable that contains very similar information

Values of VIF > 10 indicate serious collinearity

The SE of a regression parameter will ↑ by a factor of about the square root of VIF when a collinear predictor variable is added to the model

New cards

VIF = 1/(1 – R2X)

where R2X is the coefficient of determination for describing the amount of variance in the incoming X that is explained by the predictors already in the model

New cards

Logistic Regression

Outcome of interest is measured on a categorical scale

Usually dichotomous: yes/no, negative/positive, 0/1

Predictors can be measured on a continuous or categorical

New cards

can we use regression model for logistic regression?

no. as we would be unable to interpret any predicted values of Y other than 0 or 1

New cards

Generalized linear models (GLM)

Random component: identifies the outcome variable Y & selects a probability distribution for it, e.g. normal, binomial, Poisson, negative binomial

Systematic component: specifies the linear combination of predictor variables, e.g. β0 + β1X1

Link function: specifies a function that relates the expected value of Y to the linear combination of predictor variables, i.e. it connects the random & systematic components

▪ Gives us a linear relationship between our outcome variable & predictor(s)

New cards

Interpreting OR for continuous predictors

The factor by which the odds are ↑ (or ↓) for each unit change in the predictor

New cards

Maximum likelihood estimation

used to estimate the regression parameters

New cards

Wald chi-squared test

used to evaluate the significance of individual parameters

New cards

Model assumptions logistic regression