W5 - Poisson regression

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/18

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

19 Terms

1
New cards

Normal distribution parameters

  1. mean

  2. standard deviation

2
New cards

Poisson distribution parameters

one parameter: lambda (the mean and variance of the poisson distribution)

3
New cards

Outcome variable types - Poisson distribution

  • count outcome variables

  • discrete positive integers

4
New cards

as lambda increases the distribution …

begins to look more similar to a normal distribution

5
New cards

Why linear regression cannot be used for count data

  • The normality distribution would be violated (unless high avg counts)'

  • Can't have negative numbers

  • Certainty at extreme values is impossible when the outcome can only be 0 or positive whole numbers

  • when an outcome can only be 0 or positive whole numbers, a straight line is a bad fit, and is impossible for extreme values (negative)

6
New cards

Solution: link function

  • resolves the issue of count data not having negative values

  • the link function transforms the linear predicted value η so that it never goes below zero

7
New cards

Group comparisons

poisson and logistic regression used z-scores rather than t-values

8
New cards

Link function for poisson regression

η = g(λ) = ln(λ)

Takes the average number of counts (lambda), which can only fall > 0, and transforms it to fall between negative infinity and positive infinity. This gives us a continuous and unbounded outcome we can apply a linear model to.

  • natural log transformation unbounds lambda on the left side of the graph so that it never goes below zero

  • log of zero = negative infinity (cannot go under zero)

9
New cards

ln()

the natural logarithm (or log with base e, euler's number)

10
New cards

Poisson regression

  • Used when the outcome is a count variable (discrete, whole and positive)

  • Poisson distribution used when the counts are relatively rare

  • Poisson distribution: discrete, density curve not plotted

  • Only has one distribution parameter - Lambda (mean + variance)

  • As lambda increases may start to look more like a normal distribution

11
New cards

Inverse link function

reverts predictions made from linear model back to original count scale so predictions can be interpreted

  • transforms eta back to fall > 0

12
New cards

Poisson regression assumptions

  • assumes a poisson distribution

  • errors are independent

  • assumes that on the link scale (natural log scale) there is a linear relationship

  • assumptions of equal variance and normality not required

  • Requires a large sample size. There are no degrees of freedom so sample must be large enough that parameters are distributed about normally relying on the central limit theorem

*no agreed upon method for evaluating whether residual meet assumptions

13
New cards

Poisson regression in R

  • Use glm() function

  • When using anything other than a linear model need to specify family argument e.g. family = poisson()

  • m <- glm(num_awards ~ math, data = dcount, family = poisson())

14
New cards

Poisson output

  • Deviance residuals: how much each data point contributed to the residual deviance. Measure of overall model fit. Not calculated as the raw difference between each data point and predicted value.

  • Coefficients: z-values instead of t-values as we cannot calculate degrees of freedom

  • No R2 - only works for linear regression. Instead get the null and residual deviance values and an AIC (relative measure of model fit taking into account model complexity)

  • Number of Fisher scoring iterations

15
New cards

Poisson regression interpretation

A poisson regression predicting the number of awards each student received from their math scores showed that students who had a 0 math score were expected to have -5.33 log awards, p < .001. Each one unit higher math score was associated with 0.09 higher log awards, p < .001. A graph of these results follows.

 - by specifying that we are working on a log scale can interpret the results as we did before

  • hard to interpret results on log scale

16
New cards

Incident rate ratios

  • Obtained by exponentiating the regression coefficients (slopes)

    *can exp coefficients & confint, but not p-values or z-values

  • IRRs indicate how many more times the outcome will be for a one unit change in the predictor

  • e.g. if the IRR=2  and the base rate was 1, then one unit higher would be 1∗2=2. If the base rate was 2, then a one unit higher would be 2∗2=4

  • IRRs are interpreted as for each one unit higher predictor score, there are IRR times as many events of the outcome.

  • How much your outcome is expected to increase in count numbers

  • As IRRs are multiplicative, an IRR of 1 means that a one unit change in the predictor is associated with 1 times as many events on the outcome, that is no change.

  • On the IRR scale, a coefficient of 1 is equivalent to a coefficient of 0 on the link (log) scale

17
New cards

Interpretation using IRRs

A poisson regression predicting the number of awards each student received from their math scores showed that students who had a 0 math score were expected to have 0.00 awards, [95% CI 0.00 - 0.01]. Each one unit higher math score was associated with having 1.09 times the number of awards, [95% CI 1.07 - 1.11], p < .001. A graph of these results follows.

18
New cards

eta

expected part of the model without error term

19
New cards

Poisson testDistribution()

discrete distribution

  • solid grey bars show the observed density of different values

  • blue bars show the density under a poisson distribution

  • deviates plot show the deviations from the assumed poisson distribution

<p>discrete distribution</p><ul><li><p>solid grey bars show the observed density of different values </p></li><li><p>blue bars show the density under a poisson distribution </p></li><li><p>deviates plot show the deviations from the assumed poisson distribution </p></li></ul><p></p>