Lecture 19 Poisson regression and Generalised Linear Models

MS4215/MS6061 Lecture 19: Poisson Regression

Page 1: Introduction to Poisson Regression

Page 2: Overview

Poisson Regression: A statistical method used for count data.
Maximum Likelihood Function & Estimation: Technique for estimating the parameters of a statistical model.
Generalised Linear Models (GLMs): Framework that includes various models, including Poisson regression.
Link Functions: Functions that connect the mean of the distribution to the linear predictors.

Page 3: Poisson Regression Model for Count Data

Observed Hits:
- Counts of hits: 0 (229), 1 (211), 2 (93), 3 (35), 4 (7), 5+ (1)
Expected Hits:
- Expected counts: 0 (226.7), 1 (211.4), 2 (98.6), 3 (30.6), 4 (7.1), 5+ (1.6)
Calculation: Mean hits per district = 0.9288, Number of districts = 576
Probability Mass Function (PMF):
- P(X=x) for the counts given.
Assumption: Response variable follows a Poisson Distribution where Y~P(λ) with E[Y] = Var[Y] = λ.
Example: Number of flying-bomb hits in London.

Page 4: Poisson Distribution Visualization

Graphical representation of Poisson distribution with varying parameters 'a' = 1, 4, 10.

Page 5: Applications of Poisson Regression

Common Examples:
- Number of credit cards owned per individual.
- Number of customers in line at a shop, influenced by items on discount and special events.
- Number of doctor visits by patients in a month.

Page 6: Modelling the Mean of a Poisson Response Variable

Model Initialization:
λ𝑖 = 𝛽0 + 𝛽1𝑥1𝑖 + ⋯ + 𝛽𝑘𝑥𝑘𝑖
- Transformation: ln(λ𝑖) = 𝛽0 + 𝛽1𝑥1𝑖 + ⋯ + 𝛽𝑘𝑥𝑘𝑖
- Mean: λ𝑖 = exp(𝛽0 + 𝛽1𝑥1𝑖 + ⋯ + 𝛽𝑘𝑥𝑘𝑖)
Likelihood Function:
- 𝐿(𝛽; 𝑦) = ∏𝑖=1^𝑛 e^−λ𝑖 λ𝑖^𝑦𝑖 / 𝑦𝑖! where λ𝑖 = exp(𝛽0 + 𝛽1𝑥1𝑖 + ⋯ + 𝛽𝑘𝑥𝑘𝑖).

Page 7: Parameter Estimates Interpretation

Single Predictor Model: ln(λ𝑖) = 𝛽0 + 𝛽1𝑥𝑖
Interpretation of Coefficients:
- Exp(𝛽1) represents the multiplicative effect on the mean of Y with each unit increase in X.

Page 8: Comparing Doctor Visits Data

Data Overview: Lect19DrVisits.xlsx includes patient visit data, illness types, and age.
Research Interest: Compare doctor visits among different illnesses (1, 2, 3) controlling for age.

Page 9: Regression Analysis in R

R Code:
- Convert illness to a factor and fit Poisson regression model using glm()
- Utilize functions like anova() and summary() for analysis.

Page 10: Regression Coefficients Example

Output Coefficients:
- (Intercept): -5.24712, Age: 0.07015, Illness Type 2: 1.08386, Illness Type 3: 0.36981
Statistical Significance: Employed significance codes to interpret results; AIC values used for model selection.
- Null deviance: 287.67, Residual deviance: 189.45, AIC: 373.5
Questions: Formulate based on results, such as comparison between illness types and predicted means.

Page 11: Analysis of Deviance

Deviance Table: Breakdown model evaluation sequentially with terms added.
Deviance Calculation:
- Deviance for GLMs represented mathematically; importance of deviance in model fitting.

Page 12: Characteristics of Poisson Regression

Model Type: GLMs with logarithmic link function.
Distribution Properties: Poisson distribution mean equals variance. Recognize overdispersion as a sign of model inadequacy.

Page 13: Overdispersion Example

Data Analysis Example: Variance slightly exceeds the mean indicating overdispersion in visits data.

Page 14: Absenteeism Dataset Overview

Dataset: Lect19DaysAbsent.xlsx focuses on student absenteeism alongside demographic and academic variables.

Page 15: Regression Model for School Absences

Fitting Model: Implement Poisson regression with student's absences as response variable.
- Capture coefficients and significant variables indicating their influence.
Observed Results: Mean days absent versus variance showing strong overdispersion.

Page 16: Comparing Regression Models

Negative Binomial Regression: More appropriate for overdispersed data as opposed to Poisson regression.
Model Fitting Comparison: AIC values depict negative binomial model as better fit.

Page 17: Generalised Linear Models Summary

GLM Framework: Describe the relation of predictors to response variables through link functions, improving upon traditional linear regression.
Functionality: Allows variance magnitude to depend on predicted values, enhancing model flexibility.

Lecture 19 Poisson regression and Generalised Linear Models

MS4215/MS6061 Lecture 19: Poisson Regression

Page 1: Introduction to Poisson Regression

Page 2: Overview

Page 3: Poisson Regression Model for Count Data

Page 4: Poisson Distribution Visualization

Page 5: Applications of Poisson Regression

Page 6: Modelling the Mean of a Poisson Response Variable

Model Initialization:

Page 7: Parameter Estimates Interpretation

Page 8: Comparing Doctor Visits Data

Page 9: Regression Analysis in R

Page 10: Regression Coefficients Example

Page 11: Analysis of Deviance

Page 12: Characteristics of Poisson Regression

Page 13: Overdispersion Example

Page 14: Absenteeism Dataset Overview

Page 15: Regression Model for School Absences

Page 16: Comparing Regression Models

Page 17: Generalised Linear Models Summary