GB307 Generalized Linear Model

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/44

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

45 Terms

1
New cards

What does standard linear regression assume?

  • the relationship between X’s and Y is linear

  • the errors are normally distributed

2
New cards

Standard Linear Regression Equation

knowt flashcard image
3
New cards

What are Generalized Linear Models (GLMs)?

an extension of linear regression that allow for non-normal response variable distributions and a flexible link between predictors and the mean of the outcome

4
New cards

What are the two key features that make GLMs different from linear regression?

  • The response variable Y can have a non-normal distribution.

  • The mean of Y, E(Y|X), is linked to a linear combination of predictors through a function.

5
New cards
<p>What does this equation mean in GLMs?</p>

What does this equation mean in GLMs?

the mean function

6
New cards

What is the link function in a GLM?

G (⋅) connects the expected value of Y to the linear combination of predictors

<p>G (⋅) connects the expected value of Y to the linear combination of predictors</p>
7
New cards

Why use GLMs instead of linear regression?

Because linear regression assumes normal errors and a constant variance. GLMs allow for different distributions (like binomial or Poisson) and more flexible relationships.

8
New cards

What is the identity link function in GLMs used for?

  • Used for linear relationships.

  • Link: β0 + β1x1 = E(Y|X)

  • Mean: E(Y∣X) = β0 + β1x1

9
New cards

When should you use the log link function in a GLM?

  • When the mean must be positive, like with count data.

  • Link: β0 + β1x1 = ln(E(Y|X))

  • Mean: E(Y∣X) = eβ0 + β1x1

10
New cards

What kind of relationships does the power link handle in GLMs?

  • Used for curved (non-linear) relationships.

  • Link: β0 + β1x1 = E(Y|X)a

  • Mean: E(Y|X) = (β0 + β1x1)1/a

11
New cards

How are the link and mean functions related in GLMs?

The link function transforms the mean so the model can be expressed as a linear combination of predictors. The mean function is the inverse of the link function.

12
New cards

When should you use the Normal distribution for P(Y|X)?

Use it when the outcome is bell-shaped, can be positive or negative, and you're modeling averages (e.g., sales, stock changes)

13
New cards

What kind of data is appropriate for the Gamma distribution?

Use it when the response is always positive and may be skewed, like wait times, durations, or time between events.

14
New cards

How is the Normal distribution shaped and what can it model?

The Normal distribution is symmetric and bell-shaped, and it can model values that are negative or positive. Great for modeling things like sales or returns

15
New cards

Why would you choose the Gamma distribution over Normal?

Gamma is used when the data is strictly positive and may be skewed (e.g., time, rates). Normal can’t model skew or enforce positive-only outcomes.

16
New cards

When should you use the Bernoulli distribution for P(Y|X)?

Use it when the outcome is binary (0 or 1), like yes/no or success/failure questions

17
New cards

What kind of data does the Bernoulli distribution model?

Binary outcomes — data with only two possible values (0 or 1). Great for modeling the probability of an event happening.

18
New cards

When should you use the Poisson or Negative Binomial distribution?

Use them for count data:

  • Poisson: assumes fixed variance

  • Negative Binomial: handles extra variation (overdispersion)
    Examples:

  • Number of customers per hour

  • Number of defects per product

19
New cards

What types of data are modeled with Poisson or Negative Binomial?

Positive integers — like how many times something happens. These models are used when the outcome is a count.

20
New cards

Normal Distribution

bell shaped, continuous

21
New cards

Gamme Distribution

positive, continuous, skewed

22
New cards

Bernoulli Distribution

Binary

23
New cards

Poisson/Negative Binomial Distribution

positive integers

24
New cards

What is the goal when fitting a GLM?

To estimate the coefficients β^​, and use them to predict the expected value or characteristics of Y through

<p>To estimate the coefficients β^​, and use them to predict the expected value or characteristics of Y through</p>
25
New cards

How do we estimate the coefficients β^ in a GLM?

We use maximum likelihood estimation — we choose the values of β^ that maximize the probability of observing our data given the model.

26
New cards

What does maximizing the likelihood mean in GLMs?

It means finding β^​ that gives the highest joint probability of all observed Yi values given their predictors Xi:

<p>It means finding β^​ that gives the highest joint probability of all observed Y<sub>i </sub>values given their predictors X<sub>i</sub>:</p>
27
New cards

What is the average likelihood in a GLM?

  • It is the geometric mean of the individual likelihoods:

  • It represents the average probability of observing each outcome, mostly used in discrete cases

<ul><li><p>It is the <strong>geometric mean</strong> of the individual likelihoods:</p></li><li><p>It represents the <strong>average probability</strong> of observing each outcome, mostly used in <strong>discrete cases</strong></p></li></ul><p></p>
28
New cards

Why is the average log-likelihood often used in continuous cases?

Because the average likelihood can become very small, especially with many observations. Taking the log makes the value more interpretable and numerically stable.

29
New cards

What is the formula for the average log-likelihood?

It’s the mean of the log-probabilities

<p>It’s the <strong>mean of the log-probabilities</strong></p>
30
New cards

What does the log-likelihood measure in a model?

It measures how well the model explains the data. Higher values mean the model assigns higher probability to the observed outcomes.

31
New cards

What does the Average Likelihood Ratio (ALR) compare?

It compares how well two models (A and B) explain the data by taking the geometric average of the likelihoods from each model:

<p>It compares how well two models (A and B) explain the data by taking the <strong>geometric average</strong> of the likelihoods from each model:</p>
32
New cards

What does an Average Likelihood Ratio of 3 mean?

It means that, on average, each observation is 3 times more likely under model A than model B

33
New cards

Why use a geometric average for likelihood ratios?

Because individual likelihoods are multiplied together (not added), using the geometric mean gives a more balanced comparison across all observations

34
New cards

What is the Akaike Information Criterion (AIC) used for?

To compare models that may have different numbers of predictors. It adjusts for complexity so models with more variables don’t get an unfair advantage

35
New cards

What is the formula for AIC?

  • K = number of parameters (predictors)

  • P(Y∣X) = likelihood of the model

<ul><li><p class="">K = number of parameters (predictors)</p></li><li><p class="">P(Y∣X) = likelihood of the model</p></li></ul><p></p>
36
New cards

How should AIC be interpreted when comparing models?

Lower AIC is better. It means a better balance between fit and simplicity.

37
New cards

What does it mean if one model has an AIC 2 units lower than another?

That model is considered significantly better

38
New cards

What kind of distribution does a Logit Model use?

A Bernoulli distribution, because the outcome is binary (either 0 or 1). It models the probability that Y = 1 given X

39
New cards

What does the Logit Model predict?

It predicts the probability that the outcome is 1, using the inverse of the logit link function

40
New cards

What is the mean function of a Logit Model?

This ensures the predicted value (probability) is always between 0 and 1.

<p>This ensures the predicted value (probability) is always between <strong>0 and 1</strong>.</p>
41
New cards

Why does the Logit model use a logistic curve?

Because it models probability smoothly between 0 and 1 and captures the S-shaped (sigmoid) relationship often seen in binary outcomes

42
New cards

What is the systematic component in a Logit Model?

  • It’s the linear combination of the predictors: β01x1- +⋯

  • This part models the log-odds of success

43
New cards

What is the log-odds formula in logistic regression?

This shows how the predictors influence the log of the odds of the outcome being 1

<p>This shows how the predictors influence the <strong>log of the odds</strong> of the outcome being 1</p>
44
New cards

When is the predicted probability E(Y|X) = 0.5 in a logit model?

When the linear part (systematic component) equals zero: β01x1 = 0

45
New cards

What do odds of 7:1 mean?

It means there are 7 successes for every 1 failure — the probability of success is 0.875

<p>It means there are <strong>7 successes for every 1 failure</strong> — the probability of success is 0.875</p>