Chapter 11: Binary Dependent Variables and Probit/Logit Models

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/16

flashcard set

Earn XP

Description and Tags

Flashcards covering binary dependent variables, linear probability model limitations, probit and logit models, estimation and inference (including robust SEs), and the HMDA loan data example with interpretation of key variables such as pirat and black.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

17 Terms

1
New cards

What does LPM stand for and what is the nature of the dependent variable Y in LPM?

LPM stands for Linear Probability Model. The dependent variable Y is binary, taking values 0 or 1.

2
New cards

In the Linear Probability Model, how is E[Y|X] defined and what does it represent?

E[Y|X] equals the probability P(Y=1|X) and is modeled as a linear function: E[Y|X] = β0 + β1X1 + β2X2 + … ; Pr(Y=1|X) is given by this linear form.

3
New cards

Why is Var(u|X) a concern in the LPM, and what is its typical form?

Var(u|X) = p(X)[1 - p(X)], which depends on X. This leads to heteroskedasticity and invalid standard errors; robust standard errors are often used.

4
New cards

What are probit and logit models used for in binary outcome analysis?

They are non-linear models that estimate Pr(Y=1|X) via a function of a linear index: Φ(β0+β'X) for probit and F(β0+β'X) for logit.

5
New cards

What functions are Φ and F in probit and logit models?

Φ is the standard normal CDF (probit); F is the logistic CDF (logit); both map the latent index to a probability between 0 and 1.

6
New cards

How do you estimate probit/logit models in R, and how do you obtain robust standard errors?

Use glm with family = binomial and link = 'probit' or 'logit'. To obtain robust SE, use coeftest with vcovHC (type = 'HC1').

7
New cards

Why are coefficients in probit/logit not directly interpreted as changes in probability, and what should you use instead?

Because the relationship is non-linear; coefficients reflect changes in the latent index, not direct probability changes. Use marginal effects or predicted probabilities to interpret probability changes.

8
New cards

How do you compute a predicted probability for a given X in a probit or logit model?

Use predict(model, newdata = data.frame(…), type = 'response') to obtain Pr(Y=1|X).

9
New cards

How do you compute the difference in predicted probabilities when a covariate changes from x1 to x2?

Compute p̂ at x1 and p̂ at x2 using predict with newdata, then take the difference p̂(x2) − p̂(x1).

10
New cards

In the probit manual example, with β0 = -2.19, β1 = 2.97 and X = 0.4, what is Pr(Y=1|X)?

Pr(Y=1|X) = Φ(-2.19 + 2.97*0.4) = Φ(-1) ≈ 0.159.

11
New cards

Continuing the probit example, what is Pr(Y=1|X) for X = 0.5 and what is the difference when X goes from 0.4 to 0.5?

Pr(Y=1|X=0.5) = Φ(-0.705) ≈ 0.24; difference from X=0.4 is about 0.081 (8.1 percentage points).

12
New cards

What is the purpose of adding the 'black' variable in the HMDA probit model, as shown in the notes?

To test whether race (black vs. non-black) affects loan denial probabilities, controlling for other covariates.

13
New cards

What is the practical takeaway regarding probit vs. logit curves in these notes?

Both produce S-shaped curves; they yield similar predicted probabilities; choice of link rarely changes conclusions, and results should be interpreted via predicted probabilities or marginal effects.

14
New cards

What estimation method is highlighted for probit/logit models besides OLS?

Maximum Likelihood Estimation (MLE) is used for probit/logit models as an alternative to OLS.

15
New cards

What is a key limitation of the LPM when heteroskedasticity is present?

Standard hypothesis tests are invalid because Var(u|X) is not constant; robust standard errors are needed.

16
New cards

How can probability outcomes be extended beyond binary in these models, as hinted in the notes?

Y can be extended to multiple categories (e.g., Y ∈ {0,1,2}) which would require multinomial or ordered probit/logit models.

17
New cards

What key issues threaten internal validity listed in the notes (Page 12)?

Omitted variables, misspecified functional form, measurement error, sample selection bias, and simultaneity.