Week 9 Lecture Recording
Introduction to Logistic Regression
Focus on binary logistic regression and its application to categorical dependent variables.
Objectives:
Define binary logistic regression.
Compute and interpret odds ratios.
Assess logistic regression model using Chi-Square, Wald statistics, 2 log likelihood, and confidence intervals.
Differences Between Multiple Regression and Logistic Regression
Multiple Regression:
Used for predictions about unknown events.
Dependent Variable (DV) is continuous.
Assumes linear relationships using least squares estimation.
Requires normally distributed variables and equal variance.
Logistic Regression:
Determines probability of a particular categorical outcome.
DV is categorical (nominal).
Explanatory variables (EVs) can be continuous or categorical.
Uses logit transformation (non-linear relationship).
Employs maximum likelihood estimation instead of least squares.
More flexible assumptions than multiple regression.
Understanding Logistic Regression
Logistic regression is a statistical technique to examine the relationship between independent variables (predictors) and a dependent variable (criterion).
DV outcomes can represent group membership (e.g., signing up for a swimming lesson: yes or no).
The goal is to predict the probability of an event occurring based on known values.
Logistic regression transforms the DV into the natural log of the odds (logit).
Types of Logistic Regression
Binary Logistic Regression:
Dependent variable has two possible outcomes (e.g., success/failure).
Multinomial Logistic Regression:
Dependent variable has more than two outcomes (greater than two levels).
Example of Binary Logistic Regression
Hypothetical study predicting successful enrollment into the master's psychology program.
Three predictor variables: interest in the program, previous degree score, and possession of a psychology degree (categorical).
Outcome variable is enrollment status (yes/no).
Running Logistic Regression in SPSS
Initial Model (Block 0):
Shows classification accuracy at 50% (random prediction).
Model Results:
Omnibus test assesses overall model fit.
If p-value < 0.05, predictors significantly improve predictive accuracy.
2 log likelihood provides information on model fit (lower is better).
Model fit tests (e.g., Hosmer and Lemeshow test) indicate how well the model predicts.
Effect Size and Predictors:
Nagelkerke pseudo R-square indicates the explanatory power of the model.
Individual predictors evaluated for significance (p-value).
Odds Ratio (OR) interpretation, e.g., a 1-unit increase in previous score correlates with a 17.4% increase in odds of enrollment.
Steps in Logistic Regression Analysis
Model Building Approaches:
Forward Entry: Enter all variables simultaneously.
Hierarchical Entry: Variables entered in blocks based on prior knowledge.
Stepwise Entry: Exploratory approach eliminating variables based on statistical significance.
Parsimony Principle:
Include predictors only if they contribute significantly to the model.
Complexities of Running Logistic Regression
In-depth assumptions need testing including checking residuals and identifying influential cases which might not be covered in basic analysis courses.
Alternative Example Using Interventions
An example illustrates treatment outcomes (cured/not cured) with a continuous predictor (days before treatment).
Hierarchical modeling showed no additional variance explained by adding predictors sequentially.
Interpretation of Results in SPSS
Output shows classification frequency, improvement in likelihood over predictors added, and Chi-square significance value indicates model fit.
Odds ratios provide insights into increased likelihood of desired outcomes based on predictor metrics.
Conclusion
Key components for reporting logistic regression include:
Model fit statistics (Chi-Square results, log likelihood).
Significance of predictors via Wald statistics and confidence intervals.
Summary tables should include standardized betas and effect sizes for clearer interpretation.
Next Steps:
Students will learn to perform and report binary logistic regression in computer labs.