MR

Week 9 Lecture Recording

Introduction to Logistic Regression

  • Focus on binary logistic regression and its application to categorical dependent variables.

  • Objectives:

    • Define binary logistic regression.

    • Compute and interpret odds ratios.

    • Assess logistic regression model using Chi-Square, Wald statistics, 2 log likelihood, and confidence intervals.

Differences Between Multiple Regression and Logistic Regression

  • Multiple Regression:

    • Used for predictions about unknown events.

    • Dependent Variable (DV) is continuous.

    • Assumes linear relationships using least squares estimation.

    • Requires normally distributed variables and equal variance.

  • Logistic Regression:

    • Determines probability of a particular categorical outcome.

    • DV is categorical (nominal).

    • Explanatory variables (EVs) can be continuous or categorical.

    • Uses logit transformation (non-linear relationship).

    • Employs maximum likelihood estimation instead of least squares.

    • More flexible assumptions than multiple regression.

Understanding Logistic Regression

  • Logistic regression is a statistical technique to examine the relationship between independent variables (predictors) and a dependent variable (criterion).

  • DV outcomes can represent group membership (e.g., signing up for a swimming lesson: yes or no).

  • The goal is to predict the probability of an event occurring based on known values.

  • Logistic regression transforms the DV into the natural log of the odds (logit).

Types of Logistic Regression

  • Binary Logistic Regression:

    • Dependent variable has two possible outcomes (e.g., success/failure).

  • Multinomial Logistic Regression:

    • Dependent variable has more than two outcomes (greater than two levels).

Example of Binary Logistic Regression

  • Hypothetical study predicting successful enrollment into the master's psychology program.

  • Three predictor variables: interest in the program, previous degree score, and possession of a psychology degree (categorical).

  • Outcome variable is enrollment status (yes/no).

Running Logistic Regression in SPSS

  • Initial Model (Block 0):

    • Shows classification accuracy at 50% (random prediction).

  • Model Results:

    • Omnibus test assesses overall model fit.

    • If p-value < 0.05, predictors significantly improve predictive accuracy.

    • 2 log likelihood provides information on model fit (lower is better).

    • Model fit tests (e.g., Hosmer and Lemeshow test) indicate how well the model predicts.

  • Effect Size and Predictors:

    • Nagelkerke pseudo R-square indicates the explanatory power of the model.

    • Individual predictors evaluated for significance (p-value).

    • Odds Ratio (OR) interpretation, e.g., a 1-unit increase in previous score correlates with a 17.4% increase in odds of enrollment.

Steps in Logistic Regression Analysis

  • Model Building Approaches:

    • Forward Entry: Enter all variables simultaneously.

    • Hierarchical Entry: Variables entered in blocks based on prior knowledge.

    • Stepwise Entry: Exploratory approach eliminating variables based on statistical significance.

  • Parsimony Principle:

    • Include predictors only if they contribute significantly to the model.

Complexities of Running Logistic Regression

  • In-depth assumptions need testing including checking residuals and identifying influential cases which might not be covered in basic analysis courses.

Alternative Example Using Interventions

  • An example illustrates treatment outcomes (cured/not cured) with a continuous predictor (days before treatment).

  • Hierarchical modeling showed no additional variance explained by adding predictors sequentially.

Interpretation of Results in SPSS

  • Output shows classification frequency, improvement in likelihood over predictors added, and Chi-square significance value indicates model fit.

  • Odds ratios provide insights into increased likelihood of desired outcomes based on predictor metrics.

Conclusion

  • Key components for reporting logistic regression include:

    • Model fit statistics (Chi-Square results, log likelihood).

    • Significance of predictors via Wald statistics and confidence intervals.

    • Summary tables should include standardized betas and effect sizes for clearer interpretation.

  • Next Steps:

    • Students will learn to perform and report binary logistic regression in computer labs.