Data-Processing Multiverse Analysis of the Regnerus Study and Its Critics

Study Background and Context

  • Focus of chapter: comprehensive multiverse re-analysis of Mark Regnerus’s 2012 study on outcomes of children raised by gay/lesbian parents (LGBT parents).
  • Original article appeared in Social Science Research and became pivotal in legal and public debates (e.g., US Supreme Court cases on same-sex adoption).
  • Chapter’s purpose: illustrate how data-processing “researcher degrees of freedom” can rival (or exceed) the impact of control-variable choices.
  • Two multiverse analyses are constructed:
    • Control-Variable Multiverse (11 possible controls → 211=2,0482^{11}=2{,}048 models)
    • Data-Processing Multiverse (combined with controls → (\approx 2.65) million models)
  • Central questions examined:
    • How much result variability comes from selecting control variables?
    • How much comes from decisions about cleaning, coding, weighting, or defining key variables?

Original Regnerus Study (2012a)

  • Dataset: New Family Structures Study (NFSS)
    • 15,000 initial screen; 2,988 full surveys; 236 identified a parental same-sex relationship.
  • Key screening question: “Did either of your parents ever have a romantic relationship with someone of the same sex?” (Mother/Father/No).
  • Baseline regression (Eq. 11.1): y{ij}=\beta0+\beta1\,\text{lesbian_mother}i+\beta2\,\text{gay_father}i+\sum{k=3}^4 \betak\,\text{other_family_type}{ik}+ \gamma'\,\text{controls}{ij}+\epsilon_{ij}
    • 40 separate dependent variables (education, mental health, civic engagement, etc.) → 80 LGBT coefficients.
  • Reported findings: children of LGBT parents worse on many outcomes (e.g., unemployment, public assistance, drug abuse, suicidal ideation).
  • Control set: age, mother’s education, origin-family income, female, white, bullied-as-child (+ state gay-friendly score unavailable in public data).

Immediate Scholarly Critiques

  • Cheng & Powell (2015)
    • Added 5 controls (parents’ age at birth, region, metro status, childhood welfare receipt).
    • Flagged extensive misclassification + “prankster” responses; argued 44 % of LGBT cases unreliable.
  • Rosenfeld (2015)
    • Combined 19 outcomes into a signed index; emphasized family transitions instead of Regnerus categories; rejected “bullied” as endogenous.
  • Sherkat (2012) external review considered retraction; highlighted absence of data cleaning.

Multiverse Analysis Framework

Control-Variable Multiverse (2,048 models)

  • 11 potential covariates (6 Regnerus + 5 critics) toggled on/off.
  • Empirical pattern:
    • All 2,048 coefficients negative and significant.
    • Mean coefficient βˉ=1.0\bar{\beta}=-1.0; modeling SE 0.09\approx0.09 (smaller than sampling SE 0.160.16).
    • Regnerus six-control model sits on low-magnitude side of distribution.
    • Conclusion: control choice alone barely affects inference.

Data-Processing Multiverse (>2.6 M models)

  • Dimensions added to the 2,048 control choices:
    1. Treatment of anomalous/outlier cases (4 options).
    2. Family-type operationalisation (6 options).
    3. Outcome-index construction (2 options).
    4. Weighting strategy (weighted vs. unweighted).
    5. Winsorising outcomes (yes/no).
    6. Alternative codings of income, age, race, mother’s education (4,3,3,3 options).
  • Total ≈ 2,654,2082{,}654{,}208 unique specifications.

Key Data-Processing Decisions & Rationale

Anomalous Observations

  • Evidence of prank responses: e.g., 7’8” tall, 88 lb respondent; 10 pregnancies; 100+ sexual partners.
  • Options:
    • M.1 Use all data (Regnerus baseline).
    • M.2 Drop 29 misclassified cases.
    • M.3 Drop misclassified + 6 borderline + 15 low-co-residence cases.
    • M.4 Drop cases that never lived with parent’s same-sex partner.

Variable Construction: Family Type vs. Family Transitions

  • Regnerus categorical types (IBF, single, step, divorced, adoptive, other, LGBT).
  • Rosenfeld transition count:
    • Broad: any adult entering/leaving household.
    • Parent-only transitions.
  • Hybrid model (Regnerus + transition count) also feasible (contrary to Rosenfeld collinearity worry).
  • Philosophical split:
    • Regnerus: instability is part of LGBT causal pathway → transitions a “bad control.”
    • Rosenfeld: instability largely exogenous (legal discrimination) → must control for transitions.

Outcome Index Construction

  • Rosenfeld (P.1): 19 variables; listwise deletion (17.5 % loss).
  • Revised (P.2): 29 variables; allow ≤19 missing; only 0.77 % dropped; positive sign = better outcomes.

Weights & Outliers

  • NFSS sampling weights correct demographic imbalance but inflate SEs.
  • Winsorising long left tail of outcome index included as option.

Results of Full Multiverse

  • Distribution centered closer to zero than control-only multiverse; modeling SE doubles (0.22).
  • 95 % of estimates < Regnerus coefficient (Regnerus at 5th percentile).
  • Sign pattern:
    • 0.04 % positive (none significant).
    • 76 % negative & significant.
    • Robustness ratio meantotal SE=3.14\frac{|\text{mean}|}{\text{total SE}}=3.14 (robust).
  • Central substantive takeaway: negative effect persists but magnitude smaller than Regnerus original (≈ –0.48 SD on average vs. –0.87).

Influence Analysis (Table 11.3 Highlights)

  • Largest shifts:
    • Changing comparison group (C.1–C.6) alters coefficient by –44 % → +59 %.
    • Dropping misclassified & borderline cases moves coefficient +16 % to –20 %.
    • Weights double SEs; significance rate falls from 95 % (unweighted) to 58 % (weighted).
  • Minor / negligible factors:
    • Winsorising, race coding, quadratic age, outlier handling have ≤1 % effect.

Regnerus vs. Rosenfeld Multiverse Subsets

  • Regnerus subset (no transition control):
    • Mean β=0.74\beta=-0.74; 99.9 % negative & significant.
  • Rosenfeld subset (transition controls mandatory):
    • Mean β=0.35\beta=-0.35; only 64 % significant; heavily dependent on weighting.
    • Weighted Rosenfeld models: 37 % significant (SE inflation).
  • Choice of conceptual framework + weighting is decisive for “significance” narrative.

Statistical Formulas & Metrics

  • Number of control combinations: 211=2,0482^{11}=2{,}048.
  • Total multiverse: 2,048×1,2962,654,2082{,}048 \times 1{,}296 \approx 2{,}654{,}208.
  • Robustness ratio: RR=βˉSEtotalRR=\frac{|\bar{\beta}|}{SE_{total}}.
  • Family transition effect example (Table 11.1): each additional transition → β0.07\beta\approx -0.07 on outcome index.

Ethical, Philosophical, Practical Implications

  • Data quality concerns:
    • Pre-screening on LGBT question may invite mischievous respondents.
    • Small sub-population (1.7 %) magnifies misclassification error.
  • Multiverse transparency exposes how plausible yet subjective decisions shape socially sensitive conclusions.
  • Core lesson: publication debate should shift from binary significance to distribution of effect sizes under defensible assumptions.
  • Policy stakes (adoption rights, marriage equality) warrant rigorous open science standards.

Connections to Broader Literature

  • Echoes Jasso (1985) debate on handling implausible survey responses.
  • Demonstrates general principle (Gelman & Stern 2006) that significance ≠ importance; sample size manipulations can mask effect-size stability.
  • Administrative data studies (e.g., Mazrekaj, De Witte & Cabus 2020, Netherlands) provide contrasting positive findings, highlighting need for better measurement.

Exam Tips & Takeaways

  • Memorise the two main multiverse dimensions: control choices vs. data-processing choices.
  • Understand why family transitions may be a "bad control" (causal pathway) or an essential control (confounding).
  • Be able to reproduce the logic behind Eq. 11.1 and simplified index model (Eq. 11.2):
    Index<em>i=β</em>0+β<em>1LGBTparent</em>i+β<em>2OtherFamilyTypes</em>i+γControls<em>i+ϵ</em>i\text{Index}<em>i = \beta</em>0 + \beta<em>1\,\text{LGBTparent}</em>i + \beta<em>2\,\text{OtherFamilyTypes}</em>i + \gamma'\,\text{Controls}<em>i + \epsilon</em>i
  • Know key numbers:
    • 40 outcomes → 80 coefficients in original.
    • 2,048 control models; ~2.65 M total models.
    • Mean full-multiverse effect ≈ –0.48 SD; only 0.04 % positive.
  • Distinguish between sampling SE vs. modeling SE; understand impact of weights on both.
  • Recall that inclusion/exclusion of prank data, misclassified cases, and weighting determine magnitude and precision more than classic covariate debate.