Data-Processing Multiverse Analysis of the Regnerus Study and Its Critics

Study Background and Context

Focus of chapter: comprehensive multiverse re-analysis of Mark Regnerus’s 2012 study on outcomes of children raised by gay/lesbian parents (LGBT parents).
Original article appeared in Social Science Research and became pivotal in legal and public debates (e.g., US Supreme Court cases on same-sex adoption).
Chapter’s purpose: illustrate how data-processing “researcher degrees of freedom” can rival (or exceed) the impact of control-variable choices.
Two multiverse analyses are constructed:
- Control-Variable Multiverse (11 possible controls → $2^{11}=2{,}048$ models)
- Data-Processing Multiverse (combined with controls → (\approx 2.65) million models)
Central questions examined:
- How much result variability comes from selecting control variables?
- How much comes from decisions about cleaning, coding, weighting, or defining key variables?

Original Regnerus Study (2012a)

Dataset: New Family Structures Study (NFSS)
- 15,000 initial screen; 2,988 full surveys; 236 identified a parental same-sex relationship.
Key screening question: “Did either of your parents ever have a romantic relationship with someone of the same sex?” (Mother/Father/No).
Baseline regression (Eq. 11.1): y{ij}=\beta0+\beta1\,\text{lesbian_mother}i+\beta2\,\text{gay_father}i+\sum{k=3}^4 \betak\,\text{other_family_type}{ik}+ \gamma'\,\text{controls}{ij}+\epsilon_{ij}
- 40 separate dependent variables (education, mental health, civic engagement, etc.) → 80 LGBT coefficients.
Reported findings: children of LGBT parents worse on many outcomes (e.g., unemployment, public assistance, drug abuse, suicidal ideation).
Control set: age, mother’s education, origin-family income, female, white, bullied-as-child (+ state gay-friendly score unavailable in public data).

Immediate Scholarly Critiques

Cheng & Powell (2015)
- Added 5 controls (parents’ age at birth, region, metro status, childhood welfare receipt).
- Flagged extensive misclassification + “prankster” responses; argued 44 % of LGBT cases unreliable.
Rosenfeld (2015)
- Combined 19 outcomes into a signed index; emphasized family transitions instead of Regnerus categories; rejected “bullied” as endogenous.
Sherkat (2012) external review considered retraction; highlighted absence of data cleaning.

Multiverse Analysis Framework

Control-Variable Multiverse (2,048 models)

11 potential covariates (6 Regnerus + 5 critics) toggled on/off.
Empirical pattern:
- All 2,048 coefficients negative and significant.
- Mean coefficient $\bar{\beta}=-1.0$ ; modeling SE $\approx0.09$ (smaller than sampling SE $0.16$ ).
- Regnerus six-control model sits on low-magnitude side of distribution.
- Conclusion: control choice alone barely affects inference.

Data-Processing Multiverse (>2.6 M models)

Dimensions added to the 2,048 control choices:
1. Treatment of anomalous/outlier cases (4 options).
2. Family-type operationalisation (6 options).
3. Outcome-index construction (2 options).
4. Weighting strategy (weighted vs. unweighted).
5. Winsorising outcomes (yes/no).
6. Alternative codings of income, age, race, mother’s education (4,3,3,3 options).
Total ≈ $2{,}654{,}208$ unique specifications.

Key Data-Processing Decisions & Rationale

Anomalous Observations

Evidence of prank responses: e.g., 7’8” tall, 88 lb respondent; 10 pregnancies; 100+ sexual partners.
Options:
- M.1 Use all data (Regnerus baseline).
- M.2 Drop 29 misclassified cases.
- M.3 Drop misclassified + 6 borderline + 15 low-co-residence cases.
- M.4 Drop cases that never lived with parent’s same-sex partner.

Variable Construction: Family Type vs. Family Transitions

Regnerus categorical types (IBF, single, step, divorced, adoptive, other, LGBT).
Rosenfeld transition count:
- Broad: any adult entering/leaving household.
- Parent-only transitions.
Hybrid model (Regnerus + transition count) also feasible (contrary to Rosenfeld collinearity worry).
Philosophical split:
- Regnerus: instability is part of LGBT causal pathway → transitions a “bad control.”
- Rosenfeld: instability largely exogenous (legal discrimination) → must control for transitions.

Outcome Index Construction

Rosenfeld (P.1): 19 variables; listwise deletion (17.5 % loss).
Revised (P.2): 29 variables; allow ≤19 missing; only 0.77 % dropped; positive sign = better outcomes.

Weights & Outliers

NFSS sampling weights correct demographic imbalance but inflate SEs.
Winsorising long left tail of outcome index included as option.

Results of Full Multiverse

Distribution centered closer to zero than control-only multiverse; modeling SE doubles (0.22).
95 % of estimates < Regnerus coefficient (Regnerus at 5th percentile).
Sign pattern:
- 0.04 % positive (none significant).
- 76 % negative & significant.
- Robustness ratio $\frac{|\text{mean}|}{\text{total SE}}=3.14$ (robust).
Central substantive takeaway: negative effect persists but magnitude smaller than Regnerus original (≈ –0.48 SD on average vs. –0.87).

Influence Analysis (Table 11.3 Highlights)

Largest shifts:
- Changing comparison group (C.1–C.6) alters coefficient by –44 % → +59 %.
- Dropping misclassified & borderline cases moves coefficient +16 % to –20 %.
- Weights double SEs; significance rate falls from 95 % (unweighted) to 58 % (weighted).
Minor / negligible factors:
- Winsorising, race coding, quadratic age, outlier handling have ≤1 % effect.

Regnerus vs. Rosenfeld Multiverse Subsets

Regnerus subset (no transition control):
- Mean $\beta=-0.74$ ; 99.9 % negative & significant.
Rosenfeld subset (transition controls mandatory):
- Mean $\beta=-0.35$ ; only 64 % significant; heavily dependent on weighting.
- Weighted Rosenfeld models: 37 % significant (SE inflation).
Choice of conceptual framework + weighting is decisive for “significance” narrative.

Statistical Formulas & Metrics

Number of control combinations: $2^{11}=2{,}048$ .
Total multiverse: $2{,}048 \times 1{,}296 \approx 2{,}654{,}208$ .
Robustness ratio: $RR=\frac{|\bar{\beta}|}{SE_{total}}$ .
Family transition effect example (Table 11.1): each additional transition → $\beta\approx -0.07$ on outcome index.

Ethical, Philosophical, Practical Implications

Data quality concerns:
- Pre-screening on LGBT question may invite mischievous respondents.
- Small sub-population (1.7 %) magnifies misclassification error.
Multiverse transparency exposes how plausible yet subjective decisions shape socially sensitive conclusions.
Core lesson: publication debate should shift from binary significance to distribution of effect sizes under defensible assumptions.
Policy stakes (adoption rights, marriage equality) warrant rigorous open science standards.

Connections to Broader Literature

Echoes Jasso (1985) debate on handling implausible survey responses.
Demonstrates general principle (Gelman & Stern 2006) that significance ≠ importance; sample size manipulations can mask effect-size stability.
Administrative data studies (e.g., Mazrekaj, De Witte & Cabus 2020, Netherlands) provide contrasting positive findings, highlighting need for better measurement.

Exam Tips & Takeaways

Memorise the two main multiverse dimensions: control choices vs. data-processing choices.
Understand why family transitions may be a "bad control" (causal pathway) or an essential control (confounding).
Be able to reproduce the logic behind Eq. 11.1 and simplified index model (Eq. 11.2):
$\text{Index}i = \beta0 + \beta1\,\text{LGBTparent}i + \beta2\,\text{OtherFamilyTypes}i + \gamma'\,\text{Controls}i + \epsiloni$
Know key numbers:
- 40 outcomes → 80 coefficients in original.
- 2,048 control models; ~2.65 M total models.
- Mean full-multiverse effect ≈ –0.48 SD; only 0.04 % positive.
Distinguish between sampling SE vs. modeling SE; understand impact of weights on both.
Recall that inclusion/exclusion of prank data, misclassified cases, and weighting determine magnitude and precision more than classic covariate debate.