Context #3 - Evans

What is computational sociology?

  • Definition: uses new tools and data sources to expand the scope and scale of sociological inquiry; expands the sociological imagination.
  • Data windfall: internet as a massive digital observatory; sensors in cell phones and social media; born-digital observations plus digitization (e.g., Google Books) enable unprecedented scales.
  • Big data for free: computation provides lenses to see and analyze large-scale social data; can reveal patterns that challenge or require new theories.
  • Platform for active inquiry: web-scale surveys and experiments to test theories and causal relationships; online investigations surpass offline in scale, duration, and complexity.
  • Not a passing fad: computational sociology is a reactor that can transform the study of many problems (e.g., polarization, segregation, diffusion of ideas).
  • Methods and training: extends familiar methods and demands new skills; social researchers will need to write code, wrangle data, and think computationally as well as sociologically.

From social theory to formal worlds

  • Traditional social theory uses natural language and rich nuance, which can invite ambiguity and overgeneralization.
  • Computational sociologists often express theory in mathematics or algorithms; formal models can be empirically validated or explored via simulation.
  • Example: a simple formal model of how scientists choose phenomena to study; using data from millions of articles/patents showed unfolding knowledge is often generated by a conservative strategy linking popular phenomena to nearby less prominent ones.
  • Simulation on a Cray revealed more efficient strategies could have accelerated discovery; verbal theory is essential but not sufficient for exploration.
  • Takeaway: verbal theory inspires formalizations, but combining both enables deeper exploration.

Ethnography to digital observatories

  • Ethnography surfaces novel practices and rich, nuanced detail for theory building; tests ideas against lived experience.
  • Digital traces offer massive-scale data but are often low-dimensional “cartoons” of social life (categories, clusters, scores).
  • Digital cartoons can reveal novel behaviors (e.g., online dating biases, cross-race messaging effects) and allow testing ethnographic insight at scale (e.g., police officers’ respectful behavior across stops).
  • Limitations: they cannot capture full multimodal nuance or sequence/tone; collaboration with ethnographers remains essential.
  • Ethnographers should guide trace collection and construct development; qualitative data can justify assumptions used by computational methods.

From surveys to online interactions

  • Traditional surveys underpin quantitative sociology: designed sampling, randomization, and generalizable findings.
  • Computational approaches predict attitudes/behaviors from digital breadcrumbs; machine learning imputes signals (e.g., demographics from images/text) and online traces can help select respondents.
  • Simple surveys mimic real-world platforms (e.g., rating/selection tasks) via crowdsourcing (Mechanical Turk, CrowdFlower) or marketplaces; online responses can reach larger samples with fewer questions.
  • Advantages: broader reach, faster data collection, responses reflect actual online behavior rather than stated beliefs.
  • Caveats: many online traces are collected for commercial purposes; potential algorithmic confounding; consent/ethics/privacy concerns; interpreting data requires understanding the predictive models that generated actions.

From experiments to virtual laboratories

  • Laboratory/field experiments test causal claims and theory in controlled settings; online experiments extend reach to scales and designs impractical offline.
  • Classic field experiments (e.g., Pager) demonstrate causal effects (e.g., racial discrimination in hiring).
  • Online experiments offer scalability and complexity (e.g., Salganik, Dodds, Watts): multiple “worlds” show how social information affects market outcomes.
  • Limitations: online modalities constrain sensory input; participant environments uncontrolled; sample or ecological validity concerns; some designs on pre-existing networks require advanced randomization.
  • Future strength: higher bandwidth devices, augmented/virtual reality, and AI agents enabling more sophisticated, ecologically valid experiments; ethical considerations must scale with capability.

From statistical inference to machine discovery

  • Statistical inference underpins traditional social science: estimating causal/associational relationships.
  • Machine learning emphasizes prediction over parametric estimation; supervised learning maps features to labels; unsupervised learning discovers structure (clusters, dimensionality reduction).
  • ML benefits: handles diverse data types; can yield surprising findings (e.g., migration clustering, “fuzzy boundaries” of ethnic conflict); large feature spaces (e.g., 12,000+ features) may offer limited gains for certain predictions, highlighting the need for theory.
  • Deep neural networks offer strong predictive power but can be hard to interpret and may lack causal clarity for policy relevance.
  • Sociology has a special vocation: to integrate ML insights with causal explanation and theory testing.
  • Curriculum implications: teach languages like Python, R, Julia alongside Stata; boot camps for data gathering/cleaning; introduce discrete math and linear algebra early; prioritize collaborative work with ML experts.

Embracing our special vocation

  • Sociology should maintain its core mission while embracing computational methods.
  • A broadened imagination links micro and macro scales, momentary and historical perspectives, and new digital experiences.
  • Departments should adapt training and culture to include data science fluency, collaboration, and access to modern computational tools.
  • Computation enables modeling, measuring, and modifying social structures and experiences across scales.
  • The sociological imagination has never been more expansive; the digital age demands new imagination and applied ethics.

Imagining the future of computational sociology

  • Mills’ idea of connecting individual experience to large-scale social currents remains central; computation expands that capacity across scales and contexts.
  • Sociologists have a unique role in predicting, explaining, and interpreting social data, with a blend of qualitative and quantitative habits.
  • Collaboration with machine learning and data-science experts will be essential; avoid treating ML as a black box.
  • Training should frontload practical skills (coding, data handling) and conceptual grounding in theory and methods.
  • The field envisions formal worlds, digital observatories, intelligent surveys, virtual laboratories, and machine-driven discovery—requiring an empowered, imaginative sociological imagination.