IDIS100 Week 02 Notes — Idiographic vs Nomothetic; Causality; Unit of Analysis; Time; Two Logical Systems
Research Purposes: Exploration, Description, Explanation
Exploration: develop initial understanding of a phenomenon, break new ground, gain new insights, pave the way for future research
Example: focus groups or in-depth interviews before designing and running a national survey
Description: provide precise measurement and description of population or phenomenon
Examples: Official statistics (e.g., median household income), political polls, descriptive statistics of a sample in journal articles
Explanation: answer the question of why; what causes some outcome or phenomenon
Two kinds of explanations:
Idiographic explanation
Nomothetic explanation
Relationship among purposes: not mutually exclusive; a study may have multiple purposes
For what purpose? Exploration, Description, Explanation (and their connections)
Time, causality, unit of analysis, and logic will interact with these purposes
Idiographic vs Nomothetic
Nomothetic causality aims at general laws or patterns across cases
Idiographic causality focuses on a comprehensive explanation of a single case or a few cases
Unit of Analysis, Time, and Logical Systems align with the two approaches
Recap: Idiographic vs Nomothetic are two logical approaches to understanding phenomena and they can be complementary in mixed designs
Causality (Nomothetic) and Criteria
Three Criteria for Nomothetic Causality:
Correlation
Time Order
Nonspuriousness
Definition of correlation: IV (independent variable) and DV (dependent variable) must be related
Time order: IV should precede DV in time
Nonspuriousness: the IV-DV relationship is not due to a third variable (confounder)
Spurious relationships: correlation does not imply causation; third variables may explain the association
Confounder: a variable that is a common cause of both IV and DV; accounting for confounders can alter or eliminate the IV-DV relationship
If, after controlling for a confounder, the IV-DV relationship disappears, the relationship is spurious; if it remains, it may be nonspurious or partially confounded
Correlation: Examples
Positive correlation example: income increases with education
Represented as
ho_{IV,DV} > 0 where IV = education, DV = income
Negative correlation example: health deteriorates with age
Represented as
ho_{IV,DV} < 0 where IV = age, DV = health outcomes
Clinical trial illustration (not causal by itself):
In a randomized medicine study with a placebo, distribution of treatment by gender may appear related, highlighting the need to separate correlation from causal interpretation
Time Order and Longitudinal Strategies
Time order criterion: IV must occur before DV in time
Example: education → income (time order satisfied if education precedes income)
Example of reverse causality: marriage and happiness
Does marriage cause happiness, or are happier people more likely to marry?
Solution: use longitudinal data to establish time order
Longitudinal design examples in slides:
IV = marital status at time 1, DV = happiness at time 2
Alternatively: IV = happiness at time 1, DV = marital status at time 2
Nonspuriousness and Confounding (Detailed)
Nonspuriousness: a genuine association between IV and DV not explained by other variables
Confounder: a third variable that is related to both IV and DV
How confounders arise: e.g., age can influence both smoking and mortality risk, creating a spurious association if not controlled
Common ways to address confounding:
Experiments (randomization) [Week 4]
Regressions: controlling for confounders in statistical models
Stratification: analysing subgroups where the confounder is held roughly constant
Confounders: Exercise and Interpretation
Example exercise setup:
Variables: own education (IV), own income (DV), parental education (confounder)
Question: Which is IV? which is DV? which is confounder?
Illustration: If parental education is not accounted for, observed association between own education and income may be confounded
After accounting for parental education:
The association between own education and income may weaken but still exist
The relationship becomes confounded but not entirely spurious
In regression analyses or other controls, confounding bias is removed for the controlled variable; if no unaccounted confounders remain, the association is nonspurious
Example: Smoking and Mortality (Multiple Age Groups)
Table: Risk of death in a 20-year period among women in Whickham, England, by smoking status at start (1972-74)
Vital status: Dead / Alive / Total
Smoker: Dead 139, Alive 443, Total 582; Non-smoker: Dead 230, Alive 502, Total 732
Risk (dead/total): Smokers = , Non-smokers =
Note: These risks illustrate how smoking is associated with higher mortality in some cohorts; interpretation depends on context and potential confounders
Table: Ages 18-44 (Smoker vs Non-smoker)
Dead: 15 vs 12; Alive: 270 vs 327; Total: 285 vs 339
Risk: Smokers = , Non-smokers =
Interpretation: relatively small absolute differences in this age band; other factors may contribute
Table: Ages 45-64
Dead: 80 vs 53; Alive: 167 vs 147; Total: 247 vs 200
Risk: Smokers ≈ , Non-smokers ≈
Table: Ages 65+
Dead: 44 vs 165; Alive: 6 vs 28; Total: 50 vs 193
Risk: Smokers = , Non-smokers =
Age as a potential confounder can be seen in multi-way tables
Example: Mortality risk by smoking status and age category (age as a confounder)
For each age group, compare risk for smokers vs non-smokers within that same age band
After stratifying by age, the confounding effect of age on the smoking-mortality association is mitigated
Age as a Confounder: Stratification by Age
Table layout shows mortality risk within each age group by smoking status
Key takeaway: stratifying by age can block the confounding path Age → Smoking and Age → Mortality
How stratification helps:
Within each age group, smoking status is compared at the same age, removing age as a confounder
Pr(Death|Smoker, Age group) vs. Pr(Death|Non-Smokers, Age group)
What Is a Confounder? Clarifications
A confounder is related to both IV and DV
If a variable is only related to IV or only to DV, it is not a confounder in causal inference
Notation example: If C is related to both IV and DV, but C1 and C2 are not, then only C is a confounder
Practical implication: identify potential confounders through prior research and theory
Dealing with Confounders: How to Identify and Address
How to know potential confounders:
Look to prior research and theory
Methods to address confounding:
Experiments (randomization)
Regression controls: “Controlling for …”
Stratification by confounders (e.g., stratify by age)
Subgroup analyses where confounding is minimized
Why Stratify by Age? A Worked Example
Pr(Death|Smoker, Age+) vs. Pr(Death|Non-Smokers, Age+)
Pr(Death|Smoker, Age 45-64) vs. Pr(Death|Non-Smokers, Age 45-64)
Pr(Death|Smoker, Age 18-44) vs. Pr(Death|Non-Smokers, Age 18-44)
By comparing within the same age groups, age is held constant and cannot confound the smoking-mortality relationship
False Criteria for Nomothetic Causality
“A nomothetic explanation is probabilistic and usually incomplete.” (true concept)
False criteria include:
Complete Causation: X is a cause of Y but not the only cause
Example: Education is one of several causes of income
No Exceptional Cases: Exceptions do not disprove X causes Y
Example: Some highly educated people may have low income
Apply in the Majority of Cases: True even if it appears in a minority
Example: Smoking is a major cause of lung cancer, though only a minority of smokers develop the disease
What is the Unit of Analysis? (UoA)
Definition: Who/what are you studying?
Examples: Individuals (students, men, women, children)
Groups: families, households, couples
Organizations: corporations, universities
Social interactions: text messages
Social Artifacts: books, paintings, music
Geographical areas: cities, countries
Multi-level UoA: e.g., students and school; children and families
Visual: imagine a spreadsheet where each row is an observation (the UoA)
Unit of Analysis: Concrete Examples
Example: In a pre-course survey, the unit of analysis is a student
Data example: StudentName, major, year
Proportion of students by year: 2? 87%, 3+? 13%
Exercise: For three examples, discuss what the unit of analysis is and how to tell
What Is the Unit of Analysis? (Revisited with Data Tables)
Revisit risk of death tables (Smoker vs Non-smoker across age groups) to identify UoA in each table
Always align the UoA with the research question and data collection method
Be mindful of inconsistencies: from whom data are gathered vs. from whom conclusions are drawn
Ecological Fallacy and Reductionist Fallacy
Ecological Fallacy (group-level to individual-level):
Example 1: A city has high crime; inferring Joan (an individual in New York) stole a watch
Example 2: Regions with higher average income have better health outcomes; inferring an individual with high income is healthier
Reductionist Fallacy (group-level conclusions about individuals):
Example 1: Individuals with low SES are more likely to divorce; conclude that more developed countries have lower divorce rates
Example 2: If Wisconsin is liberal, conclude Wisconsin as a whole is liberal
Reductionism: reducing complex social phenomena to too few explanations (economic, psychological, etc.)
Exercise: Ecological Fallacy or Reductionist Fallacy? (wooclap activity)
For How Long? Time Dimensions in Research
Cross-Sectional vs Longitudinal studies
Cross-Sectional: data collected at one point in time
Longitudinal: data collected at multiple points in time
Some definitions of longitudinal include panel studies; others include trend and cohort studies as longitudinal under broader definitions
Key distinction: whether data are collected on the same individuals over time or not
Cross-Sectional vs Trend Studies (Repeated Cross-Sectional)
Cross-Sectional Study: collect data from a population one time
Trend Study: collect data from a population multiple times, not necessarily the same respondents
Repeated Cross-Sections: same population or same sampling frame across time (e.g., Census years 2000, 2010, 2020)
Each census is a cross-sectional study; comparing censuses is a trend study
Example: European Value Study (EVS) and World Value Survey (WVS): cross-national, repeated cross-sectional longitudinal program
Cohort Studies vs Panel Studies
Cohort study: collect data from the same cohort over time; may draw different samples from the same cohort at different waves
Birth cohorts: e.g., people born in 1997–2012 (Gen Z)
Marriage cohorts: e.g., people married in 1997–2012
Panel study: data from the same sample across waves; may follow one or multiple cohorts
Example: Wisconsin Longitudinal Study (WLS) – follows graduates from Wisconsin high schools in 1957 across years
Health and Retirement Study (HRS) – follows multiple cohorts across time
Exercise: Studying Election (Matching Design to Question)
Task: For each question, identify the best match: cross-sectional, trend, cohort, or panel
Example questions (as given in slides):
1) What is the party a person would vote for if the election were held today? (Cross-sectional snapshot)
2) How has each party’s chance of winning changed over time? (Trend over time)
3) Which party did Baby Boomers traditionally support, and which party are they more likely to vote for now? (Cohort analysis by birth cohort)
4) For those who watched the campaign events, did their views change afterward? (Panel/longitudinal follow-up)
Two Logical Systems: Theory Testing vs Building
Distinction between building theory (inductive) and testing theory (deductive)
Deduction: From Theory to Hypothesis to Observations (Quantitative emphasis)
Hypothesis: a clear statement of expected relationships between two or more variables
Common in quantitative research
Induction: From Observations to Theory (Qualitative emphasis)
Field research to develop theories through observations
In some quantitative methods, techniques like Latent Class Analysis are inductive
Wheel of Science (integrates deduction and induction in research practice)
Deduction: From Theory to Hypothesis
Process: Theory → Hypothesis → Observations
Hypothesis: explicit, testable statement about relationships between variables
Emphasis: testing theoretical predictions with data
Induction: From Observations to Theory
Process: Observe patterns in social life → identify regularities → develop generalized explanations
Emphasis: theory-building from empirical observation
Latent Class Analysis cited as an inductive method in quantitative work
Wheel of Science
Concept that research often involves a cycle of theory and data collection, with both deductive and inductive steps
Today’s Summary and Takeaways
Research may have multiple purposes: exploration, description, explanation
Different approaches to causality: nomothetic (general laws) vs idiographic (case-focused explanations)
Different units of analysis: individuals, groups, organizations, social artifacts, geographical areas, etc.
Time dimensions: cross-sectional vs longitudinal (including trend, cohort, and panel variants)
Two logical systems: induction and deduction (and their use in theory-building and theory-testing)
Next Week Preview
Conceptualization and Operationalization (Chapter 5)
Prepare to define and operationalize key concepts for measurement in research
Equations, Concepts, and Notation References (LaTeX)
Correlation concept:
Time order (temporal precedence):
If a variable $IV$ occurs before $DV$ in time, we have $t{IV} < t{DV}$.
Risk calculation (example from tables):
For smoking vs. mortality in Whickham 20-year table:
Age-group risks (example 18-44):
Age-stratified risks (65+):
False criteria for nomothetic causality (conceptual): these are not accepted criteria, as nomothetic explanations are probabilistic and usually incomplete
"} }} {