Research Methods: Internal & External Validity

Acknowledgement of Country

Speaker Art Stukas begins by acknowledging the traditional custodians of Country throughout Australia, their ongoing connection to land, sea, and community.
Respect is paid to Elders past and present, and extended to all Aboriginal and Torres Strait Islander peoples.

Psychology 2SOC is currently focused on research methods and critical consumption of findings.
Core goal: design investigations that test, validate, or refine theories by examining when and why they work.
Two key evaluative lenses introduced:
- Internal Validity (confidence in cause–effect inside the study)
- External Validity (confidence in generalizing the findings outside the study)

Observational Research
- Naturalistic observation: watch behaviour in real-world contexts.
- Participant observation: researcher joins the group (e.g., becoming a volunteer to study volunteering from the inside).
Archival Analysis
- Uses historical records, artifacts, or existing data (e.g., post-disaster cooperation between groups to see if contact increases liking).
Surveys / Correlational Studies
- Measure variables without manipulating them; compute associations.
- Example: Is age related to political beliefs? Cannot claim causation.
- Useful first step that may inspire experiments.
Experiments (Gold Standard)
- Researcher manipulates an independent variable (IV), compares to control, measures a dependent variable (DV).
- Allows causal inference if well-designed.
- Requires ethics approval; manipulations can be mundane (hot vs. cold rooms) or social (Confederates acting).

Theory refresher
- Cognitive dissonance: inconsistency between beliefs and behaviour creates tension that motivates change.
- Effort-justification hypothesis: the more effort invested in a goal, the more positively one evaluates the outcome.
- Logical chain: “I worked hard \Rightarrow I must really value this.”

Ethnographic / Observational
- Many cultures have rites of passage; possible function = bonding people to the group via effort.
- Hard to rule out other explanations.
Survey Examples
- U.S. fraternities/sororities: severity of initiation vs. reported liking.
- University study time vs. liking for La Trobe University.
- Problems: self-selection, reverse causality, third variables C.

Participants: N=63 university women (Mills College, CA).
Setting: “Psychology of Sex” study—provocative in 1959.
Random Assignment (3 conditions):
1. Control: no initiation.
2. Mild initiation: read sex-related dictionary words aloud.
3. Severe initiation: say the “7 words you can’t say on television” into a microphone before male experimenters.
Common Experience: All listened (via headphones) to an intentionally dull discussion—“Sex habits of the whooping crane.”
DV: Desire to join the discussion group (liking for the group).
Results (qualitative summary):
- Severe > Mild > Control in reported liking; supports effort-justification.
Ethical Reflection
- Power and gender imbalance: male experimenters, female participants, sexual language.
- 2019 critique by J.C. Young & P. Hagerty questions ethics through a contemporary lens (Me-Too era).

Definition: Degree to which observed DV changes can be confidently attributed to the manipulated IV.
Key Design Features Enhancing Internal Validity
- Control group(s) for baseline comparison.
- Random assignment: each participant has equal P=\tfrac{1}{n} chance of any condition, eliminating self-selection bias.
- Blinding / masking: experimenter unaware of participant condition to prevent experimenter-expectancy effects.
- Standardized procedures: all other aspects held constant.
Threats & Examples
- Confound: extraneous variable varying systematically with IV (e.g., testing Control in morning, Severe at night \Rightarrow time-of-day confound).
- Experimenter expectancy: differential questioning tone if the experimenter knows condition.

Definition: Extent to which results generalize across people, settings, manipulations, and time.
Sample Considerations
- Representative vs. convenience samples.
- Historical reliance on WEIRD participants:
- Western, Educated, Industrialized, Rich, Democratic.
- Question: do findings replicate in non-WEIRD contexts (e.g., Japan, China, Bangladesh, African nations)?
Setting Considerations
- Laboratory: high control \Rightarrow high internal validity, possibly lower ecological realism.
- Field: natural environment \Rightarrow higher realism, greater external validity, but less control.
Operationalization Diversity
- Different measures of “effort” (time, money, physical pain).
- Different forms of “liking” (attitude scales, behavioural choices).
Modern Advances
- Online platforms enable multi-country data collection and more diverse samples.
- Translation and cultural adaptation of measures.

Three causal models when variables A and B are correlated:
1. A \rightarrow B (e.g., violent media \rightarrow aggression).
2. B \rightarrow A (already-aggressive individuals seek violent media).
3. A third variable C (e.g., chaotic home life) influences both.
Correlational designs alone cannot adjudicate among these models.

Replication: repeating studies in new samples or settings to verify robustness.
Moderator Testing: systematically vary potential moderators (e.g., culture, age, initiation type) to map boundary conditions.
Meta-Analysis & Systematic Review
- Aggregate effect sizes across studies; quantify average effects \bar d, heterogeneity I^2.
- Identify overall support, gaps, and future research directions.
Open Science Movement (teased for next lecture)
- Preregistration, data sharing, and transparency to enhance credibility.

Historical studies (Aron & Mills, Milgram) yielded insights but raised ethical concerns: participant stress, deception, power dynamics.
Modern ethics committees require:
- Informed consent, right to withdraw.
- Risk–benefit analysis.
- Debriefing.
Researchers must balance knowledge gain with participant welfare and societal values.

Probability of random assignment to one of k conditions: P = \tfrac{1}{k}.
Correlation coefficient symbol: r_{XY}.
Internal validity goal: isolate a single causal path IV \rightarrow DV.
Confound definition (informal): \exists\ Z\ :\ Z\,\text{covaries with}\ IV \ \wedge \ Z\,\text{affects}\ DV.

No single method is perfect; each offers different strengths.
Strong internal validity demands tight control and randomization; strong external validity demands representativeness and realistic settings.
Science progresses cumulatively: diverse methods, continual replication, and ethical vigilance.
Students should critically evaluate both the how (method/validity) and the why (theoretical significance) of every study they read.