Construct Validity & Operationalisation – Research Methods (Psychology 2SOC)

Lecturer begins by paying respect to the traditional custodians of country throughout Australia.
- Recognises their ongoing connection to land, sea and community.
- Extends respect to Elders (past, present) and to all Aboriginal and Torres Strait Islander peoples.

Constructs = abstract, general ideas that underpin theories.
- Examples mentioned: “positive contact”, “liking”, “aggression”.
Importance:
- They organise observations and predict future findings.
- Cannot be observed directly; require operationalisation.

Definition: The degree to which an operationalisation (measure or manipulation) truly reflects the intended construct.
Central Questions:
- Does the manipulation create the state we claim?
  • E.g., Does doing a jigsaw puzzle with an older person truly represent “positive contact”?
- Does the measure capture the construct?
  • E.g., Is asking “On a scale of $1$ – $10$ how much do you like older people?” sufficient to capture “liking”?
Significance: Without good CV, findings cannot properly test or refine theory.

Two broad tasks:
1. Manipulation of Independent Variables (IVs).
  • Must induce the conceptual state (e.g., create positive vs. no contact).
2. Measurement of Dependent Variables (DVs).
  • Must reflect the conceptual outcome (e.g., behavioural, attitudinal, physiological indices of liking).
Choices are non-trivial; involve creativity, cost, ethics, and practicality.

IV manipulation: Completing a jigsaw puzzle with an older person vs. simply being in the same room.
DV options discussed:
- Self-report liking scale $1$ – $10$ .
- Behavioural indices: seating distance, willingness to help, time spent.
CV Issues:
- Is cooperative puzzle-doing the “best” instantiation of positive contact?
- Which DV best generalises to real-world outcomes?

Broad definition: Behaviour intended to harm or cause pain.
Key Distinctions:
- Hostile (angry) aggression: harm is the primary goal.
- Instrumental aggression: harm is a means to another end (e.g., nurse giving an injection to protect public health).
Implications for CV:
- A theory about temperature increasing aggression may only apply to hostile aggression.
- Measurement choice (e.g., noise-blast paradigm vs. willingness to inflict pain for money) must align with the subtype under investigation.

Careful, theory-driven decision making.
Trial-and-error across multiple studies; continuous refinement.
Use multiple measures or multiple operationalisations within a single study to triangulate the construct.

Formats: questionnaires, interviews, Likert items, semantic differentials, free response.
Strengths:
- Efficient, inexpensive, can tap unobservable states (thoughts, plans, attitudes toward AI, etc.).
Weaknesses:
- Social desirability bias – people present an overly positive self-image.
- Limited self-knowledge or poor recall (e.g., estimating hours volunteered in last $12$ months).
- Question-order & wording effects.
- Reviewers routinely scrutinise psychometric quality.

Data: counts, latencies, durations, choice patterns (e.g., lever presses by rats when light is on).
Procedures:
- Create situations where target behaviour can occur (e.g., allow volunteering time; record conversations for warmth).
- Use trained raters; assess inter-rater reliability (agreement among observers before combining or averaging ratings).
Strengths:
- Less reliant on introspection; can capture implicit processes.
Weaknesses:
- Reactivity (participants alter behaviour when watched).
  • Hidden cameras raise major ethical issues; two-way mirrors require disclosure.
- Observer biases – mitigated via multiple raters and reliability statistics.

Reaction-time paradigms: how quickly a key is pressed after a stimulus.
Eye-tracking: gaze location reveals attention patterns.
Experience Sampling (ESM):
- Smartphones “ping” participants multiple times per day with brief surveys (e.g., $3$ items on happiness, busyness).
- Can embed manipulations (e.g., instruct one group to perform $10$ helpful acts in a week).
Classic & emerging biosignals: heart rate, galvanic skin response, neuro-imaging (EEG, fMRI).
• Portability & cost remain challenges (e.g., portable fMRI not yet feasible).
Trade-offs:
- Higher ecological validity in ESM; higher cost/complexity in neuro-tech.

For observational data:
- Multiple raters quantify the same behaviour (e.g., warmth on $1$ – $7$ scale).
- Combine scores only when inter-rater consistency is acceptable.
For self-report scales: internal consistency, test–retest reliability, factor structure all scrutinised to support validity claims.

Informed consent limits covert recording.
Hidden cameras/two-way mirrors rarely approved by ethics committees.
Technology (beepers, phones) must respect privacy and minimise participant burden.

Progress via competing theories and competing operationalisations.
Single studies rarely decisive; body of evidence (multiple methods, replications) builds confidence.
Construct validity is re-evaluated each time a new measure/manipulation is introduced or an unexpected result emerges.

Good research requires a tight link between abstract constructs and concrete operations.
No single operationalisation is perfect; convergence across methods strengthens conclusions.
Awareness of strengths, weaknesses, ethical issues, and cost guides methodological choices.
Construct validity sits at the heart of theory testing, demanding critical scrutiny at every stage of the scientific process.