�

T3 Notes

Exam logistics

  • Mid-semester exam date and format
    • Saturday, September 6 at 2\ \text{PM}.
    • Fully invigilated and in person.
    • Covers content from weeks 1–4 (all material covered thus far, including lectures and textbook readings assigned in weeks 1–4).
    • Tutorial content will not be assessed; only lecture content and assigned textbook readings matter.
    • Worth 20\% of the overall course grade.
    • Exam format: 40 multiple-choice questions (MCQs); each MCQ contributes 0.5\% to the final grade.

Case study overview (research design exercise)

  • Study design overview (as described in the session):
    • Participants divided into two relationship-status groups:
    • Newly coupled participants (in a new relationship).
    • Continually coupled participants (medium-to-long term relationships).
    • Measures collected:
    • Participants rated their actual partner on four traits: physical attractiveness, vitality, status/resources, and warmth/trustworthiness.
    • Participants also rated their ideal partner (their preference) on the same four traits.
    • Discrepancy score for each participant:
    • Reflects the difference between how they rated their actual partner and how they rated their ideal partner.
    • Popularly framed as: D = Rating{actual} - Rating{ideal} (magnitude |D| reflects discrepancy).
    • Main finding: discrepancy scores were on average lower in continually coupled participants than in newly coupled participants.
    • Interpretation given: people in longer relationships calibrate their ideal partner to be closer to their actual partner; relationship status affects ideal partner preferences.
    • Baseline ratings of actual partners were similar between the two groups, meaning the difference lay in the ideals, not the actuals.
    • Implication: over time, people’s preferences may recalibrate to match their current partner’s traits.

Critical-thinking prompts (group discussion and critique)

  • Task given to students:
    • For about 10 minutes at the table (or individually): identify sources of systematic and nonsystematic variability in this study.
    • Propose alternative explanations for the results.
    • If flaws are identified, discuss possible redesigns to obtain more robust conclusions.
  • Example points raised in discussion (from the transcript):
    • Systematic variability:
    • Group assignment ( continually vs newly coupled ) is not time-defined; variability in group composition beyond simple status (e.g., years together) could confound results.
    • Nonsystematic variability:
    • What participants actually value may be influenced by momentary mood, context, or recent experiences rather than stable trait preferences.
    • Alternative explanations discussed:
    • Survivorship bias: those whose partners already closely match their ideals may be more likely to stay together and thus be in the continually coupled group.
    • Cognitive dissonance or justification effects: long-term partners might adjust their reported ideals to maintain consistency with their relationship, or to rationalize staying in the relationship.
    • Age effects: older participants in longer relationships might have different, potentially more realistic, preferences.
    • Regression toward the mean or measurement noise in trait ratings.
    • Redesign ideas proposed:
    • Add a third group (e.g., long-term but not extremely long-term) or measure duration as a continuous variable (years together) and analyze with regression.
    • A longitudinal design: recruit new couples, obtain baseline discrepancy (actual vs ideal) at dating onset, and follow over time to observe how discrepancy changes and whether initial discrepancy predicts relationship continuity.
    • Longitudinal design would help address survivorship bias and test whether lower baseline discrepancy predicts relationship persistence, while also examining whether preferences recalibrate over time.
  • Instructor’s reflections (summarized):
    • Reiterated main alternative explanations (e.g., survivorship bias, cognitive dissonance, age effects).
    • Emphasized that the study’s design cannot conclusively determine whether preferences shift due to relationship status or due to preexisting alignment between partner and self.
    • Suggested longitudinal study as the best approach to disentangle these factors, acknowledging time and cost barriers in practice.

Constructs, variables, and operationalization (conceptual groundwork)

  • Core idea: constructs vs variables

    • Construct: abstract concept (e.g., cognitive flexibility).
    • Operationalization: concrete measurement of the construct as a variable (e.g., reaction times in a task-switching paradigm).
    • Example from lecture: cognitive flexibility in older adults measured via a task-switching paradigm; construct = cognitive flexibility; operationalized as reaction times on switch vs non-switch trials.
    • Importance of clarity in operationalization: enables replication and ensures readers understand exactly what was measured and how.
    • There can be multiple valid ways to operationalize a construct; no single “right” method.
  • Scales of measurement (classification of variables)

  • Four primary scales with key features and examples:

    • Nominal scale:
    • Definition: Categories without any mathematical ordering or relationship.
    • Example: Type of drug in a trial (paracetamol vs placebo).
    • Other examples: Ethnicity categories; eye color (brown, blue, green).
    • Mathematical implication: no meaningful arithmetic operations between categories.
    • Ordinal scale:
    • Definition: Ordered categories, but intervals between adjacent categories are not guaranteed to be equal.
    • Example: Finishing position in a race (1st, 2nd, 3rd, …) or Likert-type scales (strongly disagree to strongly agree).
    • Important nuance: treated as ordinal in theory, but often analyzed as interval in practice due to numeric labeling.
    • Interval scale:
    • Definition: Ordered with equal intervals between categories, but no true zero point.
    • Example: Dates on a calendar (differences are meaningful; e.g., 2020 vs 2021 shows a 1-year interval; but there is no true zero year).
    • Ratio scale:
    • Definition: Ordered with equal intervals and a true zero point, allowing meaningful ratios.
    • Examples: Dosage of a drug (0 mg means no drug), reaction times (0 ms means no time taken).
  • Special notes and nuanced examples from the lecture:

    • Time-of-day as a variable can be represented across different scales depending on measurement choice:
    • Nominal: day vs night.
    • Ordinal: order of times (dawn, noon, afternoon, evening).
    • Interval: standard 12-hour clock without absolute zero.
    • Ratio: 24-hour time with midnight as a true zero point, enabling meaningful ratios (e.g., 14:00 is twice as far from midnight as 07:00).
    • Eye color could theoretically be treated as a ratio measure if one used a continuous brightness metric, though typically it is treated as nominal.
    • Socioeconomic status can be nominal or ordinal depending on how it’s measured (income vs a ranked status ladder).
    • Self-reported happiness and other Likert-type scales are typically treated as ordinal in theory, though often analyzed as interval in practice.
    • IQ scores and age are nuanced: IQ is often treated as at least ordinal and sometimes interval, but has interpretive complexity (non-uniform intervals across the scale; validity of zero is not clear); age generally treated as ratio but can be binned to yield ordinal or interval representations.
    • Temperature example: Celsius is interval; Kelvin is ratio; both measure temperature but differ in zero-point interpretation.
    • Final note on measurement: the same construct can be represented differently across scales; responsible researchers choose the scale that best fits the study design and analysis.

Validity and reliability (quality of measurements)

  • Validity: the extent to which a measurement actually measures what it claims to measure.

    • Internal validity: how well conclusions about relationships can be drawn from the study design (focus on confounds and nuisance variance).
    • External (ecological) validity: generalizability of results beyond the current setting or sample.
    • Construct validity: whether the test actually measures the intended construct (strongly related to the adequacy of the construct measurement).
    • Predictive validity: whether scores on a measurement co-vary with a future criterion it should predict.
    • Content/face validity: whether the test appears to measure the intended construct; often considered the least critical form of validity.
  • Concrete examples from meditation/anxiety study to illustrate validity categories:

    • Internal validity (good): Random assignment to a meditation vs control group would reduce systematic differences between groups; absence of confounds would strengthen causal inferences.
    • Internal validity (poor): Self-selection into the meditation group (volunteering based on motivation) could confound results.
    • External validity (good): Include participants from multiple schools, ages, prior meditation experiences.
    • External validity (poor): Include only first-year psychology students from a single university.
    • Construct validity (good): Use multiple indicators of anxiety (self-report, behavioral observations, physiological measures like heart rate or skin conductance).
    • Construct validity (poor): Rely on a single self-report item to assess anxiety.
    • Predictive validity (good): Reductions in anxiety scores predict fewer real-life anxiety symptoms later (panic attacks, etc.).
    • Predictive validity (poor): Short-term anxiety reductions do not translate into real-world improvements over time.
    • Face validity (good): A measurement that clearly asks about anxiety (e.g., “How anxious do you feel right now?”) is easy for participants to interpret.
    • Face validity (poor): Using obscure physiological measures that participants don’t associate with anxiety may reduce perceived relevance, though it could be useful in some deception-prone contexts.
    • Practical note: Face validity is often less important than construct validity; it can be strategically manipulated in some contexts (e.g., to prevent participants from altering responses).
  • Reliability (stability/consistency of a measure):

    • Test-retest reliability: administer the same measure more than once and assess whether results correlate across administrations.
    • Inter-rater reliability: different raters’ scores correlate; essential when observations are involved.
    • Good reliability examples:
    • Using a well-validated, standardized measure (e.g., Beck Depression Inventory) with known test-retest reliability.
    • Trained, standardized observers rating aggressive behaviors with detailed criteria.
    • Poor reliability examples:
    • A newly developed depression inventory with highly variable scores from day to day due to mood, fatigue, or time of day.
    • Untrained, unstandardized raters giving divergent scores because of personal bias.
    • Conceptual relationship between validity and reliability:
    • It is possible to have good reliability but poor validity (consistent but not measuring the intended construct).
    • It is uncommon to have good validity with poor reliability; if reliability is near zero, there is essentially no meaningful information to rely on (like measuring with a spaghetti tape measure).
  • Methodological notes about reporting and interpretation:

    • The method section should be written in past tense because the study has been completed.
    • Distinguish between validity concepts and reliability concepts when evaluating a study.

Method section structure (how to write up a study)

  • Four subsections, with suggested order and core content:
    • Participants (must be included and described first)
    • Include total number of participants (n), who they were, and how they were selected.
    • Describe participation incentives (voluntary, paid, or other), and relevant demographic variables (age, gender, etc.).
    • Example template (one-paragraph):
      • "Participants n = 57, 40 female, 14 male, 3 non-binary, were undergraduates involved in three small sections of a third-year biology course at the University of Western Australia. They participated voluntarily; ages ranged from 18 to 32, with a mean and standard deviation recorded. Gender breakdown and other demographics reported as counts or percentages."
    • Practical formatting note: write numbers that begin a sentence as words; numbers 10 and above used as numerals, except in special cases; mean and standard deviation use numerals.
    • Design (structure of the study)
    • State the overall design: experimental vs observational; independent groups vs repeated measures; cross-sectional vs longitudinal; etc.
    • For the example in the transcript: observational, correlational; not experimental; not cross-sectional; no grouping by preexisting criteria beyond natural group status.
    • Name the key constructs (e.g., conscientiousness, anxiety) and describe how you operationalize them (e.g., Big Five conscientiousness subscale; summed item scores).
    • A concise example template: "Participants were assessed in a two-condition setup with repeated measures; construct X measured via Y; Z measured via W."
    • Procedure
    • Detail step-by-step how the study was carried out, including who administered it, participant instructions, and data collection method.
    • Example from the transcript: online questionnaire completed in tutoring class; instructions provided by tutor; written task instructions; response sheets submitted; debriefing provided.
    • Emphasize replication: the write-up should allow exact replication from the description.
    • Materials
    • Describe each scale used for the study (even those not central to the hypotheses); include number of items per scale and per subscale; provide at least one example item per scale (not per subscale).
    • Note about reverse scoring where applicable and why it matters.
    • Mention that readers should be directed to the appendix for full scale details (scales used, items, etc.).
    • Provide a concrete example item for each scale used in the study (one per scale; not necessarily for every subscale).
    • Practical tip: for this course’s assignment, you should describe every scale used, even those not analyzed; this is more exhaustive than a typical published paper.
  • General guidance and style notes:
    • Use past tense consistently.
    • Keep each subsection concise (often a single paragraph per subsection).
    • The four subsections together should provide enough detail for a reader to replicate the study exactly, including the specific measures and procedures used.
    • The Materials subsection will typically be longer than the other sections because it includes descriptions of multiple scales and example items.
    • An appendix section may include full scales, item-level details, and scoring rules; refer readers there from the Materials subsection.

Practical examples and templates (quick-reference for writing)

  • Participant example template (one-paragraph):
    • "Participants n = 57, 40 female, 14 male, 3 non-binary, were undergraduates enrolled in three sections of a third-year biology course at the University of Western Australia. Participation was voluntary. Ages ranged from 18 to 32 (M = 25.3, SD = 3.7). Gender breakdown reported as counts."
  • Design example (one-paragraph):
    • "Design: observational, correlational. No experimental manipulation. The study examined relationships between variables X and Y within a single sample. Constructs: X was measured by [scale or items], Y by [scale or items]."
  • Procedure example (one-paragraph):
    • "Procedures: participants completed online questionnaires in their first tutorial class. The tutor provided instructions; written task prompts were supplied; responses were collected via a digital response sheet; debriefing followed the session."
  • Materials example (one-page overview, longer for real study):
    • "Scales administered included Scale A (XX items; Cronbach’s α = .XX), Scale B (YY items; Cronbach’s α = .YY), and Scale C (ZZ items; Cronbach’s α = .ZZ). One sample item per scale: A: 'I feel [construct-related item]'; B: '[item]'; C: '[item]'. Half of the items on Scale A and Scale B were reverse-scored. Scores were summed to yield a total for each scale. Appendix A lists all items and subscales."

Closing notes and next steps

  • The instructor’s closing remark: next week will cover data analysis in Excel and calculating the standard deviation.
  • Key takeaway: the upcoming session will build practical data analysis skills to complement the conceptual foundations reviewed here.