Comprehensive Notes on Research Methods in Relationship Science
Data collection in Relationship Science
- Focus on observable behavior and interaction as data sources
- Examples of observable data include:
- Facial expressions
- Who is speaking and what they’re saying
- How they’re saying it (tone, prosody)
- Observational coding schemes (highly structured coding of behavior) used to quantify interactions
- Example study approach described:
- Bring couples into a lab or natural setting
- Prime them to discuss a typical topic
- Record the interaction for later coding
- Code movement of face, tone of voice, and interaction dynamics
- Analyze how participants respond to each other in real time (e.g., face changes, snapping back, calming the conflict)
- Observational methods provide a rich, contextual picture but are labor-intensive and costly
- Two broad data collection approaches discussed:
- Single-instance observational data
- Repeated observations or longitudinal designs
- Example: measure a couple before and after a relationship-education program using the same observational method
- Contrast with experience sampling (ESM):
- Participants carry a device that records short verbal samples when prompted
- Records snippets of lived experiences throughout the day, then data are coded
- Purpose of ES M is to capture data that matches the research question (e.g., college students’ conversations after a big football game)
- Preschool observational work described:
- Some early pilot studies described as watching many children for short periods (e.g., 10 seconds per child) and collecting thousands of data points to measure sociability
- Observational data are typically naturalistic and short bursts; can be conducted in lab settings too
- Common questions about observational data:
- Is a single observation enough, or should we collect multiple sessions?
- How does study design (lab vs natural setting) influence findings?
Potential downsides and limitations of observational data
- Reactivity effect (Hawthorne effect): participants may behave differently because they know they’re being observed
- Narrow window: a single observation may not generalize to typical behavior
- Internal observation limits: data are limited to observable behaviors; private states may be inferred but are less direct
- High cost and effort: training, reliability checks, and coding time are substantial
- Reliability challenges:
- Raters need consistent coding across observers
- Requires rigorous training and calibration to reach acceptable reliability
- Time and cost escalate with larger samples
- Mitigation strategies:
- Mixed methods: combine observational data with self-report data
- Use standardized coding schemes and training materials
- Use multiple observational instances to increase reliability
- Technological considerations:
- Video/audio recording equipment, secure storage, and data management
- Synchronization of multiple data streams (e.g., video plus audio transcripts)
- Ethical considerations in observational studies discussed within Belmont Report framework (see Ethics section)
Experience Sampling (ESM) in relationship science
- ESM samples real-lived experiences across time rather than relying on a single observation
- Typical setup:
- Portable recording device (e.g., handheld recorder or smartphone app)
- Automatic prompts or user-initiated recordings
- Data captured only when the participant starts talking; device then stops
- Advantages:
- Captures variability across moods, contexts, conflicts, and interactions
- Reduces reliance on retrospective reporting
- Limitations:
- Participant burden and potential nonresponse bias
- Data are momentary and require careful interpretation
- Examples:
- After a significant social event (e.g., football game), what topics arise and how do partners interact?
- Across a school day, what interactions occur with peers or partners?
Other data collection modalities
- Self-report data (surveys, questionnaires): prompts participants to report thoughts, feelings, behaviors
- Physiological data: body-based indicators of relationship processes
- Heart rate, respiration, cortisol (stress hormone), skin conductance (sweat)
- Brain activity (e.g., EEG/other measures) in some studies
- Hormone levels can be measured non-invasively (e.g., saliva cortisol)
- Archival and secondary data: using existing data sources
- Public records (marriage/divorce decrees, birth records)
- Health records, governmental datasets
- Online data (social media posts, forums, Reddit) to infer relationship processes
- Example of archival/social-media data use:
- Analyzing why people call off marriages by coding Reddit posts without directly asking participants
Key distinctions among data types
- Natural environments vs lab settings: data can be collected in either; ecological validity often higher in natural settings
- Single-instance vs repeated measures: longitudinal data allow observing change over time
- Depth vs breadth: observational coding yields rich, detailed data; self-report can cover broader constructs
- Inter-method fit: researchers often combine data types (mixed methods) to address limitations of any single approach
Reliability in relationship science data
- Reliability refers to consistency of measurement
- Three forms highlighted:
- Internal reliability (internal consistency): consistency of responses within a measure
- External reliability (test-retest reliability): stability of scores across time or conditions
- Inter-rater reliability: consistency between different coders or raters
Internal reliability
- Focused on how consistently multiple items measure the same construct
- Example: a depression scale with 10 items
- Internal consistency means respondents answer related items in a coherent way
- Common statistic: Cronbach’s alpha, α=N−1N(1−σT2∑<em>i=1Nσ</em>i2)
- N = number of items
- \sigma_i^2 = variance of item i
- \sigma_T^2 = variance of the total score
- High internal reliability suggests items cohere well; not sufficient alone for validity
External reliability (test-retest)
- Measures stability of scores over time
- Approach: administer the same measure to the same participants at two or more time points
- Ideal outcome: scores remain reasonably stable when the underlying construct is stable
- Note: some constructs (e.g., mood) may legitimately change over time; interpretation depends on construct
Inter-rater reliability
- Important for observational and coding data
- Gauges whether different raters code the same behavior similarly
- Common statistics:
- Cohen’s kappa: κ=1−P<em>eP<em>o−P</em>e where Po is observed agreement and P_e is chance agreement
- Intraclass correlation (ICC) for continuous ratings
- Achieving high inter-rater reliability requires training, calibration, and clear coding manuals
Validity in relationship science data
- Validity concerns whether a measure accurately captures what it is intended to measure
- Three forms highlighted:
- Convergent validity: different measures of the same construct yield similar results
- Divergent (discriminant) validity: measures of different constructs do not highly correlate
- Face validity: the measure appears to assess the intended construct at face value
Convergent validity
- If two separate depression scales yield high correlation for the same individuals, they demonstrate convergent validity
- Conceptual idea: different methods converge on the same underlying construct
Divergent validity
- Measures of related but distinct constructs should not be perfectly correlated
- Example: depression vs. relationship satisfaction should not perfectly track together, though some overlap may exist
Face validity
- The items or tasks should seem appropriate for the construct being measured
- Poor face validity example: measuring depression with only questions about eating habits
- Face validity matters for participant understanding and engagement, but it is not sufficient for overall validity
Reliability vs validity relationship
- Reliability is a prerequisite for validity: you cannot have a valid measure if it is not reliable
- A measure can be reliable but not valid (consistently wrong)
- A measure cannot be valid if it is not reliable
- Example visuals (described conceptually):
- Highly reliable but far off target (consistent but biased)
- On-target but inconsistent (accurate on average but noisy)
- Neither reliable nor valid (scattered and off-target)
- Practical takeaway: ensure both reliability and validity when interpreting data
Ethics and responsible research with human participants
- Research with humans requires ethical consideration and oversight
- Belmont Report (1979): foundational document outlining ethical principles for human subjects research
- Respect for persons: acknowledge autonomy; protect those with diminished autonomy; informed consent
- Beneficence: do not harm; maximize benefits; minimize risks
- Justice: fairness in distribution of research burdens and benefits
- Institutional Review Board (IRB): independent group that evaluates proposed research to protect participants
- If a study poses risk or ethical concerns, IRB can require modifications or reject the study
- Historical caution: Tuskegee syphilis study highlighted the dangers of unethical research and the need for IRB oversight
- Informed consent: written and/or verbal documentation detailing a study’s purpose, procedures, risks, benefits, duration, compensation, data handling, and right to withdraw
- Participant rights: voluntary participation, right to withdraw at any time without penalty, ability to ask questions and receive answers
- Ethical conduct in practice:
- Be transparent about study goals and potential harm
- Ensure fair treatment and respect for participants
- Avoid coercion and undue influence
- Protect privacy and data confidentiality
- Provide debriefing and resources if discussing sensitive topics
Data quality, limitations, and responsible interpretation
- Data quality matters: poor data lead to incorrect conclusions
- Data cleaning: process of identifying and correcting or removing noisy or erroneous data
- Examples: misreporting (e.g., “three kids” but only two checked), duplicated records, missing values
- Limitations and transparency:
- All studies have limitations; good papers acknowledge them and outline their impact
- Be cautious about overgeneralizing from single studies or small samples
- Being a good consumer of research:
- Check measurement definitions (what was measured and how)
- Assess reliability and validity evidence
- Look for explicit limitations and potential biases
- Consider the appropriateness of the data type for the research question
- Examine ethical considerations (IRB approval, consent, data handling)
Mixed methods and real-world applications
- Mixed methods combine observational, self-report, physiological, and archival data to provide a fuller picture
- The choice of method depends on the research question and practical constraints
- Real-world example from the module:
- An assignment where students discuss whether a self-report survey on satisfaction is appropriate for a longitudinal study
- Emphasis on thinking through the data type, reliability, and interpretation
- Practical research example discussed in class:
- Studying toddler-satisfaction or partner-report correlations; identifying X and Y variables and their directional relationship
- Using correlation concepts to interpret whether higher scores on one measure align with higher scores on another
- Broader relevance:
- These methods inform relationship education programs, clinical practice, and understanding of relationship dynamics in everyday life
Quick glossary and key terms (with notes)
- Experience Sampling (ESM): real-time data collection through momentary reports
- Internal reliability: consistency of items within a measure; often quantified by Cronbach’s alpha α
- External reliability: stability of scores over time (test-retest)
- Inter-rater reliability: agreement among coders or raters (e.g., Cohen’s kappa κ, ICC)
- Convergent validity: different measures of the same construct yield similar results
- Divergent validity: measures of different constructs do not correlate strongly
- Face validity: items appear to measure the intended construct at face value
- Belmont Report: ethical principles for human subjects research (Respect for Persons, Beneficence, Justice)
- IRB: institutional review board that reviews research proposals to protect participants
- Hawthorne effect: behavioral changes due to awareness of being observed
- Data cleaning: process of correcting or removing inaccurate data
- Correlation: statistical association between two variables; often summarized by r=σ<em>Xσ</em>Ycov(X,Y)
- Causality and longitudinal design: longitudinal data help infer temporal relations and potential causal inferences, though causality requires careful analysis
- Archival data: using existing records or datasets rather than collecting new data
- Mixed methods: integrating qualitative and quantitative approaches to address research questions
Example prompts and reflection (to practice exam-style thinking)
- If you observe a high inter-rater reliability (ICC or Cohen’s kappa), what does that imply about your observational coding scheme?
- How would you determine whether a depression scale used in a relationship study has adequate convergent validity with another established depression measure?
- What ethical steps would you take to study a topic involving intimate partner violence or high-stress conflict, and how would you communicate potential risks to participants?
- Given a short-term cortisol measure during a conflict task, how would you interpret elevated cortisol with respect to the patient’s subjective report of stress?
- When would experience sampling be preferred over a single in-lab observational session, and why?
Final takeaway
- High-quality relationship science relies on a thoughtful combination of data sources, rigorous reliability and validity checks, and strict ethical conduct
- Use transparent reporting of methods, acknowledge limitations, and integrate multiple data types to form robust conclusions
- Being a critical consumer of research includes scrutinizing measurement quality, data handling, and the broader implications for individuals and communities
Additional note from the instructor's example
- There is an ongoing study in the department on attachments and adult brain responses to images; students were invited to consider this study as an example of ethical research and data interpretation in practice