Notes on The Scientific Method, Correlational Research, and Causation

Goals and Rationale of the Lesson

Learn how scientific research underpins psychology and how to think critically about information presented in everyday life.
By the end of today: be able to (1) name and define each stage of the scientific method, (2) identify the strength and direction of correlations, and (3) explain two reasons why correlation does not equal causation.
Emphasis on developing critical thinking about information, recognizing bias, and understanding how research is conducted to assess validity.

The Scientific Method: An Overview

The scientific method is a systematic approach for answering questions about the nature of reality.
It reduces human biases in observation and judgment by following structured steps.
Core loop: theory → hypotheses → research → data/outcomes → conclusions → inform future theories.
If results support the theory, they strengthen it; if not, theories may be revised or abandoned and the cycle restarts.
This class focuses on concepts, phenomena, and how results inform theory rather than heavy data analysis in this particular course.

What is a Theory?

In science, a theory is an explanation for observed patterns grounded in evidence, research, and prior observations.
A good theory is:
- Simple and clear
- Integrates information from disparate areas
- Predictive
Example: Cholesterol Theory
- Proposes that excess cholesterol builds up in arterial walls, narrowing arterial space for oxygenated blood, increasing risk of heart disease and heart attack.
- Important clarifications:
- Not a proven fact; science uses evidence that supports or does not support a theory, not proof.
- The absence of proof does not disprove a theory; future evidence could support or challenge it.
Semantic point: In science, we avoid the word proof/proven; we say there is evidence to support or not support a theory.
Why this matters: misuse of “proof” historically has led to harmful biased conclusions (e.g., discredited theories about race and intelligence).

From Theory to Hypothesis

Hypothesis: a testable prediction derived from a theory; typically an if-then statement.
It specifies which variables should be related if the theory is correct and guides the research design.
Examples (continuing from cholesterol theory):
- If high cholesterol leads to heart disease, then people with higher cholesterol should be more likely to have heart disease.
- If cholesterol buildup leads to heart disease, then decreasing cholesterol should reduce the likelihood of heart disease.
Hypotheses bridge theory and empirical testing.

Research and Variables

Research: a systematic investigation into a problem to test hypotheses and gather data.
Variable: any measurable trait or characteristic that can vary across people, objects, or events.
In psychology, many variables are abstract (e.g., stress, self-esteem, discrimination).
It is crucial to operationalize variables to measure them reliably and validly.

Operationalization: Turning Abstract Concepts into Measurable Things

Operationalization = defining a variable so it can be measured in a study.
Operational definition answers: how will we measure this variable in this study?
Example: Stress
- Abstract concept: stress varies between individuals.
- Operational approaches:
- Self-report: group participants by a stress scale score (e.g., a questionnaire with a 0–25 scale; above 15 = high stress).
- Manipulation: induce stress in some participants (e.g., timed writing task) and compare to a non-stressed group.
- Physiological markers: heart rate, blood pressure, cortisol levels (cortisol in blood or saliva above a threshold indicates high stress).
- Rationale: operational definitions ensure that different researchers can understand and replicate how stress was measured.
Why operationalization matters:
- Different studies may measure the same construct in different ways; reporting definitions allows proper interpretation and replication.
Student exercise: discuss options for measuring stress and justify whether self-report, behavioral tasks, or physiological measures are appropriate for a given study.

Types of Research in Health Psychology (Overview)

Correlational research
Experimental research
Quasi-experimental research
Genetic research (often uses twin designs)
Note: This class will focus on correlational research today; genetics will be mentioned but not covered in depth.

Correlational Research: What It Is and What It Is Not

Purpose: estimate the strength and direction of the relationship between variables.
Key limitation: correlation does not imply causation.
- A relationship between two variables does not tell you whether one causes the other.
- Possible interpretations include a third variable causing both, or reverse causality (see below).
The outcome of a correlational study is a correlation coefficient, denoted by a lowercase r (Pearson’s r).
Correlation coefficient properties:
- Range: r \,\in\, [-1, 1]
- Can be positive, negative, or zero.
- The magnitude (absolute value) indicates strength; the sign indicates direction.
- Absolute value interpretation (psychology norms):
- |r| around 0.3 = weak to moderate
- |r| around 0.3 to 0.7 = moderate
- |r| > 0.7 = strong
Example interpretations:
- Negative 0.72 is a strong negative relationship.
- -0.65 is a strong-to-moderate negative relationship (closer to 1 in magnitude).
- 0.30 is a weak to moderate positive relationship.
- 0.00 indicates no association.
Visual intuition: scatterplots; stronger relationships align more closely to a straight line; weaker relationships show more scatter.
Common illustrative examples:
- Positive: sunlight and plant growth (more sunlight associated with more growth; may imply causation but still only correlation).
- Negative: exercise and risk of heart disease (more exercise associated with lower risk).
Important reminder: correlation does not establish causation; even strong correlations require careful interpretation and further testing to determine causality.

Interpreting the Strength and Direction of Correlations (Concept Checks)

Practice items reviewed in class (illustrative):
- A negative correlation of -0.9 is strong and negative.
- A correlation around -0.4 is a moderate negative association.
- A correlation around 0.0 indicates no association.
- A correlation around +0.6 indicates a moderate positive association.
These interpretations align with the absolute value for strength and the sign for direction.

Why Correlation Does Not Equal Causation: Two Main Explanations

Reverse Causality Problem:
- When two variables are correlated, it is possible that Variable Y causes Variable X, or vice versa, or both.
- Example in public discourse: media violence and aggression. A correlational study might show children who watch more violent media also display more aggression, but the direction could be that more aggressive children seek out more violent media, or shared third factors influence both.
- The key point: correlation does not specify the direction of effect.
Third Variable Problem (Confounds):
- A third, unmeasured variable Z could influence both X and Y, creating a spurious association between them.
- Classic illustrative example: polio, ice cream, and temperature.
- Historical misinterpretation: ice cream consumption caused polio because both peak in summer.
- Actual causal chain: temperature increases ice cream sales and polio cases (seasonal confound).
- In practice, observed correlations can be due to an unmeasured third variable such as SES, stress, health behaviors, etc.
The Captain Kangaroo example used to illustrate the idea of a lurking third variable (temperature) driving both ice cream sales and polio incidence.

Third Variable Activity: Practical Exploration

Purpose: practice identifying plausible third variables that could explain observed correlations without claiming causation.
Setup: groups analyze pairs of variables and propose candidate third variables that could be related to both.
Example prompts discussed:
- Women who have a baby after age 40 are more likely to live to 100.
- Possible third variables: overall health, SES, access to healthcare, healthier lifestyle, lower stress, health advancements.
- Number of air conditioning units sold and number of people who drown.
- Likely third variable: summer weather/temperature leading to both higher AC sales and more swimming in hot weather.
- Class participation and GPA (positive or negative correlation).
- Possible third variables: student motivation, teaching style, class engagement, external life factors affecting attendance and study time.
- Number of fire hydrants and number of dogs in a city.
- Likely third variable: city size/population (more people means more hydrants and more dogs).
- Organic food consumption and autism diagnosis.
- Important caution: this is a correlation often exposed in media; a proposed third variable might be health awareness, access to health information, socioeconomic status, or health-related practices.
- Critical takeaway: organic food does not cause autism; vaccines do not cause autism; the correlation can reflect confounds or media influence rather than causation.
Student-led reasoning emphasized:
- Focus on identifying variables that could influence both X and Y, not on proving a causal link.
- Consider broader factors like health information access, socioeconomic status, life stress, health advancements, or external factors.
Takeaway from the activity:
- Even strong correlations can be explained by third variables; always ask what else could explain the association.
- Media reports often phrase correlations as causations; apply critical thinking to assess alternative explanations.

Key Takeaways for Interpreting Correlations

Correlation quantifies association: strength (|r|) and direction (sign of r).
Range constraint: r \in [-1, 1]; absolute value indicates strength, sign indicates direction.
Correlation does not imply causation due to:
- Reverse causality (the direction of influence could be opposite to what is assumed).
- Third-variable confounds (an unmeasured variable drives both observed variables).
Theories vs evidence:
- Theories are supported by evidence but are not proven; ongoing research can strengthen or challenge them.
Operationalization matters for replicability and interpretation:
- Different studies may measure constructs differently; clear definitions are essential for comparing results.
Real-world relevance:
- In health psychology and in media reporting, critically evaluate statements that imply causation from correlation.
- Use the scientific method as a rigorous framework to test claims and avoid overgeneralization.

Final Thoughts for Exam Preparation

Be able to articulate each stage of the scientific method and provide examples.
Be able to explain how to operationalize a variable with concrete definitions and measurement approaches.
Understand how to read a correlation coefficient and interpret both strength and direction.
Be able to distinguish correlation from causation and articulate two (or more) reasons why they are not equivalent, including reverse causality and the third-variable problem.
Practice generating plausible third-variable explanations for given correlations, without asserting causal direction.
Connect these concepts to health psychology contexts and to how findings may be communicated in the media. Lectures 3