EBP Lec 13: Correlations and Comparisons

Opening Remarks

Good Morning Greeting: The speaker welcomes everyone to the lecture, expressing enthusiasm and encouragement for the day's content.
Structure of the Day: Mention of a four-hour lecture stint, indicating that future lectures will involve more interactive elements, including presentations and revisions.
Previous Topics Recap: Review of the chocolate experiment and fundamentals of inferential statistics including:
- Taking a random sample from a population
- Making inferences about the population based on sample data
- Introduction to the Central Limit Theorem for statistical analysis.

Key Concept: Correlation

Definition of Correlation: A relationship between two variables.
**Types of Correlation:
- Positive Correlation:**
- As one variable increases, the other also increases; similarly, if one decreases, the other decreases.
- Example: Increased chocolate consumption corresponds with the number of Nobel Prize laureates.
Example Comic Reference:
- The comic emphasizes that "correlation does not imply causation."

Illustrative Examples of Correlation vs. Causation

Ice Cream and Drowning Example:
- An increase in ice cream sales correlates with an increase in drownings, but sunny weather is the true underlying cause.
- The speaker emphasizes the common logical mistake of assuming causation from correlation.
Married vs. Single Men:
- Men who live longer are often married; however, healthier men are more likely to get married.
- This instance demonstrates how statistical relationships can be misinterpreted.
Short-Sighted Children:
- A study finds a correlation between sleeping with lights on and increased short-sightedness, later debunked as a genetic issue.
- -
Self-Esteem and Academic Performance:
- Initial conclusions that self-esteem causes good grades were reversed; in reality, good grades increase self-esteem.

Important Distinction: Statistical Significance

Understanding Correlation Mistakes: The speaker emphasizes the dangers of erroneously inferring causation from correlation.
Hard Limitations: Illustrates caution while interpreting correlations, noting such correlations may hint at deeper issues but require additional analysis.

Types of Correlation

Positive Correlation:
- Example provided: Scores in anatomy and physiology increasing together.
Negative Correlation:
- Conceptually opposite where an increase in one variable results in a decrease in the other; example given: hearing threshold levels correlating with speech discrimination scores.
No Correlation:
- Random data with no direct relationship, exemplified using unrelated variables such as house numbers and age.
Curvilinear Relationships:
- Explanation that some data may show relationships that are not linear (like health metrics against weight).

Measuring Correlation

Pearson’s Correlation Coefficient (r):
- Represents the strength of the correlation:
- Interpretation of r:
  - $ r = 1$: Perfect positive correlation
  - $ r = -1$: Perfect negative correlation
  - $ r = 0$: No correlation
Strength of Correlation
- Value threshold interpretations:
- 0.00-0.30: Weak
- 0.30-0.70: Moderate
- 0.70-1.00: Strong

Reporting Correlation Statistics

Range of required statistics includes direction, strength, and significance (p-value).
Example of Reporting: "There is a significant strong positive correlation between anatomy and physiology scores."

Apparent Counterpoints to Correlation

Examples of situations where correlation isn’t indicative of causation for clarity.

Regression Analysis Overview

Correlation establishes relationships; regression is used for predictions based on correlated data.
Regression Equation: The example provided: physiology score = 4.5 + (0.961 * anatomy score).
Important distinction between independent variable (anatomy score) and dependent variable (physiology score).

Assumptions for Linear Regression Analysis

Need to work with interval or ratio data.
Assumption of independence of observations.
Assumption of linear relationships and normal distribution.
Homoscedasticity: Equal variance among data points.

Conducting Regression Analysis using Software (Minitab)

Steps outlined for executing regression in Minitab.
Interpretation includes:
- R value and its explanation of variance accounted for.
- All reported outputs including significance of regression statistics.

Conclusion and Transition to T-tests

Transition into t-tests context:
- The speaker explains the necessity for comparing two groups with categorical independent variables that can't be handled by correlation or regression methods.
Introduction to different types of t-tests: one-sample, two-sample, and paired t-tests, while discussing methods of executing them in practical scenarios.

Summary Guidance for a Research Assignment

Group assignments announced: exploration of a provided research question, consultation of literature, data analysis, and integration of findings, culminating in a report.
Reminder of tools and community resources available for student use.

Correlation – Concept and Purpose

Definition

Correlation describes the relationship between two variables — how one changes with respect to the other.
Can be:
- Positive correlation: both increase or decrease together.
- Negative correlation: one increases while the other decreases.
- No correlation: no consistent relationship.
- Curvilinear: relationship exists but not linear.

Key Reminder

“Correlation does not imply causation.”

Example (video):
- Ice cream sales vs drownings — both increase in summer, but caused by weather, not each other.
- Marriage and lifespan — healthier men more likely to marry, not vice versa.
- Children’s night lights and myopia — short-sighted parents leave lights on; genetics, not lighting, causes short-sightedness.
- Self-esteem and grades — good grades → high self-esteem, not the reverse.

Be cautious: correlations can be real but misleading if confounding variables exist.

3. Types of Correlation

Type	Pattern	Example
Positive	As one increases, the other increases.	Anatomy vs Physiology scores — students who perform well in one tend to perform well in the other.
Negative	As one increases, the other decreases.	PTA thresholds vs Speech discrimination — poorer hearing → lower speech scores.
No Correlation	No pattern.	House number vs Age.
Curvilinear	Non-linear pattern.	Weight vs Health — best at optimal weight, declines if too low/high.

In this course, focus on linear relationships (straight-line trends).

4. Pearson’s Correlation Coefficient (r)

Definition

Quantifies strength and direction of linear relationship.
Denoted by r (ranges from –1 to +1).
- r = +1 → perfect positive linear relationship.
- r = –1 → perfect negative linear relationship.
- r = 0 → no relationship.
Works with interval or ratio data only.

Comparison

Spearman’s correlation (ρ) → used for ordinal data or non-parametric cases.

Strength Guidelines

r value range	Strength
0.00–0.19	Very weak
0.20–0.39	Weak
0.40–0.59	Moderate
0.60–0.79	Strong
≥ 0.80	Very strong

Real-world data rarely exceeds r = 0.8.

5. Three Aspects to Report

Direction – Positive, Negative, or None.
Strength – Weak/Moderate/Strong.
Significance – p-value (< 0.05).

Important Note

Strong correlation ≠ necessarily significant if sample size is too small.
- e.g., r = 0.98 but n = 3 → p = 0.127 (not significant).
Always report r and p together.

6. Pearson’s Correlation in Minitab

Procedure

Open dataset (e.g., Anatomy vs Physiology).
Go to Stat → Basic Statistics → Correlation.
Select both variables.
In “Graphs” options, tick “Show correlation and P value”.

Output Example

r = 0.87, p = .001
→ Significant strong positive correlation between anatomy and physiology scores.

APA Reporting Style

Omit leading zero (APA rule).
Report as:
r = .87, p < .001

7. Exercise: Height and Shoe Size

Task: Run correlation for height and shoe size using class data.
Result example:
r = .87, p < .001
→ Strong, positive, significant correlation between height and shoe size.
Report p < .001 (never p = 0).
Use 2 decimal places (3 if needed for clarity).

8. Regression Analysis

Definition

Explores prediction between two variables once correlation exists.
Regression line (line of best fit):
Y=A+BXY = A + BXY=A+BX
- Y = dependent (predicted variable).
- X = independent (predictor variable).
- A = intercept.
- B = slope (gradient).

Example:
Physiology score = 4.5 + 0.961 × Anatomy score
→ For every 1-point increase in anatomy, physiology rises by 0.961.

Assumptions of Linear Regression

Assumption	Meaning
Data level	Both X and Y are interval/ratio.
Independence	Observations are independent and randomly sampled.
Linearity	Relationship must be linear.
Normality	Data approximately normal.
Homoscedasticity	Equal variance of Y across all X values.

Violations → interpret with caution.

9. Output Interpretation

Key Metrics

Statistic	Interpretation
R²	% of variance in Y explained by X (r²).
p-value	Whether regression model is significant.
F-statistic	From ANOVA table (used for APA reporting).

Example: R² = 0.75 → 75% of variance in Physiology explained by Anatomy.

Confidence vs Prediction Intervals

Interval Type	Meaning
Confidence interval (green lines)	Range where true mean response likely lies (±2 SE).
Prediction interval (purple lines)	Range where 95% of all individual values likely fall (±2 SD).

Prediction interval always wider than confidence interval.

10. Regression in Minitab

Steps

Stat → Regression → Fitted Line Plot
Set:
- Response (Y): Dependent variable (e.g., Physiology)
- Predictor (X): Independent variable (e.g., Anatomy)
Under “Options,” tick:
- Display confidence and prediction intervals.

Interpretation Example

R² = .75, F(1,18) = 53.16, p = .001
→ Anatomy scores significantly predict Physiology scores.

APA Reporting Format

Anatomy scores predicted physiology scores, R² = .75, F(1,18) = 53.16, p = .001.

Example: Prediction

Equation: Physiology = 4.46 + 0.9608 × Anatomy
For Anatomy = 80 → Predicted Physiology = 81.32

11. Practice Example: Shoe Size Predicting Height

Regression equation: Height = A + B × Shoe size
R² = .76, F(1,55) = 176.09, p < .001
→ Shoe size strongly predicts height.

Used as forensic example (estimate height from footprint size).

12. T-Tests Overview

Purpose

Compare means between two groups or a group vs known value.
Requires:
- Categorical independent variable
- Interval/ratio dependent variable

“T for Two” — T-tests compare two things.

13. Types of T-Tests

Type	When to Use	Example
One-sample	Compare sample mean to known value.	Mean Mars Bar weight vs 16 g standard.
Independent (Two-sample)	Compare two independent groups.	Mars vs Cadbury chocolate weights.
Paired (Dependent)	Compare two measures from same group.	Student’s Anatomy vs Physiology scores, or Summer vs Winter hours outdoors.

14. One-Sample T-Test

Example

Does the average Mars Bar weight differ from 16 g?
- t-test result: t = 8.41, p < .001
  → Mean weight (17 g) significantly higher than standard.

Interpretation Steps:

Check significance (p < .05).
Compare means to determine direction.
Report in APA:
t(19) = 8.41, p < .001
“Average Mars Bar (M = 17.0 g, SD = 1.2) heavier than recommended 16 g.”

15. Two-Sample (Independent) T-Test

Example

Do Cadbury chocolates differ in weight from Mars chocolates?
- Cadbury (M = 14.4 g), Mars (M = 15.1 g)
- t = –3.10, p = .002
  → Significant difference; Mars heavier.

APA Reporting

t(174) = –3.10, p = .002
“Mars chocolates (M = 15.14 g, SD = 1.2) were significantly heavier than Cadbury (M = 14.41 g, SD = 0.9).”

Notes:

df = N – 2 (two samples).
Sign of t doesn’t affect conclusion — only direction.

16. Paired (Dependent) T-Test

Concept

Compares two related measures:
- Same group tested twice (e.g., before/after intervention).
- Two comparable measures from same participants.

Example

Audiology students’ outdoor hours:
- Summer (M = 5.3 h), Winter (M = 3.2 h)
- p = .044 → significant difference.

“Students spend significantly more time outdoors in summer than winter.”

APA Reporting

t(9) = 2.26, p = .044

Notes

Removes inter-subject variability.
df = N – 1 (one sample measured twice).

17. One-Tailed vs Two-Tailed Tests

Test Type	Purpose	Example
Two-tailed	Tests for difference in either direction.	“Does Mars Bar weight differ from 16 g?”
One-tailed	Tests for difference in specific direction.	“Is Mars Bar weight less than 16 g?”

Two-tailed = more conservative, standard approach.
One-tailed = easier to reach significance but less rigorous.

Use one-tailed only with clear directional hypothesis.

18. Degrees of Freedom Summary

T-Test Type	df Formula
One-sample	N – 1
Two-sample	N – 2
Paired	N – 1

19. Minitab Procedures

Test	Menu Path	Key Inputs	Tips
One-sample	Stat → Basic Statistics → 1-sample t	Enter test mean (e.g., 16)	Use “Perform hypothesis test.”
Two-sample	Stat → Basic Statistics → 2-sample t	Select dependent var (e.g., Weight) + grouping var (e.g., Brand)	Tick “Assume equal variances.”
Paired	Stat → Basic Statistics → Paired t	Select paired columns	Works only with paired data.

20. Practical Activity Summary

One-sample: Sangas R Us rating vs city average (75).
Two-sample: Compare Sangas R Us vs Best Bagels ratings.
Paired: Sangas R Us ratings in January vs June (improvement check).

Each output should include:

Mean, SD
t-value, df, p-value
Interpretation of direction
APA-style report.

21. Final Key Points

✅ Correlation = Relationship
✅ Regression = Prediction
✅ T-tests = Comparison

Always Report:

Direction (positive/negative or group mean)
Strength (r or R²)
Significance (p-value)
Interpretation (what it means in real context)

“Statistics are tools for meaning — numbers only matter when you can interpret them responsibly.”