EBP Lec 13: Correlations and Comparisons
Opening Remarks
Good Morning Greeting: The speaker welcomes everyone to the lecture, expressing enthusiasm and encouragement for the day's content.
Structure of the Day: Mention of a four-hour lecture stint, indicating that future lectures will involve more interactive elements, including presentations and revisions.
Previous Topics Recap: Review of the chocolate experiment and fundamentals of inferential statistics including:
Taking a random sample from a population
Making inferences about the population based on sample data
Introduction to the Central Limit Theorem for statistical analysis.
Key Concept: Correlation
Definition of Correlation: A relationship between two variables.
**Types of Correlation:
Positive Correlation:**
As one variable increases, the other also increases; similarly, if one decreases, the other decreases.
Example: Increased chocolate consumption corresponds with the number of Nobel Prize laureates.
Example Comic Reference:
The comic emphasizes that "correlation does not imply causation."
Illustrative Examples of Correlation vs. Causation
Ice Cream and Drowning Example:
An increase in ice cream sales correlates with an increase in drownings, but sunny weather is the true underlying cause.
The speaker emphasizes the common logical mistake of assuming causation from correlation.
Married vs. Single Men:
Men who live longer are often married; however, healthier men are more likely to get married.
This instance demonstrates how statistical relationships can be misinterpreted.
Short-Sighted Children:
A study finds a correlation between sleeping with lights on and increased short-sightedness, later debunked as a genetic issue.
-
Self-Esteem and Academic Performance:
Initial conclusions that self-esteem causes good grades were reversed; in reality, good grades increase self-esteem.
Important Distinction: Statistical Significance
Understanding Correlation Mistakes: The speaker emphasizes the dangers of erroneously inferring causation from correlation.
Hard Limitations: Illustrates caution while interpreting correlations, noting such correlations may hint at deeper issues but require additional analysis.
Types of Correlation
Positive Correlation:
Example provided: Scores in anatomy and physiology increasing together.
Negative Correlation:
Conceptually opposite where an increase in one variable results in a decrease in the other; example given: hearing threshold levels correlating with speech discrimination scores.
No Correlation:
Random data with no direct relationship, exemplified using unrelated variables such as house numbers and age.
Curvilinear Relationships:
Explanation that some data may show relationships that are not linear (like health metrics against weight).
Measuring Correlation
Pearson’s Correlation Coefficient (r):
Represents the strength of the correlation:
Interpretation of r:
$ r = 1$: Perfect positive correlation
$ r = -1$: Perfect negative correlation
$ r = 0$: No correlation
Strength of Correlation
Value threshold interpretations:
0.00-0.30: Weak
0.30-0.70: Moderate
0.70-1.00: Strong
Reporting Correlation Statistics
Range of required statistics includes direction, strength, and significance (p-value).
Example of Reporting: "There is a significant strong positive correlation between anatomy and physiology scores."
Apparent Counterpoints to Correlation
Examples of situations where correlation isn’t indicative of causation for clarity.
Regression Analysis Overview
Correlation establishes relationships; regression is used for predictions based on correlated data.
Regression Equation: The example provided: physiology score = 4.5 + (0.961 * anatomy score).
Important distinction between independent variable (anatomy score) and dependent variable (physiology score).
Assumptions for Linear Regression Analysis
Need to work with interval or ratio data.
Assumption of independence of observations.
Assumption of linear relationships and normal distribution.
Homoscedasticity: Equal variance among data points.
Conducting Regression Analysis using Software (Minitab)
Steps outlined for executing regression in Minitab.
Interpretation includes:
R value and its explanation of variance accounted for.
All reported outputs including significance of regression statistics.
Conclusion and Transition to T-tests
Transition into t-tests context:
The speaker explains the necessity for comparing two groups with categorical independent variables that can't be handled by correlation or regression methods.
Introduction to different types of t-tests: one-sample, two-sample, and paired t-tests, while discussing methods of executing them in practical scenarios.
Summary Guidance for a Research Assignment
Group assignments announced: exploration of a provided research question, consultation of literature, data analysis, and integration of findings, culminating in a report.
Reminder of tools and community resources available for student use.
Correlation – Concept and Purpose
Definition
Correlation describes the relationship between two variables — how one changes with respect to the other.
Can be:
Positive correlation: both increase or decrease together.
Negative correlation: one increases while the other decreases.
No correlation: no consistent relationship.
Curvilinear: relationship exists but not linear.
Key Reminder
“Correlation does not imply causation.”
Example (video):
Ice cream sales vs drownings — both increase in summer, but caused by weather, not each other.
Marriage and lifespan — healthier men more likely to marry, not vice versa.
Children’s night lights and myopia — short-sighted parents leave lights on; genetics, not lighting, causes short-sightedness.
Self-esteem and grades — good grades → high self-esteem, not the reverse.
Be cautious: correlations can be real but misleading if confounding variables exist.
3. Types of Correlation
Type | Pattern | Example |
|---|---|---|
Positive | As one increases, the other increases. | Anatomy vs Physiology scores — students who perform well in one tend to perform well in the other. |
Negative | As one increases, the other decreases. | PTA thresholds vs Speech discrimination — poorer hearing → lower speech scores. |
No Correlation | No pattern. | House number vs Age. |
Curvilinear | Non-linear pattern. | Weight vs Health — best at optimal weight, declines if too low/high. |
In this course, focus on linear relationships (straight-line trends).
4. Pearson’s Correlation Coefficient (r)
Definition
Quantifies strength and direction of linear relationship.
Denoted by r (ranges from –1 to +1).
r = +1 → perfect positive linear relationship.
r = –1 → perfect negative linear relationship.
r = 0 → no relationship.
Works with interval or ratio data only.
Comparison
Spearman’s correlation (ρ) → used for ordinal data or non-parametric cases.
Strength Guidelines
r value range | Strength |
|---|---|
0.00–0.19 | Very weak |
0.20–0.39 | Weak |
0.40–0.59 | Moderate |
0.60–0.79 | Strong |
≥ 0.80 | Very strong |
Real-world data rarely exceeds r = 0.8.
5. Three Aspects to Report
Direction – Positive, Negative, or None.
Strength – Weak/Moderate/Strong.
Significance – p-value (< 0.05).
Important Note
Strong correlation ≠ necessarily significant if sample size is too small.
e.g., r = 0.98 but n = 3 → p = 0.127 (not significant).
Always report r and p together.
6. Pearson’s Correlation in Minitab
Procedure
Open dataset (e.g., Anatomy vs Physiology).
Go to Stat → Basic Statistics → Correlation.
Select both variables.
In “Graphs” options, tick “Show correlation and P value”.
Output Example
r = 0.87, p = .001
→ Significant strong positive correlation between anatomy and physiology scores.
APA Reporting Style
Omit leading zero (APA rule).
Report as:
r = .87, p < .001
7. Exercise: Height and Shoe Size
Task: Run correlation for height and shoe size using class data.
Result example:
r = .87, p < .001
→ Strong, positive, significant correlation between height and shoe size.Report p < .001 (never p = 0).
Use 2 decimal places (3 if needed for clarity).
8. Regression Analysis
Definition
Explores prediction between two variables once correlation exists.
Regression line (line of best fit):
Y=A+BXY = A + BXY=A+BX
Y = dependent (predicted variable).
X = independent (predictor variable).
A = intercept.
B = slope (gradient).
Example:
Physiology score = 4.5 + 0.961 × Anatomy score
→ For every 1-point increase in anatomy, physiology rises by 0.961.
Assumptions of Linear Regression
Assumption | Meaning |
|---|---|
Data level | Both X and Y are interval/ratio. |
Independence | Observations are independent and randomly sampled. |
Linearity | Relationship must be linear. |
Normality | Data approximately normal. |
Homoscedasticity | Equal variance of Y across all X values. |
Violations → interpret with caution.
9. Output Interpretation
Key Metrics
Statistic | Interpretation |
|---|---|
R² | % of variance in Y explained by X (r²). |
p-value | Whether regression model is significant. |
F-statistic | From ANOVA table (used for APA reporting). |
Example: R² = 0.75 → 75% of variance in Physiology explained by Anatomy.
Confidence vs Prediction Intervals
Interval Type | Meaning |
|---|---|
Confidence interval (green lines) | Range where true mean response likely lies (±2 SE). |
Prediction interval (purple lines) | Range where 95% of all individual values likely fall (±2 SD). |
Prediction interval always wider than confidence interval.
10. Regression in Minitab
Steps
Stat → Regression → Fitted Line Plot
Set:
Response (Y): Dependent variable (e.g., Physiology)
Predictor (X): Independent variable (e.g., Anatomy)
Under “Options,” tick:
Display confidence and prediction intervals.
Interpretation Example
R² = .75, F(1,18) = 53.16, p = .001
→ Anatomy scores significantly predict Physiology scores.
APA Reporting Format
Anatomy scores predicted physiology scores, R² = .75, F(1,18) = 53.16, p = .001.
Example: Prediction
Equation: Physiology = 4.46 + 0.9608 × Anatomy
For Anatomy = 80 → Predicted Physiology = 81.32
11. Practice Example: Shoe Size Predicting Height
Regression equation: Height = A + B × Shoe size
R² = .76, F(1,55) = 176.09, p < .001
→ Shoe size strongly predicts height.
Used as forensic example (estimate height from footprint size).
12. T-Tests Overview
Purpose
Compare means between two groups or a group vs known value.
Requires:
Categorical independent variable
Interval/ratio dependent variable
“T for Two” — T-tests compare two things.
13. Types of T-Tests
Type | When to Use | Example |
|---|---|---|
One-sample | Compare sample mean to known value. | Mean Mars Bar weight vs 16 g standard. |
Independent (Two-sample) | Compare two independent groups. | Mars vs Cadbury chocolate weights. |
Paired (Dependent) | Compare two measures from same group. | Student’s Anatomy vs Physiology scores, or Summer vs Winter hours outdoors. |
14. One-Sample T-Test
Example
Does the average Mars Bar weight differ from 16 g?
t-test result: t = 8.41, p < .001
→ Mean weight (17 g) significantly higher than standard.
Interpretation Steps:
Check significance (p < .05).
Compare means to determine direction.
Report in APA:
t(19) = 8.41, p < .001
“Average Mars Bar (M = 17.0 g, SD = 1.2) heavier than recommended 16 g.”
15. Two-Sample (Independent) T-Test
Example
Do Cadbury chocolates differ in weight from Mars chocolates?
Cadbury (M = 14.4 g), Mars (M = 15.1 g)
t = –3.10, p = .002
→ Significant difference; Mars heavier.
APA Reporting
t(174) = –3.10, p = .002
“Mars chocolates (M = 15.14 g, SD = 1.2) were significantly heavier than Cadbury (M = 14.41 g, SD = 0.9).”
Notes:
df = N – 2 (two samples).
Sign of t doesn’t affect conclusion — only direction.
16. Paired (Dependent) T-Test
Concept
Compares two related measures:
Same group tested twice (e.g., before/after intervention).
Two comparable measures from same participants.
Example
Audiology students’ outdoor hours:
Summer (M = 5.3 h), Winter (M = 3.2 h)
p = .044 → significant difference.
“Students spend significantly more time outdoors in summer than winter.”
APA Reporting
t(9) = 2.26, p = .044
Notes
Removes inter-subject variability.
df = N – 1 (one sample measured twice).
17. One-Tailed vs Two-Tailed Tests
Test Type | Purpose | Example |
|---|---|---|
Two-tailed | Tests for difference in either direction. | “Does Mars Bar weight differ from 16 g?” |
One-tailed | Tests for difference in specific direction. | “Is Mars Bar weight less than 16 g?” |
Two-tailed = more conservative, standard approach.
One-tailed = easier to reach significance but less rigorous.
Use one-tailed only with clear directional hypothesis.
18. Degrees of Freedom Summary
T-Test Type | df Formula |
|---|---|
One-sample | N – 1 |
Two-sample | N – 2 |
Paired | N – 1 |
19. Minitab Procedures
Test | Menu Path | Key Inputs | Tips |
|---|---|---|---|
One-sample | Stat → Basic Statistics → 1-sample t | Enter test mean (e.g., 16) | Use “Perform hypothesis test.” |
Two-sample | Stat → Basic Statistics → 2-sample t | Select dependent var (e.g., Weight) + grouping var (e.g., Brand) | Tick “Assume equal variances.” |
Paired | Stat → Basic Statistics → Paired t | Select paired columns | Works only with paired data. |
20. Practical Activity Summary
One-sample: Sangas R Us rating vs city average (75).
Two-sample: Compare Sangas R Us vs Best Bagels ratings.
Paired: Sangas R Us ratings in January vs June (improvement check).
Each output should include:
Mean, SD
t-value, df, p-value
Interpretation of direction
APA-style report.
21. Final Key Points
✅ Correlation = Relationship
✅ Regression = Prediction
✅ T-tests = Comparison
Always Report:
Direction (positive/negative or group mean)
Strength (r or R²)
Significance (p-value)
Interpretation (what it means in real context)
“Statistics are tools for meaning — numbers only matter when you can interpret them responsibly.”