Correlation
Correlation
General Concept
Definition: Correlation refers to the statistical relationship between two variables where no control or manipulation is exerted on the variables. It assesses whether there is a relationship between two variables and is typically represented through two scores per individual, referred to as X and Y.
Scatterplot
Visual Representation: Scatterplots visually depict the correlation between two variables, where each axis represents one variable.
Example shown:
X-axis: Hours studying
Y-axis: Exam grade
Types of Correlation
Positive Correlation
Definition: A positive correlation occurs when an increase in one variable (X) corresponds with an increase in another variable (Y).
Example: As sleep increases, performance on cognitive tests increases.
Negative Correlation
Definition: A negative correlation occurs when an increase in one variable (X) corresponds with a decrease in another variable (Y).
Example: As temperature decreases, the time cats spend in laps increases.
Further Examples of Correlation Types
A) The more time spent texting in class, the lower a student’s exam scores:
Type: Negative Correlation
B) Dr. Evil found that he could run farther the more he practiced running:
Type: Positive Correlation
Understanding the Sign of Correlation
The + or - sign indicates the direction of the relationship:
Positive: As one variable increases, the other increases.
Negative: As one variable increases, the other decreases.
Important Note: The sign does not convey anything about the strength of the relationship!
Strength of Correlation
Definition: The strength of correlation signifies how closely the data points cluster around a line of best fit on a scatterplot.
Correlation Scale: Correlation can range from -1 to +1:
Perfect Positive Correlation: $r = 1.00$
Perfect Negative Correlation: $r = -1.00$
No Correlation: $r = 0.0$
More Information: The strength is determined not just by the sign but by how closely the data points are clustered; the further away from 0, the stronger the correlation.
Comparison Examples:
For $r = 0.99$ (strong positive) vs $r = 0.50$ (weaker positive), $r = 0.99$ has a stronger correlation.
Correlation vs. Causation
Key Concept: Correlation does not imply causation. One cannot deduce which variable causes the other, or if an external factor influences both.
Illustration Example:
Correlation of the number of lawyers in Texas and the number of deaths from falling out of bed: $r = 0.95$.
Sources: CDC and ABA, visualized at tylervigen.com.
Statement: If correlation is depicted alongside causation, it is often misleading, leading to misconceptions about the relationship.
Hypothesis-Testing Procedure for Correlation
State Hypotheses
Null Hypothesis ($H_0$): ρ = 0 (no population correlation, where ρ = population correlation coefficient)
Alternative Hypothesis ($H_1$): ρ ≠ 0 (there is a population correlation)
Set Decision Criteria
Utilization of t-tests to determine the t-critical value. For example, with $n = 5$, degrees of freedom (df) = $n - 2 = 3$, and for a two-tailed test at $ ext{α} = 0.05$, the $t_{ ext{critical}} = 3.182$.
Compute Sample Statistics
In analyzing the relationships between variables such as studying (X), money spent (Y), and grades (Z).
Calculate the Sum of Products of Deviations (SP)
Making Decisions Based on Results
Decision-making involves evaluating to see if the computed statistic falls beyond the critical limits.
Example of Computing Sample Statistics
Data Example:
Studying (X): {8, 2, 3, 2, 4, 9}
Money Spent (Y): {6, 3, 4, 4, 4, 9}
SP Calculation:
SP is calculated through deviations:
For each X and Y, calculate deviations from their means:
Deviations: (X - Mx), (Y - My)
Products of deviations to compute SP.
SS Calculation:
SS (Sum of Squares) for each variable calculated as:
$SSx = ext{Sum}[(X - Mx)^2]$
$SSy = ext{Sum}[(Y - My)^2]$
Final Interpretations
Interpretation of Results:
If no significant relationship (non-significant result), conclude NO relationship exists.
If positive and significant:
The interpretation is that as one variable increases, so does the other.
If negative and significant:
The interpretation is that as one variable increases, the other variable decreases.
Data Summary Example
Test Used: Pearson correlations
Studying vs. Money Spent: $r = -0.34$, $n = 5$, $p > .05$ – No relationship.
Studying vs. Grades: $r = 0.97$, $n = 5$, $p < .05$ – As studying increases, grades increase.
Money Spent vs. Grades: $r = -0.38$, $n = 5$, $p > .05$ – No relationship.
SPSS Correlation Conclusion
Data provided shows:
Significant positive correlation between studying and grades (r = 0.97), suggesting that effort in studying corresponds strongly with improved academic performance.
Correlation does not imply a direct cause-effect relationship, reinforcing the complexity of interpreting correlational data.