Correlation

Correlation

General Concept

  • Definition: Correlation refers to the statistical relationship between two variables where no control or manipulation is exerted on the variables. It assesses whether there is a relationship between two variables and is typically represented through two scores per individual, referred to as X and Y.

Scatterplot

  • Visual Representation: Scatterplots visually depict the correlation between two variables, where each axis represents one variable.

    • Example shown:

    • X-axis: Hours studying

    • Y-axis: Exam grade

Types of Correlation

Positive Correlation
  • Definition: A positive correlation occurs when an increase in one variable (X) corresponds with an increase in another variable (Y).

  • Example: As sleep increases, performance on cognitive tests increases.

Negative Correlation
  • Definition: A negative correlation occurs when an increase in one variable (X) corresponds with a decrease in another variable (Y).

  • Example: As temperature decreases, the time cats spend in laps increases.

Further Examples of Correlation Types
  • A) The more time spent texting in class, the lower a student’s exam scores:

    • Type: Negative Correlation

  • B) Dr. Evil found that he could run farther the more he practiced running:

    • Type: Positive Correlation

Understanding the Sign of Correlation

  • The + or - sign indicates the direction of the relationship:

    • Positive: As one variable increases, the other increases.

    • Negative: As one variable increases, the other decreases.

  • Important Note: The sign does not convey anything about the strength of the relationship!

Strength of Correlation

  • Definition: The strength of correlation signifies how closely the data points cluster around a line of best fit on a scatterplot.

  • Correlation Scale: Correlation can range from -1 to +1:

    • Perfect Positive Correlation: $r = 1.00$

    • Perfect Negative Correlation: $r = -1.00$

    • No Correlation: $r = 0.0$

  • More Information: The strength is determined not just by the sign but by how closely the data points are clustered; the further away from 0, the stronger the correlation.

  • Comparison Examples:

    • For $r = 0.99$ (strong positive) vs $r = 0.50$ (weaker positive), $r = 0.99$ has a stronger correlation.

Correlation vs. Causation

  • Key Concept: Correlation does not imply causation. One cannot deduce which variable causes the other, or if an external factor influences both.

  • Illustration Example:

    • Correlation of the number of lawyers in Texas and the number of deaths from falling out of bed: $r = 0.95$.

    • Sources: CDC and ABA, visualized at tylervigen.com.

  • Statement: If correlation is depicted alongside causation, it is often misleading, leading to misconceptions about the relationship.

Hypothesis-Testing Procedure for Correlation

  1. State Hypotheses

    • Null Hypothesis ($H_0$): ρ = 0 (no population correlation, where ρ = population correlation coefficient)

    • Alternative Hypothesis ($H_1$): ρ ≠ 0 (there is a population correlation)

  2. Set Decision Criteria

    • Utilization of t-tests to determine the t-critical value. For example, with $n = 5$, degrees of freedom (df) = $n - 2 = 3$, and for a two-tailed test at $ ext{α} = 0.05$, the $t_{ ext{critical}} = 3.182$.

  3. Compute Sample Statistics

    • In analyzing the relationships between variables such as studying (X), money spent (Y), and grades (Z).

    • Calculate the Sum of Products of Deviations (SP)

  4. Making Decisions Based on Results

    • Decision-making involves evaluating to see if the computed statistic falls beyond the critical limits.

Example of Computing Sample Statistics
  1. Data Example:

    • Studying (X): {8, 2, 3, 2, 4, 9}

    • Money Spent (Y): {6, 3, 4, 4, 4, 9}

  2. SP Calculation:

    • SP is calculated through deviations:

      • For each X and Y, calculate deviations from their means:

        • Deviations: (X - Mx), (Y - My)

      • Products of deviations to compute SP.

  3. SS Calculation:

    • SS (Sum of Squares) for each variable calculated as:

    • $SSx = ext{Sum}[(X - Mx)^2]$

    • $SSy = ext{Sum}[(Y - My)^2]$

Final Interpretations

  • Interpretation of Results:

  • If no significant relationship (non-significant result), conclude NO relationship exists.

  • If positive and significant:

    • The interpretation is that as one variable increases, so does the other.

  • If negative and significant:

    • The interpretation is that as one variable increases, the other variable decreases.

Data Summary Example

  • Test Used: Pearson correlations

  1. Studying vs. Money Spent: $r = -0.34$, $n = 5$, $p > .05$ – No relationship.

  2. Studying vs. Grades: $r = 0.97$, $n = 5$, $p < .05$ – As studying increases, grades increase.

  3. Money Spent vs. Grades: $r = -0.38$, $n = 5$, $p > .05$ – No relationship.

SPSS Correlation Conclusion

  • Data provided shows:

  • Significant positive correlation between studying and grades (r = 0.97), suggesting that effort in studying corresponds strongly with improved academic performance.

  • Correlation does not imply a direct cause-effect relationship, reinforcing the complexity of interpreting correlational data.