Correlation Coefficients and Their Interpretation

Correlation Coefficients

  • Definition: A numerical index reflecting the relationship between two variables.
  • Range: Values range from 1-1 to +1+1.
  • Bivariate Correlation: Relationship between two variables.
  • Pearson Product-Moment Correlation: Examines the relationship between two continuous variables (e.g., height, age).
Types of Correlations
  • Direct (Positive) Correlation: Variables change in the same direction (e.g., as X increases, Y increases). Value is positive (.00.00 to +1.00+1.00).
  • Indirect (Negative) Correlation: Variables change in opposite directions (e.g., as X increases, Y decreases). Value is negative (1.00-1.00 to .00.00).
Key Principles of Correlation
  • Strength: The absolute value of the coefficient reflects strength; .70-.70 is stronger than +.50+.50.
  • Data Points: Requires at least two data points (variables) per case.
  • Variability: Must have variability in both variables; if one variable does not change, the correlation is zero.
  • Constrained Range: Restricting the range of a variable reduces the observed correlation.
  • Notation: Represented by rxy\text{r}_{xy} for variables X and Y.
Computational Formula for Pearson Correlation Coefficient (rxy\text{r}_{xy})
  • rxy=nΣXY(ΣX)(ΣY)[nΣX2(ΣX)2][nΣY2(ΣY)2]r_{xy} = \frac{n\Sigma XY - (\Sigma X)(\Sigma Y)}{\sqrt{[n\Sigma X^2 - (\Sigma X)^2][n\Sigma Y^2 - (\Sigma Y)^2]}}
    • nn = sample size
    • XX, YY = individual scores
    • ΣXY\Sigma XY = sum of products of X and Y
    • ΣX2,ΣY2\Sigma X^2, \Sigma Y^2 = sum of squared individual X and Y scores
Visual Representation: Scatterplots
  • Definition: A plot of each set of scores on separate axes ($X$ on horizontal, YY on vertical).
  • Interpretation: The general shape indicates direction and strength.
    • Positive Slope: Data points cluster from lower-left to upper-right (direct/positive correlation).
    • Negative Slope: Data points cluster from upper-left to lower-right (indirect/negative correlation).
    • Perfect Correlation (±1.00\pm 1.00): Data points align along a straight line.
Correlation Matrix
  • A table showing correlation coefficients for all pairs of multiple variables.
  • Diagonal values are 1.001.00 (variable correlated with itself).
  • Symmetrical: r<em>AB\text{r}<em>{AB} is the same as r</em>BA\text{r}</em>{BA}.
Interpreting Significance of Correlation Coefficient
  • General Interpretation (Rule of Thumb):
    • .8 to 1.0.8 \text{ to } 1.0: Very strong
    • .6 to .8.6 \text{ to } .8: Strong
    • .4 to .6.4 \text{ to } .6: Moderate
    • .2 to .4.2 \text{ to } .4: Weak
    • .0 to .2.0 \text{ to } .2: Weak or no relationship
Coefficient of Determination ( r2r^2)
  • Definition: The percentage of variance in one variable accounted for by the variance in the other variable.
  • Computation: Square the correlation coefficient (r2r^2).
  • Example: If r=.70\text{r} = .70, then r2=.49\text{r}^2 = .49, meaning 49%49\% of variance is explained.
  • Coefficient of Alienation (or Nondetermination): The percentage of unexplained variance (1r21 - r^2).
Correlation vs. Causality
  • Association: Correlations express an association between variables.
  • No Causality: Correlation does NOT imply causation. A third variable (confounder) might explain the relationship (e.g., ice cream sales and crime rates are both influenced by temperature).
Other Correlation Coefficients
  • Different techniques exist for variables at different levels of measurement (e.g., point-biserial for nominal-interval, Spearman rank for ordinal-ordinal, Phi coefficient for nominal-nominal).