Correlation Coefficient Notes

Correlation Coefficient

Pearson Correlation Coefficient (r)

  • The calculation of correlation coefficient is symbolized as rr.
  • Also known as Pearson correlation coefficient for interval scale.
  • Pearson was a student of Francis Galton, who introduced eugenics and was interested in the heights of parents and their children.
  • rr tells the general trend of the relationship between two variables and provides an exact calculation of the association.
  • A positive sign means the correlation is positive, and a negative sign means it's negative (related to the slope).
  • rr is bounded by -1 and 1.
    • 1 or -1 indicates a perfect correlation, meaning all data points fall on a straight line.
    • 0 indicates no correlation.
  • The magnitude of the correlation defines the strength; ignore the sign when determining strength.
    • For example, 0.99-0.99 is a stronger correlation than 0.750.75.

Calculation of r

  • Formula: r=(Z<em>XZ</em>Y)Nr = \frac{\sum(Z<em>X * Z</em>Y)}{N}
    • Where Z<em>XZ<em>X and Z</em>YZ</em>Y are the z-scores for XX and YY values, respectively.
    • NN is the sample size (not degrees of freedom).
  • Steps:
    1. Convert all scores to z-scores.
    2. Calculate the cross-product of the z-scores for each person.
    3. Sum the cross-products.
    4. Divide by the number of people in the study (NN).

Hypothesis Testing for Correlation Coefficient

  • Null Hypothesis: There is no association between two variables (r=0r = 0).
  • Alternative Hypotheses:
    • Two-tailed: There is an association between two variables (direction not predicted).
    • One-tailed: Predicts a positive or negative association (directional hypothesis).
  • Test Statistic: T statistics
    • Convert rr to a t-score using the formula: t=rN21r2t = r \sqrt{\frac{N-2}{1-r^2}}
    • Degrees of freedom: df=N2df = N - 2

R-squared (r2r^2)

  • The square of the correlation coefficient.
  • Represents the proportion of total variance in one variable that can be explained by the other variable (proportion of variance accounted for).