Study Notes on Pearson Correlation Coefficient

Lecture Overview

  • This is lecture five of module eight.

  • Topic: Pearson Correlation Coefficient.

Pearson Correlation Coefficient

  • Definition: A statistical measure that describes the strength and direction of a relationship between two variables.

  • Notation:

    • Sample: represented by r

    • Population: represented by the Greek letter rho (ρ)

Range of Values

  • Both r and ρ range from -1 to 1.

    • Positive correlation: When r > 0 (e.g., 0.2, 0.8).

    • Negative correlation: When r < 0 (e.g., -0.2, -0.8).

    • No correlation: When r = 0.

Strength and Direction of Correlation

  • As the absolute value of r moves away from zero, the strength of the correlation increases:

    • Example: |0.9| and |-0.9| are stronger than |0.1| and |-0.1|.

  • Interpretation of r:

    • Tells you both the strength and the direction of the correlation.

Correlation and Regression Line

  • The equation of the line of best fit (regression line) is: y=bx+ay = b x + a Where:

    • b = slope

    • a = y-intercept

  • Slope Interpretation:

    • Positive slope: If the line slopes upwards, then r > 0.

    • Negative slope: If the line slopes downwards, then r < 0.

    • Flat line: If the slope is zero, then r = 0.

Relationship between r and Slope

  • If r > 0, then the slope b > 0.

  • If r < 0, then the slope b < 0.

  • If r = 0, then the slope b = 0.

  • Key Difference:

    • r is restricted to the range [-1, 1].

    • Slope (b) can take any value:

    • - ext{infinity} < b < + ext{infinity}

    • Examples:

    • b = 4391 or b = -1382467.

Calculating r

  • Can be calculated if both predictor and criterion variables are continuous.

  • It is possible to calculate r if the predictor is categorical, but it may not be useful.

Handling Categorical Predictor with Continuous Criterion

  • If y (criterion) is continuous and x (predictor) is categorical:

    • Recommended to use a bar graph.

    • Use a statistical measure called Cohen's d.

Definitions

  • Continuous Variable:

    • A variable that can take on an infinite number of values.

    • Examples:

      • Temperature

      • Height

      • Survey scores

      • Percentages

  • Categorical Variable:

    • A variable that can only take certain distinct values.

    • Examples:

      • Gender identity

      • Race

      • Country of origin

      • Favorite sports

Future Lectures

  • In lectures six and seven, the focus will shift to

    • Bar graphs

    • Cohen's d statistics