LB

Unit 7-Measures of Association: Correlation and Regression

Measures of Association

Correlation & Regression

  • Focus on understanding the relationships between variables through correlation and regression analyses.

Correlational Study Key Points

  • Criteria Variable (Dependent): The outcome or variable you are trying to predict.
  • Predictor Variable (Independent): The variable you manipulate or measure for prediction.
  • Scatterplot: Visual representation to examine the relationship between the two variables.
  • Intra-ocular Test: Evaluate your subjective impression of the scatterplot; does the relationship make sense psychologically, physiologically, or sociologically?

Correlation Analysis Overview

  • Describes:
    • Direction and strength of the relationship between two variables.
    • Consistency in paired (X, Y) scores.
    • Variability of Y scores with respect to X scores.
  • Regression Analysis:
    • Fit of scatterplot to the regression line (line of best fit).
    • Predictive accuracy from the analysis.

Relationship Types

Positive Relationship
  • Example: Grade Point Average (Y) vs. Lectures Attended (X)
  • Attending more lectures positively correlates with higher GPAs.
Negative Relationship
  • Example: Time Spent in Gym (X) vs. Body Mass (Y)
  • More time in the gym does not necessarily reduce body mass but indicates an inverse relationship.

Correlation Coefficients

  • r (Correlation Coefficient): Measures the strength and direction of linear relationships.
  • Ranges from -1 (perfect negative) to +1 (perfect positive).
  • Values Interpretation:
    • 0.00 - 0.25: Very weak
    • 0.25 - 0.50: Weak
    • 0.50 - 0.75: Moderate
    • >0.75: Strong
  • A high r-value suggests a stronger predictive relationship but does not imply causation.

Pearson Product Moment Correlation (PPMC)

  • Defines strength of a linear relationship: High r indicates strong predictability.
  • Relies on the notion of co-variance between variables.

Coefficient of Determination

  • Measure of the proportion of variability in one variable explained by another (expressed as (r_{XY})^2).
  • Evaluates how much variation can be ascribed to the predictor variable.

Linear Regression Equation

  • General form: Y = a + bX
    • b is the slope of the line, corresponding to the covariance ratio of correlated variables.
    • a is the y-intercept, representing the value of Y when X = 0.

Regression Analysis Steps

  1. Calculate descriptive statistics (mean, standard deviation) for X and Y.
  2. Calculate Pearson's r to find the correlation coefficient.
  3. Determine line of best fit (regression equation): Calculate slope b and intercept a.
  4. Calculate the Standard Error of Estimate (SEE): SEE = s_Y imes ext{sqr}(1 - r^2)
  5. With a hypothetical X value, predict Y using the regression equation; express predicted Y with the SEE.
  6. Calculate the range of predicted values considering error.

Example: Correlation in Student Grades

  • Correlation between first-year (X) and second-year (Y) psychology course grades.
  • Based on sample scores: (e.g., X means: 50,56,52… Y means: 80,85,83…)
  • Derived regression equation can help predict second-year grades based on first-year grades.
    • Example linear regression output: Y = 36.08 + 0.88X

Limitations of PPMC

  1. Data range influence: Only captures relationships in the studied range.
  2. Nonlinear relationships may go undetected.
  3. Extreme data points can skew results significantly.

Hypothesis Testing in Correlation

  • Statistical null hypothesis (H0): Presume no correlation between X and Y.
  • Alternative hypothesis (H1): Posit a significant relationship exists.
  • Degrees of freedom: Typically N - 2 for correlation significance tests.

Covid-19 Case Study Example

  • Examining correlation between country vaccination rates (predictor variable X) and 28-day new case counts (criterion variable Y).
  • Found statistically significant positive correlation (r = 0.214) suggesting higher vaccination rates are associated with increased case counts in studied countries.