Correlation_Notes

Dependent vs. Independent Variables

  • Dependent Variable: The factor that is measured in an experiment.

  • Independent Variable: The factor that is controlled or manipulated in an experiment.

Definition of Correlation

  • Correlation: A statistical method used to describe and measure the relationship between two variables.

    • Relationship: Changes in one variable are consistently accompanied by changes in another variable.

    • Key Characteristic: Observes variables in their natural state without manipulation.

Characteristics of Correlation

  • Usual Scale: Values can exceed 0.50 but higher correlations are less common.

  • Direction of Correlation:

    • Positive Correlation (+): As one variable increases, the other also increases.

    • Negative Correlation (-): As one variable increases, the other decreases.

Types of Correlation

  • Linear Correlation: A straight-line relationship.

  • Non-linear Correlation: A relationship that may show a pattern but is not constant (e.g., curvilinear).

Magnitude of Correlation

  • Measured on a scale from -1 to +1:

    • -1: Perfect negative correlation.

    • 0: No correlation.

    • +1: Perfect positive correlation.

  • Examples of correlation strength:

    • r = 0.80: Strong correlation.

    • r = 0.32: Moderate correlation.

Parametric Tests for Correlation

  • Pearson’s r (Product-Moment Correlation): Measures the degree and direction of the linear relationship between two variables.

    • Assumptions: Data must be interval or ratio scale, bivariate normality, and randomly selected sample.

    • Applications: Used for descriptive and inferential statistics in both bivariate and multivariate designs.

Reporting Correlation Results

  • Format: r(df) = r statistic, p = p value.

  • r: Strength of the relationship.

  • p: Probability of the result occurring by chance (commonly < 0.05).

  • Example: IQ and GPA correlation report -- r(38) = 0.34, p = 0.03.

Cohen's d

  • Definition: A measure of effect size that quantifies the difference between two group means in standard deviation units.

    • Range: Typically ranges from 0 to infinity (can be negative if in reverse direction).

    • Effect Sizes:

      • Small effect: d = 0.2

      • Medium effect: d = 0.5

      • Large effect: d = 0.8

    • Use: Applied in t-tests and comparisons between groups.

Other Correlation Tests

  • Spearman’s rho: For monotonic or non-linear relationships; uses rank/ordinal data.

  • Point Biserial: For one binary (dichotomous) and one interval/ratio variable.

  • Cramer’s V: For two nominal variables.

  • Kendall’s Tau: For ordinal, interval, or ratio data.

  • Phi Coefficient: For two dichotomous variables.

Important Notes on Correlation

  1. Correlation ≠ Causation: Correlation does not imply that one variable causes the other (e.g., higher income does not directly lead to better grades).

  2. Directionality Problem: Correlation may be misinterpreted:

    • x → y

    • y → x

    • both can be true (x ↔ y).

  3. Restricted Range of Scores: Narrow ranges may distort correlation magnitude; be cautious of generalizations.

  4. Outliers: Outliers can significantly affect correlation values, always examine scatter plots prior to running analyses.

  5. Proportion of Shared Variability: Correlation coefficients reflect the degree of relationship but do not indicate proportionate explanations of variability.

  • Coefficient of Determination (r²): Represents the proportion of variability in one variable explained by another (e.g., r = 0.875 implies r² = 0.766, or 76.6% shared variability).