Correlation vs. Experiment: Key Concepts and the Correlation Coefficient

Correlation vs Experiments

  • Core idea: a score on variable A is used to predict a score on variable B. The goal is to understand what factors can help predict B from A.

  • Example scenario from transcript:

    • Hypothesis: engaging in more social media usage (A) will lead to recognizing more celebrities (B).

    • Prediction: knowing how much someone uses social media should allow us to predict how many celebrities they can recognize on a test.

  • Major difference between correlation designs and experiments:

    • Experiments focus on causation.

    • Correlation designs focus on prediction.

  • In an experiment:

    • The independent variable (IV) is manipulated.

    • The dependent variable (DV) is measured.

    • There is usually random assignment and groups for comparison.

  • In a correlational design:

    • No manipulation occurs.

    • Both variables are measured, not controlled by the experimenter.

    • There are no groups or random assignments; each participant provides data for both variables.

  • Measurement example in the transcript:

    • Variable A (social media usage): measured by hours per day used.

    • Variable B (celebrities recognized): measured by a test of celebrity faces.

    • Both variables are obtained from the participants; neither is controlled by the experimenter.

  • Predictive hypothesis in correlation:

    • Hypothesize that Variable A provides an indication or prediction of Variable B.

    • After measurement, test whether A predicts B via a correlational analysis.

  • Testing in correlational design differs from testing in an experiment:

    • In correlation, there are no groups to compare (no experimental vs control groups).

    • The test focuses on the relationship between A and B rather than group differences.

  • Statistical procedure used: correlation analysis.

    • Output is the correlation coefficient, denoted by rr.

    • Range: r[1,  1]r \,\in\, [-1, \; 1].

  • What the correlation coefficient tells us (two main aspects):

    • Strength of the relationship: how well A predicts B. The farther from zero, the stronger the relationship.

    • Direction of the relationship: whether the association is positive or negative.

  • Interpretation of strength relative to zero:

    • The stronger the correlation, the closer the data points lie to a line of best fit.

    • The closer data points cluster to a straight line, the stronger the prediction.

  • Interpretation of the range:

    • Strongest possible correlations: r=1 or r=1r = -1\text{ or }r = 1.

    • If r=1r = 1, A perfectly predicts B with a positive relationship.

    • If r=1r = -1, A perfectly predicts B with a negative relationship.

    • Zero correlation: r=0r = 0, indicating no relationship between A and B.

  • Practical note about typical data:

    • In psychology, correlations close to -1, 0, or 1 are uncommon; real data often show weaker, intermediate correlations.

  • Graphical interpretation (scatter plots):

    • A strong correlation yields data points that lie close to the line of best fit.

    • A weaker correlation yields data points that spread more around the line.

    • In the examples given, a positive correlation is shown when increases in A correspond to increases in B.

  • Direction of relationship details:

    • Positive correlation: As Variable A increases, Variable B increases as well.

    • Negative correlation: As Variable A increases, Variable B decreases.

  • Examples from the transcript:

    • Positive correlation example: more social media usage (A) associated with recognizing more celebrities (B).

    • Negative correlation example (hypothetical in the transcript): more social media usage (A) associated with recognizing fewer celebrities (B).

  • Summary of the approach:

    • In correlational studies, we measure A and B for each participant.

    • We test whether A predicts B using the correlation coefficient r.r.

    • The results inform us about prediction strength and direction, not causation.

  • Connecting to broader principles:

    • Correlation designs are used for prediction and association testing before or alongside causal investigations.

    • They rely on observational data rather than manipulated conditions.

  • Key formulas to remember:

    • Pearson correlation coefficient:
      r=cov(X,Y)σ<em>Xσ</em>Yr = \frac{\mathrm{cov}(X,Y)}{\sigma<em>X \sigma</em>Y}

    • Interpretation guide:

    • If r1|r| \approx 1, strong predictive relationship.

    • If r0|r| \approx 0, little to no predictive relationship.

  • Terminology recap:

    • Variable A: predictor/independent variable in the context of the correlation (though not manipulated).

    • Variable B: outcome/dependent variable in the context of the correlation (though not manipulated).

    • Line of best fit: the best straight-line approximation through the data points in a scatter plot.

  • Final takeaway:

    • Correlation helps us understand whether there is a predictable relationship between two measured variables and in which direction that relationship goes, but it does not establish causation or imply that changing A will cause B to change.