Chapter 12: The correlational research strategy

Correlational research


  • Goal: To establish that a relation/association exists between variables and to describe the nature of the relationship (form, direction & strength).

  • There is no attempt to manipulate, control, or interfere with the variables. 

  •  E.g., whether a relation exists between people’s degree of meaningful conversations and their level of happiness. 


Correlational Data

  • For each individual, there will be two data points (one for each variable).

  • Data can be presented in a table showing the two scores for each individual in separate columns. 

  • These data points can be presented on a scatter plot.


Form of the relation:

  • Relation:  Looking for pattern in data that suggests a consistent and predictable relationship. Linear relationship vs. non-linear relationship


Direction of the relation:

  • Direction: How are changes in one variable related to changes in the other variable?

  • Positive correlation (+): Increases in 𝑥 are paired with increases in 𝑦 

  • Negative correlation (-): Increases in 𝑥 are paired with decreases in 𝑦

 Strength of the relation:

  • Strength: degree of association between two variables.

  • Expressed mathematically as the correlation coefficient ®

  • Correlation coefficient can vary from 1.00 to -1.00

  • When the two variables are either ratio or interval, we use Pearson r

  • If one variable is ordinal, we use the Spearman r

Strength of correlation coefficient:

  • 𝑟 near 0 indicate weak relations 

  •  𝑟 close to -1 or 1 indicate that points lie close to a straight line

  • 𝑟 equal to -1 or 1 indicate that points lie exactly along a straight line

Regression line:

  • indicates a linear relationship between the dependent variables on the y-axis and the independent variables on the x-axis.

  • The closer the points are to the line, the greater the association (relation) between the variables

Significance  of the Correlation:

  • Statistical significance is determined by consulting a table. The table takes into account sample size and alpha level.  

  • Small sample sizes are prone to producing large correlations, so the criteria for statistical significance is more strict

  • To be significant, r must be equal to or larger than the value corresponding to the appropriate df and p level. 

  • For correlation, df is always sample size minus 2

Shared variance:

  •  shared variance: the shared common ground between variables A and B.

  • To determine what percentage of changes in one variable (A) can be accounted for by changes in the other (B) one must calculate the shared variance

  • r 2 = coefficient of determination = shared variance.


  • Correlation: describes the relationship between two variables.

  • Very useful for predictions

  • If A is known, and the correlation between A and B is known, then B can be predicted with an accuracy which depends on the strength of the correlation.

  • Can also be used to establish validity and reliability, like test-retest reliability and concurrent validity

Because two variables are related, this does not mean that there must be a direct relation between the two variables. A third (unidentified) variable may be responsible for producing the observed relation.e.g., (𝑍 causes 𝑋) & (𝑍 causes 𝑌) ➔ 𝑋 & 𝑌 correlated

  • Spurious correlation: two variables appear to be correlated but are not.

  • Directionality problem: the situation in which it is known that two variables are related although it is not known which is the cause and which is the effect.

  • Regression analysis:  way of using associations between variables as a method of prediction.

 

Potential Problems :

  • The Pearson correlation results will be misleading under two particular circumstances: 

1. Nonlinear relationships: Need to look at scatter plots to look at the shape of the  relationship – must be linear.

 2. Restricted range: When the range of values measured for one of the variables is restricted for some reason. This will lead to misleading results.

Strengths and weaknesses of restricted range:

  • Record what exists naturally

  • Cannot assess causality 

  • Helps identify where to look for causes

  • Can investigate what is otherwise unethical to examine experimentally

  • High external validity

  • Low internal validity