Chapter 12: The correlational research strategy
Correlational research
Goal: To establish that a relation/association exists between variables and to describe the nature of the relationship (form, direction & strength).
There is no attempt to manipulate, control, or interfere with the variables.
E.g., whether a relation exists between people’s degree of meaningful conversations and their level of happiness.
Correlational Data
For each individual, there will be two data points (one for each variable).
Data can be presented in a table showing the two scores for each individual in separate columns.
These data points can be presented on a scatter plot.
Form of the relation:
Relation: Looking for pattern in data that suggests a consistent and predictable relationship. Linear relationship vs. non-linear relationship
Direction of the relation:
Direction: How are changes in one variable related to changes in the other variable?
Positive correlation (+): Increases in 𝑥 are paired with increases in 𝑦
Negative correlation (-): Increases in 𝑥 are paired with decreases in 𝑦
Strength of the relation:
Strength: degree of association between two variables.
Expressed mathematically as the correlation coefficient ®
Correlation coefficient can vary from 1.00 to -1.00
When the two variables are either ratio or interval, we use Pearson r
If one variable is ordinal, we use the Spearman r
Strength of correlation coefficient:
𝑟 near 0 indicate weak relations
𝑟 close to -1 or 1 indicate that points lie close to a straight line
𝑟 equal to -1 or 1 indicate that points lie exactly along a straight line
Regression line:
indicates a linear relationship between the dependent variables on the y-axis and the independent variables on the x-axis.
The closer the points are to the line, the greater the association (relation) between the variables
Significance of the Correlation:
Statistical significance is determined by consulting a table. The table takes into account sample size and alpha level.
Small sample sizes are prone to producing large correlations, so the criteria for statistical significance is more strict
To be significant, r must be equal to or larger than the value corresponding to the appropriate df and p level.
For correlation, df is always sample size minus 2
Shared variance:
shared variance: the shared common ground between variables A and B.
To determine what percentage of changes in one variable (A) can be accounted for by changes in the other (B) one must calculate the shared variance
r 2 = coefficient of determination = shared variance.
Correlation: describes the relationship between two variables.
Very useful for predictions
If A is known, and the correlation between A and B is known, then B can be predicted with an accuracy which depends on the strength of the correlation.
Can also be used to establish validity and reliability, like test-retest reliability and concurrent validity
Because two variables are related, this does not mean that there must be a direct relation between the two variables. A third (unidentified) variable may be responsible for producing the observed relation.e.g., (𝑍 causes 𝑋) & (𝑍 causes 𝑌) ➔ 𝑋 & 𝑌 correlated
Spurious correlation: two variables appear to be correlated but are not.
Directionality problem: the situation in which it is known that two variables are related although it is not known which is the cause and which is the effect.
Regression analysis: way of using associations between variables as a method of prediction.
Potential Problems :
The Pearson correlation results will be misleading under two particular circumstances:
1. Nonlinear relationships: Need to look at scatter plots to look at the shape of the relationship – must be linear.
2. Restricted range: When the range of values measured for one of the variables is restricted for some reason. This will lead to misleading results.
Strengths and weaknesses of restricted range:
Record what exists naturally
Cannot assess causality
Helps identify where to look for causes
Can investigate what is otherwise unethical to examine experimentally
High external validity
Low internal validity