The Evolution of Pearson's Correlation Coefficient
The Evolution of Pearson's Correlation Coefficient
Introduction to Correlation in Statistics
Definition of Correlation:
Correlation is a measure of the
Direction (positive or negative) and
Strength of the linear relationship
between two quantitative variables.
Pearson's Correlation Coefficient (r):
Typically taught in high school and introductory college-level statistics.
The article explores an activity that aids in understanding its formula and interpretation through scatter plots.
Activity Overview
Introduces Quadrant Count Ratio (QCR) as an intermediate measure of association.
Discusses how QCR leads to Pearson's r to overcome its shortcomings.
Aligns with GAISE Report (Franklin et al., 2007):
Promotes statistical literacy and reasoning skills for high school graduates.
Statistical education framework described in three developmental levels (A, B, and C).
Developmental Levels
Level A: Introduction to statistical concepts through simple activities.
Level B: Building upon foundational concepts and introducing more complex ideas.
Level C: Advanced statistical thinking and the ability to understand deeper statistical methods.
Understanding Association Between Variables
Definition:
Two variables are associated if the values of one variable tend to occur more frequently with certain values of the other variable (Moore and McCabe, 2003).
Important for making predictions about one variable based on another.
Example of Association
Anthropometric Question:
Is there a relationship between arm span and height?
Treatment of height as independent (x) and arm span as dependent (y) variable.
Scatter Plots
Most effective method for exploring the association between two quantitative variables.
Example measurement data: Height and arm span of 25 students (in centimeters).
Types of Relationships Identified in Scatter Plots
Direction: Ascending or descending pattern.
Strength: How closely the points cluster around the line.
Form: Linear or non-linear trend.
Quadrant Count Ratio (QCR)
QCR Definition:
Components:
QI, QII, QIII, and QIV represent the number of points in each quadrant.
n = total number of observations.
Example Calculation for Arm Span and Height
Using the provided formula with data:
Interpretation: Indicates a strong positive association between arm span and height.
Properties of the QCR
Range:
QCR is guaranteed to be between -1 and 1.
Units Independence:
QCR is independent of the units of measurement, e.g., height and arm span are both in centimeters.
Explored through Scatter Plots:
Figures 2-7 demonstrate various properties of QCR.
Properties Questions
Is the general trend positive or negative?
How are points distributed across quadrants?
Does the relationship appear linear?
What strength does the QCR suggest?
Specific Properties Explained
Property 1:
QCR will be positive if predominantly in quadrants I and III; negative if in II and IV.
Property 2:
QCR approaches zero when association is weak.
Property 3:
QCR of 1 indicates all points in quadrants I and III; -1 if exclusively in II and IV.
Transitioning to Pearson's Correlation Coefficient
Distance Calculation:
Use signed distances from each point to the mean lines as part of association strength measure.
Calculate Pearson's r:
r represents how much stronger the correlation is when considering distances.
Properties of Pearson's r
Trend Correlation:
Positive when trend is ascending, negative if descending.
Weak Correlation:
r is close to zero in weak associations.
Perfect Correlation:
r = 1 or -1 only in perfect linear relationships.
Comparisons Between QCR and Pearson's r
Use of Scatter Plots:
Visual representation aids in understanding the nuances of association.
Directions and Forms:
Both methods inform about the association, but Pearson's r provides more quantifiable metrics.
Summary
The article emphasizes an understanding of how Pearson's correlation coefficient represents the direction and strength of relationships in quantitative data.
Implementation in classrooms includes practical data collection and exploration through scatter plots.
Suggested resources include online platforms that further enhance learning.