BIOS 7021 Scatterplot and Correlation Summary
Course Overview
Transition from Density Curves and Normal Distributions to Scatterplots and Correlation.
Focuses on visualizing relationships between two quantitative variables.
Key Concepts
Two Quantitative Variables
Scatterplot: Displays form, direction, and strength of relationship.
Correlation Coefficient (๐): Measures direction and strength of linear relationship.
Regression: Quantifies linear relationships.
Categorical Variables
Two-way Frequency Tables (Contingency Tables) for analyzing relationships between two categorical variables.
Mixed Variables
Side-by-side Boxplots: Compare distributions of a categorical variable across different groups.
Association
Association exists if knowing one variable gives insight on another.
Positive Association: Both variables increase together.
Negative Association: One variable increases while the other decreases.
Variables Defined
Response Variable (๐): Outcome of a study.
Explanatory Variable (๐): Causes changes in response.
Scatterplots
Charts the relationship between two quantitative variables using two scales.
Important to avoid inappropriate scaling for accuracy.
Identifying Relationships in Scatterplots
Look for overall patterns, strength (less scatter indicates stronger relationships), and outliers.
Correlation Coefficient (๐)
Ranges from -1 to 1; indicates strength and direction of linear relationship.
Properties:
๐ > 0: Positive association
๐ < 0: Negative association
๐ = 0: No linear association
๐ is unaffected by units
Sensitive to outliers.
Cautions
Correlation does not imply causation.
Consider lurking variables that may explain observed associations.