BIOS 7021 Scatterplot and Correlation Summary

Course Overview

  • Transition from Density Curves and Normal Distributions to Scatterplots and Correlation.

  • Focuses on visualizing relationships between two quantitative variables.

Key Concepts

Two Quantitative Variables

  • Scatterplot: Displays form, direction, and strength of relationship.

  • Correlation Coefficient (๐‘Ÿ): Measures direction and strength of linear relationship.

  • Regression: Quantifies linear relationships.

Categorical Variables

  • Two-way Frequency Tables (Contingency Tables) for analyzing relationships between two categorical variables.

Mixed Variables

  • Side-by-side Boxplots: Compare distributions of a categorical variable across different groups.

Association

  • Association exists if knowing one variable gives insight on another.

  • Positive Association: Both variables increase together.

  • Negative Association: One variable increases while the other decreases.

Variables Defined

  • Response Variable (๐‘Œ): Outcome of a study.

  • Explanatory Variable (๐‘‹): Causes changes in response.

Scatterplots

  • Charts the relationship between two quantitative variables using two scales.

  • Important to avoid inappropriate scaling for accuracy.

Identifying Relationships in Scatterplots

  • Look for overall patterns, strength (less scatter indicates stronger relationships), and outliers.

Correlation Coefficient (๐‘Ÿ)

  • Ranges from -1 to 1; indicates strength and direction of linear relationship.

  • Properties:

    • ๐‘Ÿ > 0: Positive association

    • ๐‘Ÿ < 0: Negative association

    • ๐‘Ÿ = 0: No linear association

    • ๐‘Ÿ is unaffected by units

    • Sensitive to outliers.

Cautions

  • Correlation does not imply causation.

  • Consider lurking variables that may explain observed associations.