Section+2.5+Two+Quantitative+Variables+Scatterplot+and+Correlation+%28Student+Version%29

SECTION 2.5: Two Quantitative Variables - Scatterplot and Correlation

Key Concepts

  • Two Quantitative Variables: These involve two numerical datasets that can be analyzed for relationships.

  • Visualization: Achieved through a scatterplot, representing data points for two variables.

  • Summary Statistic: The correlation coefficient quantifies the strength and direction of the relationship between the variables.

Associations Between Variables

  • Positive Association:

    • One variable's increase corresponds with an increase in the other variable.

  • Negative Association:

    • One variable's increase corresponds with a decrease in the other variable.

  • No Association:

    • Variables do not provide predictive information about each other.

Examples of Expected Associations

  1. Key Examples:

    • Fertilizer and Crop Yield: Expect a positive association.

    • Cigarettes and Lung Capacity: Expect a negative association.

    • Tire Tread and Miles Driven: Expect a negative association.

Scatterplots

  • Definition: A graph illustrating the relationship between two quantitative variables, plotting paired data as points.

  • Axes:

    • Explanatory variable on the horizontal axis (X-axis).

    • Response variable on the vertical axis (Y-axis).

Example Creating a Scatterplot (Example 2)

  1. Data Table:

    • X: 5, 1, 7, 3, 8

    • Y: 0, 1, 4, 2, 7

  2. Procedure:

    • Use StatKey to produce a scatterplot by inputting paired data.

Correlation Measurement

  • Correlation: Measures the strength and direction of linear relationships between two quantitative variables.

  • Notation:

    • Sample correlation is denoted by r.

    • Population correlation is denoted by ρ (rho).

  • Range of Correlation:

    • The value of r falls between -1 and 1.

    • Significance:

      • Positive Correlation: r > 0

      • Negative Correlation: r < 0

      • No Linear Association: r ≈ 0

    • A value closer to ±1 denotes a stronger association.

Examples of Calculating Correlation

  1. Example 3:

    • Given variables from Example 2, calculate correlation using StatKey.

  2. Example 4:

    • Visualize scatter plot for Cars99 and determine correlation proximity to given options.

  3. Example 5:

    • Analyze scatter plot with Fuel Capacity data; determine correlation proximity.

Understanding Correlation in Context (Example 6)

  1. Context: Investigate relationship between rebounds and free throw percentage in NBA.

  2. Direction of Association: Assess whether this relationship is positive or negative.

  3. Correlation Value Estimate: Guess possible correlation value based on visual representation.

  4. Impossible Values for Correlation: Identify values such as greater than 1 or less than -1 as invalid for correlation.

Impact of Outliers on Correlation

  • Effect of Outliers: Outliers can significantly distort the calculated correlation.

  • Visualization: Always plot data, and analyze the scatterplot pre- and post-outlier removal.