Section+2.5+Two+Quantitative+Variables+Scatterplot+and+Correlation+%28Student+Version%29
SECTION 2.5: Two Quantitative Variables - Scatterplot and Correlation
Key Concepts
Two Quantitative Variables: These involve two numerical datasets that can be analyzed for relationships.
Visualization: Achieved through a scatterplot, representing data points for two variables.
Summary Statistic: The correlation coefficient quantifies the strength and direction of the relationship between the variables.
Associations Between Variables
Positive Association:
One variable's increase corresponds with an increase in the other variable.
Negative Association:
One variable's increase corresponds with a decrease in the other variable.
No Association:
Variables do not provide predictive information about each other.
Examples of Expected Associations
Key Examples:
Fertilizer and Crop Yield: Expect a positive association.
Cigarettes and Lung Capacity: Expect a negative association.
Tire Tread and Miles Driven: Expect a negative association.
Scatterplots
Definition: A graph illustrating the relationship between two quantitative variables, plotting paired data as points.
Axes:
Explanatory variable on the horizontal axis (X-axis).
Response variable on the vertical axis (Y-axis).
Example Creating a Scatterplot (Example 2)
Data Table:
X: 5, 1, 7, 3, 8
Y: 0, 1, 4, 2, 7
Procedure:
Use StatKey to produce a scatterplot by inputting paired data.
Correlation Measurement
Correlation: Measures the strength and direction of linear relationships between two quantitative variables.
Notation:
Sample correlation is denoted by r.
Population correlation is denoted by ρ (rho).
Range of Correlation:
The value of r falls between -1 and 1.
Significance:
Positive Correlation: r > 0
Negative Correlation: r < 0
No Linear Association: r ≈ 0
A value closer to ±1 denotes a stronger association.
Examples of Calculating Correlation
Example 3:
Given variables from Example 2, calculate correlation using StatKey.
Example 4:
Visualize scatter plot for Cars99 and determine correlation proximity to given options.
Example 5:
Analyze scatter plot with Fuel Capacity data; determine correlation proximity.
Understanding Correlation in Context (Example 6)
Context: Investigate relationship between rebounds and free throw percentage in NBA.
Direction of Association: Assess whether this relationship is positive or negative.
Correlation Value Estimate: Guess possible correlation value based on visual representation.
Impossible Values for Correlation: Identify values such as greater than 1 or less than -1 as invalid for correlation.
Impact of Outliers on Correlation
Effect of Outliers: Outliers can significantly distort the calculated correlation.
Visualization: Always plot data, and analyze the scatterplot pre- and post-outlier removal.