Exploring Two-Variable Quantitative Data
Exploring Two-Variable Quantitative Data
Introduction
Investigating relationships between variables is fundamental in statistics. The relationship between two quantitative variables can yield predictions about one variable based on the value of another. For instance, we might explore how the price of a used car relates to the number of miles it has been driven, or how the length of a fish correlates with its weight. These relationships can be examined through various activities, such as measuring the hand spans of students and correlating it with the amount of candy they can collect in a certain timeframe.
Candy Grab Activity
Objective
To investigate whether students with larger hand spans can grab more candy than those with smaller hand spans.
Procedure
- Measure the span of your dominant hand to the nearest half-centimeter (cm). This is defined as the distance from the tip of the thumb to the tip of the pinkie finger on a fully stretched hand.
- One student at a time will use their dominant hand to grab as many candies as possible from a container, with their fingers pointing down (no scooping allowed) and holding the candies for 2 seconds before counting. After counting, the candies must be put back.
- Record each student’s hand span and number of candies in a table.
- Graph the results by creating a scatterplot with the x-axis labeled as 'Hand Span (cm)' and the y-axis as 'Number of Candies'.
- Analyze the graph and summarize observations about the relationship between hand span and the number of candies.
Section 3.1: Scatterplots and Correlation
Learning Targets
By the end of this section, students should be able to:
- Distinguish between explanatory and response variables for quantitative data.
- Create scatterplots to display relationships between two quantitative variables.
- Describe the direction, form, strength of relationships displayed in scatterplots, and identify any unusual features.
- Interpret correlation and understand its basic properties, including how it is influenced by unusual points.
- Differentiate correlation from causation.
Bivariate vs Univariate Data
- Univariate Data: A set describing one variable.
- Bivariate Data: A set that describes the relationship between two variables.
Explanatory and Response Variables
In the Candy Grab activity:
- Response Variable: Number of candies (what is being measured).
- Explanatory Variable: Hand span (used to predict the outcome).
- Definition: A response variable measures an outcome of a study, while an explanatory variable may help predict or explain changes in a response variable.
Example Problems
Diamonds and Pricing:
- Explanatory Variable: Weight of diamonds (in carats).
- Response Variable: Price of diamonds (in dollars).
- Reasoning: The weight helps explain the price.
SAT Scores Examined:
- Either variable could be the explanatory variable since SAT math score and SAT reading score might be used to predict each other.
Scatterplots
A scatterplot is a graphical representation of two quantitative variables and is essential for displaying their relationship visually.
- Scatterplot Definition: A scatterplot shows the relationship between two quantitative variables measured on the same individuals, with one variable's values on the horizontal axis and the other on the vertical axis.
- Example scatterplot metrics such as hand span vs. number of candies show the general correlation tendencies among individuals studied.
Making a Scatterplot
- Label Axes: Indicate the explanatory variable on the horizontal axis and the response variable on the vertical axis.
- Scale Axes: Equally spaced tick marks should begin just below the minimum value and extend beyond the maximum value.
- Plot Data Values: For each individual, plot a point in the scatterplot according to their values for both variables.
Describing a Scatterplot
- Direction: Positive, negative, or no association.
- Form: Linear (straight line) or nonlinear (curved).
- Strength: Strong, moderate, or weak (based on how closely the scatterplot data points cluster around a line).
- Unusual Features: Points that deviate significantly from the main pattern or distinct clusters of points.
Analyzing Relationships
- A positive association exists when values of one variable increase as the other variable also increases.
- A negative association occurs when as one variable increases, the other decreases.
- If knowing one variable does not help predict the other, there is no association.
Important Cues (AP® Exam Tips)
Discussions regarding scatterplots should include the direction, form, strength, and any unusual features while contextualizing variables in terms of the specific problem presented.
Example Analysis:
Baseball Team Payroll vs. Wins
- A scatterplot comparing MLB payrolls and wins displays a general positive association; as payroll increases, wins tend to increase.
- However, this association (e.g., correlation r = 0.613) indicates that while a relationship exists, it is not perfect, and there are exceptions where teams may perform differently despite payroll differences.
Analysis of Correlation
- Correlation r Definition: A linear relationship between two quantitative variables measured as values between -1 and 1. The sign indicates direction (positive or negative).
- Key properties of correlation:
- Correlation ranges from -1 to 1.
- Extreme values of -1 and 1 indicate a perfect linear relationship.
- Values closer to 0 suggest a weak linear relationship, while those nearer to 1 or -1 indicate strong linear relationships.
Cautions About Correlation
- Correlation does not imply causation; a strong correlation may exist without one variable affecting the other directly.
- Correlations are sensitive to outliers; unusual data points can significantly influence the correlation value, leading to misleading interpretations.
Conclusion:
Understanding the diverse relationships between quantitative variables through scatterplots and correlation can aid in predicting outcomes and identifying trends across varied applications in statistics and real-world scenarios. Trusting this analytical framework while observing data relationships ensures accuracy and valid conclusions in statistical analysis.
Exercises
A list of exercises is provided for practice, focusing on identifying explanatory and response variables, creating scatterplots, and analyzing relationships between various datasets.
- Coral Reefs and Cell Phones - Identify variables.
- Teenagers and Corn Yield - Identify variables.
- Assessing Backpacks - Make a scatterplot for body and backpack weight.
- Golf Putting - Make scatterplots of putting distances related to performance.
- Analyze Height and Weight of Olympic Athletes - Create scatterplots and summarize relationships.
- Continue with further data-focused inquiries.
These activities provide hands-on opportunities to establish practical understanding of relationships in statistics, affirming theoretical knowledge through real data.