1/30
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Describing relationships - goal
To understand one variable (like SAT scores), we often need to see how it relates to other variables, not just look at it alone.
Lurking variable - idea
A variable not included in the study that may influence the relationship between the explanatory and response variables.
Explanatory variable - definition
The variable that helps explain or causes changes in another variable; it is usually plotted on the horizontal (x) axis.
Response variable - definition
The variable that measures an outcome or result of a study; it is usually plotted on the vertical (y) axis.
Scatterplot - definition
A graph that shows the relationship between two quantitative variables measured on the same individuals; each point represents one individual's pair of values.
Axes of a scatterplot
Plot the explanatory variable on the horizontal (x) axis and the response variable on the vertical (y) axis; if there is no clear explanatory variable, either can go on x.
Overall pattern in a scatterplot
Described by the direction, form, and strength of the relationship between the two variables.
Deviation / outlier in a scatterplot
An outlier is an individual value that falls outside the overall pattern of the relationship in the scatterplot.
Positive association - definition
Two variables are positively associated when above‑average values of one tend to accompany above‑average values of the other (and below‑average with below‑average); the scatterplot slopes upward left to right.
Negative association - definition
Two variables are negatively associated when above‑average values of one tend to accompany below‑average values of the other (and vice versa); the scatterplot slopes downward left to right.
Direction of a relationship
Direction tells whether the association between variables is positive (upward trend) or negative (downward trend) in the scatterplot.
Form of a relationship
Form describes whether the pattern of points is roughly linear (straight line), curved, or shows clusters or other shapes.
Strength of a relationship
Strength indicates how closely the points follow a clear form; strong when points tightly follow a pattern, weak when they are widely scattered.
Example of positive association (nutrition)
As the percent of nutritionists saying a food is healthy increases, the percent of American voters saying it is healthy also tends to increase, giving a positive association.
Example of curved relationship (GDP and life expectancy)
In the health and wealth example, life expectancy rises quickly as GDP increases and then levels off, showing a curved (nonlinear) positive relationship.
Outliers in health and wealth example
Equatorial Guinea and Liechtenstein are outliers in the GDP vs life expectancy scatterplot because their GDP or life expectancy do not follow the overall pattern of other nations.
Multiple variables in scatterplots
Additional variables (such as number of cylinders for cars) can be incorporated by using different symbols or colors to see how they affect the relationship.
Correlation - definition
Correlation r describes the direction and strength of a straight‑line (linear) relationship between two quantitative variables.
Correlation - symbol and range
Correlation is usually written as r and always lies between −1 and 1.
Interpretation of r close to 0
Values of r near 0 indicate a very weak straight‑line relationship between the variables.
Interpretation of r near ±1
Values of r close to 1 or −1 indicate that the points lie close to a straight line (strong linear relationship); r = 1 or r = −1 only when the points lie exactly on a straight line.
Sign of correlation
A positive r indicates a positive association; a negative r indicates a negative association between the variables.
Unitless property of correlation
Correlation has no units; changing the units of x, y, or both does not change the value of r.
Symmetry of correlation
Correlation does not change if we switch explanatory and response variables; r is the same whether we label a variable x or y.
Correlation and linear form only
Correlation measures the strength of straight‑line association only; it does not describe curved relationships even if they are strong.
Effect of outliers on correlation
Correlation is strongly affected by outliers; a few unusual points can greatly change the value of r.
Requirements for using correlation
Correlation is appropriate only for two quantitative variables and only for describing linear relationships.
Strategy for two‑variable data
First make a scatterplot to see direction, form, strength, and outliers; then use correlation and numerical summaries to supplement the graph.
College Board's warning about SAT rankings
The College Board strongly discourages ranking states on SAT scores alone, stating that such comparisons are invalid when based only on SAT averages.
Statistics in summary - scatterplots
When examining a scatterplot, always comment on direction (positive/negative), form (linear/curved), strength (strong/weak), and possible outliers.
Statistics in summary - correlation
Correlation r measures the direction and strength of a linear relationship but is not a complete description; means and standard deviations of x and y should also be given.