What is a response variable?
Measures an outcome of a study (dependent variable in a way)
What is an explanatory variable?
Helps to explain or predict changes in a response variable (independent)
What is a scatterplot?
Plot that shows the relationship between two quantitative variables measured on the same individuals.
When describing a scatterplot, what format should you use? Describe Association in Context
Direction: Association (Positive/Negative)
Form: Approximately Linear or Curved
Strength: Weak, Medium, Strong
Unusual Features: Outliers
Describe the Association IN CONTEXT
Example: Students who took longer to run 40 yards had shorter jump lengths (Association in context). The relationship between the two quantitative variables reveals a positive association while having a linear form, a strong strength. There is one unusual feature at the 25 yard mark, which deviates slightly from the overall pattern.
What does Correlation (r) measure?
The direction and strength of a linear relationship between two quantitative variables. Correlation is always between -1 and 1.
If r>0, the association is positive.
If r<0, the association is negative.
If r=0, the association is none.
This is how you quantify it.
How would you interpret the r value in context?
Mention the direction and strength!
EX: The R-value of _____ indicates the linear relationship between the number of points scored versus the amount of turnovers is strong and positive.
Facts about correlation….
It does not imply what caused something!
Correlation measures strength and direction.
Correlation is not a resistant measure (one point can have a drastic change)
Correlation makes NO distinction of the explanatory and response variable.
Correlation is not affected by data transformations.
What is a regression line?
A line that describes how a response variable (y) changes as an explanatory variable (x) changes. Often used to predict the value of y for a given value of x.
What is the equation of a regression line?
^
y = a +bx
What is extrapolation?
The use of a regression line for prediction far outside the interval of values of the explanatory variable used to obtain the line. These predictions ARE NOT ACCURATE
What is a residual? How do you find it?
The prediction error from a regression line. It is the difference between an observed value compared to the predict value.
EQUATION: Residual = Observation-Prediction
What is the Least Squares Regression Line?
A line that makes the sum of the squared residuals as small as possible.
What is a high-leverage point?
A point in an LSRL that has a substantially larger or smaller x-value than the other observations have.
EX: Low X value, high Y value.
What is an influential point?
A point in an LSRL that is any point that, if removed, changes the relationship substantially.
What is the Standard Deviation of the Residuals, s?
The approximate size of a typical prediction error (residual)
How would you interpret a residual?
You state what the residual is (show how you got it), then say if it is above or below the predicted outcome of the response variable.
EX: The residual for the point (8,2) is -14.47, showing that given 8 seconds of tapping time, the result for the amount of soda left (in mL) is 14.47 BELOW the predicted outcome.
Squared deviations determine the _____
Why does the Standard Deviation have to be squared?
LSRL.
If we don’t square the deviations, they will add up to 0.
What happens when you find the mean of the response and explanatory variable, AKA THE 50% mark of each?
They intersect at the LSRL.
What is a residual plot? How do you know which plot to choose?
A scatterplot of the residuals against the explanatory variable. Residual plots help determine whether a linear model is appropriate.
A linear plot can be chosen if there is no pattern.
What is the coefficient of determination (r²)? What is the format?
The fraction of the variation in the values of y that is accounted for by the LSRL of y on x. FORMAT:
______% of the variation in [response variable] is accounted for by the linear (exponential/power) model relating [response variable] to [explanatory variable].
Linear relationships can only be viewed in…
Correlation and Regression
What does the power model equation look like?
^
ln(y) = a+b*ln(x)
What does the exponential model look like?
^
ln(y) = a+b*x
To find the predicted y value for the pwoer model, use the equation:
^
y = e^a+b*log(x)
To find the predicted y value for the exponential model, use the equation:
^
y = e^a+b*x
To find the r-squared value, you do what?
Square the correlation!
How do you interpret the r-squared value?
______% of the variation in [response variable] is accounted for by the linear (exponential/power) model relating [response variable] to [explanatory variable].