describing scatter plots
direction (positive/negative), form (linear/curved), strength, unusual features
scatterplot on calculator
Stat -> Edit, then 2nd -> Statplot
r
correlation coefficient
r measures
the strength of correlation between two quantitative variables
conditions of r
quantitative, straight enough, no outlier
r²
coeffiricent determination
r² measures
how much of the variation in the y-variable is explained by variation in the x-variable
lurking variable
a variable that could cause changes in x and y that’s not measured
residual
the vertical distance between the point and the line of best fit. Actual – Predicted
line of best fit on calculator
type x’s into L1, y’s into L2. Then, Stat-CALC-8:LinReg(a+bx). If the output says: 𝑦 = 𝑎 + 𝑏x; 𝑎 = 2.3; 𝑏 = 1.1, then the line of best fit is 𝑦 = 2.3 + 1.1x
lie of best fit slope =
r * (SD of y/SD of x)
lie of best fit y-intercept =
mean of y - (mean of x * slope)
regression to the mean
number of SD * r = actual prediction from mean
residual plot on calculator
fter data is entered in L1 and L2, do Stat -> CALC -> 8:LinReg(a + bx).
Then, 2 nd ->Statplot, select Scatterplot. Change the Y-List to RESID by highlighting it and clicking 2 nd -> STAT -> RESID. Then, ZOOM -> 9.
residual plot is random
plot is linear, otherwise probably curved
extrapolation
predicting data beyond domain
influential outlier
point changes r or line of best fit
high leverage
far horizontally
high residual
far vertically
re-expression if
scatter plot is not straight enough
re-expression on calculator
1) Sketch a scatterplot of data. (Straight line = good)
2) Do linear regression on it. Record r and r 2 (r close to 1 or -1 = good. r 2 close to 1 = good)
3) Sketch the residual plot. (Random scatter = good)
4) Take the square root of the y’s and store as a new list √ L2 STO L3
5) Now, redo steps 1-3, using L1 and L3
6) Continue transforming L2 using Log(y), stored in L4, 1/√𝑦, stored in L5, and 1/𝑦 ,
stored in L6, in that order, and keep repeating steps 1-3. If those all fail, try Log(x)
and y, or Log(x) and Log(y)
o On calc: Log L2 STO L4
o On calc: 1/ √ L2 STO L5
o On calc: 1/L2 STO L6
7) Choose which one works best by checking the scatterplot, r,r², and the residual plot
8) Make predictions with your old x's and new y's
graph oscillates
can’t re-express if