1/26
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What information is provided by the correlation coefficient?
- The closer to -1, the stronger the negative relationship
- The closer to 1, the stronger the positive relationship
- The closer to 0, the weaker the linear relationship
Hypothesis testing for the correlation coefficient
H0: ρ = 0 (no correlation)
H1: ρ ≠ 0 (correlation)
Calculate test statistic
Df = n - 2
If you reject null, there is evidence of a correlation.
How do the points lie on the line when the correlation coefficient between 2 variables is -1?, 1?, 0?
-1 : straight line going through all points downward
1: straight line going through all points upward
0: points randomly strewn about and no line can be drawn
Population correlation coefficient
ρ (Rho)
Sample correlation coefficient
r
Correlation
assess the degree of association or relationship between two quantitative variables
Regression
mathematical equation to describe the relationship between a variable of interest (dependent variable) and one or more related variables (independent variables or explanatory variables)
Least squares regression
b0 and b1 are obtained by fitting a line to the data in a way that minimizes the sum of the squared errors
What does the slope of the line represent?
Measures the change in the average value of y as a result of a one-unit change in x
What does the Y variable represent?
The dependent (response) variable
What is the X variable?
The independent (explanatory) variable
Coefficient of determination
r^2
We can state that ____% of the variation in y is explained by the variation in x.
Coefficient of alienation
1 - r^2
We can state that ____% of the variation in y is not explained by the variation in x.
Standard error of estimate
S (given by Minitab)
The average amount of error in predicting y from x is approximately ____.
What is the main difference between the regression methodology and the correlation coefficient methodology?
Correlation just looks to determine the strength of a relationship between two variables, it does not explain causality or variation.
Regression is used primarily to provide prediction and to model causality or explain variation.
How do you interpret the coefficients b0 and b1 in the regression equation?
b0 is the average value of y when the value of x is zero
b1 is the change in the average value of y as a result of a one-unit increase in x
When shouldn't you interpret b0?
When it doesn't make sense in the terms of the problem or when 0 falls outside the range of values given for x
How do you test whether the equation is relevant to the population?
Do a hypothesis test
H0: β1 = 0 (no linear relationship)
H1: β1 ≠ 0 (linear relationship)
t = (b1 - β1)/Sb1
df = n -2
Calculate the confidence interval for beta1. What is your conclusion?
b1 - (tn-2 Sb1) ≤ β1 ≤ b1 + (tn-2 Sb1)
We can be 95% confident that for an increase of one unit of x, y will increase a minimum of ____ and a maximum of ____ on average.
What is the difference between CI(confidence interval) and PI(prediction interval)?
Confidence interval uses the term "average" in the conclusion, prediction interval doesn't and tries to find a specific value so it is larger.
How do you interpret CI and PI? Conclusion?
"We can be 95% confident that the average units of y for x is between ____ and ___"
vs.
"We can be 95% confident that the units of y for x is between ____ and ____"
What is the margin error?
tn-2 * Sb1
What is extrapolation?
Extending a trend line beyond the given data to make a prediction
What is a Residual? What information is provided by the residuals?
It is the difference between the actual Y value and the Y value predicted from the
regression equation.
A positive residual indicates a value of Y that is larger than what
would be expected based on the value of x, and a negative residual indicates a value of Y that is
smaller than what would be expected based on the value of x. The largest residuals (either positive
or negative) provide information about "unusual" data points
How do you "spot" the predicted values on the minitab printout when predicting for a single value?
Under the sections "Values of Predictors for New Observations" and "Predicted Values for New Observations"
What parameter(s) on the minitab printout do you use to demonstrate that there is a linear relationship between Y and X?
Residual error DF + 2 = n
Residual error DF = df
tstat is under T column for x value
What is the relationship between R_square and the correlation coefficient r?
R-square is r^2, the correlation coefficient r is the square root of r^2