Chapter 12: Linear Regression and Correlation

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/44

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

45 Terms

1
New cards
Degrees of freedom
n - 2
2
New cards
Outliers
are observed data points that are far from the least squares line.
3
New cards
Influential points
observed data points that are far from the other observed data points in the horizontal direction. These points may have a big effect on the slope of the regression line.
4
New cards
p-value is less than the significance level
We reject the null hypothesis. There is sufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is significantly different from zero
5
New cards
p-value is NOT less than the significance level
DO NOT REJECT the null hypothesis. There is insufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is NOT significantly different from zero.
6
New cards
Null Hypothesis
H0→ ρ \= 0
7
New cards
Alternate Hypothesis
Ha→ ρ ≠ 0
8
New cards
Interpreting Null Hypothesis
The population correlation coefficient IS NOT significantly different from zero. There IS NOT a significant linear relationship (correlation) between x and y in the population.
9
New cards
Interpreting Alternate Hypothesis
The population correlation coefficient IS significantly DIFFERENT FROM zero. There IS A SIGNIFICANT LINEAR RELATIONSHIP (correlation) between x and y in the population.
10
New cards
ρ
population correlation coefficient
11
New cards
r
sample correlation coefficient
12
New cards
Conclusion for Significant
There is sufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is significantly different from zero.
13
New cards
Conclusion for Not Significant
"There is insufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is not significantly different from zero."
14
New cards
Significance of the correlation coefficient
to decide whether the linear relationship in the sample data is strong enough to use to model the relationship in the population.
15
New cards
Coefficient of determination
a number between 0 and 1 that measures how well a statistical model predicts an outcome
16
New cards
r^2 interpretation
when expressed as a percent, represents the percent of variation in the dependent (predicted) variable y that can be explained by variation in the independent (explanatory) variable x using the regression (best-fit) line.
17
New cards
1 - r^2 Interpretation
when expressed as a percentage, represents the percent of the variation in y that is NOT explained by variation in x using the regression line.
18
New cards
Positive correlation
A positive value of r means that when x increases, y tends to increase and when x decreases, y tends to decrease.
19
New cards
Positive correlation
A negative value of r means that when x increases, y tends to decrease and when x decreases, y tends to increase
20
New cards
Correlation coefficient (r)
is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
21
New cards
Slope equation
b \= r (sy / sx)
22
New cards
sx
= the standard deviation of the x values.
23
New cards
sy
= the standard deviation of the y values
24
New cards
Interpretation of the Slope
“The slope of the best-fit line tells us how the dependent variable (y) changes for every one unit increase in the independent (x) variable, on average.”
25
New cards
Least-Squares Line
You have a set of data whose scatter plot appears to "fit" a straight line
26
New cards
Least-squares regression line
Helps obtain a line of best fit
27
New cards
y hat
estimates value of y
28
New cards
y0 – ŷ0 \= ε0
error or residual
29
New cards
Absolute value of a residual
measures the vertical distance between the actual value of y and the estimated value of y
30
New cards
ε
the Greek letter epsilon
31
New cards
Scatterplot Direction
High values of one variable occurring with high values of the other variable or low values of one variable occurring with low values of the other variable
32
New cards
Strength
Looking at how close the points are to the line
33
New cards
Linear regression
shows the relationship between a dependent and independent variable(s)
34
New cards
Scatterplot
uses dots to represent values for two different numeric variables.
35
New cards
y \= a + bx
linear regression for two variables is based on a linear equation with one independent variable.
36
New cards
Independent variable
x
37
New cards
Dependent variable
y
38
New cards
Slope
b
39
New cards
y-intercept
a
40
New cards
Graph form
a straight line or linear
41
New cards
B \> 0
slopes to the right
42
New cards
b \= 0
horizontal line
43
New cards
b < 0
slopes downward to the right
44
New cards
Bivariate data
two variable data
45
New cards
Multivariate data
more than two variables