1/39
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is R-squared?
the ratio of explained variation to total variation and is equivalent to the proportional reduction of error
How do we interpret R squared?
we interpret r squared as the amount of variability in y as explained by x.
R squared sumarises how well x can predict y. It's a measure of the proportional reduction in prediction error: aka, how much better does x explain variation in y compared to when y bar is used to predict y
R squared tells us that there is ___% less error in x as an explanatory variable for y compared to using bar to predict y
What is a conditional distribution?
refers to the spread of a value around the regression line. Think of the predicted value like a predicted mean and there will be a spread of normal distribution around it.
What is RMSE? ( root mean standard error)
Root mean standard error is the estimate of the variability in y values at each value of x. It estimates the spread around the regression line and gives us the estimated standard deviation of conditional distributions of y at each value of x.
The estimated amount of variability in y at each x is assumed to be identical
What do we expect when things are normally distributed?
When things are normally distributed, we can expect 95% of our y value (for a given x value) to be within 2 standard deviation of the mean
what is the marginal distribution?
it describes the likelihood of a single event to occur when considering a set of variables. It helps us understand the distribution of individual variables without needing to consider the relationships between them
How does the strength of the association between x and y affect conditional variability compared to overall variability?
The stronger the association between x and y, the less that conditional variability will be compared to the overall variability (there is less prediction error)
How do we interpret the RMSE?
if the RMSE is relatively small, this tells us that our residuals are typically small and that the regression line provides a good fit to the actual data (so you want a small RMSE but a large R squared value)
Why do we use a two tailed test in regression?
because we want to test whether there is an association, not predict the direction of the association
When we test for significance in regression, what are we trying to test?
we're trying to see if the distribution of Y is identical at each x value for our linear regression function which occurs when the slope, b, = 0.
In the Null therefore, b = 0
And our alternative is b is not equal to 0
how do w calculate our test statistic?
t = b/SE(b) where b is the slope and SE(b) is the standard error of the slope
How do we calculate standard error of the slope/SE(b)?
we do b* the square root of 1 - r square/ r* the square root of n-2
Assumptions of regression?
What will a full interpretation of a regression model include?
-statistical significance
-assumptions assesed
Why may regression not be reliable?
What does standard error of the slope estimate?
it estimate the variability of our slope if we took repeated samples from the population. We can also use se(b) to estimate confidence intervals ( but i dunno how, dont ask me)
How is R squared similar to r?