1/36
Flashcards covering key vocabulary and concepts from the lecture notes on Regression Analysis and Linear Regression.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Response Variable (Y)
The variable you are trying to predict or explain.
Predictive Variables (Xs)
Variables used to predict the response variable. There can be one or more.
Generalizability
Evaluating if a model is applicable to a new dataset, not just the one it was trained on.
Linear Relationship
A straight-line relationship between variables.
Non-linear Relationship
A relationship between variables that is not a straight line.
Simple Linear Regression
A regression with one predictive variable and one response variable.
Multiple Linear Regression
A regression with multiple predictive variables and one response variable.
Reasons to do Regression
Want to determine if there is a relationship between predictor and response, estimate new values of the response, and test hypotheses.
Line of Best Fit
A line that best represents the data points in a scatter plot.
Residual
The distance between the actual observed data point and the line of best fit.
Sum of Squared Residuals
The sum of the squared distances between each data point and its predicted value on the regression line. Regression aims to minimize this.
Galton and Pearson
Credited with inventing regression and putting together the theory of least squares.
Pearson's Correlation Coefficient
A numerical method to check for linearity between variables, ranging from -1 to +1.
Intercept
The intercept point where the regression line crosses the y-axis
Slope
The rate of change in the dependent variable for every one-unit increase in the independent variable.
Residual (Error)
The difference between the actual value and the predicted value in a regression model.
Method of Least Squares
A method of estimating the parameters in a statistical model by minimizing the sum of the squares of the residuals.
Y-hat (ŷ)
Used to denote a predicted value in regression analysis.
Lm Function
Function used to fit linear models, including both simple and multiple linear regression.
Assumptions of Linear Regression (LINE)
Linearity, Independence, Normality, Equal Variance.
Linearity (Assumption)
The assumption that there is a straight-line relationship between the predictor and response variables.
Independence (Assumption)
The assumption that observations are independent of one another.
Normality (Assumption)
The assumption that the residuals are normally distributed.
Equal Variance (Homoscedasticity) (Assumption)
The assumption that the variance of the residuals is constant across all levels of the predictor variables.
par(mfrow = c(rows, columns))
Arrange plots in R.
Cook's Distance
The distance from the data which shifts the line; when seen transformation may be needed.
Hat PC
Used for Linear Regression models.
Collinearity
The degree to which the independent variables in a multiple regression model are correlated.
ANOVA (Analysis of Variance)
A statistical test used to assess how well the model fits the data; partitioning variance into sums of squares for the residual and for the error.
R-squared
A measure of the proportion of variance in the dependent variable that can be predicted from the independent variable(s).
Adjusted R-squared
A modified version of R-squared that adjusts for the number of predictors in the model.
Predict Function
A function in R used to generate predictions from a fitted model.
Transformation
Altering the mathematical scale of a variable to better meet the assumptions of a statistical model or improve the relationship between variables.
Back Transformation
Converting data back to its original scale after a transformation has been applied.
Natural Log Transformation
A type of transformation often used to address issues of non-linearity or non-constant variance.
Multiple Linear Regression Equation
A linear regression equation with multiple independent variables.
Partial Regression Coefficient
The change in the response variable associated with a one-unit increase in a specific predictor, holding all other predictors constant.