1/18
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Independent Variables in Linear Regression
Inputs - They can be categorical or continuous
Dependent Variables in Linear Regression
Target you’re predicting - must be continuous
What kind of Machine Learning problems can be solved using Linear Regression
CO2 emissions from engine specs
House Price from features
Life satisfaction from GDP
Simple Linear Regression
One independent variable, predicting CO2 emissions only using engine size
Multiple Linear Regression
More than one independent variable, predicting CO2 emissions using engine size, cylinders, and fuel consumption
What does it mean to ‘fit a line’ in Linear Regression
Finding a theta_0 and theta_1 that best represents the relationship between the independent and dependent variables across your scattered data points, the line that overall minimizes the error
Error in Linear Regression
actual y - predicted y for an instance, it’s the vertical distance between a data point and the fitted regression line
Mean Absolute Error (MAE)
Average of the absolute errors, simplest to interpret
Mean Squared Error (MSE)
Average of squared errors, penalizes large errors more, often used as the cost function
Root Mean Squared Error (RMSE)
Square root of MSE, same units as the target, the most popular/interpretable metric
Y
Vector of the actual targets
Theta
The parameter vector holding the bias theta_0 and feature weights theta_1.. theta_n
X
Feature matrix/vector for the instances
Why transpose theta
It let’s theta^T * x become a valid matrix multiplication, producing a single scalar prediction
Why is a column of all 1s (x0) added during linear regression
SO bias term theta_0 can be folded into the same matrix multiplication form as the other parameters
Deriving Normal Equation
Start from Cost function MSE written in vector/matrix form
Compute the squared residual by multiplying the residual vector by its own transpose
Take the derivative of this cost function with respect to theta
Set it to zero to find the minimum
Solve for theta (gives you the closed form result)
How to convert a table into a design matrix and use it in the Normal Equation
Take the table and add a new column x0 filled entirely with 1s for the intercept
Arrange all the feature columns into a matrix X where row is one instance’s full vector
Target column becomes the vector y
Plug the variables into theta_hat to solve for the parameter vector
Computational Complexity of the Normal Equation for Linear Regression
O(n³)
Weakness of using Normal Equation for Linear Regression
it scales poorly for large datasets because of the matrix inversion