Regression & Cross Validation Code

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/10

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:55 PM on 2/8/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

11 Terms

1
New cards

Write the code for the import blocks for linear_model, mean_squared_error, r2_score, mean_absolute_error, GridSearchCV, train_test_split, and cross_val_score

from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score

2
New cards

Write the code for coding a train test split for x and y

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20) # Splits into 80% training, 20% testing

3
New cards

Write the 2 lines of code for creating an empty linear regression model and finding the best straight line that fits it

regr = linear_model.LinearRegression()
regr.fit(X_train, y_train)

4
New cards

Write the 2 lines of code for guessing the y values for the test data & train data.

y_pred = regr.predict(X_test) # For test data
y_pred_train = regr.predict(X_train) # Guessing the y values for the train data 

5
New cards

Write the code for printing the R2 score for the train & test data

print("R^2 Score (Train):", r2_score(y_train, y_pred_train))
print("R^2 Score (Test):", r2_score(y_test, y_pred))	

6
New cards

Write the code for printing the MSE and MAE for the train and test data

print("Mean Squared Error (Train):", mean_squared_error(y_train, y_pred_train))
print("Mean Squared Error (Test):", mean_squared_error(y_test, y_pred))	
print("Mean Absolute Error (Train):", mean_absolute_error(y_train, y_pred_train))
print("Mean Absolute Error (Test):", mean_absolute_error(y_test, y_pred))

7
New cards

Write the code for plotting a line of best fit for the regression data

fig, ax = plt.subplots(1,2, figsize=(10,5)) # Set up 2 side-by-side plots

Xplot = np.array([22.0,28.0]) # Define start and end x points for drawing red line

ax[0].scatter(X_train, y_train)
ax[0].plot(Xplot, regr.coef_[0]*Xplot+regr.intercept_, color='red', linewidth=1)
ax[0].set_title("Training set with $R^2$=%.2f" % r2_score(y_train, y_pred_train))

ax[1].scatter(X_test, y_test)
ax[1].plot(Xplot, regr.coef_[0]*Xplot+regr.intercept_, color='red', linewidth=1) 
ax[1].set_title("Testing set with $R^2$=%.2f" % r2_score(y_test, y_pred))

8
New cards

Write the code for creating a new dataset where the input is x2 and x.

regr = linear_model.LinearRegression()

regr.fit(X_train, y_train)

X2_train = np.c_[X_train**2,X_train]
X2_test = np.c_[X_test**2,X_test]

9
New cards

Write the code for training the 2nd model on squared data

regr2 = linear_model.LinearRegression()
regr2.fit(X2_train, y_train)

y_pred2_train = regr2.predict(X2_train)
y_pred2_test = regr2.predict(X2_test)

print("R2: %.2f" % r2_score(y_train, y_pred2_train))
print("R2: %.2f" % r2_score(y_test, y_pred2_test))

10
New cards

Write the code for plotting a quadratic regression with a quadratic line of best fit

fig, ax = plt.subplots(1,2, figsize=(10,5))

Xplot = np.linspace(22,28,100)

ax[0].scatter(X_train, y_train)
ax[0].plot(Xplot, regr2.coef_[0]*Xplot**2+regr2.coef_[1]*Xplot+regr2.intercept_, color='red', linewidth=1)
ax[0].set_title("Training set with $R^2$=%.2f" % r2_score(y_train, y_pred2_train))

ax[1].scatter(X_test, y_test)
ax[1].plot(Xplot, regr2.coef_[0]*Xplot**2+regr2.coef_[1]*Xplot+regr2.intercept_, color='red', linewidth=1)
ax[1].set_title("Testing set with $R^2$=%.2f" % r2_score(y_test, y_pred2_test))

11
New cards

Write the code for comparing the performance of your linear and quadratic model using kfold and Polynomial Regression

from sklearn.model_selection import kFold, cross_val_score

kf = KFold(n_splits=5, shuffle=True)
kf.split(X_train)

regr=linear_model.LinearRegression()
R2lin = cross_val_score(regr, X_train, y_train, cv=kf)

poly=PolynomialFeatures(degree=2, include_bias=False)
X_train2 = poly.fit_transform(X_train)
regr2 = linear_model.LinearRegression()
R2_quad = cross_val_score(regr2, X_train2, y_train, cv=kf)

print(np.mean(R2lin), np.mean(R2quad))