Video Lecture: Data Preparation, Linear Regression, and Gradient Descent

0.0(0)

Studied by 0 people

View linked note

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/26

Earn XP

Description and Tags

Vocabulary flashcards covering key concepts from the lecture on data preparation, linear regression, and basic ML workflow.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

27 Terms

New cards

y (ground truth / true label)

The actual target value for a sample used during training; the model aims for y_hat to be as close as possible to y.

New cards

y_hat (predicted value)

The model's predicted value for a sample; used to compute the error with the true label y.

New cards

w (weights)

Model parameters learned during training; coefficients that, with x, produce predictions (intercept handled by bias).

New cards

b (bias / intercept)

A constant term added to the linear combination to shift the prediction.

New cards

I (sample index)

Index of a data sample (0-based in the lecture; 0 to n-1).

New cards

n (number of samples)

The total number of samples in the dataset.

New cards

d (number of features)

The number of feature dimensions (input size for each sample).

New cards

y = w^T x + b (linear model equation)

The linear predictor: a dot product between weights and features plus the bias, giving the predicted value.

New cards

Mean Squared Error (MSE)

Average of the squared differences between predicted and true values: (1/n) Σ (yhati − y_i)^2.

New cards

Root Mean Squared Error (RMSE)

Square root of MSE; same units as y and easier to interpret.

New cards

Mean Absolute Error (MAE)

Average of the absolute differences between predicted and true values; less sensitive to outliers.

New cards

Classification vs Regression

Classification predicts discrete categories; regression predicts continuous numeric values.

New cards

One-hot encoding

Converts a categorical feature with k categories into k binary features to avoid implying ordinal relationships.

New cards

Drop first in one-hot encoding

Option (drop_first=True) to remove one dummy column and avoid redundancy/collinearity.

New cards

Label encoding

Assigns integers to categories; can introduce artificial order and is not ideal for nominal categories.

New cards

Structured data

Tabular data with rows and columns (like an Excel file) where features and labels are clearly defined.

New cards

Unstructured data

Data without a fixed schema (e.g., text, images); images are matrices of pixel values.

New cards

Train-test split

Partition data into training and testing subsets; often 80/20; random_state for reproducibility; stratify to preserve class distribution.

New cards

Random state (seed)

A seed for the random number generator to ensure reproducible splits and results.

New cards

Imputation

Filling in missing values (e.g., with column mean); training data used to compute imputation values to avoid leaking test information.

New cards

Standardization

Scaling features to zero mean and unit variance (z-scores) using fit_transform on training data and transform on test data.

New cards

Min-max scaling

Scaling features to [0, 1] by subtracting the min and dividing by the range.

New cards

Pseudoinverse

Generalized inverse used when X^T X is not invertible; enables least-squares solutions for non-square matrices.

New cards

Normal equations

Closed-form solution for linear regression: w* = (X^T X)^{-1} X^T y; can be expensive for large datasets, hence iterative methods are common.

New cards

Simple vs. Multiple Linear Regression

Simple: one independent variable; Multiple: more than one independent variable.

New cards

Intercept (w_0)

The predicted value when all features are zero; the base level of the regression line.

New cards

Slope (w_1, etc.)

The change in the predicted value for a one-unit change in the corresponding feature.