DAT566: Module 4 Linear and Logistic Regression

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/23

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 4:10 PM on 6/7/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

24 Terms

1
New cards

What is the difference between regression, classification, and clustering?

Regression: Predicting a numerical quantity

Classification: Assigning a label from a discrete set of possibilities

Clustering; Grouping items by similarity

2
New cards

Which evaluation metrics are commonly used for classification tasks?

Precision, recall, and accuracy

3
New cards

What is linear regression?

Predict the value of a dependent variable (Y) based on know values of one or more independent variables.

4
New cards

What are the main goals of linear regression?

Fit a line of the form f(x) = k⋅x + m to the data

Select k and m so that the total error (sum of residuals) is minimal

5
New cards

What is residual error?

The vertical distance between the actual value and the predicted value for a given point.

6
New cards

What model of residual error do we use and why?

Ordinary Least Squares (OLS): Treats positive and negative errors equally, penalizes large deviations and ensures convexity.

7
New cards

What conditions are required to fit a linear regression model?

The relationship between predictors and response is linear

Errors have constant variance

Errors are independent

8
New cards

What is gradient descent?

Finds the local minima for non-convex surfaces

Does not guarantee a globally optimum solution

Finds the direction to minimize loss but not the step size

9
New cards

What is true about optimization in linear regression?

Gradient descent iteratively updates parameters in the direction that reduces the value of a cost function.

The mean squared error cost function for linear regression has a single global minimum.

10
New cards

In linear regression, what does coefficient of determination (R²) represent?

The proportion of variance in the dependent variables explained by the independent variable

11
New cards

What is correlation?

Correlation measures the linear relationship between two continuous variables

It tells us how much one variable tends to change when the other one does

12
New cards

What does the Pearson correlation coefficient (r) measure?

The strength and direction of the linear relationship between two variables

Pearson is used for continuous, interval/ratio data

13
New cards

What does Spearman’s correlation (ρ) represent?

Quantifies the strength and direction of a linear regression

Spearman is used for ordinal or ranked data

14
New cards

What is regularization?

A technique used in linear regression to prevent overfitting by penalizing large coefficient weights to force the model to be simpler and rely less on any single feature.

15
New cards

What is a strategy to avoid overfitting?

Adding regularization (e.g., L1 or L2)

16
New cards

What is Ridge regression (L1)?

A regularization technique that reduces model complexity and prevents overfitting by shrinking high-weight coefficients closer to zero.

Uses the Euclidean distance.

17
New cards

What is LASSO regression (L2)?

A regularization technique that prevents overfitting by adding a penalty proportional to the absolute value of coefficient magnitudes to the loss function.

It forces less important feature weights to exactly zero.

Uses the Manhattan distance.

18
New cards

Why might ridge regression produce better predictive performance than ordinary least squares on new data?

It shrinks coefficient estimates using an L2 penalty, reducing variance and improving generalization.

19
New cards

What is logistic regression?

A technique primarily used for classification type of problems.

Involves predicting probabilities of belonging to each class; outputs are probabilities

Models log-odds linerally

Predicts whether something is true or false, instead of predicting something continuous like size

20
New cards
<p>Is logistic regression still considered a linear model? </p>

Is logistic regression still considered a linear model?

Yes, even though the curve is non-linear, the decision boundary is a straight line, so it lets us make a definitive answer.

21
New cards

What is the difference between probability vs likelihood?

Probability predicts future outcomes based on fixed parameters, while likelihood evaluates the plausibility of parameter values based on observed data.

22
New cards

What is Multi-class classification?

A machine learning technique for assigning data points to one of three or more mutually exclusive classes

23
New cards

Give an example of Multi-class classification

Identifying a vehicle in an image as a car, truck, or motorcycle

24
New cards

What can be used to balance classes?

Oversample the minority class

Undersample the majority class

Assigning higher weights to data points in rare classes