Ch4 Methods : Linear regression and logistic regression

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/16

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

17 Terms

1
New cards

golden rule

test set must stay a representative sample for out-of-sample evaluation

meaning : lock away the test set until all modeling decisions have been taken

2
New cards

objective / loss / cost function

function that represents our goal, and can be calculated for a particular set of predictions and their corresponding output values 

3
New cards

statistics vs machine learning

  • in statistics the point of our model is to characterize the relationship between the data and our outcome variavle, not to make predictions about future data. = statistical inference (there won’t be a testset)

  • in machine learning we train the model, which involves using a subset of our data, and we do not know how well the model will perform until we ‘test” this data on additional data that was not present during training = test set → goal is to obtain the best performance on new data (the test set) 

4
New cards

can linear regression overfit ? 

yes

5
New cards

what adds complexity to linear regression

  • more features

  • higher coefficient / slope → higher impact of variable 

  • polynomial / interaction features 

6
New cards

regularization 

strategies that reduce test error, possibly at the cost of higher training error 

→ necessary with high capacity / complex models (ex ANN’s) 

7
New cards

logistic regression

probability values are bounded between 0 and 1, while linear regression linu usually will go from minus infinity to infinity

estimates probabilitys

cut-off point (usually 0.5) 

<p>probability values are bounded between 0 and 1, while linear regression linu usually will go from minus infinity to infinity </p><p>estimates probabilitys </p><p>cut-off point (usually 0.5)&nbsp;</p>
8
New cards

a logistic regression model predicts ..

1 of X^T 0 is positive and 0 if it is negative

9
New cards

minimizing the binary cross entropy is equivalent to

maximizing the (Bernouilli) (log) likelihood of observing the data

10
New cards

why is logistic regression considered a linear classification method ?

for a given cut-off, decision boundary / surface is linear

<p>for a given cut-off, decision boundary / surface is linear </p>
11
New cards

does a logistic regression have a way to predict interaction effects

no

12
New cards

multinomial logistic (softmax) regression

for more than two (k) classes

→ estimates the probability that instance x belongs to class k, given the scores of each class

→ probabilities across classes sum to 1

<p>for more than two (k) classes </p><p>→ estimates the probability that instance x belongs to class k, given the scores of each class </p><p>→ probabilities across classes sum to 1 </p><p></p>
13
New cards

multinomial logistic (softmax) regression : for classification

the argmax operator returns the value of a variable that macimizes a function. In this equation, it returns the value of k that maximizes the estimated probability sigma(s(x))k

loss : cross entropy (general form of binary cross-entropy)

<p>the argmax operator returns the value of a variable that macimizes a function. In this equation, it returns the value of k that maximizes the estimated probability sigma(s(x))k </p><p>loss : cross entropy (general form of binary cross-entropy) </p>
14
New cards

ridge regression

MSE function + ridge regulation term

15
New cards

advantages of logistic regression

  • works well, work horse of data mining

  • coputationally not demanding

  • provides comprehensible linear model

  • provides probabilities

16
New cards

disadvantages of logistic regression

non-linear models can improve the performance, as standard logistic regression is unabble to capture non-linearities in the data

17
New cards

non-linear effects in logistic regression

  • interaction effects

  • polynomial features

  • → can be added but are very impractical and reduce interpretability