1/16
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
golden rule
test set must stay a representative sample for out-of-sample evaluation
meaning : lock away the test set until all modeling decisions have been taken
objective / loss / cost function
function that represents our goal, and can be calculated for a particular set of predictions and their corresponding output values
statistics vs machine learning
in statistics the point of our model is to characterize the relationship between the data and our outcome variavle, not to make predictions about future data. = statistical inference (there won’t be a testset)
in machine learning we train the model, which involves using a subset of our data, and we do not know how well the model will perform until we ‘test” this data on additional data that was not present during training = test set → goal is to obtain the best performance on new data (the test set)
can linear regression overfit ?
yes
what adds complexity to linear regression
more features
higher coefficient / slope → higher impact of variable
polynomial / interaction features
regularization
strategies that reduce test error, possibly at the cost of higher training error
→ necessary with high capacity / complex models (ex ANN’s)
logistic regression
probability values are bounded between 0 and 1, while linear regression linu usually will go from minus infinity to infinity
estimates probabilitys
cut-off point (usually 0.5)

a logistic regression model predicts ..
1 of X^T 0 is positive and 0 if it is negative
minimizing the binary cross entropy is equivalent to
maximizing the (Bernouilli) (log) likelihood of observing the data
why is logistic regression considered a linear classification method ?
for a given cut-off, decision boundary / surface is linear

does a logistic regression have a way to predict interaction effects
no
multinomial logistic (softmax) regression
for more than two (k) classes
→ estimates the probability that instance x belongs to class k, given the scores of each class
→ probabilities across classes sum to 1

multinomial logistic (softmax) regression : for classification
the argmax operator returns the value of a variable that macimizes a function. In this equation, it returns the value of k that maximizes the estimated probability sigma(s(x))k
loss : cross entropy (general form of binary cross-entropy)

ridge regression
MSE function + ridge regulation term
advantages of logistic regression
works well, work horse of data mining
coputationally not demanding
provides comprehensible linear model
provides probabilities
disadvantages of logistic regression
non-linear models can improve the performance, as standard logistic regression is unabble to capture non-linearities in the data
non-linear effects in logistic regression
interaction effects
polynomial features
→ can be added but are very impractical and reduce interpretability