ML Finals

0.0(0)

Studied by 1 person

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/24

Earn XP

Description and Tags

em el

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

25 Terms

New cards

Nonlinear

Means that you can’t predict a label with a traditional model

New cards

Neural Networks

A family of model architectures designed to find nonlinear patterns in data, the model automatically learns the optimal feature crosses to perform

New cards

Hidden Layers

Additional layers between the input and output layer

New cards

Neurons

The nodes in the hidden layers

New cards

Nonlinear mathematical operations

Configuring a neural network to learn non linear relationships between values

Sigmoid Function

New cards

Activation Function

A function that enables neural networks to learn nonlinear (complex) relationships between features and label

Sigmoid
TanH
reLU

New cards

Sigmoid Function

A mathematical function that squishes an input value into a constrained range, typically 0 to 1

It converts the raw output of a logistic regression model to a probability
Acting as an activation function in some neural networks
Its term generally refers to any S-shaped functions

New cards

TanH function

Its a mathematical function that squisihes an input value to a range of -1 to +1

New cards

ReLU

It is an activation function that transforms output using the following algo:

If the input value x is less than 0, return 0
If the input value x is greater than or equal to 0, return the input value

It is less susceptible to the vanishing gradient problem during NN training

New cards

Vanishing Gradient Problem

The tendency for the gradients of early hidden layers of some deep NN to become low or flat. Very low gradients result in smaller changes to weights on nodes, leading to little or no learning.

This is when gradient values approach 0 for the lower layers

ReLU activation functions can help prevent this

New cards

Backpropagation

This is the algorithm that implements gradient descent in NNs

Keras now implement this for you

This is just like adjusting the weights of each inputs so that you can derive to a better output

New cards

Exploding Gradients

If the weights in a network are very large, then the gradients for the lower layers involve products of many large terms.

This is gradients that get too large to converge

Batch normalization can help with this as well as lowering the learning rate

New cards

Dead ReLU Units

Once the weighted sum for a ReLU unit falls below 0, the ReLU unit can get stuck. It outputs 0, contributing nothing to the output, and gradients can no longer flow through it during backpropagation.
Lowering the learning rate can keep ReLU units falling into this problem

New cards

Dropout Regularization

It is another form of regularization that works by randomly dropping out unit activations in a network for a single gradient step. The more you drop out, the stronger the regularization

New cards

Multi-class classification model

The model can pick from multiple possibilities

New cards

One vs all

It uses binary classification for a series of yes or no prediction, it might be given a picture of fruit, and the training would be to ask if is it an image of fruit1, is it an image of fruit 2, and so on