Week 1 - Introduction to Deep Learning

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/28

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

29 Terms

1
New cards

Artificial Neural Network (ANN)

  • Consists of many computation unit called neurons

  • Learns a complex mapping function that maps any input X to any output Y by training with lots of data

2
New cards

Deep Learning

An ANN with many layers, which allow it to learn increasingly complex concepts out of simple concepts

3
New cards

Artificial Neuron

  • Multiplies each input feature by a corresponding weight and then adds these values together with a bias term

  • This value is then put through a function called the activation function

4
New cards

Recurrent Network (RNN)

  • Neurons get feedback from its own output

  • Used for processing data with time or sequence information

  • Successfully used for machine translation, language modelling and time series prediction

5
New cards

Convolutional Neural Network

  • Neurons are connected to small sets of inputs only

  • Efficient for computer vision tasks such as object detection, image classification

6
New cards

Purpose of Activation Functions

  1. Firing Decision

    1. Helps to decide if the neurons should fire or not - fire only if they are relevant to the prediction (mathematical gate)

  2. Bounded Values

    1. Some activation functions provide a bound to the output values. This provides more stability during training

  3. Non-linearity

    1. Introduce non-linearity to the network

    2. Most of the interesting problems in real life are non linear in nature and that requires a non-linear neural network to handle them

7
New cards

Types of Activation Functions

  • Linear Function

  • Sigmoid/Logistic

  • Tanh (Hyperbolic Tangent)

  • SoftMax

  • RELU (Rectified Linear Unit)

  • Leaky SELU

  • SELU (Scaled Exponential Linear Unit)

  • Parametric Rectified Linear Unit

  • SoftPlus

8
New cards

Linear Function

  • g(x) = x

  • Output is proportional to the input multiplied by weight

  • Generally used for regression at the output layer

9
New cards

Sigmoid Function

  • g(x) = 1/(1+e^-x)

  • Output values bounded between 0 and 1

  • Generally used for binary prediction at the output layer

10
New cards

TanH Function

  • g(x) = tanh(x)

  • Default activation for RNN layer

11
New cards

Softmax Function

  • One output for each class

  • Value will be normalized to between 0 and 1 such that values for all classes summed to 1

  • This allows comparison and thus supports multi-class classification

  • Commonly used in output layer for a multi-class classifier with the cross-entropy loss function

12
New cards

RELU (Rectified Linear Unit) Function

  • g(x) = max{0,x}

  • Looks like linear function but is non-linear

  • It is the default activation function (hidden layer) for many neural networks

13
New cards

Activation function to use for a regression problem at the output layer

Linear activation function

14
New cards

Activation function to use for binary classification at the output layer

Use sigmoid for the single output neuron in the output layer

15
New cards

Activation function to use for multiclass classification at the output layer

Use Softmax activation function, one output neuron per class

16
New cards

Activation function to use at the hidden layer

Common to start with ReLU activation function and use others to improve performance

17
New cards

Activation function to use at the input layer

No activation function

18
New cards

Back Propogation

  • A highly efficient algorithm that derives the optimal weight values in all the layers

  • It uses gradient descent and the chain rules to determine how to adjust the weight in each neuron in the network.

  • The weight adjustment starts from the output layer (where the error is calculated) and works back towards the input layer

19
New cards

Loss Function

  • Measures how much difference between correct (ground truth) output and predicted output

  • A loss function should return a high value for bad prediction and low value for good prediction

20
New cards

Common types of loss functions

  • Binary Cross-entropy - for Binary Classification

  • Categorical Cross-entropy - for Multi-class Classification

  • Mean-Squared Error - for Regression

21
New cards

Cross Entropy Loss Function

  • Also called Log loss function

  • Depending on how you encode your target label, you will use either categorical_crossentropy or sparse_categorical_crossentropy in Keras

  • Assuming we have 3 different classes: 0, 1, 2, and assume our target labels for two samples are [1, 2], we have two ways of representing the target labels:

    • One-hot-encoded target labels: y_true = [[0, 1, 0], [0, 0, 1]]

    • Integer target labels: y_true = [1, 2]

22
New cards

Optimizer

Adjusts the weights based on the errors in the prediction (as measured by loss function), using gradient descent

23
New cards

Training Epochs and Training Steps

When training a neural network, we usually feed the network with a batch of samples, instead of a single sample at a time

24
New cards

Training Epoch

Refers to one iteration (forward pass + backward pass) over ALL training samples

25
New cards

Training step

Refers to one iteration (forward + backward pass) over a single batch of samples. Involves a gradient update of weights

26
New cards

Learning Rate

Determines how fast the weights are adjusted by an optimizer during gradient descent

27
New cards

Width of a neural network

The number of units in a layer of a neural network

28
New cards

Depth of a neural network

The number of layers in a neural network

29
New cards

Mean-Squared Error loss function

Used for regression problems, where the output is a single continuous value. The output will be a single unit (for single output prediction)