Robot Vision Test 1

0.0(0)

Studied by 1 person

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/46

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

47 Terms

New cards

Correct parameter-tuning protocol

Split data into train, validation, and test sets.

Tune on validation, and use the test set only once at the very end to report the final score.

New cards

Softmax loss

A function that measures how bad a model’s predictions are in a multi-class classification problem by comparing predicted probabilities to the actual class

New cards

Purpose of regularization in deep learning

To prevent the model from memorizing the training data (overfitting), helping it perform better on new, unseen data

New cards

Difference between L1 and L2 regularization

L1 regularization pushes some model weights to become exactly zero, effectively selecting important features

L2 Regularization forces weights to be small but barely zero

New cards

Common regularization techniques (besides L1/L2)

Dropout (randomly turns off neurons during training)

Batch Normalization (stabilizes and speeds up training)

New cards

k-NN vs a linear classifier

k-NN is slower at making predictions, requires no training time, and uses more memory to store the entire training dataset

New cards

Numeric vs Analytic gradients

Numeric gradient is easy to implement but is slow and an approximation

Analytic gradient is fast and exact but it’s easier to make coding mistakes

New cards

Learning-rate schedules

A pre-defined plan for how the learning rate changes during training, such as gradually decreasing it over time to improve convergence

New cards

Normalization Layers (BatchNorm, LayerNorm, InstanceNorm)

Techniques to standardize the inputs to a layer, which helps to speed up and stabilize the training of deep neural networks

New cards

Residual connections (ResNet)

A “shortcut” that skips some layers, allowing the model to easily learn to do nothing if a layer is not useful, which helps in training very deep networks

New cards

Problem with naive weights intialization

Initializing all weights to very small/large random values can cause the signals (gradients) to shrink/explode, making the network very difficult to train

New cards

Limitation of Xavier intialization

It doesn’t work well for networks that use ReLU activation functions, as it can lead to neurons dying (always outputting 0)

New cards

Best initialization for ReLU-based networks

Kaiming initialization

Because it is specifically designed to work with the properties of the ReLU activation function

New cards

Hinge loss

A loss function used for training classifiers, which aims to ensure that correct predications are made with a confident margin

New cards

Effective receptive field of three 3×3 conv layers

The same as a single 7×7 convolution layer

New cards

Why stack small 3×3 convolutions?

It uses fewer parameters than a single large kernel (like 7×7) and allows for more non-linear activation functions, making the network more powerful and efficient

New cards

Role of a loss function

To measure how far off the model’s predictions are from the correct answers, guiding the model on how to adjust its weights

New cards

Purpose of a non-linear activation function

It allows the neural network to learn complex patterns and relationships in the data that a simple linear model cannot

New cards

Benefit of pooling layer in CNNs

It reduces the size of the feature maps, which makes the computation faster and helps the network become more robust to the exact position of objects in an image

New cards

Two common types of pooling

Max Pooling (takes maximum value in a window)

Average Pooling (takes average value)

New cards

Final layer in a classification CNN

A fully connected layer is typically added at the end to take the high level features learned by the CNN and use them to make the final classification

New cards

Weight update rule

An algorithm, like gradient descent, that adjusts the weights of the network in the direction that reduces the loss function

New cards

Effect of a large learning rate

It can cause the optimization to overshoot the ideal solution and bounce around, possibly preventing the model from converging

New cards

Vanishing gradients

A problem in very deep networks where the gradient becomes extremely small, causing the weights in the early layers to stop updating, effectively halting learning

New cards

How to lessen vanishing gradients

Use architectures like ResNet with residual connections, employ proper weight initialization (like Kaiming), and use normalization layers (like BatchNorm)

New cards

Softmax Loss/Cross Entropy Loss

L_i = - log( e^s_^y / Σ_je^s_^j)

New cards

Hinge Loss (Multi class SVM) / (Single)

L_i = Σ_j≠ymax(0, s_j - s_y + 1)

New cards

Hinge Loss Average

L = 1 / N Σ^N_i=1L_i

New cards

Total Squared Error

E_total = Σ ½ (target - output)²

New cards

Convolution Layer Parameters (weights)

K × K × C_in× C_out

K: Kernel size

C_in: input channels

C_out: output channels (num of filters)

New cards

FC Layer Params

Input size × output size + output size

New cards

Sequential Stacking of Convs (ex, three 3×3 layers)

If all layers go from C → C, total weights is:

3×(3×3×C×C)

New cards

ResNet (Residual Block Relation)

y = F(x) + x

output = transformation of input + og input

New cards

ReLU Activation

a = max(0, z)

z: pre-activation input

New cards

Weight Update Rule (Gradient Descent)

W_new = W_old - η∂W/∂E

η: learning rate

New cards

Effective Receptive Field

For a stack of 3×3 convos w stride 1, effective receptive field is 7×7

(formula for L layers = 1 + L(K - 1), where K = 3 for 3×3)

New cards

L1 vs L2 regularization specifics

L1: loss + λ(Σⁿ_i=1|W_i|)

L2: loss + λ(Σⁿ_i=1 W_i²)

New cards

What is the correct learning-rate schedule statement

Warm-up increases LR linearly from a small value at the start

Exponential decay multiplies the LR by a fixed factor every epoch

Time-based decay reduces LR gradually as num epochs increases

New cards

What is the need for normalization layers

Internal Covariate Shift (changing distribution of layer inputs during training) and allow use of higher LR

New cards

BatchNorm2D Axes

Computes statistics over

N (batch), H (height), and W (width) axes

Normalizes across

channels C for the entire batch

New cards

LayerNorm Axes

Computes statistics over

C (channel), H (height), W (width)

Normalizes

within a single sample

New cards

InstanceNorm Axes

Computes statistics over

H (height) and W (width) axes

Normalizes

within a single sample and single channel

New cards

Effective receptive field of single 7×7 conv

7×7

New cards

Parameter Count:

Three 3×3 stack (C to C)

3×(3×3×C×C) = 27C²

New cards

Parameter Count: Single 7×7 conv (C to C)

1×(7×7×C×C) = 49C²

New cards

Usefulness of a pooling layer

Reduces feature map size

Lowers computational cost and memory

Provides a degree of translation invariance

New cards

Cause of Vanishing Gradients

Repeated multiplication of small gradients through many layers during backpropagation

Explore top notes

Integration Methods to Know for AP Calculus AB/BC

Updated 310d ago

Note

Spanish Alphabet

Updated 1075d ago

Note

Growth of Industry Notes

Updated 361d ago

Note

ARTICLES OF FAITH (2)

Updated 1109d ago

Note

Notes

Updated 670d ago

Note

Chapter 17: Weather and Climate

Updated 858d ago

Note

Behavioral Ecology

Updated 1143d ago

Note

NaOH Concentration Determination by Titration

Updated 128d ago

Note

Explore top flashcards

Micro Please Save Me

Updated 282d ago

Flashcards (128)

Spanish Cohesive Devices

Updated 748d ago

Flashcards (32)

Human Geo Unit 6 Exam Prep

Updated 562d ago

Flashcards (33)

ap human geography unit 3

Updated 881d ago

Flashcards (51)

Bing Music 112 Exam 3

Updated 290d ago

Flashcards (104)

Science semester 2 final

Updated 123d ago

Flashcards (68)

Art Forms and Practices in the Philippines

Updated 890d ago

Flashcards (83)

AP Chemistry Ultimate Guide

Updated 907d ago

Flashcards (160)