VL IX - CNN Architexture

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/30

There's no tags or description

Looks like no tags are added yet.

Last updated 2:53 PM on 6/22/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

31 Terms

New cards

LeNet Activation function

Tanh/sigmoid

New cards

Common performance metrics

Top 1 Score
Top 5 score
Top 5 error (reverse of top 5 score)

New cards

Where are most parameters in a CNN

in the FC layer

New cards

Activation Function AlexNet

ReLU

New cards

What does VGGNet change

Striving for simplicity
3×3 filters with stride 1
Pooling: 2×2 with stride 2

New cards

Structure VGGNet

Conv → Pool → Conv → Pool → … → FC

New cards

Which VGGNet is best and why

VGG-16, due to with increasing layers we get vanishing gradients due to flooting point inaccuracy

New cards

Why use Skip Connections

Improving gradient flow
Preserving information
Making optimization easier

New cards

Which convolution is used in skip connection

Same convolution, otherwise conversion with zero padding or matrix of learned weights

New cards

Why do ResNets Work

Identity is easy for the residual block to learn

Guaranteed it will not hurt performance

New cards

Why use 1×1 Convolutions

Use it to shrink the number of channels
Further adds a non-linearity → learn more complex functions

New cards

Whats an Inception layer

Layer with multiple different convolutions going to 1 output

New cards

Why use inception layers

let the network decide which convolution works best

New cards

Why use 1×1 convolution in Inception layers

Reduction of multiplications

New cards

Whats Xception Net

Applying depthwise seperable convolutions

conv layers structured into several modules with skip connections
Filters are only applied at a certain depth

New cards

Why use xception nets

reduction of multiplications → less computations

New cards

Types of Upsampling

Interpolation
Transposed Conv

New cards

Left side of U-Net

Contraction Path

New cards

What does the contraciton Path do

Captures context of the image
Follows architecture of CNN

New cards

Right side of U-net

Expansion path (decoder)

New cards

What does the expansion path do

Upsampling to recover spatial locations for assigning class labels to each pixel
width up, depth down

New cards

How does the feature map change the deeper we go in a CNN

The Width goes down, but the depth goes up

New cards

Challenges detection

convolution are translationally equivariant
- same filter slides across all positions
- shifting input image → output shifts by same amount

Need for detecting multiple objects

Identifying what makes where difficult

New cards

What does R-CNN stand for

Region-Based CNNs

New cards

What is R-CNN

Fixed number of boxes are run over an image
Boxes hopefully extract promising feature
Each box has to run its features through a CNN

New cards

Difference Fast R-CNN to R-CNN

Run CNN for entire image once
Use region of interest pooling to extract fixed size feature maps for each region

New cards

Whats Region of interest Pooling

Rescale variable-sized (dimension) feature maps to a fixed sized output using max pooling

New cards

How does Faster R-CNN work

Region proposal network RPN slides over feature map and dedicates if interesting feature is present

New cards

Whats an Anchor (Detection)

Potential bounding box where an object can be detected

New cards

difference Faster R-CNN and Mask R-CNN

After bounding boxes are found additionally it is determined if the pixel in the bounding box belongs to the object

New cards

Whats instance segmentation

Pixel wise mask inside of a bounding box