VL IX - CNN Architexture

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/30

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 2:53 PM on 6/22/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

31 Terms

1
New cards

LeNet Activation function

Tanh/sigmoid

2
New cards

Common performance metrics

  • Top 1 Score

  • Top 5 score

  • Top 5 error (reverse of top 5 score)

3
New cards

Where are most parameters in a CNN

  • in the FC layer

4
New cards

Activation Function AlexNet

ReLU

5
New cards

What does VGGNet change

  • Striving for simplicity

  • 3×3 filters with stride 1

  • Pooling: 2×2 with stride 2

6
New cards

Structure VGGNet

Conv → Pool → Conv → Pool → … → FC

7
New cards

Which VGGNet is best and why

VGG-16, due to with increasing layers we get vanishing gradients due to flooting point inaccuracy

8
New cards

Why use Skip Connections

  • Improving gradient flow

  • Preserving information

  • Making optimization easier

9
New cards

Which convolution is used in skip connection

Same convolution, otherwise conversion with zero padding or matrix of learned weights

10
New cards

Why do ResNets Work

  • Identity is easy for the residual block to learn

  • Guaranteed it will not hurt performance

11
New cards

Why use 1×1 Convolutions

  • Use it to shrink the number of channels

  • Further adds a non-linearity → learn more complex functions

12
New cards

Whats an Inception layer

  • Layer with multiple different convolutions going to 1 output

13
New cards

Why use inception layers

  • let the network decide which convolution works best

14
New cards

Why use 1×1 convolution in Inception layers

  • Reduction of multiplications

15
New cards

Whats Xception Net

  • Applying depthwise seperable convolutions

  • conv layers structured into several modules with skip connections

  • Filters are only applied at a certain depth

16
New cards

Why use xception nets

reduction of multiplications → less computations

17
New cards

Types of Upsampling

  • Interpolation

  • Transposed Conv

18
New cards

Left side of U-Net

Contraction Path

19
New cards

What does the contraciton Path do

  • Captures context of the image

  • Follows architecture of CNN

20
New cards

Right side of U-net

Expansion path (decoder)

21
New cards

What does the expansion path do

  • Upsampling to recover spatial locations for assigning class labels to each pixel

  • width up, depth down

22
New cards

How does the feature map change the deeper we go in a CNN

The Width goes down, but the depth goes up

23
New cards

Challenges detection

  • convolution are translationally equivariant

    • same filter slides across all positions

    • shifting input image → output shifts by same amount

  • Need for detecting multiple objects

  • Identifying what makes where difficult

24
New cards

What does R-CNN stand for

Region-Based CNNs

25
New cards

What is R-CNN

  • Fixed number of boxes are run over an image

  • Boxes hopefully extract promising feature

  • Each box has to run its features through a CNN

26
New cards

Difference Fast R-CNN to R-CNN

  • Run CNN for entire image once

  • Use region of interest pooling to extract fixed size feature maps for each region

27
New cards

Whats Region of interest Pooling

  • Rescale variable-sized (dimension) feature maps to a fixed sized output using max pooling

28
New cards

How does Faster R-CNN work

Region proposal network RPN slides over feature map and dedicates if interesting feature is present

29
New cards

Whats an Anchor (Detection)

Potential bounding box where an object can be detected

30
New cards

difference Faster R-CNN and Mask R-CNN

After bounding boxes are found additionally it is determined if the pixel in the bounding box belongs to the object

31
New cards

Whats instance segmentation

Pixel wise mask inside of a bounding box