1/30
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
LeNet Activation function
Tanh/sigmoid
Common performance metrics
Top 1 Score
Top 5 score
Top 5 error (reverse of top 5 score)
Where are most parameters in a CNN
in the FC layer
Activation Function AlexNet
ReLU
What does VGGNet change
Striving for simplicity
3×3 filters with stride 1
Pooling: 2×2 with stride 2
Structure VGGNet
Conv → Pool → Conv → Pool → … → FC
Which VGGNet is best and why
VGG-16, due to with increasing layers we get vanishing gradients due to flooting point inaccuracy
Why use Skip Connections
Improving gradient flow
Preserving information
Making optimization easier
Which convolution is used in skip connection
Same convolution, otherwise conversion with zero padding or matrix of learned weights
Why do ResNets Work
Identity is easy for the residual block to learn
Guaranteed it will not hurt performance
Why use 1×1 Convolutions
Use it to shrink the number of channels
Further adds a non-linearity → learn more complex functions
Whats an Inception layer
Layer with multiple different convolutions going to 1 output
Why use inception layers
let the network decide which convolution works best
Why use 1×1 convolution in Inception layers
Reduction of multiplications
Whats Xception Net
Applying depthwise seperable convolutions
conv layers structured into several modules with skip connections
Filters are only applied at a certain depth
Why use xception nets
reduction of multiplications → less computations
Types of Upsampling
Interpolation
Transposed Conv
Left side of U-Net
Contraction Path
What does the contraciton Path do
Captures context of the image
Follows architecture of CNN
Right side of U-net
Expansion path (decoder)
What does the expansion path do
Upsampling to recover spatial locations for assigning class labels to each pixel
width up, depth down
How does the feature map change the deeper we go in a CNN
The Width goes down, but the depth goes up
Challenges detection
convolution are translationally equivariant
same filter slides across all positions
shifting input image → output shifts by same amount
Need for detecting multiple objects
Identifying what makes where difficult
What does R-CNN stand for
Region-Based CNNs
What is R-CNN
Fixed number of boxes are run over an image
Boxes hopefully extract promising feature
Each box has to run its features through a CNN
Difference Fast R-CNN to R-CNN
Run CNN for entire image once
Use region of interest pooling to extract fixed size feature maps for each region
Whats Region of interest Pooling
Rescale variable-sized (dimension) feature maps to a fixed sized output using max pooling
How does Faster R-CNN work
Region proposal network RPN slides over feature map and dedicates if interesting feature is present
Whats an Anchor (Detection)
Potential bounding box where an object can be detected
difference Faster R-CNN and Mask R-CNN
After bounding boxes are found additionally it is determined if the pixel in the bounding box belongs to the object
Whats instance segmentation
Pixel wise mask inside of a bounding box