1/18
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Human Vision
Retina captures image and tuns it into a electrical signal. It then travels up the optic nerve and gets processed in the visual cortex. Frontal node adds other information
Computer Vision and how it captures Image
A camera or sensor captures an image and turns it into pixels. Image is processed into features, and algorithm processes signal. Algorithm output determines decisions.
What are Decisions based on for CV
Based on algorithms, pattern matching and trained models
Human Vision Decision making based on
On past experiences, context and instinct
What was used after Multilayer/Feed Forward Models
Modern Convolutional Neural Network
What are challenges in CV?
Image Quality, Variability(Different angles, lighting
conditions, scale, and occlusions), Motion, Scale, Bias
Feedforward old method of CV
flatten images, looses spacial awarness
Train a NN
Forward pass
2. Calculate the error
3. Propagate the error backwards
4. Update the weights
5. Repeat until some stopping criterion
Batches
Smaller Subset of data that gets passed through. An epoch is still only complete once all data has been through model
Learning Rate
It dictates how much we adjust the weights according to the error
signal
Convolution Neural Networks
Input and output layers (just like in MLPs)
• Convolutional layers
• Pooling layers
• Dense (fully-connected) layers (these are what we saw in MLPs)
Pooling Pooling
Filter/Kernals
feature detection, builds ups. It chooses the best features by itself. 3D filters for RGB
Padding
Extra pixels at the edge usually 0 so that edges are treated equally to middle pixels.
Max Pooling
Take the maximum value, for a given window
Stride length
the amount the filter moves by. Less is more accurate but more computationally expensive.
Pooling layers meaning
Reduce Dimensionality, prevents overfitting and translation invariance
Dense/Fully-Connected Layers
Connect all layers together
The same as our standard MLPs/feed-forward neural networks
• Flatten the feature maps from the last pooling layer, then feed
through dense layers
• Connects all neurons from the previous layer to the output
node(s) to produce a final decision/classification.