KD

P3:

  • Concept of Translation Invariance

    • Translation invariance refers to the property where the outcome remains unchanged irrespective of the position of the input.
    • This principle is essential in eliminating rotational or other variances that may arise during different transformations in image processing.
  • Introduction to Convolutions

    • Convolutions are fundamental in digital image processing, used for statistical analysis and feature extraction from images.
    • Early methods were manual (manual feature engineering), relying on expert knowledge to identify which features to extract from images.
  • Shift from Manual to Automated Feature Engineering

    • With neural networks and convolution operations, feature engineering has moved towards automation, allowing neural networks to learn feature extraction automatically.
    • Backpropagation algorithms play a crucial role in adjusting filters (kernels) to optimize outcomes based on learned information from input data.
  • Convolution Operations Overview

    • When performing convolution, a filter (kernel) is applied to an image:
    • Image and filter example: Let (W) represent the kernel values.
    • The filter is slid over the image, multiplying the corresponding values, resulting in neuron activations reflecting that segment of the image.
  • Sliding Operation in Convolutions

    • The term "sliding" refers to moving the filter across the image one step at a time to generate output for each neuron layer.
    • This sliding generates feature maps that reflect various aspects of the input image.
  • Manual Feature Engineering Example

    • Demonstrates how multiple kernels can be applied to extract different features from the same image, resulting in unique feature maps.
    • Each kernel produces varying neuron activations based on specific feature detection.
  • Automated Feature Extraction through Backpropagation

    • In automated feature extraction, the initial weights for kernels are set randomly.
    • The backpropagation algorithm adjusts these weights based on output errors, refining the extracted features to minimize error rates.
  • Padding in Convolutions

    • Padding involves adding extra layers (often zeros) around the input image to maintain edge features during convolution.
    • It ensures corner pixels, which usually get less importance, are captured adequately in feature maps.
    • Padding helps mitigate the boundary problem where certain features are captured multiple times at the edges of the image while others are neglected.
  • Stride in Convolutions

    • The stride is the number of pixels by which the filter moves across the input image.
    • A stride of 1 moves the filter one pixel at a time, while a stride of 2 skips a pixel.
    • The stride value affects the dimensions and size of the resultant feature map.
  • Calculating Feature Map Size

    • The general formula for calculating the feature map dimensions involves the input size, filter dimensions, padding, and stride:
    • For the height of the feature map: [ fh = \frac{n + 2p - fh}{s} + 1 ]
    • For the width, it follows a similar pattern.
    • Understanding this formula allows for predicting how convolutions affect image dimensions through various operations and transformations.