KD

P2: Notes on Regression Techniques and Convolutional Neural Networks

Regression Techniques in Machine Learning

  • Concept of Regularization:
    • Regularization is used to penalize larger weights to prevent overfitting in machine learning models.
    • Two common techniques are Ridge regression (L2 regularization) and Lasso regression (L1 regularization).
  • Understanding Weights:
    • Given values: $w = [23, 5, 4, 3, 2, 68]$ indicates the presence of larger weights.
    • The aim is to squeeze larger weights towards smaller values.
  • Effect of Ridge Regression:
    • In Ridge regression, the larger weights are penalized by applying the polynomial form, thus decreasing their effect in the model.
    • As larger $w$ values decrease, the probability of weights becoming zero increases.
  • Effect of Lasso Regression:
    • In Lasso regression, the focus is on trying to minimize the absolute value of weights.
    • This leads to some weights potentially being shrunk to zero, effectively performing feature selection by eliminating less important features.

Convolutional Operations in Neural Networks

  • Convolution Concept:
    • Convolution in Neural Networks focuses on specific areas of the input rather than the complete input, allowing for feature extraction.
  • Difference between Convolution and Fully Connected Networks:
    • Fully connected networks connect each neuron in one layer to all neurons in the next layer, providing a holistic overview.
    • Convolutional layers focus only on localized patches of the input, allowing neurons to specialize in specific features of the image.
  • Why Convolution?
    • Specializing in features allows better extraction of necessary attributes in data (like images).
    • Each neuron is responsible for a distinct patch of features instead of absorbing the entire dataset.

Neural Network Layers and Features Extraction

  • Creating Feature Maps:
    • Convolutional layers generate multiple feature maps based on different weights (filters) which detect unique features in the input data.
    • Filters act like lenses focusing on particular aspects of an input, generating different neuron activations based on the learned weights.
  • Pooling Operations:
    • Following convolutions, pooling operations downsample the feature maps to provide a hierarchy of features, aiding in reducing dimensionality.
    • Pooling types include max pooling and average pooling, focusing on key features while ignoring insignificant variations (noise).

Visualization and Calculation Example

  • Example of Convolution:
    • When a specific filter is convolved over an image, a particular activation value is derived, indicating the presence of corresponding features.
    • Example calculation:
    • For an image patch $I = [1, 0, 0, 0, 1, 0, 0, 0, 1]$ with a filter $F = [0, 1, 1, 0, 0]$, the activation value is computed as the sum of the products of the overlapping values.
  • Pooling operation illustration:
    • Suppose pooling takes the maximum value from regions of the feature map to ensure that the critical features remain while discarding noise.
    • Example for max pooling would yield maximum values from distinct regions of the input feature maps, summarizing important features.