1/71
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What are genetic algorithms?
Adaptive heuristic search algorithms inspired by biological models. Part of evolutionary Computation
What two works do genetic algorithms rely on?
Genetic Mendel’s principles which defines how offspring receive genetic material from parent generations and Darwin’s theory of Evolution by natural selection which defines ability of individuals to survive, adapt, and reproduce
What problems is genetic algorithms reccomended for?
Complex problems for which there is no known efficient algorithm
What are some common areas of application for genetic algorithms?
Optimization, search, planning, classification. Also used to tune learning parameters for other algorithms such as artificial neural networks
How does GA generate successor hypotheses?
By repeatedly recombining and mutating parts of the best currently known hypothesis which guarantees that variants of these hypotheses are most likely to be considered next
In GA what do genes represent?
Representation of some parameter (variables) according to an alphabet used to form chromosomes
In GA, what are individuals?
Each individual corresponds to a chromosome, used to represent a candidate solution (hypothesis)
In GA, what is a population?
Set of individuals who will compete for survival and reproduction (collection of hypothesis)
In GA, what is a generation?
Population of a given period (iteration)
In GA, what is a fitness function?
Function used to evaluate an individual’s ability to survive and reproduce. Provide a measure of quality of each candidate solution
How is the choice of individuals who will form the initial population done?
Randomly or heuristically by using relevant information for the search process
What must individuals satisfy in GA?
Problem constraints (feasible)
What do larger populations allow for in GA?
Greater diversity of solutions but consumes more time
What should smaller populations do in GA?
GAs should not generate the initial population randomly, but in a well-defined approach for a wider coverage of the search space
In GA, what is the population size?
Population of individuals whose size remains constant over time
How to represent hypothesis in GA?
Represent hypotheses of if-then rules by bit strings (binary)
How to represent multiple attributes in GA?
By concatenating the strings
How can an entire rule be described in GA?
By concatenating the bit strings describing the rule preconditions, together with the bit string describing the rule postcondition
What does the choice of the fitness function to ranking hypotheses depend on?
The problem to be solved
When minimizing or maximizing mathematical functions, what fitness function should be used?
The function to be maximized/minimized
For a classification problem, what fitness function should be used?
Could be defined as the accuracy of the hypothesis over the training data
What happens to unfeasible individuals produced during the genetic operations?
Penalized or discarded
What is it called if all individuals in the previous population can be replaced by new solutions?
The process is called generational
What is it called if only one group of individuals can be replaced (usually the worse)
Steady state which prevents quality individuals from being lost
What are the two main strategies for selection?
Roulette wheel and tournament
In GA what is roulette wheel?
The probability that a hypothesis will be selected is given by the ratio of its fitness to the fitness of other members of the current population (correspond proportionally to the sectors of a roulette wheel)
In GA what is tournament?
Two hypothesis are first chosen at random from the current population. With some predefined probability p, the most fit of these two is then selected and with probability (1-p) the less fit hypothesis is selected.
Between the two main strategies for selection, which yields a more diverse population?
Tournament selection
In GA, what is mutation?
Produces small random changes to the bit string by choosing a single bit at random then changing its value
Why is mutation done during GA?
Prevents the algorithm from stagnating at a local maximum, allowing it to explore other regions that may be more rpomising
When is mutation done?
Often performed after crossover has been applied with low probability (0.1%)
What termination criteria can be used?
Fixed number of generations, an individual reaches the expected result, distance of an individual to the expected result, Convergence: it does not significantly improve the solution over several generations
Why is GA less likely to fall into same kind of local minima that can plague gradient descent methods neural network during backpropagation?
The gradient descent search in backpropagation moves smoothly from one hypothesis to a new hypothesis that is very similar. GA search can also move much more abruptly, replacing a parent hypothesis by an offspring that may be radically different from the parent
What are the characteristics of GA?
Multiple candidate solutions produced in parallel, analysis of only parts of the search space, handle highly complex problems with a large space to search, not necessary to have specialized knowledge about the domain to find a function, the fitness function guides the search
What is a Convolutional Neural Network (CNN)
Type of deep learning algorithm mainly used in the processing o data with grid-like topology, such as image recognition and classification
What are the characteristics of CNN?
Automatic feature extraction, translation invariance, pre-trained architectures, and versatilityW
In CNN, what is Automatic feature Extraction?
CNNs can autonomously learn and extract hierarchical features from data, eliminating the need for manual feature engineering
In CNN, what is translation invariance?
CNN’s can recognize patterns regardless of position, orientation, or scale, making them robust to spatial variations
In CNNs, what are pre-trained architectures?
Models like InceptionV3 and ResNet50 achieve state-of-the-art results and can be fine-tuned for new tasks with relatively small datasets
In CNNs, what is versatility?
Though commonly used for image classification, also excel in natural language processing, time series analysis, and speech recognition
CNN’s consist of what?
Multiple layers such as the input, convolutional, pooling, flattening, fully connected, and output layers
What is the convolutional layer?
Applies filters to the input image to extract features
What is the pooling layer?
Downsamples the feature maps to reduce computation
What is the flattening layer?
Reshapes the data into a 1D vetor
What is the fully connected layer?
Processes the learned features
What is the output layer?
Generates the final prediction
What is the input layer?
Layer which we give input to our model
In the input layer, what are image channels?
Represents the image in a numerical format
For the input layer, what happens to the image channel if its in color?
3D array is used where each pixel from the image is represented by its corresponding pixel values in three different channels. RGB The three values are blended to forma single color
In the convolutional layers, how are prominent features extracted?
Using filters and kernals
What are filters in CNN?
Small grids (matrices) that slide across (convolve) the input image and compute a spatial correlation between the filter values and local pixel patterns. The result is a new array called feature map
In CNN, what are filters?
Work as a mini magnifying glass that looks for specific patterns in the photo, like lines, curves or shapes
Why is the convolutional layer powerful for CNNs in image classification?
Allows for the ability ot detect local patterns, allowing them to learn visual features much more effectively than traditional fully connected networks
What activation function is commonly used for CNNs?
ReLU
When using a kernel, what is done to the receptive field and kernel field?
Dot product
In CNN, what is padding?
Due to the kernel only being bale to scan an area on the data set, the data point of the border cant be properly learned. To solve this, we can use padding which creates additional data points with a value of 0 outside of the border of the dataset to increase its size.
In CNN, what is stride?
Controls how many pixels the filter moves across the input each time it slides over it. Moves n pixels
What is the goal of the pooling layer?
Pull most significant features form the feature maps, making the model less sensitive to small shifts or distortions in the input. Also relevant for mitigating overfitting, as the network will have fewer parameters and activations in the later layers
How is pooling done in CNN?
Applying some aggregation operations which reduce the dimension of the feature map and the memory used while training. There are no weights to learn
What are the common types of pooling layers?
Max, min, sum, and average
Why is the flattening layer used?
Make the outcomes of CNN be compatible with an ANN which expects a vector as input. Reshape the 2 or 3D vector into 1D
What is the fully connected dense layer?
These layers receive the one-dimensional vector generated by the flattening layer. The main purpose of the fully connected layer is to learn high-level, abstract patterns by considering the entire input.
In CNN, what happens in a dense layer?
Every neuron is connected to every neuron in the previous layers and the weights are trainable. ReLU is used as activation function
In CNN, what is output (dense) layer used for?
Responsible for prediction, often a dense layer with activation functions
In CNN, in the output layer what activation function is used for classification?
Softmax for multiclass or sigmoid for binary activation
In CNN, in the output layer what activation function is used for regression?
Linear activation function
In CNN, what should match in the output dense layer?
The number of output neurons should match the number of classes or be 1 (for binary classification or regression)
What are the characteristics of ANN Multilayer neural networks?
Multilayer neural networks with at least one hidden layer are universal approximators, meaning they can approximate any target function. However, due to their expressiveness choosing the right network topology is crucial to prevent overfitting.
What are the characteristics of ANN Multilayer redundant features?
Can handle redundant features, as their weights are automatically learned during training so redundant features have smaller weights
What are the characteristics of ANN Multilayer noise?
Neural networks are sensitive to noise in training data. A common way to address this is by using a validation set to estimate generalization error
What are the characteristics of ANN Multilayer gradient descent?
Gradient descent, used for optimizing ANN weights, often converges to a local optimum
What are the characteristics of ANN Multilayer training?
Training ANNs is computationally expensive, especially with many hidden nodes, but once trained, they classify new examples efficiently