1/5
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Cross-Validation
Used when datasets have varying data that change over time of collection. So, is an algorithm was trained on earlier data, but trends and patterns are different in later data, it wouldn’t be valid. Essentially, we use the same dataset to train and retrain the ML model.
At each iteration, a different subset of the dataset will be reserved for testing, while the rest would be training. There’d be many iterations and the model will process each one (very time consuming).
Results in much more reliable ensemble models
Reserve some portion of the dataset for testing
Use the rest of the dataset to train the model
Test the model using the reserved portion
Rinse and repeat
Genetic Programming
An algorithm inspired by natural selection involving biological genetic inspired operations.
Essentially, the algorithm will generate an initial random set of solutions. It will undergo “genetic operations” to hopefully bring forth a new population of better solutions to eventually generate the best solution
Genetic Operations
Includes:
GP mutation
GP crossover
All for the goal of GP survival, the persistence of the best solution/s
GP Mutation
Used to ensure diversity in the solutions, stops lags from persisting and happening
GP Crossover
Information from two separate solutions are crossed over and swaped
Pros & Cons of Genetic Programming
Pros:
It can result in understanding (more easier to understand & interpret, eg. if all solutions contains a variable, it may mean that variable is important in the output)
Produces multiple solutions
Can be “stochastic”
Cons:
Very computationally heavy'
We might never get the best solution (stuck in a lag, due random generations, could find local best instead of global best
It is slow and not the most powerful algorithm