1/20
Vocabulary flashcards for key concepts in machine learning, specifically focusing on ensemble methods, evaluation metrics, and dealing with imbalanced datasets.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Ensemble Learning
A method that combines the predictions of multiple models to improve overall prediction accuracy.
Bagging
A technique that reduces variance by training multiple models independently using random subsets of data.
Boosting
A sequential technique that reduces bias by training models one after another, focusing on examples that previous models misclassified.
Bootstrapping
A sampling technique used to create multiple subsets of data from a single dataset, with replacement.
Bias
The error due to overly simplistic assumptions in the learning algorithm.
Variance
The error due to excessive sensitivity to small fluctuations in the training set.
Random Forest
An ensemble method that conditions on subtrees from multiple decision trees for regression and classification.
AdaBoost
An ensemble method that adjusts the weights of instances based on previous classifiers’ errors.
Gradient Boosting
A method where new models are added to correct errors made by existing models.
XGBoost
An optimized gradient boosting framework that is widely used for its performance.
Confusion Matrix
A table that summarizes the performance of a classification algorithm by comparing predicted vs actual classifications.
True Positive (TP)
Instances correctly predicted as the positive class.
False Positive (FP)
Instances incorrectly predicted as the positive class.
True Negative (TN)
Instances correctly predicted as the negative class.
False Negative (FN)
Instances incorrectly predicted as the negative class.
Precision
The ratio of true positive predictions to the total predicted positive cases.
Recall
The ratio of true positive predictions to the total actual positive cases.
F1 Score
The harmonic mean of precision and recall, used to evaluate a model's accuracy.
Imbalanced Dataset
A dataset where the distribution of classes is not uniform, affecting model performance.
Oversampling
The process of increasing the number of instances in the minority class in an imbalanced dataset.
Undersampling
The process of reducing the number of instances in the majority class in an imbalanced dataset.