1/10
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Logistic Regression
Logistic Regression is a statistical and machine learning method used for classification problems, especially when the outcome has two possible results.
Logistic Regression (purpose)
It predicts the probability that something belongs to a particular category.
Decision Tree
A decision tree is a supervised machine learning algorithm that can be used for both classification (sorting data into categories) and regression (predicting continuous values) tasks.
Decision Tree (description)
It works by creating a model that resembles a flowchart or an upside-down tree, where each internal node represents a “test” on a data feature, each branch represents the outcome of the test, and each leaf node represents the final predicted outcome or class label.
Overfitting
A decision tree can grow very deep and complex, essentially memorizing the noise and small fluctuations in the training data rather than learning the true underlying patterns. This results in a model with very high accuracy on the training set but low accuracy on a separate test set.
Instability
Decision trees are very sensitive to small changes in the training data. A minor change, like adding or removing a few data points, can lead to a completely different tree structure, making the model unstable and unreliable.
Bias toward dominant classes
If the dataset is imbalanced (one class has significantly more data points than others), the tree may become biased towards the majority class and fail to generalize well for the minority classes.
Random Forest
Random Forests are an ensemble learning method that addresses the weaknesses of a single decision tree. The algorithm works by creating a “forest” of many trees and then aggregating their results.
Reduced Overfitting
Uses bagging (bootstrap aggregating) to build many unique trees. Final prediction is averaged (regression) or majority vote (classification), lowering variance and preventing overfitting.
Increased Stability
Aggregating many trees cancels out individual errors or biases, making the model more reliable and less sensitive to noise/outliers.
Better with Imbalanced Data
Each tree sees a slightly different class distribution, so the ensemble handles imbalance more effectively than a single tree (though additional methods may still be needed).