1/28
lecture 7
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is a decision tree?
A flowchart-like model that makes decisions by following a path of if/else questions about features, leading to a prediction at the end.
What are the main parts of a decision tree?
Root node (starting point), internal nodes (questions), branches (possible answers), and leaf nodes (final predictions).
What's the difference between classification and regression trees?
Classification trees predict categories (like "apple" or "orange"), while regression trees predict continuous values (like price or temperature).
What makes decision trees easy to understand?
Their visual, flowchart structure shows exactly which features matter and how decisions are made, unlike "black box" models.
How does a decision tree make predictions?
By following a path from root to leaf, answering questions about features at each node until reaching a final prediction.
What algorithm is commonly used to build decision trees?
CART (Classification And Regression Trees), which finds the best feature and threshold to split the data at each step.
What does a decision tree try to achieve when splitting data?
Creating groups that are as "pure" as possible, meaning they contain mostly the same class or similar values.
How does a decision tree choose the best feature to split on?
By finding the feature that creates the biggest reduction in impurity (like Gini or entropy) after splitting.
What is the concept of impurity ?
If a node has one class in it it is considure to be pure and if the node is mixed with other classes it is consider to be impure and that rage of impurity depends on how mixed it is
What is Gini impurity?
Gini impurity measures the probability of misclassification at a node. It equals zero when a node contains only one class and increases as the node becomes more mixed, guiding the tree to create purer splits.
How is Gini impurity calculated?
1 minus the sum of squared probabilities of each class; Gini = 1 - Σ(probability of class)².
What is entropy in decision trees?
entropy measures how uncertain or mixed up a set of data is. Formulae: Σ(pi × log₂(pi))
Which is faster to calculate: Gini impurity or entropy?
Gini impurity, which is why it's often the default choice even though both give similar results.
Why do decision trees tend to overfit?
They can grow very deep and create complex rules that fit training data perfectly but don't generalize well to new data.
What is the most common way to control tree complexity?
Setting a maximum depth to limit how many questions the tree can ask before making a prediction.
What does minsamplessplit control?
The minimum number of samples needed in a node before it can be split, preventing splits with too little data.
What does minsamplesleaf control?
The minimum number of samples required in a leaf node, ensuring predictions are based on enough data.
What is pruning in decision trees?
Removing branches that provide little predictive power to reduce complexity and prevent overfitting.
How can you find the best hyperparameters for a decision tree?
Using Grid Search or Random Search with cross-validation to test different combinations of settings.
What's the difference between Grid Search and Random Search?
Grid Search tests all combinations of specified parameters, while Random Search tests random combinations, often saving time.
Why is cross-validation important when tuning decision trees?
It ensures the model performs well across different subsets of data, not just one specific test set.
What are the main advantages of decision trees?
Easy to understand, no need for feature scaling, handle mixed data types, and can capture non-linear patterns.
What are the main limitations of decision trees?
Tend to overfit, can be unstable (small data changes create different trees), and create boxy decision boundaries.
Why don't decision trees need feature scaling?
Because they use thresholds rather than distances, so the scale of features doesn't affect their performance.
Why might a small change in training data create a completely different tree?
If the best split changes even slightly, all subsequent splits will be different, causing the whole tree structure to change.
When would you choose a decision tree over other models?
When interpretability is important, when you have mixed data types, or when you need a baseline model quickly.
What types of decision boundaries can decision trees create?
Rectangular, axis-aligned boundaries built by splitting features one at a time.
How do decision trees handle categorical features?
By creating binary splits for each category or group of categories, testing if data belongs to a category or not.
Why are single decision trees often outperformed by ensemble methods?
Because ensembles like Random Forests combine multiple trees to overcome the limitations of any single tree.