1/10
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Classification
A method that predicts which category a new observation belongs to using a labeled training dataset.
When To Use Classification
Use when the outcome variable is categorical (e.g., Default vs. No Default, Spam vs. Not Spam).
Outcome Variable (Target)
The categorical variable you want to predict.
The final predicted value in a Leaf Node.
Predictor Variables (Inputs / Features)
The attributes used to split the data and make decisions at each node
Maximum Depth (max_depth)
The maximum number of splits allowed from the root node to any leaf node.
Maximum Depth (max_depth): Small Value
Produces a less complex tree.
Maximum Depth (max_depth): Large Value
Produces a more complex (deeper) tree
Minimum Split (min_split)
The minimum number of observations required in a node to attempt an additional split.
Minimum Split (min_split): Small Value
Produces a more complex tree (allows splitting smaller groups).
Minimum Split (min_split): Large Value
Produces a less complex tree (prevents splitting small groups).
Confusion Matrix
Used to check the accuracy of the model by comparing the predicted outcomes to the observed (actual) outcomes in the validation set