Decision Tree
can be used to visually and explicitly represent decisions and decision making
Decision Tree
Build a _______ for classifying
Decision Tree
It utilizes supervised learning, batch processing of training
Preference Bias
Define a metric for comparing fs so as to determine whether one is better than another
upside down
A decision tree is drawn ______ with its root at the top
condition or internal node
bold text in black of a decision tree represents a ____
branches or edges
tree splits into _____
decision or leaf
The end of the branch that doesn’t split anymore is the _____
Random
Select any attribute at random
Least-Values
Choose the attribute with the smallest number of possible values
Most-Values
Choose the attribute with the largest number of possible values
Max-Gain
Choose the attribute that has the largest expected information gain
Max-Gain
try to select the attribute that will result in the smallest expected size of the subtrees rooted at its children
H
measures the information content or entropy in bits
Low information content
is desirable in order to make the smallest tree because low information content means that most of examples are classified the SAME, and therefore we would expect that the rest of the tree rooted at this node will be quite small to differentiate between the two classifications.
Conditional entropy
is defined as a conditional probability of a class, Y, given a value, v, for an attribute (i.e., question), X.
question
Pr(Y|X=v) what is X?
label
Pr(Y|X=v) what is Y?
answer to the question
Pr(Y|X=v) what is v?
symmetric
information gain is ________
mutual information
other term for information gain
entropy
measurement of uncertainty
Machine Learning
Is said as a subset of artificial intelligence
data and past experiences
Machine learning is the development of algorithms which allow a computer to learn from the ______ and _______ on their own
Arthur Samuel
Machine Learning was introduced by
1959
what year was Machine Learning introduced?
patterns
Machine learning uses data to detect various ______ in a given dataset.
automatically
It can learn from past data and improve ____________.
data-driven
It is a _________ technology.
data mining
Machine learning is much similar to _______ as it also deals with the huge amount of the data.
increment
Need for Machine Learning Rapid _______ in the production of data
complex
Need for Machine Learning Solving ______ problems, which are difficult for a human
Decision
Need for Machine Learning ______-making in various sector including finance
hidden
Need for Machine Learning Finding ______ patterns and extracting useful information from data
Supervised Learning
Classification of Machine learning Classification/Regression/Estimation
Unsupervised Learning
Classification of Machine learning Clustering/Prediction/Association
Reinforcement Learning
Classification of Machine learning Classification/Control/Decision-Making
Data Exploration
The step where we understand the nature of data that we have to work with. In this, we find Correlations, general trends, and outliers.
Data pre-processing
The step where preprocessing of data for its analysis takes place
Train
_____ model: to improve its performance for better outcome of the problem
Test
_____ model: check for the accuracy of our model by providing a test dataset to it.
Deployment
deploy the model in the real-world system
Herbert Simon
“Learning is any process by which a system improves performance from experience.” Who said dis?
Tom Mitchell
"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E." Who said dis?
Learning
is essential for unknown environments
system construction
Learning is useful as a _________ method
omniscience
when designer lacks _______
reality
expose the agent to ____ rather than trying to write it down
decision mechanisms
Learning modifies the agent's _______ to improve performance
Machine learning
how to acquire a model on the basis of data / experience
probabilities
example of Learning parameters (plural)
Bayesian network graph
example of Learning structure (singular)
clustering
example of Learning hidden concepts
Supervised Learning
Machine Learning Areas Data and corresponding labels are given
Unsupervised Learning
Machine Learning Areas Only data is given, no labels provided
Semi-Supervised Learning
Machine Learning Areas Some (if not all) labels are present
Reinforcement Learning
Machine Learning Areas An agent interacting with the world makes observations, takes actions, and is rewarded or punished; it should learn to choose actions in such a way as to obtain a lot of reward
past experiences of Data feed in
A machine is said to be learning from ____________ with respect to some class of tasks if its Performance in a given Task improves with the Experience
previous knowledge or past experiences
the machine works in a basic conceptual level of looking at the ________
Data
Features
attribute-value pairs which characterize each x
Experimentation Cycle
-Learn parameters (e.g. model probabilities) on training set -(Tune hyper-parameters on held-out set) -Compute accuracy of test set -Very important: never “peek” at the test set
accuracy
fraction of instances predicted correctly
overfitting
fitting the training data very closely, but not generalizing well
Classification
Learning a discrete function: ________
Regression
Learning a continuous function: _________
discrete
Learning a ______ function: Classification (a SL task where output is having defined labels)
continuous
Learning a ______ function: Regression (a SL task output is having a continuous value)
Data Cleaning
Issues: Data Preparation Preprocess data in order to reduce noise and handle missing values
Relevance Analysis
Issues: Data Preparation Remove the irrelevant or redundant attributes
Data Transformation
Issues: Data Preparation -Generalize data to (higher concepts, discretization) -Normalize attribute values
Model construction
describing a set of predetermined classes
class label
Each tuple/sample is assumed to belong to a predefined class, as determined by the ___________
training set
The set of tuples used for model construction is _______
Model Usage
for classifying future or unknown objects
independent
Test set is _______ of training set, otherwise over-fitting will occur
classify
If the accuracy is acceptable, use the model to ______ data tuples whose class labels are not known
Inductive learning Task
Use particular facts to make more generalized conclusions
predictive
A _____ model based on a branching series of Boolean tests
one-stage
These smaller Boolean tests are less complex than a _____ classifier
measure
We first make a list of attributes that we can _______
discrete
These attributes of the decision tree (for now) must be ______
target attribute
We then choose a _______ that we want to predict
experience table
Then create an ____________ that lists what we have seen in the past
Ross Quinlan
Who developed the ID3 algorithm in 1975?
entropy
ID3 splits attributes based on their ______
entropy
________ is the measure of disinformation
minimized
Entropy is ______ when all values of the target attribute are the same
maximized
Entropy is _______ when there is an equal chance of all values for the target attribute (i.e. the result is random)
lowest
ID3 splits on attributes with the ______ entropy
pruning
There is another technique for reducing the number of attributes used in a tree
prepruning
we decide during the building process when to stop adding attributes
postpruning
waits until the full decision tree has built and then prunes the attributes
expected entropy
ID3 is not optimal because it uses ________ reduction, not actual reduction
errors propagating
Decision trees suffer from a problem of ________ throughout a tree
discretization
We can use a technique known as discretization where We choose cut points for splitting continuous attributes
boundary point
where two adjacent instances in a sorted list have different target value attributes
Lionhead Studios
Black & White was developed by _____ that used ID3
Black & White
Used to predict a player’s reaction to a certain creature’s action