Artificial Intelligence

0.0(0)
studied byStudied by 11 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/98

flashcard set

Earn XP

Description and Tags

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

99 Terms

1
New cards
Decision Tree
can be used to visually and explicitly represent decisions and decision making
2
New cards
Decision Tree
Build a _______ for classifying
3
New cards
Decision Tree
It utilizes supervised learning, batch processing of training
4
New cards
Preference Bias
Define a metric for comparing fs so as to determine whether one is better than another
5
New cards
upside down
A decision tree is drawn ______ with its root at the top
6
New cards
condition or internal node
bold text in black of a decision tree represents a ____
7
New cards
branches or edges
tree splits into _____
8
New cards
decision or leaf
The end of the branch that doesn’t split anymore is the _____
9
New cards
Random
Select any attribute at random
10
New cards
Least-Values
Choose the attribute with the smallest number of possible values
11
New cards
Most-Values
Choose the attribute with the largest number of possible values
12
New cards
Max-Gain
Choose the attribute that has the largest expected information gain
13
New cards
Max-Gain
try to select the attribute that will result in the smallest expected size of the subtrees rooted at its children
14
New cards
H
measures the information content or entropy in bits
15
New cards
Low information content
is desirable in order to make the smallest tree because low information content means that most of examples are classified the SAME, and therefore we would expect that the rest of the tree rooted at this node will be quite small to differentiate between the two classifications.
16
New cards
Conditional entropy
is defined as a conditional probability of a class, Y, given a value, v, for an attribute (i.e., question), X.
17
New cards
question
Pr(Y|X=v)
what is X?
18
New cards
label
Pr(Y|X=v)
what is Y?
19
New cards
answer to the question
Pr(Y|X=v)
what is v?
20
New cards
symmetric
information gain is ________
21
New cards
mutual information
other term for information gain
22
New cards
entropy
measurement of uncertainty
23
New cards
Machine Learning
Is said as a subset of artificial intelligence
24
New cards
data and past experiences

Machine learning is the development of algorithms which allow a computer to learn from the ______ and _______ on their own
25
New cards
Arthur Samuel
Machine Learning was introduced by 
26
New cards
1959
what year was Machine Learning introduced?
27
New cards
patterns
Machine learning uses data to detect various ______ in a given dataset.
28
New cards
automatically
It can learn from past data and improve ____________.
29
New cards
data-driven
It is a _________ technology.
30
New cards
data mining
Machine learning is much similar to _______ as it also deals with the huge amount of the data.
31
New cards
increment
Need for Machine Learning
Rapid _______ in the production of data
32
New cards
complex
Need for Machine Learning
Solving ______ problems, which are difficult for a human
33
New cards
Decision
Need for Machine Learning
______-making in various sector including finance
34
New cards
hidden
Need for Machine Learning
Finding ______ patterns and extracting useful information from data
35
New cards
Supervised Learning
Classification of Machine learning
Classification/Regression/Estimation
36
New cards
Unsupervised Learning
Classification of Machine learning
Clustering/Prediction/Association
37
New cards
Reinforcement Learning
Classification of Machine learning
Classification/Control/Decision-Making
38
New cards
Data Exploration
The step where we understand the nature of data that we have to work with. In this, we find Correlations, general trends, and outliers.
39
New cards
Data pre-processing
The step where preprocessing of data for its analysis takes place
40
New cards
Train
_____ model: to improve its performance for better outcome of the problem
41
New cards
Test
_____ model: check for the accuracy of our model by providing a test dataset to it.
42
New cards
Deployment
deploy the model in the real-world system
43
New cards
Herbert Simon
“Learning is any process by which a system improves performance from experience.” Who said dis?
44
New cards
Tom Mitchell
"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E." Who said dis?
45
New cards
Learning
is essential for unknown environments
46
New cards
system construction
Learning is useful as a _________ method
47
New cards
omniscience
when designer lacks _______
48
New cards
reality
expose the agent to ____ rather than trying to write it down
49
New cards
decision mechanisms
Learning modifies the agent's _______ to improve performance
50
New cards
Machine learning
how to acquire a model on the basis of data / experience
51
New cards
probabilities
example of Learning parameters (plural)
52
New cards
Bayesian network graph
example of Learning structure (singular)
53
New cards
clustering
example of Learning hidden concepts
54
New cards
Supervised Learning
Machine Learning Areas
Data and corresponding labels are given
55
New cards
Unsupervised Learning
Machine Learning Areas
Only data is given, no labels provided
56
New cards
Semi-Supervised Learning
Machine Learning Areas
Some (if not all) labels are present
57
New cards
Reinforcement Learning
Machine Learning Areas
An agent interacting with the world makes observations, takes actions, and is rewarded or punished; it should learn to choose actions in such a way as to obtain a lot of reward
58
New cards
past experiences of Data feed in
A machine is said to be learning from ____________ with respect to some class of tasks if its Performance in a given Task improves with the Experience
59
New cards
previous knowledge or past experiences
the machine works in a basic conceptual level of looking at the ________
60
New cards
Data
labeled instances
61
New cards
Features
attribute-value pairs which characterize each x
62
New cards
Experimentation Cycle
-Learn parameters (e.g. model probabilities) on training set
-(Tune hyper-parameters on held-out set)
-Compute accuracy of test set
-Very important: never “peek” at the test set
63
New cards
accuracy
fraction of instances predicted correctly
64
New cards
overfitting
fitting the training data very closely, but not generalizing well
65
New cards
Classification
Learning a discrete function: ________
66
New cards
Regression
Learning a continuous function: _________
67
New cards
discrete
Learning a ______ function: Classification (a SL task where output is having defined labels)
68
New cards
continuous
Learning a ______ function: Regression (a SL task output is having a continuous value)
69
New cards
Data Cleaning
Issues: Data Preparation
Preprocess data in order to reduce noise and handle missing values
70
New cards
Relevance Analysis
Issues: Data Preparation
Remove the irrelevant or redundant attributes
71
New cards
Data Transformation
Issues: Data Preparation
-Generalize data to (higher concepts, discretization)
-Normalize attribute values
72
New cards
Model construction
describing a set of predetermined classes
73
New cards
class label
Each tuple/sample is assumed to belong to a predefined class, as determined by the ___________
74
New cards
training set
The set of tuples used for model construction is _______
75
New cards
Model Usage
for classifying future or unknown objects
76
New cards
independent
Test set is _______ of training set, otherwise over-fitting will occur
77
New cards
classify
If the accuracy is acceptable, use the model to ______ data tuples whose class labels are not known
78
New cards
Inductive learning Task
Use particular facts to make more generalized conclusions
79
New cards
predictive
A _____ model based on a branching series of Boolean tests
80
New cards
one-stage
These smaller Boolean tests are less complex than a _____ classifier
81
New cards
measure
We first make a list of attributes that we can _______
82
New cards
discrete
These attributes of the decision tree (for now) must be ______
83
New cards
target attribute
We then choose a _______ that we want to predict
84
New cards
experience table
Then create an ____________ that lists what we have seen in the past
85
New cards
Ross Quinlan
Who developed the ID3 algorithm in 1975?
86
New cards
entropy
ID3 splits attributes based on their ______
87
New cards
entropy
________ is the measure of disinformation
88
New cards
minimized
Entropy is ______ when all values of the target attribute are the same
89
New cards
maximized
Entropy is _______ when there is an equal chance of all values for the target attribute (i.e. the result is random)
90
New cards
lowest
ID3 splits on attributes with the ______ entropy
91
New cards
pruning
There is another technique for reducing the number of attributes used in a tree
92
New cards
prepruning
we decide during the building process when to stop adding attributes
93
New cards
postpruning
waits until the full decision tree has built and then prunes the attributes
94
New cards
expected entropy
ID3 is not optimal because it uses ________ reduction, not actual reduction
95
New cards
errors propagating
Decision trees suffer from a problem of ________ throughout a tree
96
New cards
discretization
We can use a technique known as discretization where We choose cut points for splitting continuous attributes
97
New cards
boundary point
where two adjacent instances in a sorted list have different target value attributes
98
New cards
Lionhead Studios
Black & White was developed by _____ that used ID3
99
New cards
Black & White
Used to predict a player’s reaction to a certain creature’s action