Artificial Intelligence

0.0(0)

Studied by 12 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/98

Earn XP

Description and Tags

Computer Science

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

99 Terms

New cards

Decision Tree

can be used to visually and explicitly represent decisions and decision making

New cards

Decision Tree

Build a _______ for classifying

New cards

Decision Tree

It utilizes supervised learning, batch processing of training

New cards

Preference Bias

Define a metric for comparing fs so as to determine whether one is better than another

New cards

upside down

A decision tree is drawn ______ with its root at the top

New cards

condition or internal node

bold text in black of a decision tree represents a ____

New cards

branches or edges

tree splits into _____

New cards

decision or leaf

The end of the branch that doesn’t split anymore is the _____

New cards

Random

Select any attribute at random

New cards

Least-Values

Choose the attribute with the smallest number of possible values

New cards

Most-Values

Choose the attribute with the largest number of possible values

New cards

Max-Gain

Choose the attribute that has the largest expected information gain

New cards

Max-Gain

try to select the attribute that will result in the smallest expected size of the subtrees rooted at its children

New cards

measures the information content or entropy in bits

New cards

Low information content

is desirable in order to make the smallest tree because low information content means that most of examples are classified the SAME, and therefore we would expect that the rest of the tree rooted at this node will be quite small to differentiate between the two classifications.

New cards

Conditional entropy

is defined as a conditional probability of a class, Y, given a value, v, for an attribute (i.e., question), X.

New cards

question

Pr(Y|X=v)
what is X?

New cards

label

Pr(Y|X=v)
what is Y?

New cards

answer to the question

Pr(Y|X=v)
what is v?

New cards

symmetric

information gain is ________

New cards

mutual information

other term for information gain

New cards

entropy

measurement of uncertainty

New cards

Machine Learning

Is said as a subset of artificial intelligence

New cards

data and past experiences

Machine learning is the development of algorithms which allow a computer to learn from the ______ and _______ on their own

New cards

Arthur Samuel

Machine Learning was introduced by

New cards

1959

what year was Machine Learning introduced?

New cards

patterns

Machine learning uses data to detect various ______ in a given dataset.

New cards

automatically

It can learn from past data and improve ____________.

New cards

data-driven

It is a _________ technology.

New cards

data mining

Machine learning is much similar to _______ as it also deals with the huge amount of the data.

New cards

increment

Need for Machine Learning
Rapid _______ in the production of data

New cards

complex

Need for Machine Learning
Solving ______ problems, which are difficult for a human

New cards

Decision

Need for Machine Learning
______-making in various sector including finance

New cards

hidden

Need for Machine Learning
Finding ______ patterns and extracting useful information from data

New cards

Supervised Learning

Classification of Machine learning
Classification/Regression/Estimation

New cards

Unsupervised Learning

Classification of Machine learning
Clustering/Prediction/Association

New cards

Reinforcement Learning

Classification of Machine learning
Classification/Control/Decision-Making

New cards

Data Exploration

The step where we understand the nature of data that we have to work with. In this, we find Correlations, general trends, and outliers.

New cards

Data pre-processing

The step where preprocessing of data for its analysis takes place

New cards

Train

_____ model: to improve its performance for better outcome of the problem

New cards

Test

_____ model: check for the accuracy of our model by providing a test dataset to it.

New cards

Deployment

deploy the model in the real-world system

New cards

Herbert Simon

“Learning is any process by which a system improves performance from experience.” Who said dis?

New cards

Tom Mitchell

"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E." Who said dis?

New cards

Learning

is essential for unknown environments

New cards

system construction

Learning is useful as a _________ method

New cards

omniscience

when designer lacks _______

New cards

reality

expose the agent to ____ rather than trying to write it down

New cards

decision mechanisms

Learning modifies the agent's _______ to improve performance

New cards

Machine learning

how to acquire a model on the basis of data / experience

New cards

probabilities

example of Learning parameters (plural)

New cards

Bayesian network graph

example of Learning structure (singular)

New cards

clustering

example of Learning hidden concepts

New cards

Supervised Learning

Machine Learning Areas
Data and corresponding labels are given

New cards

Unsupervised Learning

Machine Learning Areas
Only data is given, no labels provided

New cards

Semi-Supervised Learning

Machine Learning Areas
Some (if not all) labels are present

New cards

Reinforcement Learning

Machine Learning Areas
An agent interacting with the world makes observations, takes actions, and is rewarded or punished; it should learn to choose actions in such a way as to obtain a lot of reward

New cards

past experiences of Data feed in

A machine is said to be learning from ____________ with respect to some class of tasks if its Performance in a given Task improves with the Experience

New cards

previous knowledge or past experiences

the machine works in a basic conceptual level of looking at the ________

New cards

Data

labeled instances

New cards

Features

attribute-value pairs which characterize each x

New cards

Experimentation Cycle

-Learn parameters (e.g. model probabilities) on training set
-(Tune hyper-parameters on held-out set)
-Compute accuracy of test set
-Very important: never “peek” at the test set

New cards

accuracy

fraction of instances predicted correctly

New cards

overfitting

fitting the training data very closely, but not generalizing well

New cards

Classification

Learning a discrete function: ________

New cards

Regression

Learning a continuous function: _________

New cards

discrete

Learning a ______ function: Classification (a SL task where output is having defined labels)

New cards

continuous

Learning a ______ function: Regression (a SL task output is having a continuous value)

New cards

Data Cleaning

Issues: Data Preparation
Preprocess data in order to reduce noise and handle missing values

New cards

Relevance Analysis

Issues: Data Preparation
Remove the irrelevant or redundant attributes

New cards

Data Transformation

Issues: Data Preparation
-Generalize data to (higher concepts, discretization)
-Normalize attribute values

New cards

Model construction

describing a set of predetermined classes

New cards

class label

Each tuple/sample is assumed to belong to a predefined class, as determined by the ___________

New cards

training set

The set of tuples used for model construction is _______

New cards

Model Usage

for classifying future or unknown objects

New cards

independent

Test set is _______ of training set, otherwise over-fitting will occur

New cards

classify

If the accuracy is acceptable, use the model to ______ data tuples whose class labels are not known

New cards

Inductive learning Task

Use particular facts to make more generalized conclusions

New cards

predictive

A _____ model based on a branching series of Boolean tests

New cards

one-stage

These smaller Boolean tests are less complex than a _____ classifier

New cards

measure

We first make a list of attributes that we can _______

New cards

discrete

These attributes of the decision tree (for now) must be ______

New cards

target attribute

We then choose a _______ that we want to predict

New cards

experience table

Then create an ____________ that lists what we have seen in the past

New cards

Ross Quinlan

Who developed the ID3 algorithm in 1975?

New cards

entropy

ID3 splits attributes based on their ______

New cards

entropy

________ is the measure of disinformation

New cards

minimized

Entropy is ______ when all values of the target attribute are the same

New cards

maximized

Entropy is _______ when there is an equal chance of all values for the target attribute (i.e. the result is random)

New cards

lowest

ID3 splits on attributes with the ______ entropy

New cards

pruning

There is another technique for reducing the number of attributes used in a tree

New cards

prepruning

we decide during the building process when to stop adding attributes

New cards

postpruning

waits until the full decision tree has built and then prunes the attributes

New cards

expected entropy

ID3 is not optimal because it uses ________ reduction, not actual reduction

New cards

errors propagating

Decision trees suffer from a problem of ________ throughout a tree

New cards

discretization

We can use a technique known as discretization where We choose cut points for splitting continuous attributes

New cards

boundary point

where two adjacent instances in a sorted list have different target value attributes

New cards

Lionhead Studios

Black & White was developed by _____ that used ID3

New cards

Black & White

Used to predict a player’s reaction to a certain creature’s action