1/398
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Machine Learning
Agents building models and systems based on observed data.
Artificial Intelligence
AI ≠ Machine Learning (ML); Machine Learning is a subset of AI.
Deep Learning
A part of ML involving artificial neural networks.
Learning Agent
An agent is learning if it improves performance after making observations about the world.
Model
A hypothesis about the world and software that can solve problems.
Supervised Learning
Learn a function from labeled data.
Unsupervised Learning
Learn patterns from unlabeled data.
Reinforcement Learning
Learn best actions from experience of rewards and punishments.
Labeled Data
Input-output pairs where the label is the output.
Classification
Output: finite set of values called classes or labels (e.g. true/false, sunny/rainy/cloudy).
Regression
Output: A number (e.g. temperature, which can be an integer or a real number).
Clustering
Output: Sets of similar data (based on a defined criteria).
Association Rule Mining
Output: Correlations and associations; Example: which items shoppers tend to purchase together.
Nearest Neighbors, Decision Trees, Neural Networks, Support Vector Machines, Linear Regression
Supervised learning algorithms
K Means Clustering, Hierarchical Clustering, Gaussian Mixture Models, Apriori Algorithm (association rule mining)
Unsupervised learning algorithms
Q-Learning
A reinforcement learning algorithm.
SARSA
State-Action-Reward-State-Action; a reinforcement learning algorithm.
Deep Q Network
A reinforcement learning algorithm.
Exploration
Try other options to get additional information.
Exploitation
Stay with what has given most reward.
Restaurant Waiting
Will we wait for a table in a restaurant?
Labeled Dataset
A dataset where each instance has a corresponding label.
Instances
Examples in a dataset, which can be rows in a table.
Features
Attributes or characteristics of an instance, typically represented as columns in a dataset.
Labels
The output or target variable in a supervised learning task, usually found in the last column of a dataset.
Decision Tree
A model used in machine learning for classification and regression tasks.
Attributes
The features that describe instances in a dataset.
Goal of Classification
To accurately predict the label of instances based on their features.
Rows in a Dataset
Represent individual instances in a dataset.
Columns in a Dataset
Represent features or attributes of instances in a dataset.
Last Column in Dataset
Typically contains the labels in a labeled dataset.
Example of a Labeled Dataset
A dataset containing instances with both features and corresponding labels.
Russel & Norvig
Authors of published material referenced in the slides.
Past Examples
Twelve examples with decisions made, each having ten attributes/features.
Decision
The outcome of a classification task, either 'Will Wait' or 'Will Not'.
Function (Model)
A function derived based on a dataset to predict the label of an instance with an unknown label.
Model Training
The process of training a model using features and labels from a dataset.
Model Testing
The process of testing a trained model using test features to predict labels.
Untrained Model
A model that has not yet been trained on any dataset.
Trained Model
A model that has been trained using a dataset and can make predictions.
Prediction
The output label predicted by a trained model for an instance.
Classification Models
Models used to categorize data into predefined classes, including Nearest Neighbors, Decision Trees, Random Forest, Support Vector Machines, and Neural Networks.
K in K Nearest Neighbors
The number of nearest neighbors to consider when classifying an unlabeled instance.
Instance as Datapoint
Each instance in a dataset represented as a point in a graph based on its features.
Label Representation
The label that represents most of the K nearest points in K Nearest Neighbors classification.
Feature Table
A table where each row represents a person with various features and their corresponding labels.
Patrons
Individuals represented in the feature table.
Hungry
A feature indicating whether a patron is hungry or not.
Type
A feature indicating the type of food a patron prefers.
Will Wait
A label indicating whether a patron is willing to wait for food.
Blue Label
Indicates a patron who waited for food.
Red Label
Indicates a patron who did not wait for food.
Test
Each test is based on a single feature.
Predicted Label
Eventually leads to a predicted label.
Goal of Decision Tree
A tree that most consistently leads to the correct labels (of the dataset).
Feature Selection
The feature that can best distinguish examples by their labels.
Branch via new feature
The process of splitting the decision tree based on a new feature.
Same number of 'Yes' and 'No'
Indicates a bad split in the decision tree.
Examples
Instances that are used to build the decision tree.
Dataset
A collection of examples used for training the decision tree.
Full
Indicates a complete set of data or features.
Some
Indicates a partial or incomplete set of data or features.
None
Indicates the absence of data or features.
Types of Cuisine
Examples include Thai, French, Italian, and Burger.
Decision Outcomes
The results of the tests leading to labels.
Feature Distinction
The ability of a feature to separate examples effectively.
Random Forest
An ensemble method that predicts labels based on multiple decision trees, each from a random sample of the main dataset.
Overfitting
A problem with Decision Trees where the model fits well with the training dataset but does not perform well with new instances.
Support Vector Machines (SVM)
A method where instances are treated as datapoints and features as dimensions in a hyperplane, aiming to linearly divide labeled datapoints.
Support Vectors
Points closest to the boundary in Support Vector Machines.
Artificial Neural Networks (ANN)
A model inspired by neurons and synapses in the human brain, consisting of layers of neurons connected to each other.
Input Layer
The layer in an ANN that takes in input signals, such as features.
Output Layer
The layer in an ANN that provides the output, such as labels.
Hidden Layers
Layers in an ANN that facilitate computations between the input and output layers.
Ensemble Method
A technique that combines multiple models to improve prediction accuracy.
Dimensions
Features in a dataset represented as axes in a hyperplane.
Training Dataset
The dataset used to train a model.
New Instances
Data points that the model has not seen during training.
Popular in the early 2000s
A description of the widespread use of Support Vector Machines during that time.
Layers of Neurons
The structure of an ANN where neurons are organized in layers.
Computations
The processes carried out by hidden layers in an ANN to transform inputs into outputs.
Good in Practice
A phrase describing the effectiveness of Support Vector Machines in real-world applications.
Random Sample
A subset of data selected randomly from the main dataset for training individual decision trees.
Goal of Random Forest
To predict labels based on the aggregation of predictions from multiple decision trees.
Neuron activation
A neuron is activated based on input signals, weights, thresholds, and activation function.
Back propagation
Uses back propagation to learn weights and thresholds.
Model Evaluation
To evaluate a classification model, we split our dataset into training set and test set.
Training set
used to train the model
Test set
used to evaluate the model
Accuracy
Confusion Matrix
shows correct results against predicted results for each class (i.e. possible values of label)
True Positive (TP)
Correctly predicted positive instances
False Positive (FP)
Incorrectly predicted positive instances
True Negative (TN)
Correctly predicted negative instances
False Negative (FN)
Incorrectly predicted negative instances
Number of test instances
12
Number of correct predictions
9
Accuracy example
9/12 = 0.75 or 75%
Correct predictions
are along the diagonal in the confusion matrix.
Example of Unsupervised Learning
Input: Images of animals; Output: Groups of similar images