Machine Learning Foundations Week 5 Glossary

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/27

Earn XP

Description and Tags

Flashcards for Machine Learning Foundations Week 5 Glossary

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

28 Terms

New cards

Accuracy

A performance metric for classification models; the number of correct predictions out of the total number of predictions.

New cards

Area under the receiver operator curve (AUC)

A commonly used metric for measuring a binary classifier’s performance.

New cards

Base rate

Pertaining to a model, the percent of cases in your evaluation data where Y equals 1.

New cards

Classification

A supervised learning method in which the label is a categorical value.

New cards

Conditional expected value

The likely average future value of Y in cases where X is true.

New cards

Empirical risk minimization

Choosing the model that minimizes loss on the training set.

New cards

Expected value

The likely average future value of Y.

New cards

Expected value estimation

The most likely value of an outcome given known information about an example

New cards

Feature selection

The process of empirically testing different combinations of features to choose an appropriate set.

New cards

Generalization

A model’s ability to adapt to new, previously unseen data.

New cards

Heuristic selection

A feature selection method that filters out features using heuristic rules prior to modeling.

New cards

Hyperparameters

The “knobs” that you tweak during successive runs of training a model. Often trade off complexity vs. simplicity of models.

New cards

Implicit feature selection

Reducing feature count as a byproduct of the model training procedure.

New cards

K-fold cross-validation

A resampling method that uses different portions of the data to train and validate the model on different partitions of the data.

New cards

Model deployment

The process of using a machine learning model in a production environment where it can be used for its intended purpose.

New cards

Out-of-sample validation

Computing evaluation metrics on examples that were not part of model training. Helps approximate the expected loss.

New cards

Precision

Percentage of positive predictions that were actually positive.

New cards

Ranking

Sorting examples and choosing top K to fulfill some optimization objective.

New cards

Recall

Percentage of actual positives that were correctly classified as positive.

New cards

Receiver operator curve (ROC)

A curve that represents the performance of your binary classification model at various classification thresholds.

New cards

Regression

A supervised learning method in which the label is any real valued number.

New cards

Regularization

The penalty on a model’s complexity; helps prevent overfitting.

New cards

Stepwise selection

Feature selection method to iteratively add/reduce features based on empirical model performance.

New cards

Supervised learning

A class of machine learning problems in which labeled data are available, enabling an algorithm to learn how to associate data values with data labels so that predictive models for classification or regression on unseen data are possible.

New cards

Test set

The subset of the data set that you use as a final test of your model’s performance.

New cards

Training set

The subset of the data set used to train a machine learning model to make predictions.

New cards

Validation set

The subset of the data set that is used to evaluate models’ performances when performing model selection.

New cards