1/42
Vocabulary flashcards covering key ML concepts, terminology, and evaluation ideas from the lecture notes.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Machine Learning
Field where computer programs improve performance on tasks with experience from data; learning from E with respect to tasks T and measure P.
Task
The behavior or task being improved (e.g., classification, acting in an environment).
Data
Experiences used to improve performance in the task; training examples.
Performance Measure
Criterion to evaluate task performance (e.g., accuracy, error rate).
Example (x,y)
An instance x with label y; used to learn mappings from inputs to outputs.
Hypothesis Space
Set of all hypotheses (functions) that a learning algorithm may output.
Hypothesis
A function that approximates the target function mapping inputs to outputs.
Target Function
The true function f that maps each input x to its correct output y.
Training Data
Collection of observed examples used to learn a hypothesis.
Instance Space
Set of all possible objects describable by features X.
Concept
Subset of X representing objects with a property; unknown.
Feature
A measurable property describing an aspect of an instance.
Feature Vector
An n-dimensional vector of features representing an object.
Decision Tree
Tree-structured model that splits data on feature values to predict outputs.
Linear Function
Model where output is a weighted sum of inputs (linear relationship).
Perceptron
Single-layer neural unit computing a weighted sum with an activation function.
Multi-Layer Neural Network
Network with multiple hidden layers enabling complex representations.
Supervised Learning
Learning from labeled data (X,y) to predict labels for new inputs.
Unsupervised Learning
Learning from unlabeled data to cluster or summarize data.
Semi-Supervised Learning
Learning from a mix of labeled and unlabeled data.
Reinforcement Learning
Learning to act in an environment to maximize cumulative reward; agent, environment, state, action, reward.
Classification
Predicting a discrete label.
Regression
Predicting a continuous value.
Confusion Matrix
Table of true vs. predicted classes used to compute accuracy, precision, recall.
Accuracy
Proportion of correct predictions: (TP+TN)/(P+N).
Precision
TP/(TP+FP) – correctness of positive predictions.
Recall (TP rate)
TP/P – proportion of actual positives correctly predicted.
Bias
Systematic error from model assumptions; can cause underfitting.
Variance
Variability of model predictions due to different training data.
Generalization
Ability to perform well on unseen data; low generalization error.
Overfitting
Model fits noise in training data; high variance, poor generalization.
Underfitting
Model too simple; high bias, poor training and test performance.
Evaluation
Process of assessing algorithm performance using metrics and validation.
Cross-Validation
Estimating generalization by partitioning data into training/validation sets; common as k-fold.
Training Set
Data used to train the model.
Test Set
Independent data used to evaluate model performance.
Occam's Razor
Among consistent hypotheses, the simplest is preferred.
Minimum Description Length (MDL)
Inductive bias favoring shorter descriptions of hypotheses.
Maximum Margin
SVM principle: maximize the distance between class boundaries.
Inductive Bias
Assumptions guiding the learning process to prefer certain hypotheses.
Hypothesis Language
Formalism used to express hypotheses, influencing bias.
Inductive Learning
Inferring a general function from training examples.
Experimentation Error
Error due to finite sample size and bias-variance trade-offs.