1/31
Comprehensive vocabulary flashcards covering basic machine learning definitions, task types, performance concepts like overfitting/underfitting, regularization techniques, and the common Python package ecosystem used in the course.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Machine Learning
A program that learns from experience E on tasks T with performance measure P if its performance on T, as measured by P, improves with E.
Supervised Learning
A type of learning where machines learn from labeled data in which input examples are paired with target values to learn a mapping between inputs and output labels.
Classification
A supervised learning task where machines learn to determine which discrete class or category an input belongs to.
Regression
A supervised learning task where machines learn to predict a continuous numeric value from a set of inputs by fitting a model to data.
Unsupervised Learning
A type of learning where no labels are associated with the input data, and algorithms learn patterns and relationships among samples without explicit guidance.
Association
An unsupervised learning task that identifies patterns that frequently occur together in data, such as diabetes being associated with high blood pressure.
Clustering
An unsupervised learning task that groups data samples into clusters that share similar features, typically using similarity or distance measures.
Anomaly Detection
An unsupervised learning task that detects rare or unusual patterns that differ from typical behavior, such as spam email or credit card fraud.
Accuracy
A common performance measure for classification defined as the proportion of examples for which the model produces the correct output.
Error Rate
The complement of accuracy which indicates the proportion of misclassified examples.
Generalization
The ability of a machine learning approach to perform well on new, previously unseen examples, rather than just the data used for training.
Underfitting
A condition where a model cannot capture the underlying patterns in the data, resulting in high training error and typically high validation or test error.
Overfitting
A condition where a model fits the training data too closely, including noise or outliers, leading to low training error but high validation or test error.
Model Capacity
A model’s ability to represent a wide variety of functions; low capacity leads to underfitting while high capacity can lead to overfitting.
Regularization
A modification to a learning algorithm to encourage better generalization while maintaining acceptable training error by discouraging overly complex solutions.
Weight Decay
A common form of regularization in linear regression where a penalty term is added to the cost function to discourage large parameter values.
Hyperparameter λ (lambda)
A parameter that controls the trade-off between fitting the data and keeping the weights small; it is typically tuned using validation data.
L2 Regularization (Ridge)
A type of regularization that adds a penalty term of λ∑w2 to the cost function.
L1 Regularization (Lasso)
A type of regularization that adds a penalty term of λ∑∣w∣ to the cost function.
NumPy
A powerful library for numerical computing that provides fast n-dimensional arrays, linear algebra functionality, and vectorized computations.
SciPy
A fundamental library for scientific computing that provides efficient numerical routines for integration, interpolation, optimization, and statistics.
Pandas
A flexible open-source library used for data analysis and manipulation of tabular data, such as spreadsheets and database tables.
Matplotlib
A comprehensive 2D plotting library for Python used to create static, animated, and interactive visualizations.
Scikit-learn
A widely used machine learning library built on top of NumPy and SciPy that provides implementations for classification, regression, clustering, and data preprocessing.
OpenCV
A library that supports image I/O and classic computer vision operations like filtering, thresholding, and feature extraction.
IPython
An enhanced interactive Python console that provides a powerful environment for exploratory programming and serves as the kernel for Jupyter.
SymPy
A symbolic mathematics library for Python used for algebraic manipulation, symbolic differentiation, and integration.
Seaborn
A data visualization library built on Matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics.
Statsmodels
A Python library for estimating and analyzing statistical models, supporting parametric and non-parametric tests and R-style formulas.
Conda Environment
A self-contained directory holding its own Python installation and package sets to prevent dependency conflicts.
Jupyter Kernel
The running Python process that a notebook connects to; users must select the kernel for the correct environment.
Channel
The location where conda pulls packages from, such as conda-forge.