BE 530 - Machine Learning in Python Lecture 2 Flashcards

0.0(0)

Studied by 0 people

View linked note

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/31

Earn XP

Description and Tags

Comprehensive vocabulary flashcards covering basic machine learning definitions, task types, performance concepts like overfitting/underfitting, regularization techniques, and the common Python package ecosystem used in the course.

Last updated 2:16 AM on 5/22/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

32 Terms

New cards

Machine Learning

A program that learns from experience $E$ on tasks $T$ with performance measure $P$ if its performance on $T$ , as measured by $P$ , improves with $E$ .

New cards

Supervised Learning

A type of learning where machines learn from labeled data in which input examples are paired with target values to learn a mapping between inputs and output labels.

New cards

Classification

A supervised learning task where machines learn to determine which discrete class or category an input belongs to.

New cards

Regression

A supervised learning task where machines learn to predict a continuous numeric value from a set of inputs by fitting a model to data.

New cards

Unsupervised Learning

A type of learning where no labels are associated with the input data, and algorithms learn patterns and relationships among samples without explicit guidance.

New cards

Association

An unsupervised learning task that identifies patterns that frequently occur together in data, such as diabetes being associated with high blood pressure.

New cards

Clustering

An unsupervised learning task that groups data samples into clusters that share similar features, typically using similarity or distance measures.

New cards

Anomaly Detection

An unsupervised learning task that detects rare or unusual patterns that differ from typical behavior, such as spam email or credit card fraud.

New cards

Accuracy

A common performance measure for classification defined as the proportion of examples for which the model produces the correct output.

New cards

Error Rate

The complement of accuracy which indicates the proportion of misclassified examples.

New cards

Generalization

The ability of a machine learning approach to perform well on new, previously unseen examples, rather than just the data used for training.

New cards

Underfitting

A condition where a model cannot capture the underlying patterns in the data, resulting in high training error and typically high validation or test error.

New cards

Overfitting

A condition where a model fits the training data too closely, including noise or outliers, leading to low training error but high validation or test error.

New cards

Model Capacity

A model’s ability to represent a wide variety of functions; low capacity leads to underfitting while high capacity can lead to overfitting.

New cards

Regularization

A modification to a learning algorithm to encourage better generalization while maintaining acceptable training error by discouraging overly complex solutions.

New cards

Weight Decay

A common form of regularization in linear regression where a penalty term is added to the cost function to discourage large parameter values.

New cards

Hyperparameter $\lambda$ (lambda)

A parameter that controls the trade-off between fitting the data and keeping the weights small; it is typically tuned using validation data.

New cards

L2 Regularization (Ridge)

A type of regularization that adds a penalty term of $\lambda \sum w^2$ to the cost function.

New cards

L1 Regularization (Lasso)

A type of regularization that adds a penalty term of $\lambda \sum |w|$ to the cost function.

New cards

NumPy

A powerful library for numerical computing that provides fast n-dimensional arrays, linear algebra functionality, and vectorized computations.

New cards

SciPy

A fundamental library for scientific computing that provides efficient numerical routines for integration, interpolation, optimization, and statistics.

New cards

Pandas

A flexible open-source library used for data analysis and manipulation of tabular data, such as spreadsheets and database tables.

New cards

Matplotlib

A comprehensive 2D plotting library for Python used to create static, animated, and interactive visualizations.

New cards

Scikit-learn

A widely used machine learning library built on top of NumPy and SciPy that provides implementations for classification, regression, clustering, and data preprocessing.

New cards

OpenCV

A library that supports image I/O and classic computer vision operations like filtering, thresholding, and feature extraction.

New cards

IPython

An enhanced interactive Python console that provides a powerful environment for exploratory programming and serves as the kernel for Jupyter.

New cards

SymPy

A symbolic mathematics library for Python used for algebraic manipulation, symbolic differentiation, and integration.

New cards

Seaborn

A data visualization library built on Matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics.

New cards

Statsmodels

A Python library for estimating and analyzing statistical models, supporting parametric and non-parametric tests and R-style formulas.

New cards

Conda Environment

A self-contained directory holding its own Python installation and package sets to prevent dependency conflicts.

New cards

Jupyter Kernel

The running Python process that a notebook connects to; users must select the kernel for the correct environment.

New cards

Channel

The location where conda pulls packages from, such as conda-forge.