Machine Learning

0.0(0)

Studied by 21 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/92

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

93 Terms

New cards

Machine Learning

A field of Artificial Intelligence where systems learn from data to make predictions or decisions without being explicitly programmed. It involves the study of algorithms that improve performance at a task through experience.

New cards

Supervised Learning

A type of training that uses a series of labeled examples with direct feedback, where the training data includes the desired outputs.

New cards

Unsupervised Learning

A type of learning with no feedback, where the training data does not include desired outputs. It involves learning "what normally happens" or grouping similar instances.

New cards

Reinforcement Learning

A type of learning involving indirect feedback after many examples, where rewards are received from a sequence of actions. It focuses on learning a policy (a sequence of outputs).

New cards

Regression

A task where the goal is to predict a continuous numeric value based on input features (e.g., predicting house prices or temperature).

New cards

Classification

A task where the goal is to predict categories or nominal outputs.

New cards

Linear Regression

A regression algorithm that fits data with a hyperplane (or line in 2D). It is the simplest model for function approximation.

New cards

Logistic Regression

A common algorithm used when the dependent variable is binary (e.g., disease vs. no disease). It fits data with a sigmoidal or logistic curve rather than a line to output a probability approximation.

New cards

Delta Rule (Least Mean Squares Rule)

An update rule used in supervised learning (specifically for neural networks) to minimize error by adjusting weights based on the difference between actual and predicted outputs.

New cards

Sum of Squared Error (SSE)

An objective function used in simple linear regression that sums the squared differences between predicted and actual values. It creates a parabolic error surface ideal for gradient descent.

New cards

Mean Absolute Error (MAE)

The average absolute difference between predicted and actual values.

New cards

Mean Squared Error (MSE)

The average squared difference between predicted and actual values; it penalizes larger errors more than MAE.

New cards

R² Score (Coefficient of Determination)

A metric that measures how well the model explains the variance in the data.

New cards

Binary Classification

A type of classification involving exactly two classes (e.g., Pass/Fail, Yes/No).

New cards

Multiclass Classification

Classification involving more than two classes (e.g., Cat, Dog, Bird).

New cards

Multilabel Classification

A scenario where each instance can belong to multiple classes simultaneously.

New cards

Threshold

The decision boundary (e.g., 0.5 or 0.9) that converts a model’s probability output into a specific class label.

New cards

Decision Tree

A hierarchical structure that makes decisions based on feature values, used for classification and regression.

New cards

Random Forest

An ensemble method consisting of multiple decision trees.

New cards

K-Nearest Neighbors (KNN)

An algorithm that classifies instances based on their nearest data points.

New cards

Support Vector Machine (SVM)

An algorithm that finds the best boundary between classes.

New cards

Naïve Bayes

A classifier based on probability and Bayes’ theorem.

New cards

Confusion Matrix

A table comparing the actual (true) labels with the labels predicted by the classification model.

New cards

Precision

The number of correctly classified positive examples divided by the total number of examples classified as positive.

New cards

Recall (Sensitivity)

The number of correctly classified positive examples divided by the total number of actual positive examples in the test set.

New cards

Specificity

Also known as the True Negative Rate (TNR).

New cards

Cross-validation

A method where data is partitioned into n subsets to train and test the model n times to estimate accuracy.

New cards

Leave-one-out Cross-validation

A special case of cross-validation for small datasets where each fold has only a single test example.

New cards

Scoring

Assigning a probability estimate (PE) to an instance rather than a definite class label.

New cards

ROC Curve (Receiver Operating Characteristics)

A plot of the True Positive Rate (TPR) against the False Positive Rate (FPR).

New cards

AUC (Area Under the Curve)

A performance measure where a value of 1 represents a perfect classifier and 0.5 represents random guessing.

New cards

Lift Analysis

An analysis performed by ranking examples by their score and dividing them into bins to observe the distribution of positive examples.

New cards

Overfitting

A situation where a tree fits the training data well but performs poorly on test data, often characterized by the tree being too deep or having too many branches.

New cards

Pre-pruning

A method to avoid overfitting by halting the construction of the tree early.

New cards

Post-pruning

A method to avoid overfitting by removing branches or sub-trees from a "fully grown" tree.

New cards

Association Rule Mining

A task focused on finding relationships between items, such as identifying that customers who buy coffee likely buy bread.

New cards

Dimensionality Reduction

The process of simplifying large datasets into fewer variables while retaining most of the important information.

New cards

Clustering

The process of grouping data based on similarity or distance (e.g., Euclidean distance) without knowing labels ahead of time.

New cards

K-Means

An algorithm that partitions data into k clusters by minimizing the distance between data points and cluster centers (centroids).

New cards

Hierarchical Clustering

An algorithm that builds a tree-like structure of clusters, often visualized as a dendrogram.

New cards

DBSCAN

A density-based algorithm that groups points lying close together and marks outliers as noise.

New cards

Elbow Method

A technique used to select the optimal number of clusters (k) by plotting the Sum of Squared Errors (SSE) and finding the "elbow point" where the error curve starts to flatten.

New cards

PCA

A dimensionality reduction technique that transforms data into a new coordinate system using principal components to simplify the dataset while keeping the maximum variance.

New cards

Principal Components

New variables created by PCA that are linear combinations of original features, designed to capture the most variance in the data.

New cards

Eigenvalues and Eigenvectors

Mathematical properties used in PCA to determine the directions (components) with the most variance.

New cards

Silhouette Score

A metric (ranging from –1 to +1) that measures how well data points match their own cluster compared to others; values closer to +1 indicate good clustering.

New cards

Davies–Bouldin Index (DBI)

A metric measuring cluster separation and compactness, where values closer to 0 indicate better clustering.

New cards

Calinski–Harabasz Index (CH Index)

A metric where higher values indicate better clustering, signifying that between-cluster variance is much greater than within-cluster variance.

New cards

fit()

A method used to learn or train parameters from the data.

New cards

transform()

A method used to apply learned parameters to data, typically used for test or new data.

New cards

fit_transform()

A method that learns parameters and applies the transformation in a single step, commonly used with preprocessing transformers.

New cards

Scalers

Tools used to normalize features, which is essential for distance-based algorithms.

New cards

StandardScaler

Standardizes features to have a mean of 0 and a standard deviation of 1.

New cards

MinMaxScaler

Scales data to a specific range, usually [0, 1].

New cards

RobustScaler

Scales data using the Interquartile Range (IQR), making it resistant to outliers.

New cards

MaxAbsScaler

Scales data to the range [-1, 1], often used for sparse data.

New cards

QuantileTransformer

Transforms data to follow a uniform or Gaussian distribution.

New cards

Imputers

Tools for handling missing values.

New cards

SimpleImputer

Fills missing values with basic statistics like mean, median, or mode.

New cards

KNNImputer

Fills values based on the similarity of nearest neighbors.

New cards

IterativeImputer

A regression-based method for modeling complex missing values.

New cards

MissingIndicator

Adds a binary indicator to denote where values were missing.

New cards

Encoders

Techniques to convert categorical data into numbers.

New cards

One-Hot Encoding

Converts categories into binary columns.

New cards

Label Encoding

Assigns each category a unique integer label.

New cards

Ordinal Encoding

Converts categories into integers based on a specific order.

New cards

Binary Encoding

Converts categories into binary digits.

New cards

Target Encoding

Replaces categories with the mean of the target variable.

New cards

Hashing Encoding

Maps categories to fixed-length hash values.

New cards

Text Vectorizers

Tools that convert text into numeric vectors, such as CountVectorizer and TfidfVectorizer.

New cards

Regularization

Techniques to prevent overfitting in linear models by penalizing complexity.

New cards

Ridge

Shrinks coefficients but keeps all features.

New cards

Lasso

Shrinks coefficients and sets some to zero, performing feature selection.

New cards

Elastic Net

Combines Ridge and Lasso penalties.

New cards

Gini Impurity

Measures the impurity of a class distribution (used to select splits).

New cards

Entropy

A measure of uncertainty or disorder used in splitting.

New cards

Gaussian NB

Assumes continuous features are normally distributed.

New cards

Multinomial NB

Used for discrete counts (e.g., text classification).

New cards

Bernoulli NB

Used for binary features.

New cards

Kernel Trick

A method to transform data into higher dimensions without calculating the transformation explicitly.

New cards

Types

Linear, Polynomial, RBF (Radial Basis Function), and Sigmoid.

New cards

Activation Functions

Operations determining neuron output.

New cards

ReLU (Rectified Linear Unit)

Common for hidden layers, outputs input if positive, else 0.

New cards

Softmax

Used for multi-class classification.

New cards

Optimizers

Algorithms to minimize loss.

New cards

MLP Regressor

A Multi-Layer Perceptron specifically for predicting continuous values.

New cards

Manhattan Distance

Grid-like distance.

New cards

Minkowski Distance

Generalization of Euclidean and Manhattan.

New cards

Hamming Distance

Used for binary/string data.

New cards

Cosine Similarity

Measures the angle between vectors (text similarity).

New cards

ARIMA

AutoRegressive Integrated Moving Average for forecasting.

New cards

SARIMA

Seasonal ARIMA, which includes seasonal components for data with cyclic patterns.