1/51
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Subquery
A SELECT query that is enclosed inside another query, used to filter or manipulate data dynamically.
Execution Flow
The sequence in which subqueries are executed within SQL commands like SELECT, INSERT, UPDATE, and DELETE.
Supervised Learning
A type of machine learning that uses labeled data to predict outcomes, primarily through classification and regression.
Unsupervised Learning
A type of machine learning that analyzes unlabeled data to identify patterns, such as clustering or dimensionality reduction.
Reinforcement Learning
A learning process where an agent learns to behave in an environment by receiving rewards or penalties.
Ranking
The process of predicting the most relevant item, commonly used in recommendation systems.
Recommendation Systems
Systems designed to suggest products, music, or content to users based on preferences.
Features
Properties or characteristics of data, which can be quantitative or categorical.
Labels
Target outcomes in supervised learning, indicating what the model is attempting to predict.
Feature Vector
A single sample's data represented as a row of features.
Feature Matrix
A complete set of feature vectors from all samples, typically organized in a table.
Target Vector
A column of labels or target outcomes in supervised learning.
Derivative
A mathematical measure of how a function changes as its input changes, crucial for optimization.
Probability
A measure of the likelihood that a certain outcome will occur.
Probability Distribution
A function that describes how probabilities are distributed over a range of outcomes.
Gaussian Distribution
A common probability distribution also known as a normal distribution.
Uniform Distribution
A type of probability distribution where all outcomes are equally likely.
traintestsplit()
A function in scikit-learn used to split datasets into training and testing sets.
NaNs
Stands for 'Not a Number', representing missing or undefined data in datasets.
Abstract Class
A class that cannot be instantiated on its own and must be subclassed; contains at least one abstract method.
Abstract Method
A method defined in an abstract class that has no implementation and must be implemented by subclasses.
Static Method
A method that does not operate on an instance of the class or require class context.
Instance Method
A method that works on the instance of a class (typically uses 'self').
Concrete Method
A method in a class that has a complete implementation.
.set_index()
A Pandas method that sets a specified column as the index for a DataFrame.
.reset_index()
A Pandas method that resets the index of a DataFrame; drop=True removes the old index.
.loc[]
A Pandas accessor for label-based subsetting, allowing selection by index or column names.
df.groupby()
A method in Pandas that groups data by a column's values and is often used with aggregation functions.
Subset
A filtered portion of a dataset, containing specific rows and/or columns.
Pivot Table
A data processing tool that summarizes and reshapes data in a way similar to Excel pivot tables.
One-Hot Encoding
A method of converting categorical variables into a format suitable for machine learning algorithms.
Training Data
Data used to train a machine learning model, containing examples and known outcomes.
Testing Data
Data used to evaluate the performance of a trained machine learning model.
Model Fitting
The process of training a machine learning model on a dataset to learn patterns.
Feature Engineering
The process of selecting, modifying, or creating features from raw data to improve model performance.
Overfitting
A modeling error that occurs when a model learns noise and details in the training data to an extent that negatively impacts its performance on new data.
Underfitting
A situation where a model is too simple to capture the underlying trend of the data.
Cross-Validation
A method to evaluate model performance by dividing the data into subsets to ensure it generalizes well.
Hyperparameter Tuning
The process of optimizing the parameters that govern the training algorithm of a model.
Bagging
A machine learning ensemble method that helps reduce variance by training multiple models and averaging their predictions.
Boosting
An ensemble technique that adjusts weights of weak learners to minimize errors.
Confusion Matrix
A table used to describe the performance of a classification model, showing true positive, false positive, true negative, and false negative predictions.
AUC-ROC Curve
A performance measurement for classification problems at various threshold settings; ROC is a graphical plot of true positive rate against false positive rate.
Gradient Descent
An optimization algorithm used to minimize the cost function in machine learning.
Regularization
A technique used to prevent overfitting by adding a penalty to the loss function.
Decision Tree
A machine learning model that splits data into branches to make predictions based on feature values.
Random Forest
An ensemble machine learning model that constructs multiple decision trees at training time and outputs the mode of their predictions.
Support Vector Machine (SVM)
A supervised machine learning model that separates data points using hyperplanes to classify them.
Neural Network
A model inspired by the human brain, consisting of interconnected 'neurons' for processing information.
Deep Learning
A subfield of machine learning that uses neural networks with many layers to learn complex patterns.
Transfer Learning
A technique where a pre-trained model is reused on a new problem to improve efficiency or performance.
Feature Importance
A technique to determine which features in a dataset are most influential for making predictions.