1/41
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
ORDER BY
A SQL clause that allows you to sort query results, defaulting to ascending order (ASC).
ASC
Ascending order for sorting results in SQL.
DESC
Descending order for sorting results in SQL.
WHERE
SQL clause that filters individual rows before grouping.
HAVING
SQL clause that filters results after grouping, often used with aggregate functions.
GROUP BY
SQL clause used to group rows by unique values in one or more columns.
Logical execution order of SQL
The order in which SQL processes the clauses: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT.
Aggregate functions
Functions like AVG() or COUNT() used to summarize data.
Machine Learning (ML)
A field of AI focused on teaching computers to recognize patterns and make predictions using data.
Artificial Intelligence (AI)
A broader field that simulates human behavior and tasks.
Data Science
The practice of analyzing data to extract insights, utilizing tools such as Machine Learning.
Supervised Learning
A type of Machine Learning that uses labeled data to train models.
Feature Matrix (X)
Input data consisting of all columns of features for every row.
Feature Vector
A single row from the feature matrix representing an individual data point.
Target Vector (y)
The output of supervised learning that the model aims to predict.
Classification
A supervised learning task that involves predicting discrete labels.
Regression
A supervised learning task that involves predicting continuous numeric values.
Quantitative data
Data that is numeric, can be discrete or continuous.
Qualitative data
Categorical data that can be ordinal or nominal.
Discrete data
Countable quantitative data, e.g., number of pets.
Continuous data
Measurable quantitative data that can take any value in a range.
Ordinal data
Qualitative data with a natural order, e.g., small/medium/large.
Nominal data
Qualitative data without a natural order, e.g., eye color.
One-Hot Encoding
A technique to convert categorical data into vectors.
Training set
The dataset used for training a model.
Validation set
A subset of data used to tune model parameters.
Test set
The dataset used to evaluate the final performance of the model.
L1 Loss (MAE)
A loss function measuring error as the sum of absolute errors.
L2 Loss (MSE)
A loss function measuring error as the sum of squared errors.
Binary Cross-Entropy
A loss function used for binary classification tasks.
Accuracy
The percentage of correct predictions made by a model.
Imbalanced datasets
Datasets where one class is significantly more frequent than others, complicating classification.
Unsupervised Learning
A type of Machine Learning that finds patterns without labeled data.
Reinforcement Learning
A type of Machine Learning where an agent learns by interacting with an environment.
traintestsplit
A function in Scikit-learn used to divide data into training and test sets.
NaNs
Missing values in data that need to be cleaned before model training.
Overfitting
A modeling error that occurs when a model fits too closely to the training data.
Critical Edge Cases
Special scenarios in ML that can lead to errors or suboptimal performance.
Common Mistakes in SQL
Errors such as misspelling 'traintestsplit' as 'trantestsplit' or misusing WHERE with aggregates.
Feature Vector real-world example
Predicting whether a patient has a disease based on various features.
Regression real-world example
Predicting house prices based on features such as size, location, etc.
Metrics to use instead of accuracy
Precision, recall, or F1 score for evaluating model performance in imbalanced datasets.