SQL, ML, OOP, Pandas

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/51

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

52 Terms

1
New cards

Subquery

A SELECT query that is enclosed inside another query, used to filter or manipulate data dynamically.

2
New cards

Execution Flow

The sequence in which subqueries are executed within SQL commands like SELECT, INSERT, UPDATE, and DELETE.

3
New cards

Supervised Learning

A type of machine learning that uses labeled data to predict outcomes, primarily through classification and regression.

4
New cards

Unsupervised Learning

A type of machine learning that analyzes unlabeled data to identify patterns, such as clustering or dimensionality reduction.

5
New cards

Reinforcement Learning

A learning process where an agent learns to behave in an environment by receiving rewards or penalties.

6
New cards

Ranking

The process of predicting the most relevant item, commonly used in recommendation systems.

7
New cards

Recommendation Systems

Systems designed to suggest products, music, or content to users based on preferences.

8
New cards

Features

Properties or characteristics of data, which can be quantitative or categorical.

9
New cards

Labels

Target outcomes in supervised learning, indicating what the model is attempting to predict.

10
New cards

Feature Vector

A single sample's data represented as a row of features.

11
New cards

Feature Matrix

A complete set of feature vectors from all samples, typically organized in a table.

12
New cards

Target Vector

A column of labels or target outcomes in supervised learning.

13
New cards

Derivative

A mathematical measure of how a function changes as its input changes, crucial for optimization.

14
New cards

Probability

A measure of the likelihood that a certain outcome will occur.

15
New cards

Probability Distribution

A function that describes how probabilities are distributed over a range of outcomes.

16
New cards

Gaussian Distribution

A common probability distribution also known as a normal distribution.

17
New cards

Uniform Distribution

A type of probability distribution where all outcomes are equally likely.

18
New cards

traintestsplit()

A function in scikit-learn used to split datasets into training and testing sets.

19
New cards

NaNs

Stands for 'Not a Number', representing missing or undefined data in datasets.

20
New cards

Abstract Class

A class that cannot be instantiated on its own and must be subclassed; contains at least one abstract method.

21
New cards

Abstract Method

A method defined in an abstract class that has no implementation and must be implemented by subclasses.

22
New cards

Static Method

A method that does not operate on an instance of the class or require class context.

23
New cards

Instance Method

A method that works on the instance of a class (typically uses 'self').

24
New cards

Concrete Method

A method in a class that has a complete implementation.

25
New cards

.set_index()

A Pandas method that sets a specified column as the index for a DataFrame.

26
New cards

.reset_index()

A Pandas method that resets the index of a DataFrame; drop=True removes the old index.

27
New cards

.loc[]

A Pandas accessor for label-based subsetting, allowing selection by index or column names.

28
New cards

df.groupby()

A method in Pandas that groups data by a column's values and is often used with aggregation functions.

29
New cards

Subset

A filtered portion of a dataset, containing specific rows and/or columns.

30
New cards

Pivot Table

A data processing tool that summarizes and reshapes data in a way similar to Excel pivot tables.

31
New cards

One-Hot Encoding

A method of converting categorical variables into a format suitable for machine learning algorithms.

32
New cards

Training Data

Data used to train a machine learning model, containing examples and known outcomes.

33
New cards

Testing Data

Data used to evaluate the performance of a trained machine learning model.

34
New cards

Model Fitting

The process of training a machine learning model on a dataset to learn patterns.

35
New cards

Feature Engineering

The process of selecting, modifying, or creating features from raw data to improve model performance.

36
New cards

Overfitting

A modeling error that occurs when a model learns noise and details in the training data to an extent that negatively impacts its performance on new data.

37
New cards

Underfitting

A situation where a model is too simple to capture the underlying trend of the data.

38
New cards

Cross-Validation

A method to evaluate model performance by dividing the data into subsets to ensure it generalizes well.

39
New cards

Hyperparameter Tuning

The process of optimizing the parameters that govern the training algorithm of a model.

40
New cards

Bagging

A machine learning ensemble method that helps reduce variance by training multiple models and averaging their predictions.

41
New cards

Boosting

An ensemble technique that adjusts weights of weak learners to minimize errors.

42
New cards

Confusion Matrix

A table used to describe the performance of a classification model, showing true positive, false positive, true negative, and false negative predictions.

43
New cards

AUC-ROC Curve

A performance measurement for classification problems at various threshold settings; ROC is a graphical plot of true positive rate against false positive rate.

44
New cards

Gradient Descent

An optimization algorithm used to minimize the cost function in machine learning.

45
New cards

Regularization

A technique used to prevent overfitting by adding a penalty to the loss function.

46
New cards

Decision Tree

A machine learning model that splits data into branches to make predictions based on feature values.

47
New cards

Random Forest

An ensemble machine learning model that constructs multiple decision trees at training time and outputs the mode of their predictions.

48
New cards

Support Vector Machine (SVM)

A supervised machine learning model that separates data points using hyperplanes to classify them.

49
New cards

Neural Network

A model inspired by the human brain, consisting of interconnected 'neurons' for processing information.

50
New cards

Deep Learning

A subfield of machine learning that uses neural networks with many layers to learn complex patterns.

51
New cards

Transfer Learning

A technique where a pre-trained model is reused on a new problem to improve efficiency or performance.

52
New cards

Feature Importance

A technique to determine which features in a dataset are most influential for making predictions.