LAHAT NG NEED REVIEWHIN KAY SIR DENVER WITHOUT THE TRUE OR FALSE

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/42

There's no tags or description

Looks like no tags are added yet.

Last updated 12:42 AM on 4/14/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

43 Terms

New cards

Classification

\ is a supervised machine learning technique that sorts data into predefined categories or classes.

New cards

classification

The goal is to to predict a categorical outcome (label/class) based on input data.

New cards

Training Phase

A classification model is built using a dataset where each data point is already labeled with its correct class.

New cards

Prediction Phase

Once the model is trained, it can be used to classify new, unlabeled data.

New cards

Binary Classification

This is the simplest type, where the model predicts between two possible classes.

New cards

Multiclass Classification

The model predicts among more than two classes.

New cards

Accuracy

TP + TN / TP + TN + FP + FN

New cards

Precision

TP / TP + FP

New cards

Recall

TP / TP + FN

New cards

F1-Score

2 x (Precision x Recall) / Precision + Recall

New cards

Specificity

TN / TN + FP

New cards

Logistic Regression

is a statistical and machine learning method used for classification problems, especially when the outcome has two possible results.

New cards

Decision tree

is a supervised machine learning algorithm that can be used for both classification (sorting data into categories) and regression (predicting continuous values) tasks.

New cards

Overfitting

A decision tree can grow very deep and complex, essentially memorizing the noise and small fluctuations in the training data rather than learning the true underlying patterns.

New cards

Instability

Decision trees are very sensitive to small changes in the training data.

New cards

Bias toward dominant classes

If the dataset is imbalanced (one class has significantly more data points than others), the tree may become biased towards the majority class and fail to generalize well for the minority classes.

New cards

Random Forests

are an ensemble learning method that addresses the weaknesses of a single decision tree.

New cards

K-Nearest Neighbors (KNN)

The idea behind K-Nearest Neighbors (KNN) is very simple.

New cards

Regression

is a statistical method used to predict or explain the relationship between independent variables (IVs) and dependent variable (DV).

New cards

]regression

The objective is to find the best-fitting curve for a dependent variable in a multidimensional space, with each independent variable being a dimension.

New cards

Simple Linear Regression

A statistical method used to model the relationship between one independent variable (X)and one dependent variable (Y) by fitting a straight line.

New cards

Multiple Linear Regression

predicts a dependent variable based on multiple predictors.

New cards

The coefficient of determination or R²

measures the proportion of variance in the dependent variable explained by the independent variable(s).

New cards

Mean Absolute Error (MAE)

measures the average absolute difference between actual and predicted values.

New cards

Mean Squared Error (MSE)

measures the average of squared differences between actual and predicted values.

New cards

Root Mean Squared Error (RMSE)

is the square root of MSE and is one of the most commonly used regression metrics.

New cards

Model Lifecycle

describes the complete journey of a machine learning model, starting from identifying and defining the problem, collecting and preparing data, building and training the model, evaluating its performance, deploying it into real-world use, and continuously monitoring and maintaining it to ensure accuracy and effectiveness over time.

New cards

Problem Definition

In this stage, the goal is to clearly understand what problem the model will solve and why it is needed.

New cards

Data Collection

involves gathering all relevant information needed to train the model.

New cards

Data Preparation

is a crucial phase in model lifecycle that involves transforming raw data into a clean and usable format for modeling.

New cards

Data cleaning

removing duplicate records, correcting errors, and handling missing values to improve data quality.

New cards

Data transformation

converting data into appropriate formats, such as scaling numerical values or encoding categorical variables.

New cards

Exploratory Data Analysis

analyzing data using statistics and visualizations to understand patterns, distributions, and relationships.

New cards

Feature engineering

creating new features from existing data to improve the model’s predictive capability.

New cards

Feature selection

selecting relevant features and removing unnecessary or irrelevant variables.

New cards

Data splitting

dividing the dataset into training and testing sets to properly train and evaluate the model.

New cards

Model Selection

This stage involves selecting the most appropriate algorithm or model to solve a specific problem based on the type of data and the desired outcome.

New cards

Cross-Validation

Divide the data into several subsets to train and test the model multiple times, ensuring reliable performance and preventing overfitting.

New cards

Hyperparameter Tuning

Adjust the model’s settings (hyperparameters) to improve its accuracy and overall performance.

New cards

Model Evaluation

The stage at which the trained model is evaluated on previously unseen data.

New cards

Conduct Error Analysis

Analyze incorrect predictions to identify weaknesses and improve model accuracy.

New cards

Perform Sensitivity Analysis

Determine how changes in input features affect predictions to understand feature importance and model stability.

New cards

Model Deployment

The final phase of the machine learning lifecycle, where the trained model is integrated into a production environment, allowing it to make predictions on new data.