1/12
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is Data Mining?
Non-trivial extraction of implicit, previously unknown, and potentially useful information from data.
Data Mining Process
Involves data selection, pre-processing, exploration, and analysis to discover meaningful patterns.
Supervised Algorithms
Algorithms that use training data with known results to make predictions on new data.
Unsupervised Algorithms
Algorithms that do not use training data, allowing for the discovery of unknown classes.
Classification
The process of predicting class labels for new records based on model training with existing data.
k-Nearest Neighbor
A classification approach that identifies the class of an unknown record by examining the 'k' closest training examples.
Decision Trees
A supervised classification method that uses a tree-like model of decisions for classification.
Distance Metric in k-NN
A method used to compute the distance between stored records in k-Nearest Neighbor classification.
Classification Description
Given a collection of records, where each record has a set of attributes with one being the dependent variable/class, the goal is to build a model that predicts the class based on other attribute values.
Goal of Classification
The objective is to assign previously unseen records to a class as accurately as possible.
Division of Data Sets
In classification, the data set is typically divided into training and test sets, with the training set used to build the model and the test set used to validate it.
Role of Test Set in Classification
A test set is utilized to determine the accuracy of the classification model after it has been trained.
Dependent Variable in Classification
The dependent variable, also known as the class attribute, is the outcome that the classification model aims to predict based on the input attributes.