Data Mining Notes

0.0(0)

Studied by 0 people

View linked note

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/12

There's no tags or description

Looks like no tags are added yet.

Last updated 4:17 AM on 12/5/24

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

13 Terms

New cards

What is Data Mining?

Non-trivial extraction of implicit, previously unknown, and potentially useful information from data.

New cards

Data Mining Process

Involves data selection, pre-processing, exploration, and analysis to discover meaningful patterns.

New cards

Supervised Algorithms

Algorithms that use training data with known results to make predictions on new data.

New cards

Unsupervised Algorithms

Algorithms that do not use training data, allowing for the discovery of unknown classes.

New cards

Classification

The process of predicting class labels for new records based on model training with existing data.

New cards

k-Nearest Neighbor

A classification approach that identifies the class of an unknown record by examining the 'k' closest training examples.

New cards

Decision Trees

A supervised classification method that uses a tree-like model of decisions for classification.

New cards

Distance Metric in k-NN

A method used to compute the distance between stored records in k-Nearest Neighbor classification.

New cards

Classification Description

Given a collection of records, where each record has a set of attributes with one being the dependent variable/class, the goal is to build a model that predicts the class based on other attribute values.

New cards

Goal of Classification

The objective is to assign previously unseen records to a class as accurately as possible.

New cards

Division of Data Sets

In classification, the data set is typically divided into training and test sets, with the training set used to build the model and the test set used to validate it.

New cards

Role of Test Set in Classification

A test set is utilized to determine the accuracy of the classification model after it has been trained.

New cards

Dependent Variable in Classification

The dependent variable, also known as the class attribute, is the outcome that the classification model aims to predict based on the input attributes.