Data Mining Flashcards

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/17

flashcard set

Earn XP

Description and Tags

Vocabulary flashcards for data mining concepts.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

18 Terms

1
New cards

Data Mining

Methods that attempt to discover patterns, trends, and relationships among data, especially non-obvious and unexpected patterns.

2
New cards

Data Lake

Unstructured data in its original format.

3
New cards

Data Warehouse

A database structured to study patterns in data from multiple sources.

4
New cards

Data Mart

A scaled down data warehouse that is specific to one part of an organization.

5
New cards

Classification

Predicting the class (or category) to which a record belongs.

6
New cards

Prediction

Predicting the value of a continuous variable.

7
New cards

Cluster Analysis

Separating data into groups such that records are similar within a group and are different across groups.

8
New cards

Market Basket Analysis

Identifying items that are often purchased together.

9
New cards

Overfitting

Creating a model that fits the test data TOO well, such that it will not match new data well; model re-creates noise and patterns specific to the test data that are not generalizable to new data.

10
New cards

Training Data

Data used to build the model (typically 70-80% of the original data).

11
New cards

Testing Data

Data used to evaluate the model (typically 20-30% of the original data).

12
New cards

Confusion Matrix

Organizes the counts of records by predicted class and actual class.

13
New cards

Overall Accuracy

The number of true predictions divided by the total number of records.

14
New cards

Sensitivity

How well a classifier correctly detects the important class members.

15
New cards

Specificity

How well a classifier correctly rules out the less important class members.

16
New cards

Classification Trees

Separate records into subgroups by creating splits on predictor variables, forming logical if/then rules.

17
New cards

Decision (splitting) node

Splits data into subgroups. Have successors (nodes below them).

18
New cards

Terminal node

Contain total count and count of each class