1/20
Vocabulary flashcards covering the key data mining tasks, methods, applications, and foundational concepts from Lecture 1.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data Mining
The process of discovering patterns and knowledge from large data sets.
Prediction Methods
Use some variables to predict unknown or future values of other variables.
Description Methods
Find human-interpretable patterns that describe the data.
Classification
Predictive task: builds a model from labeled training data to assign class labels to new, unseen records based on their attributes.
Clustering
Descriptive task: group data points into clusters where members are similar within clusters and different across clusters.
Association Rule Discovery
Process of discovering rules that predict the occurrence of an item based on occurrences of other items (e.g., Milk → Soda).
Deviation/Anomaly Detection
Detect significant deviations from normal behavior (e.g., fraud detection, intrusions).
Regression
Predict a continuous-valued variable based on other variables using linear or nonlinear models.
Training Set
Subset of data used to build the model.
Test Set
Subset of data used to validate the model's accuracy.
Model
The learned classifier or predictive model produced from training.
Classifier
A model that assigns class labels to unseen records.
Direct Marketing (Classification Application)
Use classification to target customers likely to buy a product, reducing mailing costs.
Fraud Detection (Classification Application)
Predict fraudulent credit card transactions using transaction and account information.
Sky Survey Cataloging (Classification Application)
Predict whether a sky object is a star or galaxy from image features.
Market Segmentation (Clustering Application)
Subdivision of a market into distinct customer groups for targeted marketing.
Document Clustering (Clustering Application)
Group documents into clusters based on important terms to find similarity.
Milk → Soda
An example association rule showing that if Milk occurs, Soda is likely to occur.
Diaper, Milk → Beer
An association rule illustrating multiple antecedents predicting Beer.
Antecedent
In an association rule (e.g., A→B), the 'if-part' (A) represents the condition or set of items that are observed. For example, in the rule 'Milk → Soda', 'Milk' is the antecedent, indicating that if Milk is purchased, Soda is predicted to be bought as well.
Consequent
In an association rule (e.g., A→B), the 'then-part' (B) represents the item(s) that are predicted to occur if the antecedent is present. For example, in the rule 'Diaper, Milk → Beer', 'Beer' is predicted to be bought if 'Diaper' and 'Milk' are purchased.