1/27
Vocabulary flashcards for reviewing key data mining concepts and definitions
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data Generation Advances
Enormous data growth in both commercial and scientific databases due to advances in data generation and collection technologies.
Data Gathering
Gather whatever data you can whenever and wherever possible.
Commercial Data Mining Viewpoint
Data mining helps businesses provide better, customized services for an edge in Customer Relationship Management.
Scientific Data Mining Viewpoint
Data mining assists scientists in automated analysis of massive datasets and in hypothesis formation.
Data
The 'Facts'.
Information
Interpretation of Data.
Knowledge
Information That Has Been Given Meaning.
Explicit Knowledge
Identified and Codified Information, Documents, Records, and Files.
Tacit Knowledge
Lives in people and their practices; Experiences, Competence, Commitment, Deeds, and Thoughts.
Data Mining
Non-trivial extraction of implicit, previously unknown and potentially useful information from data; Exploration & analysis, by automatic or semi-automatic means, of large quantities of data to discover meaningful patterns.
Data Cleaning
Process that removes or transforms noise and inconsistent data.
Data Integration
Process where multiple data sources may be combined.
Data Selection
Data relevant to the analysis task are retrieved from the database.
Data Transformation
Data transformed/consolidated into appropriate forms for mining.
Data Mining (Process)
Essential process where intelligent and efficient methods are applied in order to extract patterns.
Pattern Evaluation
Process that identifies the truly interesting patterns representing knowledge based on some interestingness measures.
Knowledge Presentation
Visualization and knowledge representation techniques are used to present the mined knowledge to the user.
Prediction Methods
Use some variables to predict unknown or future values of other variables.
Description Methods
Find human-interpretable patterns that describe the data.
Classification
Find a model for class attribute as a function of the values of other attributes; Predictive Modeling
Data Mining Steps (abridged)
Understanding purpose, obtaining data, cleaning & preprocessing, reducing dimension, determining task, partitioning data, choosing technique, using algorithms, interpreting results, deploying model
Regression
Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency.
Clustering
Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups.
Association Rule Discovery
Produce dependency rules which will predict occurrence of an item based on occurrences of other items, given a set of records.
Deviation/Anomaly/Change Detection
Detect significant deviations from normal behavior
Scalability
Ability to increase or decrease in response to change.
High Dimensionality
The number of dimensions are so high that calculations become difficult.
Heterogeneous and Complex Data
Any data with high variability of data types and formats.