1/17
A collection of key terms and their definitions related to data analytics, useful for exam preparation.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Data Analytics
The science that analyzes crude data to extract useful knowledge, involving pattern recognition and data-driven decision making.
Three Vs of Big Data
Volume, Variety, Velocity - dimensions that define the characteristics and challenges of big data.
Hyperparameters
Values set by the user in an optimization method, such as the number of clusters in k-means clustering.
CRISP-DM
Cross-Industry Standard Process for Data Mining; a non-rigid framework for data mining methodologies.
Data Mining
The process of discovering patterns in large datasets to derive essential insights.
Descriptive Analytics
The process of summarizing and interpreting historical data to describe what has occurred.
Predictive Analytics
The use of data and statistical algorithms to identify the likelihood of future outcomes based on historical data.
Clustering
An unsupervised learning technique that groups data objects based on information found in the data.
Attribute
A characteristic of an instance in a dataset, often synonymous with variable or feature.
Data Visualization
The graphical representation of information and data, helping to identify trends, outliers, and patterns.
Support Count
The number of transactions that contain a particular itemset in the context of association rules.
Lift (Interest Factor)
The ratio of the confidence of a rule to the support of the itemset in its consequent, indicating correlation strength.
Outlier
An anomaly or unusual value in a dataset that deviates significantly from the majority of the data.
Data Quality Dimensions
Factors such as accuracy, completeness, consistency, timeliness, validity, and uniqueness that determine the quality of data.
Feature Selection
The process of selecting a subset of relevant features for use in model construction.
Sampling
The process of selecting a representative subset from a population for analysis to reduce the cost and time involved.
Algorithm
A self-contained, step-by-step set of instructions for solving a problem or performing a task.
A Priori Principle
The theorem stating that if an itemset is frequent, then all of its subsets must also be frequent, impacting frequent itemset generation.