1/23
This set of flashcards covers key vocabulary and concepts related to data management, quality, and analytics as presented in the lecture notes.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Data Profiling
The process of investigating data quality and structure, involving investigating quality, structure, and decision-making about data issues.
Data Quality Characteristics
Key criteria to evaluate data, including correctness (accuracy of facts), validity (adherence to defined rules), and consistency (uniform representation).
Single-Valued Column
A column in a database that contains a single value for each row, describing one characteristic.
Composite Column
A column that combines values from two or more characteristics into a single field.
Multi-Valued Column
A column where a cell contains multiple values of the same characteristic.
Flat Tables
A method for structuring data for analysis using single-value columns, preferred over crosstabulation tables.
Star Schema
A type of data model consisting of fact tables and dimension tables to support efficient querying and analysis.
Data Cleaning
The process of correcting, validating, and modifying data to ensure its quality for analysis.
Imputation
A strategy for handling incomplete data by substituting estimated values for missing entries.
Referential Integrity
A validation rule in databases ensuring that all values in a foreign key column correspond to existing values in the primary key column.
Continuous Auditing
An approach in auditing that continuously evaluates transactions and data as they occur to ensure accuracy and compliance.
Cognitive Technologies
Artificial intelligence tools that replicate human judgment in decision-making and analysis.
Data Mining
The process of examining large datasets to identify patterns, correlations, and anomalies.
Robotic Process Automation (RPA)
Technology that automates routine tasks by mimicking human actions across digital platforms.
Smart Contracts
Self-executing contracts with the terms of the agreement directly written into code, utilizing blockchain technology.
Textual Analysis
An analytical method that examines text data for insights regarding sentiments, trends, and interpretations.
What-If Analysis
A technique used to predict the outcome of a situation based on varying input assumptions.
Data Visualization
The graphical representation of data to communicate insights clearly and effectively.
Confirmation Bias
The tendency to search for or interpret information in a way that confirms one’s pre-existing beliefs.
Selection Bias
A distortion in the statistical analysis due to non-random selection of participants or data.
ETL (Extract, Transform, Load)
A three-step process to move data from source systems, clean and consolidate it, and load it into a data warehouse or data lake.
Metadata
Data that provides information about other data, describing its characteristics, structure, content, and context.
Data Governance
The overall management of data availability, usability, integrity, and security within an organization to ensure data quality and compliance.
Anomaly Detection
The process of identifying rare items, events, or observations that deviate significantly from the majority of the data.