1/49
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
the 4 Primary Traits of the Value of Data
Type, Timeliness, Quality and Governance
Transactional Data
Encompasses all of the data contained within a single business process or unit of work, and its primary purpose is to support the performing of daily operational tasks
Analytical Data
Encompasses all organizational data, and its primary purpose is to support the performing of managerial analysis tasks
Data Timeliness
is an aspect of data that depends on the situation.
Real-time data and real-time systems
Data Quality
Accurate, Complete, Consistent, Timely and Unique.
Data inconsistency and data integrity issues and data steward.
The 4 Primary Sources of Low-Quality Include:
1. Customers intentionally enter inaccurate data to protect their privacy
2. Different entry standards and formats
3. Operators enter abbreviated or erroneous data by accident or to save time
4. Third party and external data contains inconsistencies, inaccuracies, and errors
Potential business effects resulting from low quality data include
• Inability to accurately track customers
• Difficulty identifying valuable customers
• Inability to identify selling opportunities
• Marketing to nonexistent customers
• Difficulty tracking revenue
• Inability to build strong customer relationships
Understanding the Benefits of Good Data
• High quality data can significantly improve the chances of making a good decision
• Good decisions can directly impact an organization's bottom line
• A data steward is responsible for ensuring data policies and procedures are implemented across an organization
Data Governance
Refers to the overall management of the availability, usability, integrity, and security of company data.
Master Data Management (MDM) and data validation
Database
maintains data about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)
Entities
A person, place, thing, transaction, or event about which information is stored
Data Element
smallest or basic unit of data
Data Model
Logical data structures that detail the relationships among data elements using graphics or pictures
Metadata
details about data
Data Dictionary
Compiles all of the metadata about the data elements in the data model
Primary Key
A field (or group of fields) that uniquely identifies a given entity in a table
Foreign Key
A primary key of one table that appears an attribute in another table and acts to provide a logical relationship among the two tables
Database Management System (DBMS)
Creates, reads, updates, and deletes data in a database while controlling access and security
Structured Query Language (SQL)
Asks users to write lines of code to answer questions against a database
Data Analysis Cycle
Collect (identify data sources), Analyze (ask the right questions), Visualize (craft your data story) and Communicate (influence and persuade).
Business Intelligence Dashboards
Track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis.
Consolidation
The aggregation of data from simple roll-ups to complex groupings of interrelated information.
Drill down
Enables users to view details, and details of details, of information.
Slice-and-dice
The ability to look at information from different perspectives.
Pivot
Rotates data to display alternative presentations of the data.
Estimation Analysis
Determines values for an unknown continuous variable behavior or estimated future value.
Affinity Grouping Analysis
Reveals the relationship between variables, along with the nature and frequency of the relationships.
Cluster Analysis
A technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible.
Classification Analysis
The process of organizing data into categories or groups for its most effective and efficient use; for example, groups of political affiliation and charity donors.
Collaborative Filtering
A technique used in recommendation systems to provide personalized recommendations to users based on their similarity to other users. It is based on the idea that people who have similar preferences or behaviors in the past are likely to have similar preferences in the future.
Recommendation Engine
A data mining algorithm that analyzes a customer’s purchases and actions on a website and then uses the data to recommend complementary products.
Market Basket Analysis
Evaluates such items as websites and checkout scanner information to detect customers’ buying behavior and predict future behavior by identifying affinities among customers’ choices of products and services.
Data warehouse
A logical collection of data—gathered from many different operational databases—that supports business analysis activities and decision-making tasks
Data Aggregation
The collection of data from various sources for the purpose of data processing
Extraction, Transformation and Loading (ETL)
A process that extracts data from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse
Data Mart
Contains a subset of data warehouse data
Information Cube
The common term for the representation of multidimensional data
Data Lake
A storage repository that holds a vast amount of raw data in its original format until the business needs it
Data Scrubbing
A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete data
Poor Data Quality
The data, if available, were often incorrect or incomplete. Therefore, users could not rely on the data to make decisions.
Inconsistent Data Definitions
Every department had its own method for recording data so when trying to share information, data did not match and users did not get the data they really needed
Ineffective Direct Data Access
Most data stored in operational databases did not allow users direct access; users had to wait to have their queries or questions answered by MIS professionals who could code SQL.
Lack of Data Standards
Managers needed to perform cross-functional analysis using data from all departments, which differed in granularities, formats, and levels.
Inadequate Data Usefulness
Users could not get the data they needed; what was collected was not always useful for intended purposes.
Distributed Computing
Processes and manages algorithms across many machines in a computing environment
Ledger
Records classified and summarized transactional data
Blockchain
A type of distributed ledger, consisting of blocks of data that maintain a permanent and tamper-proof record of transactional data
Proof-of-work
A requirement to define an expensive computer calculation, also called mining, that needs to be performed in order to create a new group of trustless transactions (blocks) on the distributed ledger or blockchain
Hash
A function that converts an input of letters and numbers into an encrypted output of a fixed length
Blocks
Data structures containing a hash, previous hash, and data