1/27
A comprehensive set of vocabulary flashcards covering the core concepts of data management, big data, storage structures, and analytical tools as presented in the lecture transcript.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Data Management (Gartner Group)
The practices, architectural techniques, and tools for achieving consistent access to and delivery of data across the spectrum of data subject areas and data structure types in the enterprise.
Data
Raw, unorganized facts that might not mean much until they are processed, organized, and structured into a context.
Information
Data that has context and meaning after being processed and organized, such as the average training score of an entire department.
Database Management Systems (DBMS)
Software systems used to create and manage databases where data are stored in computer files called tables.
Tables
The most important part of a database where data is held; they consist of records (rows) and fields (columns).
Records
The rows in a database table.
Fields
The columns in a database table.
Primary Key
A field in a database table that uniquely identifies a record in that table, such as a unique student ID number.
Foreign Key
A field in a database table that provides a link between two tables in a relational database.
Schema
The organization or layout of a database that defines tables, fields, constraints, keys, and integrity; it serves as the database blueprint.
Big Data
Large and expansive collected data sets from disparate sources including smartphone metadata, internet usage records, and social media activity.
Volume
One of the 4 Vs of big data referring to the sheer amount of data that requires significant resources to hold and manage.
Variety
One of the 4 Vs of big data referring to data coming from both structured and unstructured areas and in various forms.
Veracity
One of the 4 Vs of big data referring to the quality and trustworthiness of the data, and whether it represents what is believed.
Velocity
One of the 4 Vs of big data referring to the accelerating speed of data production over a given time period.
Data Mining
The examination of huge sets of data to find patterns, connections, outliers, and hidden relationships to make informed business decisions; also called data discovery.
Structured Data
Data residing in fixed formats that are typically well-labeled with traditional fields and records, making them easily searchable and queryable.
Unstructured Data
Unorganized data that cannot be easily read by a computer because it is not in rows and columns, such as video, audio, and social media posts; it accounts for 80% of all data.
Semi-structured Data
Data that falls between structured and unstructured, containing both consistent structured elements (like tags or headers) and unstructured content, such as email and HTML.
Data Warehouse
A digital location used to consolidate disparate data in a central location for enterprise-wide needs, often holding yottabytes of data.
Data Mart
A smaller data storage system designed to support the specific needs of a single department, such as sales or human resources.
Yottabytes
A unit of data storage equivalent to 1trillion terabytes; each terabyte is 1,000gigabytes.
ETL
An acronym for extract, transform, and load; it describes tools used to standardize data across systems and prepare them for querying in a warehouse or mart.
Hadoop
An infrastructure for storing and processing large sets of data across multiple servers using a distributed file system, designed for unstructured and semi-structured data.
Structured Query Language (SQL)
The most widely used standard computer language for relational databases, allowing programmers to manipulate and query data.
Tableau
A business analytics platform that produces interactive data visualizations, such as graphs and charts, to simplify raw data into information.
Business Intelligence (BI)
A broad range of tools and practices used to extract, analyze, and report information to assist in making critical strategic business decisions and predictions.
Data Governance
The management of data policies, procedures, access control, backup and recovery, and classification standards to ensure data remains confidential and safe.