ALL QUIZ BI

studied byStudied by 9 people
0.0(0)
Get a hint
Hint

data unit

1 / 71

flashcard set

Earn XP

Description and Tags

72 Terms

1

data unit

In a database, a ___ is called record (also known as a row or tuple) is a single, structured data item that is stored in a table.

New cards
2

dataset

A ____ is a collection of related data units and information that is composed of separate elements but can be manipulated as a unit by a computer. ____ is normally presented in a tabular pattern.

New cards
3

Data item

It is the equivalent of column in spreadsheet while dataset is equivalent to worksheet in spreadsheet.

New cards
4

dataset

A ____ is a collection of related data units and information that is composed of separate elements but can be manipulated as a unit by a computer. This set is normally presented in a tabular pattern. It is also known as table in database management system.

New cards
5

Data Warehousing

It integrates data and information collected from various sources into one comprehensive database.

New cards
6

ETL (Extract, Transform, Load)

It is the process of extracting data from various sources, transforming it into a consistent format, and loading it into a data warehouse or another storage system for analysis. ___ is a process used in data integration and data warehousing.

New cards
7

Data Lake

It is a centralized repository that allows organizations to store all their structured and unstructured data at any scale.

New cards
8

Data visualization

It is the graphical representation of data to facilitate understanding, analysis, and interpretation. It involves presenting data in visual formats such as charts, graphs, maps, and dashboards to communicate complex information clearly and effectively

New cards
9

Data mining

It is the process of searching and analyzing a large batch of raw data in order to identify patterns and extract useful information.

New cards
10

Machine learning

It is a branch of artificial intelligence (AI) and computer science which focuses on the development of algorithms and statistical models that enable computers to learn and improve their performance on a specific task without being explicitly programmed.

New cards
11

Big data

It is a combination of structured, semi-structured and unstructured data collected by organizations that can be mined for information and used in machine learning projects, predictive modeling and other advanced analytics applications.

New cards
12

Data Analytics

Since ____ is a wider concept in which data analysis is just a part. ___ is the broad field of using data and tools to make business decisions. Data analysis, a subset of _____, refers to specific actions.

New cards
13

Data science

It combines mathematics, and statistics, scientific methods, algorithms, specialized programming (information technology), advanced analytics, artificial intelligence (AI), and machine learning (ML) with specific subject matter expertise to uncover actionable insights hidden in an organization's data.

New cards
14

Deep learning

It is a subset of machine learning, which itself is a subset of artificial intelligence (AI). It involves using neural networks with many layers (hence "deep") to model and understand complex patterns in data.

New cards
15

Artificial intelligence (AI)

It is the simulation of human intelligence processes by machines, especially computer systems.

New cards
16

Internet of Things (IoT)

The ____ describes the network of physical objects-"things"-that are embedded with sensors, software, and other technologies for the purpose of connecting and exchanging data with other devices and systems over the internet.

New cards
17

ChatGPT

It is an artificial intelligence (AI) chatbot that uses natural language processing to create humanlike conversational dialogue.

New cards
18

Spreadsheet

It is a computer application or program used to organize, display, analyze, compute, manipulate and store data in a tabular format, typically presented in rows and columns. The most popular spreadsheet applications today are MS Excel, Google sheet, Apple Numbers, and LibreOffice Calc.

New cards
19

Database Management System

This, also often known as DBMS, is a software system that enables users to define, create, maintain, manipulate, and manage databases. Some of the most popular large-scale _____(s) are MS SQL, MySQL, Oracle, Teradata, DB2 (Mainframe) and Adabas (Mainframe).

New cards
20

Databases

These store structured data in a format optimized for efficient storage, retrieval, and manipulation. Structured data is typically stored in tabular form and managed in a relational database (RDBMS).

New cards
21

Data Visualization Tools

These are software applications or platforms that allow users to create visual representations of data. Some popular data visualization tools are Microsoft Power BI, Tableau, Google Data Studio, and QlikView.

New cards
22

Various approaches to data analytics include

New cards
23

a. looking at what happened (descriptive analytics)

New cards
24

b. why something happened (diagnostic analytics)

New cards
25

c. what is going to happen (predictive analytics), or

New cards
26

d. what should be done next (prescriptive analytics).

Various approaches to data analytics include

New cards
27

a. looking at what happened (?)

New cards
28

b. why something happened (?)

New cards
29

c. what is going to happen (?), or

New cards
30

d. what should be done next (?).

New cards
31

Descriptive analytics

It involves analyzing historical data to understand and examine what happened in the past. It focuses on summarizing and visualizing data to provide insights into trends, patterns, and relationships.

New cards
32

Diagnostic analytics

It helps explain why things happened the way they did. It's a more complex version of descriptive analytics, extending beyond what happened to why it happened.

New cards
33

Diagnostic analytics

It involves digging deeper into historical data to understand why certain events occurred.

New cards
34

Predictive analytics

It aims to predict likely outcomes and make educated forecasts using historical data. Simply put, it seeks to answer the question, "What will happen?". _____ use probabilities instead of simply interpreting existing facts.

New cards
35

Prescriptive analytics

It is the use of advanced processes and tools to analyze data and content to recommend the optimal course of action or strategy moving forward. _____ is the most advanced of the four types of data analytics.

New cards
36

Predicting future trends

New cards
37

Optimizing business operations

New cards
38

Enhancing decision-making

New cards
39

Transforming raw data

The primary goals of data analytics are:

New cards
40

Cloud Computing

It is the delivery of different services through the Internet. These resources include tools and applications like data storage, servers, databases, networking, and software. Database as a Service (DBaaS) is a _____ service model that provides users with access to managed database services over the internet. Some of the leading cloud service providers are Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).

New cards
41

Business Intelligence

It is about descriptive and diagnostic analytics while Business Analytics is about predictive and prescriptive analytics. ____ and Business Analytics (BA) are both subsets of Data Analytics.

New cards
42

Structured Query Language (SQL)

It is a standard language for managing and manipulating relational databases. Mastering ___ empowers data analysts to derive insights from large datasets and optimize the performance of data-related operations.

New cards
43

Data quality

determines the usability and trustworthiness of data.

New cards
44

quality data

The characteristics of ____ are validity, accuracy, completeness, consistency, timeliness, relevance, and reliability.

New cards
45

Data quality

____ issues include incomplete, duplicate, outdated, insecure, inaccurate, incorrect, inconsistent, and outlier.

New cards
46

Data Validity

refers to the degree to which your data conforms to defined business rules or constraints.

New cards
47

Data Accuracy

ensures that your data is close to the true values.

New cards
48

Data consistency

refers to the degree to which all required data is supplied and known. _____ ensures your data is stable within the same data set and/or across multiple data sets. ____ occurs when aggregated data is reconciled with detailed data at lower levels of granularity.

New cards
49

Data Uniformity

refers to the degree to which data is specified using the same unit of

measure.

New cards
50

Data duplicate

also known as data redundancy, occurs when the same information is entered multiple times, sometimes in different formats. can be avoided by implementing record validation checks within a program to ensure that a record does not already exist before it is added to a dataset or database.

New cards
51

Outlier data

refers to the values that differ significantly from other values in your data set. Outlier data refers to observations or data points that deviate significantly from the rest of the dataset.

New cards
52

Insecure data

refers to sensitive data that are not encrypted or access controlled.

New cards
53

Incomplete data

occurs when you don't have data stored for certain variables or data items.

New cards
54

Incorrect data

can easily be prevented when data validation is in place.

New cards
55

Inconsistent data

occurs when there are multiple tables within a database that deal with the same data but may receive it from different inputs.

New cards
56

Inaccurate data

refers to data that contains errors and discrepancies that deviate from the true or expected values.

New cards
57

Constructive Transformation process

where data item is added, copied or replicated.

New cards
58

Destructive Transformation process

where data items or records are trimmed or deleted.

New cards
59

Aesthetic Transformation process

where certain values are standardized to meet requirements or parameters.

New cards
60

Structural transformation process

which includes columns being renamed, moved, and combined.

New cards
61

Data Cleaning

is also known as data cleansing and data scrubbing.

New cards
62

Garbage-In Garbage-Out or GIGO

simply means the quality of output is determined by the quality of the input.

New cards
63

data completeness

The ____ is likely to be achieved when you make the important fields mandatory in the data entry and data model.

New cards
64

Data timeliness

refers to data that is available when it is required.

New cards
65

discovery stage

At the ________ data teams work to understand, identify, and find all applicable raw data and data types that need to be transformed. Data discovery includes identifying and understanding data in its original source format with the help of data profiling tools.

New cards
66

data mapping stage

At the ________ , data teams determine how individual fields are matched, filtered, joined, modified, and aggregated.

New cards
67

extraction stage

At the _______, data teams move data from its source system into the staging areas.

New cards
68

code generation and execution stage

At _____ and _____, data teams generate and execute programs/codes based on the mapping process using a programming language.

New cards
69

review output stage

At the _______, the transformed data is evaluated by the data teams to ensure the conversion has had the desired results in terms of the format of the data.

New cards
70

target stage

At the send to _________, involves sending the transformed data to its target destination.

New cards
71

Data profiling

involves identifying patterns and inconsistencies in data. _______ helps identify data quality issues and assess the overall quality of the data.

New cards
72

Data set

The _________ must be updated or refreshed to replace the obsolete data with the newer data.

New cards

Explore top notes

note Note
studied byStudied by 521 people
... ago
4.5(2)
note Note
studied byStudied by 460 people
... ago
4.0(1)
note Note
studied byStudied by 3 people
... ago
5.0(1)
note Note
studied byStudied by 8 people
... ago
4.0(1)
note Note
studied byStudied by 39 people
... ago
5.0(1)
note Note
studied byStudied by 88 people
... ago
5.0(1)
note Note
studied byStudied by 16 people
... ago
5.0(1)
note Note
studied byStudied by 12 people
... ago
5.0(1)

Explore top flashcards

flashcards Flashcard (39)
studied byStudied by 1 person
... ago
5.0(1)
flashcards Flashcard (35)
studied byStudied by 2 people
... ago
5.0(1)
flashcards Flashcard (28)
studied byStudied by 17 people
... ago
5.0(1)
flashcards Flashcard (129)
studied byStudied by 5 people
... ago
5.0(1)
flashcards Flashcard (100)
studied byStudied by 9 people
... ago
5.0(1)
flashcards Flashcard (29)
studied byStudied by 350 people
... ago
4.0(1)
flashcards Flashcard (25)
studied byStudied by 9 people
... ago
5.0(1)
flashcards Flashcard (69)
studied byStudied by 9 people
... ago
5.0(1)
robot