ISDS 351 Ch 6

studied byStudied by 15 people
5.0(1)
Get a hint
Hint

Big data

1 / 38

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

39 Terms

1

Big data

data collections that are so enormous (terabytes or more) and complex (from sensor data to social media data) that traditional data management software, hardware, and analysis processes are incapable of dealing with them

New cards
2

Characteristics of Big Data

  • Volume: size of data

  • Velocity: the rate at which new data is being generated

  • Value: the worth of the data in decision making

  • Variety: structured v unstructured data

  • Veracity: a measure of the quality of the data

New cards
3

Sources of Big Data

knowt flashcard image
New cards
4

Challenges of Big Data

With so much data readily available business users can have a hard time;

  • Finding the information they need to make decisions.

  • Trusting the validity of the data they can access.

New cards
5

Data warehouse

A large database that holds business information from many sources in the enterprise, covering all aspects of the company’s processes, products, and customers.

New cards
6

Charactersitics of a data warehouse

  • Large: holds billions of records and petabytes of data

  • Multiple sources: data comes from many sources

  • Historical: typically 5+ years of data

  • Cross organizational access and analysis: data accessed, used, and analyzed by users across the organization to support multiple business processes and decision making

  • Supports various types of analyses and reporting: drill down analysis, development of metrics, identification of trends

New cards
7

Extract transform load (ETL) process

A data handling process that takes data from a variety of sources, edits and transforms it into the format used in the data warehouse, and then loads this data into the warehouse.

New cards
8

Data mart

A subset of a data warehouse that is used by small and medium-sized businesses and departments within large companies to support decision making.

New cards
9

Data lake

A “store everything” approach to big data that saves all the data in its raw and unaltered form.

New cards
10

NoSQL database

A way to store and retrieve data that is modeled using some means other than the simple two-dimensional tabular relations used in relational databases. impro

  • More flexible than relational database tables

  • Provide improved access speed and redundancy

New cards
11

Categories of NoSQL Databases

  1. Key-value: two columns (“key” and “value”)

  2. Document: Store, retrieve, and manage document-oriented information

  3. Graph: Well-suited for analyzing interconnections

  4. Column: store data in columns

New cards
12

Hadoop

An open-source software framework including several software modules that provide a means for storing and processing extremely large data sets.

* Limitation: can only perform batch processing

New cards
13

Hadoop Distributed File System

A system used for data storage that divides the data into subsets and distributes the subsets onto different servers for processing.

New cards
14

MapReduce program

A composite program that consists of a Map procedure that performs filtering and sorting and a Reduce method that performs a summary operation.

New cards
15

In-memory database

A database management system that stores the entire database in random access memory (RAM).

  • Faster access to data.

  • Enable the analysis of big data and other

    challenging data-processing applications

  • Feasible

New cards
16

Business Intelligence (BI)

A wide range of applications, practices, and technologies for the extraction, transformation, integration, visualization, analysis, interpretation, and presentation of data to support improved decision making.

New cards
17

Analytics

The extensive use of data and quantitative analysis to support fact-based decision making within organizations.

New cards
18

Benefits of BI and Analytics

  • Detect fraud

  • Improve forecasting

  • Increase sales

  • Optimize operations

  • Reduce costs

New cards
19

Data scientist

An individual who combines strong business acumen, a deep understanding of analytics, and a healthy appreciation of the limitations of data, tools, and techniques to deliver real improvements in decision making.

New cards
20

Components required for effective BI and Analytics

  • Existence of a solid data management program

  • Creative data scientists

  • Strong commitment to data-driven decision making

New cards
21

BI and Analytics Tools

knowt flashcard image
New cards
22

Descriptive analysis

A preliminary data processing stage used to identify patterns in the data and answer questions about who, what, where, when, and to what extent.

Two types:

  1. Visual analytics

  2. Regression analysis

New cards
23

Visual analytics

The presentation of data in a pictorial or graphical format.

New cards
24

Word cloud

A visual depiction of a set of words that have been grouped together because of the frequency of their occurrence.

New cards
25

Conversion funnel

A graphical representation that summarizes the steps a consumer takes in making the decision to buy your product and become a customer.

<p><span>A graphical representation that summarizes the steps a consumer takes in making the decision to buy your product and become a customer.</span></p>
New cards
26

Regression analysis

A method for determining the relationship between a dependent variable and one or more independent variables.

New cards
27

Predictive analysis

A set of techniques used to analyze current data to identify future probabilities and trends, as well make predictions about the future.

New cards
28

Time series analysis

The use of statistical methods to analyze time series data and determine useful statistics and characteristics about the data.

New cards
29

Data mining

A BI analytics tool used to explore large amounts of data for hidden patterns to predict future trends and behaviors for use in decision making.

New cards
30

Cross-Industry Process for Data Mining (CRISP-DM)

A six-phase structured approach for the planning and execution of a data mining project.

<p><span>A six-phase structured approach for the planning and execution of a data mining project.</span></p>
New cards
31

Genetic algorithm

A technique that employs a natural selection-like process to find approximate solutions to optimization and search problems

<p><span>A technique that employs a natural selection-like process to find approximate solutions to optimization and search problems</span></p>
New cards
32

Linear programming

A technique for finding the optimum value (largest or smallest, depending on
the problem) of a linear expression (called the objective function) that is calculated based on the value of a set of decision variables that are subject to a set of constraints.

New cards
33

Computer simulation

involves using a model expressed in the form of a computer program to emulate the dynamic responses of a real-world system to various inputs.

New cards
34

Scenario analysis

A process for predicting future values based on certain potential events.

New cards
35

Monte Carlo simulation

A simulation that enables you to see a spectrum of thousands of possible outcomes, considering not only the many variables involved, but also the range of potential values for each of those variables.

New cards
36

Text analysis

A process for extracting value from large quantities of unstructured text data

New cards
37

Video analysis

The process of obtaining information or insights from video footage.

New cards
38

Self-service analytics

Training, techniques, and processes that empower end users to work independently to access data from approved sources to perform their own analyses using an endorsed set of tools.

New cards
39

Advantages of self-service analytics

  • Gets valuable data into the hands of end users

  • Encourages fact-based decision making

  • Accelerates decision making

  • Provides a solution to the shortage of data scientists

New cards

Explore top notes

note Note
studied byStudied by 17 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 9 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 22 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 8 people
Updated ... ago
4.0 Stars(1)
note Note
studied byStudied by 85 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 16 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 222 people
Updated ... ago
5.0 Stars(5)
note Note
studied byStudied by 18 people
Updated ... ago
5.0 Stars(2)

Explore top flashcards

flashcards Flashcard40 terms
studied byStudied by 2 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard23 terms
studied byStudied by 12 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard64 terms
studied byStudied by 92 people
Updated ... ago
5.0 Stars(2)
flashcards Flashcard39 terms
studied byStudied by 152 people
Updated ... ago
4.0 Stars(2)
flashcards Flashcard90 terms
studied byStudied by 4 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard20 terms
studied byStudied by 63 people
Updated ... ago
5.0 Stars(5)
flashcards Flashcard151 terms
studied byStudied by 5 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard81 terms
studied byStudied by 7 people
Updated ... ago
5.0 Stars(1)