busn analytics- chapter 13

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/19

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

20 Terms

1
New cards

big data

describes the explosion of data generation, storage and usage in recent years

2
New cards

Seven V's of big data

1. value

2. volume

3. variety

4. velocity

5. veracity

6. variability

7. volatility

3
New cards

volume

the amount of data collected and measured in increasing orders of magnitude (gigabytes, terabytes, petabytes)

4
New cards

variety

refers to both the source and the form of data

5
New cards

velocity

the speed at which data are generated by and collected from source systems

- how quickly data can be processed so as to provide a feedback loop

6
New cards

variability

the changes in the meaning of data over time or in context

7
New cards

veracity

the reliability or truthfulness of data

8
New cards

volatility

describes the lifespan of data

9
New cards

value

providing insights or support for decisions

- the final and driving force of big data

10
New cards

Drivers of big data

1. the world becoming increasingly digital

2. the world is becoming more connected

3. electronics around the work are becoming more economical

4. the digital world have revolutionized communication and community

11
New cards

Apache Hadoop

an open source software framework that supports distributed computing for very large datasets

12
New cards

MapReduce

a software programming model for processing large datasets

13
New cards

Map and reduce

the input dataset is split into independent chunks that are processed by the "map tasks"

- aka map-shuffle or map sort-reduce

14
New cards

map tasks

assign the data chunks to computer nodes

15
New cards

SAP HANA

a leading-edge technology that stores all relevant data in random access memory rather than a hard drive

16
New cards

In-memory databases

utilize several innovations in conjunction with RAM to achieve incredible improvements in database operations

- IMDB

17
New cards

IMDB Innovations

1. Data are stored in memory

2. columnar data storage

3. Indexing

4. Data compression

5. Parallel data Processing

6. Partitioning data

18
New cards

Real-time Analytics

involves processing big data almost instantaneously to provide feedback as quickly as possible

19
New cards
20
New cards

Complete syntax for SELECT statement

SELECT [column list]

FROM [Tablelist]

WHERE conditionlist

GROUP BY columnlist

HAVING conditionlist
ORDER BY column list ASC DESC