Chapter 8 Cloud Computing: Big Data Analytics in the Cloud

0.0(0)
studied byStudied by 4 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/269

flashcard set

Earn XP

Description and Tags

Fall 2025

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

270 Terms

1
New cards

What is the primary goal of analytics?

To extract meaningful insights from data.

2
New cards

Why does raw data lack meaning on its own?

It is not yet processed or contextualized.

3
New cards

Which of the following best describes analytics?

A process of transforming raw data into useful information.

4
New cards

What is typically done to raw data during analytics?

Filtered, processed, and categorized

5
New cards

In analytics, why is contextualization important?

It provides meaning by relating data to its environment

6
New cards

What does organizing and structuring processed information allow a system to do?

Infer knowledge and improve efficiency

7
New cards

What makes systems “smarter” through analytics?

Learning from processed information to guide decisions

8
New cards

What organization identified the seven “giants” of massive data analysis?

National Research Council (NRC)

9
New cards

What is the main goal of the NRC’s “seven giants” characterization?

To provide a taxonomy of key computational tasks in data analysis

10
New cards

Which of the following is not one of the NRC’s seven “giants”?

Data Encryption

11
New cards

What do the seven “giants” represent?

Core computational tasks useful for massive data analysis

12
New cards

Which “giant” deals with operations like regression or mean and variance calculations?

Basic Statistics

13
New cards

Which “giant” focuses on computations involving distances or interactions between many elements?

Generalized N-Body Problems

14
New cards

Which “giant” involves solving systems of equations and performing matrix operations?

Linear Algebraic Computations

15
New cards

Which “giant” centers on finding best or most efficient solutions under constraints?

Optimization

16
New cards

Which “giant” deals with relationships and connections between nodes or entities?

Graph-Theoretic Computations

17
New cards

What do the seven “giants” share in common?

They are grouped by mathematical structure and computational strategy.

18
New cards

Which type of analytics answers the question “What happened?”

Descriptive Analytics

19
New cards

Which type of analytics focuses on “Why did it happen?”

Diagnostic Analytics

20
New cards

Which type of analytics predicts “What is likely to happen?”

Predictive Analytics

21
New cards

Which type of analytics addresses “What can we do to make it happen?”

Prescriptive Analytics

22
New cards

Basic statistics (mean, median, variance) are primarily used in which type of analytics?

Descriptive Analytics

23
New cards

Generalized N-Body problems, like clustering and similarity, are often used in:

Diagnostic and Predictive Analytics

24
New cards

Linear algebraic computations (e.g., PCA, regression) are most closely tied to which analytics types?

Predictive and Diagnostic Analytics

25
New cards

Graph-theoretic computations, such as shortest path or centrality, are often applied in:

Predictive and Diagnostic Analytics

26
New cards

Optimization techniques (minimization, linear programming) are most relevant to:

Prescriptive Analytics

27
New cards

Integration methods like Bayesian inference and Monte Carlo simulations are mainly used for:

Predictive Analytics

28
New cards

Alignment problems, such as matching text or image datasets, are associated with:

Predictive and Prescriptive Analytics

29
New cards

Which of the following correctly matches type to example?

Predictive → Simulation

30
New cards

What question does Descriptive Analytics aim to answer?

What has happened?

31
New cards

What question does Diagnostic Analytics aim to answer?

Why did it happen?

32
New cards

Which type of analytics summarizes past data for easier interpretation?

Descriptive Analytics

33
New cards

Which type of analytics seeks to find reasons behind past events?

Diagnostic Analytics

34
New cards

Computing counts, means, and percentages is an example of which analytics type?

Descriptive Analytics

35
New cards

Identifying the cause of a machine fault by analyzing past sensor data is an example of which analytics type?

Diagnostic Analytics

36
New cards

Which analytics type primarily uses statistical functions like maximum, minimum, and top-N?

Descriptive Analytics

37
New cards

Which analytics type uses pattern recognition from historical data to explain anomalies or failures?

Diagnostic Analytics

38
New cards

What question does Predictive Analytics aim to answer?

What is likely to happen?

39
New cards

What question does Prescriptive Analytics aim to answer?

What can we do to make it happen?

40
New cards

Which analytics type focuses on forecasting future events or outcomes?

Predictive Analytics

41
New cards

Which analytics type recommends the best actions to achieve desired outcomes?

Prescriptive Analytics

42
New cards

Predictive models learn from existing data to forecast outcomes using which types of models?

Classification and regression models

43
New cards

Which analytics type uses multiple prediction models to determine the best course of action?

Prescriptive Analytics

44
New cards

Training a model on historical data to forecast sales next quarter is an example of which analytics type?

Predictive Analytics

45
New cards
46
New cards

Evaluating different strategies to maximize profit based on predicted outcomes is an example of which analytics type?

Prescriptive Analytics

47
New cards

What is Big Data primarily defined by?

Large volume, velocity, and variety of data that traditional tools cannot easily handle

48
New cards

Which company estimated that 2.5 quintillion bytes of data are created every day?

IBM

49
New cards

According to DOMO, about how many pieces of content are shared on Facebook every minute?

4.16 million

50
New cards

Approximately how many tweets are sent on Twitter every minute, according to DOMO?

300,000

51
New cards

How many photos are liked on Instagram every minute?

1.73 million

52
New cards

How much video content is uploaded to YouTube every minute?

300 hours

53
New cards

How many apps are downloaded by Apple users every minute?

51,000

54
New cards

How many new Skype calls are made every minute?

110,000

55
New cards

How many new visitors does Amazon receive every minute?

4,300

56
New cards

How many Uber rides are taken every minute?

694

57
New cards

How many hours of video are streamed by Netflix users every minute?

77,000

58
New cards

What does Big Data Analytics primarily deal with?

The collection, storage, processing, and analysis of massive-scale data

59
New cards

Which of the following is the correct sequence of steps in Big Data Analytics?

Data cleansing → Data munging (wrangling) → Data processing → Visualization

60
New cards

Which of the following describes data munging (or wrangling)?

Transforming and cleaning raw data into a usable format.

61
New cards

Why are specialized tools and frameworks required for Big Data Analytics?

Because traditional systems cannot efficiently handle large volume, high velocity, and diverse data types.

62
New cards

When is Big Data Analytics especially needed?

When data volume, velocity, or variety exceed the limits of single-machine processing.

63
New cards

What is an example of velocity in Big Data?

Data that must be analyzed in real time

64
New cards

Which of the following best describes variety in Big Data?

Data that can be structured, unstructured, or semi-structured from multiple sources.

65
New cards

What does Volume in Big Data refer to?

The extremely large amount of data that cannot fit on a single machine.

66
New cards

Why are specialized tools and frameworks needed for high data Volume?

To store, process, and analyze data that exceeds single-machine capacity.

67
New cards

What does Velocity describe in the context of Big Data?

The speed at which data is generated and arrives for processing.

68
New cards

Which of the following is an example of high-velocity data?

Social media posts or sensor readings generated in real time.

69
New cards

What does Variety refer to in Big Data?

The different forms and formats of data, such as structured, unstructured, and semi-structured.

70
New cards

Which of the following best represents Variety in Big Data?

Text, images, audio, video, and sensor data.

71
New cards

Which of the 3Vs focuses on how fast data is produced?

Velocity

72
New cards

Which of the 3Vs focuses on the different types of data formats?

Variety

73
New cards

Which of the 3Vs focuses on the scale or size of data?

Volume

74
New cards

What does Veracity refer to in Big Data?

The accuracy and reliability of the data

75
New cards

Why is data cleaning important for Veracity?

It removes noise and errors to ensure accurate analysis.

76
New cards

What does Value represent in Big Data?

The usefulness of the data for its intended purpose.

77
New cards

What is the ultimate goal of Big Data Analytics?

To extract value from the data

78
New cards

Which of the following ensures that the insights derived from Big Data are trustworthy?

Veracity

79
New cards

Which of the following ensures that data contributes meaningfully to business or research objectives?

Value

80
New cards

If a dataset contains duplicate or incorrect records, which “V” does it affect most?

Veracity

81
New cards

If data provides actionable insights that improve decision-making, which “V” does it demonstrate?

Value

82
New cards

What is the first step in any analytics application?

Data Collection

83
New cards

What must happen before data can be analyzed?

It must be collected and ingested into a big data stack.

84
New cards

The choice of tools for data collection depends primarily on what factors?

The source and type of data being ingested.

85
New cards

What is the main goal of Data Preparation?

To clean and organize data before processing.

86
New cards

Which of the following is a common issue addressed in data preparation?

Missing values or corrupt records.

87
New cards

What process removes duplicate entries from a dataset?

De-duplication

88
New cards

What term refers to transforming raw data into a usable format?

Data wrangling (or munging)

89
New cards

What does normalization in data preparation help ensure?

Consistent data formats and units across the dataset

90
New cards

What is the purpose of filtering during data preparation?

To remove irrelevant or unnecessary data points

91
New cards

What is the next step after data preparation in the analytics flow?

Determine the analysis type

92
New cards

Which of the following are the four main types of data analysis?

Descriptive, Diagnostic, Predictive, and Prescriptive

93
New cards

What comes after selecting the analysis type for an application?

Determine the analysis mode

94
New cards

Which of the following are common analysis modes?

Batch, Real-time, and Interactive

95
New cards

What does the choice of analysis mode depend on?

The requirements of the application

96
New cards

In which analysis mode is data processed periodically in large groups?

Batch mode

97
New cards

In which analysis mode is data processed instantly as it arrives?

Real-time mode

98
New cards

What determines the choice of visualization tools and frameworks?

The requirements of the application

99
New cards

Which of the following are types of data visualizations?

Static, Dynamic, and Interactive

100
New cards

What is the main purpose of visualizations in analytics?

To present data and insights in an understandable and meaningful way