Chapter 5 PP notes

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/52

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 5:42 PM on 6/22/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

53 Terms

1
New cards

What are the four Vs of big data?

Data volume

Data velocity

Data variety

Data veracity

2
New cards

Who uses big data term and why?

Companies use this term to describe the massive amounts of data they now capture, store, and analyze

3
New cards

What is data volume?

amount of data created and stored by an organization

4
New cards

What is data velocity?

 speed at which data is created and stores 

5
New cards

What is data variety?

different forms data can take

6
New cards

What is data veracity?

quality or trustworthiness of data

7
New cards

What is an analytics mindset?

is a way of thinking that centers on the correct use of data and analysis for decision making 

8
New cards

What is an analytics mindset include the ability to according to EY?

Ask the right questions

Extract, transform, and load relevant data

Apply appropriate data analytic techniques 

Interpret and share the results with stakeholders

9
New cards

Who said this quote: “the significant problems we face today cannot be solved at the same level of thinking when we created them” ?

Albert Einstein

10
New cards

A good data analytic question helps establish…

SMART:

Specific

Measurable

Achievable

Relevant

Timely

11
New cards

What does specific mean in SMART?

needs to be direct and focused to produce a meaningful answer

12
New cards

in order to ask the right questions what should be followed?

SMART

13
New cards

What does measurable mean in SMART?

must be amendable to data analysis and thus the inputs to answering the question must be measurable with data

14
New cards

What does achievable mean in SMART?

should be able to be answered and the answer should cause a decision make to take an action

15
New cards

What does relevant mean in SMART?

should relate to the objectives of the organization or the situation under consideration

16
New cards

What does timely mean in SMART?

must have a defined time horizon answering

17
New cards

What is the ETL process and what does it stand for?

Extract, transform, and load relevant data and is the most time-consuming part of the analytics mindset process

18
New cards

What are the three steps to extracting data in the extraction process?

Understand the data needs and the data available

perform the data extraction

verify the data extraction quality and document what you have done

19
New cards

What is the different types of organization for data:

Structure data

semi-structured data

unstructured data

20
New cards

What is structured data?

highly organized (e.g. accounting data)

21
New cards

What is semi-structured data?

not sufficiently structured to be inserted into a database (e.g. CSV file)

22
New cards

What is unstructured data?

most publicly available (e.g. images, tweets, text files)

you cannot meaningfully analyze the data

dark data

23
New cards

What are three alternative structures?

Data warehouse

Data Mart

Data Lake

24
New cards

What is a data warehouse?

large database containing detail and summarized data for a number of years but not used in transaction processing but used for analysis

25
New cards

What is a data mart?

hold structured data for a subset

  • Ex: international company with data from different regions and you may want to organize data separately

26
New cards

What is a data lake?

 Structured,  Semi-structured, and unstructured data that is stored in a single location

THE BIGGEST ONE

27
New cards

What are the four steps to transforming data?

  1. understand the data and the desired outcome (cleaning the data is extremely important)

  2. standardize, structure, and clean the data

  3. validate data quality and verify data meets data requirements

  4. document the transformation process

28
New cards

What are important considerations when loading data?

The transformed data must be stored in a format and structure acceptable to the receiving software

Programs used for analysis may treat some data formats differently than expected. It is important to understand how the new program will interpret data formats.

29
New cards

What is important to do once data is successfully loaded into the new program?

update or create a new data dictionary

30
New cards

What are four categories of data analytics?

Descriptive analytics

Diagnostic Analytics

Predictive Analytics

Prescriptive analytics

31
New cards

What is descriptive analytics?

info that results from the examination of data to understand the past; answers to the question “what happened?”

32
New cards

What is diagnostic analytics?

build on descriptive analytics and try to answer the question “why did this happen?”

33
New cards

What is predictive analytics?

are info that results from analyses that focus on predicting the future; answers the question “what might happen in the future?”

34
New cards

What is prescriptive analytics?

info that results from analyses to provide a recommendation of what should happen; answers question “what should be done?”

35
New cards

What is a common way people interpret results incorrectly?

relation to causation and correlation

36
New cards

What is a second common misinterpretation of results?

psychology research

37
New cards

What is correlation?

tells if two things happen at the same time

  • Ex: if wear a purple shirt and it rains does it mean that every time you wear a purple shirt it will rain and answer is no

38
New cards

What is causation?

tells that the occurrence of one thing will cause the occurrence of a second thing

  • Ex: if light something on fire in your house then it will cause it to smoke up 

39
New cards

Why can two people disagree with the same data?

They interpret it differently

  • Example: hertz rental car co. decision between older stock of cars with higher maintenance costs or newer fleet of cars with less maintenance costs

40
New cards

What are good principles of visualization design include?

Selecting the right type of visualization 

Presenting the data in a simplified manner 

Emphasizing important aspects of the data 

Representing the data in an ethical manner

41
New cards

What is automation?

the application of machines to automatically perform a task once performed by humans 

42
New cards

What is Robotic process automation (RPA)?

computer software that can be programmed to automatically perform tasks across applications just as human workers do

43
New cards

How can RPA be used?

to automate ETL tasks

44
New cards

Data analytics is NOT always the right tool to reach the best outcome. T/F?

True

45
New cards

Data can help us make better decisions, but we need to remember the importance of

Intuition

Expertise

Ethics

Other sources of knowledge that are not easy to quantify but can have a significant impact on performance

46
New cards

An accounting firm is trying to understand if its external audit fees are appropriate. They compute a regression using public data from all companies in their industry to understand the factors associated with higher audit. What type of analytics is this an example of?

Diagnostic analytics

47
New cards

A self-driving car company uses artificial intelligence to help clean its historic social media data so they can analyze trends

Descriptive (AI is foil)

48
New cards

An airline downloads weather data for the past 10 years to help build a model that will estimate future fuel usage for flights.

Predictive

49
New cards

A shipyard company runs a computer simulation of how a tsunami would damage its shipyards, computing damages in terms of destruction and lost production time

Predictive

50
New cards

An online retail company tracks past customer purchases. Based on the amount customers previously spent, the program automatically computes purchase discounts for current customer purchases to build loyalty.

Prescriptive

51
New cards

An all-you-can-eat restaurant uses automated conveyer belts to bring cold food to the chefs for preparation. The conveyer belts bring the food to the chefs based on algorithms that monitor the number of people entering and leaving the restaurant

Prescriptive

52
New cards

A large manufacturer of farm equipment continuously analyzes data sent from engine sensors to understand how load, temperature, and other factors influence engine failure.

Diagnostic

53
New cards

A small tax services business provides its financial statements to a bank to get a loan so it can buy a new building to grow its business.

Descriptive