Google Data Analytics Certification

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/102

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

103 Terms

1
New cards

Database

Collection of data stored in computer system

2
New cards

Data life cycle

Plan → capture → manage → analyze → archive → destroy

3
New cards

Plan

What data do we need? How will it be managed? Who’s responsible for it? What are the optimal outcomes?

4
New cards

Capture

Collecting data from variety of sources and brought into the organization

5
New cards

Manage

Where to store data? What tools to keep it secure? Actions needed for proper maintenance?

6
New cards

Analyze

Data is used to solve problems, make decisions, support business goals

7
New cards

Archive

Storing data in a place where it’s available, but may not be used again

8
New cards

Destroy

Important for protecting company’s private information and private data about customers

9
New cards

Steps of data analysis

Ask→ Prepare → Process → Analyze → Share → Act

10
New cards

Ask

Define problem and make sure we understand stakeholder expectations.

Defining problem involves looking at current state and identify how it’s different from the ideal state.

Who are the stakeholders? Maintain strong communication with stakeholders.

11
New cards

Stakeholder

People who help make decisions, influence actions and strategies, and have specific goals they want to meet.

12
New cards

Prepare

Collect and store data that will be used for analysis process.

13
New cards

Process

Find and eliminate errors/inaccuracies that can get in the way of results

Cleaning data, transforming it into more useful format, combining datasets, removing outliers

Fix typos, inconsistencies, or missing/inaccurate data

Veryfing and sharing data cleansing with stakeholders

14
New cards

Analyze

Using tools to transform/organize info to make useful conclusions, make predictions, and drive informed decision-making

15
New cards

Share

Interpreting results and sharing them with others to help stakeholders make effective data-driven decisions.

Data visualization is key

16
New cards

Act

Business taking all insights you have provided and uses them to solve the original business problem.

17
New cards

Formula

Set of instructions that performs a specific calculation using the data in a spreadsheet.

18
New cards

Function

Preset command that automatically performs a specific process or task using the data in a spreadsheet.

19
New cards

Query language

Programming language that allows you to retrieve and manipulate data from a database.

20
New cards

Database

A collection fo data stored in a computer system.

21
New cards

Query

Request for data/info from a database

<p>Request for data/info from a database</p><p></p>
22
New cards

Issue

Topic/subject to investigate

23
New cards

Business task

Question/problem data analysis answers for a business

24
New cards

Fairness

Ensuring that your analysis doesn’t create or reinforce bias

25
New cards

Structured thinking

Process of recognizing the current problem or situation, organizing available info, revealing gaps/opportunities, and identifying options

26
New cards

Making predictions problem type

Using data to make informed decision about how things may be in future

27
New cards

Categorizing things problem type

Assigning info to different groups or clusters based on common features

28
New cards

Spotting something unusual problem type

Identifying data that’s different from norm

29
New cards

Identifying themes problem type

Grouping categorized info into broader concepts

30
New cards

Discovering connections problem type

Finding similar challenges faced by different entities and combining data and insights to address them

31
New cards

Finding patterns problem type

Using historical data to understand what happened in the past and is therefore likely to happen again

32
New cards

Closed-ended questions

Only answered with yes or no, doesn’t really provide useful insights

33
New cards

SMART questions

Specific - simple, significant, focused on single topic or a few closely related ideas

Measurable - can be quantified and assessed

Action-oriented - encourage change

Relevant - matter, important, have significance to the problem you’re solving

Time-bound - specify the time to be studied

34
New cards

Data-inspired decision-making

Explores different data sources to find out what they have in common

35
New cards

Report

Static collection of data given to stakeholders periodically

Pros:

  • High-level historical data

  • Easy to design/use

  • Pre-cleaned and sorted data

Cons:

  • Continual maintenance

  • Less visually appealing

  • Static

36
New cards

Dashboard

Monitors live incoming data

Pros:

  • Dynamic, automatic, interactive

  • More stakeholder access

  • Low maintenance

  • More visually appealing

Cons:

  • Labor-intensive design

  • Can be confusing

  • Long time to fix bugs

  • Potentially uncleaned data

37
New cards

Pivot table

Data summarization tool used in data processing, used to summarize, sort, reorganize, group, count, total, or average data stored in database

38
New cards

Metric

Single, quantifiable type of data that can be used for measurement

Can help calculate customer retention rates

39
New cards

Metric goal

Measurable goal set by company and evaluated using metrics

40
New cards

Mathematical thinking

Looking at problem and logically breaking it down step-by-step so you can see the relationship of patterns in data, using that to analyze the problem

41
New cards

Small data

  • Specific

  • Short time period

  • Day-to-day decisions

  • Ex:) How much water you drink a day

42
New cards

Big data

  • Large and less specific

  • Long time period

  • Usually need to be broken down

  • Big decisions

43
New cards

Operator

Symbol that names type of operation or calculation to be performed

44
New cards

Cell reference

A cell or range of cells in a worksheet that can be used in a formula

RowNum like A1

45
New cards

Common errors

#ERROR! - Formula can’t be interpreted as input (parsing error)

#N/A - data in formula can’t be found

#NAME? - formula/function name isn’t understood

#NUM! - formula/function can’t be performed as specified

#VALUE! - general error that could indicate problem with formula or referenced cells

#REF! - formula is referencing a cell that is no longer value or has been deleted

46
New cards

Problem domain

Specific area of analysis that encompasses every activity affecting or affected by the problem

47
New cards

Scope of work (SOW)

An agreed-upon outline of the work you’re going to perform on a project

48
New cards

Before communicating…

  1. Who is my audience?

  2. What do they already know?

  3. What do they need to know?

  4. How can I communicate that effectively to them?

49
New cards

First-party data

Data collected by an individual or group using their own resources (preferred)

50
New cards

Second-party data

Data collected by a group directly from its audience and then sold

51
New cards

Third-party data

Data collected from outside sources who did not collect it directly (less reliable)

52
New cards

Nominal data

Qualitative data that’s categorized without a set order

53
New cards

Ordinal data

Qualitative data with a set order or scale

54
New cards

Internal data

Data that lives within a company’s own systems

  • More reliable

  • Easier to collect

55
New cards

External data

Data that lives and is generated outside of an organization

56
New cards

Structured data

Data that’s organized in a certain format such as rows and columns

  • Easily searchable

  • Analysis-ready

  • Good for databases

  • Easily visualized

Ex:) Relational databases, spreadsheets

57
New cards

Unstructured data

Data that’s not organized in any easily identifiable manner

Ex:) Audio and video files

58
New cards

Data model

Used for organizing data elements and how they relate to one another. Works well for structured data.

  • Keeps data consistent

  • Maps out how data is organized

59
New cards

Data elements

Pieces of info

Ex:) names, account numbers, addresses

60
New cards

Wide data

Every data subject has a single row with multiple columns to hold the values of various attributes of the subject

  • Easily identify and compare different data between columns

61
New cards

Long data

Each row is one time point per subject so each subject will have data in multiple rows

  • Good for storing and organizing data when there’s multiple variables for each subject at each time point

  • Less columns, only need to add one more column for new variable

62
New cards

Observer (experimenter/research) bias

Different people observe things differently

63
New cards

Interpretation bias

Interpreting ambiguous situations in a positive or negative way

64
New cards

Confirmation bias

Searching for, or interpreting info in a way that confirms preexisting beliefs

65
New cards

Identifying good data

Reliable

Original - validate with original source

Comprehensive - contains all info needed

Current

Cited - makes info more credible

66
New cards

Data ethics

Well-founded standards of right and wrong that dictate how data is collected, shared, and used

67
New cards

GDPR

General Data Protection Regulation of the EU

68
New cards

Ownership

Individuals own the raw data they provide and they have primary control over its usage, how it’s processed, and how it’s shared.

69
New cards

Transaction transparency

All data-processing activities and algorithms should be completely explainable and understood by the individual who provides their data.

70
New cards

Consent

An individual’s right to know explicit details about how and why their data will be used before agreeing to provide it.

71
New cards

Currency

Individuals should be aware of financial transactions resulting from the use of their personal data and the scale of these transactions.

72
New cards

Privacy

Preserving a data subject’s info and activity any time a data transaction occurs.

People should have…

  • Protection from unauthorized access to our private data

  • Freedom from inappropriate use of our data

  • The right to inspect, update, or correct our data

  • Ability to give consent to use our data

  • Legal right to access our data

73
New cards

Openness

Free access, usage, and sharing of data

Open data standards:

  • Availability and access

  • Reuse and redistribution

  • Universal participation

74
New cards

Data anonymization

Process of protecting people’s private or sensitive data by eliminating personally identifiable info.

75
New cards

Data interoperability

Ability of data systems and services to openly connect and share data.

76
New cards

Relational database

Database that contains a series of related tables that can be connected via their relationships.

77
New cards

Primary key

An identifier that references a column in which each value is unique.

  • Used to ensure data in a specific column is unique

  • Uniquely identifies a record in a relational database table

  • Only one allowed in a table

  • No null/blank values

78
New cards

Foreign key

A field within a table that’s a primary key in another table (how one table can be connected to another)

  • Column or group of columns in a relational database table that provides a link between the data in two tables

  • Refers to field in a table that’s the primary key of another table

  • More than one allowed in a table

79
New cards

Descriptive metadata

Describes a piece of data and can be used to identify it at a later time.

80
New cards

Structural metadata

Indicates how a piece of data is organized and whether it is part of one, or more than one, data collection

81
New cards

Administrative metadata

Indicates the technical source of a digital asset

82
New cards

Metadata repository

Database specifically created to store metadata. Make it easier and faster to bring together multiple sources for data analysis

83
New cards

Data governance

A process to ensure the formal management of a company’s data assets

84
New cards

External data

Data that lives and is generated outside an organization

85
New cards

Naming conventions

Consistent guidelines that describe the content, date, or version of a file in its name.

86
New cards

Data security

Protecting data from unauthorized access or corruption by adopting safety measures.

87
New cards

Mentor

Professional who shares their knowledge, skills, and experience to help you develop and grow.

88
New cards

Sponsor

Professional advocate who’s committed to moving a sponsee’s career forward with an organization.

89
New cards

Data integrity

Accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle

90
New cards

Types of insufficient data

  • Data from only one source

  • Data that keeps updating

  • Outdated data

  • Geographically-limited data

91
New cards

Ways to address insufficient data

  • Identify trends with available data

  • Wait for more data if time allows

92
New cards

Statistical power

Probability of getting meaningful results from a test.

93
New cards

Hypothesis testing

A way to see if an survey or experiment has meaningful results.

94
New cards

Statistically significant

Results are real and not an error caused by randomness (usually at least 0.8 power)

95
New cards

Confidence level

Probability that your sample size accurately reflects the greater population. Independent from margin of error (doesn’t need to add up to 100%)

96
New cards

Margin of error

Max amount that the sample results are expected to differ from those of the actual population.

97
New cards

Dirty data

Data that’s incomplete, incorrect, or irrelevant to the problem you’re trying to solve.

98
New cards

Clean data

Data that’s complete, correct, and relevant to the problem you’re trying to solve.

99
New cards

Data engineers

Transform data into a useful format for analysis and give it a reliable infrastructure.

100
New cards

Data warehousing specialists

Develop procedures and processes to effectively store and organize data.