This deck contains terms and definitions from the glossary of the Google Data Analytics certification program, which includes 8 courses and is available on Coursera.
World Health Organization
An organization whose primary role is to direct and coordinate international health within the United Nations system
X-axis
The horizontal line of a graph usually placed at the bottom, which is often used to represent time scales and discrete categories
Y-axis
The vertical line of a graph usually placed to the left, which is often used to represent frequencies and other numerical variables
YAML
A language that translates data to improve readability
A/B testing
The process of testing two variations of the same web page to determine which page is more successful at attracting user traffic and generating revenue
Absolute reference
A reference within a function that is locked so that rows and columns won’t change if the function is copied
Access control
Features such as password protection, user permissions, and encryption that are used to protect a spreadsheet
Accuracy
The degree to which data conforms to the actual entity being measured or described
Action-oriented question
A question whose answers lead to change
Administrative metadata
Metadata that indicates the technical source of a digital asset
Aesthetic (R)
A visual property of an object in a plot
Agenda
A list of scheduled appointments
Aggregation
The process of collecting or gathering many separate pieces into a whole
Algorithm
A process or set of rules followed for a specific task
Aliasing
Temporarily naming a table or column in a query to make it easier to read and write
Alternative text
Text that provides an alternative to non-text content, such as images and videos
Analytical skills
Qualities and characteristics associated with using facts to solve problems
Analytical thinking
The process of identifying and defining a problem, then solving it by using data in an organized, step-by-step manner
Annotation
Text that briefly explains data or helps focus the audience on a particular aspect of the data in a visualization
Anscombe’s quartet
Four datasets that have nearly identical summary statistics but contain different plotted values
Area chart
A data visualization that uses individual data points for a changing variable connected by a continuous line with a filled in area underneath
Argument (R)
Information needed by a function in R in order to run
Arithmetic operator
An operator used to perform basic math operations such as addition, subtraction, multiplication, and division
Array
A collection of values in spreadsheet cells
Assignment operator
An operator used to assign values to variables and vectors
Attribute
A characteristic or quality of data used to label a column in a table
Audio file
Digitized audio storage usually in an MP3, AAC, or other compressed format
AVERAGE
A spreadsheet function that returns an average of the values from a selected range
AVERAGEIF
A spreadsheet function that returns the average of all cell values from a given range that meet a specified condition
Bad data source
A data source that is not reliable, original, comprehensive, current, and cited (ROCCC)
Balance
The design principle of creating aesthetic appeal and clarity in a data visualization by evenly distributing visual elements
Bar graph
A data visualization that uses size to contrast and compare two or more values
Bias
A conscious or subconscious preference in favor of or against a person, group of people, or thing
Big data
Large, complex datasets typically involving long periods of time, which enable data analysts to address far-reaching business problems
Boolean data
A data type with only two possible values, usually true or false
Borders
Lines that can be added around two or more cells on a spreadsheet
Box plot
A data visualization that displays the distribution of values along an x-axis
Bubble chart
A data visualization that displays individual data points as bubbles, comparing numeric values by their relative size
Bullet graph
A data visualization that displays data as a horizontal bar chart moving toward a desired value
Business metric
A standard of measurement used to solve a business task
Business task
The question or problem data analysis resolves for a business
C#
An object-oriented programming language used to create games and mobile apps in the .NET open source developer platform
C++
An extension of the C programming language that is used to create console games, such as those for Xbox
Calculated field
A new field within a pivot table that carries out certain calculations based on the values of other fields
Calculus
A branch of mathematics that involves the study of rates of change and the changes between values that are related by a function
CASE
A SQL statement that returns records that meet conditions by including an if/then statement in a query
Case study
A common way for employers to assess job skills and gain insight into how a candidate approaches common data-related challenges
CAST
A SQL function that converts data from one datatype to another
Causation
When an action directly leads to an outcome, such as a cause-effect relationship
Cell reference
A cell or a range of cells in a worksheet typically used in formulas and functions
Changelog
A file containing a chronologically ordered list of modifications made to a project
Channel
A visual aspect or variable that represents characteristics of the data in a visualization
Chart
A graphical representation of data from a worksheet
Circle view
A data visualization that shows comparative strength in data
Clean data
Data that is complete, correct, and relevant to the problem being solved
Cloud
A place to keep data online, rather than a computer hard drive
Cluster
A collection of data points on a data visualization with similar values
COALESCE
A SQL function that returns non-null values in a list
Code chunk
A piece of code added in an R Markdown file that is used to process, visualize or analyze data
Coding
The process of writing instructions to a computer in the syntax of a specific programming language
Column chart
A data visualization that uses individual data points for a changing variable, represented as vertical columns
Combo chart
A data visualization that combines more than one visualization type
Compatibility
How well two or more datasets are able to work together
Completeness
The degree to which data contains all desired components or measures
Computer programming
The process of giving instructions to a computer in order to perform an action or set of actions
CONCAT
A SQL function that adds strings together to create new text strings that can be used as unique keys
CONCATENATE
A spreadsheet function that joins together two or more text strings
Conditional formatting
A spreadsheet tool that changes how cells appear when values meet specific conditions
Conditional statement
A declaration that if a certain condition holds, then a certain event must take place
Confidence interval
A range of values that conveys how likely a statistical estimate reflects the population
Confidence level
The probability that a sample size accurately reflects the greater population
Confirmation bias
The tendency to search for or interpret information in a way that confirms pre-existing beliefs
Consent
The aspect of data ethics that presumes an individual’s right to know how and why their personal data will be used before agreeing to provide it
Consistency
The degree to which data is repeatable from different points of entry or collection
Context
The condition in which something exists or happens
Continuous data
Data that is measured and can have almost any numeric value
CONVERT
A SQL function that changes the unit of measurement of a value in data
Cookie
A small file stored on a computer that contains information about its users
Correlation
The measure of the degree to which two variables change in relationship to each other
COUNT
A spreadsheet function that counts the number of cells within a range the meet a specified condition
COUNTA
A spreadsheet function that counts the total number of values within a specified range that meet specified criteria
COUNTIF
A spreadsheet function that returns the number of cells within a range that match a specified value
COUNT DISTINCT
A SQL function that only returns the distinct values in a specified range
CRAN (Comprehensive R Archive Network) (R)
An online archive with R packages, source code, manuals, and documentation
CREATE TABLE
A SQL clause that adds a temporary table to a database that can be used by multiple people
Cross-field validation
A process that ensures certain conditions for multiple data fields are satisfied
CSS (Cascading Style Sheets)
A programming language used for web page design that controls graphic elements and page presentation
CSV (comma-separated values) file
A delimited text file that uses a comma to separate values
Currency
The aspect of data ethics that presumes individuals should be aware of financial transactions resulting from the use of their personal data and the scale of those transactions
Dashboard
A tool that monitors live, incoming data
Data
A collection of facts
Data aggregation
The process of gathering data from multiple sources and combining it into a single, summarized collection
Data analysis
The collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making
Data analysis process
The six phases of ask, prepare, process, analyze, share, and act whose purpose is to gain insights that drive informed decision-making
Data analyst
Someone who collects, transforms, and organizes data in order to draw conclusions, make predictions, and drive informed decision-making
Data analytics
The science of data
Data anonymization
The process of protecting people's private or sensitive data by eliminating identifying information
Data bias
When a preference in favor of or against a person, group of people, or thing systematically skews data analysis results in a certain direction
Data blending
A Tableau method that combines data from multiple data sources
Data composition
The process of combining the individual parts in a visualization and displaying them together as a whole