1/44
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Lookup Questions
could simply find the answer by searching for a specific value in the data
Compute Questions
requires you to use math to find the answer, it uses more than one value found in the data
Relate Questions
requires you to find a possible relationship between more than one value.
Question (could not be answered)
question requires additional information, data
Lookup Question Example
What was the gender of the shopper who purchased an item on January 5, 2019
Compute Question Example
What was the purchase total of the shopper on January 5, 2019
Relate Question Example
Do females or males spend more at the store?
Intersection of fields in data science
statistics, coding, and business knowledge
Data Science
the process of learning about the world using data and computation.
Statistical Questions
could have a variety of different answers, involves looking at more than one piece of data.
Use of data science
make predictions, draw reliable conclusions about the world.
Data Science Life Cycle
a sequence of steps taken to process and use data.
Ask Questions
formulate statistical questions that can be answered with data
Considered Data
collect or record data, or finding an existing data
Analyze Data
run calculations and or create data displays to identify patterns and relationships
Interpret Data
answer questions and determine results
Qualitative Data
data the can be divided into different categories, descriptive data
Quantitative Data
numeric data that can be counted or measured
Data Table
used to organize data in data science, each row represents a case and each column represents a variable
Column
A vertical stack of cells in a table.
Row
The horizontal placement of cells in a table.
Structured Data
Data that (1) are typically numeric or categorical; (2) can be organized in a way that is easy for computers to read, organize, and understand; and (3) can be inserted into a database.
Interpret Data
an observation that lies outside the overall pattern of a distribution
Data Cleaning
The process of fixing or removing incorrect, corrupted, incorrectly formatted, or duplicated data.
Sorting
Arranging data in a specified order.
Filtering
To create displays for relevant information only.
Line Chart
Chart used to illustrate changes in data over time
Pie Chart
used to display distribution, shows the relationship of a part to a whole
Bar Chart
effective for comparing data across different categories and display relationships
Data Visualization
the presentation of data in a pictorial or graphical format
Average
returns the average (arithmetic means) of its arguments, mean
Min
returns the smallest number in a set of values
Max
returns the highest value in a set of data
Revenue
multiplying the quantity of goods by price
If formula
performs logical comparisons and return different results depending on the outcome
Auto Sum
A function that automatically identifies and adds ranges of cells in your worksheet.
Absolute Reference
A cell reference that does not change when a formula is copied to a new location.
=
symbol was used at the beginning of each formula in Excel Spreadsheet
Data labels
text used to identify data points or categories, used to identify each value in the data series
Raw data
The original data as it was collected, not yet processed, formatted, or analyzed.
Data encryption
The process of encoding or translating data into another form so that only the intended recipient can decrypt and read the data
Data minimization
Limiting the collection of personal information to that which is directly relevant and necessary to accomplish a specific task.
Data anonymization
The process of protecting people's private or sensitive data by eliminating identifying information
Data aggregation
The process of collecting and organizing large amounts of information
Statistical questions
Questions that account for variability in the responses, or many different answers