1/17
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Descriptive analytics
uncovers trends in data sets by asking questions about what happened.
predictive analytics
uses probability analysis techniques and data mining, statistical modeling, machine learning, and deep learning to predict future outcomes given certain conditions.
Prescriptive analytics
predict what, when, and why a given scenario might occur.
sensing, collection, wrangling, analysis, storage
what are the 5 steps of the data project life cycle (in order)?
sensing
first step in the data project life cycle, involves identifying the meaningful data to be collected.
collection
second step in the data project life cycle, involves gathering data.
wrangling
third step in the data project life cycle, involves converting raw data into a user-friendly format.
analysis
fourth step in the data project life cycle, involves examining and analyzing the data.
storage
fifth and last step in the data project life cycle, involves securing and maintaining data for access.
Ask, Prepare, Process, Share, Act
what are the 6-step processes of DDDM (including what is involved with each step?
SQL stands for structured query language. Allows the user to query data contained in a database, filter for specific data and track correlated pieces of data
what does SQL stand for, and what does it allow users to do?
syntax
refers to a set of rules and guidelines that define a specific computer language.
SELECT, FROM, WHERE
--re components of SQL: the most common SQL terms used in queries include?
select, from, where
re Extracting Data from Multiple Fields and Adding Comments to a SQL Query: in order to extract information from multiple tables, the _______________ SQL command is used to indicate multiple fields in conjunction with ______________ and ___________________ to narrow the search and get specific results
Explore data sets for patterns, plan for visuals, create the visuals
what are the three steps used in the data visualization process?
Data anonymization is the process of removing or altering personal information so that the data cannot be traced back to an individual.
PII (Personally Identifiable Information) is any data that can identify a specific person.
Examples: name, address, phone number, email, Social Security number, date of birth.
These are PII elements that have been stripped of identifying details:
Replacing names with ID numbers
Removing addresses or changing them to general regions
Masking phone numbers (e.g., --1234)
Generalizing dates of birth (e.g., “1990s” instead of exact date)
Aggregating data (e.g., “age group 20–30” instead of age 22)
what is data anonymization, and what is PII? --what are common examples of anonymized PII data?
Open Data — criteria to be considered open
For data to be considered open, it must:
Be freely available for anyone to use
Be accessible in a usable, machine-readable format
Allow reuse and redistribution
Have no restrictive licenses, or use an open license (like Creative Commons)
re open data: in order to be considered open, what criteria must data meet?
the process of changing data from one format or structure into another so it can be analyzed, stored, or used properly.
what is data transformation?