1/18
Data Analytics
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Define the abbreviation SQL in data analytics
Structured Query Language
Name the most common method of accessing & retrieving data in data analytics
SQL
What is SQL used for in data analytics?
SQL is used for managing and manipulating relational data in databases
Define the word ‘Query’ in SQL
A request for information from a database. Retrieving data from a table
Define the word ‘Output’ in SQL
the data returned from an SQL query
Define the word ‘Table’ in SQL
this is what stores the data
Define the word ‘Database’ in SQL
a structured system that stores data in tables and lets you retrieve or manage that data using queries
Name the 4 different types of Databases
Object-Oriented
Hierarchy
Network
Relational
What’s the most popular /common database used in SQL
Relational
What are the 3 main ways SQL is used in data analytics?
ETL
Data Cleaning
EDA
Define the abbreviation ETL in SQL
Extract Transform Load
Explain the ETL process
Data is extracted from source systems and transformed and loaded into a database warehouse where it is then analysis.
Define the ‘ Staging Area’ in the ETL process
The staging area is where data engineer drops data to be analyzed
Define ‘Database Warehouse’ in the ETL process
a central storage system where cleaned and transformed data is loaded and managed for analysis by data engineers
Define ‘Analytics’ in the ETL process
After data is cleaned and transformed, analysts explore and use it to find insights, answer questions, and support business decisions
Define Data Cleaning
data cleaning identifies and fixes errors in the data, allowing for accurate and reliable data to be analyzed
Name the 8 steps in the Data Cleaning Cycle
Import Data
Merging Data Sets
Rebuilding Missing Data
Standardization
Normalization
De-Duplication
Verification & Enrichment
Exporting Data
Define the acronym EDA in SQL
Exploratory Data Analysis
Describe the EDA process
EDA is used by data analyst to investigate data for errors, summarize main characteristics of a data set, and discover trends and patterns.