1/61
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
MS SQL
Relational DB
MySQL
Relational DB
Oracle
Relational DB
Greenplum
Relational DB
Postgres
Relational DB
IBM DB2
Relational DB
Cassandra
NoSQL (We don’t integrate)
MongoDB
NoSQL
ElasticSearch
NoSQL
Data Lake
A centralized repository that stores, processes and secures large amounts of data in its original form. Can store structured, semi-structured, and unstructured data from any source, without sacrificing fidelity.
Hadoop
Data Lake (on-prem/open-source)
Amazon S3
Data Lake (cloud)
Azure Data Lake Service
Data Lake (cloud)
Google Cloud Storage
Data Lake (cloud)
MinIO
On-prem object storage
Dell Isilon
On-prem object storage
Dell ECS
On-prem object storage
Pure Storage
On-prem object storage
Cumulo
On-prem object storage
NetApp
On-prem object storage
Data Warehouse
Proprietary way of storing and managing data, built for analytics, massive costs, getting data in takes time
Terradata
Data warehouse (on-prem)
Vertica
Data warehouse (on prem)
Oracle
Data warehouse
Netezza
Data warehouse (on-prem)
Snowflake
Data warehouse (cloud)
Amazon Redshift
Data warehouse (cloud)
Microsoft Synapse
Data warehouse (cloud)
Firebolt
Data Warehouse
SQL Engine
Engine that queries data
Impala
Cloudera SQL Engine (assume Hadoop)
Drill
SQL Engine (MapR)(Hadoop)
Hive
SQL Engine (Horton Works)
Presto
SQL Engine (built by FB. Open-source, Free)(connects to all data lakes like we do)
Amazon Athena
SQL Engine (only on s3)
Starburst
SQL Engine (most like Dremio) connects to all data lakes, SQL and noSQL sources.
Trino
SQL Engine
Databricks SQL
SQL Engine
ETL (Extract, Transform, Load)
Moving data from one repository to another
Informatica
ETL
Microsoft SSIS
ETL
IBM Datastage
ETL
Data Prep
Collect, clean, transform and organize raw and incomplete data into a suitable and consistent format for further data processing.
Alteryx
Data prep
Paxata
Data prep
Trifacta
Data prep
BI/BA
Software applications that collect, process and analyze data to help businesses make sense of it.
Tableau
BI
Power BI
BI
Cognos
BI
Microstrategy
BI
Looker
BI
Qilk
BI
Machine Learning
Used to analyze data and identify patterns, which are then used to create a data model that can make predictions.
Jupyter
ML
Spark
ML (databricks)
SAS
ML
Anaconda
ML
Virtualization
Data virtualization is an approach to data management that allows an application to retrieve and manipulate data without requiring technical details about the data, such as how it is formatted at source, or where it is physically located, and can provide a single customer view of the overall data.
Denodo
Virtualization
Tibco
Virtualization