1/9
Flashcards covering key vocabulary and concepts related to data engineering.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
ETL (Extract, Transform, Load)
A process in data engineering that moves data from various sources, processes it, and loads it into a target system.
Data Warehouse
A central hub designed to store cleaned, structured, and processed data optimized for reporting and analysis.
Data Lake
A large storage system that stores raw, unstructured, and structured data in its original form.
Data Maturity
Measures how well an organization utilizes, integrates, and manages data for competitive advantage.
DataOps
An approach for data management that improves data quality and speeds data development and analysis through automation and collaboration.
Apache Spark
An in-memory, high-speed data processing engine that supports both batch and streaming data processing.
NoSQL Databases
Flexible databases like MongoDB and Cassandra, designed to handle unstructured and semi-structured data.
ETL Process
The three-step process of extracting, transforming, and loading data into target systems for analysis.
Data Pipeline
Automated workflows that manage the flow of data through various processes from source to destination.
Big Data
Extensive datasets that require special technology for processing, storage, and analysis, often using tools like Hadoop and Spark.