Fundamentals Of Data Engineering Masterclass

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/13

flashcard set

Earn XP

Description and Tags

Concepts and notes taking from the learnings of the following video: https://youtu.be/hf2go3E2m8g?si=SR_wCm6Blra8IO1R

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

14 Terms

1
New cards

Data Engineering Pipeline

The process from data generation to data delivery that defines the core function of a Data Engineer (DE). DEs regularly build these for various business usecases.

2
New cards

Data Integration

The process of taking data from multiple data generation sources and combining them into a single record prior to data delivery.

3
New cards

Data Engineering Life Cycle

The overall process Data Engineers follow when building out data pipelines. Involves taking data from data generation, ingesting it into a pipeline, transforming it, storing it, and finally delivering it to users.

4
New cards

Data Generation

The first stage of the Data Engineering Life Cycle. Often the origin of the data DEs often utilize. Can come from the following non-exhaustive methods:

5
New cards
  • RDBMS
6
New cards
  • IoT Device
7
New cards
  • API or 3rd party data
8
New cards
  • Machine Log
9
New cards

Data Storage

Not to be confused with the storage layer from the Software Engineering perspective, this is the underlying layer of the ingesting, transforming, and delivery stages of the DE life cycle. Can take the following non-exhaustive forms:

10
New cards
  • RDBMS
11
New cards
  • NoSQL DB
12
New cards
  • Data Warehouse(s)
13
New cards
  • Object Storage / Data Lake (e.g. Amazon E3)
14
New cards

Data Modeling

A visual representation or blueprint of how how a database is organized. Often indicates how various tables are tied together via their relationships.