L5- L7 Data Lakes and Warehouse Concepts

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/9

flashcard set

Earn XP

Description and Tags

These flashcards cover key vocabulary and concepts related to Data Lakes, Data Warehouses, and the principles of managing large datasets.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

10 Terms

1
New cards

Big Data

Data that is characterized by its large volume, variety, and velocity, driving transformative changes in data processing.

2
New cards

Schema-on-read

A method where schema is applied at the time of reading data, allowing for flexible data storage without strict structure upfront.

3
New cards

Data Lake

A storage repository that holds vast amounts of raw data in its native format until it is needed for analysis.

4
New cards

Data Warehouse

A centralized repository of structured data that has been cleaned and processed for strategic analysis.

5
New cards

ETL

Stands for Extract, Transform, Load; a process used to collect data from various sources, transform it, and load it into a data warehouse.

6
New cards

Data Silos

Isolated pockets of data that are not accessible or shared across different departments or systems.

7
New cards

Raw Data

Unprocessed data that has not been subjected to any transformation, cleaning, or structuring.

8
New cards

Data Quality

The measure of data's fitness for its intended purpose, which encompasses accuracy, completeness, consistency, and timeliness.

9
New cards

Data Governance

The overall management of data availability, usability, integrity, and security in an organization.

10
New cards

Machine Learning

A type of artificial intelligence that uses statistical techniques to give computer systems the ability to learn from data without being explicitly programmed.