1/29
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
ingest
to bring data into a system from external or internal sources
transform
to modify data to meet specific requirements or formats
extract
to pull raw data from various sources such as databases
load
to write transformed or extracted data into a storage system such as a data warehouse
clean
to identify and correct or remove errors
validate
to ensure data meets predefined quality criteria
aggregate
to combine data from multiple records or datasets to provide summarized information
integrate
to merge data from multiple systems or sources into a unified system for a more comprehensive view or analysis.
parse
to break down raw data into structured
optimize
to improve the performance or efficiency of data processing tasks
schedule
to set up automated tasks to run at specified intervals
monitor
to observe systems
stream
to process data in real-time as it is being produced
query
to retrieve specific data from databases or systems by using languages like SQL or NoSQL query languages.
index
to create data structures to speed up retrieval operations by mapping specific fields to database records
partition
to divide data into distinct sections or chunks based on specific criteria (like date ranges) to improve performance and organization.
cluster
to organize data into groups that have similar characteristics OR to organize servers to process data in parallel
scale
to increase or decrease the capacity of a system to handle more or less data efficiently
replicate
to create copies of data across multiple systems or nodes to ensure high availability
archive
to store older or less frequently used data in long-term storage solutions
migrate
to move data from one system
profile
to analyze the content
secure
to implement safeguards such as encryption
version
to manage different versions of data or schema to ensure traceability and compatibility
catalog
to create an organized inventory of available data assets
visualize
to represent data graphically (charts
deploy
to implement data pipelines
automate
to reduce manual intervention by configuring systems to carry out repetitive data-related tasks (e.g.
backfill
to process and load historical data into a system when data was previously unavailable or missing to ensure completeness.
log
to record events or actions taken by data systems