1/8
Flashcards about data storage considerations in an analytics pipeline using AWS services.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Modern Data Architecture
A modern data architecture with a central data lake storage surrounded by other data stores focused on specific workloads. It facilitates data movement while controlling access and enabling efficient analysis.
AWS Lake Formation
Provides management of the data lake within AWS.
AWS Glue
Provides the data catalog within AWS.
Amazon Athena
Offers a SQL query engine for directly analyzing data from the data lake.
Data Warehouse Use Case
Highly structured, curated data for complex queries and business analytics, justifying a higher storage cost.
Data Lake Use Case
Unstructured raw data available for exploration at a lower cost.
Amazon Redshift Spectrum
A service where you can efficiently query S3 buckets without moving the data to Amazon Redshift, costing less than warehouse storage.
Pipeline Storage Selection
Optimizes cost and business value by using a combination of storage types as data moves through the pipeline.
Major Components of AWS Data Architecture
Includes an Amazon S3 data lake and an Amazon Redshift data warehouse.