1/9
flashcards on Data Extraction best practices for ML
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What are the methods to transfer/extract data for ML?
Cli, Sdk, S3 transfer acceleration, Database Migration Service, Lambda, Glue, DataSync and Snowball
What is S3 transfer acceleration?
service that allows you to accelerate large data transfers to and from S3.
What is Database Migration Service?
facilitates data migration between cloud databases or to S3. Allows the data to be extracted in SQL, XML, JSON and CSV formats.
How can lambda functions be used to extract/transfer data for ML?
can be run on a schedule or triggered by an event to extract data from AWS storage sources.
What storage locations does AWS lambda support?
S3, EFS, FSx, RDS and Dynamo DB.
What is AWS Glue?
a managed ETL service that integrates with S3, RDS, DynamoDB and workflows.
what is DataSync?
a service that enables you to transfer data between on-prem and the cloud or between aws storage systems.
What data sources does DataSync support?
network file systems or network attached storage.
what is snowball?
network device service that enables large data transfers into and out of aws. It is used for transferring terabytes or petabytes of data into S3.