machine learning engineer - ML Data Storage Flashcards

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/11

flashcard set

Earn XP

Description and Tags

AWS ML storage options and best practices

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

12 Terms

1
New cards

what data access pattern does FSx support?

random access

2
New cards

what data access pattern does EFS support?

random access

3
New cards

what data access pattern does EBS storage support?

sequential streaming

4
New cards

what storage type is best for dataset storage?

S3

5
New cards

what storage type is best for real-time and streaming workloads?

Amazon EFS, as it supports low-latency access and concurrent access from multiple instances.

6
New cards

what storage type is best for training large ML workloads?

EBS as it provides random I/O access to EC2 instances with high throughput.

7
New cards

what storage type is best for ML inference workloads?

Amazon EBS or Amazon EFS, depending on the latency and concurrency requirements.

8
New cards

What is FSx ?

Amazon FSx is a fully managed file storage service that provides file systems optimized for specific workloads, such as high-performance computing or machine learning.

9
New cards

When is FSx used?

FSx is used when high-performance file storage is required for workloads like machine learning, big data analytics, or high-performance computing.

10
New cards

What are the storage options for ML?

The storage options for machine learning include Amazon S3 for object storage, Amazon EBS for block storage, and Amazon EFS or FSx for file storage, each catering to different performance and scalability needs.

11
New cards

what are the object notation data types for ML and how are they used?

JSON and JSONL. They are used on non-tabular hierarchical data

12
New cards

what data access pattern does S3 support?

copy and load