AWS Machine Learning - Specialty Official Practice Question Set (MLS-C01) v2

0.0(0)

Studied by 0 people

0.0(0)

Call with Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/50

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No study sessions yet.

51 Terms

New cards

Pre-splitting Your Data

Used for when you need explicit control over the data in your training and evaluation datasources.

New cards

Sequentially Splitting Your Data

This approach is useful if you want to evaluate your ML models on data for a certain date or within a certain time range

New cards

Randomly Splitting Your Data

This approach is useful to ensure that the distribution of the data is similar in the training and evaluation datasources.

New cards

True

Is important to use the same seed string value for both datasources and the complement flag for one datasource

New cards

True

A common pitfall in developing a high-quality ML model is evaluating the ML model on data that is not similar to the data used for training.

New cards

True

The model and evaluation are too dissimilar (have extremely different descriptive statistics) to be useful.

New cards

This can happen when input data is sorted by one of the columns in the dataset

and then split sequentially.

New cards

False

You need to use random splitting in Amazon ML if you have already randomized your input data

New cards

groupFiles

Set _ to inPartition to enable the grouping of files within an Amazon S3 data partition.

New cards

groupSize

Set _ to the target size of groups in bytes.

New cards

False

The groupSize property is required

New cards

recurse

Set _ to True to recursively read files in all subdirectories when specifying paths as an array of paths.

New cards

False

You need to set recurse if paths is an array of object keys in Amazon S3

New cards

Sequence-to-Sequence Algorithm

supervised learning algorithm where the input is a sequence of tokens (for example

New cards

Sequence-to-Sequence Algorithm

Algorithm to use for a:

New cards

machine translation

New cards

Sequence-to-Sequence Algorithm

Algorithm to use for a:

New cards

text summarization

New cards

True

Amazon SageMaker AI seq2seq uses Recurrent Neural Networks (RNNs) models

New cards

False

Amazon SageMaker AI seq2seq does not use Convolutional Neural Network (CNN) models

New cards

Data Flow

Create a _ to define a series of ML data prep steps.

New cards

Used to combine datasets from different data sources

identify the number and types of transformations you want to apply to datasets

New cards

Transform

Clean and your dataset using standard s like string

New cards

Examples in usage:

text and date/time embedding
categorical encoding.

New cards

Generate Data Insights

Automatically verify data quality and detect abnormalities in your data with Data Wrangler Data Quality and Insights Report.

New cards

True

Amazon SageMaker Canvas supports training a range of model types

New cards

Amazon SageMaker Canvas

Canvas custom model on the following types of datasets:

New cards

Tabular (including numeric

categorical

New cards

Numeric prediction

Predicting house prices based on features like square footage

New cards

Numeric

New cards

Local upload

Amazon S3

New cards

2 category prediction

Predicting whether or not a customer is likely to churn

New cards

Binary or categorical

New cards

Local upload

Amazon S3

New cards

3+ category prediction

Predicting patient outcomes after being discharged from the hospital

New cards

Categorical

New cards

Local upload

Amazon S3

New cards

Time series forecasting

Predicting your inventory for the next quarter

New cards

Timeseries

New cards

Local upload

Amazon S3

New cards

Single-label image prediction

Predicting types of manufacturing defects in images

New cards

Image (JPG

PNG)

New cards

Local upload

Amazon S3

New cards

Multi-category text prediction

Predicting categories of products

New cards

Target column: binary or categorical

New cards

Local upload

Amazon S3

New cards

True

In Amazon ML

New cards

Area Under the (Receiver Operating Characteristic) Curve (AUC)

Amazon ML provides an industry-standard accuracy metric for binary classification models called _

New cards

True

AUC values near 1 indicate an ML model that is highly accurate.

New cards

True

Values near 0.5 indicate an ML model that is no better than guessing at random.

New cards

True

Values near 0 are unusual to see