1/93
question sets for AWS certificate
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
what type of ML models use image data?
object recognition, autonomous driving, image classifications
What ML model processes time series data?
predictive
What ML model processes tabular data?
Linear Regression
what data type does predictive ML models use?
time series
what data type does NLP models use?
Text
What data type does linear regression models use?
Tabular
What techniques do you use to prevent model overfitting for an ML model?
Early Stopping - pause the model training when the model detects noise in the dataset.
Pruning - eliminate feature in the dataset that are irrelevant to the model.
Regularization - rank the dataset features based on their importance to the predictive outcome and assign a penalty to those that are less important.
Ensembling - for models with weak prediction results, combine the outcomes for multiple models.
Data Augmentation - change the dataset each time the model processes it
How to prevent overfitting of a neural network model?
add dropout layers to the network, which forces some neurons to zero and causes the model to learn more generalized patterns
What algorithm is designed to handle class imbalance?
Xtreme Gradient Boost - a supervised learning alg that tries to predict a target value by combining estimates from multiple models used in structured data problems. Has a tuning parameter that handles class imbalance.
How do you transfer a model from one account to another?
use the model api to copy the model. Model must be copied to the same region in which it was created. Assign a role to the model account owner granting access to the role assigned to the new account granting them access to import the model.
What are the techniques for hyperparameter tuning?
bayesian optimization, random search, grid search
what is random search hyperparameter tuning?
uses random samples of hyperparameters based on the values configured. It allows you to turn the model by running tuning jobs in parallel because the parameters chosen does not depend on the results of the previous training jobs.
what is grid search hyperparameter training?
It chooses combinations of the categorical values set. The number of training jobs created is equal to the max number of categorical values in the model.
what is the Bayesian optimization hyperparameter training?
Optimizes the model for the chosen metric using regression to select the hyperparameters to test. Uses the result from the previous tuning job for the next.
What tool is used for automated ML model deployment as IaC?
Cloud Formation - using nested stacks enables you to create an automation workflow to deploy components. When you feed the output of a nested stack as input to another it ensures the stacks can communicate with each other.
what does sagemaker pipe mode do?
streams data from S3 with high throughput directly to ML training instances
What does the sagemaker file mode do?
copies all the data from S3 to the ML training instances before processing.
what does the sagemaker fastfile mode do?
it streams the data from S3, but allows training to start without all of the data in S3 being read. Best for sequential reads. Suitable for large numbers of small files.
What is S3 Express One Zone input mode for Sagemaker?
stores data in high throughput single AZ (same as compute resources) to speed processing. Can be used with SageMaker file, pipe and fastfile input modes.
What does sagemaker FSx mode do?
provides high throughput, low latency file processing that support millions of gigabytes of data. Sagemaker mounts the FSx system to send data to the training instance.
How do you avoid data transfer costs when using FSx lustre input mode with SageMaker when model training?
training job must connect to a VPC. FSx it must be in a single AZ and the training job VPC subnet must reference the FSx AZ ID.
What does EFS input mode do?
SageMaker mounts to the EFS instance to feed date to the training instance. The training instance must be in a VPC to connect to EFS.
What are the types of boosting?
Extreme Gradient (XGBoost), Gradient and Adaptive
What is Boosting?
Boosting is a method used in machine learning to reduce errors in predictive data analysis. It improves machine models' predictive accuracy and performance by converting multiple weak learners into a single strong learning model. It is often used with decision tree analysis.
What is adaptive boosting?
Boosting creates an ensemble model by combining several weak decision trees sequentially. It assigns weights to the output of individual trees. Then it gives incorrect classifications from the first decision tree a higher weight and input to the next tree. After numerous cycles, the boosting method combines these weak rules into a single powerful prediction rule.
What type of problems is adaptive boosting best for?
classification problems
When should you not use adaptive boosting
where there is high correlation between the features or high data dimensionality.
What is Gradient boosting?
A sequential training technique similar to adaptive boost, but it does not give more weight to incorrectly classified items. It aims to classify items correctly and the classification improves over each iteration.
what types of problems is gradient boost best for?
regression and classification
What is extreme gradient boosting?
gradient boosting but uses multicore CPUs for learning to run in parallel.
when is it best to use extreme gradient boosting?
big data applications
What are the cons of boosting?
challenging to use real-time
sensitive to outliers or data values that differ from the rest of the dataset.
What are the benefits of boosting?
ease of use, reduces bias, efficient for large datasets
What are the metrics used to evaluate ML model performance?
Precision
Accuracy
Area Under the Curve
F1
logLoss
Mean Absolute Error
Mean Square Error
Inference Latency
Balance Accuracy
What is precision?
measure how well a model predicts true positives out of all the positives predicted. TP/(TP + FP)
What type of models use precision as a metric?
binary classification
when should precision be used as a metric?
binary classification when the cost of a false positive is high
What is Means Square Error?
average squared differences between the actual and predicted values.
What kinds of models use Mean Square Error as a metric?
regression
What is Mean Absolute Error?
measures the difference between predicted and actual values. Used to determine model fit to data.
What types of models use Mean Absolute Error as a metric?
regression
What is LogLoss?
metric used to measure the quality of the model probabilities. It is used to evaluate when a model makes incorrect predictions at high probabilities.
What model types use LogLoss as a metric?
binary and multiclass classification and neural nets
What is inference latency?
the time it takes to receive a model prediction from a real-time model inference
what is F1?
measure of how well a model predicts data using precision and recall.
What is precision?
a measure of the quality of positive predictions. = TP/(TP + FP)
what is recall?
measures how completely a model predicts the model class in the dataset. TP/(TP + FN)
What type of model uses F1?
binary classification
When is it best to use F1 metric?
binary classification when there is class imbalance or outliers skew results
What is Balanced Accuracy?
measures the ratio of accurate predictions to all predictions. = 0.5*((TP/P) +(TN/N))
What model types use balanced accuracy?
binary and multiclass classification
When is it best to use balanced accuracy metric?
classification models where the number of positives and negatives differ greatly
What is area under the curve?
measure used to compare binary classification models that return probabilities.
what model type uses area under the curve?
binary classification, logistic regression
When is it best to use the area under the curve metric?
binary classification models where the dataset is balanced.
What is accuracy?
ratio of correctly classified vs all classified items.
What model types use accuracy as a measure?
binary and multiclass classification.
What is ensemble learning?
use of multiple models/algorithms to gain more accurate predictions
What are the kinds of ensemble learning algorithms?
boosting, bagging and stacking
What is bagging?
used to reduce the variance of a single model.
what are the types of bagging?
random forest and extra trees
How does bagging work?
involves training multiple models independently and combining their predictions through averaging or voting.
What are the benefits of bagging?
helps prevent overfitting
improves accuracy
reduces variance
What are the cons of bagging?
loss of interpretability due to combining predictions from multiple models
increased computational cost from training multiple models
limited effectiveness on stable models where variance reduction is minimal
potential for overfitting if base models are too complex
and increased memory usage due to storing multiple models in memory, making it less suitable for large datasets or real-time applications
When is bagging used?
Classification, Regression, anomaly detection, forecasting, clustering, imbalanced data, feature selection
What is stacking?
entails training numerous base models on the same training dataset, then feeding their predictions into a higher-level model, also known as a meta-model or second-level model, to make the final prediction
what are the advantages of stacking?
improved predictive performance
Supports multiple types of base models
reveals the significance of the base model in the final prediction
What are the disadvantages of stacking?
increased complexity
the need for more computational resources
How is stacking different from bagging and boosting?
In stacking, the base models are different types. In bagging and boosting the base models are the same type.
What is Random Cut Forest?
an unsupervised algorithm for detecting anomalous data points within a data set
How is Random Cut Forest used with time series data?
Anomalies can manifest as unexpected spikes in time series data, breaks in periodicity, or unclassifiable data points. It’s particularly well-suited for scenarios where you need to react to unusual behaviors in near-real-time
What is DeepAR?
a supervised learning model for time series forecasting, generating predictions about future values
what is linear learner?
an unsupervised learning algorithm that is ideal for supervised learning tasks like binary classification. It is specifically designed to handle class imbalance by adjusting class weights and ensures that the minority class is adequately represented during training.
What is Neural Topic Model (NTM)?
NTM is an unsupervised learning algorithm that is used to organize a corpus of documents into topics that contain word groupings based on their statistical distribution
What is Embedding?
a numerical representation of text (or data) in a high-dimensional vector space. Embeddings allow models to measure the similarity between words, phrases, or sentences, which is essential for tasks like semantic search or clustering
What is a Token?
units of text (like words, subwords, or characters) processed by an LLM
What is retrieval augmented generation?
a technique that integrates external data retrieval (e.g., from a knowledge base) with generative AI to provide more contextually accurate and relevant outputs. It is particularly useful when the model lacks specific information or domain knowledge
What is Prompt?
an input provided to a model to guide it to generate an appropriate response or output for the input
What is temperature?
adjusts the randomness of LLM outputs. A lower temperature makes outputs more deterministic, while a higher temperature increases variability and creativity.
What is Stochastic Gradient Descent?
an optimization algorithm used in machine learning that updates a model's parameters by calculating the gradient based on a single randomly selected data point from the training set at each iteration
How do you stabilized the model training process for models trained using stochastic gradient descent?
decrease the learning rate/batch size so that the gradient updates become smaller, allowing the model to converge more smoothly and stabilize the training process
What are the symptoms when the learning rate is too high for a model trained with stochastic gradient descent
an oscillating pattern of the loss values during training and validation
What is data Drift?
changes in the distribution of the input data over time, which can lead to the model receiving data that is different from what it was trained on
What is model drift?
occurs when the model’s performance degrades because its assumptions or parameters no longer align with the real-world data
What should you do to fix data drift?
SageMaker Model Monitor can be used to detect data drift by tracking changes in data distribution
What should you do to fix model drift?
retrain the model with updated data
what types of drift does SageMaker Model Monitor detect?
data quality
model quality
bias in predictions
feature attribution
what is feature attribution?
Feature Attributions is a family of methods for explaining a model’s predictions on a given input by attributing it to features of the individual inputs. The attributions are proportional to the contribution of the feature to the prediction.
What is the AUC metric and what does it measure?
measures the ability of the model to predict a higher score for positive examples as compared to negative examples. It is a measure of model accuracy where values near 1 indicate an ML model that is highly accurate. Values near 0.5 indicate an ML model that is no better than guessing at random. Values near 0 are unusual to see, and typically indicate a problem with the data
What is an ROC curve and when is it used?
a graphical plot used to show the diagnostic ability of binary classifiers.
How do you integrate SageMaker pipelines and AWS Glue?
Use SageMaker Pipelines callback steps to wait for the AWS Glue jobs to complete and retrieve the outputs directly from Amazon S3
what is concept drift?
Concept drift happens when the relationship between input features (e.g., customer transaction patterns, demographics, or usage behaviors) and the target variable (e.g., fraud detection or churn likelihood) changes over time
how do you configure an on-demand workflow to run to detect bias drift for models deployed to real-time end-points?
By enabling data capture in SageMaker endpoints and integrating Clarify, you can run on-demand bias analysis workflows to identify and measure bias drift
what is the difference between sagemaker clarify vs sagemaker model monitor?
Model Monitor is a service that performs recurring monitoring on data captured from an endpoint or batch transform job. It runs clarify jobs. You can use Clarify independently to run one-off jobs that do not use model monitor.