1/108
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Foundation Model
a large, pre-trained AI model trained on massive datasets, offering broad capabilities for text, image, or multimodal tasks
Large Language Model (LLM)
AI designed to generate coherent human-like text
Non-deterministic
generated text may be different for every user that uses the same prompt
Amazon Bedrock
fully managed AWS service that makes it easy to build & scale gen. AI apps by providing Foundation Models (FM)
None of your data is used to train the FM
Amazon Titan
High-performing FM from AWS
Amazon Bedrock Fine-Tuning
Adapt a copy of a FM with your own data
Training Data:
Must adhere to a specific format
Must be stored in S3
To use a fine-tuned model, you must provision throughput by purchasing capacity
Not all models can be fine-tuned
Instruction-based fine-tuning
The model is trained to follow explicit instructions by aligning out-puts with human-provided examples of desired behavior
Continued Pre-training
The model is further trained on domain-specific or custom data to adapt its knowledge base beyond the original pre-training
Single-turn Messaging
The model is optimized to handle one-off prompts where each query & response are independent of prior context
Multi-turn messaging
The model is fine-tuned to maintain context across a conversation, enabling coherent back-and-forth interactions over multiple exchanges
Which of the fine-tuning options is usually cheaper?
Instruction-based fine-tuning because computations are less intense & the amount of data required is usually less
Transfer Learning
the broader concept of reusing a pre-trained model to adapt it to a new related task
For evaluating a model, what do some benchmark datasets allow you to do?
very quickly detect any kind of bias & potential discrimination against a group of people
Automated Metrics to Evaluate an FM
ROUGE; Recall-Oriented Understudy for Gisting Evaluations
BLEU; Bilingual Evaluation Understudy
BERTScore
Perplexity (how well the model predicts the next token (lower is better))
Business Metrics to Evaluate a Model on
User Satisfaction
Average Revenue Per User (ARPU)
Cross-Domain Performance (model’s ability to perform cross different domain tasks)
Conversion Rate
Efficiency
Retrieval-Augmented Generation (RAG)
Allows a FM to reference a data source outside of its training data
RAG Use Cases
Customer Service Chatbot
Legal Research & Analysis
Healthcare Question-Answering
Tokenization
converting raw text into a sequence of tokens
Context Window
The number of tokens an LLM can consider when generating text
Embeddings
create vectors (array of numerical values) out of text, images, or audio
Since vectors have a high dimensionality, what can they do?
they can capture many features for one token, such as semantic meaning, syntactic role, sentiment
What can embedding models power?
search applications
Bedrock-Agent
Manage & carry out various multi-step tasks related to infrastructure provisioning, application deployment, & operational activities
Model Improvement Techniques Cost Order (most cost effective to least)
1.) Prompt Engineering
2.) RAG
3.) Instruction-based Fine-tuning
4.) Domain Adaption Fine-tuning
What can Amazon OpenSearch Serverless help store?
embeddings within vector databases
Prompt Engineering
developing, designing, & optimizing prompts to enhance the output
Improved Prompting techniques consists of:
Instructions
Context
Input data
Output Indicator
Negative Prompting
explicitly instruct the model on what not to include or do in its response
Temperature (0 to 1)
creativity of the model’s output
Low (ex: .2); more conservative, repetitive responses
High (ex: 1); more diverse, creative, and possibly less coherent responses
Top P (0 to 1);
Consideration of words
Low P (ex: .25); consider the 25% most likely words, more coherent responses
High P (ex: .99); considers a broad range of possible words, more creative & diverse responses
Top K
Limits the number of probable words
Low K (ex: 10); more coherent responses, less probable words
High K (ex: 500); more probable words, more diverse & creative
Stop Sequences
Tokens that signal the model to stop generating output
What isn’t latency impacted by?
Top P, Top K, or Temperature
Zero-Shot Prompting
Present a task to the model without providing examples or explicit training for that specific task
Few-Shots Prompting
Provide examples of a task to the model to guide its output
Chain of Thought Prompting
Divide the task into a sequence of reasoning steps, leading to more structure, & coherence (first, then, next)
Prompt Templates
predefined structures/patterns to format inputs for AI models. They help standardize prompts so the model produces more reliable & repeatable outputs
“Ignoring the prompt template” attack
users enter malicious inputs to hijack the prompt & provide info on a prohibited or harmful topic
Amazon Q Business
Fully managed Gen-AI assistant for your employees. Based on your company’s knowledge & data. Built on Amazon Bedrock
Amazon Q Apps
Create Gen-AI powered apps without coding by using natural language
Amazon Q Developer
Answer questions about the AWS documentation & AWS service selection
Answer questions about resources in your AWS account
AI Code Companion to help you code new apps
Amazon SageMaker AI
Fully managed service for developers/data scientists to build ML models
SageMaker Automatic Model Tuning (AMT)
a managed service that finds the best hyperparameters for your ML model by automatically running multiple training jobs with different configurations
SageMaker Deployments; Real-time
One prediction at a time; Configure CPU & GPU
SageMaker Deployments; Serverless
Configure RAM; Idle period between traffic spikes; cold starts
SageMaker Deployments; Asynchronous
For Large payload sizes up to 1 GB
Long processing times
Near-real time latency requirements
Request & responses are in S3
SageMaker Deployments; Batch
Prediction for an entire data set (multiple predictions)
Request & responses are in S3
SageMaker Model Deployment Use Cases; Real-time
Fast, near-instant predictions for web/mobile apps
SageMaker Model Deployment Use Cases; Serverless
sporadic, short-term inference without infrastructure, can tolerate cold starts
SageMaker Model Deployment Use Cases; Asynchronous
Large payloads & workloads requiring longer processing times
SageMaker Model Deployment Use Cases; Batch
Bulk processing for large datasets. Concurrent processing
SageMaker Studio
End-to-end ML development from a unified interface
SageMaker Data Wrangler
Prepare tabular data & image data for ML (transform)
ML Features
inputs to ML models used during training & used for inference
Feature Engineering
process of transforming raw data into meaningful input variables features; e.g. birthday to age
SageMaker Feature Store
a managed repository within AWS for storing, sharing, & serving features
SageMaker Clarify
AWS service that helps data scientists detect bias & understand the explainability of their ML models
RLHF
Reinforcement Learning from Human Feedback
SageMaker GroundTruth
help create high-quality training datasets; human-in-the-loop
SageMaker Model Cards
Essential model info
SageMaker Model Dashboard
Centralized repo; Information & insights for all models
SageMaker Role Manager
Define roles for people
SageMaker Model Monitor
monitor the quality of your model; alerts for deviations
SageMaker JumpStart
ML hub that provides pre-trained models, built in algorithms, & end-to-end solutions
is for developers needing deep customization & control over a model
SageMaker Canvas
Build ML models using a visual interface (no coding required)
ML Flow
open-source tool which helps ML teams manage the entire ML lifecycle
SageMaker Network Isolation mode
Run SageMaker job containers without any outbound internet access
SageMaker DeepAR
Used to forecast time series data
Deep Learning
subset of ML
uses neurons & synapses (like our brain) to train a model
GPT (Generative Pre-trained Transformer)
generate human text or computer code based on input prompts
BERT (Bidirectional Encoder Representations from Transformers)
similar to intent to GPT, but reads the text in two directions
RNN (recurrent Neural Network)
meant for sequential data such as time-series or text, useful in speech recognition, time series prediction
ResNet (Residual Network)
Deep Convolutional Neural Network (CNN) used for image recognition tasks, object detection, facial recognition
SVM (Support Vector Machine)
ML algorithm for classification & regression
WaveNet
model to generate raw audio waveform used in speech synthesis
GAN (Generative Adversarial Network)
models used to generate synthetic data such as images, videos, or sounds that resembles the training data. Helpful for data augmentation
XGBoost (Extreme Gradient Boosting)
an implementation of gradient boosting
RLHF Steps
Data collection
Supervised fine-tuning of a language model
Build a separate reward model
Optimize the language model with the reward-based model
Overfitting
Performs well on the training data, but doesn’t perform well on evaluation data (high variance)
Underfitting
Model performs poorly on training data; the model could be too simple or there could be poor data features (high bias)
Bias
Difference or error between predicted and actual values
Variance
How much the performance of a model changes if trained on a different dataset which has a similar distribution (data is all over the place if the variance is high)
Binary Classification Evaluation Metrics
Precision, Recall, F1, & Accuracy
AUC-ROC
shows what the curve for true positives compared to false positives looks like (binary classification)
Regression Evaluation Metrics
Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE), R Squared
ML Project Phases
1.) Business Goal Identification
2.) ML Problem Framing
3.) Data Processing
4.) Model Development
5.) Model Deployment
6.) Model Monitoring
Hyperparameter tuning
the process of finding the best settings (hyperparameters) for a ML model before training, to maximize its performance & accuracy
Important Hyperparameters
Learning rate, Batch size, Number of Epochs, Regularization
Amazon Comprehend
For Natural Language Processing (NLP)'; extract insights; sentiment analysis; Named Entity Recognition (NER)
Amazon Transcribe
convert speech to text
Amazon Polly
Turn text into lifelike speech using deep learning
Amazon Rekognition
Find objects, people, text, scenes in images & videos; Facial Analysis and Facial Search
Amazon Lex
Same tech that powers Alexa
ASR to convert speech to text
Natural Language Understanding
Helps build chatbots, call center bots
Amazon Personalize
Fully managed ML-service to build apps with real-time personalized recommendations
Amazon Textract
Automatically extracts text, handwriting, & data from any scanned documents using AI & ML
Amazon Kendra
Fully managed ML document search service
Amazon Comprehend Medical
Uses NLP to detect PHI in a document
Amazon Polly Lexicons
Define how to read certain specific pieces of text; AWS → Amazon Web Services
Amazon Polly Speech Synthesis Markup Language (SSML)
Markup for your text to indicate how you pronounce it
Amazon Mechanical Turk
Crowdsourcing marketplace to perform simple human tasks; Distributed virtual workforce