1/103
A comprehensive set of vocabulary flashcards covering key concepts, services, techniques, and best practices from the AWS AI & Machine Learning lecture.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
AWS Cloud Adoption Framework for AI (CAF-AI)
A structured roadmap that outlines best practices and organizational capabilities for accelerating AI and ML adoption and generating business value.
Envision Stage (CAF-AI)
Initial phase focused on identifying and prioritizing AI opportunities that align with desired business outcomes.
Align Stage (CAF-AI)
Phase for building stakeholder buy-in, identifying dependencies, addressing concerns, and creating readiness strategies.
Launch Stage (CAF-AI)
Hands-on phase where pilot projects and proofs of concept (POCs) are delivered into production to demonstrate value.
Scale Stage (CAF-AI)
Expands successful pilots to broader, sustained business adoption and value creation.
Business Perspective (CAF-AI)
Ensures AI investments deliver measurable business value through alignment with outcomes, portfolio management, and innovation.
People Perspective (CAF-AI)
Addresses the human element: building AI fluency, attracting talent, and fostering an AI-first culture.
Governance Perspective (CAF-AI)
Manages AI initiatives to maximize benefits and minimize risks, emphasizing Responsible AI principles.
Platform Perspective (CAF-AI)
Covers the technical foundation for AI workloads, including scalable platforms, modernization, and MLOps.
Security Perspective (CAF-AI)
Protects data and AI workloads by ensuring confidentiality, integrity, availability, and addressing AI-specific attack vectors.
Operations Perspective (CAF-AI)
Ensures AI services run reliably, with monitoring, performance management, and continuous value delivery.
Artificial Intelligence (AI)
Broad field of computer science focused on creating machines that can sense, reason, act, and adapt like humans.
Machine Learning (ML)
Subset of AI that enables computers to learn patterns from data without explicit programming.
Deep Learning (DL)
Specialized subfield of ML that uses multi-layered neural networks to perform complex tasks such as image or speech recognition.
Generative AI
Class of AI systems capable of creating new, original content such as text, images, audio, or video.
Training (ML)
Phase where a model learns patterns by processing a large, high-quality labeled dataset.
Inference (ML)
Phase where a trained model makes predictions on new, unseen data.
Supervised Learning
ML approach where models learn from labeled data to predict continuous values (regression) or categories (classification).
Regression
Supervised learning task that predicts a continuous numerical value.
Classification
Supervised learning task that predicts discrete categories or classes.
Unsupervised Learning
ML approach using unlabeled data to discover hidden patterns, structures, or relationships.
Clustering
Unsupervised technique that groups similar data points together.
Association
Unsupervised technique that discovers relationships or co-occurrences in data.
Reinforcement Learning (RL)
Learning method where an agent interacts with an environment and learns through rewards and penalties to maximize cumulative reward.
Self-Supervised Learning
Technique where a model generates its own labels from input data, bridging supervised and unsupervised learning and powering modern foundation models.
Bias (ML)
Error from overly simplistic assumptions; high bias leads to underfitting.
Variance (ML)
Error from excessive model sensitivity to training data; high variance leads to overfitting.
Hyperparameters
Configuration settings chosen before training that control how a model learns (e.g., learning rate, batch size).
RLHF (Reinforcement Learning from Human Feedback)
Technique for fine-tuning language models to align outputs with human preferences for helpfulness and safety.
Machine Learning Lifecycle
Series of phases: problem formulation, data collection/preparation, feature engineering, training, evaluation, deployment, and monitoring.
Structured Data
Data with a predefined schema (rows & columns) that is easy to query, e.g., relational tables.
Unstructured Data
Data without a predefined model or structure, such as free-form text, images, or audio.
Semi-Structured Data
Data that lacks a rigid table structure but uses tags or keys (e.g., JSON, XML) to impose hierarchy.
Time-Series Data
Data recorded sequentially over time with timestamps as a critical element, used for forecasting.
Real-Time (Online) Inference
Immediate, low-latency predictions on individual data points as they arrive.
Batch (Offline) Inference
Processing large collections of data at once when low latency is not required.
Amazon Bedrock
Fully managed service that provides access to multiple high-performing foundation models to build and scale generative AI applications.
Foundation Model (FM)
Very large, pre-trained AI model that can be adapted to a wide variety of downstream tasks.
Fine-Tuning (FMs)
Further training a base foundation model on a smaller, labeled dataset to update its internal weights for specialized behavior.
Retrieval-Augmented Generation (RAG)
Technique that enriches prompts with real-time information from an external knowledge base without changing model weights.
Knowledge Bases for Amazon Bedrock
Bedrock feature that provides built-in RAG to connect FMs to enterprise data sources.
Guardrails for Amazon Bedrock
Configuration layer that enforces responsible AI policies by filtering content, denying topics, and redacting PII.
Hallucination (AI)
Phenomenon where a model produces nonsensical or factually incorrect outputs.
Embedding
Numerical vector representation capturing the meaning and context of text, enabling semantic search.
Vector Database
Database optimized to store and search embeddings efficiently (e.g., Amazon OpenSearch Service vector engine).
Agents for Amazon Bedrock
Managed feature that enables generative AI applications to execute multi-step tasks by calling APIs or AWS Lambda functions.
Prompt Engineering
Designing and refining input prompts to obtain accurate, relevant, and useful outputs from a foundation model.
Zero-Shot Prompting
Asking a model to perform a task without providing examples in the prompt.
Few-Shot Prompting
Including a few examples within the prompt to demonstrate the desired task, improving results.
Chain-of-Thought (CoT) Prompting
Technique that encourages the model to reason step-by-step before answering, enhancing logical tasks.
Prompt Template
Reusable prompt structure containing placeholders for dynamic content, enabling scalable AI applications.
Amazon Q
Generative AI-powered assistant for work, focused on enterprise security and privacy.
Amazon Q Business
Variant of Amazon Q that connects securely to enterprise data sources to answer questions, summarize, and generate content for business users.
Amazon Q Apps
No-code experience within Q Business that lets non-technical users build custom generative AI tools.
Amazon Q Developer
Assistant for developers that accelerates the software development lifecycle with code generation, troubleshooting, and AWS guidance.
PartyRock
Free, no-code generative AI playground powered by Bedrock for learning and experimentation (not production-grade).
Amazon Comprehend
NLP service that extracts entities, sentiment, key phrases, and PII from text.
Amazon Translate
Neural machine translation service converting text between languages.
Amazon Transcribe
Automatic speech recognition service that converts speech to text and identifies multiple speakers.
Amazon Polly
Text-to-speech service that turns text into lifelike speech.
Amazon Rekognition
Computer vision service that analyzes images and videos to detect objects, faces, text, and activities.
Amazon Lex
Conversational AI service for building chatbots and voice bots using the same technology as Amazon Alexa.
Amazon Personalize
Service providing real-time personalized recommendations using Amazon.com’s recommendation technology.
Amazon Textract
Intelligent document processing service that extracts text, handwriting, and structured data from scanned documents.
Amazon Kendra
Intelligent enterprise search service that provides accurate answers to natural language queries across internal documents.
Amazon Mechanical Turk (MTurk)
Crowdsourcing marketplace for on-demand human workers, commonly used for data labeling tasks.
Amazon Augmented AI (A2I)
Managed service that orchestrates human reviews of low-confidence ML predictions via customizable workflows.
AWS Trainium
Custom AWS chip optimized for cost-effective training of deep learning models.
AWS Inferentia
Custom AWS chip optimized for high-performance, low-latency model inference.
Amazon SageMaker
Fully managed ML service providing tools to prepare data, build, train, deploy, and monitor models across the entire ML lifecycle.
SageMaker Data Wrangler
Visual, low-code tool within SageMaker for data preparation and feature engineering.
SageMaker Feature Store
Central repository for storing, sharing, and reusing ML features.
SageMaker Studio Notebooks
Managed Jupyter notebook environment integrated with SageMaker resources.
SageMaker Ground Truth
Data labeling service combining automated tools and human workforces to create labeled datasets.
SageMaker Model Cards
Documentation that captures model purpose, performance, and responsible AI details (a model 'nutrition label').
SageMaker Model Dashboard
Central interface for monitoring deployed models’ performance, drift, and responsible AI metrics.
MLOps
Set of practices applying DevOps principles to ML workflows to automate and streamline the end-to-end lifecycle.
Amazon SageMaker Pipelines
CI/CD service for building, automating, and managing ML workflows as part of MLOps.
Responsible AI (AWS)
Framework ensuring AI systems are developed and used safely, ethically, and legally across fairness, explainability, robustness, privacy, governance, and transparency.
Bias in AI
Systematic error resulting from biased training data, leading models to perpetuate unfair outcomes.
Toxicity (GenAI)
Potential of generative models to produce harmful or offensive content.
Prompt Injection
Attack where malicious instructions are inserted into prompts to manipulate or hijack an AI system’s behavior.
Poisoning Attack
Security threat where training data is corrupted to manipulate a model’s future behavior.
AWS Identity and Access Management (IAM)
Service for securely controlling access to AWS resources using users, groups, roles, and policies.
Amazon S3
Highly durable object storage service that serves as the foundational data store for ML datasets, artifacts, and knowledge bases.
Amazon EC2
Scalable compute service; SageMaker uses EC2 instances for training and hosting models.
AWS Lambda
Serverless, event-driven compute service often used to glue AI services together via triggers and integrations.
Amazon Macie
ML-powered service that discovers and classifies sensitive data such as PII in S3 buckets.
AWS CloudTrail
Service that records all API calls made in an AWS account for auditing and governance.
AWS Artifact
Self-service portal providing on-demand access to AWS compliance reports and certifications.
AWS Trusted Advisor
Tool that delivers real-time best-practice recommendations across cost, performance, security, fault tolerance, and service limits.
Root Mean Square Error (RMSE)
Regression metric measuring the square root of the average squared differences between predicted and actual values.
Accuracy (ML)
Classification metric measuring the proportion of correct predictions over total predictions.
Precision
Classification metric indicating the proportion of true positives among all positive predictions.
Recall
Classification metric indicating the proportion of true positives captured among all actual positives.
F1-Score
Harmonic mean of precision and recall, balancing both metrics for classification tasks.
Automatic Evaluation
A fast, scalable evaluation method in Bedrock that uses algorithms to score a model's performance. It is best used for objective, fact-based tasks like question-answering and summarization.
Human Evaluation
An evaluation method where people review a model's outputs to judge subjective qualities that algorithms can't easily measure, such as creativity, helpfulness, brand alignment, or style.
Custom Dataset
A private dataset that you create using your own data. It contains prompts that reflect your specific, real-world use case and is the most accurate way to predict how a model will perform for your specific application.
Benchmark Dataset
A standardized, public dataset used as a common yardstick to measure and compare the general capabilities of different models on a well-defined task (e.g., testing for toxicity or factual accuracy).