AWS AI Practitioner: Foundations - Domain 2 - Intro to Gen AI

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/27

Earn XP

Description and Tags

AWS Certification

Last updated 5:30 PM on 3/25/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

28 Terms

New cards

Chunking

The process of breaking large documents or datasets into smaller, manageable pieces (chunks) before feeding them into a model or retrieval system. Critical in RAG (Retrieval-Augmented Generation) pipelines to improve the relevance of retrieved context passed to a model.

New cards

Unimodal Model

A model that processes only one type of data modality (e.g., text only). Standard LLMs are unimodal.

New cards

Multimodal Model

A model capable of processing and generating content across multiple data types simultaneously — for example, text, images, and audio. Multimodal LLMs can generate image captions, create product designs from text descriptions, or analyze customer queries that include multimedia content.

New cards

Prompt Engineering

The practice of developing, designing, and optimizing the text instructions (prompts) given to a foundation model to guide its outputs toward desired results. It is the fastest and lowest-cost method to optimize FM behavior — no model retraining required.

New cards

Zero-Shot Prompting

A prompting technique where the model is given a task instruction with no examples. The model relies entirely on its pre-trained knowledge to produce the output.

New cards

Few-Shot Prompting

A prompting technique where one or more input-output examples are included in the prompt to demonstrate the desired pattern or behavior before asking the model to complete the task on new input.

New cards

Chain-of-Thought Prompting

A prompting technique that instructs the model to reason through a problem step-by-step before producing a final answer, improving performance on complex reasoning tasks.

New cards

Foundation Model (FM) Lifecycle (7)

The end-to-end framework for developing and deploying a foundation model, comprising seven stages: (1) Data Selection, (2) Model Selection, (3) Pre-training, (4) Fine-tuning/Adaptation, (5) Evaluation, (6) Deployment, and (7) Feedback and Iteration. Distinct from the traditional ML Pipeline in that it centers on adapting pre-trained large models rather than building from scratch.

New cards

Retrieval-Augmented Generation (RAG)

Enhances FM outputs at inference time by retrieving relevant information from an external knowledge base and injecting it as context into the prompt. RAG improves factual accuracy without requiring full model fine-tuning. Relies on chunking and embeddings for effective retrieval.

New cards

Intrinsic Interpretability

A form of model transparency that refers to the ability to understand the internal workings and decision-making processes of a model during its operation. Generative AI (especially large neural networks) has very low intrinsic interpretability — its internal logic is largely opaque.

New cards

Post-Hoc Interpretability

A complementary approach to model transparency that involves analyzing and explaining a model's outputs after they are generated (e.g., using explanation or attribution tools), rather than understanding the internal mechanics. Also challenging for large generative models.

New cards

Cross-Domain Performance (Gen AI Metric)

A performance metric evaluating the versatility of a generative AI model across different domains and task types beyond its primary training domain — important for systems deployed across diverse use cases.

New cards

Efficiency (Gen AI Metric)

A performance metric assessing the computational efficiency of a generative AI system, including inference time and resource utilization. Efficient models are essential for practical, scalable, and cost-effective deployments.

New cards

Conversion Rate (Gen AI Metric)

A business performance metric measuring the ability of a generative AI system to convert users or drive desired commercial actions, directly impacting business value and ROI.

New cards

Average Revenue Per User — ARPU (Gen AI Metric)

A business performance metric quantifying the average revenue generated per user interacting with a generative AI-powered system, providing a direct measure of financial impact in commercial applications.

New cards

Customer Lifetime Value — CLV (Gen AI Metric)

A business performance metric estimating the total revenue a customer generates over their entire relationship with a generative AI-powered product or service, evaluating long-term sustainability of the AI solution.

New cards

Amazon Bedrock — On-Demand Pricing

The default pricing model for Amazon Bedrock text generation FMs: users are charged by the number of input tokens received plus the number of output tokens generated. No long-term commitment required. Note: embedding models are charged only on input tokens processed.

New cards

Amazon Bedrock — Provisioned Throughput

An Amazon Bedrock pricing option where users purchase a committed level of model processing capacity (tokens per minute) in advance. Higher provisioned throughput guarantees predictable performance for time-sensitive workloads but at a higher cost than on-demand.

New cards

PartyRock (Amazon Bedrock Playground)

A hands-on, no-code sandbox environment built on Amazon Bedrock that allows users to experiment with and build AI-powered applications using foundation models without writing code. Designed for exploration, learning, and rapid prototyping.

New cards

Amazon SageMaker JumpStart

An AWS service that provides a curated set of pre-built ML solutions, pre-trained foundation models, and example notebooks for the most common generative AI and ML use cases. Enables faster experimentation by providing a proven starting point for model development and deployment.

New cards

Amazon Q

An AWS generative AI-powered assistant designed for enterprise work contexts. Amazon Q can be tailored to a specific organization's data, systems, and workflows to answer questions, generate content, summarize documents, and complete tasks — drawing on the company's own enterprise data.

New cards

Amazon Q Developer

An AWS ML-powered coding assistant that provides code recommendations, generates code, explains existing code, and identifies security vulnerabilities across a variety of programming languages. Accelerates software development workflows through AI-powered code generation.

New cards

Amazon Fraud Detector

A fully managed AWS AI service that uses ML to automatically identify potentially fraudulent activities — such as online payment fraud and fake account creation — without requiring ML expertise from the user. Categorized under the Business Metrics layer of the AWS AI/ML stack.

New cards

Amazon Elastic File System (Amazon EFS)

An AWS fully managed, scalable shared file storage service. In generative AI workloads, EFS stores training data and model artifacts with shared access across multiple compute instances simultaneously.

New cards

Amazon Elastic Kubernetes Service (Amazon EKS)

An AWS managed Kubernetes container orchestration service used to deploy and scale containerized generative AI workloads. Provides infrastructure for hosting and auto-scaling generative AI model serving at enterprise scale.

New cards

AWS CloudFormation

An AWS infrastructure-as-code (IaC) service that allows users to define and provision AWS cloud infrastructure through machine-readable templates. Used to automate and standardize the provisioning of generative AI application infrastructure.

New cards

Token-Based Pricing

A pricing model used by AWS generative AI services (notably Amazon Bedrock and Amazon Q Developer) where users pay based on the volume of tokens — units of text or code — that are processed as inputs and/or generated as outputs by the service.

New cards

Spot Instances (Gen AI Cost Optimization)

A cost-optimization AWS EC2 feature that provides access to spare AWS compute capacity at significantly reduced prices. Applicable to generative AI training workloads to reduce infrastructure costs, though they can be interrupted when AWS reclaims the capacity.