AWS AI Practitioner: Foundations - Domain 3 - Application of FMs

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/38

Earn XP

Description and Tags

AWS Certification

Last updated 10:16 PM on 3/26/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

39 Terms

New cards

Modality (FM Selection Criterion)

The type(s) of data a foundation model can process or generate — for example, text only, images only, or multiple types simultaneously (multimodal). A primary selection criterion because it determines whether a model is architecturally capable of handling the application's input and output requirements.

New cards

Latency (FM Selection Criterion)

The time delay between sending a request to a foundation model and receiving its response. A critical selection criterion for real-time or interactive applications; higher-latency models may be acceptable for batch or asynchronous workloads.

New cards

Model Complexity (FM Selection Criterion)

A characteristic of a foundation model reflecting the number of parameters, depth of architecture, and computational demands. More complex models generally achieve higher accuracy on difficult tasks but require more compute, increase latency, and cost more to run.

New cards

Inference Parameters

Configuration settings that control the behavior of a foundation model at inference time (i.e., when generating a response). Key parameters include Temperature, Top-p, Top-k, Maximum Length, Stop Sequences, and Response Length. These parameters are adjusted to balance diversity, coherence, and output length.

New cards

Temperature (Inference Parameter)

A randomness and diversity parameter (range 0–1) that controls how deterministic or creative a model's output is. Low values (e.g., 0.2) produce more focused, consistent responses; high values (e.g., 1.0) produce more diverse, creative responses. Modifies the probability distribution over the model's token choices.

New cards

Top-p (Nucleus Sampling)

An inference parameter (range 0–1) that limits the model's token choices to the smallest set of tokens whose cumulative probability meets the threshold p. Low values (e.g., 0.25) restrict word choices; high values (e.g., 0.99) allow a wide range of word choices, increasing diversity.

New cards

Amazon Bedrock Knowledge Bases

An Amazon Bedrock feature that implements a fully managed RAG workflow. It connects to structured data sources via SQL and ingests unstructured data from sources such as Amazon S3, Confluence, Microsoft SharePoint, Salesforce, and Web Crawler. It automatically creates embeddings and stores them in a supported vector search database (Aurora, OpenSearch Serverless, Neptune Analytics, MongoDB, Pinecone, Redis Enterprise Cloud).

New cards

Vector Database (Vector Store)

A specialized database designed to store, index, and query high-dimensional vector embeddings efficiently. Used in RAG pipelines and semantic search applications. AWS vector-capable services include Amazon OpenSearch Service, Amazon OpenSearch Serverless, Amazon Aurora PostgreSQL Compatible Edition, Amazon RDS for PostgreSQL, Amazon Neptune ML, Vector search for Amazon MemoryDB, and Amazon DocumentDB.

New cards

Amazon OpenSearch Service

A fully managed AWS search and analytics service based on OpenSearch. Supports vector search via built-in k-nearest neighbor (k-NN) and semantic search capabilities, making it suitable as a vector store for RAG implementations and ML-augmented search experiences.

New cards

Amazon OpenSearch Serverless

A deployment option for Amazon OpenSearch Service. Provides the same vector search and indexing capabilities as Amazon OpenSearch Service without requiring infrastructure management — suited for variable or unpredictable workloads.

New cards

Amazon Aurora PostgreSQL Compatible Edition

A fully managed, high-performance relational database service compatible with PostgreSQL. Supports vector search functionality through the pgvector extension, enabling vector-similarity search combined with traditional relational data operations in the same database.

New cards

Amazon RDS for PostgreSQL

Amazon's managed relational database service running the PostgreSQL engine. Supports the pgvector extension, which provides vector-similarity search capabilities, making it a suitable vector store for RAG implementations.

New cards

Amazon Bedrock Agents

An Amazon Bedrock feature that enables foundation models to perform multi-step, autonomous tasks by orchestrating a sequence of actions — such as calling APIs, querying knowledge bases, and executing code — to complete complex business workflows. Can understand the user's goal, break it into steps, and execute them with minimal human intervention.

New cards

FM Customization — Tradeoffs (4)

The cost-capability tradeoffs between the four main approaches to customizing a foundation model's behavior, in increasing order of cost and control: (1) Prompt Engineering — fastest, cheapest, no retraining; (2) RAG — adds external knowledge at inference time, no weight changes; (3) Fine-tuning — modifies model weights on custom data, higher cost; (4) Pre-training — trains from scratch or continues pre-training, highest cost and control.

New cards

Single-Shot Prompting

A prompting technique where exactly one input-output example is included in the prompt to demonstrate the desired pattern before asking the model to perform the task. Sits between zero-shot (no examples) and few-shot (multiple examples) in terms of guidance provided.

New cards

Negative Prompting

A prompt engineering technique that explicitly instructs the model about what NOT to include, generate, or do — constraining the output by specifying undesired content, formats, or behaviors. Helps prevent hallucinations, off-topic content, and undesired styles.

New cards

Guardrails (Prompt Engineering)

Safeguards or constraints built into prompts or the model deployment layer to prevent the generation of undesirable, harmful, or biased content. On AWS, Amazon Bedrock supports configurable guardrails as a managed capability.

New cards

Adversarial Prompting

A category of security risks associated with prompt engineering where malicious actors craft inputs designed to manipulate, deceive, or exploit a foundation model. Includes exposure, prompt injection, jailbreaking, hijacking, and poisoning attacks.

New cards

Exposure (Adversarial Prompting Risk)

An adversarial prompt engineering risk where a crafted prompt causes the model to reveal sensitive information, trade secrets, system instructions, or confidential data it should not disclose — resulting in data breaches or intellectual property theft.

New cards

Prompt Injection

An adversarial attack where a malicious actor embeds harmful or manipulative instructions within user-provided input, causing the model to execute unintended commands or generate harmful content by overriding the original prompt's intent.

New cards

Jailbreaking (Prompt Attack)

An adversarial technique that exploits vulnerabilities in a model's safety constraints to bypass its ethical safeguards or content policies, causing it to produce content it was designed to refuse.

New cards

Prompt Hijacking

An adversarial prompt technique where a malicious actor crafts inputs that steer the model's responses in a desired (often harmful or off-brand) direction, effectively taking control of the model's output for malicious purposes.

New cards

Poisoning (Prompt / Training Attack)

An attack that introduces corrupted, biased, or adversarial data into the model's training dataset, manipulating the model's learned behavior and outputs — leading to systematically biased or unreliable results in production.

New cards

Amazon Nova

A family of AWS foundation models available through Amazon Bedrock that provide pre-trained language model capabilities with customization and control through prompt engineering. Used alongside Amazon Bedrock for building generative AI solutions.

New cards

Continuous Pre-training

An ongoing FM training approach that further pre-trains an already pre-trained model on additional diverse data to continually expand its knowledge base and adaptability — while retaining existing knowledge. Produces progressively more knowledgeable and versatile models over time. Mitigates the risk of catastrophic forgetting associated with fine-tuning.

New cards

Catastrophic Forgetting

A phenomenon that can occur during fine-tuning where the model loses (forgets) knowledge it acquired during pre-training as its weights are adjusted to optimize for the new, narrower task. Mitigated by continuous pre-training and careful fine-tuning data design.

New cards

Instruction Fine-tuning

A fine-tuning method that trains a pre-trained FM using examples of instructions paired with the desired model responses — teaching the model how to follow a specific type of instruction. Prompt tuning is a variant of instruction fine-tuning.

New cards

Reinforcement Learning from Human Feedback (RLHF)

A fine-tuning method that incorporates human feedback data to align a foundation model's behavior with human preferences — making outputs more helpful, accurate, honest, and safe. A human evaluator rates or ranks model outputs, and the model is trained to maximize the behaviors humans prefer.

New cards

Holdout Validation (Fine-tuning)

A model evaluation approach used during fine-tuning where a separate validation dataset (not used during training) is used to assess model performance on unseen data during or after the fine-tuning process — informing decisions about further training or deployment.

New cards

Amazon SageMaker Canvas

A SageMaker feature that provides a visual, low-code/no-code interface for creating ML data preprocessing flows and building ML models. Enables feature engineering workflows and model development with minimal coding, accessible to non-developers.

New cards

Amazon SageMaker Clarify

A SageMaker tool that analyzes training data and ML model outputs to detect and measure potential bias across multiple dimensions (such as gender, race, or age). Also provides model explainability capabilities to help developers understand and address fairness and transparency issues in their ML models.

New cards

Amazon SageMaker Ground Truth

A SageMaker data labeling service that manages human-in-the-loop data labeling workflows for training datasets. Provides a comprehensive set of capabilities to help users build and manage labeled datasets at scale, ensuring high-quality labeled data for model training across the ML lifecycle.

New cards

Amazon EMR (Elastic MapReduce)

An AWS managed big data processing service that runs open-source frameworks such as Apache Spark, Apache Hive, and Presto at scale. Amazon SageMaker Studio provides built-in integration with Amazon EMR for scalable data preparation tasks.

New cards

Human Evaluation (FM)

A foundation model evaluation method where human raters assess the quality, coherence, accuracy, and usefulness of the model's outputs — providing a gold-standard signal that automated metrics may miss, especially for open-ended or creative tasks.

New cards

Probing Tasks (FM Evaluation)

Diagnostic evaluation tasks designed to systematically analyze a foundation model's capabilities and limitations in specific areas — for example, arithmetic reasoning, factual recall, or logical inference — by testing targeted sub-skills.

New cards

Robustness Testing (FM Evaluation)

An FM evaluation approach that assesses the model's ability to handle edge cases, adversarial inputs, or distribution shifts without significant performance degradation. Tests whether the model generalizes reliably beyond its training distribution.

New cards

ROUGE (Recall-Oriented Understudy for Gisting Evaluation)

A set of evaluation metrics used to assess the quality of automatically generated summaries and machine translations by comparing them to one or more reference texts. Measures overlap (recall-oriented) between generated and reference content.

New cards

BLEU (Bilingual Evaluation Understudy)

A metric used to evaluate the quality of machine-generated text — particularly machine translation — by measuring similarity between the generated text and one or more reference translations. Considers both precision and brevity (brevity penalty).

New cards

BERTScore

A semantic similarity metric for evaluating generated text that uses pre-trained bidirectional encoder representations from transformers (BERT) models to compute contextualized embeddings for both the generated and reference texts, then calculates cosine similarity between them. More robust to paraphrasing than n-gram metrics like BLEU and ROUGE.