Generative AI – Domain 2 Vocabulary Flashcards

0.0(0)

Studied by 0 people

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/81

Earn XP

Description and Tags

80 vocabulary flashcards covering key Domain 2 Generative AI concepts, AWS services, model architectures, security, RAG, evaluation metrics, and CAF-AI perspectives.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

82 Terms

New cards

Generative AI

A subset of deep learning in which models create new, original content (text, images, audio, code) by learning patterns from large datasets.

New cards

Foundation Model (FM)

An extremely large, pre-trained neural network with billions of parameters that acts as a base for many downstream tasks.

New cards

Parameters

The internal variables a model learns during training; more parameters generally mean greater capacity and capability.

New cards

Prompt

User-supplied input (instructions, context, questions, examples) that tells a generative model what to do.

New cards

Completion

The output text (or image, etc.) that a generative AI model returns in response to a prompt.

New cards

Inference

The run-time process where a trained model uses its knowledge to generate a completion from a prompt.

New cards

Prompt Engineering

The skill of designing, structuring, and refining prompts to obtain the desired model output.

New cards

In-Context Learning

Technique of providing task examples inside the prompt so the model can mimic them without retraining.

New cards

Zero-Shot Learning

Asking the model to perform a task with no examples in the prompt.

New cards

One-Shot Learning

Supplying exactly one example of the task inside the prompt.

New cards

Few-Shot Learning

Including multiple examples of the task inside the prompt to guide the model.

New cards

Transformer Architecture

State-of-the-art neural network design (introduced in “Attention Is All You Need,” 2017) that processes sequences in parallel using self-attention.

New cards

Tokenizer

Component that splits human text into tokens and converts them to numeric IDs the model can process.

New cards

Token

Basic data unit for an LLM (roughly a word or sub-word) used to measure context window size and pricing.

New cards

Vector

Ordered list of numbers representing features of a concept; enables mathematical comparison of similarity.

New cards

Embedding

Dense vector representation of a token or item that captures its semantic meaning and context.

New cards

Transformer Network

Neural network built from stacked encoder/decoder blocks using self-attention and positional embeddings.

New cards

Self-Attention Mechanism

Process that lets a Transformer weigh the importance of every token relative to every other token when generating output.

New cards

Positional Embeddings

Extra vectors added to token embeddings to convey each token’s position in the sequence so order is preserved.

New cards

Context Window

Maximum number of tokens (prompt + completion) an LLM can handle in a single request.

New cards

Encoder (Transformer)

Half of a Transformer that reads the entire input and produces a contextual representation of it.

New cards

Decoder (Transformer)

Half of a Transformer that takes encoder context (or previous outputs) and generates output tokens one by one.

New cards

Softmax Output Layer

Final function that converts raw model scores into a probability distribution over possible next tokens.

New cards

Pre-Training

Compute-intensive initial training phase where the model learns statistical patterns from large, unlabeled data.

New cards

Self-Supervised Learning

Training method in which the model generates its own labels (e.g., predicting the next word) instead of using human-labeled data.

New cards

Unimodal Model

Generative model that accepts and outputs only one data type (e.g., text-to-text).

New cards

Multimodal Model

Model capable of processing and/or generating multiple data types, such as text, images, or audio.

New cards

Diffusion Model

Generative model that creates content by reversing a stepwise noising process, refining random noise into coherent output.

New cards

Stable Diffusion

Efficient diffusion architecture that performs denoising in a low-dimensional latent space to generate images from text.

New cards

Forward Diffusion

Conceptual training process of adding progressive noise to data so the model learns the noise pattern.

New cards

Reverse Diffusion

Generative process of starting with noise and iteratively removing it to create new content.

New cards

Latent Space

Compressed, abstract feature space where models operate to represent data more efficiently than raw pixels or text.

New cards

Retrieval-Augmented Generation (RAG)

Technique that enriches a prompt with retrieved, authoritative data before generation to reduce hallucinations and add freshness.

New cards

Knowledge Bases for Amazon Bedrock

Fully managed AWS feature that automates RAG: ingesting data, creating embeddings, storing them, and retrieving context for prompts.

New cards

Vector Database

Specialized store that indexes embeddings and returns semantically similar vectors for a query vector.

New cards

Ingestion (Knowledge Base)

Process of chunking source documents, generating embeddings, and loading them into a vector database.

New cards

Amazon OpenSearch Serverless

Fully managed, pay-per-use AWS vector database option ideal for quick, low-overhead RAG setups.

New cards

Pinecone

Purpose-built, high-performance vector database suited for large-scale, low-latency semantic search workloads.

New cards

Redis Enterprise Cloud

In-memory database choice for real-time, ultra-low-latency vector search, often used when Redis is already in use.

New cards

Amazon Aurora (pgvector)

Relational database (PostgreSQL) with vector search extension, ideal when structured data already resides in Aurora.

New cards

MongoDB Atlas

Document database offering vector search; chosen when data is stored in MongoDB JSON documents.

New cards

Amazon S3 (RAG Data Source)

Primary storage location where Bedrock Knowledge Bases ingest supported text-centric documents.

New cards

Hallucination (LLM)

Model output that is plausible-sounding but factually incorrect or fabricated.

New cards

Prompt Injection

Attack where a malicious input causes the model to ignore original instructions and perform unintended actions.

New cards

Data Poisoning

Attack that corrupts training data to bias or compromise a model’s behavior.

New cards

Model Inversion

Attack attempting to reconstruct private training data by repeatedly querying a model.

New cards

ROUGE

Metric that evaluates automatic text summarization quality by comparing model output to reference summaries.

New cards

BLEU

Metric that measures machine-translation quality by comparing model output to human reference translations.

New cards

Generative Adversarial Network (GAN)

Model consisting of competing generator and discriminator networks that produce high-fidelity synthetic data, especially images.

New cards

Variational Autoencoder (VAE)

Encoder–decoder model that learns a latent space to generate new data and allows controlled attribute manipulation.

New cards

Reinforcement Learning from Human Feedback (RLHF)

Fine-tuning approach where human-ranked outputs create a reward model to align an LLM with human preferences.

New cards

Amazon Bedrock

AWS fully managed service giving API access to multiple foundation models with usage-based pricing.

New cards

Amazon SageMaker JumpStart

SageMaker hub offering pre-trained models, notebooks, and 1-click deployments to accelerate ML and generative AI projects.

New cards

Amazon Titan

AWS family of foundation models (text and embeddings) available exclusively through Bedrock.

New cards

Amazon Q Developer (CodeWhisperer)

Generative AI coding assistant that produces code suggestions directly in an IDE from natural-language comments.

New cards

PartyRock

Playground built on Bedrock that lets users experiment with prompt engineering by rapidly creating small AI apps.

New cards

AWS Nitro System

Hardware foundation of modern EC2 instances providing isolated, hardware-enforced security for customer workloads.

New cards

AWS Trainium

AWS-designed chip optimized for high-performance, cost-efficient training of large ML models.

New cards

AWS Inferentia

AWS-designed chip optimized for high-throughput, low-cost inference of ML models.

New cards

Transfer Learning

Method of starting with a pre-trained model and fine-tuning it on a smaller, domain-specific dataset.

New cards

CAF-AI (Cloud Adoption Framework for AI)

AWS strategic framework guiding organizations across six perspectives to scale AI responsibly and effectively.

New cards

CAF-AI Business Perspective

Focuses on aligning AI initiatives with measurable business outcomes and ROI.

New cards

CAF-AI People Perspective

Addresses workforce skills, culture, and change management for AI adoption.

New cards

CAF-AI Governance Perspective

Ensures responsible, ethical, and compliant AI through policies and risk management.

New cards

CAF-AI Platform Perspective

Covers technology architecture, MLOps pipelines, and scalable infrastructure for AI workloads.

New cards

CAF-AI Security Perspective

Protects data, models, and intellectual property against threats unique to AI systems.

New cards

CAF-AI Operations Perspective

Defines processes for running, monitoring, and continuously improving AI systems in production.

New cards

High Availability (HA)

Design goal of minimizing downtime so a system stays accessible (e.g., 99.99 % uptime).

New cards

Fault Tolerance (FT)

Capability of a system to keep operating without interruption even when components fail.

New cards

AWS Region

Geographically isolated AWS area containing multiple Availability Zones; key to disaster-recovery strategies.

New cards

Availability Zone (AZ)

Physically separate data-center cluster within a Region; applications spanning multiple AZs gain HA and FT.

New cards

Edge Location

AWS Point of Presence used by CloudFront and Global Accelerator to cache or route traffic closer to users.

New cards

Vector Embeddings Model

Model (e.g., Amazon Titan Text Embeddings) that converts text into high-dimensional numeric vectors for similarity search.

New cards

Token-Based Pricing

Pay-per-use cost model where charges depend on the number of tokens processed (input + output).

New cards

Self-Hosting (LLM)

Running a model on your own EC2/GPU infrastructure, incurring 24/7 compute costs and operational overhead.

New cards

Temperature (LLM Parameter)

Inference setting controlling randomness of output; higher values yield more creative but less deterministic text.

New cards

Top-p (Nucleus) Sampling

Decoding method where the model samples from the smallest set of top probable tokens whose cumulative probability exceeds p.

New cards

Embedding Layer

First neural-network layer that maps discrete token IDs to learned dense vectors.

New cards

Statelessness (LLM)

Property that the model does not retain conversational memory between separate calls unless explicitly provided.

New cards

Grounding (RAG)

Supplying external, authoritative data to an LLM so it can generate fact-based, context-relevant answers.

New cards

Chunking (RAG)

Splitting large documents into smaller text pieces before embedding and storing them for retrieval.

New cards

Soft Prompt Tuning

Lightweight fine-tuning approach that learns a small set of prompt tokens instead of updating the entire model.