1/27
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Chunking
The process of breaking large documents or datasets into smaller, manageable pieces (chunks) before feeding them into a model or retrieval system. Critical in RAG (Retrieval-Augmented Generation) pipelines to improve the relevance of retrieved context passed to a model.
Unimodal Model
A model that processes only one type of data modality (e.g., text only). Standard LLMs are unimodal.
Multimodal Model
A model capable of processing and generating content across multiple data types simultaneously — for example, text, images, and audio. Multimodal LLMs can generate image captions, create product designs from text descriptions, or analyze customer queries that include multimedia content.
Prompt Engineering
The practice of developing, designing, and optimizing the text instructions (prompts) given to a foundation model to guide its outputs toward desired results. It is the fastest and lowest-cost method to optimize FM behavior — no model retraining required.
Zero-Shot Prompting
A prompting technique where the model is given a task instruction with no examples. The model relies entirely on its pre-trained knowledge to produce the output.
Few-Shot Prompting
A prompting technique where one or more input-output examples are included in the prompt to demonstrate the desired pattern or behavior before asking the model to complete the task on new input.
Chain-of-Thought Prompting
A prompting technique that instructs the model to reason through a problem step-by-step before producing a final answer, improving performance on complex reasoning tasks.
Foundation Model (FM) Lifecycle (7)
The end-to-end framework for developing and deploying a foundation model, comprising seven stages: (1) Data Selection, (2) Model Selection, (3) Pre-training, (4) Fine-tuning/Adaptation, (5) Evaluation, (6) Deployment, and (7) Feedback and Iteration. Distinct from the traditional ML Pipeline in that it centers on adapting pre-trained large models rather than building from scratch.
Retrieval-Augmented Generation (RAG)
Enhances FM outputs at inference time by retrieving relevant information from an external knowledge base and injecting it as context into the prompt. RAG improves factual accuracy without requiring full model fine-tuning. Relies on chunking and embeddings for effective retrieval.
Intrinsic Interpretability
A form of model transparency that refers to the ability to understand the internal workings and decision-making processes of a model during its operation. Generative AI (especially large neural networks) has very low intrinsic interpretability — its internal logic is largely opaque.
Post-Hoc Interpretability
A complementary approach to model transparency that involves analyzing and explaining a model's outputs after they are generated (e.g., using explanation or attribution tools), rather than understanding the internal mechanics. Also challenging for large generative models.
Cross-Domain Performance (Gen AI Metric)
A performance metric evaluating the versatility of a generative AI model across different domains and task types beyond its primary training domain — important for systems deployed across diverse use cases.
Efficiency (Gen AI Metric)
A performance metric assessing the computational efficiency of a generative AI system, including inference time and resource utilization. Efficient models are essential for practical, scalable, and cost-effective deployments.
Conversion Rate (Gen AI Metric)
A business performance metric measuring the ability of a generative AI system to convert users or drive desired commercial actions, directly impacting business value and ROI.
Average Revenue Per User — ARPU (Gen AI Metric)
A business performance metric quantifying the average revenue generated per user interacting with a generative AI-powered system, providing a direct measure of financial impact in commercial applications.
Customer Lifetime Value — CLV (Gen AI Metric)
A business performance metric estimating the total revenue a customer generates over their entire relationship with a generative AI-powered product or service, evaluating long-term sustainability of the AI solution.
Amazon Bedrock — On-Demand Pricing
The default pricing model for Amazon Bedrock text generation FMs: users are charged by the number of input tokens received plus the number of output tokens generated. No long-term commitment required. Note: embedding models are charged only on input tokens processed.
Amazon Bedrock — Provisioned Throughput
An Amazon Bedrock pricing option where users purchase a committed level of model processing capacity (tokens per minute) in advance. Higher provisioned throughput guarantees predictable performance for time-sensitive workloads but at a higher cost than on-demand.
PartyRock (Amazon Bedrock Playground)
A hands-on, no-code sandbox environment built on Amazon Bedrock that allows users to experiment with and build AI-powered applications using foundation models without writing code. Designed for exploration, learning, and rapid prototyping.
Amazon SageMaker JumpStart
An AWS service that provides a curated set of pre-built ML solutions, pre-trained foundation models, and example notebooks for the most common generative AI and ML use cases. Enables faster experimentation by providing a proven starting point for model development and deployment.
Amazon Q
An AWS generative AI-powered assistant designed for enterprise work contexts. Amazon Q can be tailored to a specific organization's data, systems, and workflows to answer questions, generate content, summarize documents, and complete tasks — drawing on the company's own enterprise data.
Amazon Q Developer
An AWS ML-powered coding assistant that provides code recommendations, generates code, explains existing code, and identifies security vulnerabilities across a variety of programming languages. Accelerates software development workflows through AI-powered code generation.
Amazon Fraud Detector
A fully managed AWS AI service that uses ML to automatically identify potentially fraudulent activities — such as online payment fraud and fake account creation — without requiring ML expertise from the user. Categorized under the Business Metrics layer of the AWS AI/ML stack.
Amazon Elastic File System (Amazon EFS)
An AWS fully managed, scalable shared file storage service. In generative AI workloads, EFS stores training data and model artifacts with shared access across multiple compute instances simultaneously.
Amazon Elastic Kubernetes Service (Amazon EKS)
An AWS managed Kubernetes container orchestration service used to deploy and scale containerized generative AI workloads. Provides infrastructure for hosting and auto-scaling generative AI model serving at enterprise scale.
AWS CloudFormation
An AWS infrastructure-as-code (IaC) service that allows users to define and provision AWS cloud infrastructure through machine-readable templates. Used to automate and standardize the provisioning of generative AI application infrastructure.
Token-Based Pricing
A pricing model used by AWS generative AI services (notably Amazon Bedrock and Amazon Q Developer) where users pay based on the volume of tokens — units of text or code — that are processed as inputs and/or generated as outputs by the service.
Spot Instances (Gen AI Cost Optimization)
A cost-optimization AWS EC2 feature that provides access to spare AWS compute capacity at significantly reduced prices. Applicable to generative AI training workloads to reduce infrastructure costs, though they can be interrupted when AWS reclaims the capacity.