1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is an Agentforce Data Library? How are they associated with Agentforce? How do they help? What are the 2 types of sources for a library?
A library of content that can be used by an agent to answer questions.
Data Libraries can be assigned to Agentforce features to improve accuracy, add personalization, and build trust in generative AI responses.
What are the 2 types of sources for an Agentforce Data Library?
Salesforce Knowledge
Uploaded files (text, HTML, and PDFs)
Describe grounding in terms of Agentforce Data Libraries. How does grounding help in general?
Agentforce uses the information in the assigned data library at run time to ground LLM prompts and produce better, more accurate, and relevant LLM responses.
Grounding adds domain-specific knowledge or customer information to the prompt and provides context to the LLM
Describe chunking in relation to Agentforce libraries. How does it help? What types of files are supported?
Data sources are broken down into smaller parts called chunks to make the search more efficient and improve relevance.
Chunking supports many types of data, including text, images, and audio files.
Describe indexing in relation to chunking. How des it help? How does a search run using an index?
Data that is split into chunks is organized and categorized, which is called indexing.
This process simplifies the search and retrieval of data chunks.
When information is required or a user asks a question, an agent searches through the chunks in the index.
Describe how a similarity score relates to a search run against chunks of a library?
When a search is performed by an agent, each chunk is compared to a similarity score, which indicates matching text.
Chunks with high similarity scores are returned and added into the prompt since they point to relevant articles or sources.
Describe a retriever. How does it help and how can it be used? What does it help determine?
A retriever is a resource that can be embedded in a prompt template to search for, and return, relevant information from the data library.
The retriever assigned to a data library determines which datasets in Data Cloud are available to AI agents.
What type of standard agent action is used by an AI agent to answer it based on the data in the corresponding Agentforce Data Library?
Answer Questions with Knowledge
What is required to set up data libraries? What needs to be enabled and what permissions are required?
Data Cloud must be enabled
Data Cloud admin permissions are required to set up data libraries.
Where can data libraries be created and assigned to AI features? What type of features can they be assigned to.
Libraries can be assigned to AI features like Agentforce agents from the Agentforce Data Library page in Setup or when configuring an agent in the Agent Builder.
What configuration steps does creating a data library automate?
Steps like pushing data streams or files to Data Cloud, creating a search index and retriever, and linking the agents to that data.
Describe a data space related to creating a data library. What is it used to define?
A data space can be selected when creating a data library
It defines the Data Cloud data source that is used by the data library.
What 2 things are automatically created after setting a data source and what is it used by?
A search index and retriever are automatically created for the selected data source and used by the knowledge action.
How can a data library can be configured to use the Knowledge base as its data source?
By accessing the Knowledge tab of the data library and selecting the Knowledge fields for the library to index
How do Identifying fields and Content fields help agents?
Identifying fields help agents locate the correct Knowledge articles.
Content fields help agents enrich responses with relevant details.
Where do we specify the Knowledge articles that should be included in the data library?
In Knowledge settings
How can indexed articles be restricted and filtered?
Indexed articles can be restricted to use only publicly available articles in the Knowledge base. They can also be filtered by specific data categories.
How can a data library can be configured to use specific files as its data source?
By selecting the File Upload tab and uploading the files.
What is the limit for text or HTML files and PDF files to be uploaded?
Up to 4 MB of text or HTML files and 100 MB of PDF files can be uploaded
What types of libraries can be assigned to AI features? How many features can a data library be used by?
Including Agentforce Agents, Agentforce Service Agent, and Einstein Service Replies.
Each feature can use only one data library at a time
How can a data library that an agent uses to generate responses can be generated where?
What types of libraries can be used and how?
What 2 aspects can be configured related to a library?
The Knowledge tab.
An existing library can be selected, or a new library can be created.
The Knowledge fields or file uploads for the library to index can be configured
How can agent responses be based on all Knowledge articles and fields?
By selecting All Knowledge records and fields
How can indexed articles be restricted and how are filters handled?
Indexed articles can be restricted to public articles and filtered by specific data categories.
What can the Knowledge tab of the Agent Builder be used for?
Can be used to configure a data library for the responses.
What is unstructured data? Provide 5 examples. What are benefits of bringing into Data Cloud related to which AI features?
Information without a specific, consistent format that can’t be easily stored in a relational database.
Examples:
PDFs
Knowledge articles
Sales call transcripts
Audio and video files
Previous emails
A benefit is to provide more context so agents can answer with richer, real-world context instead of generic responses in Einstein generative AI (Prompt Builder and Agentforce).
What 3 formats of unstructured data can Data 360 accept? (Not an exhaustive list)
HTML, TXT, and PDF
What is an external blob store?
A storage provider for unstructured data
What 3 types of external blob store connections does Data Cloud 360 support for unstructured data?
Amazon S3
Azure Blob Storage
Google Cloud Storage
What is a UDLO?
Unstructured Data lake object
What is a UDMO?
Unstructured Data Model Object
What can be created between an external blob store and Data 360? After creating it, how can unstructured data be referenced in Data 360?
A connection can be created between an external blob store and Data 360.
After creating the connection, the unstructured data can be referenced in Data 360 by creating an unstructured data lake object (UDLO) and mapping it to an unstructured data model object (UDMO)
Describe the relationship between UDLOs and UDMOs and provide an example
The relationship between UDLOs and UDMOs can be 1:1 or N:1, which means that each UDLO can be mapped to at most one UDMO, while multiple UDLOs can be mapped to a single UDMO.
Consider that you’re referencing case-recording data from multiple external blob stores. Three different UDLOs reference data from these three sources: CaseRecordingsFromAWSBucket1, CaseRecordingsFromAWSBucket2, and CaseRecordingsfromGCS. Because these sources are logically the same object, the individual UDLOs are mapped to one UDMO: CaseRecordings
Describe how field level mappings are created between UDLOs and UDMOs and why.
Field-level mappings are automatically created between the two due to identical schemas.
What types of data to knowledge base articles contain?
Both structured and unstructured data in custom text fields.
What can the Knowledge article bundle be used for in Data 360? How does it related to DMOs?
The Knowledge article bundle can be used to ingest Knowledge article data from the Salesforce org.
It includes default mappings to the relevant data model objects (DMOs).
What can Search Index Configurations be used for?
To ground search on unstructured and structured data and enhance the use of generative AI.
What is chunking and how does it help?
Chunking breaks long text fields into smaller, semantically meaningful chunks stored in chunk DMOs (CDMOs), so retrievers can pull just the information that answers the user’s question.
When can a chunking strategy be set?
When creating a search index configuration.
What 4 chunking strategies are supported?
Section-Aware Chunking
Semantic-Based Passage Extraction
Conversation-Based Chunking
Prepend Field Chunking
What is section-aware chunking? How can overly small chunks be avoided?
Where the title and heading elements are used to chunk documents.
When creating a search index, Max Token and Overlap Tokens settings can be used to avoid misidentifying short paragraphs or list items as standalone sections, leading to overly small chunks.
What is semantic-based page extraction and what are considered dividers for chunks?
Where the semantic meaning inherent in HTML tags is used to chunk a document into passages.
The HTML elements (e.g., heading levels 1-6 <h1-h6>, thematic breaks <hr>, bold <b>, etc.) are considered logical boundaries for chunks.
What is conversation-based chunking? What is used to divide chunks?
Where the transcribed data from audio and video files are segmented into chunks, typically separated when the voice changes. If there are multiple speakers, each chunk represents the speech of an individual speaker
What is prepend field chunking?
When creating a search index using the advanced builder, prepend fieldscan be configured when there’s a need to add additional fields or metadata to provide context for a chunk.
For example, the Title field can be prepended to provide more context for the Description field in a Knowledge article, which could exceed the optimal chunk size for prompt-based retrieval.
(So it allows you to manually define chunks by adding items to the data, prepending where a chunk should be.)
What are 2 aspects we can edit on a search configuration. How can the embeddings be regnerated?
Can be edited to add fields or file extensions for chunking.
The search index can then be rebuilt to re-chunk and regenerate the embeddings
How can an attachment be included? What are the requirements to include it and what are the steps?
If a search index is on a DMO of a Salesforce object with file attachments, the attachments can be included by clicking Include Attachments and selecting the ContentDocumentVersion UDMO.
What are 3 providers that models can be built, trained, and deployed with?
After the model is registered how can we get prediction criteria and how can we set it up with Data 360?
Amazon SageMaker, Google Cloud Vertex AI, and Databricks.
After model registration, the prediction criteria can be defined and the model can be connected with Data 360 to get predictions and insights.
When creating a search index configuration, at what level is the chunking strategy set?
The strategy can be set for each field.
What should be done to a search index configuration if the associated data changes often and we want to keep it up to date?
The search index should be rebuilt on a regular schedule to re-chunk change HTML/PDF (or any unstructured) content and regenerate embeddings, ensuring the agent always works from the latest content.
What settings should be made on a search index configuration to include PDFs attached to emails that are included.
Include Attachments should be enabled in the search index configuration, and the ContentDocumentVersion UDMO should be selected so solutions stored in attached PDF files (for example) are searchable.
What is a good chunking strategy to use on HTML files to get small chunks and why?
Semantic-based passage extraction. This chunking strategy treats HTML structure (H1–H6, paragraphs, separators) as logical boundaries and creates coherent answer-sized passages, which improves retrieval precision compared to large, section-only chunks.
How should a search index configuration be set up to properly handle PII information?
The specialist should create an Unstructured Data Model Object (UDMO) for the patient-history notes and build a Data Cloud search index that’s populated only with PII-safe content. Sensitive fields, such as names, MRNs, phone numbers, and addresses, should be masked or removed so resulting chunks contain only non-PII text.
The index should use the section-aware chunking strategy to produce answer-sized passages that the retriever uses to ground the agent’s responses. This approach would give the Agentforce agent detailed historical context while enforcing privacy-by-design, instead of relying on prompt wording or encrypted vectors that could still expose PII at query time.
What can we do about Agentforce Session Tracing data containing sensitive data like PII? What are 2 ways to configure the protection?
Use Data Cloud Data Governance to protect sensitive data by setting up dynamic data masking and field-specific data exclusion policies.
You need to grant access to the impacted Data 360 objects according to an access policy.
In the access policy, you can select from either attribute-based access control (ABAC) or role-based access control (RBAC).
What is a Data Kit? What are 2 ways it can be used?
Add Data 360 metadata components to Data Kit.
Then in Setup you can create a DMO from a Data Kit using the “Create from a data kit” option