Adding vectorized documents for grounding foundation model prompts
Last updated: Jan 29, 2025
Adding vectorized documents for grounding foundation model prompts
Add grounding documents to a vector index that can be used to add contextual information to foundation model prompts for retrieval-augmented generation tasks.
Required permissions
To create vector index assets and associate them with a prompt, you must have the Admin or Editor role in a project.
When you use foundation models for question-answering tasks, you can help the foundation model generate factual and up-to-date answers by adding contextual information to the foundation model prompt. When a foundation model is given factual information
as input, it is more likely to incorporate that factual information in its output.
To make contextual information available to a prompt, first add grounding documents to a vector index asset, and then associate the vector index with a foundation model prompt.
The task of adding grounding documents to an index is depicted in the retrieval-augmented generation diagram by the preprocessing step, where company documents are vectorized.
Supported vector stores
Copy link to section
You can use one of the following vector stores to store your grounding documents:
In memory: A Chroma database vector index that is associated with your project and provides temporary vector storage.
Note: The in-memory vector index asset is created for you automatically; you don't need to set up the vector store.
Elasticsearch: A third-party vector index that you set up and connect to your project.
watsonx.data Milvus: A third-party vector index that you can set up in watsonx.data, and then connect to your project.
Choosing a vector store
Copy link to section
When you create a vector index for your documents, you can choose the vector store to use. To determine the right vector store for your use case, consider the following factors:
What embedding models can be used with the vector store?
The embedding models that you can use to vectorize documents that you add to the index differ by vector store. For details, see Embedding models and vectorization settings.
How many grounding documents do you want to be able to search from your foundation model prompts?
When you connect to a third-party vector store, you can choose to do one of the following things:
Add files to vectorize and store in a new vector index or collection in the vector store.
Use vectorized data from an existing index or collection in the vector store.
The number of files that you can add to the vector store at the time that you create the vector index is limited. If you want to vectorize more documents, such as a set of PDF files that is larger than 50 MB, use a third-party vector store. With a third-party
vector store, you can create a collection or index with more documents directly from the data store first. Then, you can connect to the existing collection or index when you create a vector index asset to associate with your prompt.
Supported grounding document file types
Copy link to section
When you add grounding documents to create a new vector index, you can upload files or connect to a data asset that contains files.
The following table lists the supported file types and maximum file sizes that you can add when you create a new vector index. The supported file types differ by vector store.
File types are listed in the first column. The maximum total file size that is allowed for each file type is listed in the second column. A checkmark (✓) indicates that the vector store that is named in the column header supports the file type
that is listed in the first column.
Table 1. Supported file types for grounding documents that you add
File type
Maximum total file size
In-memory
Elasticsearch
Milvus
CSV
5 MB
✓
✓
DOCX
10 MB
✓
✓
✓
HTML
5 MB
✓
✓
JSON
5 MB
✓
✓
PDF
50 MB
✓
✓
✓
PPTX
300 MB
✓
✓
✓
TXT
5 MB
✓
✓
✓
XLSX
5 MB
✓
✓
Supported embedding models
Copy link to section
When you upload grounding documents, an embedding model is used to calculate vectors that represent the document text. You can choose the embedding model to use.
For in-memory and Milvus data stores, the following embedding models are supported:
all-MiniLM-L6-v2
Requires a smaller chunk size than the IBM Slate embedding models.
all-MiniLM-l12-v2
Requires a smaller chunk size than the IBM Slate embedding models.
granite-embedding-107m-multilingual
Standard sentence transformer model based on bi-encoders and part of the IBM Granite Embeddings suite.
granite-embedding-278m-multilingual
Standard sentence transformer model based on bi-encoders and part of the IBM Granite Embeddings suite.
slate-30m-english-rtrvr
IBM model that is faster than the 125m version.
slate-125m-english-rtrvr
IBM model that is more precise than the 30m version.
slate-30m-english-rtrvr-v2
Latest version of the IBM model that is faster than the 125m version.
slate-125m-english-rtrvr-v2
Latest version of the IBM model that is more precise than the 30m version.
For the Elasticsearch data store, ELSER (Elastic Learned Sparse EncodeR) embedding models are supported. For more information, see ELSER – Elastic Learned Sparse EncodeR
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.