You can use the following types of vector stores to index your grounding documents:
- In memory
- Elasticsearch
- watsonx.data Milvus
When you choose an in-memory vector store, the index is created for you automatically; you don't need to set up the vector store.
For factors to consider when you choose a vector store, see Adding vectorized documents for grounding foundation model prompts.
If you choose to use a third-party vector store, you must set up a connection to the data store before you create the vector index. For more information, see the appropriate procedure for the vector store that you want to use:
Procedure
To create a vector index for your grounding documents, complete the following steps. The order of the steps might differ slightly based on the vector store you choose to use.
-
From the project overview, click the Assets tab, and then choose New asset > Ground gen AI with vectorized documents.
Alternatively, you can start from the Prompt Lab in chat mode by clicking the Grounding with documents icon at the start of the page, and then clicking Select or create vector index.
-
Choose the vector store that you want to use.
-
Name the vector index asset.
-
Add grounding documents in one of the following ways:
- Add files from a data asset that is associated with your project
- Browse to upload files from your file system
The following options are available from third-party vector stores only:
-
Add existing content from a connected vector store
Select the connected data source, choose a database if applicable, and then click Next. Choose the index or collection that you want to use.
-
Add new content to a connected vector index
To add new content to a connected vector store, select the connected data source, choose a database if applicable, and then click Next. Click New index or New collection, specify a name, and then add documents by uploading files or connecting to a data asset.
The supported file types differ by vector store. For more information, see Supported grounding document file types.
-
Optional: If applicable, choose the embedding model or vectorization settings that you want to use to vectorize your documents.
For more information, see Embedding model and vectorization settings.
-
For connected data stores only: Map fields from your existing index or collection to new fields that will be defined in the vector index asset in watsonx.ai.
These field mappings are important because watsonx.ai needs a consistent way to extract data and capture details about the documents, such as the original file name and page number, from various supported vector stores.
Table 1. Vector store schema fields New vector index field name Field from connected vector store Vector query Required for Elasticsearch indexes only. Field where the query text is specified that is used to search the Elasticsearch index, such as ml
orvector
.Document name Field that identifies the source file. You can choose a field that captures the file name, such as metadata.source
, or the document title, such asmetadata.title
.Text Field that contains the bulk of the page content, such as body
ortext
.Page number Field that identifies the page number, such as metadata.page_number
.Document url Field that contains the URL for the document, such as metadata.document_url
. -
Click Create.
The text in the file is vectorized and the vectors are indexed and stored in a new vector index asset.
When you add new content to a connected third-party data store, the following things happen:
- A notebook asset is generated that runs in a job to vectorize the documents and build the index or collection in the third-party data store.
- A vector index asset is created in watsonx.ai that can pass submitted queries to the index or collection in the third-party data store and get search results.
After the vector index asset is created, test how well the vectorized documents can answer questions, and make any necessary adjustments. See Managing a vector index.
Embedding model and vectorization settings
The following settings control how documents are broken into smaller segments, or chunks, before they are sent to the embedding model:
-
Text chunk size: Number of characters to include per document segment.
Define a segment size that is smaller than the maximum number of input tokens allowed by the model. If you break the document into larger segments, some document text might be omitted because after the maximum token size limit is met, any extra characters in the segment are ignored by the embedding model.
The chunk size is specified in characters. The number of characters per token varies by embedding model, but one token is equal to approximately 2-3 characters.
Table 2. Embedding model chunk sizes Embedding model Maximum input tokens Approximate chunk size all-MiniLM-L6-v2 256 700 ELSER 512 1400 slate-30m-english-rtrvr 512 1400 slate-125m-english-rtrvr 512 1400 -
Text chunk overlap: Number of characters to repeat in each of two consecutive document segments.
Repeating text creates a buffer between document segments that helps to capture complete sentences and prevents text from being missed altogether.
-
Split PDF pages: When enabled, breaks a PDF file into one segment per page and includes the page number source in the answer. The page numbers that are shown are PDF viewer page numbers.
Note: This option is available only when you add a PDF file.
Use the vector index with your foundation model prompt
When the vector index is ready for use, associate this vector index asset with a foundation model prompt in one of the following ways:
- From the vector index asset page, click the View vector index info icon at the start of the page to open the About this asset panel, and then click Open in Prompt Lab.
- From the Prompt Lab in chat mode, click the Grounding with documents icon at the start of the page, and then click Select or create vector index.
Use this prompt pattern in your application
After experimenting with retrieval-augmented generation (RAG) patterns that use your document set, save the prompt logic in a notebook so that you can use it in your generative AI application.
When you save the prompt as a notebook, select the Deployable gen AI flow option. The notebook that is generated provides Python code for the prompt template and a deployable Python function that can be consumed by REST APIs.
For more information, see Saving your work.
Learn more
- Setting up a Elasticsearch vector store
- Setting up a watsonx.data Milvus vector store
- Prompt Lab
- Chatting with documents and images
- Retrieval-augmented generation (RAG)
- Vectorizing text by using the API
Parent topic: Adding vectorized documents for grounding foundation model prompts