Creating a vector index

Last updated: Jan 29, 2025

You can use the following types of vector stores to index your grounding documents:

In memory
Elasticsearch
watsonx.data Milvus

When you choose an in-memory vector store, the index is created for you automatically; you don't need to set up the vector store.

For factors to consider when you choose a vector store, see Adding vectorized documents for grounding foundation model prompts.

If you choose to use a third-party vector store, you must set up a connection to the data store before you create the vector index. For more information, see the appropriate procedure for the vector store that you want to use:

Procedure

To create a vector index for your grounding documents, complete the following steps. The order of the steps might differ slightly based on the vector store you choose to use.

From the project overview, click the Assets tab, and then choose New asset > Ground gen AI with vectorized documents.

Alternatively, you can start from the Prompt Lab in chat mode by clicking the Grounding with documents icon at the start of the page, and then clicking Select or create vector index.
Choose the vector store that you want to use.
Name the vector index asset.
Add grounding documents in one of the following ways:
- Add files from a data asset that is associated with your project
- Browse to upload files from your file system
The following options are available from third-party vector stores only:
- Add existing content from a connected vector store
  
  Select the connected data source, choose a database if applicable, and then click Next. Choose the index or collection that you want to use.
- Add new content to a connected vector index
  
  To add new content to a connected vector store, select the connected data source, choose a database if applicable, and then click Next. Click New index or New collection, specify a name, and then add documents by uploading files or connecting to a data asset.
The supported file types differ by vector store. For more information, see Supported grounding document file types.
Optional: If applicable, choose the embedding model or vectorization settings that you want to use to vectorize your documents.

For more information, see Embedding model and vectorization settings.

For connected data stores only: Map fields from your existing index or collection to new fields that will be defined in the vector index asset in watsonx.ai.

These field mappings are important because watsonx.ai needs a consistent way to extract data and capture details about the documents, such as the original file name and page number, from various supported vector stores.

Table 1. Vector store schema fields
New vector index field name	Field from connected vector store
Vector query	Required for Elasticsearch indexes only. Field where the query text is specified that is used to search the Elasticsearch index, such as `ml` or `vector`.
Document name	Field that identifies the source file. You can choose a field that captures the file name, such as `metadata.source`, or the document title, such as `metadata.title`.
Text	Field that contains the bulk of the page content, such as `body` or `text`.
Page number	Field that identifies the page number, such as `metadata.page_number`.
Document url	Field that contains the URL for the document, such as `metadata.document_url`.

Click Create.

The text in the file is vectorized and the vectors are indexed and stored in a new vector index asset.

When you add new content to a connected third-party data store, the following things happen:

A notebook asset is generated that runs in a job to vectorize the documents and build the index or collection in the third-party data store.
A vector index asset is created in watsonx.ai that can pass submitted queries to the index or collection in the third-party data store and get search results.

After the vector index asset is created, test how well the vectorized documents can answer questions, and make any necessary adjustments. See Managing a vector index.

Embedding model and vectorization settings

The following settings control how documents are broken into smaller segments, or chunks, before they are sent to the embedding model:

Text chunk size: Number of characters to include per document segment.

Define a segment size that is smaller than the maximum number of input tokens allowed by the model. If you break the document into larger segments, some document text might be omitted because after the maximum token size limit is met, any extra characters in the segment are ignored by the embedding model.

The chunk size is specified in characters. The number of characters per token varies by embedding model, but one token is equal to approximately 2-3 characters.

Table 2. Embedding model chunk sizes
Embedding model	Maximum input tokens	Approximate chunk size
all-MiniLM-L6-v2	256	700
all-MiniLM-l12-v2	256	700
ELSER	512	1400
slate-30m-english-rtrvr	512	1400
slate-125m-english-rtrvr	512	1400
slate-30m-english-rtrvr-v2	512	1400
slate-125m-english-rtrvr-v2	512	1400
granite-embedding-107m-multilingual	512	1400
granite-embedding-278m-multilingual	512	1400

Text chunk overlap: Number of characters to repeat in each of two consecutive document segments.

Repeating text creates a buffer between document segments that helps to capture complete sentences and prevents text from being missed altogether.
Split PDF pages: When enabled, breaks a PDF file into one segment per page and includes the page number source in the answer. The page numbers that are shown are PDF viewer page numbers.

Note: This option is available only when you add a PDF file.

Use the vector index with your foundation model prompt

When the vector index is ready for use, associate this vector index asset with a foundation model prompt in one of the following ways:

From the vector index asset page, click the View vector index info icon at the start of the page to open the About this asset panel, and then click Open in Prompt Lab.
From the Prompt Lab in chat mode, click the Grounding with documents icon at the start of the page, and then click Select or create vector index.

Use this prompt pattern in your application

After experimenting with retrieval-augmented generation (RAG) patterns that use your document set, save the prompt logic in a notebook so that you can use it in your generative AI application.

When you save the prompt as a notebook, select the Deployable gen AI flow option. The notebook that is generated provides Python code for the prompt template and a deployable Python function that can be consumed by REST APIs.

For more information, see Saving your work.

Alternatively, you can use classes that are available from the watsonx.ai SDKs. If you use the SDK with the in-memory vector index, you can get details about the index, such as the vector index ID, by saving the prompt as a notebook, and then checking the notebook for the vector index details.

Learn more

Parent topic: Adding vectorized documents for grounding foundation model prompts

Was the topic helpful?

0/1000