Vectorizing text

Last updated: Feb 07, 2025

Vectorizing text

Use the embedding models and embeddings API that are available from watsonx.ai to create text embeddings that capture the meaning of sentences or passages for use in your generative AI applications.

Ways to develop

You can vectorize text, meaning convert text to numerical representations of text called embeddings, by using these programming methods:

REST API
Python
Node.js

You can also use IBM embedding models from third-party platforms, such as:

Alternatively, you can use graphical tools from the watsonx.ai UI to vectorize documents as part of a chat workflow or to create vector indexes. See the following resources:

Overview

Converting text into text embeddings, or vectorizing text, helps with document comparison, question-answering, and in retrieval-augmented generation (RAG) tasks, where you need to retrieve relevant content quickly.

For more information, see the following topics:

Supported foundation models

For details about the available embedding models in watsonx.ai, see Supported encoder models.

To find out which embedding models are available for use programmatically, use the List the available foundation models method in the watsonx.ai as a service API. Specify the filters=function_embedding parameter to return only the available embedding models.

curl -X GET \
  'https://{cluster_url}/ml/v1/foundation_model_specs?version=2024-07-25&filters=function_embedding'

Python

See the Embeddings class of the watsonx.ai Python library.

The following sample notebook is available:

Sample notebook for converting text to text embeddings

REST API

Use the Generate embeddings method in the watsonx.ai API to vectorize text.

REST API example

The following code snippet uses the slate-30m-english-rtrvr model to convert the following two lines of text into text embeddings:

A foundation model is a large-scale generative AI model that can be adapted to a wide range of downstream tasks.
Generative AI a class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data.

In this example, two lines of text are being submitted for conversion. You can specify up to 1,000 lines. Each line that you submit must conform to the maximum input token limit that is defined by the embedding model.

To address cases where a line might be longer, the truncate_input_tokens parameter is specified to force the line to be truncated. Otherwise, the request might fail. The input_text parameter is included so that the original text is added to the response, making it easier to pair the original text with each set of embedding values.

You specify the embedding model that you want to use as the model_id in the payload for the embedding method.

REST API request example

curl -X POST \
  'https://{region}.cloud.ibm.com/ml/v1/text/embeddings?version=2024-05-02' \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer eyJraWQiOi...' \
  --data-raw '{
  "inputs": [
    "A foundation model is a large-scale generative AI model that can be adapted to a wide range of downstream tasks.",
    "Generative AI a class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data."
  ],
  "parameters":{
    "truncate_input_tokens": 128,
    "return_options":{
      "input_text":true
    }
  },
  "model_id": "ibm/slate-30m-english-rtrvr",
  "project_id": "81966e98-c691-48a2-9bcc-e637a84db410"
}'

REST API response example

The response looks something like this, where the 384 values in each embedding are reduced to 6 values to improve the readbility of the example:

{
  "model_id": "ibm/slate-30m-english-rtrvr",
  "created_at": "2024-05-02T16:21:56.771Z",
  "results": [
    {
      "embedding": [
        -0.023104044,
        0.05364946,
        0.062400896,
        ...
        0.008527246,
        -0.08910927,
        0.048190728
      ],
      "input": "A foundation model is a large-scale generative AI model that can be adapted to a wide range of downstream tasks."
    },
    {
      "embedding": [
        -0.024285838,
        0.03582272,
        0.008893765,
        ...
        0.0148864435,
        -0.051656704,
        0.012944954
      ],
      "input": "Generative AI a class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data."
    }
  ],
  "input_token_count": 57
}

Learn more

Parent topic: Coding generative AI solutions