0 / 0
Converting text to text embeddings
Last updated: Nov 27, 2024
Converting text to text embeddings

Use the watsonx.ai text embeddings API and the available embedding models to generate text embeddings.

Use functions that are available in the watsonx.ai Python library from a notebook in watsonx.ai to convert text to text embeddings.

The following code snippet illustrates how to use the slate-30m-english-rtrvr model to convert the following two lines of text into text embeddings:

  • A foundation model is a large-scale generative AI model that can be adapted to a wide range of downstream tasks.
  • Generative AI a class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data.
from ibm_watsonx_ai.foundation_models import Embeddings
from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames as EmbedParams

my_credentials = {
  "url": "https://{region}.ml.cloud.ibm.com",
  "apikey": {my-IBM-Cloud-API-key},
}

client = APIClient(my_credentials)

model_id = client.foundation_models.EmbeddingModels.SLATE_30M_ENGLISH_RTRVR
gen_parms = None
project_id = {my-project-ID}
space_id = None
verify = False

# Set the truncate_input_tokens to a value that is equal to or less than the maximum allowed tokens for the embedding model that you are using. If you don't specify this value and the input has more tokens than the model can process, an error is generated.

embed_params = {
  EmbedParams.TRUNCATE_INPUT_TOKENS: 128,
  EmbedParams.RETURN_OPTIONS: {
    'input_text': True
  }
}

embedding = Embeddings(
  model_id=model_id,
  credentials=my_credentials,
  params=embed_params,
  project_id=project_id,
  space_id=space_id,
  verify=verify
)

q = [
  "A foundation model is a large-scale generative AI model that can be adapted to a wide range of downstream tasks.",
  "Generative AI a class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data."
]

embedding_vectors = embedding.embed_documents(texts=q)

print(embedding_vectors)

Replace {region}, {my-IBM-Cloud-API-key}, and {my-project-ID} with valid values for your environment.

Sample output

[
   [-0.0053823674,-0.018807093,0.009131943, ...-0.010469643,0.0010533642,0.020114796], 
   [-0.04075534,-0.041552857,0.04326911, ...0.017616473,-0.010064489,0.020788372]
]

Learn more

Parent topic: Python library

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more