0 / 0
Supported embedding models available with watsonx.ai

Supported embedding models available with watsonx.ai

Use embedding models that are deployed in IBM watsonx.ai to help with semantic search and document comparison tasks.

Embedding models are encoder-only foundation models that create text embeddings. A text embedding encodes the meaning of a sentence or passage in an array of numbers known as a vector. For more information, see Text embedding generation.

The following embedding models are available in watsonx.ai:

For more information about generative foundation models, see Supported foundation models.

IBM embedding models

The following table lists the supported embedding models that IBM provides.

Table 1. IBM embedding model in watsonx.ai
Model name API model_id Billing class Maximum input tokens Number of dimensions More information
slate-125m-english-rtrvr ibm/slate-125m-english-rtrvr Class C1 512 768 Model card
slate-30m-english-rtrvr ibm/slate-30m-english-rtrvr Class C1 512 384 Model card

Third-party embedding models

The following table lists the supported third-party embedding models.

Table 2. Supported third-party embedding model in watsonx.ai
Model name API model_id Provider Billing class Maximum input tokens Number of dimensions More information
all-minilm-l12-v2 sentence-transformers/all-minilm-l12-v2 Open source natural language processing (NLP) and computer vision (CV) community Class C1 256 384 Model card
multilingual-e5-large intfloat/multilingual-e5-large Microsoft Class C1 512 1024 Model card
Research paper

 

Embedding model details

You can use the watsonx.ai Python library or REST API to submit sentences or passages to one of the supported embedding models.

all-minilm-l12-v2

The all-minilm-l12-v2 embedding model is built by the open source natural language processing (NLP) and computer vision (CV) community and provided by Hugging Face. Use the model as a sentence and short paragraph encoder. Given an input text, it outputs a vector which captures the semantic information in the text.

Usage: Use the sentence vectors that are generated by the all-minilm-l6-v2 embedding model for tasks such as information retrieval, clustering, and for detecting sentence similarity.

Cost: Class C1. For pricing details, see Watson Machine Learning plans.

Number of dimensions: 384

Input token limits: 256

Supported natural languages: English

Fine-tuning information: This embedding model is a version of the pretrained MiniLM-L12-H384-uncased model from Microsoft that is fine-tuned with sentence pairs from more than 1 billion sentences.

Model architecture: Encoder-only

License: Apache 2.0 license

Learn more

multilingual-e5-large

The multilingual-e5-large embedding model is built by Microsoft and provided by Hugging Face.

The embedding model architecture has 24 layers that are used sequentially to process data.

Usage: Use for use cases where you want to generate text embeddings for text in a language other than English. When you submit input to the model, follow these guidelines:

  • Prefix the inputs with query: and passage: respectively for tasks such as passage or information retrieval.
  • Prefix the input text with query: for tasks such as semantic similarity, bitext mining, and paraphrase retrieval.
  • Prefix the input text with query: if you want to use embeddings as features, such as in linear probing classification or for clustering.

Cost: Class C1. For pricing details, see Watson Machine Learning plans.

Number of dimensions: 1024

Input token limits: 512

Supported natural languages: Up to 100 languages. See the model card for details.

Fine-tuning information: This embedding model is a version of the XLM-RoBERTa model, which is a multilingual version of RoBERTa that is pretrained on 2.5TB of filtered CommonCrawl data. This embedding model was continually trained on a mixture of multilingual datasets.

Model architecture: Encoder-only

License: Microsoft Open Source Code of Conduct

Learn more

slate-125m-english-rtrvr

The slate-125m-english-rtrvr foundation model is provided by IBM. The slate-125m-english-rtrvr foundation model generates embeddings for various inputs such as queries, passages, or documents. The training objective is to maximize cosine similarity between a query and a passage. This process yields two sentence embeddings, one that represents the question and one that represents the passage, allowing for comparison of the two through cosine similarity.

Usage: Two to three times slower but performs slightly better than the slate-30m-english-rtrvr model.

Cost: Class C1. For pricing details, see Watson Machine Learning plans.

Number of dimensions: 768

Input token limits: 512

Supported natural languages: English

Fine-tuning information: This version of the model was fine-tuned to be better at sentence retrieval-based tasks.

Model architecture: Encoder-only

License: Terms of use

Learn more

slate-30m-english-rtrvr

The slate-30m-english-rtrvr foundation model is a distilled version of the slate-125m-english-rtrvr, which are both provided by IBM. The slate-30m-english-rtrvr embedding model is trained to maximize the cosine similarity between two text inputs so that embeddings can be evaluated based on similarity later.

The embedding model architecture has 6 layers that are used sequentially to process data.

Usage: Two to three times faster and has slightly lower performance scores than the slate-125m-english-rtrvr model.

Cost: Class C1. For pricing details, see Watson Machine Learning plans.

Try it out: Using vectorized text with retrieval-augmented generation tasks

Number of dimensions: 384

Input token limits: 512

Supported natural languages: English

Fine-tuning information: This version of the model was fine-tuned to be better at sentence retrieval-based tasks.

Model architecture: Encoder-only

License: Terms of use

Learn more

Parent topic: Text embedding generation

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more