Use embedding models to create text embeddings that capture the meaning of a sentence or a passage. You can use these models with classifiers such as support vector machines. Embedding models can also help you with retrieval-augmented generation tasks.
The following diagram illustrates the retrieval-augmented generation pattern with embedding support.
The retrieval-augmented generation pattern with embedding support involves the following steps:
- Convert your content into text embeddings and store them in a vector data store.
- Use the same embedding model to convert the user input into text embeddings.
- Run a similarity or semantic search in your knowledge base for content that is related to a user's question.
- Pull the most relevant search results into your prompt as context and add an instruction, such as “Answer the following question by using only information from the following passages.”
- Send the combined prompt text (instruction + search results + question) to the foundation model.
- The foundation model uses contextual information from the prompt to generate a factual answer.
For more information, see:
USE embeddings
USE embeddings are wrappers around Google Universal Sentence Encoder embeddings that are available in TFHub. These embeddings are used in the document classification SVM algorithm. For a list of pretrained USE embeddings and their supported languages, see Pretrained USE embeddings that are included in
When using USE embeddings, consider the following:
-
Choose
embedding_use_en_stock
if your task involves English text. -
Choose one of the multilingual USE embeddings if your task involves text in a non-English language, or you want to train multilingual models.
-
The USE embeddings exhibit different trade-offs between quality of the trained model and throughput at inference time, as described below. Try different embeddings to decide the trade-off between quality of result and inference throughput that is appropriate for your use case.
embedding_use_multi_small
has reasonable quality, but it is fast at inference timeembedding_use_en_stock
is a English-only version ofembedding_embedding_use_multi_small
, hence it is smaller and exhibits higher inference throughputembedding_use_multi_large
is based on Transformer architecture, and therefore it provides higher quality of result, with lower throughput at inference time
Code sample
import watson_nlp
syntax_model = watson_nlp.load("syntax_izumo_en_stock")
embeddings_model = watson_nlp.load("embedding_use_en_stock")
text = "python"
syntax_doc = syntax_model.run(text)
embedding = embeddings_model.run(syntax_doc)
print(embedding)
Output of the code sample:
{
"data": {
"data": [
-0.01909315399825573,
-0.009827353060245514,
...
0.008978910744190216,
-0.0702751949429512
],
"rows": 1,
"cols": 512,
"dtype": "float32"
},
"offsets": null,
"producer_id": null
}
The following table lists the pretrained blocks for USE embeddings that are available and the languages that are supported. For a list of the language codes and the corresponding language, see Language codes.
Block name | Model name | Supported languages |
---|---|---|
use |
embedding_use_en_stock |
English only |
use |
embedding_use_multi_small |
ar, de, el, en, es, fr, it, ja, ko, nb, nl, pl, pt, ru, th, tr, zh_tw, zh |
use |
embedding_use_multi_large |
ar, de, el, en, es, fr, it, ja, ko, nb, nl, pl, pt, ru, th, tr, zh_tw, zh |
GloVe embeddings
GloVe embeddings are used by the CNN classifier.
Block name:
embedding_glove__stock
Supported languages: ar, de, en, es, fr, it, ja, ko, nl, pt, zh-cn
Code sample
import watson_nlp
syntax_model = watson_nlp.load("syntax_izumo_en_stock")
embeddings_model = watson_nlp.load("embedding_glove_en_stock")
text = "python"
syntax_doc = syntax_model.run(text)
embedding = embeddings_model.run(syntax_doc)
print(embedding)
Output of the code sample:
{
"data": {
"data": [
-0.01909315399825573,
-0.009827353060245514,
...
0.008978910744190216,
-0.0702751949429512
],
"rows": 1,
"cols": 512,
"dtype": "float32"
},
"offsets": null,
"producer_id": null
}
Transformer embeddings
Block names
embedding_transformer_en_slate.125m
embedding_transformer_en_slate.30m
Supported languages
English only
Code sample
import watson_nlp
# embeddings_model = watson_nlp.load("embedding_transformer_en_slate.125m")
embeddings_model = watson_nlp.load("embedding_transformer_en_slate.30m")
text = "python"
embedding = embeddings_model.run(text)
print(embedding)
Output of the code sample
{
"data": {
"data": [
-0.055536773055791855,
0.008286023512482643,
...
-0.3202415108680725,
5.000295277568512e-05
],
"rows": 1,
"cols": 384,
"dtype": "float32"
},
"offsets": null,
"producer_id": {
"name": "Transformer Embeddings",
"version": "0.0.1"
}
}
Parent topic: Watson Natural Language Processing task catalog