テキストをテキスト埋め込みに変換する

資料の英語版に戻る

テキストをテキスト埋め込みに変換する

最終更新: 2024年11月28日

テキストをテキスト埋め込みに変換する

watsonx.aiテキスト埋め込みAPIと利用可能な埋め込みモデルを使用して、テキスト埋め込みを生成します。

watsonx.ai のノートブックの watsonx.ai Python ライブラリーで使用可能な関数を使用して、テキストをテキスト埋め込みに変換します。

以下のコード・スニペットは、 slate-30m-english-rtrvr モデルを使用して、以下の 2 行のテキストをテキスト埋め込みに変換する方法を示しています。

foundation modelとは、幅広い下流タスクに適応できる大規模な生成AIモデルである。
生成 AI は、テキスト、ソース・コード、画像、音声、合成データなど、さまざまなタイプのコンテンツを生成できる AI アルゴリズムのクラスです。

from ibm_watsonx_ai.foundation_models import Embeddings
from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames as EmbedParams

my_credentials = {
  "url": "https://{region}.ml.cloud.ibm.com",
  "apikey": {my-IBM-Cloud-API-key},
}

client = APIClient(my_credentials)

model_id = client.foundation_models.EmbeddingModels.SLATE_30M_ENGLISH_RTRVR
gen_parms = None
project_id = {my-project-ID}
space_id = None
verify = False

# Set the truncate_input_tokens to a value that is equal to or less than the maximum allowed tokens for the embedding model that you are using. If you don't specify this value and the input has more tokens than the model can process, an error is generated.

embed_params = {
  EmbedParams.TRUNCATE_INPUT_TOKENS: 128,
  EmbedParams.RETURN_OPTIONS: {
    'input_text': True
  }
}

embedding = Embeddings(
  model_id=model_id,
  credentials=my_credentials,
  params=embed_params,
  project_id=project_id,
  space_id=space_id,
  verify=verify
)

q = [
  "A foundation model is a large-scale generative AI model that can be adapted to a wide range of downstream tasks.",
  "Generative AI a class of AI algorithms that can produce various types of content including text, source code, imagery, audio, and synthetic data."
]

embedding_vectors = embedding.embed_documents(texts=q)

print(embedding_vectors)

{region}、 {my-IBM-Cloud-API-key}、および {my-project-ID} を、ご使用の環境で有効な値に置き換えます。

出力例

[
   [-0.0053823674,-0.018807093,0.009131943, ...-0.010469643,0.0010533642,0.020114796], 
   [-0.04075534,-0.041552857,0.04326911, ...0.017616473,-0.010064489,0.020788372]
]

詳細情報

親トピック: Python ライブラリー