Coding an AutoAI RAG experiment with text extraction
Coding an AutoAI RAG experiment with text extraction
Last updated: Feb 21, 2025
Coding an AutoAI RAG experiment with text extraction
Review the guidelines and code samples to learn how to code an AutoAI RAG experiment by using watsonx text extraction to process input documents.
You can use text extraction to process input documents for an AutoAI RAG experiment. Text extraction transforms high-quality business documents with tables, images, and diagrams into markdown format. The resulting markdown files can then be used
in an AutoAI RAG experiment to enhance the quality of generated patterns.
The text extraction service uses the watsonx.ai Python client library (version 1.1.11 or later). For more information about using text extraction
from watsonx.ai Python SDK, see Text Extractions.
Follow these steps to use text extraction in your AutoAI RAG experiment.
Data must be in the JSON format with a fixed schema with these fields: question, correct_answer, correct_answer_document_ids
correct_answer_document_ids must refer to the text extraction service output file
benchmarking_data = [
{
"question": "What are the two main variants of Granite Code models?",
"correct_answer": "The two main variants are Granite Code Base and Granite Code Instruct.",
"correct_answer_document_ids": <TEXT EXTRACTION OUTPUT FILENAME>
},
{
"question": "What is the purpose of Granite Code Instruct models?",
"correct_answer": "Granite Code Instruct models are finetuned for instruction-following tasks using datasets like CommitPack, OASST, HelpSteer, and synthetic code instruction datasets, aiming to improve reasoning and instruction-following capabilities.",
"correct_answer_document_ids": <TEXT EXTRACTION OUTPUT FILENAME>
},
{
"question": "What is the licensing model for Granite Code models?",
"correct_answer": "Granite Code models are released under the Apache 2.0 license, ensuring permissive and enterprise-friendly usage.",
"correct_answer_document_ids": <TEXT EXTRACTION OUTPUT FILENAME>
},
]
When the status is completed, move to the next step.
Step 3: Configure the RAG optimizer
Copy link to section
The rag_optimizer object provides a set of methods for working with the AutoAI RAG experiment. In this step, enter the details to define the experiment. These are the available configuration options:
Run the optimizer to create the RAG patterns by using the specified configuration options. Use the output from text extraction as input in your AutoAI RAG experiment.
In this code sample for running a Chroma experiment, the task is run in interactive mode. You can run the task in the background by changing the background_mode to True.
Step 5: Review the patterns and select the best one
Copy link to section
After the AutoAI RAG experiment completes successfully, you can review the patterns. Use the summary method to list completed patterns and evaluation metrics information in the form of a Pandas DataFrame so you can review the patterns,
ranked according to performance against the optimized metric.
summary = rag_optimizer.summary()
summary
Copy to clipboardCopied to clipboard
For example, pattern results display like this:
Pattern
mean_answer_correctness
mean_faithfulness
mean_context_correctness
chunking.chunk_size
embeddings.model_id
vector_store.distance_metric
retrieval.method
retrieval.number_of_chunks
generation.model_id
Pattern1
0.6802
0.5407
1.0000
512
ibm/slate-125m-english-rtrvr
euclidean
window
5
meta-llama/llama-3-70b-instruct
Pattern2
0.7172
0.5950
1.0000
1024
intfloat/multilingual-e5-large
euclidean
window
5
ibm/granite-13b-chat-v2
Pattern3
0.6543
0.5144
1.0000
1024
intfloat/multilingual-e5-large
euclidean
simple
5
ibm/granite-13b-chat-v2
Pattern4
0.6216
0.5030
1.0000
1024
intfloat/multilingual-e5-large
cosine
window
5
meta-llama/llama-3-70b-instruct
Pattern5
0.7369
0.5630
1.0000
1024
intfloat/multilingual-e5-large
cosine
window
3
mistralai/mixtral-8x7b-instruct-v01
Select a pattern to test locally
Copy link to section
The next step is to select a pattern and test it locally. Because Chroma is in-memory, you must re-create the document index.
Tip:
In the following code sample, the index is built with the documents core_api.html and fm_embeddings.html.
payload = {
client.deployments.ScoringMetaNames.INPUT_DATA: [
{
"values": ["How to use new approach of providing credentials to APIClient?"],
}
]
}
best_pattern.query(payload)
Copy to clipboardCopied to clipboard
The model's response looks like this:
According to the document, the new approach to provide credentials to APIClient is by using the Credentials class. Here's an example:
from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai import Credentials
credentials = Credentials(
url = "https://us-south.ml.cloud.ibm.com",
token = "***********",
)
client = APIClient(credentials)
This replaces the old approach of passing a dictionary with credentials to the APIClient constructor.
Copy to clipboardCopied to clipboard
Tip:
To retrieve a specific pattern, pass the pattern number to rag_optimizer.get_pattern().
Reviewing experiment results in Cloud Object Storage
Copy link to section
If the final status of the experiment is failed or error, use rag_optimizer.get_logs() or refer to experiment results to understand what went wrong. Experiment results and logs are stored in the default Cloud Object Storage instance
that is linked to your account. By default, results are saved in the default_autoai_rag_out directory.
The evaluation_results.json file contains evaluation results for each benchmark question.
The indexing_inference_notebook.ipynb contains the Python code for building a vector database index as well as building retrieval and generation function. The notebook introduces commands for retrieving data, chunking, and embeddings
creation as well as for retrieving chunks, building prompts, and generating answers.
Note:
The results notebook indexing_notebook.ipynb contains the code for embedding and indexing the documents. You can accelerate the document indexing task by changing vector_store.add_documents() to vector_store.add_documents_async().
Get an inference and indexing notebook
Copy link to section
To download a specified inference notebook, use the get_inference_notebook(). If you leave pattern_name empty, the method downloads the notebook of the best computed pattern.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.