0 / 0
Quick start: Evaluate and track a prompt template

Quick start: Evaluate and track a prompt template

Take this tutorial to learn how to evaluate and track a prompt template. You can evaluate prompt templates in projects or deployment spaces to measure the performance of foundation model tasks and understand how your model generates responses. Then, you can track the prompt template in an AI use case to capture and share facts about the asset to help you meet governance and compliance goals.

Required services
watsonx.governance

Your basic workflow includes these tasks:

  1. Open a project that contains the prompt template to evaluate. Projects are where you can collaborate with others to work with assets.
  2. Evaluate a prompt template using test data.
  3. Review the results on the AI Factsheet.
  4. Track the evaluated prompt template in an AI use case.
  5. Deploy and test your evaluated prompt template.

Read about prompt templates

With watsonx.governance, you can evaluate prompt templates in projects to measure how effectively your foundation models generate responses for the following task types:

  • Classification
  • Summarization
  • Generation
  • Question answering
  • Entity extraction

Read more about evaluating prompt templates in projects

Read more about evaluating prompt templates in deployment spaces

Watch a video about evaluating and tracking a prompt template

Watch Video Watch this video to preview the steps in this tutorial. There might be slight differences in the user interface shown in the video. The video is intended to be a companion to the written tutorial.

This video provides a visual method to learn the concepts and tasks in this documentation.


Try a tutorial about evaluating and tracking a prompt template

In this tutorial, you will complete these tasks:





Tips for completing this tutorial Here are some tips for successfully completing this tutorial.

Use the video picture-in-picture

Tip: Start the video, then as you scroll through the tutorial, the video moves to picture-in-picture mode. Close the video table of contents for the best experience with picture-in-picture. You can use picture-in-picture mode so you can follow the video as you complete the tasks in this tutorial. Click the timestamps for each task to follow along.

The following animated image shows how to use the video picture-in-picture and table of contents features:

How to use picture-in-picture and chapters

Get help in the community

If you need help with this tutorial, you can ask a question or find an answer in the watsonx Community discussion forum.

Set up your browser windows

For the optimal experience completing this tutorial, open Cloud Pak for Data in one browser window, and keep this tutorial page open in another browser window to switch easily between the two applications. Consider arranging the two browser windows side-by-side to make it easier to follow along.

Side-by-side tutorial and UI

Tip: If you encounter a guided tour while completing this tutorial in the user interface, click Maybe later.



Task 1: Create a model inventory and AI use case

preview tutorial video To preview this task, watch the video beginning at 00:09.

A model inventory is for storing and reviewing AI use cases. AI use cases collect governance facts for AI assets that your organization tracks. You can view all the AI use cases in an inventory.

Task 1a: Create a model inventory

Follow these steps to create a model inventory:

  1. From the Navigation Menu Navigation menu, choose AI governance > AI use cases.

  2. Manage your inventories:

    • If you have existing inventory, then you can skip to Create a new AI use case to use that inventory.
    • If you don't have any inventories, then click Manage inventories.
    1. Click New inventory.

    2. For the name, copy and paste the following text:

      Golden Bank Insurance Inventory
      
    3. For the description, copy and paste the following text:

      Model inventory for insurance related processing
      
    4. Clear the Add collaborators after creation option. You can restrict access at the inventory and AI use case level.

    5. Select your Cloud Object Storage instance from the list.

    6. Click Create.

  3. Close the Manage inventories page.

Checkpoint icon Check your progress

The following image shows the model inventory. You are now ready to create an AI use case.

Model inventory

Task 1b: Create an AI use case

An AI use case is a defined business problem that you can solve with the help of AI. Usually these are defined before any AI asset gets developed. Follow these steps to create an AI use case:

  1. Click New AI use case.

  2. For the Name, copy and paste the following text:

    Insurance claims processing AI use case
    
  3. Select an existing model inventory.

  4. Click Create to accept the default values for the rest of the fields.

Checkpoint icon Check your progress

The following image shows the AI use case. You are now ready to track the prompt template.

AI use case




Task 2: Create a project

preview tutorial video To preview this task, watch the video beginning at 00:51.

You need a project to store the prompt template and the evaluation. Follow these steps to create a project based on a sample:

  1. Access the Getting started with watsonx governance project in the Resource hub.

    1. Click Create project.

    2. Accept the default values for the project name, and click Create.

    3. Click View new project when the project is successfully created.

  2. Associate a Watson Machine Learning service with the project. For more information, see Watson Machine Learning.

    1. When the project opens, click the Manage tab, and select the Services and integrations page.

    2. On the IBM services tab, click Associate service.

    3. Select your Watson Machine Learning instance. If you don't have a Watson Machine Learning service instance provisioned yet, follow these steps:

      1. Click New service.

      2. Select Watson Machine Learning.

      3. Click Create.

      4. Select the new service instance from the list.

    4. Click Associate service.

    5. If necessary, click Cancel to return to the Services & Integrations page.

  3. Click the Assets tab in the project to see the sample assets.

For more information or to watch a video, see Creating a project.

For more information on associated services, see Adding associated services.

Checkpoint icon Check your progress

The following image shows the project Assets tab. You are now ready to evaluate the sample prompt template in the project.

Sample project assets




Task 3: Evaluate the sample prompt template

preview tutorial video To preview this task, watch the video beginning at 01:28.

The sample project contains a few prompt templates and CSV files used as test data. Follow these steps to download the test data and evaluate one of the sample prompt templates:

  1. Download the test data from the sample project. You need to provide a local file for the test data during evaluation.

    1. Click the Assets tab.
    2. For the Insurance claim summarization test data.csv file, click the Overflow menu Overflow menu, and choose Download.
    3. Save the CSV file locally.
  2. Click Insurance claim summarization to open the prompt template in Prompt Lab, and then click Edit.

  3. Click the Prompt variables icon Prompt variables.

    Note: To run evaluations, you must create at least one prompt variable.
  4. Scroll to the Try section. Notice the {input} variable in the Input field. You must include the prompt variable as input for testing your prompt. A prompt variable is a placeholder keyword that you include in the static text of your prompt at creation time and replace with text dynamically at run time.

  5. Click the Evaluate icon Evaluate.

  6. Expand the Generative AI Quality section to see a list of dimensions. The available metrics depend on the task type of the prompt. For example, summarization has different metrics than classification.

  7. Click Next.

  8. Select the test data:

    1. Click Browse.
    2. Select the Insurance claim summarization test data.csv file that you previously downloaded.
    3. Click Open.
    4. For the Input column, select Insurance_Claim.
    5. For the Reference output column, select Summary.
    6. Click Next.
  9. Click Evaluate. Evaluations can take a few minutes to complete. When the evaluation completes, you see the test results on the Evaluate tab. This page shows detailed information about this evaluation run so you can gain insights about your model performance. The summary provides an overview of metric scores and violations of default score thresholds for your prompt template evaluations.

  10. Click the AI Factsheet tab.

    1. View the information on each of the sections on the tab.
    2. Click Evaluation > Develop > Test to see the test results again.

Checkpoint icon Check your progress

The following image shows the results of the evaluation. Now you can start tracking the prompt template in an AI use case.

Prompt template evaluation test results




Task 4: Start tracking the prompt template

preview tutorial video To preview this task, watch the video beginning at 02:37.

You can track your prompt template in an AI use case to report the development and test process to your peers. Follow these steps to start tracking the prompt template:

  1. From the Navigation Menu Navigation menu, choose Projects > View all projects.
  2. Select the Getting started with watsonx governance project.
  3. Click the Assets tab.
  4. From the Overflow menu overflow menu for the Insurance claim summarization prompt template, select View AI Factsheet. Every AI asset has an AI factsheet; which includes detailed information about how the asset was built, it’s evaluation results across the AI lifecycle, and additional attachments.
  5. On the AI Factsheet tab, click the Governance page.
  6. Click Track an AI use case.
  7. Select the Insurance claims processing AI use case.
  8. Click Next.
  9. Select an approach. An approach is one facet of the solution to the business problem represented by the AI use case. For example, you might create approaches to track several prompt templates in a use case.
  10. Click Next.
  11. For the model version, select Experimental.
  12. Accept the default value for the version number.
  13. Click Next.
  14. Review the information, and then click Track asset.
  15. When model tracking successfully begins, click the View details icon View details to open the AI use case.
  16. Click the Lifecycle tab to see the prompt template is in the Develop phase. As the prompt template moves through the AI lifecycle, it will move through these phases:
    • Develop phase: AI assets that have been developed in a project environment.
    • Test phase: ML models that have been deployed in a space for testing.
    • Validate phase: AI assets that have been deployed in a space or project for validation.
    • Operate phase: AI assets deployed in a space for operation.

Checkpoint icon Check your progress

The following image shows the Lifecycle tab in the AI use case with the prompt template in the Develop phase. You are now ready to continue to the Validate phase.

The Lifecycle tab in the AI use case




Task 5: Create a new project for validation

preview tutorial video To preview this task, watch the video beginning at 03:27.

Typically, the prompt engineer evaluates the prompt with test data, and the validation engineer validates the prompt. The validation engineer has access to the validation data that prompt engineers might not have. In this case, validation data occurs in a different project. Follow these steps to export the development project and import it as a new validation project to move the asset into the validation phase of the AI lifecycle:

  1. From the Navigation Menu Navigation menu, choose Projects > View all projects.

  2. Select the Getting started with watsonx governance project.

  3. Click the Import/Export icon Import/Export > Export project.

  4. Check the box to select all assets.

  5. Click Export.

  6. For the project name, copy and paste the following text, and then click Save.

    validation project.zip
    
  7. When the project export completes, click Back to project.

  8. From the Navigation Menu Navigation menu, choose Projects > View all projects.

  9. Click New project.

    1. Select Create a project from a sample of file.

    2. Click Browse.

    3. Select the validation project.zip, and click Open.

    4. For the project name, copy and paste the following text:

      Validation project
      
    5. Click Create.

  10. When the project is created, click View new project.1. Follow the same steps as in Step 1 to associate your Watson Machine Learning service with this project.

Checkpoint icon Check your progress

The following image shows the validation project Assets tab. You are now ready to evaluate the sample prompt template in the validation project.

Validation project assets




Task 6: Validate the prompt template

preview tutorial video To preview this task, watch the video beginning at 04:22.

Now you are ready to evaluate the prompt template in this validation project using the same evaluation process as before. Use the same test data set for evaluation. And select the same Input and Output columns as before. Follow these steps to validate the prompt template:

  1. Click the Assets tab in the Validation project.
  2. Repeat the steps in Task 2 to evaluate the Claims processing summarization prompt template.
  3. Click the AI Factsheet tab when the evaluation is complete.
  4. View both sets of test results:
    1. Click Evaluation > Develop > Test.
    2. Click Evaluation > Validate > Test.

Checkpoint icon Check your progress

The following image shows the validation test results. You are now ready to promote the prompt template to a deployment space, and then deploy the prompt template.

Prompt template evaluation test results




Task 7: Deploy the prompt template

preview tutorial video To preview this task, watch the video beginning at 05:04.

Task 7a: Promote the prompt template to a deployment space

You promote the prompt template to a deployment space in preparation for deploying it. Follow these steps to prompte the prompt template:

  1. Click Validation project in the projects navigation trail.
  2. From the Overflow menu overflow menu for the Insurance claim summarization prompt template, select Promote to space.
  3. For the Target space, select Create a new deployment space.
    1. For the Space name, copy and paste the following text:

      Insurance claims deployment space
      
    2. For the Deployment stage, select Production.

      Important: You must select Production for the Deployment stage if you wish to move the deployment from the Evaluate stage to the Operate stage.
    3. Select your machine learning service from the list.

    4. Click Create.

    5. Click Close.

  4. Select the Insurance claims deployment space deployment space from the list.
  5. Check the option to Go to the space after promoting the prompt template.
  6. Click Promote.

Checkpoint icon Check your progress

The following image shows the prompt template in the deployment space. You are now ready to create a deployment.

Prompt template in deployment space

Task 7b: Deploy the prompt template

Now you can create an online deployment of the prompt template from inside the deployment space. Follow these steps to create a deployment:

  1. From the Insurance claims summarization asset page in the deployment space, select New deployment.

  2. For the deployment name, copy and paste the following text:

    Insurance claims summarization deployment
    
  3. Click Create.

Checkpoint icon Check your progress

The following image shows the deployed prompt template.

Deployed prompt template

Task 7c: View the deployed prompt template

Follow these steps to view the deployed prompt template in its current phase of the lifecycle:

  1. View the deployment when it is ready. The API reference tab provides information for you to use the prompt template deployment in your application.
  2. Click the Test tab. The Test tab allows you to submit an instruction and Input to test the deployment.
  3. Click Generate. Close the results window.
  4. Click the AI Factsheet tab. The AI Factsheet shows that the prompt template is now in the operate phase.
  5. Scroll down to the bottom of the AI Factsheet page, and click the arrow for more details.
  6. Select the Evaluation > Operate > Deployment 1 page.
  7. Click the View details icon View details at the top of the factsheet to open the AI use case.
  8. Click the Lifecycle tab.
  9. Click the Insurance claim summarization prompt template in the Operate phase. When you are done, click Cancel.
  10. Click the Insurance claims summarization deployment prompt template deployment in the Operate phase.

Checkpoint icon Check your progress

The following image shows the prompt template prompt template in the Operate phase of the lifecycle.

Prompt template in the Operate phase




Next steps

Try one of the other tutorials:

Additional resources

  • View more videos.

  • Find sample data sets, projects, models, prompts, and notebooks in the Resource hub to gain hands-on experience:

    Notebook Notebooks that you can add to your project to get started analyzing data and building models.

    Project Projects that you can import containing notebooks, data sets, prompts, and other assets.

    Data set Data sets that you can add to your project to refine, analyze, and build models.

    Prompt Prompts that you can use in the Prompt Lab to prompt a foundation model.

    Model Foundation models that you can use in the Prompt Lab.

Parent topic: Quick start tutorials

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more