0 / 0
Quick start: Evaluate and track a prompt template

Quick start: Evaluate and track a prompt template

Take this tutorial to learn how to evaluate and track a prompt template. You can evaluate prompt templates in projects or deployment spaces to measure the performance of foundation model tasks and understand how your model generates responses. Then, you can track the prompt template in an AI use case to capture and share facts about the asset to help you meet governance and compliance goals.

Required services
watsonx.governance

Your basic workflow includes these tasks:

  1. Open a project that contains the prompt template to evaluate. Projects are where you can collaborate with others to work with assets.
  2. Evaluate a prompt template using test data.
  3. Review the results on the AI Factsheet.
  4. Track the evaluated prompt template in an AI use case.
  5. Deploy and test your evaluated prompt template.

Read about prompt templates

With watsonx.governance, you can evaluate prompt templates in projects to measure how effectively your foundation models generate responses for the following task types:

  • Classification
  • Summarization
  • Generation
  • Question answering
  • Entity extraction

Read more about evaluating prompt templates in projects

Read more about evaluating prompt templates in deployment spaces

Watch a video about evaluating and tracking a prompt template

Watch Video Watch this video to preview the steps in this tutorial. There might be slight differences in the user interface shown in the video. The video is intended to be a companion to the written tutorial.

This video provides a visual method to learn the concepts and tasks in this documentation.

Try a tutorial to evaluating and tracking a prompt template

In this tutorial, you will complete these tasks:




  • Use the video picture-in-picture

    Tip: Start the video, then as you scroll through the tutorial, the video moves to picture-in-picture mode. Close the video table of contents for the best experience with picture-in-picture. You can use picture-in-picture mode so you can follow the video as you complete the tasks in this tutorial. Click the timestamps for each task to follow along.

    The following animated image shows how to use the video picture-in-picture and table of contents features:

    How to use picture-in-picture and chapters

    Get help in the community

    If you need help with this tutorial, you can ask a question or find an answer in the Cloud Pak for Data Community discussion forum.

    Set up your browser windows

    For the optimal experience completing this tutorial, open Cloud Pak for Data in one browser window, and keep this tutorial page open in another browser window to switch easily between the two applications. Consider arranging the two browser windows side-by-side to make it easier to follow along.

    Side-by-side tutorial and UI

    Tip: If you encounter a guided tour while completing this tutorial in the user interface, click Maybe later.

    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 00:08.

    You need a project to store the prompt template and the evaluation. Follow these steps to create a project based on a sample:

    1. Access the Getting started with watsonx governance project in the Resource hub.

      1. Click Create project.

      2. Accept the default values for the project name, and click Create.

      3. Click View new project when the project is successfully created.

    2. Associate a Watson Machine Learning service with the project:

      1. When the project opens, click the Manage tab, and select the Services and integrations page.

      2. On the IBM services tab, click Associate service.

      3. Select your Watson Machine Learning instance. If you don't have a Watson Machine Learning service instance provisioned yet, follow these steps:

        1. Click New service.

        2. Select Watson Machine Learning.

        3. Click Create.

        4. Select the new service instance from the list.

      4. Click Associate service.

      5. If necessary, click Cancel to return to the Services & Integrations page.

    3. Click the Assets tab in the project to see the sample assets.

    For more information or to watch a video, see Creating a project.

    For more information on associated services, see Adding associated services.

    Checkpoint icon Check your progress

    The following image shows the project Assets tab. You are now ready to evaluate the sample prompt template in the project.

    Sample project assets


    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 00:36.

    The sample project contains a few prompt templates and CSV files used as test data. Follow these steps to download the test data and evaluate one of the sample prompt templates:

    1. Download the test data from the sample project.

      1. Click the Assets tab.
      2. For the Insurance claim summarization test data.csv file, click the Overflow menu Overflow menu, and choose Download.
      3. Save the CSV file locally.
    2. Click Insurance claim summarization to open the prompt template in Prompt Lab, and then click Edit.

    3. Click the Prompt variables icon Prompt variables.

      Note: To run evaluations, you must create at least one prompt variable.
    4. Scroll to the Try section. Notice the {input} variable in the Input field. You must include the prompt variable as input for testing your prompt. A prompt variable is a placeholder keyword that you include in the static text of your prompt at creation time and replace with text dynamically at run time.

    5. Click the Evaluate Evaluate icon icon.

    6. Expand the Generative AI Quality section to see a list of dimensions. The available metrics depend on the task type of the prompt. For example, summarization has different metrics than classification.

    7. Click Next.

    8. Select the test data:

      1. Click Browse.
      2. Select the Insurance claim summarization test data.csv file.
      3. Click Open.
      4. For the Input column, select Insurance_Claim.
      5. For the Reference output column, select Summary.
      6. Click Next.
    9. Click Evaluate. When the evaluation completes, you see the test results on the Evaluate tab.

    10. Click the AI Factsheet tab.

      1. View the information on each of the sections on the tab.
      2. Click Evaluation > Develop > Test to see the test results again.

    Checkpoint icon Check your progress

    The following image shows the results of the evaluation. Now you can start tracking the prompt template in an AI use case.

    Prompt template evaluation test results


    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 01:54.

    You use a model inventory for storing and reviewing AI use cases. AI use cases collect governance facts for AI assets that your organization tracks. You can view all the AI use cases in an inventory. Follow these steps to create a model inventory and AI use case:

    Create a model inventory

    1. From the navigation menu Navigation menu, choose AI governance > AI use cases.

    2. Manage your inventories:

      • If you have existing inventory, then you can skip to Create a new AI use case to use that inventory.
      • If you don't have any inventories, then click Manage inventories.
      1. Click New inventory.

      2. For the name, copy and paste the following text:

        Golden Bank Insurance Inventory
        
      3. For the description, copy and paste the following text:

        Model inventory for insurance related processing
        
      4. Clear the Add collaborators after creation option.

      5. Select your Cloud Object Storage instance from the list.

      6. Click Create.

    3. Close the Manage inventories page.

    Checkpoint icon Check your progress

    The following image shows the model inventory. You are now ready to create an AI use case.

    Model inventory

    Create an AI use case

    1. Click New AI use case.

    2. For the Name, copy and paste the following text:

      Insurance claims processing AI use case
      
    3. Select an existing model inventory.

    4. Click Create to accept the default values for the rest of the fields.

    Checkpoint icon Check your progress

    The following image shows the AI use case. You are now ready to track the prompt template.

    AI use case


    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 02:33.

    You can track your prompt template in an AI use case to report the development and test process to your peers. Follow these steps to start tracking the prompt template:

    1. From the navigation menu Navigation menu, choose Projects > View all projects.
    2. Select the Getting started with watsonx governance project.
    3. Click the Assets tab.
    4. From the Overflow menu overflow menu for the Insurance claim summarization prompt template, select View AI Factsheet.
    5. On the AI Factsheet tab, click the Governance page.
    6. Click Track an AI use case.
    7. Select the Insurance claims processing AI use case.
    8. Click Next.
    9. Select an approach.
    10. Click Next.
    11. For the model version, select Experimental.
    12. Accept the default value for the version number.
    13. Click Next.
    14. Review the information, and then click Track asset.
    15. When model tracking successfully begins, click the View details icon View details icon to open the AI use case.
    16. Click the Lifecycle tab to see the prompt template in the Develop phase.

    Checkpoint icon Check your progress

    The following image shows the Lifecycle tab in the AI use case with the prompt template in the Develop phase. You are now ready to continue to the Validate phase

    The Lifecycle tab in the AI use case


    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 03:22.

    Typically, the prompt engineer evaluates the prompt with test data, and the validation engineer validates the prompt. The validation engineer has access to the validation data that prompt engineers might not have. In this case, validation data occurs in a different project. Follow these steps to export the development project and import it as a new validation project to move the asset into the validation phase of the AI lifecycle:

    1. From the navigation menu Navigation menu, choose Projects > View all projects.

    2. Select the Getting started with watsonx governance project.

    3. Click the Import/Export icon Import/Export icon > Export project.

    4. Check the box to select all assets.

    5. Click Export.

    6. For the project name, copy and paste the following text, and then click Save.

      validation project.zip
      
    7. When the project export completes, click Back to project.

    8. From the navigation menu Navigation menu, choose Projects > View all projects.

    9. Click New project.

      1. Select Create a project from a sample of file.

      2. Click Browse.

      3. Select the validation project.zip, and click Open.

      4. For the project name, copy and paste the following text:

        Validation project
        
      5. Click Create.

    10. When the project is created, click View new project.1. Follow the same steps as in Step 1 to associate your Watson Machine Learning service with this project.

    Checkpoint icon Check your progress

    The following image shows the validation project Assets tab. You are now ready to evaluate the sample prompt template in the validation project.

    Validation project assets


    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 04:18.

    Now you are ready to evaluate the prompt template in this validation project using the same evaluation process as before. Use the same test data set for evaluation. And select the same Input and Output columns as before. Follow these steps to validate the prompt template:

    1. Click the Assets tab in the Validation project.
    2. Repeat the steps in Task 2 to evaluate the Claims processing summarization prompt template.
    3. Click the AI Factsheet tab when the evaluation is complete.
    4. View both sets of test results:
      1. Click Evaluation > Develop > Test.
      2. Click Evaluation > Validate > Test.

    Checkpoint icon Check your progress

    The following image shows the validation test results. You are now ready to promote the prompt template to a deployment space, and then deploy the prompt template.

    Prompt template evaluation test results


    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 05:00.

    Promote the prompt template to a deployment space

    You promote the prompt template to a deployment space in preparation for deploying it. Follow these steps to prompte the prompt template:

    1. Click Validation project in the projects navigation trail.
    2. From the Overflow menu overflow menu for the Insurance claim summarization prompt template, select Promote to space.
    3. For the Target space, select Create a new deployment space.
      1. For the Space name, copy and paste the following text:

        Insurance claims deployment space
        
      2. For the Deployment stage, select Production.

        Important: You must select Production for the Deployment stage if you wish to move the deployment from the Evaluate stage to the Operate stage.
      3. Select your machine learning service from the list.

      4. Click Create.

      5. Click Close.

    4. Select the Insurance claims deployment space deployment space from the list.
    5. Check the option to Go to the space after promoting the prompt template.
    6. Click Promote.

    Checkpoint icon Check your progress

    The following image shows the prompt template in the deployment space. You are now ready to create a deployment.

    Prompt template in deployment space

    Deploy the prompt template

    Now you can create an online deployment of the prompt template from inside the deployment space. Follow these steps to create a deployment:

    1. From the Insurance claims summarization asset page in the deployment space, select New deployment.

    2. For the deployment name, copy and paste the following text:

      Insurance claims summarization deployment
      
    3. Click Create.

    Checkpoint icon Check your progress

    The following image shows the deployed prompt template.

    Deployed prompt template

    View the deployed prompt template

    Follow these steps to view the deployed prompt template in its current phase of the lifecycle:

    1. View the deployment when it is ready. The API reference tab provides information for you to use the prompt template deployment in your application.
    2. Click the Test tab. The Test tab allows you to submit an instruction and Input to test the deployment.
    3. Click Generate. Close the results window.
    4. Click the AI Factsheet tab. The AI Factsheet shows that the prompt template is now in the operate phase.
    5. Scroll down to the bottom of the AI Factsheet page, and click the arrow for more details.
    6. Select the Evaluation > Operate > Deployment 1 page.
    7. Click the View details icon View details icon at the top of the factsheet to open the AI use case.
    8. Click the Lifecycle tab.
    9. Click the Insurance claim summarization prompt template in the Operate phase. When you are done, click Cancel.
    10. Click the Insurance claims summarization deployment prompt template deployment in the Operate phase.

    Checkpoint icon Check your progress

    The following image shows the prompt template prompt template in the Operate phase of the lifecycle.

    Prompt template in the Operate phase


    Back to the top

Next steps

Try one of the other tutorials:

Additional resources

  • View more videos.

  • Find sample data sets, projects, models, prompts, and notebooks in the Resource hub to gain hands-on experience:

    Notebook icon Notebooks that you can add to your project to get started analyzing data and building models.

    Project icon Projects that you can import containing notebooks, data sets, prompts, and other assets.

    Data set icon Data sets that you can add to your project to refine, analyze, and build models.

    Prompt icon Prompts that you can use in the Prompt Lab to prompt a foundation model.

    Model icon Foundation models that you can use in the Prompt Lab.

Parent topic: Quick start tutorials

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more