To drive responsible, transparent, and explainable AI workflows, your enterprise needs an integrated system for tracking, monitoring, and retraining AI models. Watsonx.governance provides the processes and technologies to enable your enterprise to monitor, maintain, automate, and govern machine learning and generative AI models in production.
Challenges
Watsonx.governance helps you to solve the following challenges for your enterprise:
- Ensuring governance and compliance for machine learning model and generative AI assets
- Organizations need to evaluate, track, and document the detailed history of machine learning models and generative AI assets to ensure compliance and to provide visibility to all stakeholders.
- Managing risk and ensuring responsible AI
- Organizations need to monitor models in production to ensure that the models are valid and accurate, and that they are not introducing bias or drifting away from the intended goals.
- Monitoring and retraining machine learning models
- Organizations need to automate the monitoring and retraining of machine learning models based on production feedback.
Example: Golden Bank's challenges
Follow the story of Golden Bank as it uses watsonx.governance to govern their AI assets as they to implement a process to analyze marketing promotions boost sales of their investment products. The team needs to:
- Track their machine learning models and generative AI assets throughout the lifecycle; to capture and share facts about the assets; to help meet governance and compliance goals.
- Monitor their deployed models for fairness, accuracy, explainability, and drift.
- Evaluate their summarization and question-answering prompt templates to measure how effectively the foundation models generate responses.
- Create a pipeline to simplify the retraining process.
Process
To implement watsonx.governance for your enterprise, your organization can follow this process:
- Track machine learning models and prompt templates
- Evaluate machine learning models and prompt templates
- Monitor deployed machine learning models and prompt templates
Watsonx.ai and watsonx.governance provide the tools and processes that your organization needs to govern AI assets.
1. Track machine learning models and prompt templates
Your team can track your machine-learning models and prompt templates from request to production and evaluate whether they comply with your organization's regulations and requirements.
What you can use | What you can do | Best to use when |
---|---|---|
Factsheets | Create an AI use case to track and govern AI assets from request through production. View lifecycle status for all of the registered assets and drill down to detailed factsheets for models, deployments, or prompt templates that are registered to the model use case. Review the details that are captured for each tracked asset in a factsheet associated with an AI use case. View evaluation details, quality metrics, fairness details, and drift details. |
You need to request a new model or prompt template from your data science team. You want to make sure that your model or prompt template is compliant and performing as expected. You want to determine whether you need to update a model or prompt template based on tracking data. You want to run reports on tracked assets to share or preserve details. |
Example: Golden Bank's model tracking
Business analysts at Golden Bank requested a predictive model. They can then track the model through all stages of the AI lifecycle as data scientists or ML engineers build and train the model and ModelOps engineers deploy and evaluate it. Factsheets document details about the model history and generate metrics that show its performance.
2. Evaluate machine learning models and prompt templates
You can evaluate machine learning models and prompt templates in projects or deployment spaces to measure their performance. For machine learning models, evaluate the model for quality, fairness, and accuracy. For foundation models, evaluate foundation model tasks, and understand how your model generates responses.
What you can use | What you can do | Best to use when |
---|---|---|
Projects | Use a project as a collaborative workspace to build machine learning models, prompt foundation models, save machine learning models and prompt templates, and evaluate machine learning models and prompt templates. By default, your sandbox project is created automatically when you sign up for watsonx. | You want to collaborate on machine learning models and prompt templates. |
Spaces user interface | Use the Spaces UI to deploy and evaluate machine learning models, prompt templates, and other assets from projects to spaces. | You want to deploy and evaluate machine learning models and prompt templates and view deployment information in a collaborative workspace. |
Example: Golden Bank's prompt evaluation
Golden Bank's data scientist and prompt engineering teams work together to evaluate the generation and questing-answering prompt templates using test data. They want to measure the performance of the foundation model and to understand how the model generates responses. The evaluations are tracked in AI Factsheets, so the entire team can monitor throughout the lifecycle from the development phase all the way through to the production phase.
3. Monitor deployed machine learning models and prompt templates
After deploying models, it is important to govern and monitor them to make sure that they are explainable and transparent. Data scientists must be able to explain how the models arrive at certain predictions so that they can determine whether the predictions have any implicit or explicit bias. You can configure drift evaluations to measure changes in your data over time to ensure consistent outcomes for your model. Use drift evaluations to identify changes in your model output, the accuracy of your predictions, and the distribution of your input data.
What you can use | What you can do | Best to use when |
---|---|---|
Watson OpenScale | Monitor model fairness issues across multiple features. Monitor model performance and data consistency over time. Explain how the model arrived at certain predictions with weighted factors. Maintain and report on model governance and lifecycle across your organization. |
You have features that are protected or that might contribute to prediction fairness. You want to trace model performance and data consistencies over time. You want to know why the model gives certain predictions. |
Example: Golden Bank's model monitoring
Data scientists at Golden Bank use Watson OpenScale to monitor the deployed predictive model to ensure that it is accurate, fair, and explainable. They run a notebook to set up monitors for the model and then tweak the configuration by using the Watson OpenScale user interface. Using metrics from the Watson OpenScale quality monitor and fairness monitor, the data scientists determine how well the model predicts outcomes and if it produces any biased outcomes. They also get insights for how the model comes to decisions so that the decisions can be explained to the data analysts.
Tutorials for watsonx.governance
Tutorial | Description | Expertise for tutorial |
---|---|---|
Evaluate and track a prompt template | Evaluate a prompt template to measure the performance of foundation model and track the prompt template through its lifecycle. | Use the evaluation tool and an AI use case to track the prompt template. |
Evaluate a machine learning model | Deploy a model, configure monitors for the deployed model, and evaluate the model. | Run a notebook to configure the models and use Watson OpenScale to evaluate. |
Evaluate a deployment in spaces | Deploy a model, configure monitors for the deployed model, and evaluate the model in a deployment space. | Configure the monitors and evaluate a model in a deployment space. |
Next Steps
- Build machine learning and generative AI models with watsonx.ai
- Scale AI workloads, for all your data, anywhere with watsonx.data Presto
Learn more
- IBM watsonx overview
- watsonx.ai Studio overview
- watsonx.ai Runtime overview
- Watson OpenScale overview
- Videos
- Try out different use cases on a self-service site. Select a use case to experience a live application built with watsonx. Developers, access prompt selection and construction guidance, along with sample application code, to accelerate your project.
Parent topic: Use cases