Managing the AI Lifecycle with MLOps
Watson Studio, Watson Machine Learning, Watson OpenScale, and IBM Knowledge Catalog provide the integrated tools that you need to manage MLOps for your organization. Use the tools to:
- Share data assets in a feature store (catalog).
- Track and govern models in a model inventory.
- Review model details in the associated factsheets.
- Monitor deployments for fairness and accuracy.
- Automate processes in a pipeline.
ModelOps explained
MLOps synchronizes cadences between the application and model pipelines. It builds on these practices:
- DevOps for bringing a machine learning model from creation through training, to deployment, and to production.
- ModelOps for managing the lifecycle of a traditional machine learning model, including evaluation and retraining.
MLOps includes not just the routine deployment of machine learning models but also the continuous retraining, automated updating, and synchronized development and deployment of more complex machine learning models. Explore these resources for more details on developing an MLOps strategy:
- Data science and MLOps use case describes how to manage data, model building and deployment, and evaluate model fairness and performance.
- AI Governance use case provides context for how ModelOps can mesh with AI Governance to provide a comprehensive plan for tracking machine learning assets in your organization.
ModelOps tools
Cloud Pak for Data provides these tools to help you with ModelOps:
- IBM Knowledge Catalog for storing and sharing data assets in a feature store for use in machine learning models.
- Pipelines for automating the end-to-end flow of a machine learning model through the AI lifecycle.
- AI Factsheets for creating a centralized repository of model factsheets to track the lifecycle of a model, including request, building, deployment, and evaluation of a machine learning model. Use AI Factsheets as part of your AI Governance strategy.
- Watson OpenScale for evaluating deployed models to make sure that deployed models meet thresholds set for fairness and accuracy.
- Cpdctl command-line interface tool for managing and automating your machine learning assets that are hosted on Cloud Pak for Data as a Service by using the cpdctl command-line interface tool. Use automatic configuration from IBM Cloud to easily connect with the cpdctl API commands.
Managing access with deployment spaces
Use deployment to organize and manage access to assets as they move through the AI lifecycle. For example, you can manage access with deployment spaces in the following ways:
- Create a deployment space and assign it to Development as the deployment stage. If you are governing assets, deployments in this type of space display in the Develop stage of a use case. Assign access to the data scientists to create the assets or DevOps users to create deployments.
- Create a deployment space and assign it to Testing as the deployment stage. If you are governing assets, deployments in this type of space display in the Validate stage of a use case. Assign access to the model validators to test the deployments.
- Create a deployment space and assign it to Production as the deployment stage. If you are governing assets, deployments in this type of space display in the Operate stage of a use case. Limit access to this space to ModelOps users who manage the assets that are deployed to a production environment.
Sharing data assets in a feature store
If your organization uses IBM Knowledge Catalog, a catalog can serve as a feature store. Data assets that are stored in a feature store contain features that you can use in machine learning models and later share them across your organization. Data assets include metadata about where they are used in models. You can control access for Catalogs at the catalog and the data asset level. For more information, see Governing and curating data.
Automating ModelOps by using Pipelines
The IBM Watson Pipelines editor provides a graphical interface for orchestrating an end-to-end flow of assets from creation through deployment. Assemble and configure a pipeline to create, train, deploy, and update machine learning models and Python scripts. Make your ModelOps process simpler and repeatable. For more information, see IBM Watson Pipelines.
Tracking models with AI Factsheets
AI Factsheets provides the capabilities for you to track data science models across the organization and store the details in a catalog. View at a glance which models are in production and which need development or validation. Use the governance features to establish processes to manage the communication flow from data scientists to ModelOps administrators. For more information, see Model inventory and AI Factsheets.
Evaluating model deployments
Use IBM Watson OpenScale to analyze your AI with trust and transparency and understand how your AI models are involved in decision making. Detect and mitigate bias and drift. Increase the quality and accuracy of your predictions. Explain transactions and perform what-if analysis.
Automate managing assets and lifecycle
You can automate the AI Lifecycle in a notebook by using the Watson Machine Learning Python client. This sample notebook demonstrates how to:
- Download an externally trained scikit-learn model with data set
- Persist an external model in the Watson Machine Learning repository
- Deploy a model for online scoring by using the client library
- Score sample records by using the client library
- Update a previously persisted model
- Redeploy a model in-place
- Scale a deployment
Alternatively, you can use IBM Cloud Pak for Data Command-Line Interface (IBM cpdctl) to manage configuration settings and automate an end-to-end flow. This end-to-end flow includes training a model, saving it, creating a deployment space, and deploying the model. For more information, see IBM Cloud Pak for Data Command-Line Interface documentation. For an example of using cpdctl for exporting assets from a space, see Exporting space assets.
Typical ModelOps scenario
A typical ModelOps scenario in Cloud Pak for Data might be:
- Organize and curate data assets in a feature store
- Train a model by using AutoAI
- Save and deploy the model
- Track the model in a model inventory so that all collaborators can track the progress of the model through the lifecycle and make sure that it complies with organizational standards.
- Evaluate the deployment for bias
- Update the deployment with a better-performing model
- Monitor deployments and jobs across the organization
More resources
- ModelOps Wikipedia article
- Read the ModelOps blog post.
- IBM Blog post on ModelOps about using ModelOps to drive value from your AI investment.
- See how IBM is addressing ModelOps.
Parent topic: Deploying and managing assets