Managing AI Lifecycle with ModelOps
Watson Studio, Watson Machine Learning, Watson OpenScale, and Watson Knowledge Catalog provide the integrated tools that you need to manage ModelOps for your organization. Use the tools to share data assets in a feature store (catalog), track and govern models in a model inventory, review model details in associated factsheets, monitor deployments for fairness and accuracy, and automate processes in a pipeline.
ModelOps explained
ModelOps synchronizes cadences between the application and model pipelines. It builds on these practices:
- DevOps for bringing a machine learning model from creation through training, to deployment, and to production.
- MLOps for managing the lifecycle of a traditional machine learning model, including evaluation and retraining.
ModelOps extends MLOps to include not just the routine deployment of machine learning models but also the continuous retraining, automated updating, and synchronized development and deployment of more complex machine learning models. Explore these resources for additional details on developing a ModelOps strategy:
- Data science and MLOps use case describes how to manage data, operationalize model building and deployment, and evaluate model fairness and performance.
- AI governance use case provides context for how ModelOps can mesh with AI Governance to provide a comprehensive plan for tracking machine learning assets in your organization.
ModelOps tools
Cloud Pak for Data provides these tools to help you with ModelOps:
- Watson Knowledge Catalog for storing and sharing data assets in a feature store for use in machine learning models.
- Pipelines for automating the end-to-end flow of a machine learning model through the AI lifecycle.
- AI Factsheets for creating a centralized repository of model factsheets to track the lifecycle of a model, including request, building, deployment, and evaluation of a machine learning model. Use AI Factsheets as part of your AI Governance strategy.
- Watson OpenScale for evaluating deployed models to make sure that deployed models meet thresholds set for fairness and accuracy.
- cpdctl command line interface tool for managing and automating your machine learning assets hosted on IBM Cloud Pak for Data as a Service (CPDaaS) using the cpdctl command line interface tool. Use automatic configuration from IBM Cloud to easily connect with the cpdctl API commands.
You can extend your model operations with multicloud offerings that optimize the use of machine learning models across clouds and enable integration through continuous integration and continuous deployment. For details, refer to Data Science for Multicloud ModelOps.
Sharing data assets in a feature store
If your organization uses Watson Knowledge Catalog, a catalog can serve as a feature store. Data assets that are stored in a feature store contain features that you can use in machine learning models and later share them across your organization. Data assets include metadata about where they are used in models. Catalogs have controlled access at the catalog and the data asset level. For details, refer to Governing and curating data.
Automating ModelOps by using Pipelines
The IBM Watson Pipelines editor provides a graphical interface for orchestrating an end-to-end flow of assets from creation through deployment. Assemble and configure a pipeline to create, train, deploy, and update machine learning models and Python scripts. Make your ModelOps process simpler and repeatable. For details, refer to IBM Watson Pipelines.
Tracking models with AI Factsheets
AI Factsheets provides the capabilities for you to track data science models across the organization and store the details in a catalog. View at a glance which models are in production and which need development or validation. Use the governance features to establish processes to manage the communication flow from data scientists to ModelOps administrators. For details, refer to Model inventory and AI Factsheets.
Evaluating deployed models
Use IBM Watson OpenScale to analyze your AI with trust and transparency and understand how your AI models make decisions. Detect and mitigate bias and drift. Increase the quality and accuracy of your predictions. Explain transactions and perform what-if analysis.
Automate managing assets and lifecycle
You can automate the AI Lifecycle in a notebook, using the Watson Machine Learning Python client. This sample notebook demonstrates how to:
- Download an externally trained scikit-learn model with dataset
- Persist an external model in Watson Machine Learning repository
- Deploy model for online scoring using the client library
- Score sample records using the client library
- Update a previously persisted model
- Redeploy a model in-place
- Scale a deployment
Alternatively, using the IBM Cloud Pak for Data Command Line Interface (IBM cpdctl), you can manage configuration settings and automate an end-to-end flow that includes training a model, saving it, creating a deployment space, and deploying the model. For details, see the IBM Cloud Pak for Data Command Line Interface documentation. For an example of using cpdctl for exporting assets from a space, see Exporting space assets.
Typical ModelOps scenario
A typical ModelOps scenario in Cloud Pak for Data might be:
- Organize and curate data assets in a feature store
- Train a model by using AutoAI
- Save and deploy the model
- Track the model in a model inventory so that all collaborators can track the progress of the model through the lifecycle and ensure compliance with organizational standards
- Evaluate the deployment for bias
- Update the deployment with a better-performing model
- Monitor deployments and jobs across the organization
Additional resources
- ModelOps Wikipedia article
- Read the ModelOps blog post.
- IBM Blog post on ModelOps about using ModelOps to drive value from your AI investment.
- See how IBM is addressing ModelOps.
Parent topic: Deploying and managing models