0 / 0
ModelOps use cases
ModelOps use cases

ModelOps use cases

By setting up a ModelOps process for your data, your company can benefit from a full end-to-end AI lifecycle that optimizes your data and AI investments.

Overview

Your company needs to ensure that data is collected and explored efficiently and that the AI models that use the data are properly built and governed. You need integrated systems and processes to manage data and model assets across the AI lifecycle.

With Cloud Pak for Data, your company can manage the full AI lifecycle from a single platform with integrated services that support the entire flow from collecting data all the way to monitoring your models in production.

Use Watson Studio services that support ModelOps to improve, simplify, and automate AI lifecycle operations and management. You can streamline and accelerate data collection and management, model development, model validation, and model deployment. You can operate trusted AI through ongoing model monitoring and retraining on an end-to-end unified data and AI platform, and then use the resulting predictions to decide on the actions to address your company’s needs.

Process

Watson Studio provides the tools you need to implement your ModelOps use case in Cloud Pak for Data.

The ModelOps process lets you collect, build, deploy, and monitor assets

Collecting the data

Collecting and organizing data is an important step in building your automated AI pipeline. Data scientists create projects, and data engineers collect data and add it to the projects so it can be organized and refined. You can collect data from multiple sources and ensure that it is secure and accessible for use by the Cloud Pak for Data tools and services that support your ModelOps AI lifecycle. You can address policy, security, and compliance issues to help you govern the data that is collected before you analyze the data and use it in your AI models.

Tools What you can do Best to use when
Watson Knowledge Catalog Create and catalog connections to diverse data sources from IBM Cloud, on premises data services, and third-party data services.
Create and catalog data assets, point to data sets that are accessible through connections, and upload files, such as CSV files, as data assets.
Use Watson Knowledge Catalog for data collection when you need an inventory of data connections and data sets at the organizational level so data scientists and analysts can work with the data for various projects.
Watson Query Create virtual data tables that can combine, join, or filter data from various relational data sources.
Make the resulting combined live data available as data assets in Watson Knowledge Catalog.
With Watson Query, you can query many data sources as one. Use Watson Query for data collection when you need to combine live data from multiple sources to generate views for input for projects. For example, you can use the combined live data to feed dashboards, notebooks, and flows so that the data can be explored.
Data Refinery Access and refine data from diverse data source connections.
Materialize the resulting data sets as snapshots in time that might combine, join, filter, mask, or anonymize data to make it usable for data scientists to analyze and explore.
Make the resulting data sets available to the project in Watson Knowledge Catalog.
With Data Refinery, you can simplify the process of preparing large amounts of raw data for analysis. Use Data Refinery for data collection when you need to access and join or filter data and materialize the results as data assets that represent a point in time. You can use the data as input for analysis or model training.
DataStage Deploy ready-to use, built-in business operations for your data flows.
Handle large volumes of data and complex data.
Use DataStage when you need to quickly design and run accurate data flows by using an intuitive interface that lets you connect to a wide range of data sources. You can integrate and transform data, and deliver it to your target system in batch or real time.

Build the models

To get predictive insights based on the data that you collected, refined and analyzed, the next step is to build and train models. Data scientists use Watson Studio tools to build the AI models, ensuring that the right algorithms and optimizations are used to make predictions that help to solve business problems.

Tools What you can do Best to use when
AutoAI Automatically select algorithms, engineer features, generate pipeline candidates, and train models by using the pipeline candidates, and then evaluate and rank those models and pipelines.
Deploy the models that you like to a space, or export the model training pipeline that you like from AutoAI into a notebook to refine it.
Use AutoAI when you want an advanced and automated way to build a good set of training pipelines and models quickly, and you want to be able to export the generated pipelines to refine them.
Notebooks Write your own feature engineering and model training and evaluation code in Python, based on training data sets that are available in the project, or connections to data sources such as databases, data lakes, or object storage. Use your favorite algorithms and libraries. Use notebooks and scripts to build models that use ML algorithms and frameworks when you want to use coding skills to have full control over the code that is used to create, train, and evaluate the models.
SPSS Modeler flows Create your own model training, evaluation, and scoring flows based on training data sets that are available in the project, or connections to data sources such as databases, data lakes, or object storage. With SPSS Modeler flows, you can build flows to prepare and blend data, build and manage models, and visualize the results. Use SPSS Modeler to build models when you want a simple way to explore data and define model training, evaluation, and scoring flows.

Deploy

When operations team members deploy your AI models, the models become available for applications to use for scoring and predictions to help drive actions.

Tools What you can do Best to use when
Spaces Deploy models and other assets from projects to spaces by using the user interface. Deploy models and other assets to test or production environments by using a simple user interface.
Python Create, deploy, and view the results of detailed models.
Test and deploy your models as APIs.
Use the Watson Machine Learning Python client for development and automation when you want to configure data, add your machine learning engine, and select and monitor deployments.

Monitor the models

After models are deployed, it is important to govern and monitor them to make sure that they are explainable and transparent. Data scientists need to be able to explain how the models arrive at certain predictions so that they can determine whether the predictions have any implicit or explicit bias. In addition, it's a best practice to watch for model performance and data consistency issues during the lifecycle of the model.

Tools What you can do Best to use when
Watson OpenScale Monitor model fairness issues across multiple features.
Monitor model performance and data consistency over time.
Explain how the model arrived at certain predictions with weight factors.
Maintain and report on model governance and lifecycle across your organization.
Use Watson OpenScale to monitor models when you have features that are protected or that might contribute to prediction fairness, you need to trace model performance and data consistencies over time, or you need to know why the model gives certain predictions.

Learn more

Parent topic: Using ModelOps to manage the AI lifecycle

For a full ModelOps scenario, read the article Analyzing credit risk with IBM Cloud Pak for Data on Red Hat OpenShift about putting ModelOps into practice. Note that the article is written for training and deploying on Cloud Pak for Data, so Step 1: Set up IBM Cloud Pak for Data on Red Hat OpenShift does not apply for Cloud Pak for Data as a Service. The steps to manage data, then train, deploy and monitor the model are all similar on Cloud Pak for Data as a Service.