For proper deployment, you must set up a deployment space and then select and configure a specific deployment type. After you deploy assets, you can manage and update them to make sure they perform well and to monitor their accuracy.
To be able to deploy assets from a space, you must have a machine learning service instance that is provisioned and associated with that space.
Online and batch deployments provide simple ways to create an online scoring endpoint or do batch scoring with your models.
If you want to implement a custom logic:
Create a Python function to use for creating your online endpoint
Write a notebook or script for batch scoring
Note: If you create a notebook or a script to perform batch scoring such an asset runs as a platform job, not as a batch deployment.
Deployable assets
Copy link to section
Following is the list of assets that you can deploy from a watsonx.ai Runtime space, with information on applicable deployment types:
List of assets that you can deploy
Asset type
Batch deployment
Online deployment
Functions
Yes
Yes
Models
Yes
Yes
Scripts
Yes
No
Notes:
A deployment job is a way of running a batch deployment, or a self-contained asset like a flow in watsonx.ai Runtime. You can select the input and output for your job and choose to run it manually or on a schedule. For more information, see
Creating a deployment job.
You can deploy a Natural Language Processing model by using Python functions or Python scripts. Both online and batch deployments are supported.
Notebooks and flows use notebook environments. You can run them in a deployment space, but they are not deployable.
After you deploy assets, you can manage and update them to make sure they perform well and to monitor their accuracy. Some ways to manage or update a deployment are as follows:
Manage deployment jobs. After you create one or more jobs, you can view and manage them from the Jobs tab of your deployment space.
Update a deployment. For example, you can replace a model with a better-performing version without having to create a new deployment.
Scale a deployment to increase availability and throughput by creating replicas of the deployment.
Configuring API gateways to provide stable endpoints
Copy link to section
watsonx.ai Runtime provides stable endpoints to prevent downtime. However, you might experience downtime if you move to a new Cloud Pak for Data instance or add an instance.
API gateways provide a stable URL that can be used with your Watson Machine Learning API endpoint. You can use an API gateway (available in Cloud Pak for Integration) with your deployment endpoints to handle downtime if it happens in the following
cases:
If you have more than one instance of Cloud Pak for Data in a high-availability configuration, and one of the available instances fails. In this case, you can use an API gateway for switching automatically to another instance, thereby preventing
complete failure.
If you have more than one application that uses the same endpoint, and the deployment endpoint is not available. For example, if you accidentally delete the deployment. In this case, you can update the endpoint in the API gateway to make sure
that applications continue to use it.
Enabling GPU and MIG support for deployment runtimes
Copy link to section
If you are deploying a predictive machine learning model that requires significant processing power for inferencing, you can optionally configure a GPU for deployment runtimes.
You can also enable MIG support for GPUs when you want to deploy an application that does not require the full power of an enitre GPU. If you are configuring MIG for GPU-accelerated workloads, all GPU-enabled nodes should adhere to a single
strategy determined in the prior configuration steps. This ensures consistent behaviour across all GPU-enabled nodes in the cluster. To configure MIG support, see Nvidia Guide for configuring MIG support.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.