Decision
Optimization uses watsonx.ai Runtime asynchronous APIs to enable jobs to be run in parallel.
To solve a problem, you can create a new job from the model deployment and associate data to it.
See Deployment steps and the REST API example. You
are not charged for deploying a model. Only the solving of a model with some data is charged, based
on the running time.
To solve more than one job at a time, specify more than one node when you create your
deployment. For example in this REST API example, increment the number of the nodes by changing the value of the
nodes property: "nodes" : 1.
PODs (nodes)
Copy link to section
When a job is created and
submitted, how it is handled depends on the current configuration and jobs that are running for the
watsonx.ai Runtime instance. This process is shown in the
following diagram.
The new job is sent to the queue.
If a POD is started but idle (not running a job), it immediately begins processing this
job.
Otherwise, if the maximum number of nodes is not reached, a new POD is started. (Starting a POD
can take a few seconds). The job is then assigned to this new POD for processing.
Otherwise, the job waits in the queue until one of the running PODs has finished and can pick up
the waiting job.
The configuration of PODs of each size is as follows:
Table 1. T-shirt sizes for Decision
Optimization
Definition
Name
Description
2 vCPU and 8 GB
S
Small
4 vCPU and 16 GB
M
Medium
8 vCPU and 32 GB
L
Large
16 vCPU and 64 GB
XL
Extra Large
For all configurations, 1 vCPU and 512 MB are reserved for internal use.
In addition to the solve time, the pricing depends on the selected size
through a multiplier.
In the deployment configuration, you can also set the maximal number of nodes to be
used.
Idle PODs are automatically stopped after some timeout. If a new job is submitted when
no PODs are up, it takes some time (approximately 30 seconds) for the POD to restart.
Run-time-based pricing (CUH)
Copy link to section
Only the job solve time is charged: the idle time for PODs is not charged.
Depending on the size of the POD used, a different multiplier is used to compute the
number of Capacity Units Hours (CUH) used.
REST API example
Copy link to section
For the full procedure of deploying a model and links to the Swagger documentation, see REST API example.
Python API example
Copy link to section
In addition to the REST APIs, a Python API is provided with the watsonx.ai Runtime so that you can easily create, deploy, and use a
Decision
Optimization model from a Python notebook.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.