Learn about how usage for machine learning assets is measured using capacity unit hours (CUH).
watsonx.ai Runtime compute usage and pricing
Copy link to section
watsonx.ai Runtime compute usage is calculated by the number of capacity unit hours (CUH) consumed by an active machine learning instance. The rate of capacity units per hour consumed is determined by the computing requirements of your machine
learning assets and models. For example, a model with a large, complex data set will consume more training resources than a model with a smaller, simpler data set. Note that scaling a deployment to support more concurrent users and requests
also increases CUH consumption.
Tip: Because there are so many variables that affect resource consumption for a deployment, the recommended practice is to run tests on your models and deployments to analyze CUH consumption.
For all plans:
Capacity-unit-hour (CUH) rate consumption for training is based on training tool, hardware specification, and runtime environment.
Capacity-unit-hour (CUH) rate consumption for deployment is based on deployment type, hardware specification, and software specification.
watsonx.ai Runtime places limits on the number of deployment jobs retained for each single deployment space. If you exceed your limit,
you cannot create new deployment jobs until you delete existing jobs or upgrade your plan. By default, jobs metadata will be auto-delete after 30 days. You can override this value when creating a job. See Managing jobs.
Time to idle refers to the amount of time to consider a deployment active between scoring requests. If a deployment does not receive scoring requests for a given duration, it is treated as inactive, or idle, and billing stops for all frameworks
other than SPSS.
A plan allows for at least the stated rate limit, and the actual rate limit can be higher than the stated limit. For example, the Lite plan might process more than 2 requests per second without issuing an error. If you have a paid plan and
believe you are reaching the rate limit in error, contact IBM Support for assistance.
Compute time is calculated to the millisecond. However, there is a one-minute minimum for each distinct operation. That is, a training run that takes 12 seconds is billed as one minute toward the capacity unit hour quota, while a training
run that takes 83.555 seconds is billed exactly as calculated.
The way that online deployments consume capacity units is based on framework. For some frameworks, CUH is charged for the number of hours the deployment asset is active in a deployment space. For example, SPSS models in online deployment mode
that run 24 hours a day for seven days a week consume CUH and are charged for that period. There is no idle time for an active online deployment. For other frameworks, CUH is charged according to scoring duration. See the CUH consumption
table for details on how CUH is calculated.
CUH consumption rates by asset type
Copy link to section
Table 3. CUH consumption rates by asset type
Asset type
Capacity type
Capacity units per hour
AutoAI experiment
8 vCPU and 32 GB RAM
20
Decision Optimization training
2 vCPU and 8 GB RAM 4 vCPU and 16 GB RAM 8 vCPU and 32 GB RAM 16 vCPU and 64 GB RAM
6 7 9 13
Decision Optimization deployments
2 vCPU and 8 GB RAM 4 vCPU and 16 GB RAM 8 vCPU and 32 GB RAM 16 vCPU and 64 GB RAM
30 40 50 60
Machine Learning models (training, evaluating, or scoring)
1 vCPU and 4 GB RAM 2 vCPU and 8 GB RAM 4 vCPU and 16 GB RAM 8 vCPU and 32 GB RAM 16 vCPU and 64 GB RAM
0.5 1 2 4 8
Tuning Studio (watsonx only)
NVIDIA A100 80GB GPU
43
CUH consumption by deployment and framework type
Copy link to section
CUH consumption is calculated using these formulas:
Deployment type
Framework
CUH calculation
Online
AutoAI, AI function, AI service, SPSS, Scikit-Learn custom libraries, Tensorflow, RShiny
For example, consider a Decision Optimization batch deployment job that runs for 15 minutes. Resource consumption is calculated this way: 15 minutes = 0.25 hours, on 2 nodes, and with 2 vCPU and 8 GB RAM. This combination results in a CUH rate
of 30, so every time the job runs it consumes 0.25 * 2 * 30, which equals 15 CUH.
These tables show the capacity units per hour calculation for predefined machine learning environments, by usage type.
Capacity units per hour for training, evaluating, or scoring models
Copy link to section
Capacity type
Capacity units per hour
Extra small: 1 vCPU and 4 GB RAM
0.5
Small: 2 vCPU and 8 GB RAM
1
Medium: 4 vCPU and 16 GB RAM
2
Large: 8 vCPU and 32 GB RAM
4
Extra large: 16 vCPU and 64 GB RAM
8
Capacity units per hour for AutoAI experiments
Copy link to section
Capacity type
Capacity units per hour
8 vCPU and 32 GB RAM
20
Note:
AutoAI experiments for retrieval-augmented generation (RAG) solutions also consume tokens for embedding documents and inferencing the foundation models. For billing details for gnerative AI, see Billing details for generative AI assets.
Capacity units per hour for Decision Optimization experiments
Copy link to section
These plans apply to Decision Optimization experiments run in watsonx.ai Studio.
Capacity type
Capacity units per hour
Decision Optimization: 2 vCPU and 8 GB RAM
6
Decision Optimization: 4 vCPU and 16 GB RAM
7
Decision Optimization: 8 vCPU and 32 GB RAM
9
Decision Optimization: 16 vCPU and 64 GB RAM
13
Capacity units per hour for Decision Optimization in watsonx.ai Runtime
Copy link to section
These plans apply to Decision Optimization deployed and run from watsonx.ai Runtime.
Capacity type
Capacity units per hour
Decision Optimization: 2 vCPU and 8 GB RAM
30
Decision Optimization: 4 vCPU and 16 GB RAM
40
Decision Optimization: 8 vCPU and 32 GB RAM
50
Decision Optimization: 16 vCPU and 64 GB RAM
60
Monitoring resource usage
Copy link to section
You can track resource usage for assets you own or collaborate on in a project or space. If you are an account owner or administrator, you can track CUH for an entire account. For more information, see Monitoring account resource usage.
You can track the runtime usage for an account on the Environment Runtimes page if you are the IBM Cloud account owner or administrator or the watsonx.ai Runtime service owner. For more information, see Monitoring resources.
Tracking CUH consumption for machine learning in a notebook
Copy link to section
To calculate capacity unit hours in a notebook, use:
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.