0 / 0
Compute resource options for the notebook editor in projects
Last updated: Nov 27, 2024
Compute resource options for the notebook editor in projects

When you run a notebook in the notebook editor in a project, you choose an environment template, which defines the compute resources for the runtime environment. The environment template specifies the type, size, and power of the hardware configuration, plus the software configuration. For notebooks, environment templates include a supported language of Python and R.

Types of environments

You can use these types of environments for running notebooks:

Most environment types for notebooks have default environment templates so you can get started quickly. Otherwise, you can create custom environment templates.

Environment types for notebooks
Environment type Default templates Custom templates
Anaconda CPU
Spark clusters
GPU

Runtime releases

The default environments for notebooks are added as an affiliate of a runtime release and prefixed with Runtime followed by the release year and release version.

A runtime release specifies a list of key data science libraries and a language version, for example Python 3.10. All environments of a runtime release are built based on the library versions defined in the release, thus ensuring the consistent use of data science libraries across all data science applications.

One 24.1 Runtime release exists for different versions of Python and R.

Note:

IBM Runtime 23.1 is constricted. Beginning November 21, 2024, you cannot create new notebooks or custom environments by using 23.1 runtimes. Also, you cannot create new deployments with software specifications that are based on the 23.1 runtime. To ensure a seamless experience and to leverage the latest features and improvements, switch to IBM Runtime 24.1. This change applies to watsonx.ai Studio on Cloud Pak for Data as a Service and IBM watsonx as a Service.

While a runtime release is supported, IBM will update the library versions to address security requirements. Note that these updates will not change the <Major>.<Minor> versions of the libraries, but only the <Patch> versions. This ensures that your notebook assets will continue to run.

Library packages included in Runtimes

For specific versions of popular data science library packages included in watsonx.ai Studio runtimes refer to these tables:

Table 1. Packages and their versions in the various Runtime releases for Python
Library Runtime 24.1 on Python 3.11
Keras 2.14.0
Lale 0.8.x
LightGBM 4.2.0
NumPy 1.26.4
ONNX 1.16
ONNX Runtime 1.16.3
OpenCV 4.8.1
pandas 2.1.4
PyArrow 15.0.1
PyTorch 2.1.2
scikit-learn 1.3.0
SciPy 1.11.4
SnapML 1.14.6
TensorFlow 2.14.1
XGBoost 2.0.3
Table 2. Packages and their versions in the various Runtime releases for R
Library Runtime 24.1 on R 4.3
arrow 15.0
car 3.1
caret 6.0
catools 1.18
forecast 8.21
ggplot2 3.4
glmnet 4.1
hmisc 5.1
keras 2.13
lme4 1.1
mvtnorm 1.2
pandoc 2.12
psych 2.3
python 3.11
randomforest 4.7
reticulate 1.34
sandwich 3.0
scikit-learn 1.3
spatial 7.3
tensorflow 2.15
tidyr 1.3
xgboost 1.7

In addition to the libraries listed in the tables, runtimes include many other useful libraries. To see the full list, select the Manage tab in your project, then click Templates, select the Environments tab, and then click on one of the listed environments.

CPU environment templates

You can select any of the following default CPU environment templates for notebooks. The default environment templates are listed under Templates on the Environments page on the Manage tab of your project.

DO Indicates that the environment templates includes the CPLEX and the DOcplex libraries to model and solve decision optimization problems that exceed the complexity that is supported by the Community Edition of the libraries in the other default Python environments. See Decision Optimization notebooks.

NLP Indicates that the environment templates includes the Watson Natural Language Processing library with pre-trained models for language processing tasks that you can run on unstructured data. See Using the Watson Natural Language Processing library. This default environment should be large enough to run the pre-trained models.

Default CPU environment templates for notebooks
Name Hardware configuration CUH rate per hour
Runtime 24.1 on Python 3.10 XXS 1 vCPU and 4 GB RAM 0.5
Runtime 24.1 on Python 3.10 XS 2 vCPU and 8 GB RAM 1
Runtime 24.1 on Python 3.10 S 4 vCPU and 16 GB RAM 2
NLP + DO Runtime 24.1 on Python 3.11 XS 2 vCPU and 8 GB RAM 6
Runtime 24.1 on R 4.3 S 4 vCPU and 16 GB RAM 2

Stop all active CPU runtimes when you don't need them anymore, to prevent consuming extra capacity unit hours (CUHs). See CPU idle timeout.

Notebooks and CPU environments

When you open a notebook in edit mode in a CPU runtime environment, exactly one interactive session connects to a Jupyter kernel for the notebook language and the environment runtime that you select. The runtime is started per single user and not per notebook. This means that if you open a second notebook with the same environment template in the same project, a second kernel is started in the same runtime. Runtime resources are shared by the Jupyter kernels that you start in the runtime. For more information, see Runtime scope.

If necessary, you can restart or reconnect to the kernel. When you restart a kernel, the kernel is stopped and then started in the same session again, but all execution results are lost. When you reconnect to a kernel after losing a connection, the notebook is connected to the same kernel session, and all previous execution results which were saved are available.

Spark environment templates

You can select any of the following default Spark environment templates for notebooks. The default environment templates are listed under Templates on the Environments page on the Manage tab of your project.

Default Spark environment templates for notebooks
Name Hardware configuration CUH rate per hour
Default Spark 3.4 & Python 3.10 2 Executors each: 1 vCPU and 4 GB RAM;
Driver: 1 vCPU and 4 GB RAM
1
Default Spark 3.4 & R 4.2 2 Executors each: 1 vCPU and 4 GB RAM;
Driver: 1 vCPU and 4 GB RAM
1

Stop all active Spark runtimes when you don't need them anymore, to prevent consuming extra capacity unit hours (CUHs). See Spark idle timeout.

Large Spark environments

If you have the watsonx.ai Studio Professional plan, you can create custom environment templates for larger Spark environments.

Professional plan users can have up to 35 executors and can choose from the following options for both driver and executor:

Hardware configurations for Spark environments
Hardware configuration
1 vCPU and 4 GB RAM
2 vCPU and 8 GB RAM
3 vCPU and 12 GB RAM

The CUH rate per hour increases by 0.5 for every vCPU that is added. For example, 1x Driver: 3vCPU with 12GB of RAM and 4x Executors: 2vCPU with 8GB of RAM amounts to (3 + (4 * 2)) = 11 vCPUs and 5.5 CUH.

Notebooks and Spark environments

You can select the same Spark environment template for more than one notebook. Every notebook associated with that environment has its own dedicated Spark cluster and no resources are shared.

When you start a Spark environment, extra resources are needed for the Jupyter Enterprise Gateway, Spark Master, and the Spark worker daemons. These extra resources amount to 1 vCPU and 2 GB of RAM for the driver and 1 GB RAM for each executor. You need to take these extra resources into account when selecting the hardware size of a Spark environment. For example: if you create a notebook and select Default Spark 3.4 & Python 3.10, the Spark cluster consumes 3 vCPU and 12 GB RAM but, as 1 vCPU and 4 GB RAM are required for the extra resources, the resources remaining for the notebook are 2 vCPU and 8 GB RAM.

File system on a Spark cluster

If you want to share files across executors and the driver or kernel of a Spark cluster, you can use the shared file system at /home/spark/shared.

If you want to use your own custom libraries, you can store them under /home/spark/shared/user-libs/. There are four subdirectories under /home/spark/shared/user-libs/ that are pre-configured to be made available to Python and R or Java runtimes.

The following tables lists the pre-configured subdirectories where you can add your custom libaries.

Table 5. Pre-configured subdirectories for custom libraries
Directory Type of library
/home/spark/shared/user-libs/python3/ Python 3 libraries
/home/spark/shared/user-libs/R/ R packages
/home/spark/shared/user-libs/spark2/ Java JAR files

To share libraries across a Spark driver and executors:

  1. Download your custom libraries or JAR files to the appropriate pre-configured directory.
  2. Restart the kernel from the notebook menu by clicking Kernel > Restart Kernel. This loads your custom libraries or JAR files in Spark.

Note that these libraries are not persisted. When you stop the environment runtime and restart it again later, you need to load the libraries again.

GPU environment templates

You can select the following GPU environment template for notebooks. The environment templates are listed under Templates on the Environments page on the Manage tab of your project.

The GPU environment template names indicate the accelerator power. The GPU environment templates include the Watson Natural Language Processing library with pre-trained models for language processing tasks that you can run on unstructured data. See Using the Watson Natural Language Processing library.

~ Indicates that the environment template requires the watsonx.ai Studio Professional plan. See Offering plans.

Default GPU environment templates for notebooks
Name Hardware configuration CUH rate per hour
GPU V100 Runtime 24.1 on Python 3.11 ~ 40 vCPU + 172 GB RAM + 1 NVIDIA TESLA V100 (1 GPU) 68
GPU 2xV100 Runtime 24.1 on Python 3.11 ~ 80 vCPU and 344 GB RAM + 2 NVIDIA TESLA V100 (2 GPU) 136

Stop all active GPU runtimes when you don't need them anymore, to prevent consuming extra capacity unit hours (CUHs). See GPU idle timeout.

Notebooks and GPU environments

GPU environments for notebooks are available only in the Dallas IBM Cloud service region.

You can select the same Python and GPU environment template for more than one notebook in a project. In this case, every notebook kernel runs in the same runtime instance and the resources are shared. To avoid sharing runtime resources, create multiple custom environment templates with the same specifications and associate each notebook with its own template.

Default hardware specifications for scoring models with watsonx.ai Runtime

When you invoke the watsonx.ai Runtime API within a notebook, you consume compute resources from the watsonx.ai Runtime service as well as the compute resources for the notebook kernel.

You can select any of the following hardware specifications when you connect to watsonx.ai Runtime and create a deployment.

Hardware specifications available when invoking the watsonx.ai Runtime service in a notebook
Capacity size Hardware configuration CUH rate per hour
Extra small 1x4 = 1 vCPU and 4 GB RAM 0.5
Small 2x8 = 2 vCPU and 8 GB RAM 1
Medium 4x16 = 4 vCPU and 16 GB RAM 2
Large 8x32 = 8 vCPU and 32 GB RAM 4

Data files in notebook environments

If you are working with large data sets, you should store the data sets in smaller chunks in the IBM Cloud Object Storage associated with your project and process the data in chunks in the notebook. Alternatively, you should run the notebook in a Spark environment.

Be aware that the file system of each runtime is non-persistent and cannot be shared across environments. To persist files in watsonx.ai Studio, you should use IBM Cloud Object Storage. The easiest way to use IBM Cloud Object Storage in notebooks in projects is to leverage the project-lib package for Python or the project-lib package for R.

Compute usage by service

The notebook runtimes consumes compute resources as CUH from watsonx.ai Studio, while running default or custom environments. You can monitor the watsonx.ai Studio CUH consumption in the project on the Resource usage page on the Manage tab of the project.

Notebooks can also consume CUH from the watsonx.ai Runtime service when the notebook invokes the watsonx.ai Runtime to score a model. You can monitor the total monthly amount of CUH consumption for the watsonx.ai Runtime service on the Resource usage page on the Manage tab of the project.

Track CUH consumption for watsonx.ai Runtime in a notebook

To calculate capacity unit hours consumed by a notebook, run this code in the notebook:

CP = client.service_instance.get_details()
CUH = CUH["entity"]["usage"]["capacity_units"]["current"]/(3600*1000)
print(CUH)

For example:

'capacity_units': {'current': 19773430}

19773430/(3600*1000)

returns 5.49 CUH

For details, see the Service Instances section of the IBM watsonx.ai Runtime API documentation.

Runtime scope

Environment runtimes are always scoped to an environment template and a user within a project. If different users in a project work with the same environment, each user will get a separate runtime.

If you select to run a version of a notebook as a scheduled job, each scheduled job will always start in a dedicated runtime. The runtime is stopped when the job finishes.

Changing the environment of a notebook

You can switch environments for different reasons, for example, you can:

  • Select an environment with more processing power or more RAM
  • Change from using an environment without Spark to a Spark environment

You can only change the environment of a notebook if the notebook is unlocked. You can change the environment:

  • From the notebook opened in edit mode:

    1. Save your notebook changes.
    2. Click the Notebook Info icon Notebook Info icon from the notebook toolbar and then click Environment.
    3. Select another template with the compute power and memory capacity from the list.
    4. Select Change environment. This stops the active runtime and starts the newly selected environment.
  • From the Assets page of your project:

    1. Select the notebook in the Notebooks section, click Actions > Change Environment and select another environment. The kernel must be stopped before you can change the environment. This new runtime environment will be instantiated the next time the notebook is opened for editing.
  • In the notebook job by editing the job template. See Editing job settings.

Next steps

Learn more

Parent topic: Compute resources for tools