Specifying a model type and configuration

The environment for a machine learning model or Python function is made up of the hardware and software specifications.

For details on using and customizing environments, see Environments

Software specifications and runtimes

Software specifications have replaced runtimes as the way to define the language and version you use for a model or function. They enable you to better configure the software used for running your models and functions. Previously, you could only choose among predefined and non-configurable runtimes. Now software specifications allow you to precisely define not only the software version to be used, but also include additional extensions (such as using conda .yml files or custom libraries).

Defining a software specification

Use this format for specifying the language and version used for a model or function:

meta_props={
     client.repository.ModelMetaNames.NAME: "skl_pipeline_heart_problem_prediction",
     client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: software_spec_uid,
     client.repository.ModelMetaNames.TYPE: "scikit-learn_0.23"
}

To get the list of predefined software specifications, use:

   client.software_specifications.list()

which returns a list of software specifications.

Save a specification to a variable:

   software_spec_uid = client.software_specifications.get_uid_by_name("scikit-learn_0.23-py3.7")
software_spec_uid

Choosing a predefined software specification

This table lists the model types and software specifications that map to the supported frameworks for deployments.

This table lists the predefined, or base, model types and software specifications.

Framework Versions Model Type Default
software_specification
Spark 2.4 mllib_2.4 spark-mllib_2.4
Spark 2.3 mllib_2.3 (deprecated) spark-mllib_2.3 (deprecated)
PMML 3.0 to 4.3 pmml_. (or) pmml_..*3.0 - 4.3 spark-mllib_2.4
Hybrid/AutoML 0.1 wml-hybrid_0.1 hybrid_0.1
SPSS 17.1 spss-modeler_17.1 spss-modeler_17.1
SPSS 18.1 spss-modeler_18.1 spss-modeler_18.1
SPSS 18.2 spss-modeler_18.2 spss-modeler_18.2
Scikit-learn 0.23 scikit-learn_0.23 default_py3.7
Scikit-learn 0.22 scikit-learn_0.22 scikit-learn_0.22-py3.6 (deprecated)
default_py3.6 (deprecated)
Scikit-learn 0.20 scikit-learn_0.20 scikit-learn_0.20-py3.6 (deprecated)
XGBoost 0.90 xgboost_0.90 with Python 3.7 If model is trained with sklearn wrapper
(XGBClassifier or XGBRegressor)
use scikit-learn_0.23
default_py3.7
XGBoost 0.90 xgboost_0.90 with Python 3.6 If model is trained with sklearn wrapper
(XGBClassifier or XGBRegressor)
use scikit-learn_0.22
scikit-learn_0.22-py3.6(deprecated)
default_py3.6(deprecated)
xgboost_0.90-py3.6 (deprecated)
XGBoost 0.82 xgboost_0.82 If model is trained with sklearn wrapper
(XGBClassifier or XGBRegressor)
use scikit-learn_0.20
scikit-learn_0.20-py3.6 (deprecated)
Tensorflow 1.15 tensorflow_1.15 tensorflow_1.15-py3.6 (deprecated)
default_py3.6 (deprecated)
Tensorflow 2.1 tensorflow_2.1 default_py3.7
tensorflow_2.1-py3.7
Keras 2.2.5 keras_2.2.5 tensorflow_1.15-py3.6 (deprecated)
default_py3.6 (deprecated)
PyTorch 1.1 pytorch-onnx_1.1 pytorch-onnx_1.1-py3.6 (deprecated)
PyTorch 1.2 pytorch-onnx_1.2 pytorch-onnx_1.2-py3.6 (deprecated)
pytorch-onnx_1.2-py3.6-edt (deprecated)
default_py3.6 (deprecated)
PyTorch 1.3.1 pytorch-onnx_1.3.1 default_py3.7
pytorch-onnx_1.3-py3.7
pytorch-onnx_1.3-py3.7-edt
Decision Optimization 12.10 do-docplex_12.10
do-opl_12.10
do-cplex_12.10
do-cpo_12.10
do_12.10
Python Functions 0.1 NA default_py3.7
ai-function_0.2-py3.6 (deprecated)
default_py3.6 (deprecated)

Managing assets that refer to discontinued software or frameworks

Note these guidelines for updating assets that rely on discontinued software specifications or frameworks. In some cases, the update is seamless. In other cases, your action is required to retrain or redeploy assets.

Managing assets that refer to discontinued software specifications

  • During migration, assets that refer to the discontinued software specification will be mapped to a comparable supported default software specification. This will be done in cases where the model type is not discontinued.
  • When you create new deployments of the migrated assets, the updated software specification in the asset’s metadata will be used.
  • Existing deployments of the migrated assets will be updated to use the new software specification. If there is a failure during deployment or scoring due to framework or library version incompatibilities, follow the manual steps for recreating the asset with the supported framework versions.

Migrating assets that refer to discontinued framework versions

  • During migration, the model type will not be updated.
  • The existing deployments post migration will be removed and new deployments for this framework will not be allowed.

Notes about upgrading from a deprecated framework

Update models or functions built with deprecated frameworks.

Upgrading a machine learning model

Follow these steps to update a model built with a deprecated framework.

Option 1: Save the model with a compatible framework

  1. Download the model from the Watson Machine Learning repository.
  2. Save the model back to the Watson Machine Learning repository with a model type and version supported in the current release.
  3. Deploy the model.
  4. Score the model to generate predictions.

If there is a failure when you deploy or score the model, it means that the model is not compatible with the new version used for saving the model. In this case, use Option 2.

Option 2: Retrain the model with a compatible framework

  1. Retrain the model with a model type and version supported in the current version.
  2. Save the model to the Watson Machine Learning repository with the supported model type and version.
  3. Deploy and score the model.

Upgrading a Python function

Follow these steps to update a function built with a deprecated framework.

Option 1: Save the python function with a compatible runtime or software specification:

  1. Download the Python function from the Watson Machine Learning repository.
  2. Save the Python function back to the Watson Machine Learning repository with a supported runtime or software specification version.
  3. Deploy the Python function.
  4. Score the Python function to generate predictions.

If there is a failure when you score the Python function, it means that the function is not compatible with the new runtime or software specification version used for saving the Python function. In this case, use Option 2.

Option 2: Modify the function code and save it with a compatible runtime or software specification

  • Modify the Python function code to make it compatible with the new runtime or software specification version. This could involve updating the dependent libraries installed within the python function code.
  • Save the Python function to the Watson Machine Learning repository with the new runtime or software specification version.
  • Deploy and score the Python function.

Creating a custom software specification in a project

If your Scikit-learn, XGBoost and Tensorflow model requires custom components such as user-defined transformers, estimators, or user-defined tensors, you can create a custom software specification derived from a base, or predefined specification.Python functions and Python scripts also support custom software specifications.

You can use custom software specification to reference any third-party libraries, user-created python packages, or both. Third-party libraries or user-created python packages must be specified as package extensions which can then be referenced in a custom software specification.

Custom software specifications are promoted with the assets that use them. Python functions and scripts, and models created with the Scikit-learn, XGBoost and Tensorflow frameworks can use the software specifications for deployments.

Note that software specifications are also included when you import a project or space that includes one.

For details on creating environments with software specifications, see Environments. For details on adding custom components, see Custom components.

Overview of creating a custom software specification

The high-level steps to create a custom software specification that uses third-party libraries and user-created python packages.

  1. Create a custom software specification
  2. Create a package extension to save a conda YAML file that contains a list of 3rd party libraries. Note: This step is not required if the model does not have any dependency on a third-party library.
  3. Create a package extension to save a user-created Python package. Note: This step is not required if the model does not require any user-defined transformers, estimators, or user-defined tensors.
  4. Add a reference of the package extensions to the custom software specification you created.

Creating a custom software specification in a notebook

You can use the Watson Machine Learning APIs or Python client to define a custom software specification that is derived from a base specification.

These notes are applicable when specifying software specification for Scikit-Learn, XGBoost, Tensorflow, Keras, PyTorch, or Caffe models trained in Watson Studio notebooks, or Python functions developed in a notebook.

  • If you have created the models or Python functions in a Watson Studio notebook with “Default Python 3.7” environment, then save the model or function with a reference to the software specification using the name “default_py3.7”. To get the corresponding software specification id, use:

    softwareSpecId = wml_client.software_specifications.get_uid_by_name('default_py3.7')

  • You can create custom environments in Watson Studio to install 3rd party libraries required to execute your Python scripts. If you have trained a model or developed a Python function in a Watson Studio notebook with acustomized environment, specify the reference to the custom software specification when you save the model or function. The name of the custom software specification will be the same as the name of the custom environment. For example, if the custom environment is called “my_cust_model1_env”:

    softwareSpecId = wml_client.software_specifications.get_uid_by_name('my_cust_model1_env')

To create a custom software specification

This code template illustrates how to create a custom software specification using the Python client.

  1. Authenticate and create the client.
  wml_credentials = {
                      "apikey":"abcdHYJUoUj0-MEpQ24u559nzSdBDmO7o4Q12UZg9ojn",
                      "url": "https://server.example.com"
   }
   from ibm_watson_machine_learning import APIClient
   wml_client = APIClient(wml_credentials)
  1. Create and set the default deployment space, then list available software specifications.

    metadata = {
                wml_client.spaces.ConfigurationMetaNames.NAME: 'examples-create-software-spec',
                wml_client.spaces.ConfigurationMetaNames.DESCRIPTION: 'For my models'
                }
    space_details = wml_client.spaces.store(meta_props=metadata)
    space_uid = wml_client.spaces.get_uid(space_details)
       
    # set the default space
    wml_client.set.default_space(space_uid)
    
    # see available meta names for software specs
    print('Available software specs configuration:', wml_client.software_specifications.ConfigurationMetaNames.get())
    wml_client.software_specifications.list()
    
    asset_id = 'undefined'
    pe_asset_id = 'undefined'
    
  2. Create the metadata for package extensions to add to the base specification.

     # create the metadata for package extensions
     pe_metadata = {
         wml_client.package_extensions.ConfigurationMetaNames.NAME: 'My custom library',
         # wml_client.software_specifications.ConfigurationMetaNames.DESCRIPTION: '...', # optional
         wml_client.package_extensions.ConfigurationMetaNames.TYPE: 'conda_yml'
     }
    
  3. Create the package extension to save to a conda yaml file.
        # name: testing1
        # dependencies:
        #   - regex
     pe_asset_details = wml_client.package_extensions.store(meta_props=pe_metadata, file_path='resources/customlibrary.yaml')
     pe_asset_id = wml_client.package_extensions.get_uid(pe_asset_details)
    
     base_id = wml_client.software_specifications.get_uid_by_name('default_py3.7')
    
  4. Create the metadata for the software specification and store the software specification.

     # create the metadata for software specs
     ss_metadata = {
         wml_client.software_specifications.ConfigurationMetaNames.NAME: 'Python 3.7 with pre-installed ML package',
         wml_client.software_specifications.ConfigurationMetaNames.DESCRIPTION: 'Adding some custom libraries like regex', # optional
         wml_client.software_specifications.ConfigurationMetaNames.BASE_SOFTWARE_SPECIFICATION: {'guid': base_id}
     }
     # store the software spec
     ss_asset_details = wml_client.software_specifications.store(meta_props=ss_metadata)
     # get the id of the new asset
     asset_id = wml_client.software_specifications.get_uid(ss_asset_details)
    
  5. Associate the package extension with the software specification.

     wml_client.software_specifications.add_package_extension(asset_id, pe_asset_id)
    
     ss_asset_details = wml_client.software_specifications.get_details(asset_id)
     print('Package extensions', pp.pformat(ss_asset_details['entity']['software_specification']['package_extensions']))
    

Propagating software specification and package extension from projects to deployment spaces

Custom software specifications and package extensions created in a Watson Studio project can be exported to a deployment space using the “Promote” option in the Watson Studio interface. Further, after you promote a custom software specification and package extension from a project to a space, follow these steps after making any changes to the software specification and package extension in the project, to trabsfer the changes to the deployment space.

  1. Delete the software spec, package extensions, associated models(optional) in the space using the Watson Machine Learning Python Client.
  2. In a Watson Studio project, promote the model, function, or script that is associated with the changed custom software specification and package extension to the space.

Using a hardware specification

You can include an existing hardware specification using these steps.

  1. To get details of the available hardware specifications, use GET v2/hardware_specifications. For example:

    curl -s -k -X GET https://example.com/v2/hardware_specifications?name=S -H "content-type: application/json" -H "Authorization: Bearer $token"
    
    {"total_results":1,"resources":[{"metadata":{"created_at":"2020-02-25T09:41:57.779Z","updated_at":"2020-02-25T09:41:57.779Z","name":"S","description":"A hardware specification providing 2 CPU cores and 8 GiB of memory.","asset_type":"hardware_specification","asset_id":"bb69c3be-e441-4b59-a193-846491de7b9a","href":"/v2/hardware_specifications/bb69c3be-e441-4b59-a193-846491de7b9a"},"entity":{"hardware_specification":{"nodes":{"cpu":{"units":"2","model":""},"mem":{"size":"8Gi"},"num_nodes":1}}}}]}
    
  2. Deploy the batch job with the hardware specification in the payload. For example, to specify hardware by name:

        {
             "asset”: {
                "href":"/v4/models/2d0c85e9-a593-46aa-89a0-241bcdbe45ba?space_id=d21487e9-9f6f-4be7-9017-73e6e5eb2011"
             },
             "space": {
                "href":"/v4/spaces/$spaceId"
             },
             "batch": {},
             "hardware_spec": {
                 "name": "S",
                 "num_nodes" : 1
             }
        }
    

    To specify hardware by ID:

      {
          "asset": {
                 "href":"/v4/models/2d0c85e9-a593-46aa-89a0-241bcdbe45ba?space_id=d21487e9-9f6f-4be7-9017-73e6e5eb2011"
            },
          "space": {
                "href": "/v4/spaces/d21487e9-9f6f-4be7-9017-73e6e5eb2011"
           },
           "batch":{},
           "hardware_spec": {
                   "id": "bb69c3be-e441-4b59-a193-846491de7b9a", 
                   "num_nodes":1
              }
        }
    
  3. Create the payload for the job. If you specify a hardware configuration, it overrides and replaces the hardware_spec in the jobs payload and the job will run with the hardware_spec mentioned in the jobs payload.

    For example, specifying a payload with a hardware name:

     {
        "deployment": {
             "href":"/v4/deployments/c6ca18af-1d6a-4c6a-ad78-93606144ec00"
          },
        "scoring" : {
                        "input_data":[{
                                "id": "data1",
                                "fields" : ["Age","Sex","BP","Cholesterol","Na","K"],
                                 "values" : [[23,"F","HIGH","HIGH",0.792535,0.031258]]
                               }]
         },
          "hardware_spec": {
                    "name": "S",
                   "num_nodes":1
          }
       }
    

    Specifying hardware by ID:

     {
        "deployment": {
             "href":"/v4/deployments/c6ca18af-1d6a-4c6a-ad78-93606144ec00"
          },
        "scoring" : {
                        "input_data":[{
                                "id": "data1",
                                "fields" : ["Age","Sex","BP","Cholesterol","Na","K"],
                                 "values" : [[23,"F","HIGH","HIGH",0.792535,0.031258]]
                               }]
         },
          "hardware_spec": {
                   "id": "bb69c3be-e441-4b59-a193-846491de7b9a", 
                   "num_nodes":1
          }
       }