Importing models into Watson Machine Learning

If you have a machine learning model or neural network that was trained outside of IBM Watson Machine Learning, you can import that model into your Watson Machine Learning service.

Here, to import a trained model means:

  1. Store the trained model in your Watson Machine Learning repository
  2. [Optional] Deploy the stored model in your Watson Machine Learning service

Supported import formats

Model type PMML Spark MLlib scikit-learn XGBoost TensorFlow PyTorch
Importing a model using UI          
Importing a model object      
Importing a model using path to file  
Importing a model using path to directory    

Refer to these sections for extra information regarding importing models:

Importing a PMML model

  • The only supported type for models imported from PMML is: web service
  • The PMML file must have the .xml file extension
  • The PMML file must not contain a prolog Depending on the library you are using when you save your model, a prolog might be added to the top of the file by default, like this example:
  ::::::::::::::
  spark-mllib-lr-model-pmml.xml
  ::::::::::::::

You must remove that prolog before you can import the PMML file to Watson Machine Learning

Importing a Spark MLlib model

  • Only classification and regression models are supported
  • Custom transformers, user-defined functions, and classes are not supported
  • For more information on supported frameworks, refer to: Supported frameworks

Importing a scikit-learn model

  • .pkl file is the supported import format
  • To serialize/pickle the model, use the joblib package
  • Only classification and regression models are supported
  • Pandas Dataframe input type for predict() API is not supported
  • Supported deployment types for scikit-learn models are: web service and virtual deployment
  • For more information on supported frameworks, refer to: Supported frameworks

Importing an XGBoost model

  • .pkl file is the supported import format
  • To serialize/pickle the model, use the joblib package
  • Only classification and regression models are supported
  • Pandas Dataframe input type for predict() API is not supported
  • Supported deployment types for XGBoost models are: web service and virtual deployment
  • For more information on supported frameworks, refer to: Supported frameworks

Importing a TensorFlow model

  • To save/serialize a TensorFlow model, use the tf.saved_model.save() method
  • tf.estimator is not supported
  • The only supported deployment types for TensorFlow models are: web service and batch
  • For more information on supported frameworks, refer to: Supported frameworks

Importing a PyTorch model

  • The only supported deployment type for PyTorch models is: web service
  • For more information on supported frameworks, refer to: Supported frameworks
  • For a Pytorch model to be importable to Watson Machine Learning, it must be previously exported to .onnx format. Refer to this code:
  torch.onnx.export(<model object>, <prediction/training input data>, "<serialized model>.onnx", verbose=True, input_names=<input tensor names>, output_names=<output tensor names>)

Importing a model using UI

Follow these steps to import a model using UI:

Step 1: Store the model

  1. From the Assets tab of your project in Watson Studio, in the Models section, click New model
  2. In the page that opens, fill in the basic fields:
    • Specify a name for your new model
    • Confirm that the Watson Machine Learning service instance that you associated with your project is selected in the Machine Learning Service section
  3. Click the radio button labeled From file, and then upload your PMML file
  4. Click Create to store the model in your Watson Machine Learning repository.

Step 2: Deploy the model

After the model is stored from the model builder interface, the model details page for your stored model opens automatically.

Deploy your stored model from the model details page by performing the following steps:

  1. Click the Deployments tab
  2. Click Add deployment
  3. Give the deployment a name and then click Save

Importing a model object

Follow these instructions to import a model object:

  1. If your model is located in a remote location, follow Downloading a model stored in a remote location, and then De-serializing models

  2. Store the model object in your Watson Machine Learning repository. For details, refer to Storing model in Watson Machine Learning repository.

Importing a model using path to file

Follow these steps to import a model using a path to a file:

  1. If your model is located in a remote location, follow Downloading a model stored in a remote location to download it.

  2. If your model is located locally, place it in a specific directory:

  !cp <saved model> <target directory>
  !cd <target directory>
  1. For scikit-learn, XGBoost, Tensorflow, and PyTorch models, if the downloaded file is not a .tar.gz archive, make an archive:
  !tar -zcvf <saved model>.tar.gz <saved model>

The model file must be at the top level of the directory, for example:

  assets/
  <saved model>
  variables/
  variables/variables.data-00000-of-00001
  variables/variables.index
  1. Use the path to the saved file to store the model file in your Watson Machine Learning repository. For details, refer to Storing model in Watson Machine Learning repository.

Importing a model using path to directory

Follow these steps to import a model using a path to a directory:

  1. If your model is located in a remote location, refer to Downloading a model stored in a remote location.
  2. If your model is located locally, place it in a specific directory:
  !cp <saved model> <target directory>
  !cd <target directory>

For scikit-learn, XGBoost, Tensorflow, and PyTorch models, the model file must be at the top level of the directory, for example:

  assets/
  <saved model>
  variables/
  variables/variables.data-00000-of-00001
  variables/variables.index
  1. Use the directory path to store the model file in your Watson Machine Learning repository. For details, refer to Storing model in Watson Machine Learning repository.

Downloading a model stored in a remote location

Follow this sample code to download your model from a remote location:

import os
from wget import download

target_dir = '<target directory name>'
if not os.path.isdir(target_dir):
    os.mkdir(target_dir)
filename = os.path.join(target_dir, '<model name>')
if not os.path.isfile(filename):
    filename = download('<url to model>', out = target_dir)

Storing model in your Watson Machine Learning repository

Use this code to store your model in your Watson Machine Learning repository:

from ibm_watson_machine_learning import APIClient

client = APIClient(<your credentials>)
sw_spec_uid = client.software_specifications.get_uid_by_name("<software specification name>")

meta_props = {
    client.repository.ModelMetaNames.NAME: "<your model name>",
    client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: sw_spec_uid,
    client.repository.ModelMetaNames.TYPE: "<model type>"}
  
client.repository.store_model(model=<your model>, meta_props=meta_props)

Notes:

  • Depending on the model framework used, <your model> may be the actual model object, full path to a saved model file, or a path to a directory where the model file is located. For details, see Supported input formats.
  • For a list of available software specifications to use as <software specification name>, use the client.software_specifications.list() method.
  • For a list of available model types to use as model_type, refer to Specifying a model type and configuration.
  • For information on how to create the <your credentials> dictionary, refer to Watson Machine Learning authentication.

De-serializing models

To de-serialize models, follow these sections:

De-serializing scikit-learn and XGBoost models

Use this code to de-serialize your scikit-learn and XGBoost model:

  import joblib

  <your_model> = joblib.load("<saved model>")

De-serializing Spark MLlib models

Use this code to de-serialize your Spark MLlib model:

  from pyspark.ml import PipelineModel

  <your model> = PipelineModel.load("<saved model>")