Importing models into Watson Machine Learning
If you have a machine learning model or neural network that was trained outside of IBM Watson Machine Learning, you can import that model into your Watson Machine Learning service.
Here, to import a trained model means:
- Store the trained model in your Watson Machine Learning repository
- [Optional] Deploy the stored model in your Watson Machine Learning service
Supported import formats
Model type | PMML | Spark MLlib | scikit-learn | XGBoost | TensorFlow | PyTorch |
Importing a model using UI | ✓ | |||||
Importing a model object | ✓ | ✓ | ✓ | |||
Importing a model using path to file | ✓ | ✓ | ✓ | ✓ | ✓ | |
Importing a model using path to directory | ✓ | ✓ | ✓ | ✓ |
Refer to these sections for extra information regarding importing models:
- Importing a PMML model
- Importing a Spark MLlib model
- Importing a scikit-learn model
- Importing an XGBoost model
- Importing a TensorFlow model
- Importing a PyTorch model
Importing a PMML model
- The only supported type for models imported from PMML is: web service
- The PMML file must have the
.xml
file extension - The PMML file must not contain a prolog Depending on the library you are using when you save your model, a prolog might be added to the top of the file by default, like this example:
::::::::::::::
spark-mllib-lr-model-pmml.xml
::::::::::::::
You must remove that prolog before you can import the PMML file to Watson Machine Learning
- For more information on supported frameworks, refer to: Supported frameworks
Importing a Spark MLlib model
- Only classification and regression models are supported
- Custom transformers, user-defined functions, and classes are not supported
- For more information on supported frameworks, refer to: Supported frameworks
Importing a scikit-learn model
.pkl
file is the supported import format- To serialize/pickle the model, use the
joblib
package - Only classification and regression models are supported
- Pandas Dataframe input type for
predict()
API is not supported - Supported deployment types for scikit-learn models are: web service and virtual deployment
- For more information on supported frameworks, refer to: Supported frameworks
Importing an XGBoost model
.pkl
file is the supported import format- To serialize/pickle the model, use the
joblib
package - Only classification and regression models are supported
- Pandas Dataframe input type for
predict()
API is not supported - Supported deployment types for XGBoost models are: web service and virtual deployment
- For more information on supported frameworks, refer to: Supported frameworks
Importing a TensorFlow model
- To save/serialize a TensorFlow model, use the
tf.saved_model.save()
method tf.estimator
is not supported- The only supported deployment types for TensorFlow models are: web service and batch
- For more information on supported frameworks, refer to: Supported frameworks
Importing a PyTorch model
- The only supported deployment type for PyTorch models is: web service
- For more information on supported frameworks, refer to: Supported frameworks
- For a Pytorch model to be importable to Watson Machine Learning, it must be previously exported to
.onnx
format. Refer to this code:
torch.onnx.export(<model object>, <prediction/training input data>, "<serialized model>.onnx", verbose=True, input_names=<input tensor names>, output_names=<output tensor names>)
Importing a model using UI
Follow these steps to import a model using UI:
Step 1: Store the model
- From the Assets tab of your project in Watson Studio, in the Models section, click New model
- In the page that opens, fill in the basic fields:
- Specify a name for your new model
- Confirm that the Watson Machine Learning service instance that you associated with your project is selected in the Machine Learning Service section
- Click the radio button labeled From file, and then upload your PMML file
- Click Create to store the model in your Watson Machine Learning repository.
Step 2: Deploy the model
After the model is stored from the model builder interface, the model details page for your stored model opens automatically.
Deploy your stored model from the model details page by performing the following steps:
- Click the Deployments tab
- Click Add deployment
- Give the deployment a name and then click Save
Importing a model object
Follow these instructions to import a model object:
-
If your model is located in a remote location, follow Downloading a model stored in a remote location, and then De-serializing models
-
Store the model object in your Watson Machine Learning repository. For details, refer to Storing model in Watson Machine Learning repository.
Importing a model using path to file
Follow these steps to import a model using a path to a file:
-
If your model is located in a remote location, follow Downloading a model stored in a remote location to download it.
-
If your model is located locally, place it in a specific directory:
!cp <saved model> <target directory>
!cd <target directory>
- For scikit-learn, XGBoost, Tensorflow, and PyTorch models, if the downloaded file is not a
.tar.gz
archive, make an archive:
!tar -zcvf <saved model>.tar.gz <saved model>
The model file must be at the top level of the directory, for example:
assets/
<saved model>
variables/
variables/variables.data-00000-of-00001
variables/variables.index
- Use the path to the saved file to store the model file in your Watson Machine Learning repository. For details, refer to Storing model in Watson Machine Learning repository.
Importing a model using path to directory
Follow these steps to import a model using a path to a directory:
- If your model is located in a remote location, refer to Downloading a model stored in a remote location.
- If your model is located locally, place it in a specific directory:
!cp <saved model> <target directory>
!cd <target directory>
For scikit-learn, XGBoost, Tensorflow, and PyTorch models, the model file must be at the top level of the directory, for example:
assets/
<saved model>
variables/
variables/variables.data-00000-of-00001
variables/variables.index
- Use the directory path to store the model file in your Watson Machine Learning repository. For details, refer to Storing model in Watson Machine Learning repository.
Downloading a model stored in a remote location
Follow this sample code to download your model from a remote location:
import os
from wget import download
target_dir = '<target directory name>'
if not os.path.isdir(target_dir):
os.mkdir(target_dir)
filename = os.path.join(target_dir, '<model name>')
if not os.path.isfile(filename):
filename = download('<url to model>', out = target_dir)
Storing model in your Watson Machine Learning repository
Use this code to store your model in your Watson Machine Learning repository:
from ibm_watson_machine_learning import APIClient
client = APIClient(<your credentials>)
sw_spec_uid = client.software_specifications.get_uid_by_name("<software specification name>")
meta_props = {
client.repository.ModelMetaNames.NAME: "<your model name>",
client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: sw_spec_uid,
client.repository.ModelMetaNames.TYPE: "<model type>"}
client.repository.store_model(model=<your model>, meta_props=meta_props)
Notes:
- Depending on the model framework used,
<your model>
may be the actual model object, full path to a saved model file, or a path to a directory where the model file is located. For details, see Supported input formats. - For a list of available software specifications to use as
<software specification name>
, use theclient.software_specifications.list()
method. - For a list of available model types to use as
model_type
, refer to Specifying a model type and configuration. - For information on how to create the
<your credentials>
dictionary, refer to Watson Machine Learning authentication.
De-serializing models
To de-serialize models, follow these sections:
De-serializing scikit-learn and XGBoost models
Use this code to de-serialize your scikit-learn and XGBoost model:
import joblib
<your_model> = joblib.load("<saved model>")
De-serializing Spark MLlib models
Use this code to de-serialize your Spark MLlib model:
from pyspark.ml import PipelineModel
<your model> = PipelineModel.load("<saved model>")