Importing trained scikit-learn models into Watson Machine Learning
If you have a scikit-learn model that you trained outside of IBM Watson Machine Learning, this topic describes how to import that model into your Watson Machine Learning service.
Restrictions and requirements
- If you create the archive file for the model, use
sklearn.externals.joblib
package to serialize/pickle the model. - If you create the archive file for the model, the saved(serialized) model file must be in the top level of the .tar.gz file that gets saved/uploaded to the Watson Machine Learning repository using
client.repository.store_model()
API. - Only classification and regression models are supported
- Pandas Dataframe input type for
predict()
API is not supported - Models can contain references to only those packages (and package versions) that are available by default in the Anaconda installer of the Anaconda distribution used
- The only supported deployment type for scikit-learn models is: web service
- See also: Supported frameworks
Example
The following notebook demonstrates importing a scikit-learn model:
Interface options
There are three options for importing trained scikit-learn models:
- Option 1: If you have saved your model in PMML format, see: Importing models saved in PMML format
- Option 2: Python client
- Option 3: Command line interface (CLI)
Step 0 for interface options 2 and 3: Build, train, and save a model
The following Python code snippet demonstrates:
- Creating a Pipeline called
pipeline_org
- Saving the trained model in a Pickle file called “tent-prediction-model.pkl”
- Copying the pickle file to a directory called “model-dir”
- Saving the pickle file in a tar.gz file called “tent-prediction-model.tar.gz”
from sklearn.pipeline import Pipeline from sklearn.linear_model import LogisticRegression import pickle pipeline_org = Pipeline( steps = [ ( "classifier", LogisticRegression() ) ] ) pipeline_org.fit( X_train, y_train ) pickle.dump( pipeline_org, open( "tent-prediction-model.pkl", 'wb') ) !mkdir model-dir !cp tent-prediction-model.pkl model-dir !tar -zcvf tent-prediction-model.tar.gz tent-prediction-model.pkl
Where:
X_train
is a Pandas DataFrame containing your training datay_train
is a Pandas Series with the training data labels
For the full code example, see: the sample notebook
Interface option 2: Watson Machine Learning Python client
Step 1: Store the model in your Watson Machine Learning repository
You can store the Pipeline in your Watson Machine Learning repository using the Watson Machine Learning Python client store_model
method .
Format options:
-
In-memory Pipeline object:
from watson_machine_learning_client import WatsonMachineLearningAPIClient client = WatsonMachineLearningAPIClient( <your-credentials> ) pipeline = pickle.load( open( "tent-prediction-model.pkl", 'rb') ) model_details = client.repository.store_model( pipeline, "My scikit-learn model (in-memory object)" )
-
Pipeline saved in a pickle file:
from watson_machine_learning_client import WatsonMachineLearningAPIClient client = WatsonMachineLearningAPIClient( <your-credentials> ) metadata = { client.repository.ModelMetaNames.NAME: "My scikit-learn model (.pkl)", client.repository.ModelMetaNames.FRAMEWORK_NAME: "scikit-learn", client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.19" } model_details = client.repository.store_model( model="model-dir", meta_props=metadata )
-
Pipeline saved in a pickle file in a tar.gz file:
from watson_machine_learning_client import WatsonMachineLearningAPIClient client = WatsonMachineLearningAPIClient( <your-credentials> ) metadata = { client.repository.ModelMetaNames.NAME: "My scikit-learn model (tar.gz)", client.repository.ModelMetaNames.FRAMEWORK_NAME: "scikit-learn", client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.19" } model_details = client.repository.store_model( model="tent-prediction-model.tar.gz", meta_props=metadata )
Where:
- <your-credentials> contains credentials for your Watson Machine Learning service (see: Looking up credentials)
Step 2: Deploy the stored model in your Watson Machine Learning
The following example demonstrates deploying the stored model as a web service, which is the default deployment type:
model_id = model_details["metadata"]["guid"]
model_deployment_details = client.deployments.create( artifact_uid=model_id, name="My scikit-learn model deployment" )
See: Deployments.create
Interface option 3: Watson Machine Learning CLI
Prerequisite: Set up the CLI environment.
Step 1: Store the model in your Watson Machine Learning repository
Example command and corresponding output
>ibmcloud ml store <model-filename> <manifest-filename>
Starting to store ...
OK
Model store successful. Model-ID is '145bca56-134f-7e89-3c12-0d3a7859d21f'.
Where:
- <model-filename> is the path and name of the tar.gz file
- <manifest-filename> is the path and name of a manifest fest containing metadata about the model being stored
Sample manifest file contents
name: My scikit-learn model
framework:
name: scikit-learn
version: '0.19'
See: store
CLI command
Step 2: Deploy the stored model in your Watson Machine Learning
The following example demonstrates deploying the stored model as a web service, which is the default deployment type:
Example command and corresponding output
>ibmcloud ml deploy <model-id> "My scikit-learn model deployment"
Deploying the model with MODEL-ID '145bca56-134f-7e89-3c12-0d3a7859d21f'...
DeploymentId 316a89e2-1234-6472-1390-c5432d16bf73
Scoring endpoint https://us-south.ml.cloud.ibm.com/v3/wml_instances/5da31...
Name My scikit-learn model deployment
Type scikit-learn-0.19
Runtime None Provided
Status DEPLOY_SUCCESS
Created at 2019-01-14T19:47:51.735Z
OK
Deploy model successful
Where:
- <model-id> was returned in the output from the
store
command
See: deploy
CLI command