Importing trained scikit-learn models into Watson Machine Learning

If you have a scikit-learn model that you trained outside of IBM Watson Machine Learning, this topic describes how to import that model into your Watson Machine Learning service.

 

Restrictions and requirements

  • If you create the archive file for the model, use sklearn.externals.joblib package to serialize/pickle the model.
  • If you create the archive file for the model, the saved(serialized) model file must be in the top level of the .tar.gz file that gets saved/uploaded to the Watson Machine Learning repository using client.repository.store_model() API.
  • Only classification and regression models are supported
  • Pandas Dataframe input type for predict() API is not supported
  • Models can contain references to only those packages (and package versions) that are available by default in the Anaconda installer of the Anaconda distribution used
  • The only supported deployment type for scikit-learn models is: web service
  • See also: Supported frameworks

 

Example

The following notebook demonstrates importing a scikit-learn model:

 

Interface options

There are three options for importing trained scikit-learn models:

 

Step 0 for interface options 2 and 3: Build, train, and save a model

The following Python code snippet demonstrates:

  • Creating a Pipeline called pipeline_org
  • Saving the trained model in a Pickle file called “tent-prediction-model.pkl”
  • Copying the pickle file to a directory called “model-dir”
  • Saving the pickle file in a tar.gz file called “tent-prediction-model.tar.gz”
    from sklearn.pipeline import Pipeline
    from sklearn.linear_model import LogisticRegression
    import pickle
    pipeline_org = Pipeline( steps = [ ( "classifier", LogisticRegression() ) ] )
    pipeline_org.fit( X_train, y_train )
    pickle.dump( pipeline_org, open( "tent-prediction-model.pkl", 'wb') )
    !mkdir model-dir
    !cp tent-prediction-model.pkl model-dir
    !tar -zcvf tent-prediction-model.tar.gz tent-prediction-model.pkl
    

Where:

  • X_train is a Pandas DataFrame containing your training data
  • y_train is a Pandas Series with the training data labels

For the full code example, see: the sample notebook external link

 

Interface option 2: Watson Machine Learning Python client

Step 1: Store the model in your Watson Machine Learning repository

You can store the Pipeline in your Watson Machine Learning repository using the Watson Machine Learning Python client store_model method external link.

Format options:

  • In-memory Pipeline object:

    from watson_machine_learning_client import WatsonMachineLearningAPIClient
    client = WatsonMachineLearningAPIClient( <your-credentials> )
    pipeline = pickle.load( open( "tent-prediction-model.pkl", 'rb') )
    model_details = client.repository.store_model( pipeline, "My scikit-learn model (in-memory object)" )
    
  • Pipeline saved in a pickle file:

    from watson_machine_learning_client import WatsonMachineLearningAPIClient
    client = WatsonMachineLearningAPIClient( <your-credentials> )
    metadata = {
        client.repository.ModelMetaNames.NAME: "My scikit-learn model (.pkl)",
        client.repository.ModelMetaNames.FRAMEWORK_NAME: "scikit-learn",
        client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.19"
    }
    model_details = client.repository.store_model( model="model-dir", meta_props=metadata )
    
  • Pipeline saved in a pickle file in a tar.gz file:

    from watson_machine_learning_client import WatsonMachineLearningAPIClient
    client = WatsonMachineLearningAPIClient( <your-credentials> )
    metadata = {
        client.repository.ModelMetaNames.NAME: "My scikit-learn model (tar.gz)",
        client.repository.ModelMetaNames.FRAMEWORK_NAME: "scikit-learn",
        client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.19"
    }
    model_details = client.repository.store_model( model="tent-prediction-model.tar.gz", meta_props=metadata )
    

Where:

  • <your-credentials> contains credentials for your Watson Machine Learning service (see: Looking up credentials)

Step 2: Deploy the stored model in your Watson Machine Learning

The following example demonstrates deploying the stored model as a web service, which is the default deployment type:

model_id = model_details["metadata"]["guid"]
model_deployment_details = client.deployments.create( artifact_uid=model_id, name="My scikit-learn model deployment" )

See: Deployments.create external link

 

Interface option 3: Watson Machine Learning CLI

Prerequisite: Set up the CLI environment.

Step 1: Store the model in your Watson Machine Learning repository

Example command and corresponding output

>ibmcloud ml store <model-filename> <manifest-filename>
Starting to store ...
OK
Model store successful. Model-ID is '145bca56-134f-7e89-3c12-0d3a7859d21f'.

Where:

  • <model-filename> is the path and name of the tar.gz file
  • <manifest-filename> is the path and name of a manifest fest containing metadata about the model being stored

Sample manifest file contents

name: My scikit-learn model
framework:
  name: scikit-learn
  version: '0.19'

See: store CLI command external link

Step 2: Deploy the stored model in your Watson Machine Learning

The following example demonstrates deploying the stored model as a web service, which is the default deployment type:

Example command and corresponding output

>ibmcloud ml deploy <model-id> "My scikit-learn model deployment"
Deploying the model with MODEL-ID '145bca56-134f-7e89-3c12-0d3a7859d21f'...
DeploymentId       316a89e2-1234-6472-1390-c5432d16bf73
Scoring endpoint   https://us-south.ml.cloud.ibm.com/v3/wml_instances/5da31...
Name               My scikit-learn model deployment
Type               scikit-learn-0.19
Runtime            None Provided
Status             DEPLOY_SUCCESS
Created at         2019-01-14T19:47:51.735Z
OK
Deploy model successful

Where:

  • <model-id> was returned in the output from the store command

See: deploy CLI command external link