Importing trained XGBoost model into Watson Machine Learning

If you have an XGBoost model that you trained outside of IBM Watson Machine Learning, this topic describes how to import that model into your Watson Machine Learning service.

 

Restrictions and requirements

  • If you create the archive file for the model, use sklearn.externals.joblib package to serialize/pickle the model.
  • If you create the archive file for the model, the saved(serialized) model file must be in the top level of the .tar.gz file that gets saved/uploaded to the Watson Machine Learning repository using client.repository.store_model() API.
  • Only classification and regression models are supported
  • Pandas Dataframe input type for predict() API is not supported
  • Models can contain references to only those packages (and package versions) that are available by default in the Anaconda installer of the Anaconda distribution used
  • The only supported deployment type for XGBoost models is: web service
  • See also: Supported frameworks

 

Example

The following notebook demonstrates importing an XGBoost model:

[XGBOOST MODEL NEEDED]

 

Interface options

There are three options for importing trained xgboost models:

 

Step 0 for interface options 2 and 3: Build, train, and save a model

The following Python code snippet demonstrates:

  • Creating a Pipeline called pipeline_org
  • Saving the trained model in a Pickle file called “tent-prediction-model.pkl”
  • Copying the pickle file to a directory called “model-dir”
  • Saving the pickle file in a tar.gz file called “tent-prediction-model.tar.gz”
    from sklearn.pipeline import Pipeline
    from sklearn.linear_model import LogisticRegression
    import pickle
    pipeline_org = Pipeline( steps = [ ( "classifier", LogisticRegression() ) ] )
    pipeline_org.fit( X_train, y_train )
    pickle.dump( pipeline_org, open( "tent-prediction-model.pkl", 'wb') )
    !mkdir model-dir
    !cp tent-prediction-model.pkl model-dir
    !tar -zcvf tent-prediction-model.tar.gz tent-prediction-model.pkl
    

Where:

  • X_train is a Pandas DataFrame containing your training data
  • y_train is a Pandas Series with the training data labels

For the full code example, see: the sample notebook external link

 

Interface option 2: Watson Machine Learning Python client

Step 1: Store the model in your Watson Machine Learning repository

You can store the Pipeline in your Watson Machine Learning repository using the Watson Machine Learning Python client store_model method external link.

Format options:

  • In-memory Pipeline object:

    from watson_machine_learning_client import WatsonMachineLearningAPIClient
    client = WatsonMachineLearningAPIClient( <your-credentials> )
    pipeline = pickle.load( open( "tent-prediction-model.pkl", 'rb') )
    model_details = client.repository.store_model( pipeline, "My xgboost model (in-memory object)" )
    
  • Pipeline saved in a pickle file:

    from watson_machine_learning_client import WatsonMachineLearningAPIClient
    client = WatsonMachineLearningAPIClient( <your-credentials> )
    metadata = {
        client.repository.ModelMetaNames.NAME: "My xgboost model (.pkl)",
        client.repository.ModelMetaNames.FRAMEWORK_NAME: "xgboost",
        client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.82"
    }
    model_details = client.repository.store_model( model="model-dir", meta_props=metadata )
    
  • Pipeline saved in a pickle file in a tar.gz file:

    from watson_machine_learning_client import WatsonMachineLearningAPIClient
    client = WatsonMachineLearningAPIClient( <your-credentials> )
    metadata = {
        client.repository.ModelMetaNames.NAME: "My xgboost model (tar.gz)",
        client.repository.ModelMetaNames.FRAMEWORK_NAME: "xgboost",
        client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.82"
    }
    model_details = client.repository.store_model( model="tent-prediction-model.tar.gz", meta_props=metadata )
    

Where:

  • <your-credentials> contains credentials for your Watson Machine Learning service (see: Looking up credentials)

Step 2: Deploy the stored model in your Watson Machine Learning

The following example demonstrates deploying the stored model as a web service, which is the default deployment type:

model_id = model_details["metadata"]["guid"]
model_deployment_details = client.deployments.create( artifact_uid=model_id, name="My xgboost model deployment" )

See: Deployments.create external link

 

Interface option 3: Watson Machine Learning CLI

Prerequisite: Set up the CLI environment.

Step 1: Store the model in your Watson Machine Learning repository

Example command and corresponding output

>ibmcloud ml store <model-filename> <manifest-filename>
Starting to store ...
OK
Model store successful. Model-ID is '145bca56-134f-7e89-3c12-0d3a7859d21f'.

Where:

  • <model-filename> is the path and name of the tar.gz file
  • <manifest-filename> is the path and name of a manifest fest containing metadata about the model being stored

Sample manifest file contents

name: My xgboost model
framework:
  name: xgboost
  version: '0.82'

See: store CLI command external link

Step 2: Deploy the stored model in your Watson Machine Learning

The following example demonstrates deploying the stored model as a web service, which is the default deployment type:

Example command and corresponding output

>ibmcloud ml deploy <model-id> "My xgboost model deployment"
Deploying the model with MODEL-ID '145bca56-134f-7e89-3c12-0d3a7859d21f'...
DeploymentId       316a89e2-1234-6472-1390-c5432d16bf73
Scoring endpoint   https://us-south.ml.cloud.ibm.com/v3/wml_instances/5da31...
Name               My xgboost deployment
Type               xgboost-0.82
Runtime            None Provided
Status             DEPLOY_SUCCESS
Created at         2019-01-14T19:47:51.735Z
OK
Deploy model successful

Where:

  • <model-id> was returned in the output from the store command

See: deploy CLI command external link