Importing trained XGBoost model into Watson Machine Learning
If you have an XGBoost model that you trained outside of IBM Watson Machine Learning, this topic describes how to import that model into your Watson Machine Learning service.
Restrictions and requirements
- If you create the archive file for the model, use
sklearn.externals.joblib
package to serialize/pickle the model. - If you create the archive file for the model, the saved(serialized) model file must be in the top level of the .tar.gz file that gets saved/uploaded to the Watson Machine Learning repository using
client.repository.store_model()
API. - Only classification and regression models are supported
- Pandas Dataframe input type for
predict()
API is not supported - Models can contain references to only those packages (and package versions) that are available by default in the Anaconda installer of the Anaconda distribution used
- The only supported deployment type for XGBoost models is: web service
- See also: Supported frameworks
Example
The following notebook demonstrates importing an XGBoost model:
[XGBOOST MODEL NEEDED]
Interface options
There are three options for importing trained xgboost models:
- Option 1: If you have saved your model in PMML format, see: Importing models saved in PMML format
- Option 2: Python client
- Option 3: Command line interface (CLI)
Step 0 for interface options 2 and 3: Build, train, and save a model
The following Python code snippet demonstrates:
- Creating a Pipeline called
pipeline_org
- Saving the trained model in a Pickle file called “tent-prediction-model.pkl”
- Copying the pickle file to a directory called “model-dir”
- Saving the pickle file in a tar.gz file called “tent-prediction-model.tar.gz”
from sklearn.pipeline import Pipeline from sklearn.linear_model import LogisticRegression import pickle pipeline_org = Pipeline( steps = [ ( "classifier", LogisticRegression() ) ] ) pipeline_org.fit( X_train, y_train ) pickle.dump( pipeline_org, open( "tent-prediction-model.pkl", 'wb') ) !mkdir model-dir !cp tent-prediction-model.pkl model-dir !tar -zcvf tent-prediction-model.tar.gz tent-prediction-model.pkl
Where:
X_train
is a Pandas DataFrame containing your training datay_train
is a Pandas Series with the training data labels
For the full code example, see: the sample notebook
Interface option 2: Watson Machine Learning Python client
Step 1: Store the model in your Watson Machine Learning repository
You can store the Pipeline in your Watson Machine Learning repository using the Watson Machine Learning Python client store_model
method .
Format options:
-
In-memory Pipeline object:
from watson_machine_learning_client import WatsonMachineLearningAPIClient client = WatsonMachineLearningAPIClient( <your-credentials> ) pipeline = pickle.load( open( "tent-prediction-model.pkl", 'rb') ) model_details = client.repository.store_model( pipeline, "My xgboost model (in-memory object)" )
-
Pipeline saved in a pickle file:
from watson_machine_learning_client import WatsonMachineLearningAPIClient client = WatsonMachineLearningAPIClient( <your-credentials> ) metadata = { client.repository.ModelMetaNames.NAME: "My xgboost model (.pkl)", client.repository.ModelMetaNames.FRAMEWORK_NAME: "xgboost", client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.82" } model_details = client.repository.store_model( model="model-dir", meta_props=metadata )
-
Pipeline saved in a pickle file in a tar.gz file:
from watson_machine_learning_client import WatsonMachineLearningAPIClient client = WatsonMachineLearningAPIClient( <your-credentials> ) metadata = { client.repository.ModelMetaNames.NAME: "My xgboost model (tar.gz)", client.repository.ModelMetaNames.FRAMEWORK_NAME: "xgboost", client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.82" } model_details = client.repository.store_model( model="tent-prediction-model.tar.gz", meta_props=metadata )
Where:
- <your-credentials> contains credentials for your Watson Machine Learning service (see: Looking up credentials)
Step 2: Deploy the stored model in your Watson Machine Learning
The following example demonstrates deploying the stored model as a web service, which is the default deployment type:
model_id = model_details["metadata"]["guid"]
model_deployment_details = client.deployments.create( artifact_uid=model_id, name="My xgboost model deployment" )
See: Deployments.create
Interface option 3: Watson Machine Learning CLI
Prerequisite: Set up the CLI environment.
Step 1: Store the model in your Watson Machine Learning repository
Example command and corresponding output
>ibmcloud ml store <model-filename> <manifest-filename>
Starting to store ...
OK
Model store successful. Model-ID is '145bca56-134f-7e89-3c12-0d3a7859d21f'.
Where:
- <model-filename> is the path and name of the tar.gz file
- <manifest-filename> is the path and name of a manifest fest containing metadata about the model being stored
Sample manifest file contents
name: My xgboost model
framework:
name: xgboost
version: '0.82'
See: store
CLI command
Step 2: Deploy the stored model in your Watson Machine Learning
The following example demonstrates deploying the stored model as a web service, which is the default deployment type:
Example command and corresponding output
>ibmcloud ml deploy <model-id> "My xgboost model deployment"
Deploying the model with MODEL-ID '145bca56-134f-7e89-3c12-0d3a7859d21f'...
DeploymentId 316a89e2-1234-6472-1390-c5432d16bf73
Scoring endpoint https://us-south.ml.cloud.ibm.com/v3/wml_instances/5da31...
Name My xgboost deployment
Type xgboost-0.82
Runtime None Provided
Status DEPLOY_SUCCESS
Created at 2019-01-14T19:47:51.735Z
OK
Deploy model successful
Where:
- <model-id> was returned in the output from the
store
command
See: deploy
CLI command