If you have this notebook as a local copy on your platform, it may become outdated. Download the latest version of the project.

Part 1 - WML Federated Learning with XGBoost and Adult Income dataset - Aggregator¶

With IBM Federated Learning, you can combine data from multiple sources to train a model from the collective data without having to actually share them. This allows enterprises to train data with other companies without delegating resources for security. Another advantage is the remote data does not have to be centralized in one location, eliminates the needs to move potentially large datasets. This notebook demonstrates how to start Federated Learning with the Python client. For more details setting up Federated Learning, terminology, and running Federated Learning from the UI, see Federated Learning documentation.

Learning Goals¶

When you complete the Part 1 - WML Federated Learning with XGBoost and Adult Income dataset - Aggregator notebook, you should know how to:

  • Create a Remote Training System
  • Start a training job

Once you complete this notebook, please open Part 2 - WML Federated Learning with XGBoost and Adult Income dataset - Party.

This notebook is intended to be run by the administrator of the Federated Learning experiment.

Table of Contents¶

  • 1. Prequisites
    • 1.1 Define variables
    • 1.2 Define tags
    • 1.3 Import libraries
  • 2. Obtain IBM Cloud Token
  • 3. Create a Remote Training System
  • 4. Create FL Training Job
    • 4.1 Get Training Job Status
  • 5. Get Variables And Paste Into Party Notebook
  • 6. Save Trained Model
    • 6.1 COS connection
    • 6.2 Install pre-reqs
    • 6.3 Save model to project
  • 7. Save Trained Model
    • 7.1 List all training jobs
    • 7.2 Delete training jobs
    • 7.3 List all Remote Training Systems
    • 7.4 Delete Remote Training Systems

1. Prequisites¶

Before you proceed, you need to have:

  • An IAM API Key. To create a new one, go to IBM Cloud homepage. In your account, go to Manage < IAM < API Keys. Click Create an IBM Cloud API Key.

1.1 Define variables¶

In [ ]:
API_VERSION = "2021-10-01"

WML_SERVICES_HOST = "us-south.ml.cloud.ibm.com" # or "eu-de.ml.cloud.ibm.com", "eu-gb.ml.cloud.ibm.com", "jp-tok.ml.cloud.ibm.com"

WML_SERVICES_URL = "https://" + WML_SERVICES_HOST
IAM_TOKEN_URL = "https://iam.cloud.ibm.com/oidc/token"
 
IAM_APIKEY = "XXX"  

# Get this from Manage < IAM < Users, and check the URL. Your user ID should be in the format IBMid-<xxx>.
CLOUD_USERID = "IBMid-XXX" 

PROJECT_ID = "XXX" # Get this by going into your WS project and checking the URL.

1.2 Define tags¶

Used to identify the assets created by this notebook

In [ ]:
RTS_TAG = "wmlflxgbsamplerts"
TRAINING_TAG = "wmlflxgbsampletraining"

1.3 Import libraries¶

In [ ]:
import urllib3
import requests
import json
from string import Template

urllib3.disable_warnings()

2. Obtain Cloud authentication token¶

In [ ]:
payload = "grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey=" + IAM_APIKEY
token_resp = requests.post(IAM_TOKEN_URL ,
                          headers={"Content-Type": "application/x-www-form-urlencoded"}, 
                          data = payload,
                          verify=True)

print(token_resp)

token = "Bearer " + json.loads(token_resp.content.decode("utf-8"))["access_token"]
print("WS token: %s " % token)

3. Create Remote Training System Asset¶

Now you will learn to create a Remote Training System (RTS). An RTS handles receiving your multiple parties' call to the aggregator to run the training.

  • allowed_identities are users permitted to connect to the Federated Learning experiment. In this tutorial, only your user ID is permitted to connect but you can update the template and add additional users as required.
  • An Admin in remote_admin. The template for the admin is the same as the user. In this tutorial, a template Admin is created. It is also the same as the user ID, however generally in application, the admin does not have to be one of the users.
In [ ]:
wml_remote_training_system_asset_one_def = Template("""
{
  "name": "Remote Party 1",
  "project_id": "$projectId",
  "description": "Sample Remote Training System",
  "tags": [ "$tag" ],
  "organization": {
    "name": "IBM",
    "region": "US"
  },
  "allowed_identities": [
    {
      "id": "$userID",
      "type": "user"
    }
  ],
  "remote_admin": {
    "id": "$userID",
    "type": "user"
  }
}
""").substitute(userID = CLOUD_USERID,
                projectId = PROJECT_ID,
                tag = RTS_TAG)


wml_remote_training_system_one_resp = requests.post(WML_SERVICES_URL + "/ml/v4/remote_training_systems", 
                                                    headers={"Content-Type": "application/json",
                                                             "Authorization": token}, 
                                                    params={"version": API_VERSION,
                                                            "project_id": PROJECT_ID}, 
                                                    data=wml_remote_training_system_asset_one_def, 
                                                    verify=False)

print(wml_remote_training_system_one_resp)
status_json = json.loads(wml_remote_training_system_one_resp.content.decode("utf-8"))
print("Create remote training system response : "+ json.dumps(status_json, indent=4))

wml_remote_training_system_one_asset_uid = json.loads(wml_remote_training_system_one_resp.content.decode("utf-8"))["metadata"]["id"]
print("Remote Training System id: %s" % wml_remote_training_system_one_asset_uid)

4. Create FL Training Job¶

In this section, you will launch the Federated Learning experiment.

In [ ]:
training_payload = Template(""" 
{
  "name": "FL Aggregator",
  "tags": [ "$tag" ],
  "federated_learning": {
    "fusion_type": "xgb_classifier",
    "learning_rate": 0.1,
    "loss": "binary_crossentropy",
    "max_bins": 255,
    "rounds": 3,
    "num_classes": 2,
    "metrics": "loss",
    "remote_training" : {
      "quorum": 1.0,
      "remote_training_systems": [ { "id" : "$rts_one", "required" : true  } ]
    },
    "software_spec": {
      "name": "runtime-23.1-py3.10"
    },
    "hardware_spec": {
      "name": "XS"
    }
  },
  "training_data_references": [],
  "results_reference": {
    "type": "container",
    "name": "outputData",
    "connection": {},
    "location": {
      "path": "."
    }
  },
  "project_id": "$projectId"  
}
""").substitute(projectId = PROJECT_ID,
                rts_one = wml_remote_training_system_one_asset_uid,
                tag = TRAINING_TAG)

create_training_resp = requests.post(WML_SERVICES_URL + "/ml/v4/trainings", params={"version": API_VERSION},
                                     headers={"Content-Type": "application/json",
                                              "Authorization": token},
                                     data=training_payload,
                                     verify=False)

print(create_training_resp)
status_json = json.loads(create_training_resp.content.decode("utf-8"))
print("Create training response : "+ json.dumps(status_json, indent=4))

training_id = json.loads(create_training_resp.content.decode("utf-8"))["metadata"]["id"]
print("Training id: %s" % training_id)

4.1 Get Training Job Status¶

Before you run the following code, please make your that your project is associated with a Watson Machine Learning service. For more details on associating services, please see: Associating services
In [ ]:
get_training_resp = requests.get(WML_SERVICES_URL + "/ml/v4/trainings/" + training_id,
                                 headers={"Content-Type": "application/json",
                                          "Authorization": token},
                                  params={"version": API_VERSION,
                                          "project_id": PROJECT_ID},
                                  verify=False)

print(get_training_resp)
status_json = json.loads(get_training_resp.content.decode("utf-8"))
print("Get training response : "+ json.dumps(status_json, indent=4))

5. Get Variables And Paste Into Party Notebook¶

Run the following cell and copy the output.

In [ ]:
print("WML_SERVICES_HOST = '%s'" % WML_SERVICES_HOST)
print("PROJECT_ID = '%s'" % PROJECT_ID)
print("IAM_APIKEY = '%s'" % IAM_APIKEY)
print("RTS_ID = '%s'" % wml_remote_training_system_one_asset_uid)
print("TRAINING_ID = '%s'" % (training_id))

As the Admin, you have now launched a Federated Learning experiment. Copy the output from the previous cell. Open Part 2 - WML Federated Learning with XGBoost and Adult Income dataset - Party and paste the output into the first code cell. Run the Part 2 - Party notebook to the end.

6. Save Trained Model To Project¶

Once training has completed, run the cells below to save the trained model into your project.

6.1 Connection to COS¶

This information is located in your Watson Studio project, under the Manage tab, on the General page.

  1. The bucket name is listed inside the Storage pane.
  2. To obtain the credentials click on the Manage in IBM Cloud link located inside the Storage pane. From your COS instance click Service Credentials. You can use an existing or create a new credential if needed.
  • COS_APIKEY - the "apikey" from your credentials
  • COS_RESOURCE_INSTANCE_ID - the "resource_instance_id" from your credentials
  1. The COS endpoints are listed in your COS instance under Endpoints.
In [ ]:
BUCKET = "" # bucket used by project ex. myproject-donotdelete-pr-tdnvueqivxep8v. Go to your project > Manage and check the bucket name under Cloud storage.

COS_ENDPOINT = "https://s3.us.cloud-object-storage.appdomain.cloud" # Current list available at https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints

# Find these in cloud.ibm.com > Storage > Credentials > <Your COS bucket> 
COS_APIKEY = "" # eg "W00YixxxxxxxxxxMB-odB-2ySfTrFBIQQWanc--P3byk" 
COS_RESOURCE_INSTANCE_ID = "" # eg "crn:v1:bluemix:public:cloud-object-storage:global:a/3bf0d9003xxxxxxxxxx1c3e97696b71c:d6f04d83-6c4f-4a62-a165-696756d63903::"

6.2 Install pre-req¶

In [ ]:
!pip install ibm-cos-sdk

6.3 Save model to project¶

In [ ]:
import ibm_boto3
from ibm_botocore.client import Config, ClientError

cos = ibm_boto3.resource("s3",
    ibm_api_key_id=COS_APIKEY,
    ibm_service_instance_id=COS_RESOURCE_INSTANCE_ID,
    config=Config(signature_version="oauth"),
    endpoint_url=COS_ENDPOINT
)

ITEM_NAME = training_id + "/assets/" + training_id + "/resources/wml_model/request.json"

file = cos.Object(BUCKET, ITEM_NAME).get()
req = json.loads(file["Body"].read())


req["name"] = "Trained Adult Income Model"

model_save_payload = json.dumps(req)
print ("Model save payload: %s" % model_save_payload)
In [ ]:
model_save_resp = requests.post(WML_SERVICES_URL + "/ml/v4/models",
                                params={"version": API_VERSION,
                                        "project_id": PROJECT_ID,
                                        "content_format": "native"},
                                headers={"Content-Type": "application/json",
                                         "Authorization": token},
                                data=model_save_payload,
                                verify=False)

print(model_save_resp)
status_json = json.loads(model_save_resp.content.decode("utf-8"))
print("Save model response : "+ json.dumps(status_json, indent=4))

model_id = json.loads(model_save_resp.content.decode("utf-8"))["metadata"]["id"]
print("Saved model id: %s" % model_id)

7. Clean Up Project¶

Use this section to delete the training jobs and assets created by this notebook.

7.1 List all training jobs in project¶

In [ ]:
get_training_resp = requests.get(WML_SERVICES_URL + "/ml/v4/trainings",
                                 headers={"Content-Type": "application/json",
                                          "Authorization": token},
                                 params={"version": API_VERSION,
                                         "project_id": PROJECT_ID},
                                 verify=False)

print(get_training_resp)
status_json = json.loads(get_training_resp.content.decode("utf-8"))
print("Get training response : "+ json.dumps(status_json, indent=4))

7.2 Delete all training jobs in this project created by this notebook¶

This will stop all running aggregators created using this notebook.

In [ ]:
get_training_resp = requests.get(WML_SERVICES_URL + "/ml/v4/trainings",
                                 headers={"Content-Type": "application/json",
                                          "Authorization": token},
                                 params={"version": API_VERSION,
                                         "project_id": PROJECT_ID,
                                         "tag.value": TRAINING_TAG},
                                 verify=False)

training_list_json = json.loads(get_training_resp.content.decode("utf-8"))
training_resources=training_list_json["resources"]

for training in training_resources:
    training_id = training["metadata"]["id"]
    print("Deleting Training ID: " + training_id)
    delete_training_resp = requests.delete(WML_SERVICES_URL + "/ml/v4/trainings/" + training_id,
                                           headers={"Content-Type": "application/json",
                                                    "Authorization": token},
                                           params={"version": API_VERSION,
                                                   "project_id": PROJECT_ID,
                                                   "hard_delete": True},
                                           verify=False)
    print(delete_training_resp)

7.3 List all remote training systems in project¶

In [ ]:
get_rts_resp = requests.get(WML_SERVICES_URL + "/ml/v4/remote_training_systems", 
                            headers={"Content-Type": "application/json",
                                     "Authorization": token}, 
                            params={"version": API_VERSION,
                                    "project_id": PROJECT_ID}, 
                            verify=False)

print(get_rts_resp)
rts_list_json = json.loads(get_rts_resp.content.decode("utf-8"))
print("Remote Training Systems in Project : "+ json.dumps(rts_list_json, indent=4))

7.4 Delete all remote training systems in this project created by this notebook¶

In [ ]:
get_rts_resp = requests.get(WML_SERVICES_URL + "/ml/v4/remote_training_systems", 
                            headers={"Content-Type": "application/json",
                                     "Authorization": token}, 
                            params={"version": API_VERSION,
                                    "project_id": PROJECT_ID,
                                    "tag.value": RTS_TAG}, 
                            verify=False)

rts_list_json = json.loads(get_rts_resp.content.decode("utf-8"))
rts_resources=rts_list_json["resources"]

for rts in rts_resources:
    rts_id = rts["metadata"]["id"]
    print("Deleting RTS ID: " + rts_id)
    delete_rts_resp = requests.delete(WML_SERVICES_URL + "/ml/v4/remote_training_systems/" + rts_id, 
                                      headers={"Content-Type": "application/json",
                                               "Authorization": token}, 
                                      params={"version": API_VERSION,
                                              "project_id": PROJECT_ID}, 
                                      verify=False)
    print(delete_rts_resp)


Copyright © 2020-2024 IBM. This notebook and its source code are released under the terms of the MIT License.



Love this notebook? Don't have an account yet?
Share it with your colleagues and help them discover the power of Watson Studio! Sign Up