If you have this notebook as a local copy on your platform, it may become outdated. Download the latest version of the project.

Part 2 - WML Federated Learning with XGBoost and Adult Income dataset - Party¶

Learning Goals¶

When you complete the Part 2 - WML Federated Learning with XGBoost and Adult Income dataset - Party, you should know how to:

  • Load the data that you intend to use in the Federated Learning experiment.
  • Install IBM Federated Learning libraries.
  • Define a data handler. For more details on data handlers, see Customizing the data handler.
  • Configure the party to train data with the aggregator.
This notebook is intended to be run by the administrator or connecting party of the Federated Learning experiment.

Table of Contents¶

  1. Load the data
  2. Install Federated Learning libraries
  3. Define a Data Handler
  4. Configure the party
  5. Train with Federated Learning
  6. Summary
Before you run this notebook, you must have already run Part 1 - WML Federated Learning with XGBoost and Adult Income dataset - Aggregator). If you have not, open the notebook and run through that notebook first.

1. Input variables¶

Paste Variables From Admin Notebook¶

Paste in the ID credentials you got from the end of the Part 1 notebook. If you have not run through Part 1, open the notebook and run through it first.

In [ ]:
WML_SERVICES_HOST = 'us-south.ml.cloud.ibm.com' # or "eu-de.ml.cloud.ibm.com", "eu-gb.ml.cloud.ibm.com", "jp-tok.ml.cloud.ibm.com"
PROJECT_ID = 'XXX'
IAM_APIKEY = 'XXX'
RTS_ID = 'XXX'
TRAINING_ID = 'XXX'

2. Download Adult Income dataset¶

As the party, you must provide the dataset that you will use to train the Federated Learning model. In this tutorial, a dataset is provided by default, the Adult Income dataset.

In [ ]:
import requests

dataset_resp = requests.get("https://api.dataplatform.cloud.ibm.com/v2/gallery-assets/entries/5fcc01b02d8f0e50af8972dc8963f98e/data",
                            allow_redirects=True)

f = open('adult.csv', 'wb')
f.write(dataset_resp.content)
f.close()

!ls -lh

3. Install Federated Learning libraries¶

In this section, we will install the necessary libraries and other packages to call for Federated Learning with the Python client.

3.1 Install the IBM WML SDK with FL¶

This installs the IBM Watson Machine Learning CLI along with the whole software development package with Federated Learning.

In [ ]:
import sys
!{sys.executable} -m pip install --upgrade 'ibm-watsonx-ai[fl-rt23.1-py3.10]'

3.2 Import the IBM Watson Machine Learning client¶

The following code imports the client for the party, and ensures that it is loaded.

In [ ]:
from ibm_watsonx_ai import APIClient

wml_credentials = {
    "apikey": IAM_APIKEY,
    "url": "https://" + WML_SERVICES_HOST
}

wml_client = APIClient(wml_credentials)
wml_client.set.default_project(PROJECT_ID)

4. Define a Data Handler¶

The party should run a data handler to ensure that their datasets are in compatible format and consistent. In this tutorial, an example data handler for the Adult Income dataset is provided.

For more details on data handlers, see Customizing the data handler.

This data handler is written to the local working directory of this notebook

In [ ]:
datahandler_resp = requests.get("https://raw.githubusercontent.com/IBMDataScience/sample-notebooks/master/Files/adult_sklearn_data_handler.py",
                                allow_redirects=True)

f = open('adult_sklearn_data_handler.py', 'wb')
f.write(datahandler_resp.content)
f.close()



!ls -lh

5. Configure the party¶

Each party must run their party configuration file to call out to the aggregator. Here is an example of a party configuration.

Because you had already defined the training ID, RTS ID and data handler in the previous sections of this notebook, and the local training and protocol handler are all defined by the SDK, you will only need to define the information for the dataset file under wml_client.remote_training_systems.ConfigurationMetaNames.DATA_HANDLER.

In this tutorial, the data path is already defined as we have loaded the examplar Adult Income dataset from previous sections.

In [ ]:
import json

from pathlib import Path
working_dir = !pwd
pwd = working_dir[0]

party_config = {
    wml_client.remote_training_systems.ConfigurationMetaNames.DATA_HANDLER: {
    "info": {
            "txt_file": "./adult.csv"
    },
    "name": "AdultSklearnDataHandler",
    "path": "./adult_sklearn_data_handler.py"
  }
}

print(json.dumps(party_config, indent=4))

6. Connect and train with Federated Learning¶

Here you can finally connect to the aggregator to begin training.

6.1 Create the party¶

In [ ]:
party = wml_client.remote_training_systems.create_party(RTS_ID, party_config)
party.monitor_logs()

6.2 Connect to the aggregator and start training¶

In [ ]:
party.run(aggregator_id=TRAINING_ID, asynchronous=False)

Summary¶

Congratulations! You have learned to:

  1. Start a Federated Learning experiment
  2. Load a template model
  3. Create an RTS and launch the experiment job
  4. Load a dataset for training
  5. Define the data handler
  6. Configure the party
  7. Connect to the aggregator
  8. Train your Federated Learning model

Learn more¶

  • For more details about setting up Federated Learning, terminology, and running Federated Learning from the UI, see Federated Learning documentation for Cloud.
  • For more information on a Keras model template, see their documentation here.


Copyright © 2020-2024 IBM. This notebook and its source code are released under the terms of the MIT License.


Love this notebook? Don't have an account yet?
Share it with your colleagues and help them discover the power of Watson Studio! Sign Up