When you complete the Part 2 - WML Federated Learning with MNIST for Party, you should know how to:
Paste in the ID credentials you got from the end of the Part 1 notebook. If you have not run through Part 1, open the notebook and run through it first.
WML_SERVICES_HOST = 'us-south.ml.cloud.ibm.com' # or "eu-de.ml.cloud.ibm.com", "eu-gb.ml.cloud.ibm.com", "jp-tok.ml.cloud.ibm.com"
PROJECT_ID = 'XXX'
IAM_APIKEY = 'XXX'
RTS_ID = 'XXX'
TRAINING_ID = 'XXX'
As the party, you must provide the dataset that you will use to train the Federated Learning model. In this tutorial, a dataset is provided by default, the MNIST handwritten digits dataset.
import requests
dataset_resp = requests.get("https://api.dataplatform.cloud.ibm.com/v2/gallery-assets/entries/903188bb984a30f38bb889102a1baae5/data",
allow_redirects=True)
f = open('MNIST-pkl.zip', 'wb')
f.write(dataset_resp.content)
f.close()
import zipfile
import os
with zipfile.ZipFile("MNIST-pkl.zip","r") as file:
file.extractall()
!ls -lh
In this section, we will install the necessary libraries and other packages to call for Federated Learning with the Python client.
This installs the IBM Watson Machine Learning CLI along with the whole software development package with Federated Learning.
import sys
!{sys.executable} -m pip install --upgrade 'ibm-watson-machine-learning[fl-rt22.2-py3.10]'
The following code imports the APIClient for the party, and ensures that it is loaded.
from ibm_watson_machine_learning import APIClient
wml_credentials = {
"apikey": IAM_APIKEY,
"url": "https://" + WML_SERVICES_HOST
}
wml_client = APIClient(wml_credentials)
wml_client.set.default_project(PROJECT_ID)
The party should run a data handler to ensure that their datasets are in compatible format and consistent. In this tutorial, an example data handler for the MNIST dataset is provided.
For more details on data handlers, see Customizing the data handler.
This data handler is written to the local working directory of this notebook
import requests
data_handler_content_resp = requests.get("https://github.com/IBMDataScience/sample-notebooks/raw/master/Files/mnist_keras_data_handler.py",
headers={"Content-Type": "application/octet-stream"},
allow_redirects=True)
f = open('mnist_keras_data_handler.py', 'wb')
f.write(data_handler_content_resp.content)
f.close()
!ls -lh
Each party must run their party configuration file to call out to the aggregator. Here is an example of a party configuration.
Because you had already defined the training ID, RTS ID and data handler in the previous sections of this notebook, and the local training and protocol handler are all defined by the SDK, you will only need to define the information for the dataset file under wml_client.remote_training_systems.ConfigurationMetaNames.DATA_HANDLER
.
In this tutorial, the data path is already defined as we have loaded the examplar MNIST dataset from previous sections.
import json
from pathlib import Path
working_dir = !pwd
pwd = working_dir[0]
party_config = {
wml_client.remote_training_systems.ConfigurationMetaNames.DATA_HANDLER: {
"info": {
"train_file": pwd + "/mnist-keras-train.pkl",
"test_file": pwd + "/mnist-keras-test.pkl"
},
"name": "MnistTFDataHandler",
"path": pwd + "/mnist_keras_data_handler.py"
}
}
print(json.dumps(party_config, indent=4))
Here you can finally connect to the aggregator to begin training.
party = wml_client.remote_training_systems.create_party(RTS_ID, party_config)
party.monitor_logs()
party.run(aggregator_id=TRAINING_ID, asynchronous=False)
Congratulations! You have learned to:
Copyright © 2020, 2021 IBM. This notebook and its source code are released under the terms of the MIT License.