Additional details for Federated Learning implementation

This document provides more details on aspects of Federated Learning configuration that require extensive code implementation.

Anaconda configuration

When you work with Federated Learning from the Python client, sometimes you have to create a new Conda environment. Here is a sample yml file with the commands to set up an independent Conda environment.

# Create env
conda create -n fl_env python=3.7.9

# Install jupyter in ev
conda install -c conda-forge -n fl_env notebook

# Activate env
conda activate fl_env

# Pip install SDK
pip install ibm-watson-machine-learning

# Pip install packages needed by IBMFL
pip install environs parse websockets jsonpickle pandas pytest pyYAML requests pathlib2 psutil setproctitle tabulate lz4 opencv-python gym ray==0.8.0 cloudpickle==1.3.0 image

# Pip install frameworks used by IBMFL
pip install tensorflow==2.1.0 scikit-learn==0.23.1 keras==2.2.4 numpy==1.17.4 scipy==1.4.1

# Launch juypyter notebooks
jupyter notebook  # <-- make sure you've already activated the env: conda activate fl_env

Data handler examples

Returning a data generator defined by Keras or Tensorflow 2

The following is a code example that need to be included as part of get_data to return data in the form of a data generator defined by Keras or Tensorflow 2:

train_gen = ImageDataGenerator(rotation_range=8,
                                width_shift_range=0.08,
                                shear_range=0.3,
                                height_shift_range=0.08,
                                zoom_range=0.08)

    train_datagenerator = train_gen.flow(
        x_train, y_train, batch_size=64)

    return train_datagenerator

Returning data as numpy arrays

The following is a code example of the MNIST data handler, which returns the data in the format of numpy arrays.

import numpy as np

# imports from ibmfl
from ibmfl.data.data_handler import DataHandler
from ibmfl.exceptions import FLException



class MnistKerasDataHandler(DataHandler):
    """
    Data handler for MNIST dataset.
    """

    def __init__(self, data_config=None, channels_first=False):
        super().__init__()
        self.file_name = None
        # `data_config` loads anything inside the `info` part of the `data` section. 
        if data_config is not None:
            # this example assumes the local dataset is in .npz format, so it searches for it.
            if 'npz_file' in data_config: 
                self.file_name = data_config['npz_file']
        self.channels_first = channels_first
        
        if self.file_name is None:
            raise FLException('No data file name is provided to load the dataset.')
        else:
            try:
                data_train = np.load(self.file_name)
                self.x_train = data_train['x_train']
                self.y_train = data_train['y_train']
                self.x_test = data_train['x_test']
                self.y_test = data_train['y_test']
            except Exception:
                raise IOError('Unable to load training data from path '
                              'provided in config file: ' +
                              self.file_name)
            self.preprocess_data()

    def get_data(self):
        """
        Gets pre-processed mnist training and testing data. 

        :return: training and testing data
        :rtype: `tuple`
        """
        return (self.x_train, self.y_train), (self.x_test, self.y_test)

    def preprocess_data(self):
        """
        Preprocesses the training and testing dataset.

        :return: None
        """
        num_classes = 10
        img_rows, img_cols = 28, 28
        if self.channels_first:
            self.x_train = self.x_train.reshape(self.x_train.shape[0], 1, img_rows, img_cols)
            self.x_test = self.x_test.reshape(self.x_test.shape[0], 1, img_rows, img_cols)
        else:
            self.x_train = self.x_train.reshape(self.x_train.shape[0], img_rows, img_cols, 1)
            self.x_test = self.x_test.reshape(self.x_test.shape[0], img_rows, img_cols, 1)

        print('x_train shape:', self.x_train.shape)
        print(self.x_train.shape[0], 'train samples')
        print(self.x_test.shape[0], 'test samples')

        # convert class vectors to binary class matrices
        self.y_train = np.eye(num_classes)[self.y_train]
        self.y_test = np.eye(num_classes)[self.y_test]

Hyperparameters configuration

Based on the fusion method and framework that you select for you Federated Learning model, hyperparameter options differ.

The following table shows the the hyperparameter options available for the Scikit-learn framework and its fusion methods:

The following table shows the the hyperparameter options available for the Tensorflow 2 framework:

Framework Hyperparameter Definition Notes
TensorFlow 2 Epochs The total number of passes over the local training dataset to train a TensorFlow model.  
  Batch size When you use batch processing, specifies how many samples to process at a time. Useful for processing large datasets.
  Steps per epoch Used for large data sets, Steps per epoch determines the batches to train in a single dataset to improve the accuracy of the model. It determines the finishing of one epoch and the starting of the next epoch. When passing an infinitely repeating dataset, you must specify the steps_per_epoch argument.

The following table shows the the hyperparameter options available for the fusion methods:

Fusion method Hyperparameter Definition Notes
Iterative average,
FedAvg
Rounds The number of training iterations to complete between the aggregator and the remote systems.  
Scikit-Learn (XGBoost Classification) Learning rate The learning rate, also known as shrinkage. This is used as a multiplicative factor for the leaves values.  
  Loss The loss function to use in the boosting process.
- binary_crossentropy (also known as logistic loss) is used for binary classification.
- categorical_crossentropy is used for multiclass classification.
- auto will automatically choose either loss function depending on the nature of the problem.
- least_squares is used for regression.
 
  Rounds The number of training iterations to complete between the aggregator and the remote systems.  
  Number of classes Number of target classes for the classification model. Required if “Loss” hyperparameter is:
- auto
- binary_crossentropy
- categorical_crossentropy
Scikit-Learn (XGBoost Regression) Learning rate The learning rate, also known as shrinkage. This is used as a multiplicative factor for the leaves values.  
  Loss The loss function to use in the boosting process.
- binary_crossentropy (also known as logistic loss) is used for binary classification.
- categorical_crossentropy is used for multiclass classification.
- auto will automatically choose either loss function depending on the nature of the problem.
- least_squares is used for regression.
 
  Rounds The number of training iterations to complete between the aggregator and the remote systems.  
Scikit-learn (KMeans/SPAHM) Max iter The total number of passes over the local training dataset to train a Scikit-learn model.  
  N cluster The number of clusters to form as well as the number of centroids to generate.  

Remote server requirements

Federated Learning can connect to various of remote servers, such as Db2, or other Cloud Pak for Data clusters, and more. To connect to Federated Learning, basic requirements are to be able to:

  • Run Python 3.7.9 or higher
  • Use the Python client (see authentication for Watson Machine Learning Python client here)
  • Have web connectivity

Scikit-learn model configuration

If you chose Scikit-learn (SKLearn) as the model framework, you need to configure your settings to save the model trained in Federated Learning as a pickle file. Specify your model by the following code example of methods to implement as part of your model file, which depends on the model type that you select for SKLearn.

XGBoost classification

# XGBoost classification model
# you can choose your own loss function by changing the content for 'loss'. 
# In the following example, we choose `binary_crossentropy` 
# for a binary classification example.
# If you want to train a multiclass classification problem, you need to choose
# 'categorical_crossentropy'. 
# You can also choose 'auto' to allow IBM FL to choose the correct loss for you.

spec = {
	'global': {
            'learning_rate': 0.1,
            'loss': 'binary_crossentropy',
            'max_bins': 255,
            'max_depth': None,
            'max_iter': 100,
            'verbose': True,
            'num_classes': 2
        }
}

XGBoost regression

# XGBoost regression model
# you can choose your own loss function by changing the content for 'loss'. 
# In the following example, we choose `binary_crossentropy` 
# for a binary classification example.
# If you want to train a multiclass classification problem, you need to choose
# 'categorical_crossentropy'. 
# You can also choose 'auto' to allow IBM FL to choose the correct loss for you.

spec = {
	'global': {
            'learning_rate': 0.1,
            'loss': 'least_squares', 
            'max_bins': 255,
            'max_depth': None,
            'max_iter': 100,
            'verbose': True
        }
}

SKLearn classification

# SKLearn classification
# Specify your model. Users need to provide the classes used in classification problems.
# In the example, there are 10 classes.

model = SGDClassifier(loss='log', penalty='l2')
model.classes_ = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])  

# Define the path and save the model as a pickle file

if not os.path.exists(folder_configs):
    os.makedirs(folder_configs)
fname = os.path.join(folder_configs, 'model_architecture.pickle')
with open(fname, 'wb') as f:
    joblib.dump(model, f)
    # Generate model spec:
spec = {'model_definition': fname}

SKLearn regression

# Sklearn regression 
# create a sklearn regression model

model = SGDRegressor(loss='huber', penalty='l2')

# specify/create a directory where you want to save the model file

if not os.path.exists(folder_configs):
    os.makedirs(folder_configs)
fname = os.path.join(folder_configs, 'model_architecture.pickle')

# save the model as a pickle file

with open(fname, 'wb') as f:
    pickle.dump(model, f)

SKLearn Kmeans

# SKLearn Kmeans

def get_model_config(folder_configs, dataset, is_agg=False, party_id=0):

    model = KMeans()

    # Save model
    fname = os.path.join(folder_configs, 'kmeans-central-model.pickle')
    with open(fname, 'wb') as f:
        pickle.dump(model, f)
    # Generate model spec:
    spec = {
        'model_name': 'sklearn-kmeans',
        'model_definition': fname
    }

    model = {
        'name': 'SklearnKMeansFLModel',
        'path': 'ibmfl.model.sklearn_kmeans_fl_model',
        'spec': spec
    }

    return model

Tensorflow 2 model configuration

Here is an example of a Tensorflow 2 model configuration.

img_rows, img_cols = 28, 28
    batch_size = 28
    input_shape = (batch_size, img_rows, img_cols, 1)
    sample_input = np.zeros(shape=input_shape)

    class MyModel(Model):
        def __init__(self):
            super(MyModel, self).__init__()
            self.conv1 = Conv2D(32, 3, activation='relu')
            self.flatten = Flatten()
            self.d1 = Dense(128, activation='relu')
            self.d2 = Dense(10)

        def call(self, x):
            x = self.conv1(x)
            x = self.flatten(x)
            x = self.d1(x)
            return self.d2(x)

    # Create an instance of the model
    model = MyModel()
    loss_object = tf.keras.losses.SparseCategoricalCrossentropy(
        from_logits=True)
    optimizer = tf.keras.optimizers.Adam()
    acc = tf.keras.metrics.SparseCategoricalAccuracy(name='accuracy')
    model.compile(optimizer=optimizer, loss=loss_object, metrics=[acc])
    model._set_inputs(sample_input)

    if not os.path.exists(folder_configs):
        os.makedirs(folder_configs)

    model.save(folder_configs)

To save the model as an HD5 file, you can add the following configuration:

model = keras.Sequential([
   keras.layers.Dense(16, activation = 'relu', input_shape = (11,)),
   keras.layers.Dropout(0.5),
   keras.layers.Dense(1, activation = 'sigmoid')])
   model.compile(optimizer=keras.optimizers.Adam(lr=2e-2), loss=keras.losses.binary_crossentropy, metrics=metrics)
])