Get started

Last updated: Apr 10, 2025

Get started

Starting in 5.1.2, Federated Learning is deprecated and will be removed in the future.

Federated Learning is appropriate for any situation where different entities from different geographical locations or Cloud providers want to train an analytical model without sharing their data.

To get started with Federated Learning:

Familiarize yourself with Terminology.
Review the Architecture for creating a Federated Learning experiment.
Follow a tutorial for step-by-step instructions for creating a Federated Learning experiment or review samples.

Terminology

Terminology that is used in IBM Federated Learning training processes.

Term	Definition
Party	Users that contribute different sources of data to train a model collaboratively. Federated Learning ensures that the training occurs with no data exposure risk across the different parties. A party must have at least Viewer permission in the watsonx.ai Studio Federated Learning project.
Admin	A party member that configures the Federated Learning experiment to specify how many parties are allowed, which frameworks to use, and sets up the Remote Training Systems (RTS). They start the Federated Learning experiment and see it to the end. An admin must have at least Editor permission in the watsonx.ai Studio Federated Learning project.
Remote Training System	An asset that is used to authenticate a party to the aggregator. Project members register in the Remote Training System (RTS) before training. Only one of the members can use one RTS to participate in an experiment as a party. Multiple contributing parties must each authenticate with one RTS for an experiment.
Aggregator	The aggregator fuses the model results between the parties to build one model.
Fusion method	The algorithm that is used to combine the results that the parties return to the aggregator.
Data handler	In IBM Federated Learning, data handler is a class that is used to load and pre-process data. It also helps to ensure that data that is collected from multiple sources are formatted uniformly to be trained. More details about the data handler can be found in Data Handler.
Global model	The resulting model that is fused between different parties.
Training round	A training round is the process of local data training, global model fusion, and update. Training is iterative. The admin can choose the number of training rounds.

Architecture

IBM Federated Learning has two main components: the aggregator and the remote training parties. 

Aggregator

The aggregator is a model fusion processor. The admin manages the aggregator.

The aggregator runs the following tasks:

Runs as a platform service in regions Dallas, Frankfurt, London, or Tokyo.
Starts with a Federated Learning experiment.

Party

A party is a user that provides model input to the Federated Learning experiment aggregator. The party can be:

on any system that can run the watsonx.ai Runtime Python client and compatible with watsonx.ai Runtime frameworks.

Note:
The system does not have to be specifically Cloud Pak for Data as a Service. For a list of system requirements, see Set up your system.
running on the system in any geographical location. You are recommended to locate each party in the same region where the data is to avoid data extraction out of different regions.

This illustration shows the architecture of IBM Federated Learning. A Remote Training System is used to authenticate the party's identity to the aggregator during training. Multiple parties with different data sources connect to their remote training system. Data scientists monitor the training. At the end, a model is created based on collectively trained data.

Illustration of the Federated Learning architecture

User workflow

The data scientist:
1. Identifies the data sources.
2. Creates an initial "untrained" model.
3. Creates a data handler file.
  These tasks might overlap with a training party entity.
4. Scores the end model.
A party connects to the aggregator on their system, which can be remote.
An admin controls the Federated Learning experiment by:
1. Configuring the experiment to accommodate remote parties.
2. Starting the aggregator.
3. Saving and deploying the model.