Evaluating Federated Learning Methods.
Realizing Your Brilliant Ideas.
Having Fun with Federated Learning.
🎉 FL-bench now can perform FL training in parallel (with the help of ray)! 🎉
Traditional FL Methods
-
FedAvg -- Communication-Efficient Learning of Deep Networks from Decentralized Data (AISTATS'17)
-
FedAvgM -- Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification (ArXiv'19)
-
FedProx -- Federated Optimization in Heterogeneous Networks (MLSys'20)
-
SCAFFOLD -- SCAFFOLD: Stochastic Controlled Averaging for Federated Learning (ICML'20)
-
MOON -- Model-Contrastive Federated Learning (CVPR'21)
-
FedDyn -- Federated Learning Based on Dynamic Regularization (ICLR'21)
-
FedLC -- Federated Learning with Label Distribution Skew via Logits Calibration (ICML'22)
-
FedGen -- Data-Free Knowledge Distillation for Heterogeneous Federated Learning (ICML'21)
-
CCVR -- No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data (NIPS'21)
-
FedOpt -- Adaptive Federated Optimization (ICLR'21)
-
Elastic Aggregation -- Elastic Aggregation for Federated Optimization (CVPR'23)
Personalized FL Methods
-
pFedSim (My Work⭐) -- pFedSim: Similarity-Aware Model Aggregation Towards Personalized Federated Learning (ArXiv'23)
-
Local-Only -- Local training only (without communication).
-
FedMD -- FedMD: Heterogenous Federated Learning via Model Distillation (NIPS'19)
-
APFL -- Adaptive Personalized Federated Learning (ArXiv'20)
-
LG-FedAvg -- Think Locally, Act Globally: Federated Learning with Local and Global Representations (ArXiv'20)
-
FedBN -- FedBN: Federated Learning On Non-IID Features Via Local Batch Normalization (ICLR'21)
-
FedPer -- Federated Learning with Personalization Layers (AISTATS'20)
-
FedRep -- Exploiting Shared Representations for Personalized Federated Learning (ICML'21)
-
Per-FedAvg -- Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach (NIPS'20)
-
pFedMe -- Personalized Federated Learning with Moreau Envelopes (NIPS'20)
-
Ditto -- Ditto: Fair and Robust Federated Learning Through Personalization (ICML'21)
-
pFedHN -- Personalized Federated Learning using Hypernetworks (ICML'21)
-
pFedLA -- Layer-Wised Model Aggregation for Personalized Federated Learning (CVPR'22)
-
CFL -- Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints (ArXiv'19)
-
FedFomo -- Personalized Federated Learning with First Order Model Optimization (ICLR'21)
-
FedBabu -- FedBabu: Towards Enhanced Representation for Federated Image Classification (ICLR'22)
-
FedAP -- Personalized Federated Learning with Adaptive Batchnorm for Healthcare (IEEE'22)
-
kNN-Per -- Personalized Federated Learning through Local Memorization (ICML'22)
-
MetaFed -- MetaFed: Federated Learning among Federations with Cyclic Knowledge Distillation for Personalized Healthcare (IJCAI'22)
-
FedRoD -- On Bridging Generic and Personalized Federated Learning for Image Classification (ICLR'22)
-
FedProto -- FedProto: Federated prototype learning across heterogeneous clients (AAAI'22)
FL Domain Generalization Methods
-
FedSR -- FedSR: A Simple and Effective Domain Generalization Method for Federated Learning (NIPS'22)
-
ADCOL -- Adversarial Collaborative Learning on Non-IID Features (ICML'23)
-
FedIIR -- Out-of-Distribution Generalization of Federated Learning via Implicit Invariant Relationships (ICML'23)
Just select one of them.
pip install -r .environment/requirements.txt
conda env create -f .environment/environment.yml
# For those China mainland users
cd .environment && poetry install --no-root
# For those oversea users
cd .environment && sed -i "10,14d" pyproject.toml && poetry lock --no-update && poetry install --no-root
# For those China mainland users
docker pull registry.cn-hangzhou.aliyuncs.com/karhoutam/fl-bench:master
# For those oversea users
docker pull ghcr.io/karhoutam/fl-bench:master
# or
docker pull docker.io/karhoutam/fl-bench:master
# An example for building container
docker run -it --name fl-bench -v path/to/FL-bench:/root/FL-bench --privileged --gpus all ghcr.io/karhoutam/fl-bench:master
ALL classes of methods are inherited from FedAvgServer
and FedAvgClient
. If you wanna figure out the entire workflow and detail of variable settings, go check src/server/fedavg.py
and src/client/fedavg.py
.
# Partition the MNIST according to Dir(0.1) for 100 clients
python generate_data.py -d mnist -a 0.1 -cn 100
About methods of generating federated dastaset, go check data/README.md
for full details.
python main.py <method> [your_config_file.yml] [method_args...]
❗ Method name should be identical to the .py
file name in src/server
.
# Run FedAvg with default settings.
python main.py fedavg
- By modifying config file
- By explicitly setting in CLI, e.g.,
python main.py fedprox config/my_cfg.yml --mu 0.01
. - By modifying the default value in
src/utils/constants.py/DEFAULT_COMMON_ARGS
orsrc/server/<method>.py/get_<method>_args()
⚠ For the same FL method argument, the priority of argument setting is CLI > Config file > Default value.
For example, the default value of fedprox.mu
is 1
,
def get_fedprox_args(args_list=None) -> Namespace:
parser = ArgumentParser()
parser.add_argument("--mu", type=float, default=1.0)
return parser.parse_args(args_list)
and you set
# your_config.yml
...
fedprox:
mu: 0.01
in your config file. If you run
python main.py fedprox # fedprox.mu = 1
python main.py fedprox your_config.yml # fedprox.mu = 0.01
python main.py fedprox your_config.yml --mu 10 # fedprox.mu = 10
- Run
python -m visdom.server
on terminal. - Set
visible
astrue
. - Go check
localhost:8097
on your browser.
You need to set
# your_config_file.yml
mode: parallel
parallel:
num_workers: 2 # any positive integer that larger than 1
...
...
for parallel training, which will vastly improve your training efficiency.
A Ray
cluster would be created implicitly by python main.py <method> ...
.
Or you can manually launch it to avoid creating cluster each time by running experiment.
# your_config_file.yml
mode: parallel
parallel:
ray_cluster_addr: null
...
...
ray start --head [OPTIONS]
All common arguments have their default value. Go check DEFAULT_COMMON_ARGS
in src/utils/constants.py
for full details of common arguments.
⚠ Common arguments cannot be set via CLI.
You can also write your own .yml
config file. I offer you a template in config
and recommend you to save your config files there also.
One example: python main.py fedavg config/template.yaml [cli_method_args...]
About the default values of specific FL method arguments, go check corresponding FL-bench/src/server/<method>.py
for the full details.
Arguments | Type | Description |
---|---|---|
dataset |
str |
The name of dataset that experiment run on. |
model |
str |
The model backbone experiment used. |
seed |
int |
Random seed for running experiment. |
join_ratio |
float |
Ratio for (client each round) / (client num in total). |
global_epoch |
int |
Global epoch, also called communication round. |
local_epoch |
int |
Local epoch for client local training. |
finetune_epoch |
int |
Epoch for clients fine-tunning their models before test. |
test_interval |
int |
Interval round of performing test on clients. |
eval_test |
bool |
Non-zero value for performing evaluation on joined clients' testset before and after local training. |
eval_val |
bool |
Non-zero value for performing evaluation on joined clients' valset before and after local training. |
eval_train |
bool |
Non-zero value for performing evaluation on joined clients' trainset before and after local training. |
optimizer |
dict |
Client-side optimizer. Argument request is the same as Optimizers in torch.optim . |
lr_scheduler |
dict |
Client-side learning rate scheduler. Argument request is the same as schedulers in torch.optim.lr_scheduler . |
verbose_gap |
int |
Interval round of displaying clients training performance on terminal. |
batch_size |
int |
Data batch size for client local training. |
use_cuda |
bool |
Non-zero value indicates that tensors are in gpu. |
visible |
bool |
Non-zero value for using Visdom to monitor algorithm performance on localhost:8097 . |
straggler_ratio |
float |
The ratio of stragglers (set in [0, 1] ). Stragglers would not perform full-epoch local training as normal clients. Their local epoch would be randomly selected from range [straggler_min_local_epoch, local_epoch) . |
straggler_min_local_epoch |
int |
The minimum value of local epoch for stragglers. |
external_model_params_file |
str |
The relative file path of external model parameters. Please confirm whether the shape of parameters compatible with the model by yourself. ⚠ This feature is enabled only when unique_model=False , which is pre-defined by each FL method. |
save_log |
bool |
Non-zero value for saving algorithm running log in out/<method>/<start_time> . |
save_model |
bool |
Non-zero value for saving output model(s) parameters in out/<method>/<start_time> .pt`. |
save_fig |
bool |
Non-zero value for saving the accuracy curves showed on Visdom into a .pdf file at out/<method>/<start_time> . |
save_metrics |
bool |
Non-zero value for saving metrics stats into a .csv file at out/<method>/<start_time> . |
viz_win_name |
str |
Custom visdom window name (active when setting visible as a non-zero value). |
check_convergence |
bool |
Non-zero value for checking convergence after training. |
Arguments | Type | Description |
---|---|---|
num_workers |
int |
The number of parallel workers. Need to be set as an integer that larger than 1 . |
ray_cluster_addr |
str |
The IP address of the selected ray cluster. Default as null , which means ray will build a new cluster everytime you running an experiment and destroy it at the end. More details can be found in the official docs. |
num_cpus and num_gpus |
int |
The amount of computational resources you allocate. Default as null , which means all. |
This benchmark supports bunch of models that common and integrated in Torchvision:
- ResNet family
- EfficientNet family
- DenseNet family
- MobileNet family
- LeNet5 ...
🤗 You can define your own custom model by filling the CustomModel
class in src/utils/models.py
and use it by setting model
to custom
when running.
Regular Image Datasets
-
MNIST (1 x 28 x 28, 10 classes)
-
CIFAR-10/100 (3 x 32 x 32, 10/100 classes)
-
EMNIST (1 x 28 x 28, 62 classes)
-
FashionMNIST (1 x 28 x 28, 10 classes)
-
FEMNIST (1 x 28 x 28, 62 classes)
-
CelebA (3 x 218 x 178, 2 classes)
-
SVHN (3 x 32 x 32, 10 classes)
-
USPS (1 x 16 x 16, 10 classes)
-
Tiny-ImageNet-200 (3 x 64 x 64, 200 classes)
-
CINIC-10 (3 x 32 x 32, 10 classes)
Domain Generalization Image Datasets
- DomainNet (3 x ? x ?, 345 classes)
- Go check
data/README.md
for the full process guideline 🧾.
- Go check
Medical Image Datasets
-
COVID-19 (3 x 244 x 224, 4 classes)
-
Organ-S/A/CMNIST (1 x 28 x 28, 11 classes)
The package()
at server-side class is used for assembling all parameters server need to send to clients. Similarly, package()
at client-side class is for parameters clients need to send back to server. You should always has super().package()
in your override implementation.
-
Consider to inherit your method classes from
FedAvgServer
andFedAvgClient
for maximum utilizing FL-bench's workflow. -
For customizing your server-side process, consider to override the
package()
andaggregate()
. -
For customizing your client-side training, consider to override the
fit()
orpackage()
.
You can find all details in FedAvgClient
and FedAvgServer
, which are the bases of all implementations in FL-bench.
- Inherit your own dataset class from
BaseDataset
indata/utils/datasets.py
and add your class in dictDATASETS
.
- I offer the
CustomModel
class insrc/utils/models.py
and you just need to define your model arch. - If you want to use your customized model within FL-bench's workflow, the
base
andclassifier
must be defined. (Tips: You can define one of them astorch.nn.Identity()
for bypassing it.)
@software{Tan_FL-bench,
author = {Tan, Jiahao and Wang, Xinpeng},
license = {GPL-2.0},
title = {{FL-bench: A federated learning benchmark for solving image classification tasks}},
url = {https://github.com/KarhouTam/FL-bench}
}
@misc{tan2023pfedsim,
title={pFedSim: Similarity-Aware Model Aggregation Towards Personalized Federated Learning},
author={Jiahao Tan and Yipeng Zhou and Gang Liu and Jessie Hui Wang and Shui Yu},
year={2023},
eprint={2305.15706},
archivePrefix={arXiv},
primaryClass={cs.LG}
}