Giter VIP home page Giter VIP logo

pfedsd's Introduction

pFedSD

This is the implementation of "Personalized Edge Intelligence via Federated Self- Knowledge Distillation", also published in IEEE TPDS journal.

Requirements

Please run the following commands below to install dependencies.

conda create -y -n fed_d python=3.7
conda activate fed_d
conda install -y -c pytorch pytorch=1.7.1 torchvision=0.8.2 matplotlib python-lmdb cudatoolkit=11.3 cudnn
pip install transformers datasets pytreebank opencv-python torchcontrib gpytorch 

Training and evaluation

Details for each argument are included in ./parameters.py.

The setup of the FedAvg/FedProx for resnet-8 on cifar10 with pathological distribution:

python run_gloo.py \
    --arch resnet8 --complex_arch master=resnet8,worker=resnet8 --experiment demo \
    --data cifar10 --pin_memory True --batch_size 64 --num_workers 2 --prepare_data combine \
    --partition_data pathological --shard_per_user 2 \
    --train_data_ratio 0.8 --val_data_ratio 0.0 --test_data_ratio 0.2 \
    --n_clients 10 --participation_ratio 0.6 --n_comm_rounds 5 --local_n_epochs 5 \
    --world_conf 0,0,1,1,100 --on_cuda True \
    --fl_aggregate scheme=federated_average \
    --optimizer sgd --lr 0.01 --local_prox_term 0 --lr_warmup False --lr_warmup_epochs 5 --lr_warmup_epochs_upper_bound 150 \
    --lr_scheduler MultiStepLR --lr_decay 0.1 \
    --weight_decay 1e-5 --use_nesterov False --momentum_factor 0.9 \
    --track_time True --display_tracked_time True --python_path $HOME/anaconda3/envs/fed_d/bin/python \
    --manual_seed 7 --pn_normalize True --same_seed_process True \
    --algo fedavg \
    --personal_test True \
    --port 20001 --timestamp $(date "+%Y%m%d%H%M%S")

The setup of the pFedSD for simple cnn on cifar10 with dirichlet distribution:

python run_gloo.py \
    --arch simple_cnn --complex_arch master=simple_cnn,worker=simple_cnn --experiment demo \
    --data cifar10 --pin_memory True --batch_size 64 --num_workers 2 --prepare_data combine \
    --partition_data non_iid_dirichlet --non_iid_alpha 0.1 \
    --train_data_ratio 0.8 --val_data_ratio 0.0 --test_data_ratio 0.2 \
    --n_clients 10 --participation_ratio 0.6 --n_comm_rounds 5 --local_n_epochs 5 \
    --world_conf 0,0,1,1,100 --on_cuda True \
    --fl_aggregate scheme=federated_average \
    --optimizer sgd --lr 0.01 --local_prox_term 0 --lr_warmup False --lr_warmup_epochs 5 --lr_warmup_epochs_upper_bound 150 \
    --lr_scheduler MultiStepLR --lr_decay 0.1 \
    --weight_decay 1e-5 --use_nesterov False --momentum_factor 0.9 \
    --track_time True --display_tracked_time True --python_path $HOME/anaconda3/envs/fed_d/bin/python \
    --manual_seed 7 --pn_normalize True --same_seed_process True \
    --algo pFedSD \
    --personal_test True \
    --port 20002 --timestamp $(date "+%Y%m%d%H%M%S")

Citation

TODO

Acknowledgements

The skeleton codebase in this repository was adapted from FedDF[1].

[1] T. Lin, L. Kong, S. U. Stich, and M. Jaggi, โ€œEnsemble distillation for robust model fusion in federated learning,โ€ in NeurIPS, 2020.

pfedsd's People

Contributors

carlbye avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

pfedsd's Issues

'Unable to find a valid cuDNN algorithm to run convolution'

First thanks for your work! when I run your example script "The setup of the pFedSD for simple cnn on cifar10 with dirichlet distribution", and a bug accurs as the image below shows. i search on the stackoverflow and someone said decreasing the batch size may works. However when I change the batch size from 64 to 16, same bug accurs.
The code is running on 2 NVIDIA 3090 GPUs.
image

problem

Hello, I would like to ask, how can I draw a map of pathological non-independent distribution?

RuntimeError: [../third_party/gloo/gloo/transport/tcp/pair.cc:598] Connection closed by peer [172.28.0.12]:14004

Thanks for your paper,firstly.The pFedSD is a great case for FKD.When I run your code for pFedSD, it always show erros about process communication such as "Process Process-4:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/content/pFedSD/run_gloo.py", line 82, in main
process.run()
File "/content/pFedSD/pcode/workers/worker_pFedSD.py", line 47, in run
self._send_model_to_master()
File "/content/pFedSD/pcode/workers/worker_base.py", line 304, in _send_model_to_master
dist.send(tensor=flatten_model.buffer, dst=0)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py", line 1295, in send
default_pg.send([tensor], dst, tag).wait()
RuntimeError: [../third_party/gloo/gloo/transport/tcp/pair.cc:598] Connection closed by peer [172.28.0.12]:43185".
I will appreciate it if you can give me some tips about this error. Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.