Giter VIP home page Giter VIP logo

fedgen's People

Contributors

avivbick avatar xcrossd avatar zhuangdizhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

fedgen's Issues

the question about main_plot.py

Hello
sorry,I have a problem about main_plot.pyI

the problem
FileNotFoundError: [Errno 2] No such file or directory: 'figs\Mnist/ratio0.5\Mnist-ratio0.5.png'

I hope to have a look during my busy schedule. I just touched this direction.Thank you!

Unable to perform Mnist experiments

when i'm ready to run "python main.py --dataset Mnist-alpha0.01-ratio0.05 --algorithm FedAvg --batch_size 32 --num_glob_iters 200 --local_epochs 20 --num_users 10 --lamda 1 --learning_rate 0.01 --model cnn --personal_learning_rate 0.01 --times 3"I got the following problem。How can I solve it.

Average Global Accurancy = 0.0950, Loss = 2.31.
Traceback (most recent call last):
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\FLAlgorithms\users\userbase.py", line 163, in get_next_train_batch
(X, y) = next(self.iter_trainloader)
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 676, in _next_data
index = self._next_index() # may raise StopIteration
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 623, in _next_index
return next(self._sampler_iter) # may raise StopIteration
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\main.py", line 85, in
main(args)
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\main.py", line 42, in main
run_job(args, i)
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\main.py", line 37, in run_job
server.train(args)
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\FLAlgorithms\servers\serveravg.py", line 35, in train
user.train(glob_iter, personalized=self.personalized) #* user.train_samples
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\FLAlgorithms\users\useravg.py", line 23, in train
result =self.get_next_train_batch(count_labels=count_labels)
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\FLAlgorithms\users\userbase.py", line 167, in get_next_train_batch
(X, y) = next(self.iter_trainloader)
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 676, in _next_data
index = self._next_index() # may raise StopIteration
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 623, in _next_index
return next(self._sampler_iter) # may raise StopIteration
StopIteration

Training with CIFAR-10

Thank you for the great work.

Besides, Does anyone try to train with CIFAR-10. I have followed the setup for Mnist: replace the data loader of Mnist to CIFAR-10, change input dimension from 1 to 3, keep the same models. However, the result is not good (about 31%) on FedAvg.

Is there any special setting when do experiment with a new dataset? Thank you

run the code on cuda device

It seems that the code does not supprt CUDA?

--device "cuda" can be set but it seems that it is always running on cpu

Thanks

Partial Parameter Sharing Not Supported

It seems the code implemented does not conduct partial parameter sharing. As shown in line 103 of serverpFedGen.py, the partial parameter is default set to False, but in the paper, the pseudo-code shows only the classifier layer of the user's model is shared. Is it a bug or there is something I misunderstand in the code
self.aggregate_parameters()

Network configs: [6, 16, 'F']

Hi, I'm unable to run any of the files.
This was what is churned out. What does the Network configs: [6, 16, 'F'] mean?
python main.py --dataset Mnist-alpha0.1-ratio0.5 --algorithm FedDistll-FL --batch_size 32 --num_glob_iters 200 --local_epochs 20 --num_users 10 --lamda 1 --learning_rate 0.01 --model cnn --personal_learning_rate 0.01 --times 3

Summary of training process:
Algorithm: FedDistll-FL
Batch size: 32
Learing rate : 0.01
Ensemble learing rate : 0.0001
Average Moving : 1.0
Subset of users : 10
Number of global rounds : 200
Number of local rounds : 20
Dataset : Mnist-alpha0.1-ratio0.5
Local Model : cnn
Device : cpu

     [ Start training iteration 0 ]

Creating model for mnist
Network configs: [6, 16, 'F']
Algorithm FedDistll-FL has not been implemented.

Cannot ultilize GPU for FedGen

I run the example experiment for FedGen on Mnist in README.md with the option "--device cuda" but find out there is no process deployed on GPU. I further explore your code and it seems that you have not handled "args.device" in all scripts. Besides, I add "os.environ["CUDA_VISIBLE_DEVICES"] = '0'" in main.py but the model is still deployed only on CPU. I wonder how I can utilize GPU for FedGen. I really appreciate your help!

Question about the implementation of "FedProx"

Hi.

Does your implementation code of FedProx correspond to the algorithm block 2 in the original paper of FedProx? More specifically, the formula for updating lines 53-54 of code file "fedoptimizer.py" seems a little strange, right? In particular, what does lambda mean in FedProx algorithm?

The update formula I understand should be :
p.data=p.data - group['lr'] * ( p.grad. data + group ['mu'] * (p.data - pstar.data.clone())

Looking forward to your reply.

Trainloader is not shuffle

The performance of FedAvg is not as good as FedGen simply because the Trainloader does not have a shuffle. After fixing the bugs Fedgen is not as effective as Fedavg.

Wrong tensor type error

If there are wrong tensor type errors when running experiments with FedGen algorithm, see changes in #3

plot problem

I think in the file plot_utils.py, the variable 'all_curves' used in the outside of the loop only saves the last algorithm's results, in this way, when we add several algorithms in the config, the plot figure result will cut the other algorithms' trend by following the last one's scope.

max_acc = np.max([max_acc, np.max(all_curves) ]) + 4e-2

python main_plot.py --dataset EMnist-alpha0.1-ratio0.1 --algorithms FedAvg,FedGen,FedProx,FedDistill --batch_size 32 --local_epochs 20 --num_users 10 --num_glob_iters 200 --plot_legend 1

Question about FedProx

Hi.

Does your implementation code of FedProx correspond to the algorithm block 2 in the original paper of FedProx? More specifically, the formula for updating lines 53-54 of code file "fedoptimizer.py" seems a little strange, right? In particular, what does lambda mean in FedProx algorithm?

The update formula I understand should be :
p.data=p.data - group['lr'] * ( p.grad. data + group ['mu'] * (p.data - pstar.data.clone())

Looking forward to your reply.

Reproduce "FedDF" baseline

Thank you for open-sourcing your project. I notice that "FedDF" (Ensemble Distillation for Robust Model Fusion in Federated Learning) is one of your baselines in your paper, however, you provide code for only FedAvg, FedProx, FedDistill, and FedGen. Could you please help me reproduce the results of FedDF? I really appreciate your help.

Can't run EMNIST experiment

When I ran the EMNIST experiment after generation of emnist dataset I got:

(pt) wangshu@ubuntu:~/projects/FedGen$ CUDA_VISIBLE_DEVICES=3 python main.py --dataset EMnist-alpha0.1-ratio0.1 --algorithm FedGen --batch_size 32 --local_epochs 20 --num_users 10 --lamda 1 --model cnn --learning_rate 0.01 --personal_learning_rate 0.01 --num_glob_iters 200 --times 3 
================================================================================
Summary of training process:
Algorithm: FedGen
Batch size: 32
Learing rate       : 0.01
Ensemble learing rate       : 0.0001
Average Moving       : 1.0
Subset of users      : 10
Number of global rounds       : 200
Number of local rounds       : 20
Dataset       : EMnist-alpha0.1-ratio0.1
Local Model       : cnn
Device            : cpu
================================================================================


         [ Start training iteration 0 ]           


Creating model for emnist
Network configs: [6, 16, 'F']
Dataset emnist
/home/wangshu/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='none' instead.
  warnings.warn(warning.format(ret))
Build layer 57 X 256
Build last layer 256 X 32
ensemble_lr: 0.0001
ensemble_batch_size: 128
unique_labels: 25
latent_layer_idx: -1
label embedding 0
ensemeble learning rate: 0.0001
ensemeble alpha = 1, beta = 0, eta = 1
generator alpha = 10, beta = 1
Number of Train/Test samples: 12480 8120
Data from 20 users in total.
Finished creating FedAvg server.


-------------Round number:  0  -------------


Traceback (most recent call last):
  File "/home/wangshu/projects/FedGen/main.py", line 85, in <module>
    main(args)
  File "/home/wangshu/projects/FedGen/main.py", line 42, in main
    run_job(args, i)
  File "/home/wangshu/projects/FedGen/main.py", line 37, in run_job
    server.train(args)
  File "/home/wangshu/projects/FedGen/FLAlgorithms/servers/serverpFedGen.py", line 78, in train
    self.evaluate()
  File "/home/wangshu/projects/FedGen/FLAlgorithms/servers/serverbase.py", line 226, in evaluate
    test_ids, test_samples, test_accs, test_losses = self.test(selected=selected)
  File "/home/wangshu/projects/FedGen/FLAlgorithms/servers/serverbase.py", line 165, in test
    ct, c_loss, ns = c.test()
  File "/home/wangshu/projects/FedGen/FLAlgorithms/users/userbase.py", line 137, in test
    loss += self.loss(output, y)
  File "/home/wangshu/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/wangshu/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/modules/loss.py", line 216, in forward
    return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/wangshu/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/functional.py", line 2388, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 25 is out of bounds.
(pt) wangshu@ubuntu:~/projects/FedGen$ 

Pythorch 1.8.1, python 3.9.4.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.