Giter VIP home page Giter VIP logo

practicing-federated-learning's People

Contributors

dylan-fan avatar innovation-cat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

practicing-federated-learning's Issues

關於第十五章

您好,關於第十五章模型壓縮的部分內文有提到"一方面随着传输数据量的减少,能够有效降低网络传输的带宽消耗;另一方面,可以防止模型参数被窃取"......以及 "有效提升了系统的安全性"
請問是否有驗證這些說明的方法?
因為似乎只有看到壓縮一部分資料所得到的準確率,而沒有證明此方法確實可以保護隱私,等等的

实体书引用链接缺失

原书中第3、4、7、12章中多次引用了外部参考链接,如“链接3-5”、“链接7-13”,然而,在整本实体书中,并无法找到这些链接具体引用的地址。目前,对此状况并无解决方案。

《联邦学习实战》

你好,请问《联邦学习实战》这本书出版了吗?网上没搜到这本书。谢谢!

关于第三章lambda设置

您好,请问一下lambda什么含义,为什么取0.1?虽然一共有10个人,但参与聚合的有k人,k=5那么lambda不应为0.2吗?

RuntimeError: CUDA out of memory.

您好:
我在使用pytorch跑程序时,出现了RuntimeError: CUDA out of memory. Tried to allocate 392.00 MiB (GPU 0; 8.00 GiB total capacity; 5.69 GiB already allocated; 84.25 MiB free; 5.72 GiB reserved in total by PyTorch)的问题,网上查找资料说是爆显存了,epoch和batch_size太大机器受不了。因此我将conf.json文件中"global_epochs" : 5,"local_epochs" : 2,"batch_size" : 2,仍然会出现错误。
请问有什么好的解决办法吗,十分感谢!

第十章有人能跑的通吗?

数据集示例是什么?什么都没有就出书了?debug弄了一天都不行,一大堆报错,就你这里可以跑别人的不行是吧?

在跑第五章代码时,训练结束后,复制board_url到浏览器打不开

操作系统是ubuntu 22.04
下载的是docker_standalone-fate-1.4.0 版本

{
    "data": {
        "board_url": "http://172.18.0.2:8080/index.html#/dashboard?job_id=202305071656462212955&role=guest&party_id=10000",
        "job_dsl_path": "/fate/jobs/202305071656462212955/job_dsl.json",
        "job_runtime_conf_path": "/fate/jobs/202305071656462212955/job_runtime_conf.json",
        "logs_directory": "/fate/logs/202305071656462212955",
        "model_info": {
            "model_id": "arbiter-10000#guest-10000#host-10000#model",
            "model_version": "202305071656462212955"
        }
    },
    "jobId": "202305071656462212955",
    "retcode": 0,
    "retmsg": "success"
}

http://172.18.0.2:8080/index.html#/dashboard?job_id=202305071656462212955&role=guest&party_id=10000复制到浏览器,浏览器报错:
172.18.0.2 目前无法处理此请求。
HTTP ERROR 502

在第五章的例子中

Traceback (most recent call last):
25
File "./fate/python/fate_flow/operation/task_executor.py", line 168, in run_task
26
run_object.run(component_parameters_on_party, task_run_args)
27
File "./fate/python/federatedml/model_base.py", line 98, in run
28
this_data_output = func(*real_param)
29
File "./fate/python/federatedml/util/data_io.py", line 886, in fit
30
data_inst = self.reader.read_data(data_inst, "fit")
31
File "./fate/python/federatedml/util/data_io.py", line 139, in read_data
32
input_data_labels = input_data.mapValues(lambda value: value.split(self.delimitor, -1)[self.label_idx])
33
File "./fate/python/fate_arch/common/profile.py", line 282, in _fn
34
rtn = func(*args, **kwargs)
35
File "./fate/python/fate_arch/computing/standalone/_table.py", line 93, in mapValues
36
return Table(self._table.mapValues(func))
37
File "./fate/python/fate_arch/_standalone.py", line 141, in mapValues
38
return self._unary(func, _do_map_values)
39
File "./fate/python/fate_arch/_standalone.py", line 233, in _unary
40
func, do_func, self._partitions, self._name, self._namespace
41
File "./fate/python/fate_arch/_standalone.py", line 407, in _submit_unary
42
results = [r.result() for r in futures]
43
File "./fate/python/fate_arch/_standalone.py", line 407, in
44
results = [r.result() for r in futures]
45
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 425, in result
46
return self.__get_result()
47
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
48
raise self._exception
49
IndexError: list index out of range

第十章的CUPY以及运行问题

想问一下这个代码的运行环境是什么?cupy-cuda11x好像没有支持到?
model/utils/nms/non_maximum_suppression.py文件中的@cp.util.memoize()以及下面的函数在Cuda11.6当中没有支持版本。

关于computer vision的代码环境

老师您好,我的cuda是10.0版本,在windows下安装所需环境过程中,cupy的安装一直出错。 然后手动选择了cupy-cuda100安装成功,但是运行出错显示cupy中没有util方法。 想知道改代码环境具体是什么版本的。

第十章的问题

代码下载下来跑不了bash
一直显示nohup: failed to run command 'python3': Permission denied

第三章的横向联邦图像分类报错OSError: [Errno 22] Invalid argument

Files already downloaded and verified
Traceback (most recent call last):
File "main.py", line 25, in
server = Server(conf, eval_datasets)
File "/Users/huqiming/Library/CloudStorage/OneDrive-stu.cdut.edu.cn/bs/Practicing-FL/Practicing-Federated-Learning-main/chapter03_Python_image_classification/server.py", line 11, in init
self.global_model = models.get_model(self.conf["model_name"])
File "/Users/huqiming/Library/CloudStorage/OneDrive-stu.cdut.edu.cn/bs/Practicing-FL/Practicing-Federated-Learning-main/chapter03_Python_image_classification/models.py", line 7, in get_model
model = models.resnet18(pretrained=pretrained)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torchvision/models/resnet.py", line 277, in resnet18
**kwargs)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torchvision/models/resnet.py", line 263, in _resnet
progress=progress)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/hub.py", line 590, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 600, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
OSError: [Errno 22] Invalid argument

环境:
python3.7;
torchaudio0.10.2;
torchvision0.11.3

关于联邦测试

在cifar数据集中进行联邦学习,采取的eval_dataset实际上是测试集,那么在进行全局模型测试的时候是多次使用了测试集吗?

关于第15章backdoor攻击部分准确率的问题

我在使用第15章的代码进行联邦学习backdoor攻击的时候,观察到每当malicious client加入到训练的时候,准确率会有很明显的下降,都是下降到10%左右,而且在每次malicious client加入训练后,loss也会急剧上升,而且每次有malicious client加入的时候loss都会依次上升。在经过100次的epoch之后,loss已经到了205517059.328000。请问一下这是正常backdoor攻击的时候会出现的状况吗,这个状况出现的原因是什么呢?如能有大神解惑,不胜感激!
image
image

关于第三章Pytorch简单实现

初始化服务器模型的时候使用了self.global_model = models.get_model(self.conf["model_name"])
但是在添加参与客户端的时候,为什么也是使用self.local_model = models.get_model(self.conf["model_name"])
初始化函数参数里的model没有被使用,我觉得应该改为self.local_model=model, local_train函数不需要model参数,在前面每个通讯选中参与者时将参与者的self.model设置成新的全局模型会不会好一点。

关于第五章的横向联邦代码使用方式

README里提到

5.4 利用FATE构建横向联邦学习Pipeline

5.4.1 数据转换输入

。。。。
最后在当前目录下($fate_dir/examples/federatedml-1.x-examples),在命令行中执行下面的命令,即可自动完成上传和格式转换:

python $fate_dir/fate_flow/fate_flow_client.py -f upload -c upload_data.json

5.4.2 模型训练

。。。

将dsl和conf文件放置在任意目录下,并在该目录下执行下面的命令进行训练:

python $fate_dir/fate_flow/fate_flow_client.py -f submit_job -d test_homolr_train_job_dsl.json -c test_homolr_train_job_conf.json

我装的新版fate,执行命令为
flow job submit -d test_homolr_train_job_dsl.json -c test_homolr_train_job_conf.json
出错为 json.decoder.JSONDecodeError: Expecting ',' delimiter: line 21 column 4 (char 459)

第三章的loss可视化问题

理论上的loss可视化用的不应该是训练集的数据得到的loss吗,这里怎么用的是测试集得到的loss?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.