License: Apache License 2.0

Python 96.62% Shell 3.08% Cython 0.30%

practicing-federated-learning's People

Contributors

Stargazers

Watchers

Forkers

huyz1117 ttcaut-163 sijren vincentarthur hulanwin troublemaker1994 lyhao0212 aries-jessie keesh0410 xrosliang yangchendl bhui97 javaxiong wanghaojun soloroo cyl076 yushuaiji changjiale3 tangkg comeon-hzl skiifall yang-cheng-git huuuuuuoo zhangzhizheng dcygoodboy daybright-david xccds jasonso97 menithya condor2020 jueduilao jimmyc96 skyejy sanyu66 18106574249 zhangyujie209 gemcw tiandazhao wzljerry katerina-merkulova rcsai allensmile lisitian080516 782169620 mliwang yangyong-y kinsonchen sunk-heart washake mvampire0 lixinyiceline jiashiyu4100 zhshua 3probedata tensionyo zzzfffhhh0 cmrilc mitthumeng skyroot wenbo-z-h ffxu1024 edykristianto daodaoyoua rocket82 takeshineshiro zounanhe baojiazhong chenmosha kailv16 knsrz eliaskousk haoyangyang howenxi damon328 reggie1999 c0ks fanhang xin-qi gisgrid 15210831009 longwen-a pa-wan xhjiang1998 lia0409 qinhuohuo mlgblw zyl9737 si1ence2022 chkplusplus llyong print-123 gaoway guo623 triumphist xiaoma-ee mortal0722 liberate101 1084827637 lyx-lyx emily-wh

practicing-federated-learning's Issues

關於第十五章

您好，關於第十五章模型壓縮的部分內文有提到"一方面随着传输数据量的减少，能够有效降低网络传输的带宽消耗；另一方面，可以防止模型参数被窃取"......以及 "有效提升了系统的安全性"
請問是否有驗證這些說明的方法?
因為似乎只有看到壓縮一部分資料所得到的準確率，而沒有證明此方法確實可以保護隱私，等等的

实体书引用链接缺失

原书中第3、4、7、12章中多次引用了外部参考链接，如“链接3-5”、“链接7-13”，然而，在整本实体书中，并无法找到这些链接具体引用的地址。目前，对此状况并无解决方案。

关于第三章lambda设置

您好，请问一下lambda什么含义，为什么取0.1？虽然一共有10个人，但参与聚合的有k人，k=5那么lambda不应为0.2吗？

RuntimeError: CUDA out of memory.

您好：
我在使用pytorch跑程序时，出现了RuntimeError: CUDA out of memory. Tried to allocate 392.00 MiB (GPU 0; 8.00 GiB total capacity; 5.69 GiB already allocated; 84.25 MiB free; 5.72 GiB reserved in total by PyTorch)的问题，网上查找资料说是爆显存了，epoch和batch_size太大机器受不了。因此我将conf.json文件中"global_epochs" : 5,"local_epochs" : 2,"batch_size" : 2,仍然会出现错误。
请问有什么好的解决办法吗，十分感谢！

第五章用FATE从零开始实现横向逻辑回归

部署了Fate框架单机版，版本为1.11.3，fate-flow版本为2.0.0b0
在使用fate_flow_client去上传upload_data时发生错误，感觉是版本问题

第十章有人能跑的通吗？

数据集示例是什么？什么都没有就出书了？debug弄了一天都不行，一大堆报错，就你这里可以跑别人的不行是吧？

在跑第五章代码时，训练结束后，复制board_url到浏览器打不开

操作系统是ubuntu 22.04
下载的是docker_standalone-fate-1.4.0 版本

{
    "data": {
        "board_url": "http://172.18.0.2:8080/index.html#/dashboard?job_id=202305071656462212955&role=guest&party_id=10000",
        "job_dsl_path": "/fate/jobs/202305071656462212955/job_dsl.json",
        "job_runtime_conf_path": "/fate/jobs/202305071656462212955/job_runtime_conf.json",
        "logs_directory": "/fate/logs/202305071656462212955",
        "model_info": {
            "model_id": "arbiter-10000#guest-10000#host-10000#model",
            "model_version": "202305071656462212955"
        }
    },
    "jobId": "202305071656462212955",
    "retcode": 0,
    "retmsg": "success"
}

将http://172.18.0.2:8080/index.html#/dashboard?job_id=202305071656462212955&role=guest&party_id=10000复制到浏览器，浏览器报错：
172.18.0.2 目前无法处理此请求。
HTTP ERROR 502

在第五章的例子中

Traceback (most recent call last):
25
File "./fate/python/fate_flow/operation/task_executor.py", line 168, in run_task
26
run_object.run(component_parameters_on_party, task_run_args)
27
File "./fate/python/federatedml/model_base.py", line 98, in run
28
this_data_output = func(*real_param)
29
File "./fate/python/federatedml/util/data_io.py", line 886, in fit
30
data_inst = self.reader.read_data(data_inst, "fit")
31
File "./fate/python/federatedml/util/data_io.py", line 139, in read_data
32
input_data_labels = input_data.mapValues(lambda value: value.split(self.delimitor, -1)[self.label_idx])
33
File "./fate/python/fate_arch/common/profile.py", line 282, in _fn
34
rtn = func(*args, **kwargs)
35
File "./fate/python/fate_arch/computing/standalone/_table.py", line 93, in mapValues
36
return Table(self._table.mapValues(func))
37
File "./fate/python/fate_arch/_standalone.py", line 141, in mapValues
38
return self._unary(func, _do_map_values)
39
File "./fate/python/fate_arch/_standalone.py", line 233, in _unary
40
func, do_func, self._partitions, self._name, self._namespace
41
File "./fate/python/fate_arch/_standalone.py", line 407, in _submit_unary
42
results = [r.result() for r in futures]
43
File "./fate/python/fate_arch/_standalone.py", line 407, in
44
results = [r.result() for r in futures]
45
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 425, in result
46
return self.__get_result()
47
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
48
raise self._exception
49
IndexError: list index out of range

第十章的CUPY以及运行问题

想问一下这个代码的运行环境是什么？cupy-cuda11x好像没有支持到？
model/utils/nms/non_maximum_suppression.py文件中的@cp.util.memoize()以及下面的函数在Cuda11.6当中没有支持版本。

关于computer vision的代码环境

老师您好，我的cuda是10.0版本，在windows下安装所需环境过程中，cupy的安装一直出错。然后手动选择了cupy-cuda100安装成功，但是运行出错显示cupy中没有util方法。想知道改代码环境具体是什么版本的。

第十章的问题

代码下载下来跑不了bash
一直显示nohup: failed to run command 'python3': Permission denied

关于第15章差分隐私下联邦学习的代码

第15章中差分隐私下联邦学习的代码，在联邦学习进行模型聚合后，添加噪声会导致模型预测值均为Nan，导致loss为Nan，acc为10，请问这是为什么呀

勘误建议-P38- 3.2.2 Tensor与Python数据结构的转换

《联邦学习实战》

P38- 3.2.2 Tensor与Python数据结构的转换
代码：

a3 = torch.from_tensor(arr)
当修改为
a3 = troch.from_numpy(arr)

版次：2021年5月第1版
印次：2021年5月第1次印刷

第三章的横向联邦图像分类报错OSError: [Errno 22] Invalid argument

Files already downloaded and verified
Traceback (most recent call last):
File "main.py", line 25, in
server = Server(conf, eval_datasets)
File "/Users/huqiming/Library/CloudStorage/OneDrive-stu.cdut.edu.cn/bs/Practicing-FL/Practicing-Federated-Learning-main/chapter03_Python_image_classification/server.py", line 11, in init
self.global_model = models.get_model(self.conf["model_name"])
File "/Users/huqiming/Library/CloudStorage/OneDrive-stu.cdut.edu.cn/bs/Practicing-FL/Practicing-Federated-Learning-main/chapter03_Python_image_classification/models.py", line 7, in get_model
model = models.resnet18(pretrained=pretrained)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torchvision/models/resnet.py", line 277, in resnet18
**kwargs)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torchvision/models/resnet.py", line 263, in _resnet
progress=progress)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/hub.py", line 590, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 600, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
OSError: [Errno 22] Invalid argument

环境：
python3.7;
torchaudio0.10.2;
torchvision0.11.3

关于联邦测试

在cifar数据集中进行联邦学习，采取的eval_dataset实际上是测试集，那么在进行全局模型测试的时候是多次使用了测试集吗？

15章使用差分隐私结合联邦学习

学习三轮后它的acc一直都是10 loss是nan是为什么呀，有没有大佬可以解答一下

第10章联邦视觉案例，数据集无法下载

如题，根据readme提示，打开https://dataset.fedai.org/#/datasetfed，没有找到注册页面，无法下载数据集。希望可以把数据集上传到github，或者kaggle上面

关于第15章backdoor攻击部分准确率的问题

我在使用第15章的代码进行联邦学习backdoor攻击的时候，观察到每当malicious client加入到训练的时候，准确率会有很明显的下降，都是下降到10%左右，而且在每次malicious client加入训练后，loss也会急剧上升，而且每次有malicious client加入的时候loss都会依次上升。在经过100次的epoch之后，loss已经到了205517059.328000。请问一下这是正常backdoor攻击的时候会出现的状况吗，这个状况出现的原因是什么呢？如能有大神解惑，不胜感激！

关于第三章Pytorch简单实现

初始化服务器模型的时候使用了self.global_model = models.get_model(self.conf["model_name"])
但是在添加参与客户端的时候，为什么也是使用self.local_model = models.get_model(self.conf["model_name"])
初始化函数参数里的model没有被使用，我觉得应该改为self.local_model=model, local_train函数不需要model参数，在前面每个通讯选中参与者时将参与者的self.model设置成新的全局模型会不会好一点。

关于第五章的横向联邦代码使用方式

README里提到

5.4 利用FATE构建横向联邦学习Pipeline

5.4.1 数据转换输入

。。。。
最后在当前目录下（$fate_dir/examples/federatedml-1.x-examples），在命令行中执行下面的命令，即可自动完成上传和格式转换：

python $fate_dir/fate_flow/fate_flow_client.py -f upload -c upload_data.json

5.4.2 模型训练

。。。

将dsl和conf文件放置在任意目录下，并在该目录下执行下面的命令进行训练：

python $fate_dir/fate_flow/fate_flow_client.py -f submit_job -d test_homolr_train_job_dsl.json -c test_homolr_train_job_conf.json

我装的新版fate，执行命令为
flow job submit -d test_homolr_train_job_dsl.json -c test_homolr_train_job_conf.json
出错为 json.decoder.JSONDecodeError: Expecting ',' delimiter: line 21 column 4 (char 459)

第十章链接10-2 不存在

第三章的loss可视化问题

理论上的loss可视化用的不应该是训练集的数据得到的loss吗，这里怎么用的是测试集得到的loss？

federatedai / practicing-federated-learning Goto Github PK