federatedai / practicing-federated-learning Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
您好,關於第十五章模型壓縮的部分內文有提到"一方面随着传输数据量的减少,能够有效降低网络传输的带宽消耗;另一方面,可以防止模型参数被窃取"......以及 "有效提升了系统的安全性"
請問是否有驗證這些說明的方法?
因為似乎只有看到壓縮一部分資料所得到的準確率,而沒有證明此方法確實可以保護隱私,等等的
原书中第3、4、7、12章中多次引用了外部参考链接,如“链接3-5”、“链接7-13”,然而,在整本实体书中,并无法找到这些链接具体引用的地址。目前,对此状况并无解决方案。
你好,请问《联邦学习实战》这本书出版了吗?网上没搜到这本书。谢谢!
您好,请问一下lambda什么含义,为什么取0.1?虽然一共有10个人,但参与聚合的有k人,k=5那么lambda不应为0.2吗?
您好:
我在使用pytorch跑程序时,出现了RuntimeError: CUDA out of memory. Tried to allocate 392.00 MiB (GPU 0; 8.00 GiB total capacity; 5.69 GiB already allocated; 84.25 MiB free; 5.72 GiB reserved in total by PyTorch)的问题,网上查找资料说是爆显存了,epoch和batch_size太大机器受不了。因此我将conf.json文件中"global_epochs" : 5,"local_epochs" : 2,"batch_size" : 2,仍然会出现错误。
请问有什么好的解决办法吗,十分感谢!
数据集示例是什么?什么都没有就出书了?debug弄了一天都不行,一大堆报错,就你这里可以跑别人的不行是吧?
操作系统是ubuntu 22.04
下载的是docker_standalone-fate-1.4.0 版本
{
"data": {
"board_url": "http://172.18.0.2:8080/index.html#/dashboard?job_id=202305071656462212955&role=guest&party_id=10000",
"job_dsl_path": "/fate/jobs/202305071656462212955/job_dsl.json",
"job_runtime_conf_path": "/fate/jobs/202305071656462212955/job_runtime_conf.json",
"logs_directory": "/fate/logs/202305071656462212955",
"model_info": {
"model_id": "arbiter-10000#guest-10000#host-10000#model",
"model_version": "202305071656462212955"
}
},
"jobId": "202305071656462212955",
"retcode": 0,
"retmsg": "success"
}
将http://172.18.0.2:8080/index.html#/dashboard?job_id=202305071656462212955&role=guest&party_id=10000复制到浏览器,浏览器报错:
172.18.0.2 目前无法处理此请求。
HTTP ERROR 502
Traceback (most recent call last):
25
File "./fate/python/fate_flow/operation/task_executor.py", line 168, in run_task
26
run_object.run(component_parameters_on_party, task_run_args)
27
File "./fate/python/federatedml/model_base.py", line 98, in run
28
this_data_output = func(*real_param)
29
File "./fate/python/federatedml/util/data_io.py", line 886, in fit
30
data_inst = self.reader.read_data(data_inst, "fit")
31
File "./fate/python/federatedml/util/data_io.py", line 139, in read_data
32
input_data_labels = input_data.mapValues(lambda value: value.split(self.delimitor, -1)[self.label_idx])
33
File "./fate/python/fate_arch/common/profile.py", line 282, in _fn
34
rtn = func(*args, **kwargs)
35
File "./fate/python/fate_arch/computing/standalone/_table.py", line 93, in mapValues
36
return Table(self._table.mapValues(func))
37
File "./fate/python/fate_arch/_standalone.py", line 141, in mapValues
38
return self._unary(func, _do_map_values)
39
File "./fate/python/fate_arch/_standalone.py", line 233, in _unary
40
func, do_func, self._partitions, self._name, self._namespace
41
File "./fate/python/fate_arch/_standalone.py", line 407, in _submit_unary
42
results = [r.result() for r in futures]
43
File "./fate/python/fate_arch/_standalone.py", line 407, in
44
results = [r.result() for r in futures]
45
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 425, in result
46
return self.__get_result()
47
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
48
raise self._exception
49
IndexError: list index out of range
想问一下这个代码的运行环境是什么?cupy-cuda11x好像没有支持到?
model/utils/nms/non_maximum_suppression.py文件中的@cp.util.memoize()以及下面的函数在Cuda11.6当中没有支持版本。
老师您好,我的cuda是10.0版本,在windows下安装所需环境过程中,cupy的安装一直出错。 然后手动选择了cupy-cuda100安装成功,但是运行出错显示cupy中没有util方法。 想知道改代码环境具体是什么版本的。
代码下载下来跑不了bash
一直显示nohup: failed to run command 'python3': Permission denied
第15章中差分隐私下联邦学习的代码,在联邦学习进行模型聚合后,添加噪声会导致模型预测值均为Nan,导致loss为Nan,acc为10,请问这是为什么呀
《联邦学习实战》
P38- 3.2.2 Tensor与Python数据结构的转换
代码:
a3 = torch.from_tensor(arr)
当修改为
a3 = troch.from_numpy(arr)
版次:2021年5月第1版
印次:2021年5月第1次印刷
Files already downloaded and verified
Traceback (most recent call last):
File "main.py", line 25, in
server = Server(conf, eval_datasets)
File "/Users/huqiming/Library/CloudStorage/OneDrive-stu.cdut.edu.cn/bs/Practicing-FL/Practicing-Federated-Learning-main/chapter03_Python_image_classification/server.py", line 11, in init
self.global_model = models.get_model(self.conf["model_name"])
File "/Users/huqiming/Library/CloudStorage/OneDrive-stu.cdut.edu.cn/bs/Practicing-FL/Practicing-Federated-Learning-main/chapter03_Python_image_classification/models.py", line 7, in get_model
model = models.resnet18(pretrained=pretrained)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torchvision/models/resnet.py", line 277, in resnet18
**kwargs)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torchvision/models/resnet.py", line 263, in _resnet
progress=progress)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/hub.py", line 590, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 600, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/Users/huqiming/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
OSError: [Errno 22] Invalid argument
环境:
python3.7;
torchaudio0.10.2;
torchvision0.11.3
在cifar数据集中进行联邦学习,采取的eval_dataset实际上是测试集,那么在进行全局模型测试的时候是多次使用了测试集吗?
学习三轮后 它的acc一直都是10 loss是nan是为什么呀,有没有大佬可以解答一下
初始化服务器模型的时候使用了self.global_model = models.get_model(self.conf["model_name"])
但是在添加参与客户端的时候,为什么也是使用self.local_model = models.get_model(self.conf["model_name"])
初始化函数参数里的model没有被使用,我觉得应该改为self.local_model=model
, local_train
函数不需要model
参数,在前面每个通讯选中参与者时将参与者的self.model设置成新的全局模型会不会好一点。
README里提到
。。。。
最后在当前目录下($fate_dir/examples/federatedml-1.x-examples),在命令行中执行下面的命令,即可自动完成上传和格式转换:
python $fate_dir/fate_flow/fate_flow_client.py -f upload -c upload_data.json
。。。
将dsl和conf文件放置在任意目录下,并在该目录下执行下面的命令进行训练:
python $fate_dir/fate_flow/fate_flow_client.py -f submit_job -d test_homolr_train_job_dsl.json -c test_homolr_train_job_conf.json
我装的新版fate,执行命令为
flow job submit -d test_homolr_train_job_dsl.json -c test_homolr_train_job_conf.json
出错为 json.decoder.JSONDecodeError: Expecting ',' delimiter: line 21 column 4 (char 459)
理论上的loss可视化用的不应该是训练集的数据得到的loss吗,这里怎么用的是测试集得到的loss?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.