Giter VIP home page Giter VIP logo

Comments (19)

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

你好,感谢对PARL的关注。
1.关于并行这块。PARL目前已经有并行通讯的接口,通过这个接口来实现并行算法是支持跨系统以及跨平台的。我们自己也有使用GPU训练(centos系统) + CPU仿真(Ubuntu系统)的情况。
2.目前PARL还不支持Windows系统,主要在于Windows的命令管理方式和Linux存在较大差距,这个会在未来的版本中解决(#178)。

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

谢谢,这是秒回复啊,我刚才描述的架构模式下有一个地方没有提到,就是在CPU集群中没有parl在运行,cpu集群只是提供仿真环境的运行,不过强化学习算法可以通过grpc启动仿真环境并与仿真环境交互,不知道这样会不会影响PARL的并行运行?

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

噢,这个是不影响的,有问题可以随时抛出来:)

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

你好,PARL可以在多GPU卡上实现神经网络模型并行计算么?

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

可以的!
我们注意到离线训练(offline training)的时候,经常有训练上亿级别的数据需求。所以提供了example来举例如何使用。实际上通过一行代码就可以完成转换了~
具体参考这个PR(#193),马上就会合并到主干。

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

你好,我看到有parl.compile来完成并行计算,可以简单介绍下这个调用在多机多卡上并行计算的原理么?谢谢

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

Paddle的实现是data-parallel,就是把一个大batch的数据分发的不同的卡上分别做计算,各自计算梯度并通过all reduce的操作合并梯度,把梯度刷到每块卡上,更新其模型参数。
请关注下PARL哈,后续还会有更多新example,包括multi-agent上的工作。

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

喔,那我理解这应该是数据并行,而不是模型并行,就是说把一个大模型拆成几块放到不同机器的GPU上跑,不知我得理解对不对?
嗯,我对PARL这块很关注,我也了解其他比如rllib、rlpyt等框架,对多智能体的支持还是很需要的。

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

喔,那我理解这应该是数据并行,而不是模型并行,就是说把一个大模型拆成几块放到不同机器的GPU上跑,不知我得理解对不对?


是的,数据并行。

感谢关注!

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

这个数据并行机制底层是依托paddle实现的么?可以用pytorch来实现parl数据并行么?

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

这种对计算性能要求高的并行计算是依托Paddle底层的C++、Cuda代码实现的哈,所以torch用不了这部分的工作。

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

我看了下要合并的并行计算文件,这种并行要是在多机上实现,在什么地方设置多机多卡地址呢?

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

目前我们只提供单机多卡的RL计算example,多机多卡这块还没有提供example的打算。
这块paddle目前已经支持的了,具体可以参考paddle的文档。

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

请问有没有结合PaddlePaddle EDL 和PARL的案例?

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

目前没有哈,有使用问题可以抛出来我们再解决:)

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

我看了下代码,目前parl的并行计算所利用的资源还是CPU吧?没有涉及GPU吧

from parl.

zenghsh3 avatar zenghsh3 commented on June 19, 2024

是的,目前针对CPU

from parl.

ShamCondor avatar ShamCondor commented on June 19, 2024

目前在python3.6和python3.7环境下安装parl总是报错误,有关于psutil,请问如何解决?
Building wheels for collected packages: psutil
Building wheel for psutil (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /home/sunfangyi/virtualenv/PARL3.7/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-2vpoancx/psutil/setup.py'"'"'; file='"'"'/tmp/pip-install-2vpoancx/psutil/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-0xgcqsrp
cwd: /tmp/pip-install-2vpoancx/psutil/
Complete output (42 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.7
creating build/lib.linux-x86_64-3.7/psutil
copying psutil/_pssunos.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_psposix.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_psaix.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_psbsd.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_common.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/init.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_psosx.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_compat.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_pswindows.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_pslinux.py -> build/lib.linux-x86_64-3.7/psutil
creating build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_process.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_aix.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/runner.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_contracts.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_system.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/init.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/main.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_unicode.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_connections.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_memory_leaks.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-3.7/psutil/tests
running build_ext
building 'psutil._psutil_linux' extension
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/psutil
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=567 -DPSUTIL_LINUX=1 -I/usr/include/python3.7m -I/home/sunfangyi/virtualenv/PARL3.7/include/python3.7m -c psutil/_psutil_common.c -o build/temp.linux-x86_64-3.7/psutil/_psutil_common.o
psutil/_psutil_common.c:9:20: fatal error: Python.h: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

ERROR: Failed building wheel for psutil
Running setup.py clean for psutil
Failed to build psutil
Installing collected packages: psutil, pyzmq, click, protobuf, numpy, tensorboardX, absl-py, werkzeug, grpcio, markdown, tb-nightly, termcolor, cloudpickle, Jinja2, itsdangerous, flask, scipy, pyarrow, parl
Running setup.py install for psutil ... error
ERROR: Command errored out with exit status 1:
command: /home/sunfangyi/virtualenv/PARL3.7/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-2vpoancx/psutil/setup.py'"'"'; file='"'"'/tmp/pip-install-2vpoancx/psutil/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-_4trv25m/install-record.txt --single-version-externally-managed --compile --install-headers /home/sunfangyi/virtualenv/PARL3.7/include/site/python3.7/psutil
cwd: /tmp/pip-install-2vpoancx/psutil/
Complete output (42 lines):
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.7
creating build/lib.linux-x86_64-3.7/psutil
copying psutil/_pssunos.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_psposix.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_psaix.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_psbsd.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_common.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/init.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_psosx.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_compat.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_pswindows.py -> build/lib.linux-x86_64-3.7/psutil
copying psutil/_pslinux.py -> build/lib.linux-x86_64-3.7/psutil
creating build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_process.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_aix.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/runner.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_contracts.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_system.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/init.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/main.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_unicode.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_connections.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_memory_leaks.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-3.7/psutil/tests
copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-3.7/psutil/tests
running build_ext
building 'psutil._psutil_linux' extension
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/psutil
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=567 -DPSUTIL_LINUX=1 -I/usr/include/python3.7m -I/home/sunfangyi/virtualenv/PARL3.7/include/python3.7m -c psutil/_psutil_common.c -o build/temp.linux-x86_64-3.7/psutil/_psutil_common.o
psutil/_psutil_common.c:9:20: fatal error: Python.h: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
----------------------------------------
ERROR: Command errored out with exit status 1: /home/sunfangyi/virtualenv/PARL3.7/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-2vpoancx/psutil/setup.py'"'"'; file='"'"'/tmp/pip-install-2vpoancx/psutil/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-_4trv25m/install-record.txt --single-version-externally-managed --compile --install-headers /home/sunfangyi/virtualenv/PARL3.7/include/site/python3.7/psutil Check the logs for full command output.

from parl.

zenghsh3 avatar zenghsh3 commented on June 19, 2024

请问是在什么系统环境下安装的?(linux/macos/windows?)

from parl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.