Giter VIP home page Giter VIP logo

torchbeast's Introduction

TorchBeast

A PyTorch implementation of IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures by Espeholt, Soyer, Munos et al.

TorchBeast comes in two variants: MonoBeast and PolyBeast. While PolyBeast is more powerful (e.g. allowing training across machines), it's somewhat harder to install. MonoBeast requires only Python and PyTorch (we suggest using PyTorch version 1.2 or newer).

For further details, see our paper.

BibTeX

@article{torchbeast2019,
  title={{TorchBeast: A PyTorch Platform for Distributed RL}},
  author={Heinrich K\"{u}ttler and Nantas Nardelli and Thibaut Lavril and Marco Selvatici and Viswanath Sivakumar and Tim Rockt\"{a}schel and Edward Grefenstette},
  year={2019},
  journal={arXiv preprint arXiv:1910.03552},
  url={https://github.com/facebookresearch/torchbeast},
}

Getting started: MonoBeast

MonoBeast is a pure Python + PyTorch implementation of IMPALA.

To set it up, create a new conda environment and install MonoBeast's requirements:

$ conda create -n torchbeast
$ conda activate torchbeast
$ conda install pytorch -c pytorch
$ pip install -r requirements.txt

Then run MonoBeast, e.g. on the Pong Atari environment:

$ python -m torchbeast.monobeast --env PongNoFrameskip-v4

By default, MonoBeast uses only a few actors (each with their instance of the environment). Let's change the default settings (try this on a beefy machine!):

$ python -m torchbeast.monobeast \
     --env PongNoFrameskip-v4 \
     --num_actors 45 \
     --total_steps 30000000 \
     --learning_rate 0.0004 \
     --epsilon 0.01 \
     --entropy_cost 0.01 \
     --batch_size 4 \
     --unroll_length 80 \
     --num_buffers 60 \
     --num_threads 4 \
     --xpid example

Results are logged to ~/logs/torchbeast/latest and a checkpoint file is written to ~/logs/torchbeast/latest/model.tar.

Once training finished, we can test performance on a few episodes:

$ python -m torchbeast.monobeast \
     --env PongNoFrameskip-v4 \
     --mode test \
     --xpid example

MonoBeast is a simple, single-machine version of IMPALA. Each actor runs in a separate process with its dedicated instance of the environment and runs the PyTorch model on the CPU to create actions. The resulting rollout trajectories (environment-agent interactions) are sent to the learner. In the main process, the learner consumes these rollouts and uses them to update the model's weights.

Faster version: PolyBeast

PolyBeast provides a faster and more scalable implementation of IMPALA.

The easiest way to build and install all of PolyBeast's dependencies and run it is to use Docker:

$ docker build -t torchbeast .
$ docker run --name torchbeast torchbeast

To run PolyBeast directly on Linux or MacOS, follow this guide.

Installing PolyBeast

Linux

Create a new Conda environment, and install PolyBeast's requirements:

$ conda create -n torchbeast python=3.7
$ conda activate torchbeast
$ pip install -r requirements.txt

Install PyTorch either from source or as per its website (select Conda).

PolyBeast also requires gRPC and other third-party software, which can be installed by running:

$ git submodule update --init --recursive

Finally, let's compile the C++ parts of PolyBeast:

$ pip install nest/
$ python setup.py install

MacOS

Create a new Conda environment, and install PolyBeast's requirements:

$ conda create -n torchbeast
$ conda activate torchbeast
$ pip install -r requirements.txt

PyTorch can be installed as per its website (select Conda).

PolyBeast also requires gRPC and other third-party software, which can be installed by running:

$ git submodule update --init --recursive

Finally, let's compile the C++ parts of PolyBeast:

$ pip install nest/
$ python setup.py install

Running PolyBeast

To start both the environment servers and the learner process, run

$ python -m torchbeast.polybeast

The environment servers and the learner process can also be started separately:

python -m torchbeast.polybeast_env --num_servers 10

Start another terminal and run:

$ python3 -m torchbeast.polybeast_learner

(Very rough) overview of the system

|-----------------|     |-----------------|                  |-----------------|
|     ACTOR 1     |     |     ACTOR 2     |                  |     ACTOR n     |
|-------|         |     |-------|         |                  |-------|         |
|       |  .......|     |       |  .......|     .   .   .    |       |  .......|
|  Env  |<-.Model.|     |  Env  |<-.Model.|                  |  Env  |<-.Model.|
|       |->.......|     |       |->.......|                  |       |->.......|
|-----------------|     |-----------------|                  |-----------------|
   ^     I                 ^     I                              ^     I
   |     I                 |     I                              |     I Actors
   |     I rollout         |     I rollout               weights|     I send
   |     I                 |     I                     /--------/     I rollouts
   |     I          weights|     I                     |              I (frames,
   |     I                 |     I                     |              I  actions
   |     I                 |     v                     |              I  etc)
   |     L=======>|--------------------------------------|<===========J
   |              |.........      LEARNER                |
   \--------------|..Model.. Consumes rollouts, updates  |
     Learner      |.........       model weights         |
      sends       |--------------------------------------|
     weights

The system has two main components, actors and a learner.

Actors generate rollouts (tensors from a number of steps of environment-agent interactions, including environment frames, agent actions and policy logits, and other data).

The learner consumes that experience, computes a loss and updates the weights. The new weights are then propagated to the actors.

Learning curves on Atari

We ran TorchBeast on Atari, using the same hyperparamaters and neural network as in the IMPALA paper. For comparison, we also ran the open source TensorFlow implementation of IMPALA, using the same environment preprocessing. The results are equivalent; see our paper for details.

deep_network

Repository contents

libtorchbeast: C++ library that allows efficient learner-actor communication via queueing and batching mechanisms. Some functions are exported to Python using pybind11. For PolyBeast only.

nest: C++ library that allows to manipulate complex nested structures. Some functions are exported to Python using pybind11.

tests: Collection of python tests.

third_party: Collection of third-party dependencies as Git submodules. Includes gRPC.

torchbeast: Contains monobeast.py, and polybeast.py, polybeast_learner.py and polybeast_env.py.

Hyperparamaters

Both MonoBeast and PolyBeast have flags and hyperparameters. To describe a few of them:

  • num_actors: The number of actors (and environment instances). The optimal number of actors depends on the capabilities of the machine (e.g. you would not have 100 actors on your laptop). In default PolyBeast this should match the number of servers started.
  • batch_size: Determines the size of the learner inputs.
  • unroll_length: Length of a rollout (i.e., number of steps that an actor has to be perform before sending its experience to the learner). Note that every batch will have dimensions [unroll_length, batch_size, ...].

Contributing

We would love to have you contribute to TorchBeast or use it for your research. See the CONTRIBUTING.md file for how to help out.

License

TorchBeast is released under the Apache 2.0 license.

torchbeast's People

Contributors

adyomin avatar cdmatters avatar cpehle avatar denisyarats avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

torchbeast's Issues

Should we update the ValueNet and PolicyNet with the different loss?

In the original paper of IMPALA, the state value estimation and the action were the output of the same net, and the net was updated with the sum of three losses , which is not usual in the actor-critic algorithm.

The AtariNet in monobeast used baseline net and policy net to estimation the state value and output the action separately. So should we update the baseline net with the baseline loss and update the policy net with the policy gradient loss seperately in an actor-critic way?

Continuous Action Apace

Hello Author:
How can I apply vtrace to continuous action space?
I take the policy_logits as the normal distribution.

import torch.distributions as tdist

def __init__(self, observation_shape, num_actions, use_lstm=False):
     ...
     self.policy = nn.Linear(core_output_size, 2)
     ...
def forward(self, inputs, core_state=()):
     ...
     policy_logits = self.policy(core_output) 
     mu = policy_logits[0]
     sigma = policy_logits[1]
     action = tdist.Normal(mu, sigma).sample(1)
     ...

Error on installing PolyBeast

Hi there:

First I got the permission error on python setup.py install

error: can't create or remove files in install directory

The following error occurred while trying to add or remove files in the
installation directory:

    [Errno 13] Permission denied: '/home/ubuntu/miniconda3/envs/nle_challenge/lib/python3.8/site-packages/test-easy-install-26145.write-test'

The installation directory you specified (via --install-dir, --prefix, or
the distutils default setting) was:

    /home/ubuntu/miniconda3/envs/nle_challenge/lib/python3.8/site-packages/

After I use sudo python3 setup.py install, I got the ABI error with:

[  1%] Building CXX object grpc/third_party/abseil-cpp/absl/base/CMakeFiles/absl_exponential_biased.dir/internal/exponential_biased.cc.o
In file included from /usr/include/c++/7/cassert:43:0,
                 from /home/ubuntu/torchbeast/third_party/grpc/third_party/abseil-cpp/absl/numeric/int128.h:25,
                 from /home/ubuntu/torchbeast/third_party/grpc/third_party/abseil-cpp/absl/numeric/int128.cc:15:

/usr/include/x86_64-linux-gnu/c++/7/bits/c++config.h:250:27: error: #if with no expression
 #if _GLIBCXX_USE_CXX11_ABI

I am using an AWS-EC2 instance with ubuntu 18.04, gcc 7.5.0.

Any idea on this?

Best,
Zhihua

flags.num_servers are fixed to default

flags.num_servers in polybeast_env should be set to flags.num_actors in polybeast.py before passing it to polybeast_env.main().

Adding flags.num_servers = flags.num_actors before line 45 in polybeast.py fixes hang on actorpool_thread.join() when terminating with Ctrl-C.

I would have added a pull request but I already modified the forked repo so I'm not sure how to add a pull request.

Cannot reproduce the performance of "SpaceInvaders" game?

Hi,

Thank you for this great codebase.
I ran this code to reproduce the performance of the following 6 atari tasks: {AirRaid, Carnival, DemonAttack, NameThisGame, Pong, SpaceInvaders}. However, compared to the mean_episode_returns reported in the curves in the README, my experiment shows huge performance drop ONLY on SpaceInvaders (about x10 lower), while the other 5 tasks are reasonably reproduced. This problem is also reported here, in Appendix C. Why is that?

In my experiments, I used the same hyperparameters for all tasks. e.g.
python -m torchbeast.monobeast --env SpaceInvadersNoFrameskip-v4 --num_actors 56 --total_steps 50000000 --learning_rate 0.0006 --epsilon 0.01 --entropy_cost 0.01 --batch_size 32 --unroll_length 20 --num_threads 1 --xpid SpaceInvaders

Issues with Docker

After installing docker (on MacOS), the build failed. I am on the latest commit in master.

I get the following message:

Traceback (most recent call last):
  File "setup.py", line 759, in <module>
    build_deps()
  File "setup.py", line 311, in build_deps
    cmake=cmake)
  File "/src/pytorch/tools/build_pytorch_libs.py", line 59, in build_caffe2
    cmake.build(my_env)
  File "/src/pytorch/tools/setup_helpers/cmake.py", line 334, in build
    self.run(build_args, my_env)
  File "/src/pytorch/tools/setup_helpers/cmake.py", line 142, in run
    check_call(command, cwd=self.build_dir, env=env)
  File "/root/miniconda3/envs/torchbeast/lib/python3.7/subprocess.py", line 347, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '4']' returned non-zero exit status 1.

Error installing nest on Mac

On my Mac OS 10.15.7 (19H15)

$ conda create -n test python=3.8
$ conda activate test
$ pip install -r requirements.txt
$ brew install grpc
$ pip install nest/                                                                                                                                                                                                                                                           
Processing ./nest
Collecting pybind11>=2.3
  Using cached pybind11-2.6.2-py2.py3-none-any.whl (191 kB)
Building wheels for collected packages: nest
  Building wheel for nest (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /Users/rockt/anaconda3/envs/test/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/l4/dvygh6ns3cx1mmtmpjsrbcdhcjqgsy/T/pip-req-build-m0nc80nn/setup.py'"'"'; __file__='"'"'/private/var/folders/l4/dvygh6ns3cx1mmtmpjsrbcdhcjqgsy/T/pip-req-build-m0nc80nn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/l4/dvygh6ns3cx1mmtmpjsrbcdhcjqgsy/T/pip-wheel-fdslgohg
       cwd: /private/var/folders/l4/dvygh6ns3cx1mmtmpjsrbcdhcjqgsy/T/pip-req-build-m0nc80nn/
  Complete output (14 lines):
  running bdist_wheel
  running build
  running build_ext
  building 'nest' extension
  creating build
  creating build/temp.macosx-10.9-x86_64-3.8
  creating build/temp.macosx-10.9-x86_64-3.8/nest
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/rockt/anaconda3/envs/test/include -arch x86_64 -I/Users/rockt/anaconda3/envs/test/include -arch x86_64 -I/private/var/folders/l4/dvygh6ns3cx1mmtmpjsrbcdhcjqgsy/T/pip-req-build-m0nc80nn/.eggs/pybind11-2.6.2-py3.8.egg/pybind11/include -I/private/var/folders/l4/dvygh6ns3cx1mmtmpjsrbcdhcjqgsy/T/pip-req-build-m0nc80nn/.eggs/pybind11-2.6.2-py3.8.egg/pybind11/include -I/Users/rockt/anaconda3/envs/test/include/python3.8 -c nest/nest_pybind.cc -o build/temp.macosx-10.9-x86_64-3.8/nest/nest_pybind.o -std=c++17 -stdlib=libc++ -mmacosx-version-min=10.14 -DVERSION_INFO="0.0.3" -std=c++17 -fvisibility=hidden
  creating build/lib.macosx-10.9-x86_64-3.8
  g++ -bundle -undefined dynamic_lookup -L/Users/rockt/anaconda3/envs/test/lib -arch x86_64 -L/Users/rockt/anaconda3/envs/test/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.9-x86_64-3.8/nest/nest_pybind.o -o build/lib.macosx-10.9-x86_64-3.8/nest.cpython-38-darwin.so -stdlib=libc++
  ld: warning: object file (build/temp.macosx-10.9-x86_64-3.8/nest/nest_pybind.o) was built for newer OSX version (10.14) than being linked (10.9)
  ld: unsupported tapi file type '!tapi-tbd' in YAML file '/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/lib/libSystem.tbd' for architecture x86_64
  clang: error: linker command failed with exit code 1 (use -v to see invocation)
  error: command 'g++' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for nest
  Running setup.py clean for nest
Failed to build nest

Using more than 2 GPUs with Polybeast

Hi,
Firstly, thanks for the repository!
As far as our understanding goes, IMPALA can be distributed across more than 2 GPUs. The example you have in the repo uses up to 2 GPUs. We have access to more GPUs in a single machine and want to utilize all in order to get the maximal throughput. What would be the best way to do it (more learners etc.) and what do we have to add/change to the code?

Multi-agent RL extension?

I'm wondering if it would be straightforward to extend to the multi-agent case without breaking the multi-node multi-gpu training capability?

Minimum parameter configuration for not bad training results

Hello author:
I set the parameters as follows, but my training results are not good.
--env PongNoFrameskip-v4 --mode train --num_actors 4 --total_steps 100000 --use_lstm
test results:

[INFO:17945 monobeast:600 2021-08-29 22:14:16,250] Episode ended after 761 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:17,356] Episode ended after 759 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:18,460] Episode ended after 757 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:19,565] Episode ended after 755 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:20,668] Episode ended after 757 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:21,769] Episode ended after 755 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:22,869] Episode ended after 756 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:23,978] Episode ended after 761 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:25,076] Episode ended after 755 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:26,183] Episode ended after 761 steps. Return: -21.0
[INFO:17945 monobeast:604 2021-08-29 22:14:26,183] Average returns over 10 steps: -21.0

Please, I want to know what is the minimum parameter configuration to get a not bad training results.
Thank you!

Distributor Training

Hi, I think your implementation of IMPALA is really well done. The code is concise, clear, and understandable.

I do have a question regarding distributed training. In https://github.com/facebookresearch/torchbeast#running-polybeast, it seems the instructions still assume that the script will be run under a single machine. In the TF implementation, we can configure the multi machine setting using ClusterSpec, as shown here https://github.com/deepmind/scalable_agent/blob/6c0c8a701990fab9053fb338ede9c915c18fa2b1/experiment.py#L479.

I was wondering if there's anyway to do the same with `torchbeast.

Thanks a lot.

polybeast docker image(gpu version)

Thanks a lot for your team's work and It helps a lot
,but we have some problems and hope to get your help.

We found it's difficult to deploy polyeast using dockerfile and non-docker deployment methods on our machines. This is partly due to our network problems :). Therefore, we chose to search for the available polyeast images in the docker hub.
We found the most downloaded
https://hub.docker.com/r/torchbeast/ci-polybeast-cpu37/tags
The author should be a member of your team. But after we download it, we found the content file incomplete. We don't know how to use it and according to the name of this image, it should be a CPU version, while we need a GPU one.
After that, the other two images should be uploaded by other users. We are using the second one now(the only GPU version), but we have encountered some problems. The actor thread will exit inexplicably after the task having been executed for about 3 hours, (parameter:timeout_ms=10000). As a result, learner can't communicate with the server. The inference queue has been empty and the task can't continue. We haven't found the reason yet. We are worried about the image itself having some problem. Therefore, we have a request -- Can your team provide a polyeast image (GPU version) and transfer it to the docker hub?
If it is not convenient or can not be provided in the near future, we will try to find another way, but we still hope to get the image provided by you. Thank you very much!

can't transfer the info dictionary {} generated by env.step() to inference queue

Thanks for your pytorch implementation, but I got a issue need some help.
In my experiment with gym wrapped Starcraft2 Env, my env.step() returns five object:
obs(type:np.array,size:mn)),reward(type:int),done(type:int), 0 (type:int), info(type:dict,len:2)
but I can only get the first four object successfully in inference thread through inference_queue, with the fifth one turned out to be a tensor(size:1
1).
Is it true that this inference_queue can only transfer "int" object in it's second to fifth object?

default instructions for monobeast in Pong

Hello! I was wondering how the plots (attached) from this repo were produced? Was it done on Polybeast or Monobeast?

The reason I ask is that when I followed the default instructions:

python -m torchbeast.monobeast --env PongNoFrameskip-v4

--> Pong (which should ordinarily be an easy game) doesn't learn anything and reward remains at -20 even after millions of timesteps. Do you potentially have insight about what is going on?

Thank you so much in advance for your kind help!

image

PolyBeast build fails with Python 3.8

I tired following the NetHack challenge baseline setup instructions using Python 3.8 as suggested. I could not build PolyBeast.

Steps to reproduce the issue (except for repo and sub-modules cloning):

adyomin@DLW ~/s/torchbeast (master)> conda create --name nle_38 python=3.8                                                                                          (nle_37) 
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/adyomin/miniconda3/envs/nle_38

  added / updated specs:
    - python=3.8


The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-4.5-1_gnu
  ca-certificates    pkgs/main/linux-64::ca-certificates-2021.5.25-h06a4308_1
  certifi            pkgs/main/linux-64::certifi-2021.5.30-py38h06a4308_0
  ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.35.1-h7274673_9
  libffi             pkgs/main/linux-64::libffi-3.3-he6710b0_2
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-9.3.0-h5101ec6_17
  libgomp            pkgs/main/linux-64::libgomp-9.3.0-h5101ec6_17
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-9.3.0-hd4cf53a_17
  ncurses            pkgs/main/linux-64::ncurses-6.2-he6710b0_1
  openssl            pkgs/main/linux-64::openssl-1.1.1k-h27cfd23_0
  pip                pkgs/main/linux-64::pip-21.1.2-py38h06a4308_0
  python             pkgs/main/linux-64::python-3.8.10-h12debd9_8
  readline           pkgs/main/linux-64::readline-8.1-h27cfd23_0
  setuptools         pkgs/main/linux-64::setuptools-52.0.0-py38h06a4308_0
  sqlite             pkgs/main/linux-64::sqlite-3.35.4-hdfb4753_0
  tk                 pkgs/main/linux-64::tk-8.6.10-hbc83047_0
  wheel              pkgs/main/noarch::wheel-0.36.2-pyhd3eb1b0_0
  xz                 pkgs/main/linux-64::xz-5.2.5-h7b6447c_0
  zlib               pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate nle_38
#
# To deactivate an active environment, use
#
#     $ conda deactivate

adyomin@DLW ~/s/torchbeast (master)> conda activate nle_38                                                                                                          (nle_37) 
adyomin@DLW ~/s/torchbeast (master)> conda install pytorch cudatoolkit=11.1 -c pytorch -c nvidia                                                                    (nle_38) 
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/adyomin/miniconda3/envs/nle_38

  added / updated specs:
    - cudatoolkit=11.1
    - pytorch


The following NEW packages will be INSTALLED:

  blas               pkgs/main/linux-64::blas-1.0-mkl
  cudatoolkit        nvidia/linux-64::cudatoolkit-11.1.74-h6bb024c_0
  intel-openmp       pkgs/main/linux-64::intel-openmp-2021.2.0-h06a4308_610
  libuv              pkgs/main/linux-64::libuv-1.40.0-h7b6447c_0
  mkl                pkgs/main/linux-64::mkl-2021.2.0-h06a4308_296
  ninja              pkgs/main/linux-64::ninja-1.10.2-hff7bd54_1
  pytorch            pytorch/linux-64::pytorch-1.9.0-py3.8_cuda11.1_cudnn8.0.5_0
  typing_extensions  pkgs/main/noarch::typing_extensions-3.7.4.3-pyha847dfd_0


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: - By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html

done
adyomin@DLW ~/s/torchbeast (master)>  

adyomin@DLW ~/s/torchbeast (master)> pip install -r requirements.txt                                                                                                (nle_38) 
Collecting gym[atari]>=0.14.0
  Using cached gym-0.18.3-py3-none-any.whl
Collecting atari-py==0.2.5
  Using cached atari_py-0.2.5-cp38-cp38-linux_x86_64.whl
Collecting gitpython>=2.1
  Using cached GitPython-3.1.18-py3-none-any.whl (170 kB)
Collecting opencv-python
  Using cached opencv_python-4.5.2.54-cp38-cp38-manylinux2014_x86_64.whl (51.0 MB)
Collecting flake8
  Using cached flake8-3.9.2-py2.py3-none-any.whl (73 kB)
Collecting black
  Using cached black-21.6b0-py3-none-any.whl (140 kB)
Collecting pre-commit
  Using cached pre_commit-2.13.0-py2.py3-none-any.whl (190 kB)
Collecting numpy
  Using cached numpy-1.20.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.4 MB)
Collecting six
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting gitdb<5,>=4.0.1
  Using cached gitdb-4.0.7-py3-none-any.whl (63 kB)
Collecting smmap<5,>=3.0.1
  Using cached smmap-4.0.0-py2.py3-none-any.whl (24 kB)
Collecting cloudpickle<1.7.0,>=1.2.0
  Using cached cloudpickle-1.6.0-py3-none-any.whl (23 kB)
Collecting Pillow<=8.2.0
  Using cached Pillow-8.2.0-cp38-cp38-manylinux1_x86_64.whl (3.0 MB)
Collecting pyglet<=1.5.15,>=1.4.0
  Using cached pyglet-1.5.15-py3-none-any.whl (1.1 MB)
Collecting scipy
  Using cached scipy-1.6.3-cp38-cp38-manylinux1_x86_64.whl (27.2 MB)
Collecting pycodestyle<2.8.0,>=2.7.0
  Using cached pycodestyle-2.7.0-py2.py3-none-any.whl (41 kB)
Collecting mccabe<0.7.0,>=0.6.0
  Using cached mccabe-0.6.1-py2.py3-none-any.whl (8.6 kB)
Collecting pyflakes<2.4.0,>=2.3.0
  Using cached pyflakes-2.3.1-py2.py3-none-any.whl (68 kB)
Collecting pathspec<1,>=0.8.1
  Using cached pathspec-0.8.1-py2.py3-none-any.whl (28 kB)
Collecting toml>=0.10.1
  Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
Collecting mypy-extensions>=0.4.3
  Using cached mypy_extensions-0.4.3-py2.py3-none-any.whl (4.5 kB)
Collecting click>=7.1.2
  Using cached click-8.0.1-py3-none-any.whl (97 kB)
Collecting appdirs
  Using cached appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Collecting regex>=2020.1.8
  Using cached regex-2021.4.4-cp38-cp38-manylinux2014_x86_64.whl (733 kB)
Collecting pyyaml>=5.1
  Using cached PyYAML-5.4.1-cp38-cp38-manylinux1_x86_64.whl (662 kB)
Collecting nodeenv>=0.11.1
  Using cached nodeenv-1.6.0-py2.py3-none-any.whl (21 kB)
Collecting cfgv>=2.0.0
  Using cached cfgv-3.3.0-py2.py3-none-any.whl (7.3 kB)
Collecting virtualenv>=20.0.8
  Using cached virtualenv-20.4.7-py2.py3-none-any.whl (7.2 MB)
Collecting identify>=1.0.0
  Using cached identify-2.2.10-py2.py3-none-any.whl (98 kB)
Collecting distlib<1,>=0.3.1
  Using cached distlib-0.3.2-py2.py3-none-any.whl (338 kB)
Collecting filelock<4,>=3.0.0
  Using cached filelock-3.0.12-py3-none-any.whl (7.6 kB)
Installing collected packages: numpy, smmap, six, scipy, pyglet, Pillow, filelock, distlib, cloudpickle, appdirs, virtualenv, toml, regex, pyyaml, pyflakes, pycodestyle, pathspec, opencv-python, nodeenv, mypy-extensions, mccabe, identify, gym, gitdb, click, cfgv, atari-py, pre-commit, gitpython, flake8, black
Successfully installed Pillow-8.2.0 appdirs-1.4.4 atari-py-0.2.5 black-21.6b0 cfgv-3.3.0 click-8.0.1 cloudpickle-1.6.0 distlib-0.3.2 filelock-3.0.12 flake8-3.9.2 gitdb-4.0.7 gitpython-3.1.18 gym-0.18.3 identify-2.2.10 mccabe-0.6.1 mypy-extensions-0.4.3 nodeenv-1.6.0 numpy-1.20.3 opencv-python-4.5.2.54 pathspec-0.8.1 pre-commit-2.13.0 pycodestyle-2.7.0 pyflakes-2.3.1 pyglet-1.5.15 pyyaml-5.4.1 regex-2021.4.4 scipy-1.6.3 six-1.16.0 smmap-4.0.0 toml-0.10.2 virtualenv-20.4.7

adyomin@DLW ~/s/torchbeast (master)> pip install nest/                                                                                                              (nle_38) 
Processing ./nest
  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.
Collecting pybind11>=2.3
  Using cached pybind11-2.6.2-py2.py3-none-any.whl (191 kB)
Building wheels for collected packages: nest
  Building wheel for nest (setup.py) ... done
  Created wheel for nest: filename=nest-0.0.3-cp38-cp38-linux_x86_64.whl size=1359981 sha256=51687f8822780c0a48902e3929ecb001da9095838828c9142f849901d17343d1
  Stored in directory: /tmp/pip-ephem-wheel-cache-isfh3ukc/wheels/3b/d0/ff/525487a4ac7f26e39949ecdbe566c3ac2a9fbc6b7eabd702a9
Successfully built nest
Installing collected packages: pybind11, nest
Successfully installed nest-0.0.3 pybind11-2.6.2


adyomin@DLW ~/s/torchbeast (master)> python setup.py install                                                                                                        (nle_38) 
running install
running bdist_egg
running egg_info
writing libtorchbeast.egg-info/PKG-INFO
writing dependency_links to libtorchbeast.egg-info/dependency_links.txt
writing requirements to libtorchbeast.egg-info/requires.txt
writing top-level names to libtorchbeast.egg-info/top_level.txt
reading manifest file 'libtorchbeast.egg-info/SOURCES.txt'
writing manifest file 'libtorchbeast.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
-- Could NOT find Python3 (missing: Python3_EXECUTABLE Python3_INCLUDE_DIRS Python3_LIBRARIES Python3_NumPy_INCLUDE_DIRS Interpreter Development NumPy Development.Module Development.Embed) (Required is exact version "3.7")
    Reason given by package: 
        Interpreter: Wrong version for the interpreter "/home/adyomin/miniconda3/envs/nle_38/bin/python3"

-- pybind11 v2.6.2 
-- Found PythonInterp: /home/adyomin/miniconda3/envs/nle_38/bin/python (found version "3.8.10") 
-- Found PythonLibs: /home/adyomin/miniconda3/envs/nle_38/lib
-- 
-- 3.14.0.0
-- Caffe2: CUDA detected: 11.3
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 11.3
-- Found cuDNN: v8.2.1  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- /usr/local/cuda/lib64/libnvrtc.so shorthash is 1ea278b5
-- Added CUDA NVCC flags for: -gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
CMake Warning at /home/adyomin/miniconda3/envs/nle/lib/python3.8/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  /home/adyomin/miniconda3/envs/nle/lib/python3.8/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
  CMakeLists.txt:51 (find_package)


-- Configuring done
CMake Error at third_party/pybind11/tools/pybind11Tools.cmake:166 (add_library):
  Target "_C" links to target "Python3::NumPy" but the target was not found.
  Perhaps a find_package() call is missing for an IMPORTED target, or an
  ALIAS target is missing?
Call Stack (most recent call first):
  CMakeLists.txt:96 (pybind11_add_module)


CMake Error at third_party/pybind11/tools/pybind11Tools.cmake:166 (add_library):
  Target "_C" links to target "Python3::NumPy" but the target was not found.
  Perhaps a find_package() call is missing for an IMPORTED target, or an
  ALIAS target is missing?
Call Stack (most recent call first):
  CMakeLists.txt:96 (pybind11_add_module)


CMake Warning at third_party/pybind11/tools/pybind11Tools.cmake:166 (add_library):
  Cannot generate a safe runtime search path for target _C because there is a
  cycle in the constraint graph:

    dir 0 is [/home/adyomin/miniconda3/envs/nle/lib/python3.8/site-packages/torch/lib]
    dir 1 is [/usr/local/cuda/lib64/stubs]
    dir 2 is [/usr/local/cuda/lib64]
      dir 3 must precede it due to runtime library [libnvToolsExt.so.1]
    dir 3 is [/home/adyomin/miniconda3/envs/nle/lib]
      dir 2 must precede it due to runtime library [libcudart.so.11.0]
    dir 4 is [/home/adyomin/miniconda3/envs/nle_38/lib/python3.8/site-packages/torch/lib]

  Some of these libraries may not be found correctly.
Call Stack (most recent call first):
  CMakeLists.txt:96 (pybind11_add_module)


-- Generating done
CMake Generate step failed.  Build files cannot be regenerated correctly.

Output of the python collect_env.py in the same conda environment:

Collecting environment information...
NLE version: N/A
PyTorch version: 1.9.0
Is debug build: No
CUDA used to build PyTorch: 11.1

OS: Ubuntu 21.04
GCC version: (Ubuntu 9.3.0-23ubuntu2) 9.3.0
CMake version: version 3.18.4

Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 2070 SUPER
Nvidia driver version: 465.19.01
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.2.1

Versions of relevant libraries:
[pip3] numpy==1.20.3
[pip3] torch==1.9.0
[conda] blas                      1.0                         mkl  
[conda] mkl                       2021.2.0           h06a4308_296  
[conda] pytorch                   1.9.0           py3.8_cuda11.1_cudnn8.0.5_0    pytorch

Can't install polybeast in WSL2

When trying to install polybeast on WSL2 I encounter an issue with CUDA. In particular, when running: python setup.py install I get an issue with finding CUDA libraries:

CUDA errror CMake Error at /home/seb/anaconda3/envs/brint/lib/python3.7/site-packages/cmake/data/share/cmake-3.22/Modules/CMakeDetermineCompilerId.cmake:724 (message): Compiling the CUDA compiler identification source file "CMakeCUDACompilerId.cu" failed.

Compiler: /usr/bin/nvcc

Build flags:

Id flags: -v

The output was:

No such file or directory

I am fairly sure I have installed CUDA and the toolkits at least somewhat correctly, as I was able to install moolib which also required the use of the CUDA compiler. I ran pip install moolib within the same virtual environment that I tried running python setup.py install, and it seemed to manage to find the files correctly and install. Do you have any idea what is going wrong? Thanks for the help!

I encountered some problems when I ran the command pip install ".[polybeast]"

I want to run a torchbeast agent on the minihack environment and follow the instructions of minihack.
When I ran the command pip install ".[polybeast]" I got the error:

Obtaining file:///home/yangqiuyu/minihack/torchbeast
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
WARNING: libtorchbeast 0.0.20 does not provide the extra 'polybeast'
Requirement already satisfied: torch>=1.4.0 in /home/yangqiuyu/anaconda3/envs/polybeast/lib/python3.8/site-packages (from libtorchbeast==0.0.20) (1.7.1)
Requirement already satisfied: typing_extensions in /home/yangqiuyu/anaconda3/envs/polybeast/lib/python3.8/site-packages (from torch>=1.4.0->libtorchbeast==0.0.20) (4.1.1)
Requirement already satisfied: numpy in /home/yangqiuyu/anaconda3/envs/polybeast/lib/python3.8/site-packages (from torch>=1.4.0->libtorchbeast==0.0.20) (1.22.3)
Installing collected packages: libtorchbeast
Running setup.py develop for libtorchbeast
error: subprocess-exited-with-error

× python setup.py develop did not run successfully.
│ exit code: 1
╰─> [135 lines of output]
    running develop
    running egg_info
    writing libtorchbeast.egg-info/PKG-INFO
    writing dependency_links to libtorchbeast.egg-info/dependency_links.txt
    writing requirements to libtorchbeast.egg-info/requires.txt
    writing top-level names to libtorchbeast.egg-info/top_level.txt
    reading manifest file 'libtorchbeast.egg-info/SOURCES.txt'
    adding license file 'LICENSE'
    writing manifest file 'libtorchbeast.egg-info/SOURCES.txt'
    running build_ext
    -- Could NOT find Python3 (missing: Python3_NumPy_INCLUDE_DIRS NumPy) (found suitable version "3.8.13", minimum required is "3.7")
    CMake Warning at CMakeLists.txt:19 (find_package):
      By not providing "FindNumPy.cmake" in CMAKE_MODULE_PATH this project has
      asked CMake to find a package configuration file provided by "NumPy", but
      CMake did not find one.

      Could not find a package configuration file provided by "NumPy" with any of
      the following names:

        NumPyConfig.cmake
        numpy-config.cmake

      Add the installation prefix of "NumPy" to CMAKE_PREFIX_PATH or set
      "NumPy_DIR" to a directory containing one of the above files.  If "NumPy"
      provides a separate development package or SDK, be sure it has been
      installed.


    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ModuleNotFoundError: No module named 'torch'
    -- pybind11 v2.6.2
    CMake Warning (dev) at /usr/local/cmake-3.22.3-linux-x86_64/share/cmake-3.22/Modules/CMakeDependentOption.cmake:84 (message):
      Policy CMP0127 is not set: cmake_dependent_option() supports full Condition
      Syntax.  Run "cmake --help-policy CMP0127" for policy details.  Use the
      cmake_policy command to set the policy and suppress this warning.
    Call Stack (most recent call first):
      third_party/pybind11/CMakeLists.txt:98 (cmake_dependent_option)
    This warning is for project developers.  Use -Wno-dev to suppress it.

    CMake Warning at third_party/pybind11/tools/pybind11Tools.cmake:46 (find_package):
      By not providing "FindNumPy.cmake" in CMAKE_MODULE_PATH this project has
      asked CMake to find a package configuration file provided by "NumPy", but
      CMake did not find one.

      Could not find a package configuration file provided by "NumPy" with any of
      the following names:

        NumPyConfig.cmake
        numpy-config.cmake

      Add the installation prefix of "NumPy" to CMAKE_PREFIX_PATH or set
      "NumPy_DIR" to a directory containing one of the above files.  If "NumPy"
      provides a separate development package or SDK, be sure it has been
      installed.
    Call Stack (most recent call first):
      third_party/pybind11/tools/pybind11Common.cmake:201 (include)
      third_party/pybind11/CMakeLists.txt:188 (include)


    --
    -- 3.14.0.0
    CMake Deprecation Warning at third_party/grpc/third_party/zlib/CMakeLists.txt:1 (cmake_minimum_required):
      Compatibility with CMake < 2.8.12 will be removed from a future version of
      CMake.

      Update the VERSION argument <min> value or use a ...<max> suffix to tell
      CMake that the project does not need compatibility with older versions.


    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ModuleNotFoundError: No module named 'torch'
    -- Caffe2: CUDA detected: 10.0
    -- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
    -- Caffe2: CUDA toolkit directory: /usr/local/cuda
    -- Caffe2: Header version is: 10.0
    -- Found cuDNN: v7.6.5  (include: /usr/local/cuda/include, library: /usr/local/cuda/lib64/libcudnn.so)
    -- Added CUDA NVCC flags for: -gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
    CMake Deprecation Warning at third_party/grpc/third_party/googletest/CMakeLists.txt:4 (cmake_minimum_required):
      Compatibility with CMake < 2.8.12 will be removed from a future version of
      CMake.

      Update the VERSION argument <min> value or use a ...<max> suffix to tell
      CMake that the project does not need compatibility with older versions.


    CMake Deprecation Warning at third_party/grpc/third_party/googletest/googlemock/CMakeLists.txt:45 (cmake_minimum_required):
      Compatibility with CMake < 2.8.12 will be removed from a future version of
      CMake.

      Update the VERSION argument <min> value or use a ...<max> suffix to tell
      CMake that the project does not need compatibility with older versions.


    CMake Deprecation Warning at third_party/grpc/third_party/googletest/googletest/CMakeLists.txt:56 (cmake_minimum_required):
      Compatibility with CMake < 2.8.12 will be removed from a future version of
      CMake.

      Update the VERSION argument <min> value or use a ...<max> suffix to tell
      CMake that the project does not need compatibility with older versions.


    -- Configuring done
    CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
    Please set them or make sure they are set and tested correctly in the CMake files:
    /home/yangqiuyu/minihack/torchbeast/_Python3_NumPy_INCLUDE_DIR
       used as include directory in directory /home/yangqiuyu/minihack/torchbeast

    CMake Error at third_party/pybind11/tools/pybind11Tools.cmake:167 (add_library):
      Target "_C" links to target "Python3::NumPy" but the target was not found.
      Perhaps a find_package() call is missing for an IMPORTED target, or an
      ALIAS target is missing?
    Call Stack (most recent call first):
      CMakeLists.txt:97 (pybind11_add_module)


    CMake Warning at third_party/pybind11/tools/pybind11Tools.cmake:167 (add_library):
      Cannot generate a safe runtime search path for target _C because files in
      some directories may conflict with libraries in implicit directories:

        runtime library [libtorch_python.so] in /lib may be hidden by files in:
          /home/yangqiuyu/anaconda3/envs/polybeast/lib/python3.8/site-packages/torch/lib

      Some of these libraries may not be found correctly.
    Call Stack (most recent call first):
      CMakeLists.txt:97 (pybind11_add_module)


    -- Generating done
    CMake Generate step failed.  Build files cannot be regenerated correctly.
    /tmp/pip-build-env-yfrvjlsm/overlay/lib/python3.8/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
      warnings.warn(
    /tmp/pip-build-env-yfrvjlsm/overlay/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
      warnings.warn(
    [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.

error: subprocess-exited-with-error

× python setup.py develop did not run successfully.
│ exit code: 1
╰─> [135 lines of output]
running develop
running egg_info
writing libtorchbeast.egg-info/PKG-INFO
writing dependency_links to libtorchbeast.egg-info/dependency_links.txt
writing requirements to libtorchbeast.egg-info/requires.txt
writing top-level names to libtorchbeast.egg-info/top_level.txt
reading manifest file 'libtorchbeast.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'libtorchbeast.egg-info/SOURCES.txt'
running build_ext
-- Could NOT find Python3 (missing: Python3_NumPy_INCLUDE_DIRS NumPy) (found suitable version "3.8.13", minimum required is "3.7")
CMake Warning at CMakeLists.txt:19 (find_package):
By not providing "FindNumPy.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "NumPy", but
CMake did not find one.

  Could not find a package configuration file provided by "NumPy" with any of
  the following names:

    NumPyConfig.cmake
    numpy-config.cmake

  Add the installation prefix of "NumPy" to CMAKE_PREFIX_PATH or set
  "NumPy_DIR" to a directory containing one of the above files.  If "NumPy"
  provides a separate development package or SDK, be sure it has been
  installed.


Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'torch'
-- pybind11 v2.6.2
CMake Warning (dev) at /usr/local/cmake-3.22.3-linux-x86_64/share/cmake-3.22/Modules/CMakeDependentOption.cmake:84 (message):
  Policy CMP0127 is not set: cmake_dependent_option() supports full Condition
  Syntax.  Run "cmake --help-policy CMP0127" for policy details.  Use the
  cmake_policy command to set the policy and suppress this warning.
Call Stack (most recent call first):
  third_party/pybind11/CMakeLists.txt:98 (cmake_dependent_option)
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning at third_party/pybind11/tools/pybind11Tools.cmake:46 (find_package):
  By not providing "FindNumPy.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "NumPy", but
  CMake did not find one.

  Could not find a package configuration file provided by "NumPy" with any of
  the following names:

    NumPyConfig.cmake
    numpy-config.cmake

  Add the installation prefix of "NumPy" to CMAKE_PREFIX_PATH or set
  "NumPy_DIR" to a directory containing one of the above files.  If "NumPy"
  provides a separate development package or SDK, be sure it has been
  installed.
Call Stack (most recent call first):
  third_party/pybind11/tools/pybind11Common.cmake:201 (include)
  third_party/pybind11/CMakeLists.txt:188 (include)


--
-- 3.14.0.0
CMake Deprecation Warning at third_party/grpc/third_party/zlib/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'torch'
-- Caffe2: CUDA detected: 10.0
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 10.0
-- Found cuDNN: v7.6.5  (include: /usr/local/cuda/include, library: /usr/local/cuda/lib64/libcudnn.so)
-- Added CUDA NVCC flags for: -gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
CMake Deprecation Warning at third_party/grpc/third_party/googletest/CMakeLists.txt:4 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


CMake Deprecation Warning at third_party/grpc/third_party/googletest/googlemock/CMakeLists.txt:45 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


CMake Deprecation Warning at third_party/grpc/third_party/googletest/googletest/CMakeLists.txt:56 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- Configuring done
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
/home/yangqiuyu/minihack/torchbeast/_Python3_NumPy_INCLUDE_DIR
   used as include directory in directory /home/yangqiuyu/minihack/torchbeast

CMake Error at third_party/pybind11/tools/pybind11Tools.cmake:167 (add_library):
  Target "_C" links to target "Python3::NumPy" but the target was not found.
  Perhaps a find_package() call is missing for an IMPORTED target, or an
  ALIAS target is missing?
Call Stack (most recent call first):
  CMakeLists.txt:97 (pybind11_add_module)


CMake Warning at third_party/pybind11/tools/pybind11Tools.cmake:167 (add_library):
  Cannot generate a safe runtime search path for target _C because files in
  some directories may conflict with libraries in implicit directories:

    runtime library [libtorch_python.so] in /lib may be hidden by files in:
      /home/yangqiuyu/anaconda3/envs/polybeast/lib/python3.8/site-packages/torch/lib

  Some of these libraries may not be found correctly.
Call Stack (most recent call first):
  CMakeLists.txt:97 (pybind11_add_module)


-- Generating done
CMake Generate step failed.  Build files cannot be regenerated correctly.
/tmp/pip-build-env-yfrvjlsm/overlay/lib/python3.8/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/tmp/pip-build-env-yfrvjlsm/overlay/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.

I would really appreciate it if someone could help me find out where the problem is.

Should we keep the same policy in one trajectory ?

The paper of impala say :

At the beginning of each trajectory, an actor updates its own local policy µ to the latest learner policy π and runs it for n steps in its environment.

does it means that we should keep the same policy in one trajectory.
I think the implementation here maybe update the policy at sampling that will cause different policy.

for t in range(flags.unroll_length):

error install nest on Ubuntu 16

pip install nest/
Processing ./nest
Requirement already satisfied: pybind11>=2.3 in /anaconda3/envs/torchbeast/lib/python3.7/site-packages (from nest==0.0.3) (2.6.2)
Building wheels for collected packages: nest
Building wheel for nest (setup.py) ... error
ERROR: Command errored out with exit status 1:
command:
/anaconda3/envs/torchbeast/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-xjc52l4k/setup.py'"'"'; file='"'"'/tmp/pip-req-build-xjc52l4k/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-ty9pec67
cwd: /tmp/pip-req-build-xjc52l4k/
Complete output (227 lines):
running bdist_wheel
running build
running build_ext
building 'nest' extension
creating build
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/nest

cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from nest/nest_pybind.cc:21:0:
nest/nest.h:29:22: error: parameter packs not expanded with ‘...’:
using Ts::operator()...;
^
nest/nest.h:29:22: note: ‘Ts’
nest/nest.h:29:23: error: expected ‘;’ before ‘...’ token
using Ts::operator()...;
^
nest/nest.h:29:23: error: expected unqualified-id before ‘...’ token
nest/nest.h:32:37: error: expected constructor, destructor, or type conversion before ‘;’ token
overloaded(Ts...)->overloaded<Ts...>;
^
nest/nest.h:38:12: error: ‘variant’ in namespace ‘std’ does not name a template type
std::variant<T, std::vector, std::map<std::string, Nest>>;
^
nest/nest.h:58:16: error: expected ‘)’ before ‘v’
Nest(value_t v) : value(std::move(v)) {}

"Done" default of 1 results in 0 reward episodes

In core/environment.py, done defaults to torch.ones, instead of torch.zeros. This means that in monobeast's act(), the first replay entry each actor creates has a done value of 1. Then when episode returns are reported, those episodes have rewards of 0, though the episodes never really happened at all.

(By the way, excellent repo! Very useful.)

polybeast is slower than monobeast

I build the cuda docker container like this, and tested mono and poly by almost same parameters below:

python -m torchbeast.monobeast \
     --env PongNoFrameskip-v4 \
     --num_actors 64 \
     --total_steps 30000000 \
     --learning_rate 0.0004 \
     --epsilon 0.01 \
     --entropy_cost 0.01 \
     --batch_size 4 \
     --unroll_length 80 \
     --num_buffers 60 \
     --num_threads 4 \
     --xpid example

python -m torchbeast.polybeast \
     --env PongNoFrameskip-v4 \
     --num_actors 64 \
     --total_steps 30000000 \
     --learning_rate 0.0004 \
     --epsilon 0.01 \
     --entropy_cost 0.01 \
     --batch_size 4 \
     --unroll_length 80 \
     --xpid example

I got the result that polybeast is slower than monobeast:
monobeast speed is about 10000SPS.
polybeast speed is about 3000SPS.
I have checked GPU, it works fine. monobeast used 100% of every CPU processor, but polybeast used only 50% of every CPU processor.
How can I speed up the polybeast?

act() function doesn't use model in eval mode

Hey guys,

Thanks again for this amazing library that makes training RL agents extremely easy. I have a quick question about the act() function. This is supposed to be the function that is responsible for collecting the experiences of the agent in the environment. In this phase, the actor model is used which is different from the learner model. In PyTorch, as you might know, there are two different modalities: 'train' and 'eval'. I was expecting that the act() would call the model.eval() before starting collecting new experiences but it is not happening here: https://github.com/facebookresearch/torchbeast/blob/master/torchbeast/monobeast.py#L128

I have seen people arguing that in an RL setup is important to disable dropout to reduce the variance of the policy. This would be a side-effect of calling eval(). I can see that the default agent doesn't have any dropout so maybe this wasn't required in your case. What would you recommend?

link openssl library on Mac OS X when using Homebrew

With the latest Homebrew formula of [email protected], it seems necessary to link the ssl library explicitly. After modifying setup.py as below, things work fine again.

...
if sys.platform == "darwin":
    extra_compile_args += ["-stdlib=libc++", "-mmacosx-version-min=10.14", "-I/usr/local/opt/[email protected]/include"]
    extra_link_args += ["-stdlib=libc++", "-mmacosx-version-min=10.14", "-L/usr/local/opt/[email protected]/lib"]

    # Relevant only when c-cares is not embedded in grpc, e.g. when
    # installing grpc via homebrew.
    libraries.append("cares")
    libraries.append("ssl")
...

learning-rate so sensitive to training result?

train PongNoFrameskip-v4 in example expriment, when i change learning-rate from 0.0004 to 0.0008,i can not train to good result in 30000000 frames. why? when i choose a larger batch-size, such as 8,training speed slow down,and so how to speed up training?

Excessive memory use in monobeast.py

My collaborator and I both experience this issue.

python -m torchbeast.monobeast --env PongNoFrameskip-v4 --num_actors 1 --num_buffers 2 --total_steps 5000000

The memory in the learner processes will increment occasionally until it chokes are machines (8 Gb memory, 5Gb for the learner process) at around 1 million steps. I spoke to someone in discord about this, and they did not have this issue. However I note that the graph of memory usage they posted showed around 300 Mb of use, which seems too small (but I suppose its really net dependent)

How exactly monobeast and polybeast are different in performance perspective?

Hi @heiner,

Thank you for your kind reply for the previous issue (#25).
As I understand, I need to use polybeast to reproduce the SpaceInvaders results.

But could you please elaborate a little bit more on the following: "The MonoBeast version you are using has the upside of being simpler to install and run, but uses a different design that impacts RL performance in hard to understand ways".

I assume that polybeast enables much faster than monobeast, but what exactly is the reason of the score gap between those two?
e.g. better exploration at the early stage of training, less policy lag during environment interaction ...

Why doesn't the test code use LSTM?

Hello author!
Why doesn't the test code use LSTM?
model forward propagation code:
if self.use_lstm and len(core_state) is not 0:
The lstm state is not initialized in the test code:
agent_state = model.initial_state(batch_size=1)
lstm state is not passed in during inference:
agent_outputs = model(observation)
This will result in:
Even if the parameter --use_lstm is used when training the model, LSTM is still not used in the test.

Reproducing Pong Training Curve using Monobeast

Hi,

I tried running monobeast.py on different environments, including LunarLander-v2 and PongNoFrameskip-v4 but the model doesn't learn anything. I tried the hyperparameters for Pong that is written in the ReadMe.md but still nothing. The mean expected return does not move. I checked the gradients they are non-zero and the weights are being updated. However, during testing, I noticed that the agent always chooses the same action.

I am also getting A LOT of NaNs in the log file for the mean_expected_return, is this normal?

Any help would be appreciated, thanks!

Missing PyTorch in the requirements.txt?

Hi there. Congratulations on the release. The platform looks great.

After following the installation guide for MonoBeast

$ conda create -n torchbeast python=3.7
$ conda activate torchbeast
$ pip install -r requirements.txt

and running

python -m torchbeast.monobeast --env PongNoFrameskip-v4

I'm getting the following error ModuleNotFoundError: No module named 'torch'.

Looks like you're missing the torch prerequisite in the requirements.txt for the new torchbeast conda environment.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.