farama-foundation / d4rl-evaluations Goto Github PK

View Code? Open in Web Editor NEW

184.0 184.0 27.0 575 KB

License: Apache License 2.0

Python 98.53% Shell 1.09% Dockerfile 0.38%

d4rl-evaluations's People

Contributors

Stargazers

Watchers

d4rl-evaluations's Issues

[BEAR] Problems with the installation

Dear all,

I encountered many difficulties during the installation of BEAR, mainly due to conflicting dependencies between packages.
Also, the version of pytorch specified looks to be wrong, since it is not possible to reduce over multiple dimensions (e.g, torch.mean(x, dim=(1,2)) is not allowed) with pythorch==0.4.1.
I managed to fix the errors, and I report here how I solved them.

This is probably one of the many possible ways one can solve the aforementioned problems.

SOLUTION

cd bear
conda create -n rlkit python=3.7.0
conda activate rlkit
conda install -c conda-forge/label/cf202003 glfw
conda install pip
conda install -c conda-forge box2d-py
conda install pytorch==1.2.0 torchvision cudatoolkit=10.0 -c pytorch

conda install cython ipython joblib lockfile mako matplotlib mkl numba path.py python-dateutil scipy patchelf pygame cloudpickle gitpython parso

conda install -c conda-forge gym
conda install -c conda-forge ipdb

pip install gtimer
pip install numpy

pip install -e .

Download d4rl from the official repository and cd d4rl.
Change the setup.py with the following

from distutils.core import setup
from setuptools import find_packages

setup(
    name='d4rl',
    version='1.1',
    install_requires=['gym', 
                      'numpy', 
                      'mujoco_py', 
                      'pybullet',
                      'h5py', 
                      'termcolor', # adept_envs dependency
                      'click',  # adept_envs dependency
#                      'dm_control @ git+git://github.com/deepmind/dm_control@master#egg=dm_control',
                      'dm_control',
                      'mjrl @ git+git://github.com/aravindr93/mjrl@master#egg=mjrl'],
    packages=find_packages(),
    package_data={'d4rl': ['locomotion/assets/*',
                           'hand_manipulation_suite/assets/*',
                           'hand_manipulation_suite/Adroit/*',
                           'hand_manipulation_suite/Adroit/gallery/*',
                           'hand_manipulation_suite/Adroit/resources/*',
                           'hand_manipulation_suite/Adroit/resources/meshes/*',
                           'hand_manipulation_suite/Adroit/resources/textures/*',
                           ]},
    include_package_data=True,
)

Install the library with pip install . and pip install dm _control.

P.S. A big thank to Joao Carvalho for finding this solution @jacarvalho https://github.com/jacarvalho.

Evaluation on Carla

What is the correct way of doing evaluations on the Carla tasks?
For example, when I run python awr/scripts/run_conv.py, it gives me an enormous number of warnings and NaNs also appear in the results (as in the image below), is that normal?

Also, after 78 iterations the program is halted with "terminate called after throwing an instance of 'std::out_of_range'"

Thank you for making this great dataset! Would be better if more detailed instruction on how to replicate the results in the paper can be provided (even better if along with the estimated time when running on a single gpu)

D4RL do not support python3.5

D4RL do not support python3.5, and rlkit is build on python3.5.2, How to successfully install d4rl after install rlkit?

Issue with BEAR

I ran the SAC algorithm in BEAR and found that the average return is always negative, I don't know where the problem is

Maze2d tasks don't have a goal location in the state

Hi,

I find it irritating that the observations in the maze2d tasks only contain the 2d positions/velocities. If the agent is not informed about the goal location (which can be found in info/goal in the data set), it can't decide whether to go eg. left or right as the goal might be on either side.

How was that dealt with in the experiments from the paper? Is the agent conditioned on the goal in some form?

Thanks,
-Justin

Behavior Cloning in BRAC

I trained a behavior cloning model with the file in BRAC, but the performance is bad. Is that file the one used to get the bc results in the paper?

the error in d4rl_evaluations/crem/scripts/run_script.py

I try to run the code,and I find there is a small error in the "d4rl_evaluations/crem/scripts/run_script.py "
It maybe change

replay_buffer.add(obs, action, new_obs, reward, done_bool)

as :
replay_buffer.add((obs, action, new_obs, reward, done_bool))

RuntimeError while running Bear

I tried to run BEAR algorithm with your example command: python examples/bear_hdf5_d4rl.py --env='halfcheetah-medium-v0' --policy_lr=1e-4 --num_samples=100
already at first epoch i am getting this error

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [256, 6]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

I tried to fix it, but i couldn't. But i realized that with mode != auto it works, but i need mode turned on. But results not so good as expected!

Truncated image from render("rgb_array")

Hi,

I tried to render my D4RL ant maze environment with render("rgb_array") bu it returns me a truncated image, centered on the agent's start position. How am I supposed to do to get a full image of the environment?

Here is a google collar that show the problem: https://colab.research.google.com/drive/1ZXtcn_7EXJ9FCtCWh8AQ3STWO2LXtogu?usp=sharing

Thanks for any help

Offline version of AWR

Hi, I am trying to experiment AWR on a static dataset. But I find that both the codes of AWR in this repo and in the author's repo are the online version. Would you like to share the offline version of AWR, as you experiment in the paper?

Socket object in the snapshot ?

Hello,

Thank you for the work on this code.
I try to train a Offline RL agent on flow, but I'm unable to save the model, I get the following error :

Traceback (most recent call last): File "examples/bear_hdf5_d4rl.py", line 205, in <module> experiment(variant) File "examples/bear_hdf5_d4rl.py", line 149, in experiment algorithm.train() File "/home/tibo/Documents/Prog/Git/d4rl/d4rl_evaluations/bear/rlkit/core/rl_algorithm.py", line 46, in train self._train() File "/home/tibo/Documents/Prog/Git/d4rl/d4rl_evaluations/bear/rlkit/core/batch_rl_algorithm.py", line 176, in _train self._end_epoch(epoch) File "/home/tibo/Documents/Prog/Git/d4rl/d4rl_evaluations/bear/rlkit/core/rl_algorithm.py", line 57, in _end_epoch logger.save_itr_params(epoch, snapshot) File "/home/tibo/Documents/Prog/Git/d4rl/d4rl_evaluations/bear/rlkit/core/logging.py", line 288, in save_itr_params torch.save(params, file_name) File "/home/tibo/.local/lib/python3.7/site-packages/torch/serialization.py", line 370, in save _legacy_save(obj, opened_file, pickle_module, pickle_protocol) File "/home/tibo/.local/lib/python3.7/site-packages/torch/serialization.py", line 442, in _legacy_save pickler.dump(obj) File "/usr/lib/python3.7/socket.py", line 192, in __getstate__ raise TypeError("Cannot serialize socket object") TypeError: Cannot serialize socket object

I guess it mean there is a socket object in the saved snapshot, but I can't figure out what's wrong.
Also, what would be a good way to visualize the results of the model in the simulation ?

Thanks for the help,
Best,

Tibo

pytorch implementations

Hi Justin,

Do you have all these implementations in pytorch? For instance, I see BCQ is in pytorch but AWR isn't. Just checking.

best,
Ankur

Clarification about Training and Evaluation Task Split

Hi,

Thanks for sharing this repository. It is great
I'd like to ask about "Training and Evaluation Task Split" in Appendix D and how results are reported in Tables 1 and 3. I am a bit confused how those have been done.
For simplicity, let's assume BCQ and Maze2D are being used, which of the followings is correct description of what have been done in this paper:

BCQ is trained on "maze2d-umaze-v1". Then the leaned model is used to report results on "maze2d-eval-umaze-v1"? In other words, maze2d-eval-umaze-v1 is not used for training and only used to report results?
BCQ's hyperparameters are tuned on "maze2d-umaze-v1". Then, BCQ is trained with those hyperparameters and evaluated on "maze2d-eval-umaze-v1"? In other words, maze2d-eval-umaze-v1 is used for both training and evaluation?
Or any other scenario?

Thanks for your help.

Pytorch autograd failing when running bear/examples/sac.py

Hi,

I am using PyTorch version 1.6 to run this script: bear/examples/sac.py. The script fails with the following error:

Traceback (most recent call last):
  File "sac.py", line 111, in <module>
    experiment(variant)
  File "sac.py", line 78, in experiment
    algorithm.train()
  File "/home/ashish/d4rl_evaluations/bear/rlkit/core/rl_algorithm.py", line 46, in train
    self._train()
  File "/home/ashish/d4rl_evaluations/bear/rlkit/core/batch_rl_algorithm.py", line 172, in _train
    self.trainer.train(train_data)
  File "/home/ashish/d4rl_evaluations/bear/rlkit/torch/torch_rl_algorithm.py", line 40, in train
    self.train_from_torch(batch)
  File "/home/ashish/d4rl_evaluations/bear/rlkit/torch/sac/sac.py", line 144, in train_from_torch
    policy_loss.backward()
  File "/home/ashish/ve/py36/lib/python3.6/site-packages/torch/tensor.py", line 185, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/ashish/ve/py36/lib/python3.6/site-packages/torch/autograd/__init__.py", line 127, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [256, 1]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Does rlkit support pytorch 1.6 or is this a deeper issue?

Issue with running the codes

I was trying to run the AWR algorithm on the HalfCheetah environment as given in the README. So, first of all there is no run.py code in the folder of AWR. I copied run_script.py to the root of AWR folder and rand the command -

python run.py --env HalfCheetah-v2 --max_iter 20000 --visualize

However, I got an error as

Traceback (most recent call last):
  File "run.py", line 96, in <module>
    main(sys.argv)
  File "run.py", line 79, in main
    agent = build_agent(env)
  File "run.py", line 57, in build_agent
    agent = awr_agent.AWRAgent(env=env, sess=sess, **agent_configs)
  File "/Users/shs/Desktop/Projects/Offline RL/d4rl_evaluations/awr/learning/awr_agent.py", line 78, in __init__
    visualize=visualize)
  File "/Users/shs/Desktop/Projects/Offline RL/d4rl_evaluations/awr/learning/rl_agent.py", line 61, in __init__
    self._load_demo_data(self._env)
  File "/Users/shs/Desktop/Projects/Offline RL/d4rl_evaluations/awr/learning/rl_agent.py", line 466, in _load_demo_data
    demo_data = d4rl.qlearning_dataset(env)
  File "/Users/shs/Desktop/Projects/Offline RL/d4rl/d4rl/__init__.py", line 87, in qlearning_dataset
    dataset = env.get_dataset(**kwargs)
  File "/Users/shs/.local/lib/python3.6/site-packages/gym/core.py", line 216, in __getattr__
    return getattr(self.env, name)
AttributeError: 'HalfCheetahEnv' object has no attribute 'get_dataset'

How should I resolve this?

farama-foundation / d4rl-evaluations Goto Github PK

d4rl-evaluations's People

Contributors

Stargazers

Watchers

Forkers

d4rl-evaluations's Issues

SOLUTION

Recommend Projects

Recommend Topics

Recommend Org