pokaxpoka / sunrise Goto Github PK

SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

Python 99.91% Shell 0.09%

reinforcement-learning rl deep-learning mujoco dm-control codebase model-free off-policy deep-reinforcement-learning deep-neural-networks deep-q-learning deep-q-network soft-actor-critic sac rainbow

sunrise's Introduction

SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

Official codebase for SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning.

sunrise's People

Stargazers

Watchers

sunrise's Issues

How to reproduce the Rainbow results of your paper

Hi. I see that the result about Rainbow in your paper is not the same as [1][2].
So, I wanted to ask what hyperparameter settings were used for the Rainbow results in your paper.
Hope you can give a detailed introduction on how to use the code in README, such as the requirements, thanks!

[1] Van Hasselt H P, Hessel M, Aslanides J. When to use parametric models in reinforcement learning?[J]. Advances in Neural Information Processing Systems, 2019, 32.
[2] Kaiser Ł, Babaeizadeh M, Miłos P, et al. Model Based Reinforcement Learning for Atari[C]//International Conference on Learning Representations. 2019.

UnboundLocalError: local variable 'env' referenced before assignment

Error:

Traceback (most recent call last):
  File "/home/mirror/PycharmProjects/sunrise/OpenAIGym_SAC/examples/sunrise.py", line 197, in <module>
    experiment(variant)
  File "/home/mirror/PycharmProjects/sunrise/OpenAIGym_SAC/examples/sunrise.py", line 50, in experiment
    expl_env = NormalizedBoxEnv(get_env(variant['env'], variant['seed']))
  File "/home/mirror/PycharmProjects/sunrise/OpenAIGym_SAC/examples/sunrise.py", line 46, in get_env
    env = env(env_name=env_name, rand_seed=seed, misc_info={'reset_type': 'gym'})
UnboundLocalError: local variable 'env' referenced before assignment

Seems that the error comes from this line

def get_env(env_name, seed):
    if env_name in ['gym_walker2d', 'gym_hopper',
                    'gym_cheetah', 'gym_ant']:
        from mbbl.env.gym_env.walker import env
    env = env(env_name=env_name, rand_seed=seed, misc_info={'reset_type': 'gym'})
    return env

The env was used but not be defined.
I don't know how to fix this bug due to the complexity of the codebase. Would you mind fixing this bug in this repo?

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 1]], which is output 0 of TBackward, is at version 2; expected version 1 instead.

Hi there!

Thanks for open-sourcing this work! Congratulation to SUNRISE, that's a great work!

I am trying to run SUNRISE in Walker2d-v3 environment. I have modified the get_env function to use gym.make("Walker2d-v3").

But I encounter the following error:

Traceback (most recent call last):
  File "sunrise.py", line 203, in <module>
    experiment(variant)
  File "sunrise.py", line 156, in experiment
    algorithm.train()
  File "/home/pengzhenghao/drivingforce/drivingforce/dice/sunrise/rlkit/core/rl_algorithm.py", line 46, in train
    self._train()
  File "/home/pengzhenghao/drivingforce/drivingforce/dice/sunrise/rlkit/core/batch_rl_algorithm.py", line 82, in _train
    self.trainer.train(train_data)
  File "/home/pengzhenghao/drivingforce/drivingforce/dice/sunrise/rlkit/torch/torch_rl_algorithm.py", line 50, in train
    self.train_from_torch(batch)
  File "/home/pengzhenghao/drivingforce/drivingforce/dice/sunrise/rlkit/torch/sac/neurips20_sac_ensemble.py", line 260, in train_from_torch
    policy_loss.backward()
  File "/home/pengzhenghao/anaconda3/envs/pgdrive/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/pengzhenghao/anaconda3/envs/pgdrive/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [256, 1]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Do you have any suggestion on that? The pytorch version listed in the requirement is too old. I am using the latest pytorch now. Does the error relate to the pytorch version?

Thanks!

code is not complete

Dear authors, thank you very much for submitting the code of your outstanding work. In the paper of ‘Sunrise: a simple unified framework for ensemble learning in deep reinforcement learning’, you have tested the performance of your algorithm in DeepMind Control Suit. But there is no code in this repository, would you like to share the code with me. Thanks!

pokaxpoka / sunrise Goto Github PK

sunrise's Introduction

SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

sunrise's People

Stargazers

Watchers

Forkers

sunrise's Issues

How to reproduce the Rainbow results of your paper

UnboundLocalError: local variable 'env' referenced before assignment

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 1]], which is output 0 of TBackward, is at version 2; expected version 1 instead.

code is not complete

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent