tensorlayer / rlzoo Goto Github PK

A Comprehensive Reinforcement Learning Zoo for Simple Usage 🚀

License: Apache License 2.0

Python 66.57% Jupyter Notebook 33.35% Shell 0.08%

reinforcement-learning-practices tensorlayer tensorflow reinforcement-learning deep-reinforcement-learning deep-learning mindspore paddepaddle

rlzoo's People

Contributors

Stargazers

Watchers

Forkers

bcc2xp liuwenhaha extrememart wy2609 its-play tokarev-tt-33 siemens-aopen seeker1943 perfmjs xrosliang gingkg helgestefan ai-awesome-repos dkurome hanbaoan123 liujuncn morning197821 georgedut tonylibing hengley arhqs trytook alanthink ericcsr laicheng0830 deep-reinforcement-learning-book richitaarr ash-en joomladigger mirkomorati wyngjf creatorcen dixit91 devzoom sylor-huang h8f jiaodaxiaozi ziyiliubird caltechexperimentalgravity aavuzbcrew gthd lewin-yuan staminatang kazimbalti leetesla elephann zhangrunxi weibowen555 mhamilton723 joelmap xinc-human guoqq17 hieunq95 bruceyanghy zivzone xiao000l webprogrammer77 gmihran amir-abolfazli misterola qxydcr georgeren92 zhongzishi alexanderdurr nirvanaczj vijay-jaisankar zerounnet sawyer260 goodmight zyvoi shawnxhong nature-machine-intelligence tjevgerres rosieclementine atlasgooo2 rl-code-lib jiachen0212 mrx1a colin-chen-cn poet-libai biancaalexandru xiefeifan coco11563 h-tenets ruebin zero506 beanbois alaatekleh awanogawa ajunlonglive wesley-yang stjordanis iq-scm shenjiede sdalal1

rlzoo's Issues

RLBench learning speed

Hi, I am checking this repository, I was able to install everything without apparent problem.

I am testing the run_rlzoo.py script using RLBench with the ReachTarget. I run it but it is quite slow. One episode takes about 2~3 minutes. I wonder if you have seen the same behavior when training or somehow is my configuration. At first, the episode took about 5 min, then I realized that TensorFlow was not working with my GPU, I fixed that and well now is twice as fast, but still, 3 minutes per episode is quite slow, especially if it plans to run for 1000 episodes.

It is that the normal speed using R-VEP? is there anything I can do to train using faster-than-real-time simulation with RLBench?

Does RLzoo support Dict gym env state?

My customized gym env has a dict type obs_space, even I also customized ActorNetwork and CriticNetwork, I found out RLzoo's source code seems only support single input and can not handle dict state.
Is there any plan to support dict gym env state?

_TF_MODULE_IGNORED_PROPERTIES = tf.Module._TF_MODULE_IGNORED_PROPERTIES.union( 571 ( 572 '_graph_parents', AttributeError: type object 'Module' has no attribute '_TF_MODULE_IGNORED_PROPERTIES'

a little error

https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/common/value_networks.py#L187
I think L187 "raise ValueError("State Shape Not Accepted!")" should be changed to "raise ValueError("Action Shape Not Accepted!")"

dqn.py exploration rate is wrong

https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/algorithms/dqn/dqn.py#L178-L179
should be modified to "eps = 1 - (1 - exploration_final_eps) * min(1, i / (exploration_rate * train_episodes * max_steps))"

How to open CoppeliaSim

one error when running run_rlzoo.oy

Traceback (most recent call last):
File "D:/Anaconda3/Lib/site-packages/rlzoo/run_rlzoo.py", line 30, in
alg_params, learn_params = call_default_params(env, EnvType, AlgName)
File "D:\Anaconda3\Lib\site-packages\rlzoo\common\utils.py", line 131, in call_default_params
default_seed) # need manually set seed in the main script if default_seed = False
File "D:\Anaconda3\Lib\site-packages\rlzoo\algorithms\sac\default.py", line 43, in classic_control
soft_q_net1 = QNetwork(env.observation_space, env.action_space,
NameError: name 'QNetwork' is not defined

Results on Box2D environments

I tried to benchmark the follwing environments ['BipedalWalker-v2', 'BipedalWalkerHardcore-v2', 'CarRacing-v0', 'LunarLander-v2', 'LunarLanderContinuous-v2'] using ['A3C', 'DDPG', 'TD3', 'SAC', 'PG', 'TRPO', 'PPO', 'DPPO'] algorithms. Most of the combinations failed to learn the task and didn't converge. Only (SAC, LunarLanderContinuous-v2) and (TD3, LunarLanderContinuous-v2) learnt the task sub-optimally. . Can someone address this issue?

What is the default setting (e.g., total training steps, learning rate) of DQN?

Hi,

In your code, the training parameter setting is imported from utils.
from rlzoo.common.utils import call_default_params

May I check is there any document that explains what is this default setting and how to you fix it?

ImportError: cannot import name 'ArmActionMode'

I am getting this issue (in the screenshot) while running RLzoo with RLbench using the following code:

from rlzoo.common.env_wrappers import *
from rlzoo.common.utils import *
from rlzoo.algorithms import *

EnvName = 'ReachTarget'
EnvType = 'rlbench'
env = build_env(EnvName, EnvType, state_type='vision')

AlgName = 'SAC'
alg_params, learn_params = call_default_params(env, EnvType, AlgName)
alg = eval(AlgName+'(**alg_params)')
alg.learn(env=env, mode='train', render=False, **learn_params)
alg.learn(env=env, mode='test', render=True, **learn_params)

env.close()

I have also added export PYTHONPATH="/home/sidharth/RLBench" in .bashrc

Any help would be appreciated! Thanks.

How to install it with pip?

Results using RLBench as the environment

Hi,
first of all let me say that I appreciate a lot the work made in this repo.
I would like to know if you have had success in training any algorithm using RLBench as the environment.
I'm currently trying to train the DDPG algorithm on the ReachTarget task using all the observations available with state_type='vision'. As suggested in the issue #6 I modified the default params for DDPG lowering the max_steps and increasing the train_episodes, but I can't seem to get any result.
Any feedback is really much appreciated.

Mirko

Edit:
I noticed that RLBench doesn't provide "usable" reward metrics, am I wrong? All the episodes rewards are either 0.000 or 1.000. Any insight on this problem?

run_dqn.py got an unexpected keyword argument 'state_only'

a problem about class QNetwork

Why you do "assert len(self._action_shape) == 1", i.e, https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/common/value_networks.py#L143 ? I understand that spaces.Box's shape may have two or higher dimension

Does RLzoo support user-defined environment?

Does the RLzoo support the environment which is written by user?

cross entropy is wrong

in https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/common/distributions.py#L96, I think it should modified to "return -tf.reduce_sum(x*self._logits, axis=1)" for returning cross entropy because self._logits is already the logarithm of the probability .

a bug in calculating td error?

Does RLzoo support tensorflow 2.1.0?

A suggestion about kl divergence and entropy implementation of categorical distribution

In https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/common/distributions.py#L99, I think there is a concise implementation：

@expand_dims
def kl(self, logits):
    p = tf.exp(self._logits)
    kl = tf.reduce_sum(p * (self._logits-logits), axis=-1)
    return kl

@expand_dims
def entropy(_logits):
    p = tf.exp(_logits)
    return tf.reduce_sum(-p * _logits, axis=-1)

I don’t know the reason why you implemented it in a more complicated way in your code. Is it convenient to tell me?

ValueError: too many values to unpack

When I run the run_dqn.py with setting QNetwork(...)'s parameter state_only=False, "states, actions = inputs" in value_networks.py occurrs a ValueError as the title indicates. It occurs because "obv = np.expand_dims(obv, 0).astype('float32')" in dqn.py. I think if state_only = False, obv should add act_inputs for debugging. Hope you can fix this error.

Could PPO solve DM control tasks?

I install RLzoo and use its PPO to train an agent for DM Control Suite. I tested environments CheetahRun-v0 and CartpoleSwingup-v0, but the current PPO could solve neither of both. Could you please help me? I attach the testing reward for CartpoleSwingup-v0 below.

RLBench ActionMode attribute action_size not found

I was trying to use the RLzoo algorithms for the RLBench environment and I got an error from line 50 in the build_rlbench_env.py file.

Seems like in this commit they replaced where to get the action_size attribute, now line 50 should be

self.action_space = spaces.Box(low=-1.0, high=1.0, shape=(self.env.action_size,), dtype=np.float32)

ImportError: cannot import name 'ArmActionMode'

hi, I have run (rlzoo/interactive/main.ipynb ) and I got this error
ImportError: cannot import name 'ArmActionMode'
I did the solution that was suggested in #43 but now I get a new error and the problem isn't solved.
the error is action_shape() missing 1 required positional argument: 'scene'
what should I do?

The purpose of the init parameter state_conditioned

In https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/common/policy_networks.py#L192, I want to ask What is the purpose of the init parameter state_conditioned of class StochasticPolicyNetwork.

How to properly introduce a new RLBench task?

Hello!

I want to introduce a new RLBench task (or also override one). How do I accomplish this properly? The only way I can think of now is to rewrite parts of the code in the RLBench package, which I don't think is the proper way to do it. Should there be an argument to indicate where the task is defined?

Thank you!

Divided by None

https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/common/distributions.py#L137, self.action_scale is initialized to None. https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/common/distributions.py#L173, I think it occurs a error because of divided by None