DMControl Generalization Benchmark

License: MIT License

Shell 2.73% Python 97.27%

dmcontrol-generalization-benchmark's People

Contributors

Stargazers

Watchers

dmcontrol-generalization-benchmark's Issues

cnn-encoder part of Actor in sac.py would never be updated?

Hi, here

dmcontrol-generalization-benchmark/src/algorithms/sac.py

Line 104 in ee658ce

_, pi, log_pi, log_std = self.actor(obs, detach=True)

As the detach=True, all the cnn-encoder parts of the actor would not be updated? is it right? or you do not want to update the cnn encoder of the actor?

or am I missing something?

RAD implementation

Hi, thank you for this work! I'm interested in the RAD implementation but found that the RAD agent just inherits the SAC. May I know how RAD in this implementation differs from SAC agent?

Any changes to DMControl source code?

Hi, great work!

I've noticed that this benchmark differs from the original implementation of Distracting Control Suite.

But how about the DMC source code here compared with official repo? Any changes?

I think some much more highlighted and detailed notifications on README are neccesarry and welcomed.

How to calculate the std mentioned in the article?

Hello～ How to calculate the std.deviation in the paper? Should I record all the episode rewards in every episode from different seeds and calculate their std. deviation？Or just record the mean of 100 episodes in different seeds，and calculate the std. deviation among these mean values?

Type Error: 'setting_kwargs'

Hello,

I am facing the same issue as the one described here. I ran this command python3 src/train.py --algorithm svea --seed 0 but got this error TypeError: load() got an unexpected keyword argument 'setting_kwargs'.

I did install dm_control which comes with mujoco 2.2.0 and I dont see any way to install mujoco 2.0.0 from the mujoco download page.

@nicklashansen Any help would be appreciated.

How to implement data-mixing only mentioned in the SVEA paper?

Congratulation to be accepted by NIPS 2021!!
I wanna ask how to implement data-mixing only mentioned in the SVEA paper?

Question about video background

When I run the program with the video_easy or video_hard command, the saved video file has a green background instead of the video background.
I want to ask how to solve this problem.

RuntimeError: DataLoader worker (pid 384930) is killed by signal: Segmentation fault.

I ran your code of the SODA algorithm for 500k steps. The code ran till 211k steps and then it gave a segmentation fault error.

Evaluating: logs/walker_walk/soda/0
| eval | S: 210000 | ER: 604.9552 | ERTEST: 473.9582
| train | E: 841 | S: 210250 | D: 77.9 s | R: 676.1391 | ALOSS: -200.7976 | CLOSS: 19.3476 | AUXLOSS: 0.0003
| train | E: 842 | S: 210500 | D: 21.4 s | R: 630.4664 | ALOSS: -200.9594 | CLOSS: 19.7981 | AUXLOSS: 0.0003
| train | E: 843 | S: 210750 | D: 21.5 s | R: 575.7474 | ALOSS: -201.1477 | CLOSS: 19.6175 | AUXLOSS: 0.0003
| train | E: 844 | S: 211000 | D: 21.5 s | R: 587.5916 | ALOSS: -201.0205 | CLOSS: 19.8251 | AUXLOSS: 0.0003
| train | E: 845 | S: 211250 | D: 21.7 s | R: 600.5652 | ALOSS: -200.9775 | CLOSS: 19.5227 | AUXLOSS: 0.0003
| train | E: 846 | S: 211500 | D: 21.5 s | R: 617.4011 | ALOSS: -201.0966 | CLOSS: 19.4789 | AUXLOSS: 0.0003
| train | E: 847 | S: 211750 | D: 21.5 s | R: 670.3287 | ALOSS: -200.8488 | CLOSS: 19.7286 | AUXLOSS: 0.0003
ERROR: Unexpected segmentation fault encountered in worker.
Traceback (most recent call last):
File "src/train.py", line 152, in
main(args)
File "src/train.py", line 136, in main
agent.update(replay_buffer, L, step)
File "/home/kumars/Darshita/dmcontrol-generalization-benchmark/src/algorithms/soda.py", line 75, in update
self.update_critic(obs, action, reward, next_obs, not_done, L, step)
File "/home/kumars/Darshita/dmcontrol-generalization-benchmark/src/algorithms/sac.py", line 93, in update_critic
current_Q1, current_Q2 = self.critic(obs, action)
File "/home/kumars/anaconda3/envs/crc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/kumars/Darshita/dmcontrol-generalization-benchmark/src/algorithms/modules.py", line 248, in forward
return self.Q1(x, action), self.Q2(x, action)
File "/home/kumars/anaconda3/envs/crc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/kumars/Darshita/dmcontrol-generalization-benchmark/src/algorithms/modules.py", line 232, in forward
return self.trunk(torch.cat([obs, action], dim=1))
File "/home/kumars/anaconda3/envs/crc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/kumars/anaconda3/envs/crc/lib/python3.6/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/kumars/anaconda3/envs/crc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/kumars/anaconda3/envs/crc/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
File "/home/kumars/anaconda3/envs/crc/lib/python3.6/site-packages/torch/nn/functional.py", line 1848, in linear
return torch._C._nn.linear(input, weight, bias)
File "/home/kumars/anaconda3/envs/crc/lib/python3.6/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 384930) is killed by signal: Segmentation fault.

@nicklashansen Can you please help in this regard?

Unable to find the robotic manipulation environment.

Thank you for your contribution. I would like to use the robotic manipulation environment (pushing the cube to the location of the red disc) used in the paper- Generalization in Reinforcement Learning by Soft Data Augmentation. Is this environment publicly available for everyone to use?

@nicklashansen Request your help.

Results on other domains

Hi, do we need to tune parameters for domains like humanoid?
I tried to run the code and the training is as close to the results claimed for the walker domain, however, when I use the same parameters for humanoid, the reward doesn't even go to double digits. Is this expected or am I expected to consider some more factors?

TypeError: make() got an unexpected keyword argument 'is_distracting_cs'

Hello,

When I ran the command python3 src/train.py --algorithm sac --seed 0, I got this error :

Traceback (most recent call last):
  File "../src/train.py", line 150, in <module>
    main(args)
  File "../src/train.py", line 50, in main
    mode='train'
  File "/home/mgz_21/0_Project/DMConrol-GB/src/env/wrappers.py", line 48, in make_env
    background_dataset_paths=paths
TypeError: make() got an unexpected keyword argument 'is_distracting_cs'

My main packages version as:

cudatoolkit               11.0.221             
dm-control                0.0.318066097
numpy                      1.19.5
python                    3.7.6
torch                     1.7.1

Thanks!

Question about data augmentation on target network

Thank you for your great work.

Could you please clarify whether the target network undergoes any data augmentation, including random shift (i.e., weak augmentation), in the SVEA? I am unsure if the random shifts are applied or not, in the target network.

Thank you.

TypeError: load() got an unexpected keyword argument 'setting_kwargs'

Hi. I just made a fresh install following the instructions on the readme file.
When I run

python3 src/train.py \
  --algorithm svea \
  --seed 0

I get
AttributeError: 'dict' object has no attribute 'env_specs'
This is easily solved by downgrading the python version from 0.26.0 to 0.19.0.

Now, instead, I get the following:

/home/antonioricciardi/anaconda3/envs/dmcgb_orig/lib/python3.7/site-packages/glfw/__init__.py:916: GLFWError: (65544) b'X11: The DISPLAY environment variable is missing'
  warnings.warn(message, GLFWError)
Traceback (most recent call last):
  File "src/train.py", line 150, in <module>
    main(args)
  File "src/train.py", line 50, in main
    mode='train'
  File "/home/antonioricciardi/projects/dmcontrol-generalization-benchmark/src/env/wrappers.py", line 48, in make_env
    background_dataset_paths=paths
  File "/home/antonioricciardi/projects/dmcontrol-generalization-benchmark/src/env/dmc2gym/dmc2gym/__init__.py", line 64, in make
    return gym.make(env_id)
  File "/home/antonioricciardi/anaconda3/envs/dmcgb_orig/lib/python3.7/site-packages/gym/envs/registration.py", line 145, in make
    return registry.make(id, **kwargs)
  File "/home/antonioricciardi/anaconda3/envs/dmcgb_orig/lib/python3.7/site-packages/gym/envs/registration.py", line 90, in make
    env = spec.make(**kwargs)
  File "/home/antonioricciardi/anaconda3/envs/dmcgb_orig/lib/python3.7/site-packages/gym/envs/registration.py", line 60, in make
    env = cls(**_kwargs)
  File "/home/antonioricciardi/projects/dmcontrol-generalization-benchmark/src/env/dmc2gym/dmc2gym/wrappers.py", line 90, in __init__
    setting_kwargs=setting_kwargs
TypeError: load() got an unexpected keyword argument 'setting_kwargs'

Have any ideas of how I can solve this? Thank you!

Question about robot-push

Could you give me some help about the implements about Robotic manipulation. Looking forward your help. I'm not find such task.

Reproductibility SODA and SVEA conv

Hi Nicklas,
Thank you for your high-quality repo.
We have trouble reproducing your results on finger spin with SODA and SVEA (we have between 500 and 600).
Even in training, we don't achieve the performance shown.
Are there any special settings or configurations for this environment?

Best regards

Confusion about Capacity of ReplayBuffer

Hello authors, thx for your great work.
I have one question about the capacity of the replaybuffer code. But acording to the original DrQ code, I find that they use a hyperparameter to set the cabicity and their default parameter is 100000. Is there any reasons to set the capacity by train_steps.

TypeError: load() got an unexpected keyword argument 'setting_kwargs'

Hi @nicklashansen, when I ran the following command

python3 src/train.py \
  --algorithm svea \
  --seed 0

I got the following error -

 File "src/train.py", line 150, in <module>
    main(args)
  File "src/train.py", line 50, in main
    mode='train'
  File "/home/tejas/github/dmcontrol-generalization-benchmark/src/env/wrappers.py", line 48, in make_env
    background_dataset_paths=paths
  File "/home/tejas/github/dmcontrol-generalization-benchmark/src/env/dmc2gym/dmc2gym/__init__.py", line 64, in make
    return gym.make(env_id)
  File "/home/tejas/anaconda3/envs/dmcgb/lib/python3.7/site-packages/gym/envs/registration.py", line 235, in make
    return registry.make(id, **kwargs)
  File "/home/tejas/anaconda3/envs/dmcgb/lib/python3.7/site-packages/gym/envs/registration.py", line 129, in make
    env = spec.make(**kwargs)
  File "/home/tejas/anaconda3/envs/dmcgb/lib/python3.7/site-packages/gym/envs/registration.py", line 90, in make
    env = cls(**_kwargs)
  File "/home/tejas/github/dmcontrol-generalization-benchmark/src/env/dmc2gym/dmc2gym/wrappers.py", line 90, in __init__
    setting_kwargs=setting_kwargs
TypeError: load() got an unexpected keyword argument 'setting_kwargs'

I'm using latest version of Mujoco i.e. 2.1.0. Seems to me like this error is in dmc2gym but I'm unable to resolve it.

Thanks!

The server does not have a graphics environment, use x11 instead

output:
Working directory: logs/walker_walk/svea/0
Observations: (9, 84, 84)
Cropped observations: (9, 84, 84)
Evaluating: logs/walker_walk/svea/0
......

And then it got stuck here.... Why?

Questions about std in SVEA paper

Hi, thanks for the great work!
I've noticed that "Hi, we compute the standard deviation over the mean episode returns of each seed". from the previous issue. (#4)
However, I'm still a bit confused. Could you please confirm if my understanding is correct?

(Fig.5 Top) Training performance: std of 5 seeds
(Fig.5 Bottom) Test performance: For each seed, run zero-shot evaluation 30 times (args.eval_episode) and calculate the mean from these 30 Return values (resulting in 1 mean value per seed). Then compute std using these 5 mean values.

Thank you!

nicklashansen / dmcontrol-generalization-benchmark Goto Github PK

dmcontrol-generalization-benchmark's People

Contributors

Stargazers

Watchers

Forkers

dmcontrol-generalization-benchmark's Issues

Recommend Projects

Recommend Topics

Recommend Org