abaisero / asym-rlpo Goto Github PK
View Code? Open in Web Editor NEWAsymmetric methods for partially observable reinforcement learning
License: MIT License
Asymmetric methods for partially observable reinforcement learning
License: MIT License
Hi again,
I'm having some issues running experiments with the gridverse envs. I think the problem is in the way the environment is created with the wrapper that adds the latent space on top. In the script asym_rlpo/envs/env_gym.py
the function make_gym_env
gets called and EnvironmentType.OTHER
gets assigned to it. I tried bypassing this by just adding a EnvironmentType.GV
when a GV env is used. But now something else is wrong and the latent_space shows None
. At this point I'm not sure if I should continue or it is just a small change somewhere else I need to fix?
Thanks
Amit
Hi,
Again, just trying to reproduce things with the GV envs, I'm trying to run the following:
python main_a2c.py ../gym-gridverse/gym_gridverse/registered_envs/gv_memory_four_rooms.7x7.yaml a2c --gv-state-grid-model-type cnn
But I get some error because I have a gpu and cpu on my machine and the computations are not being done on the same device. I'm not sure exactly how to fix this.
Loading using gym.make
Environment with id ../gym-gridverse/gym_gridverse/registered_envs/gv_memory_four_rooms.7x7.yaml not found. Trying as a GV YAML environment.
Loading using YAML
/gv_memory_four_rooms.7x7.yaml a2c
Traceback (most recent call last):
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/main_a2c.py", line 684, in <module>
raise SystemExit(main())
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/main_a2c.py", line 648, in main
done = run(runstate)
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/main_a2c.py", line 471, in run
episodes = sample_episodes(
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/sampling.py", line 61, in sample_episodes
return [
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/sampling.py", line 62, in <listcomp>
sample_episode(env, policy, render=render) for _ in range(num_episodes)
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/sampling.py", line 23, in sample_episode
policy.reset(numpy2torch(observation))
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/policies.py", line 46, in reset
self.history_integrator.reset(observation)
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/features.py", line 164, in reset
input_features = self.compute_input_features(
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/features.py", line 126, in compute_input_features
return compute_input_features(
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/features.py", line 21, in compute_input_features
observation_features = observation_model(gtorch.to(observation, device))
File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/representations/gv.py", line 97, in forward
return self.fc_model(self.cat_representation(inputs))
File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/representations/cat.py", line 21, in forward
[representation(inputs) for representation in self.representations],
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/representations/cat.py", line 21, in <listcomp>
[representation(inputs) for representation in self.representations],
File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/representations/gv.py", line 256, in forward
cnn_output = self.cnn(cnn_input)
File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 457, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 453, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Hello again!
I've been trying to run experiments on the following shopping environments:
Shopping 5 - POMDP-shopping_5-episodic-v1
Shopping 6 - POMDP-shopping_6-episodic-v1
But I keep having some strange issues related to memory. Not sure why, but even if I just simply load the env with a gym.make
call it causes the system to hang (sometimes). Did you face similar issues while running experiments on these environments?
Thanks!
Amit
Hello, thanks for providing code for the paper. I'm able to install all the requirements, but I still get an error while trying to run main_a2c.py
python main_a2c.py
Traceback (most recent call last):
File "/home/fishy/python_ws/aais-baisero/code/asym-rlpo/main_a2c.py", line 16, in <module>
from asym_rlpo.algorithms import A2C_ABC, make_a2c_algorithm
File "/home/fishy/python_ws/aais-baisero/code/asym-rlpo/asym_rlpo/algorithms/__init__.py", line 3, in <module>
from asym_rlpo.envs import Environment
File "/home/fishy/python_ws/aais-baisero/code/asym-rlpo/asym_rlpo/envs/__init__.py", line 12, in <module>
from .env_gym import make_gym_env
File "/home/fishy/python_ws/aais-baisero/code/asym-rlpo/asym_rlpo/envs/env_gym.py", line 8, in <module>
import gym_pomdps
ModuleNotFoundError: No module named 'gym_pomdps'
Not sure where to get this module exactly! I'm just trying to reproduce some experiments from your paper:
https://arxiv.org/abs/2105.11674
Thanks!
Hello!
We have a paper we're working on where we would like to make comparisons from the baselines in your paper. Particularly the A2C-asym-hs
and the A2C
baselines for the following envs:
Heaven-Hell-3
Heaven-Hell-4
Shopping-5
Shopping-6
Car-Flag
Cleaner
I would really appreciate it if you could make the numerical data that is plotted available in any format in this repo (or just share it with me if you don't want to make it public). I just need it to compare plots!
Thank you
Amit
Hi,
I'm just trying to find all the environments in the paper. I know the two gridverse envs are in the gridverse package.
I looked at the output of from gym import envs print(envs.registry.all())
and I'm just trying to find all the envs used in the paper.
I was able to find these 4 which I believe are the same versions used in the paper.
Heaven-Hell-3 - POMDP-heavenhell_3-episodic-v0
Heaven-Hell-4 - POMDP-heavenhell_4-episodic-v0
Shopping 5 - POMDP-shopping_5-episodic-v1
Shopping 6 - POMDP-shopping_6-episodic-v1
However, I'm not sure about the Car-Flag
and Cleaner
envs. Could you please let me know where I could find them?
Thanks
Amit
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.