abaisero / asym-rlpo Goto Github PK

View Code? Open in Web Editor NEW

7.0 7.0 2.0 554 KB

Asymmetric methods for partially observable reinforcement learning

License: MIT License

Python 98.48% Shell 1.52%

asym-rlpo's People

Contributors

Stargazers

Watchers

Forkers

kohlerhector amitfishy

asym-rlpo's Issues

Not able to run GV envs

Hi again,

I'm having some issues running experiments with the gridverse envs. I think the problem is in the way the environment is created with the wrapper that adds the latent space on top. In the script asym_rlpo/envs/env_gym.py the function make_gym_env gets called and EnvironmentType.OTHER gets assigned to it. I tried bypassing this by just adding a EnvironmentType.GV when a GV env is used. But now something else is wrong and the latent_space shows None. At this point I'm not sure if I should continue or it is just a small change somewhere else I need to fix?

Thanks
Amit

Device related issue when using GV with --gv-state-grid-model-type=cnn

Hi,

Again, just trying to reproduce things with the GV envs, I'm trying to run the following:
python main_a2c.py ../gym-gridverse/gym_gridverse/registered_envs/gv_memory_four_rooms.7x7.yaml a2c --gv-state-grid-model-type cnn
But I get some error because I have a gpu and cpu on my machine and the computations are not being done on the same device. I'm not sure exactly how to fix this.

Loading using gym.make
Environment with id ../gym-gridverse/gym_gridverse/registered_envs/gv_memory_four_rooms.7x7.yaml not found. Trying as a GV YAML environment.
Loading using YAML
/gv_memory_four_rooms.7x7.yaml a2c

Traceback (most recent call last):
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/main_a2c.py", line 684, in <module>
    raise SystemExit(main())
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/main_a2c.py", line 648, in main
    done = run(runstate)
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/main_a2c.py", line 471, in run
    episodes = sample_episodes(
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/sampling.py", line 61, in sample_episodes
    return [
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/sampling.py", line 62, in <listcomp>
    sample_episode(env, policy, render=render) for _ in range(num_episodes)
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/sampling.py", line 23, in sample_episode
    policy.reset(numpy2torch(observation))
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/policies.py", line 46, in reset
    self.history_integrator.reset(observation)
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/features.py", line 164, in reset
    input_features = self.compute_input_features(
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/features.py", line 126, in compute_input_features
    return compute_input_features(
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/features.py", line 21, in compute_input_features
    observation_features = observation_model(gtorch.to(observation, device))
  File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/representations/gv.py", line 97, in forward
    return self.fc_model(self.cat_representation(inputs))
  File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/representations/cat.py", line 21, in forward
    [representation(inputs) for representation in self.representations],
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/representations/cat.py", line 21, in <listcomp>
    [representation(inputs) for representation in self.representations],
  File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fishy/python_ws/aais-baisero3/code/asym-rlpo/asym_rlpo/representations/gv.py", line 256, in forward
    cnn_output = self.cnn(cnn_input)
  File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 457, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/fishy/python_ws/aais-baisero3/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 453, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Memory related issues in Shopping envs

Hello again!

I've been trying to run experiments on the following shopping environments:

Shopping 5    - POMDP-shopping_5-episodic-v1
Shopping 6    - POMDP-shopping_6-episodic-v1

But I keep having some strange issues related to memory. Not sure why, but even if I just simply load the env with a gym.make call it causes the system to hang (sometimes). Did you face similar issues while running experiments on these environments?

Thanks!
Amit

main_a2c.py doesn't run because of missing gym_pomdps module

Hello, thanks for providing code for the paper. I'm able to install all the requirements, but I still get an error while trying to run main_a2c.py

python main_a2c.py

Traceback (most recent call last):
  File "/home/fishy/python_ws/aais-baisero/code/asym-rlpo/main_a2c.py", line 16, in <module>
    from asym_rlpo.algorithms import A2C_ABC, make_a2c_algorithm
  File "/home/fishy/python_ws/aais-baisero/code/asym-rlpo/asym_rlpo/algorithms/__init__.py", line 3, in <module>
    from asym_rlpo.envs import Environment
  File "/home/fishy/python_ws/aais-baisero/code/asym-rlpo/asym_rlpo/envs/__init__.py", line 12, in <module>
    from .env_gym import make_gym_env
  File "/home/fishy/python_ws/aais-baisero/code/asym-rlpo/asym_rlpo/envs/env_gym.py", line 8, in <module>
    import gym_pomdps
ModuleNotFoundError: No module named 'gym_pomdps'

Not sure where to get this module exactly! I'm just trying to reproduce some experiments from your paper:
https://arxiv.org/abs/2105.11674

Thanks!

Making the baseline data plotted in the graphs available in this repo

Hello!

We have a paper we're working on where we would like to make comparisons from the baselines in your paper. Particularly the A2C-asym-hs and the A2C baselines for the following envs:

Heaven-Hell-3
Heaven-Hell-4
Shopping-5
Shopping-6
Car-Flag
Cleaner

I would really appreciate it if you could make the numerical data that is plotted available in any format in this repo (or just share it with me if you don't want to make it public). I just need it to compare plots!

Thank you
Amit

Verifying environments used in the paper in this codebase

Hi,

I'm just trying to find all the environments in the paper. I know the two gridverse envs are in the gridverse package.

I looked at the output of from gym import envs print(envs.registry.all()) and I'm just trying to find all the envs used in the paper.

I was able to find these 4 which I believe are the same versions used in the paper.

Heaven-Hell-3 - POMDP-heavenhell_3-episodic-v0
Heaven-Hell-4 - POMDP-heavenhell_4-episodic-v0
Shopping 5    - POMDP-shopping_5-episodic-v1
Shopping 6    - POMDP-shopping_6-episodic-v1

However, I'm not sure about the Car-Flag and Cleaner envs. Could you please let me know where I could find them?

Thanks
Amit

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.