themtank / cups-rl Goto Github PK

Customisable Unified Physical Simulations (CUPS) for Reinforcement Learning. Experiments run on the ai2thor environment (http://ai2thor.allenai.org/) e.g. using A3C, RainbowDQN and A3C_GA (Gated Attention multi-modal fusion) for Task-Oriented Language Grounding (tasks specified by natural language instructions) e.g. "Pick up the Cup or else"

Home Page: http://www.themtank.org

License: MIT License

Python 100.00%

reinforcement-learning robotics cups simulated-environments multi-task-learning cup a3c rainbow model-based transfer-learning

cups-rl's People

Contributors

Stargazers

Watchers

Forkers

wwxfromtju zjut-jianhuazhang fudifudi hyzcn mohitshridhar flint-xf-fan rwgardner2 yqi19

cups-rl's Issues

reset agent but not the envirnment

Hi,
I suppose
state, done = env.reset(), False
in line 141 and 157 of rainbow main.py is to reset both agent and environment. May I ask after setting the environment will the objects location be randomized or kept in the initial position, and is there a way to reset agent's position only while keeping the environment as it is?

Thank you.

Randomize object's location

Does the current wrapper allow to randomize object's location?

decouple RL algorithm with gym wrapper

Hi, could you provide some guidelines on how to decouple RL algorithm(A3C, Rainbow) with gym wrapper? I would like to make ai2thor analogous as a sub-environment of gym, such at setting it as an environment could be done by env = gym.make(args.env_id) where the env_id is ai2thor without RL algo being involved. I tried to just use

env = FrameStackEnv(AI2ThorEnv(config_file=args.config_file), args.history_length,args.device)

but this line asks for history_length and device, which I supposed is the argument for the RL algo. Also the class Env is under the algo's directory so everything seems very coupled. Is there some way to separate them?

Really appreciate it if can give some guidance on this issue.

How long does it take to train rainbow

Roughly how long it took you to train Rainbow

regarding the input of Rainbow

Hi, may I ask does Rainbow take the metadata (especially metadata['objects']) as input? I feel by just taking frame there are a lot of useful information missing. If the metadata is taken as input, how is it fed in? Is it by certain embedding method? Thanks for replying.

Regarding to policy/model/weights

Do you mind to clarify is the policy/model/weights saved after each epoch/iteration? If not how should I make it happen? If yes where is it saved at? I see you just call the model 'args.model_path' in agent.py and save(self, path, filename) without specifically assigning a path or name.

Thank you for replying.

Mulit agent

Hi, does the ai2thor_env allow mulit agent to run in the same room?

How to add in new movement like Crouch and stand

Hi,can advice me on how to add in new movement such as crouch and stand.
Thank you

Is there any ways to improve trainning speed?

I am currently training the rainbow model on a GPU, i try to train 2 models simultaneously however, the training time become much slower. Is there any suggestion that i can improve my training speed? I am training them on a GeForce RTX 2080 Ti

Error starting AI2THOR Env

Unable to start an Env. I tried this:
python random_walk.py

/home/kb/anaconda3/lib/python3.6/site-packages/ai2thor/controller.py:1152: UserWarning: start method depreciated. The server started when the Controller was initialized.
"start method depreciated. The server started when the Controller was initialized."
Traceback (most recent call last):
File "random_walk.py", line 15, in
env = AI2ThorEnv(config_dict=config_dict)
File "/home/kb/CUPS_RL/cups-rl/gym_ai2thor/envs/ai2thor_env.py", line 119, in init
self.controller.start()
File "/home/kb/anaconda3/lib/python3.6/site-packages/ai2thor/controller.py", line 1163, in start
self.server.start()
File "/home/kb/anaconda3/lib/python3.6/site-packages/ai2thor/fifo_server.py", line 203, in start
os.mkfifo(self.server_pipe_path)
FileExistsError: [Errno 17] File exists

About task "place cup in microwave"

I'd like to consult one question that about how to create the task like the "place cup in microwave" ,I think it's a hard task, which consists of two simple tasks, should I create it in a two task way or just create one task that just a hard way but the objects contains two kinds?

Change the maximum number of steps for one eposide

Hi there, i try to change the max_step_num for rainbow algorithms under the main.py, however it doesn't seem to work. Can advise me please? Thank you

Calling different tasks for different iteration

Hi,
May I check if it is possible to define multiple tasks in tasks.py and call just one of them based on certain condition. If it is possible, how to do it ?
My guess is to modify rainbow_example.json script with a if-condition (I am using the rainbow algorithm to train the model). But since the task in that script is defined with a cell I'm not sure how exactly to implement that.

Thank you very much.

error after num_steps = 100000 / 50000000

Hi,
I was just directly running the rainbow with all default settings and this error popped out after 100000 steps:

Resetting environment and starting new episode
eval step 200
eval step 400
eval step 600
eval step 800
eval step 1000
Reached maximum episode length: 1000
Traceback (most recent call last):

File "", line 1, in
runfile('/home/user/Documents/Zeyu/cups-rl/main.py', wdir='/home/user/Documents/Zeyu/cups-rl')

File "/home/user/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
execfile(filename, namespace)

File "/home/user/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/user/Documents/Zeyu/cups-rl/main.py", line 182, in
avg_reward, avg_Q = test(env, num_steps, args, dqn, val_mem)

File "/home/user/Documents/Zeyu/cups-rl/algorithms/rainbow/test.py", line 76, in test
_plot_line(eval_steps, rewards, 'Reward', path='results')

File "/home/user/Documents/Zeyu/cups-rl/algorithms/rainbow/test.py", line 112, in _plot_line
}, filename=os.path.join(path, title + '.html'), auto_open=False)

File "/home/user/anaconda3/lib/python3.7/site-packages/plotly/offline/offline.py", line 596, in plot
auto_open=auto_open,

File "/home/user/anaconda3/lib/python3.7/site-packages/plotly/io/_html.py", line 527, in write_html
with open(file, "w") as f:

FileNotFoundError: [Errno 2] No such file or directory: 'results/Reward.html'

And yes there is indeed no results/Reward.html in the repo. I wonder do I need to create it myself?

One thing to be noted is that I copied main.py of rainbow to just under cups-rl and ran it, because otherwise lines such as :

from algorithms.rainbow.agent import Agent

will return error if main.py is under cups-rl/algorithms/rainbow

Thanks for your reply.

"open_close_interaction": true gives error

Hi, when I set "open_close_interaction": true, meaning that i want the agent to be able to open/close openable_objects, it returns me the following error:

Traceback (most recent call last):

File "", line 1, in
runfile('/home/user/Documents/Zeyu/cups-rl/main.py', wdir='/home/user/Documents/Zeyu/cups-rl')

File "/home/user/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
execfile(filename, namespace)

File "/home/user/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/user/Documents/Zeyu/cups-rl/main.py", line 142, in
next_state, _, done, _ = env.step(env.action_space.sample())

File "/home/user/Documents/Zeyu/cups-rl2/algorithms/rainbow/env.py", line 140, in step
state, reward, done, info = self.env.step(action)

File "/home/user/Documents/Zeyu/cups-rl2/gym_ai2thor/envs/ai2thor_env.py", line 175, in step
obj['distance'] < distance and not obj['isopen'] and \

KeyError: 'isopen'

I tried with a copy of your repo without changing any other things.

ploting the result

Hi,
I have another question. After I run the main.py of rainbow there is no plot coming out, but clearly in the test.py there is a _plot_line command which is supposed to be called if the main.py is executed. After checking, it is discovered that in main.py line149:
avg_reward, avg_Q = test(env, mem_steps, args, dqn, val_mem, evaluate_only=True)
and line179:
if num_steps % args.evaluation_interval == 0:
(hence line182 avg_reward, avg_Q = test(env, num_steps, args, dqn, val_mem) )

were never executed, which means that:
if args.evaluate_only: #line147
and
if num_steps >= args.learn_start: #line172

were never satisfied.

I didn't modify any part of the code except changing --max-num-steps (line 34) to 1000 for checking result promptly.

Do you have any clue what is the cause?

Thank you.

make random action in rainbow

Hi, may I ask is it possible to make a sequence of random actions by adding a line of code in main.py of Rainbow?

How to train on server?

I would like to train the rainbow main.py on a remote server without a GUI?