airsplay / r2r-envdrop Goto Github PK

PyTorch Code of NAACL 2019 paper "Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout"

License: MIT License

Python 21.70% Shell 0.36% CMake 1.22% C++ 76.72%

r2r-envdrop's People

Contributors

Stargazers

Watchers

r2r-envdrop's Issues

appropriate argument setting?

Hi, thank you for sharing the code. Could you please show all necessary arguments setting (for reproduction of paper results) ?

environment set up problem

I was setting the environment as in the README.md file. However, I met an issue when installing Matterport3D simulators when executing cmake -DEGL_RENDERING=ON ..:

(envdrop) jinggu@jinggu-MS-7B79:~/Projects/R2R-EnvDrop/build$ cmake -DEGL_RENDERING=ON ..
CMake Deprecation Warning at CMakeLists.txt:2 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


CMake Error at CMakeLists.txt:14 (find_package):
  By not providing "FindOpenCV.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "OpenCV", but
  CMake did not find one.

  Could not find a package configuration file provided by "OpenCV" with any
  of the following names:

    OpenCVConfig.cmake
    opencv-config.cmake

  Add the installation prefix of "OpenCV" to CMAKE_PREFIX_PATH or set
  "OpenCV_DIR" to a directory containing one of the above files.  If "OpenCV"
  provides a separate development package or SDK, be sure it has been
  installed.


-- Configuring incomplete, errors occurred!
See also "/home/jinggu/Projects/R2R-EnvDrop/build/CMakeFiles/CMakeOutput.log".

Do you have any idea about how to solve this issue?

Beam search validation

Hi, Is this any way to test your model as beam search?

I saw the code in train.py and agent.py ("beam_valid", "beam_search_test")

Could you please share any way to use it?

Retrained model does not get the same SPL on val unseen as reported in paper

Hi,

We are trying to retrain the EnvDrop model based on this repo, but the results are not same as reported in paper, we have tried different PyTorch versions, our best result with PyTorch 0.4.1 is 0.46, which is less than the reported 48% on val unseen dataset in terms of SPL, for detailed results you can refer to the attachment below.

Have we missed something important? or can you specify your working environment?

Our retrained model:
retrained_envdrop_results.xlsx

Results in paper:

Missing files

Hello
I am training to reproduce the results using google colab. When running r2r_src/train.py I get a file not found ('tasks/R2R/data/R2R_train.json'). I found a download script in tasks/R2R/data but i get 404 errors on some files.
Can you provide does files?

Submitted to VLN test server

Dear Sir:
Hello!I'm writing to consult you that could you please release the code for "Submitted to VLN test server" in order to evaluation the result?Thank you very much!

Why the loc_elevation is not updated in the env?

R2R-EnvDrop/r2r_src/env.py

Line 265 in c416108

loc_heading = normalized_heading - base_heading

I'm curious about why the elevation of the navigable candidates is not updated as the heading, such as:
loc_heading = normalized_heading - base_heading
loc_elevation = normalized_elevation - base_elevation

Aligning in speaker

R2R-EnvDrop/r2r_src/speaker.py

Line 243 in 4c11585

input = logits[:, :, :-1], # -1 for aligning

Hi, I am confused about the aligning operation (by your comments). It seems that you ignore the last element of the predicted logits (the logits of 'EOS' or 'PAD') and the 'BOS' of the target when you compute the loss during training the speaker (which, in my view, make the logits and target unalgined...). Can you explain how does this operation make the logits and target aligned?

How can I get the results from the paper?

How can I get the results(64% success rate in unseen) from the paper??

About the back-translation performance and trigger of pre-exploration

Hi,

We tried to replicate the pre-exploration part in ablation study to verify its performance. However, the only trigger we found is the function utils.add_explorations(paths). In this function, we notice that a new file "tasks/R2R/data/exploration.json" that is not provided is loaded as something like data augmentation, and this function is used nowhere in the total project. Is the correct pre-explore trigger provided?

One another question is that, as we've replicated the back-translation part, the performance of bt-agent is 48% success rate for val-unseen, lower than the expected at least 50%. Sincewe did it following README.md, we guess the problem is the pytorch version as you mentions. Could you pls share the version number you used? Or if there is any other settings we should change to replicate the project, could u pls share them as well?

Hardware requirements?

Hi, thanks for your awesome paper and codes.
Recently I'm preparing to follow your work and I'm wondering what hardware it requires and how long it will cost while training?
Thanks!

Difference between aug_paths and Speaker-Follower (Literal) Paths?

The "augmented" paths in the download file (http://www.cs.unc.edu/~airsplay/aug_paths.json) do not seem to be the "ground truth" paths from the original Speaker-Follower paper (http://people.eecs.berkeley.edu/~ronghang/projects/speaker_follower/data_augmentation/R2R_literal_speaker_data_augmentation_paths.json).

Is there a difference? If so, what is it?

How to run with multiple gpus?

Hi, Thank u for sharing nice work.

I followed your script, but i can't train this in multiple gpus.

I used "bash run/bt_envdrop.bash 0, 3", but it cannot equally divided into two GPUs.

Could you please tell me how to do it?

Questions about Enhanced Speaker

You claim an enhanced version of Speaker in section 3.4.3. However, geographic information and actions are only used to calculate the weight of features in attention mechanism.

I have difficulty understanding why g,a are not used to directly calculate the context. Could you provide some works related to the motivation of this design?

Release of pretrained snapshots

Hi,

Could you please release the snapshot (model weights) of the best model for single run and beam search that led to the numbers reported in the paper?
Thanks!

Some issues

Thank you for sharing your great work!

I am reading your codes now, but I found some weird parts. They are probably bugs while I am not sure as well. Please point me out if I am wrong

https://github.com/airsplay/R2R-EnvDrop/blob/master/r2r_src/agent.py#L184, cannot find matched "if" to this "else".
https://github.com/airsplay/R2R-EnvDrop/blob/master/r2r_src/agent.py#L570, five return values which do not match the function definitions here.

Requesting code with handling segmentation image

Hello, Thank u for sharing your great work.
I downloaded your segmentation images which is divided into 36 pieces.

I want to check these into panoramic image.
Is there any code with attaching 36 pieces to one panoramic image?

Could you please share code?
Thank you.

Paper Question

Hey,

When I run bash run/speaker.bash 0, there is an error.

run/speaker.bash: line 6: unbuffer: command not found

About angle feature size

Hi @airsplay , I note that the angle feature size is 4 in both the paper and the default parameter setting but 128 in run\agent.bash.
Is this a special design? :)

Multiple GPUs

How to use multiple GPUs for training? Do I need to write the function of setting multiple GPUs into the model？

reward quantification

Hi,

Thanks for the code. Can you please explain a bit more about why we need to quantize the reward in agent.py? Since I did not see this in the paper. Thanks.

One possiable bug in agent._teacher_action()

One possible bug here:

R2R-EnvDrop/r2r_src/agent.py

Line 184 in 1f46091

else: # Stop here

Vanilla Listener (Follower) Training without RL (Student Forcing + Teacher Forcing?)

Hey Hao,

Been working my way through your repo, and when training the listener via the provided script (agent.bash), it seems that the v0 listener model is trained with a hybrid teacher-forcing + sampling approach with RL.

Specifically, you first seem do be doing a Teacher-Forcing update (which makes sense):

R2R-EnvDrop/r2r_src/agent.py

Line 805 in c416108

self.rollout(train_ml=args.ml_weight, train_rl=False, **kwargs)

. You weight this by the provided hyperparameter mlWeight = 0.2.

However, then you do a sampling update that computes the RL (A2C) loss as well:

R2R-EnvDrop/r2r_src/agent.py

Line 807 in c416108

self.rollout(train_ml=None, train_rl=True, **kwargs)

, which then triggers this:

R2R-EnvDrop/r2r_src/agent.py

Line 392 in c416108

if train_rl:

If I wanted to just train the "best" Listener model without RL, do you have recommendations. Setting "feedback = argmax" seems to trigger student forcing (which is what's used in the related work), but should I mix that with Teacher Forcing as well?

Any intuition you have is much appreciated. Computing the Teacher-Forcing loss and weighting it by the hyperparameter, and adding the StudentForcing loss is what I'm currently thinking. Otherwise, I might just do StudentForcing all the way through...

Import MatterSim error

I follow the instruction in Readme to install MatterSim ，But it occurred error when running bash run/speaker.bash 0
error:
import MatterSim
ImportError: build/MatterSim.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN9mattersim9Simulator10makeActionEidd

Question about the valid seen/unseen performance report on the paper

I'm wonder that how you get the best performance on the valid seen split. You use the model having same parameters getting best performance on valid unseen then validate on the seen split or just report the best performance on valid seen but not same parameters on best valid unseen model

Can this repo reproduce the reported result ?

I read that in #11, there is another version of the code that is able to reproduce the reported result. However, is that code repo the same as the one here?

About make_action when feedback method is 'sample' or 'argmax'

Hi there,

I have a question about how the agent makes action when the feedback method is 'sample' or 'argmax'. In function make_equiv_action, it seems that even if the agent previously decided to STOP, it can keep making new actions and the trajectory will be updated accordingly. This is fine in Training, since the reward and reward_mask are set to zeros if ended is TRUE. But in testing and evaluation, one agent chooses the final viewpoint as where it is when all agent end, instead of when itself first decide to STOP. And this could make a huge difference in measuring the path length and success rate.

Could you elaborate more on this please?

Thanks a lot!
Yicong

Pre-exploration in the unseen environment

Hi,

Can you please share the augmented path file in the unseen environment? If I understand correctly, the aug_paths.json is only for the back translation in the seen environment. If it is not possible, some statistics about the unseen augmented path would also be very helpful. Thanks.

Help, error about the code

hi, when i run the code with the following instruction: bash run/agent.py , i got the following error, i dont konw why, please help me about that! thank u

Traceback (most recent call last):
File "r2r_src/train.py", line 470, in
train_val()
File "r2r_src/train.py", line 386, in train_val
train(train_env, tok, args.iters, val_envs=val_envs)
File "r2r_src/train.py", line 132, in train
listner.train(interval, feedback=feedback_method) # Train interval iters
File "/home/jing/selfmonitoring-agent/r2r_src/agent.py", line 820, in train
self.loss.backward()
File "/home/jing/anaconda2/envs/python3.6/lib/python3.6/site-packages/torch/tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/jing/anaconda2/envs/python3.6/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: Invalid index in scatterAdd at /opt/conda/conda-bld/pytorch_1579022051443/work/aten/src/TH/generic/THTensorEvenMoreMath.cpp:721

Request code to generate R2R_<partition>.json

Hi, thank you for releasing nice work.

I'm curious about how you parsed all json files in tasks/R2R/data .

Can you also release or share preprocessing code to generate them?

Test Question

After I train the model， i use the test environment to evaulate，the success rate result is below， i dont understand why the result is so low？ please help me， is there something wrong when i test ？

the test script is：
name=agent
flag="--train validlistener --featdropout 0.3 --angleFeatSize 128
--feedback argmax
--mlWeight 0.2
--subout max --dropout 0.5 --optim rms --lr 1e-4 --iters 80000 --submit"
CUDA_VISIBLE_DEVICES=$1 python r2r_src/train.py $flag --name $name

Picking the best checkpoint in the pre-exploration setting.

Hi, how did you select the checkpoint for test in the pre-explpration setting ? Did you still pick the checkpoint according to the performances on the validation-unseen set?

Beam Search setting

Hi @airsplay , I have a question about the Beam Search setting.

In Sec 5 of your EnvDrop paper, it mentioned that "beam search is usable when the environment is explored and saved in the agent’s memory but the agent does not have enough computational capacity to fine-tune its navigational model.".

Does it mean that the model used in beam-search was not fine-tuned in test unseen environments? And did previous works fine-tune in test unseen set for Beam Search setting?

Thanks.

airsplay / r2r-envdrop Goto Github PK

r2r-envdrop's People

Contributors

Stargazers

Watchers

Forkers

r2r-envdrop's Issues

Recommend Projects

Recommend Topics

Recommend Org