airsplay / r2r-envdrop Goto Github PK
View Code? Open in Web Editor NEWPyTorch Code of NAACL 2019 paper "Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout"
License: MIT License
PyTorch Code of NAACL 2019 paper "Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout"
License: MIT License
Hi, thank you for sharing the code. Could you please show all necessary arguments setting (for reproduction of paper results) ?
I was setting the environment as in the README.md
file. However, I met an issue when installing Matterport3D simulators when executing cmake -DEGL_RENDERING=ON ..
:
(envdrop) jinggu@jinggu-MS-7B79:~/Projects/R2R-EnvDrop/build$ cmake -DEGL_RENDERING=ON ..
CMake Deprecation Warning at CMakeLists.txt:2 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
CMake Error at CMakeLists.txt:14 (find_package):
By not providing "FindOpenCV.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "OpenCV", but
CMake did not find one.
Could not find a package configuration file provided by "OpenCV" with any
of the following names:
OpenCVConfig.cmake
opencv-config.cmake
Add the installation prefix of "OpenCV" to CMAKE_PREFIX_PATH or set
"OpenCV_DIR" to a directory containing one of the above files. If "OpenCV"
provides a separate development package or SDK, be sure it has been
installed.
-- Configuring incomplete, errors occurred!
See also "/home/jinggu/Projects/R2R-EnvDrop/build/CMakeFiles/CMakeOutput.log".
Do you have any idea about how to solve this issue?
Hi, Is this any way to test your model as beam search?
I saw the code in train.py and agent.py ("beam_valid", "beam_search_test")
Could you please share any way to use it?
Hi,
We are trying to retrain the EnvDrop model based on this repo, but the results are not same as reported in paper, we have tried different PyTorch versions, our best result with PyTorch 0.4.1 is 0.46, which is less than the reported 48% on val unseen dataset in terms of SPL, for detailed results you can refer to the attachment below.
Have we missed something important? or can you specify your working environment?
Our retrained model:
retrained_envdrop_results.xlsx
Hello
I am training to reproduce the results using google colab. When running r2r_src/train.py I get a file not found ('tasks/R2R/data/R2R_train.json'). I found a download script in tasks/R2R/data but i get 404 errors on some files.
Can you provide does files?
Dear Sir:
Hello!I'm writing to consult you that could you please release the code for "Submitted to VLN test server" in order to evaluation the result?Thank you very much!
Line 265 in c416108
I'm curious about why the elevation of the navigable candidates is not updated as the heading, such as:
loc_heading = normalized_heading - base_heading
loc_elevation = normalized_elevation - base_elevation
R2R-EnvDrop/r2r_src/speaker.py
Line 243 in 4c11585
Hi, I am confused about the aligning operation (by your comments). It seems that you ignore the last element of the predicted logits (the logits of 'EOS' or 'PAD') and the 'BOS' of the target when you compute the loss during training the speaker (which, in my view, make the logits and target unalgined...). Can you explain how does this operation make the logits and target aligned?
How can I get the results(64% success rate in unseen) from the paper??
Hi,
We tried to replicate the pre-exploration part in ablation study to verify its performance. However, the only trigger we found is the function utils.add_explorations(paths). In this function, we notice that a new file "tasks/R2R/data/exploration.json" that is not provided is loaded as something like data augmentation, and this function is used nowhere in the total project. Is the correct pre-explore trigger provided?
One another question is that, as we've replicated the back-translation part, the performance of bt-agent is 48% success rate for val-unseen, lower than the expected at least 50%. Sincewe did it following README.md, we guess the problem is the pytorch version as you mentions. Could you pls share the version number you used? Or if there is any other settings we should change to replicate the project, could u pls share them as well?
Hi, thanks for your awesome paper and codes.
Recently I'm preparing to follow your work and I'm wondering what hardware it requires and how long it will cost while training?
Thanks!
The "augmented" paths in the download file (http://www.cs.unc.edu/~airsplay/aug_paths.json) do not seem to be the "ground truth" paths from the original Speaker-Follower paper (http://people.eecs.berkeley.edu/~ronghang/projects/speaker_follower/data_augmentation/R2R_literal_speaker_data_augmentation_paths.json).
Is there a difference? If so, what is it?
Hi, Thank u for sharing nice work.
I followed your script, but i can't train this in multiple gpus.
I used "bash run/bt_envdrop.bash 0, 3", but it cannot equally divided into two GPUs.
Could you please tell me how to do it?
You claim an enhanced version of Speaker in section 3.4.3. However, geographic information and actions are only used to calculate the weight of features in attention mechanism.
I have difficulty understanding why g,a
are not used to directly calculate the context. Could you provide some works related to the motivation of this design?
Hi,
Could you please release the snapshot (model weights) of the best model for single run and beam search that led to the numbers reported in the paper?
Thanks!
Thank you for sharing your great work!
I am reading your codes now, but I found some weird parts. They are probably bugs while I am not sure as well. Please point me out if I am wrong
https://github.com/airsplay/R2R-EnvDrop/blob/master/r2r_src/agent.py#L184, cannot find matched "if" to this "else".
https://github.com/airsplay/R2R-EnvDrop/blob/master/r2r_src/agent.py#L570, five return values which do not match the function definitions here.
Hello, Thank u for sharing your great work.
I downloaded your segmentation images which is divided into 36 pieces.
I want to check these into panoramic image.
Is there any code with attaching 36 pieces to one panoramic image?
Could you please share code?
Thank you.
Hey,
When I run bash run/speaker.bash 0, there is an error.
run/speaker.bash: line 6: unbuffer: command not found
Hi @airsplay , I note that the angle feature size is 4 in both the paper and the default parameter setting but 128 in run\agent.bash
.
Is this a special design? :)
How to use multiple GPUs for training? Do I need to write the function of setting multiple GPUs into the model?
Hi,
Thanks for the code. Can you please explain a bit more about why we need to quantize the reward in agent.py? Since I did not see this in the paper. Thanks.
One possible bug here:
Line 184 in 1f46091
Hey Hao,
Been working my way through your repo, and when training the listener via the provided script (agent.bash), it seems that the v0 listener model is trained with a hybrid teacher-forcing + sampling approach with RL.
Specifically, you first seem do be doing a Teacher-Forcing update (which makes sense):
Line 805 in c416108
However, then you do a sampling update that computes the RL (A2C) loss as well:
Line 807 in c416108
Line 392 in c416108
If I wanted to just train the "best" Listener model without RL, do you have recommendations. Setting "feedback = argmax" seems to trigger student forcing (which is what's used in the related work), but should I mix that with Teacher Forcing as well?
Any intuition you have is much appreciated. Computing the Teacher-Forcing loss and weighting it by the hyperparameter, and adding the StudentForcing loss is what I'm currently thinking. Otherwise, I might just do StudentForcing all the way through...
I follow the instruction in Readme to install MatterSim ,But it occurred error when running bash run/speaker.bash 0
error:
import MatterSim
ImportError: build/MatterSim.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN9mattersim9Simulator10makeActionEidd
I read that in #11, there is another version of the code that is able to reproduce the reported result. However, is that code repo the same as the one here?
Hi there,
I have a question about how the agent makes action when the feedback method is 'sample' or 'argmax'. In function make_equiv_action, it seems that even if the agent previously decided to STOP, it can keep making new actions and the trajectory will be updated accordingly. This is fine in Training, since the reward and reward_mask are set to zeros if ended is TRUE. But in testing and evaluation, one agent chooses the final viewpoint as where it is when all agent end, instead of when itself first decide to STOP. And this could make a huge difference in measuring the path length and success rate.
Could you elaborate more on this please?
Thanks a lot!
Yicong
Hi,
Can you please share the augmented path file in the unseen environment? If I understand correctly, the aug_paths.json is only for the back translation in the seen environment. If it is not possible, some statistics about the unseen augmented path would also be very helpful. Thanks.
hi, when i run the code with the following instruction: bash run/agent.py , i got the following error, i dont konw why, please help me about that! thank u
Traceback (most recent call last):
File "r2r_src/train.py", line 470, in
train_val()
File "r2r_src/train.py", line 386, in train_val
train(train_env, tok, args.iters, val_envs=val_envs)
File "r2r_src/train.py", line 132, in train
listner.train(interval, feedback=feedback_method) # Train interval iters
File "/home/jing/selfmonitoring-agent/r2r_src/agent.py", line 820, in train
self.loss.backward()
File "/home/jing/anaconda2/envs/python3.6/lib/python3.6/site-packages/torch/tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/jing/anaconda2/envs/python3.6/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: Invalid index in scatterAdd at /opt/conda/conda-bld/pytorch_1579022051443/work/aten/src/TH/generic/THTensorEvenMoreMath.cpp:721
Hi, thank you for releasing nice work.
I'm curious about how you parsed all json files in tasks/R2R/data
.
Can you also release or share preprocessing code to generate them?
After I train the model, i use the test environment to evaulate,the success rate result is below, i dont understand why the result is so low? please help me, is there something wrong when i test ?
the test script is:
name=agent
flag="--train validlistener --featdropout 0.3 --angleFeatSize 128
--feedback argmax
--mlWeight 0.2
--subout max --dropout 0.5 --optim rms --lr 1e-4 --iters 80000 --submit"
CUDA_VISIBLE_DEVICES=$1 python r2r_src/train.py $flag --name $name
Hi, how did you select the checkpoint for test in the pre-explpration setting ? Did you still pick the checkpoint according to the performances on the validation-unseen set?
Hi @airsplay , I have a question about the Beam Search setting.
In Sec 5 of your EnvDrop paper, it mentioned that "beam search is usable when the environment is explored and saved in the agent’s memory but the agent does not have enough computational capacity to fine-tune its navigational model.".
Does it mean that the model used in beam-search was not fine-tuned in test unseen environments? And did previous works fine-tune in test unseen set for Beam Search setting?
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.