Giter VIP home page Giter VIP logo

envbiasvln's People

Contributors

zhangybzbo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

yangsikai wdqin

envbiasvln's Issues

Need a bit of clarification about reproducing the results mentioned in the repository and the env-bias paper

Hi,

Thank you very much for sharing the features used in the paper! I am currently trying to reproduce the results mentioned in the env-bias paper and the results reported in the repository, so I use the original code in R2R-EnvDrop with ResNet-152-imagenet.tsv to train the ResNet agent and use the features provided in this repository, namely the GT-Seg.tsv with the modify.py code to train the GT-Seg agent.

The training script I use was:

name=agent flag="--attn soft --train listener --featdropout 0.3 --angleFeatSize 128 --feedback sample --mlWeight 0.2 --subout max --dropout 0.5 --optim rms --lr 1e-4 --iters 80000 --maxAction 35" mkdir -p snap/$name CUDA_VISIBLE_DEVICES=$1 python3 r2r_src/train.py $flag --name $name

which I believe is the same as the original provided one in R2R-EnvDrop to train the agent module. After the two models are trained, I use the best trained model in the val-unseen split for evaluation, the evaluation script was:

name=ResNet_result
flag="--attn soft --train validlistener
      --load snap/agent/state_dict/best_val_unseen_ResNet
      --angleFeatSize 128
      --featdropout 0.4
      --subout max --maxAction 35"
mkdir -p snap/$name
CUDA_VISIBLE_DEVICES=$1 python3 r2r_src/train.py $flag --name $name | tee snap/$name/log

(Of course,this is just the ResNet one, and the 'name' and '--load' values are changed accordingly in the GT_Seg one)

Then I got the two evaluation results:

ResNet:
Env name: val_unseen, nav_error: 5.8181, oracle_error: 3.8605, steps: 25.7867, lengths: 9.9428, success_rate: 0.4585, oracle_rate: 0.5338, spl: 0.4219
Env name: val_seen, nav_error: 4.5834, oracle_error: 2.8824, steps: 26.7630, lengths: 10.6334, success_rate: 0.5749, oracle_rate: 0.6552, spl: 0.5432
Env name: train, nav_error: 0.3455, oracle_error: 0.2661, steps: 25.4575, lengths: 9.9808, success_rate: 0.9736, oracle_rate: 0.9830, spl: 0.9609
GT_Seg:
Env name: val_unseen, nav_error: 4.7119, oracle_error: 2.8192, steps: 55.6573, lengths: 20.2547, success_rate: 0.5534, oracle_rate: 0.6705, spl: 0.4717
Env name: val_seen, nav_error: 4.7694, oracle_error: 2.9660, steps: 47.3888, lengths: 17.1864, success_rate: 0.5397, oracle_rate: 0.6494, spl: 0.4792
Env name: train, nav_error: 1.1375, oracle_error: 0.8062, steps: 31.0214, lengths: 11.8337, success_rate: 0.8880, oracle_rate: 0.9252, spl: 0.8499

According to the results of SR, I would like to ask two questions:

  1. Is this the right evaluation results and what we are supposed to see when evaluating the model with the features provided?
  2. The reproduced ResNet results look close to the one mentioned both in your paper (according to table 4) and the one in the README file of this repository. However, for the GT_Seg one, the results (55% and 53%) seems slightly worse than the one reported in the paper (again table 4, 55% and 56%) and noticeably better than the one mentioned in the README file (48% and 48%). And it's not hard to tell that the results in the paper (55% and 56%) are different from the one described in the README file (48% and 48%). So what causes the difference? Are they being run in different settings?

Your clarification will be highly appreciated, thank you very much!

The illustration of nav-graph and R2R env

I've read your novelty paper "Diagnosing the Environment Bias in Vision-and-Language Navigation". Could you give me some advice about illustrating the navigation graph like Figure1 and Figure3 in your paper?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.