fawnliu / tris Goto Github PK

[ICCV 2023] Official code release of our paper "Referring Image Segmentation Using Text Supervision"

Home Page: https://openaccess.thecvf.com/content/ICCV2023/papers/Liu_Referring_Image_Segmentation_Using_Text_Supervision_ICCV_2023_paper.pdf

License: MIT License

Python 97.05% Shell 2.95%

referring-image-segmentation weakly-supervised-learning

tris's People

Contributors

Stargazers

Forkers

yuhaoliu7456 nanwang-crea

tris's Issues

Positive ResponseMap Selection

Thank you for your great work. I have a question for you. In the first stage of training, we have divided a positive sample of a text image pair and N negative samples of other text and this image, so why do we still need to select the response graph generated by these sample pairs in stage 1? Why not just select the response graph of the positive sample？

`demo.py` img_size hardcoded to incorrect value

Hi,

Thank you for sharing your work!

I noticed in demo.py, the get_transform() method does not respect the size argument. It is hardcoded to resize to (224,224):

TRIS/demo.py

Line 22 in b45f660

transforms.Resize((224, 224)),

I believe this should be (size,size), which will take 320px by default. I can confirm that this change results in a heatmap that is a lot closer to the example in your readme.

Before fix

After fix

I'm still not sure why my heatmap doesn't match your example 100%, but the results are impressive nonetheless. 🙂

Regarding Bilateral Prompt

Thanks authors for sharing the code. I have a following question:

When computing the attention map for visual features, is Av in the below line a all-one tensor? Only one language vector is used as key and the softmax is applied on the last dimension which is 1.

TRIS/model/attn.py

Line 122 in b45f660

Av = F.softmax(Qv.matmul(Kt.transpose(1, 2)) / math.sqrt(Ci), dim=2)

paper and code

When will your group release the paper and code?

I would like to know how long your model was trained on a single RTX 3090？

I would like to know how long your model was trained on a single RTX 3090?

Too long all_eta

Thank you for your outstanding paper! I tried to retrain your model to use it as my baseline model.

This is my current state on training stage1.

I just checked that about 33seconds consumed on loss.backword().

Is it right that all_eta is logged as 5 days?

No such file or directory: '../output/refcocog_umd/refcocog_train_names.json'

run in the order you specified but run these code :

Train IRNet and generate pseudo masks.
cd IRNet

dir=../output
CUDA_VISIBLE_DEVICES=0,1,2,3 python run_sample_refer.py --cam_out_dir $dir/refcocog_umd/cam --ir_label_out_dir $dir/refcocog_umd/ir_label --ins_seg_out_dir $dir/refcocog_umd/ins_seg --train_list $dir/refcocog_umd/refcocog_train_names.json --cam_eval_thres 0.15 --work_space output_refer/refcocog_umd --num_workers 8 --irn_batch_size 96 --cam_to_ir_label_pass True --train_irn_pass True --make_ins_seg_pass True

error: No such file or directory: '../output/refcocog_umd/refcocog_train_names.json'

fawnliu / tris Goto Github PK

tris's People

Contributors

Stargazers

Forkers

tris's Issues

Positive ResponseMap Selection

`demo.py` img_size hardcoded to incorrect value

Before fix

After fix

Regarding Bilateral Prompt

paper and code

I would like to know how long your model was trained on a single RTX 3090？

Too long all_eta

No such file or directory: '../output/refcocog_umd/refcocog_train_names.json'

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent