Giter VIP home page Giter VIP logo

pseudo-q's People

Contributors

jianghaojun avatar leaplabthu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pseudo-q's Issues

can't find build_model

Hey, I can't the fing the code for build_model in the models folder. which .py file is it in?

如何改变迭代次数?How to set the epoch?

请问epoch设置在哪儿啊?原本的迭代13643太多了,我的电脑运行需要好几天。我想将epoch减少一些。谢谢!

How to set the epoch? The original iteration 13643 was so much that it took days on my computer to run. I want to reduce the number of epoch. Thanks!

statistics of datasets

Hi, thank you for your excellent work! I found that /data/statistic/ did not have the .txt split files of other datasets. Is there any way to access these files?

Inference API

Hi, this is a nice work!
Could you please provide an inference api so that, for example, the user only needs to provide the path to the image and the corresponding description?

Evaluation on RefCOCO

Hi. Thanks to sharing your nice work!

When I run eval.sh on RefCOCO testA, I got the error "No such file unc_testA.pth'.
I wonder Why unc_testA.pth file is needed during evaluation?

I also run generate_pseudo_data_unc.sh before evaluation and I got unc_train_pseudo_split.pth files, not unc_pseudo_val.pth or unc_pseudo_testA.pth.

Thanks.

Clarification about the loss

TL; DR

Can you explain what is loaded in the dataset along with image data? I would like to understand especially the content of bbox.

Dear authors,

I'm trying to figure out how the training of your model works.

In particular, from this line

loss_dict = loss_utils.trans_vg_loss(output, target)
I noticed that the target is used to compute the loss. The function trans_vg_loss confirms it:
def trans_vg_loss(batch_pred, batch_target):
"""Compute the losses related to the bounding boxes,
including the L1 regression loss and the GIoU loss
"""
batch_size = batch_pred.shape[0]
# world_size = get_world_size()
num_boxes = batch_size
loss_bbox = F.l1_loss(batch_pred, batch_target, reduction='none')
loss_giou = 1 - torch.diag(generalized_box_iou(
xywh2xyxy(batch_pred),
xywh2xyxy(batch_target)
))
losses = {}
losses['loss_bbox'] = loss_bbox.sum() / num_boxes
losses['loss_giou'] = loss_giou.sum() / num_boxes
return losses

I tried to understand what target is, and from this line

img_data, text_data, target = batch
I checked the collate_fn used in dataloader:

Pseudo-Q/utils/misc.py

Lines 294 to 308 in ce1688f

def collate_fn(raw_batch):
raw_batch = list(zip(*raw_batch))
img = torch.stack(raw_batch[0])
img_mask = torch.tensor(raw_batch[1])
img_data = NestedTensor(img, img_mask)
word_id = torch.tensor(raw_batch[2])
word_mask = torch.tensor(raw_batch[3])
text_data = NestedTensor(word_id, word_mask)
bbox = torch.tensor(raw_batch[4])
if len(raw_batch) == 7:
batch = [img_data, text_data, bbox, raw_batch[5], raw_batch[6]]
else:
batch = [img_data, text_data, bbox]
return tuple(batch)

Is this using the ground truth bounding box from the dataset?

I checked the __getitem__ function from the dataset and I ended up with this three lines

imgset_file = '{0}_{1}.pth'.format(self.dataset, split)
imgset_path = osp.join(dataset_path, imgset_file)
self.images += torch.load(imgset_path)

Here a .pth file is loaded, and along with image data something else is loaded. Can you explain what the loaded bbox exactly contains?

Thank you,
Luca

Problem about the training.

Recently, several researchers asked me questions about training. The phenomenon is that the training loss did not decrease or the validation acc was very low.

The reason might be that they adopted a smaller batch size, e.g., 96, but did not change the learning rate.

First of all, I strongly recommend using the same batch size to reproduce our work. Secondly, if you use a smaller batch size, please try to use a smaller learning rate.

If you have any new problems with training, please post your questions inside this issue or open a new one. It would be better to provide as much information as you can, which can help me understand your question quicker.

作者您好,请问能不能对数据集目录部分描述再详细一点,谢谢。Could author provide more detailed descriptions about the dataset folder?

|-- image_data
   |-- data
      |-- flickr
      |-- gref
      |-- gref_umd
      |-- referit
      |-- unc
      |-- unc+
   |-- Flickr30k
      |-- flickr30k-images
   |-- other
      |-- images
      |-- refcoco
      |-- refcoco+
      |-- refcocog
   |-- referit
      |-- images
      |-- mask
      |-- splits

上述 Readme 部分提供的数据集目录结构,只是简单罗列,而且很多都有出入未说明。''other'' 目录应该是指 原始 ''refer'' 仓库的文件吧? 另外,''data'' 目录应该是指下载的 ''pseudo_samples'' 吧? 然后 ''referit'' 文件目录有点莫名其妙不太理解,原始的 ''refer'' 里面是 ''ReCLEF'',但是目录下面的 ‘’images, mask, splits‘’ 未查到,请问这个referit 目录又该怎么构建??
谢谢!

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

Sorry for bothering.When i run the train.py , something went wrong. Here is the output imformation:

E:\Users\JayLee\anaconda3\envs\myenv\python.exe E:/Pseudo-Q-main/train.py
Not using distributed mode
git:
sha: N/A, status: clean, branch: N/A

INFO ### torch.backends.cudnn.benchmark = False

number of params: 155559940
Missing keys when loading detr model:
[]
Start training
E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at ..\c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ..\aten\src\ATen\native\BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
Traceback (most recent call last):
File "E:\Pseudo-Q-main\train.py", line 310, in
main(args)
File "E:\Pseudo-Q-main\train.py", line 265, in main
train_stats = train_one_epoch(
File "E:\Pseudo-Q-main\engine.py", line 38, in train_one_epoch
output = model(img_data, text_data)
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Pseudo-Q-main\models\trans_vg_mlcma.py", line 36, in forward
visu_mask, visu_src = self.visumodel(img_data)
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Pseudo-Q-main\models\visual_model\detr.py", line 72, in forward
out = self.transformer(self.input_proj(src), mask, pos[-1], query_embed=None)
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Pseudo-Q-main\models\visual_model\transformer.py", line 56, in forward
memory = self.encoder(src, src_key_padding_mask=mask, pos=pos_embed)
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Pseudo-Q-main\models\visual_model\transformer.py", line 118, in forward
output = layer(output, src_mask=mask,
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Pseudo-Q-main\models\visual_model\transformer.py", line 225, in forward
return self.forward_post(src, src_mask, src_key_padding_mask, pos)
File "E:\Pseudo-Q-main\models\visual_model\transformer.py", line 196, in forward_post
src2 = self.self_attn(q, k, value=src, attn_mask=src_mask,
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\modules\activation.py", line 1031, in forward
attn_output, attn_output_weights = F.multi_head_attention_forward(
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\functional.py", line 4969, in multi_head_attention_forward
q, k, v = _in_projection_packed(query, key, value, in_proj_weight, in_proj_bias)
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\functional.py", line 4734, in _in_projection_packed
return linear(q, w_q, b_q), linear(k, w_k, b_k), linear(v, w_v, b_v)
File "E:\Users\JayLee\anaconda3\envs\myenv\lib\site-packages\torch\nn\functional.py", line 1847, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)

进程已结束,退出代码1

I don't know how to fix it. Could you please help and give me some ideas? Thank you!

关于生成伪标注的问题 Regarding the Issue of Generating Pseudo-label

请问您在生成伪标注时,针对不同的数据集是否使用了不同的方法或超参?我注意到refcoco和refcoco+生成的伪标注数量相差很大,但refcoco和refcoco+包含的图片数量似乎相差并不大。

Could you please tell me if you used different methods or hyperparameters when generating pseudo-label for different datasets? I have noticed that the number of pseudo-label generated for refcoco and refcoco+ differs significantly, but the number of images contained in refcoco and refcoco+ seems to be quite similar.

Unable to download the faster RCNN results

Hi, thank you for your work!

I cannot download the 'detection_results.tar.gz' file following the instructions here. Is that a server issue? Can you please provide other download sources?

Best,
Yunzhong

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.