Giter VIP home page Giter VIP logo

h-deformable-detr's People

Contributors

hardyho avatar jiadingcn avatar pkurainbow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

h-deformable-detr's Issues

Question about variant of hybrid matching

Hi I have a question about the performance of hybrid matching scheme.
As you noticed that among the 3 different hybrid matching schemes, using hybrid branch seems work well and achieve faster inference time since only 300 queries are used compared to other ones.
In my knowledge, if using two stage DDETR , we can set different number of queries at training and test time, so if you don't mind, I want to know whether there is potential performance degradation if training other variants with large number of queries and test with fewer queries with two stage architecture.

Thanks!

There have a bug?

Traceback (most recent call last):
File "I:/H-Deformable-DETR/main.py", line 537, in
main(args)
File "I:/H-Deformable-DETR/main.py", line 460, in main
train_stats = train_one_epoch(
File "I:\H-Deformable-DETR\engine.py", line 96, in train_one_epoch
loss_dict = train_hybrid(
File "I:\H-Deformable-DETR\engine.py", line 48, in train_hybrid
loss_dict_one2many = criterion(outputs_one2many, multi_targets)
File "H:\miniconda\envs\H-deformable-detr\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "I:\H-Deformable-DETR\models\deformable_detr.py", line 478, in forward
indices = self.matcher(outputs_without_aux, targets)
File "H:\miniconda\envs\H-deformable-detr\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "I:\H-Deformable-DETR\models\matcher.py", line 104, in forward
C = C.view(bs, num_queries, -1).cpu()
RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

RuntimeError: CUDA error: device-side assert triggered

tensor([70, 81, 32, 1], device='cuda:0')
Traceback (most recent call last):
File "main50.py", line 565, in
main(args)
File "main50.py", line 418, in main
train_stats = train_one_epoch_burnin(
File "/netscratch/shehzadi/Rego-semi/aH-semi/engine.py", line 377, in train_one_epoch_burnin
loss_dict = train_hybrid(
File "/netscratch/shehzadi/Rego-semi/aH-semi/engine.py", line 54, in train_hybrid
loss_dict = criterion(outputs, targets)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/netscratch/shehzadi/Rego-semi/aH-semi/models/deformable_detr.py", line 457, in forward
indices = self.matcher(outputs_without_aux, targets)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/netscratch/shehzadi/Rego-semi/aH-semi/models/matcher.py", line 176, in forward
cost_class = pos_cost_class[:, tgt_ids] - neg_cost_class[:, tgt_ids]
RuntimeError: CUDA error: device-side assert triggered
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: device-side assert triggered

I am getting this error.

Will the AP improve if i use fp16?

Will the AP improve if i use fp16?
Will fp16 slow down training, takes more time?
i have a v100 32g,i set batch=4
i really don't know that...

things abotut training

I run this order ,GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8
--coco_path , but i get this , launch.py: error: the following arguments are required: training_script, training_script_args.
can you tell me why?

How can I apply this code to other datasets?

The complete COCO dataset is too large for me as I only have a single 8GiB graphics card, and training on it would take too long. How can I modify this code to train or test on my own dataset?
I apologize for asking such a foolish question, and I have great admiration for the work your team has done.
However, I noticed that the dataset path in the code is set to 'coco_path'. Does this mean that if I want to train on a different dataset, I would need to put a lot of effort into adjusting the structure of the existing code?

Question about proposal generation

Hello @PkuRainBow , thanks for opening source your excellent work !

I have a question about this code patch(line244) in deformable_transformer.py:

...
            topk = self.two_stage_num_proposals
            topk_proposals = torch.topk(enc_outputs_class[..., 0], topk, dim=1)[1]
            topk_coords_unact = torch.gather(
                enc_outputs_coord_unact, 1, topk_proposals.unsqueeze(-1).repeat(1, 1, 4)
            )
...

Tensor enc_outputs_class[..., 0](enc_outputs_class.shape = (batch_size, len_flattened_encoder_seq, 91)) represents the cls prediction of the first fg class ?

In my understanding, The purpose here is to get the topk fg proposals according to topk highest fg scores(including all fg classes).

So why not execute topk = torch.topk(enc_outputs_class.max(dim=-1)[0], topk, dim=1)[1] ?

Could you please give some explanation, thx !

Can this code work with Windows?

I configured the environment in Windows, and typed "python main.py" to run this code. But it gave me an error like this:

File "main.py", line 536, in
main(args)
File "main.py", line 470, in main
use_fp16=args.use_fp16,
File "F:\SOTA_DETR\H-Deformable-DETR-master\engine.py", line 97, in train_one_epoch
outputs, targets, k_one2many, criterion, lambda_one2many
File "F:\SOTA_DETR\H-Deformable-DETR-master\engine.py", line 48, in train_hybrid
loss_dict_one2many = criterion(outputs_one2many, multi_targets)
File "D:\Anaconda3\envs\detr\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "F:\SOTA_DETR\H-Deformable-DETR-master\models\deformable_detr.py", line 478, in forward
indices = self.matcher(outputs_without_aux, targets)
File "D:\Anaconda3\envs\detr\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "F:\SOTA_DETR\H-Deformable-DETR-master\models\matcher.py", line 106, in forward
C = C.view(bs, num_queries, -1).cpu()
RuntimeError: cannot reshape tensor of 0 elements into shape [2, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

I wander whether the code can't run properly with Windows.
I really appreciate the code you provided with us. Thank you!

LVIS configs

Hi, could you please provide the configs on LVIS dataset? Appreciate it!

Questions about the DETR's performance

Thank you very much for your work.
Because the experiments were conducted on DDETR, I would like to know if this training strategy has any enhancement to the original DETR?
And I have some care for the growth of its training GPU memory usage.
If you have the corresponding data then thank you very much.

The total batch size

Thanks for your awesome work!
But I do not find key information in the paper and the released code. What is the total batch size of H-DETR, 32 or 16?

Thanks!

Question of design of self attention mask

Hi,

Really nice job on the paper. I was excited to read it.

I was wondering if you could explain a bit further on the attention masks. FYI, I am referencing your Hybrid branch which I believe was used in the rest of the paper. So if I understood the paper / your code, the attention masks are just being used to prevent information leakage between the two groups (one-to-one and one-to-many) which makes sense.

However, I don't understand why you did not want to prevent information leakage between every query within the one-to-many group. I understand that you repeat the ground truth K times so multiple queries can match to the same object. However, I would think that the self attention perfromed in the decoder for the one-to-many group would naturally prevent multiple queries from selecting the same object since the whole point of self attention here is to remove duplicates. If you were to add attention mask here, I would think that would resolve this issue.

I think I may be fundamentally misunderstanding something as clearly this worked for you. Any insight would be appreicated. I linked the code below that I have looked at.

Thanks,
Owen

# make attn mask
""" attention mask to prevent information leakage
"""
self_attn_mask = (
torch.zeros([self.num_queries, self.num_queries,]).bool().to(src.device)
)
self_attn_mask[self.num_queries_one2one :, 0 : self.num_queries_one2one,] = True
self_attn_mask[0 : self.num_queries_one2one, self.num_queries_one2one :,] = True

# self attention
q = k = self.with_pos_embed(tgt, query_pos)
tgt2 = self.self_attn(
q.transpose(0, 1),
k.transpose(0, 1),
tgt.transpose(0, 1),
attn_mask=self_attn_mask,
)[0].transpose(0, 1)

question about export the model into torchscript

i exported the model into torchscript format, and when i use the exported model to inference on image,it only can inference on the image that i used for exporting model, but for other image,it cann't work,and the error message is:
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py:1051: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return forward_call(*input, **kwargs)
Traceback (most recent call last):
File "/root/autodl-tmp/project/deploy/export_model.py", line 264, in
out1 = m(data)
File "/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/detectron2/export/flatten.py", line 9, in forward
def forward(self: torch.detectron2.export.flatten.TracingAdapter,
argument_1: Tensor) -> Tuple[Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor]:
_0, _1, _2, _3, _4, _5, _6, = (self.model).forward(argument_1, )
~~~~~~~~~~~~~~~~~~~ <--- HERE
return (_0, _1, _2, _3, _4, _5, 6)
File "code/torch/adet/modeling/text_spotter.py", line 23, in forward
batched_imgs = torch.unsqueeze
(_7, 0)
x0 = torch.contiguous(batched_imgs)
_8, _9, _10, _11, = (_0).forward(x0, image_size, )
~~~~~~~~~~~ <--- HERE
_12 = torch.softmax(_9, -1)
prob = torch.sigmoid(torch.mean(_8, [-2]))
File "code/torch/adet/modeling/model/detection_transformer.py", line 50, in forward
_29 = getattr(self.input_proj, "1")
_30 = getattr(self.input_proj, "0")
_31 = (self.backbone).forward(x, image_size, )
~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_32, _33, _34, _35, _36, _37, _38, _39, _40, _41, _42, _43, _44, _45, _46, _47, _48, _49, _50, _51, _52, _53, _54, _55, _56, _57, = _31
_58 = (_30).forward(_32, )
File "code/torch/adet/modeling/text_spotter.py", line 104, in forward
image_size: Tensor) -> Tuple[Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor]:
_61 = getattr(self, "1")
_62 = (getattr(self, "0")).forward(x, image_size, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_63, _64, _65, _66, _67, _68, _69, = _62
pos_embed = torch.to((_61).forward(_63, ), 6)
File "code/torch/adet/modeling/text_spotter.py", line 143, in forward
_92 = torch.slice(torch.slice(_91, 0, 0, 125), 1, 0, 138)
_93 = torch.view(CONSTANTS.c2, annotate(List[int], []))
94 = torch.copy(_92, torch.expand(_93, [125, 138]))
~~~~~~~~~~~ <--- HERE
masks_per_feature_level0 = torch.ones([_85, _86, _87], dtype=11, layout=None, device=torch.device("cpu"), pin_memory=False)
_95 = torch.select(masks_per_feature_level0, 0, 0)

Traceback of TorchScript, original code (most recent call last):
/root/autodl-tmp/project/adet/modeling/text_spotter.py(60): mask_out_padding
/root/autodl-tmp/project/adet/modeling/text_spotter.py(43): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/adet/modeling/text_spotter.py(21): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/adet/modeling/model/detection_transformer.py(168): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/adet/modeling/text_spotter.py(220): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/export/flatten.py(259):
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/export/flatten.py(294): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/jit/_trace.py(952): trace_module
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/jit/_trace.py(735): trace
/root/autodl-tmp/project/deploy/export_model.py(125): export_tracing
/root/autodl-tmp/project/deploy/export_model.py(224):
/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py(18): execfile
/root/.pycharm_helpers/pydev/pydevd.py(1496): _exec
/root/.pycharm_helpers/pydev/pydevd.py(1489): run
/root/.pycharm_helpers/pydev/pydevd.py(2177): main
/root/.pycharm_helpers/pydev/pydevd.py(2195):
RuntimeError: The size of tensor a (50) must match the size of tensor b (125) at non-singleton dimension 0

About Hybrid-Layer Implement

Thank you for your great work and well-organized repo.
However, I do not find any code for the variant scheme Hybrid epoch or Hybrid layer. How did you implement Hybrid layer? Are there two decoders, and calculate losses separately?

implementation details for only one-to-many branch

Hi, I have a question about the ablation setting in table 13.
There seem no implementation details for only using one-t-many label assignment.
Did you set K for 6? Do outputs of the encoder also use one-to-many label assignment which is not used in hybrid detr?

Thanks

MMCV_Custom

Hi, thank you for your work. I would like to ask how MMCV_Custom can be used in MMDetection's own project. I want to use the AMP acceleration in MMDetection, I wonder if it is feasible

About the generalization of the hybrid matching query setting

Hi, thanks for such great work! I wonder if you test the generalization of the hybrid matching proposed in your paper. I tried to implement the hybrid matching queries on DINO-Deformable-DETR, and the performance degraded from 48.7 mAP to 46.5 under the standard 1x schedule, which seems the hybrid matching strategy in your paper cannot easily transform to other DETR-based object detectors. Hope to get your reply.

Question about weight decay

man i saw your chart weight decay=0.05 under the normal weight=1e-4, weight decay=0.05 has better performance.
but its just on swin backbone in the table,
Will it better on the renset backbone with weight decay=0.05?
hope ur reply~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.