fangyi-chen / sqr Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Hi, thank you for a good job. You did experiments with DETR-like models in your paper. I would like to ask if it is convenient for you to open the source code of the experiment with Deformable DETR?
Do you also have the problem of using a lot of GPU memory during training DN-DETR?
I asked you related questions on Zhihu and thank you very much for your patient answer.
I have fixed the issue about reference point updating when using Iterative Bounding Box Refinement. Reference point will need to be reserved for each query as it varies.
Hi, I run the deformable-detr setting of SQR and get an error when applying ms_deform_attn_forward function. It seems like that the cuda op of deformable-detr is not deployed.
@Fangyi-Chen Thanks for your great work. I would like to ask about how much inference latency increases compared to the basic pathway, since queries are much heavier in later decoding stages.
Thanks for your work. when reading your article and the published code, I would like to ask which files apply the Selective Query Recollection method in the published code.
Hi, @Fangyi-Chen , could you kindly share the code of computing TP F Rate and FP E Rate? Lots of thanks!
we found and fixed an unseen bug in SQR-Deformable-DETR:
The code logic of SQR-Deformable-DETR (and the original Deformable-DETR) is like:
calculate and update queries stage by stage in
SQR/mmdet/models/utils/transformer.py
Line 707 in ceb538f
collect the calculated queries.
feed the collected queries to the cls branch and box branch in
for the original Deformable-DETR's training and inference, and for sqr's inference, we will have 6 queries and 6 lvl of corresponding cls branches and box branches. so the alignment is simply like
But for sqr training, we will have to align the collected queries (32 queries) with the corresponding 6 lvl of cls branch and box branch
While this alignment should not be applied to sqr's inference pipeline, we accidentally applied it. However, for (SQR)Deformable-DETR, the cls branches and box branches are shared across all levels. So, this bug only exists logically, but did not actually affect anything.
We have added two lines of code to force SQR-Deformable-DETR to use the original pipeline, as in
I would like to use the SQR decoder in your paper to improve the decoder in my model. In order not to affect subsequent tasks, I need to test the decoder separately.
I have written a test code, but there are still some issues. Do you have any methods to test the decoder separately?
if __name__ == '__main__':
x = torch.rand((300, 4, 256)).cuda()
valid_ratios = torch.rand((4,1,2))
reference_points = torch.rand((4, 300, 4))
reg_branches = None
model = QRDeformableDetrTransformerDecoder(num_layers=6,
return_intermediate=True,
start_q=[0, 0, 1, 2, 4, 7, 12], # 2
end_q=[1, 2, 4, 7, 12, 20, 33], # 2
transformerlayers=dict(
type='DetrTransformerDecoderLayer',
attn_cfgs=[
dict(
type='MultiheadAttention',
embed_dims=256,
num_heads=8,
dropout=0.1),
dict(
type='MultiScaleDeformableAttention',
embed_dims=256)
],
feedforward_channels=1024,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
'ffn', 'norm'))).cuda()
out = model(x,valid_ratios=valid_ratios,reference_points=reference_points,reg_branches=None)
print(out[0].shape)
print(out[1].shape)
The above code reported an error:
Traceback (most recent call last):
File "/media/cheng/dataset4/codeto2022/FairMOT_modifed/src/sqrtest.py", line 381, in
out = model(x,valid_ratios=valid_ratios,reference_points=reference_points,reg_branches=None)
File "/media/cheng/dataset4/annaconda3/envs/fashionclip/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/media/cheng/dataset4/codeto2022/FairMOT_modifed/src/sqrtest.py", line 223, in forward
**kwargs)
File "/media/cheng/dataset4/codeto2022/FairMOT_modifed/src/sqrtest.py", line 144, in forward
**kwargs)
File "/media/cheng/dataset4/annaconda3/envs/fashionclip/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/media/cheng/dataset4/annaconda3/envs/fashionclip/lib/python3.7/site-packages/mmcv/cnn/bricks/transformer.py", line 850, in forward
**kwargs)
File "/media/cheng/dataset4/annaconda3/envs/fashionclip/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/media/cheng/dataset4/annaconda3/envs/fashionclip/lib/python3.7/site-packages/mmcv/utils/misc.py", line 340, in new_func
output = old_func(*args, **kwargs)
File "/media/cheng/dataset4/annaconda3/envs/fashionclip/lib/python3.7/site-packages/mmcv/ops/multi_scale_deform_attn.py", line 312, in forward
assert (spatial_shapes[:, 0] * spatial_shapes[:, 1]).sum() == num_value
TypeError: 'NoneType' object is not subscriptable
Thanks for your work, when reading your article, I have a bit of a problem with page 4(AP grows from
44.5 AP to 51.7 (+7.2 AP)), replacing the last layer with intermediate results has greatly improved, but the experimental results have not improved much than felt. Does that mean that the optimal result may be distributed differently in different stages
In addition, can TP F Rate be more detailed?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.