Giter VIP home page Giter VIP logo

Comments (7)

Fangyi-Chen avatar Fangyi-Chen commented on July 22, 2024

We also observe a (predictable) GPU memory increase during the training of any SQR-based methods, because the increased number of queries are flowing through multiple decoding layers with their backward gradients stored.

We considered how to reduce the negative effect brought by SQR, i.e., the additional training time. Since we used the A100-80GB version, GPU memory was not in our consideration. Different implementations could lead to very different GPU memory overhead. Our implementation is very simple and easy to understand -- basically only a few lines of code -- but it is not the most efficient one. We are glad to receive any advice on faster implementation of SQR!

I also noticed that Group DETR and H-DETR should have similar operation on handling 'groups of query'. I will take a look at their implementation and see if theirs are faster. If they are, I will try to update the implementation in this repo.

Finally, do not hesitate to ask if you have any further questions. Thanks!

from sqr.

powermano avatar powermano commented on July 22, 2024

SQR does consume some additional GPU memory, but not too much. I thought it was mostly due to code problems. I have already fixed it, and the current training GPU memory is within an acceptable range.

On my own dataset, the AP has increased by 2.5 points compared to DN-DETR, very good work. The DINO trick(look forward twice) can be combined with SQR and bring greater improvement。

from sqr.

SYH9905 avatar SYH9905 commented on July 22, 2024

SQR does consume some additional GPU memory, but not too much. I thought it was mostly due to code problems. I have already fixed it, and the current training GPU memory is within an acceptable range.

On my own dataset, the AP has increased by 2.5 points compared to DN-DETR, very good work. The DINO trick(look forward twice) can be combined with SQR and bring greater improvement。

你好 请问如何将look forward twice和sqr结合起来,类似dn,dab都有x,y,w,h的参数,在进行refine操作时可能会被下面好几层共同影响,该如何去解决这个问题呢

from sqr.

SYH9905 avatar SYH9905 commented on July 22, 2024

之前进行DAB-DETR 进行look forward twice的时候每一层都通过detach()进行隔离

from sqr.

SYH9905 avatar SYH9905 commented on July 22, 2024

We also observe a (predictable) GPU memory increase during the training of any SQR-based methods, because the increased number of queries are flowing through multiple decoding layers with their backward gradients stored.

We considered how to reduce the negative effect brought by SQR, i.e., the additional training time. Since we used the A100-80GB version, GPU memory was not in our consideration. Different implementations could lead to very different GPU memory overhead. Our implementation is very simple and easy to understand -- basically only a few lines of code -- but it is not the most efficient one. We are glad to receive any advice on faster implementation of SQR!

I also noticed that Group DETR and H-DETR should have similar operation on handling 'groups of query'. I will take a look at their implementation and see if theirs are faster. If they are, I will try to update the implementation in this repo.

Finally, do not hesitate to ask if you have any further questions. Thanks!

你好,请问是否会公布SQR-DAB-DETR的代码,我想知道在DAB-DETR已经使用look forward twice的情况下,如何集合sqr进行refine操作

from sqr.

powermano avatar powermano commented on July 22, 2024

SQR does consume some additional GPU memory, but not too much. I thought it was mostly due to code problems. I have already fixed it, and the current training GPU memory is within an acceptable range.
On my own dataset, the AP has increased by 2.5 points compared to DN-DETR, very good work. The DINO trick(look forward twice) can be combined with SQR and bring greater improvement。

你好 请问如何将look forward twice和sqr结合起来,类似dn,dab都有x,y,w,h的参数,在进行refine操作时可能会被下面好几层共同影响,该如何去解决这个问题呢

following is my sqr_dn-detr code based on https://github.com/IDEA-Research/detrex repo.

          hidden_states, reference_boxes = self.transformer(
            features,
            img_masks,
            input_box_query,
            pos_embed,
            target=input_label_query,
            attn_mask=[attn_mask, None],  # None mask for cross attention
        )
        
        if self.training:
            for qid in range(hidden_states.shape[0]):
                # version 2: using the correspoding reference to update the new_reference_point
                if qid < 1:
                    lvl = 0
                elif qid >= 1 and qid < 3:
                    lvl = qid - 1
                elif qid >= 3 and qid < 6:
                    lvl = qid - 2
                elif qid >= 6 and qid < 11:
                    lvl = qid - 4
                elif qid >= 11 and qid < 19:
                    lvl = qid - 7
                elif qid >= 19 and qid < 32:
                    lvl = qid - 12
                else:
                    assert False

                reference = reference_boxes[lvl]
              
                # Calculate output coordinates and classes.
                reference = inverse_sigmoid(reference)
                anchor_box_offsets = self.bbox_embed(hidden_states[qid])
                outputs_coord = (reference + anchor_box_offsets).sigmoid()
                outputs_class = self.class_embed(hidden_states[qid])  #(layers, bs, num_q+dn_group*max_gt_per_img, 1)
                
                outputs_coords.append(outputs_coord)
                outputs_classes.append(outputs_class)
            
            
            outputs_class = torch.stack(outputs_classes)
            outputs_coord = torch.stack(outputs_coords)

from sqr.

SYH9905 avatar SYH9905 commented on July 22, 2024

SQR does consume some additional GPU memory, but not too much. I thought it was mostly due to code problems. I have already fixed it, and the current training GPU memory is within an acceptable range.
On my own dataset, the AP has increased by 2.5 points compared to DN-DETR, very good work. The DINO trick(look forward twice) can be combined with SQR and bring greater improvement。

你好 请问如何将look forward twice和sqr结合起来,类似dn,dab都有x,y,w,h的参数,在进行refine操作时可能会被下面好几层共同影响,该如何去解决这个问题呢

following is my sqr_dn-detr code based on https://github.com/IDEA-Research/detrex repo.

          hidden_states, reference_boxes = self.transformer(
            features,
            img_masks,
            input_box_query,
            pos_embed,
            target=input_label_query,
            attn_mask=[attn_mask, None],  # None mask for cross attention
        )
        
        if self.training:
            for qid in range(hidden_states.shape[0]):
                # version 2: using the correspoding reference to update the new_reference_point
                if qid < 1:
                    lvl = 0
                elif qid >= 1 and qid < 3:
                    lvl = qid - 1
                elif qid >= 3 and qid < 6:
                    lvl = qid - 2
                elif qid >= 6 and qid < 11:
                    lvl = qid - 4
                elif qid >= 11 and qid < 19:
                    lvl = qid - 7
                elif qid >= 19 and qid < 32:
                    lvl = qid - 12
                else:
                    assert False

                reference = reference_boxes[lvl]
              
                # Calculate output coordinates and classes.
                reference = inverse_sigmoid(reference)
                anchor_box_offsets = self.bbox_embed(hidden_states[qid])
                outputs_coord = (reference + anchor_box_offsets).sigmoid()
                outputs_class = self.class_embed(hidden_states[qid])  #(layers, bs, num_q+dn_group*max_gt_per_img, 1)
                
                outputs_coords.append(outputs_coord)
                outputs_classes.append(outputs_class)
            
            
            outputs_class = torch.stack(outputs_classes)
            outputs_coord = torch.stack(outputs_coords)

感谢您的帮助和回答

from sqr.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.