Giter VIP home page Giter VIP logo

Comments (5)

hellbell avatar hellbell commented on July 26, 2024

Hi @mrT23

have you considered doing "relabeld-cutmix", meaning pooling the targets from the relevant area only ?

it is definitely more complicated, but I think it can even count as a new interesting type of augmentation, that can be used
not only when learning from a teacher

I agree. That is a very good point!
It would be an interesting and new data augmentation, but based on our experiments, it shows similar or slightly worse results compared with the plain-cutmix.
I think relabeled-cutmix is conceptually better than plain-cutmix, but we may have to find a proper training setting for relabeled-cutmix. Ans also the cutmix region can be relatively very small and the quantization issue (during pooling) may appear.
Anyway, for this reason, the results with relabeled-cutmix is omitted in the current version of our paper. If we find a better way to implement relabeled-cutmix, we will update our paper.

from relabel_imagenet.

mrT23 avatar mrT23 commented on July 26, 2024

thanks for the answer

It happens to me all the time when things that seem logical and more "correct" don't give immediately score improvement :-)

I think that label-pooling is interesting, and reflects a deeper understanding of the augmentation process and limitations of "single-label" datasets.
i never really liked self-supervised methods (moco, simclr, mix-and-match,...), because they always assume this contrastive loss where under augmentations, the labels (or embedding) should remain the same, which is not always true, especially for spatial augmentations.

with label-pooling, you can also do some kind of contrastive loss, which is much more logical: take an image and a zoomed-in version; Demand that the labels of the zoomed-in image, and label pooling of the original image labels, will be the same.
this also has the advantage of self-teaching spatial capabilities to a detector.

from relabel_imagenet.

hellbell avatar hellbell commented on July 26, 2024

@mrT23

with label-pooling, you can also do some kind of contrastive loss, which is much more logical: take an image and a zoomed-in version; Demand that the labels of the zoomed-in image, and label pooling of the original image labels, will be the same.
this also has the advantage of self-teaching spatial capabilities to a detector.

Your point is quite interesting and reasonable, but I think the biggest advantage of self-supervised learning is that it can train meaningful representations without any supervision. Since our ReLabel needs explicit label supervision, comparison with self-supervised learning methods will not be easy.
Or, adding the loss that minimizes the distance between the pooled-label and the cropped image's label as an auxiliary loss would be interesting!

from relabel_imagenet.

mrT23 avatar mrT23 commented on July 26, 2024

from my experience and past tests, contrastive methods fail to improve the pretrain quality of models.
this is in contrast to KD and relabling methods, which also don't need ground-truth.

anyway, nice work
:-)

from relabel_imagenet.

hellbell avatar hellbell commented on July 26, 2024

It was an interesting discussion! Thanks.

from relabel_imagenet.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.