Giter VIP home page Giter VIP logo

Comments (6)

mathildecaron31 avatar mathildecaron31 commented on July 3, 2024 3

Hi @KyleZheng1997

  1. I used temperature = 0.1
  2. I've experimented with different parameters and the same schedule as used in this repo was the optimal (0.996 to 1 with cosine)
  3. 65k, though it was giving only minor improvement over 4k for example
  4. learning rate is the same as in this repo. For weight decay keeping it fixed to 0.05 was working the best in my experiments.
  5. no clipping
  6. 3 layers and final dimension at 256. hidden unit at 2048
  7. I use synchronous BN in projection head
  8. batch size is 1024 just like in DINO experiments
  9. same as in this repo
  10. symmetric version

Hope that helps :)

from dino.

mathildecaron31 avatar mathildecaron31 commented on July 3, 2024 2

In case that helps here are the checkpoints at 300 epochs used in our paper:
BYOL:
https://dl.fbaipublicfiles.com/dino/byol_vitsmall16_300ep_pretrain.pth

SwAV:
https://dl.fbaipublicfiles.com/dino/swav_vitsmall16_300ep_pretrain.pth

MoCo-v2:
https://dl.fbaipublicfiles.com/dino/moco_vitsmall16_300ep_pretrain.pth

DINO:
https://dl.fbaipublicfiles.com/dino/dino_vitsmall16_300ep_pretrain.pth

from dino.

mingkai-zheng avatar mingkai-zheng commented on July 3, 2024 1

Would you like to share some implementation details of mocov2 (transformer version)?

  1. temperature
  2. momentum coefficient (and did you cosine increasing m to 1 ?)
  3. number of negatives in the memory buffer
  4. learning rate, weight decay (and it's corresponding schedule strategy)
  5. global clip norm value
  6. Number of layers in the MLP head (and it's corresponding dimensions)
  7. Did you used bn in the MLP layer? if you did, did you use shuffle bn or sync bn?
  8. the batch size
  9. The details of image augmentations (simclr / mocov2 / byol style ?)
  10. The reported result of MocoV2 is the original version or the symmetric version?

Sorry to have so many questions, I will be really appreciated it if you can answer these questions.

from dino.

mathildecaron31 avatar mathildecaron31 commented on July 3, 2024 1

1- AdamW (exactly like DINO training)
2- GeLU (exactly like DINO training)
3- No activation after final layer. The features are l2 normalized (like in MoCo paper)
4- 72.7% top-1 with linear eval after 800 epochs (71.6% after 300 epochs) (please see Tables 2 and 13 from our DINO paper). I have not reported top5.

Hope that helps!

from dino.

mathildecaron31 avatar mathildecaron31 commented on July 3, 2024

Hi @Trent-tangtao

Thanks for your kind words. In order to keep this codebase simple, we have not been planning to release my implementation for BYOL, MoCo and SwAV. The repo is meant to be a simple implementation for DINO and related transfer tasks only and not a generic SSL library.

In addition there are already official releases for these works so I am not sure of the added value in releasing the re-implementation. I encourage you to take a look at https://github.com/facebookresearch/vissl, https://github.com/facebookresearch/swav or https://github.com/facebookresearch/moco.

I apologize for the possible inconvenience, and feel free to reach out if you have any questions.

from dino.

Hiusam avatar Hiusam commented on July 3, 2024

Hi, thanks for your excellent work! I have the following questions regarding reproducing MoCoV2 results:

  1. Do you use AdamW for MoCoV2 training? And what are the parameters of the optimizer?
  2. What activation function do you use in the MLP head?
  3. Do you use the activation function after the final layer of the MLP head?
  4. What were the top-1 accuracy and top-5 accuracy when you finished the MoCoV2 training for ViT-S?

from dino.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.