Comments (3)
Yeah it is quite sensitive to the hyper parameters in linear probing, which we also observed. One idea we adopted in our more recent MAE work (https://arxiv.org/abs/2111.06377) is to add an additional, parameter-free BN0 (so BN without learnable weights) to normalize features "on-the-fly" before the linear classifier. Since the BN0 stats can be absorbed in the weights, it does not violate the linear probing rule and can help reduce parameter search when changing to other architectures. For linear probing, the main parameters to search is learning rate (likely due to different feature scales).
from moco-v3.
I read the MAE recently, it's a really wonderful work!
But i'm very confused about the handling of class token when pre-train the MAE model.
The paper said: As ViT has a class token [16], to adapt to this design, in our MAE pre-training we append an auxiliary dummy token to the encoder input. This token will be treated as the class token for training the classifier in linear probing and fine-tuning.
How can this dummy token work in linear probing ? Since it's not explicitly used in pre-training. Is there any other information I missed?
from moco-v3.
I read the MAE recently, it's a really wonderful work!
But i'm very confused about the handling of class token when pre-train the MAE model.
The paper said:
As ViT has a class token [16], to adapt to this design, in our MAE pre-training we append an auxiliary dummy token to the encoder input. This token will be treated as the class token for training the classifier in linear probing and fine-tuning.
How can this dummy token work in linear probing ? Since it's not explicitly used in pre-training. Is there any other information I missed?
Hello, Have you tried other architectures like EfficientNet?
from moco-v3.
Related Issues (20)
- How many TPUs ? HOT 1
- How about the loss converges during training? HOT 5
- The size of an embedding is 1000, how would i change it? HOT 1
- How many epochs for resnet50 end-to-end finetuning? HOT 1
- 队列 HOT 2
- ViT-Base fine-tuned checkpoints
- Training with multi-crop
- # BUG
- Cann't load pretrain weights HOT 1
- The question about the temperature of loss in MocoV3 HOT 1
- question about batch size
- Question about linear probe
- About the learning rate for resnet-50 HOT 1
- How to fine-tune?
- MOCO V3 vit_small error: object has no attribute "num_tokens" HOT 6
- the linear-prob acc1 of ViT-tiny on ImageNet is bad
- Links for pre-trained models are broken HOT 1
- Most of the learning time is spent loading data. This makes it impossible to use GPU resources efficiently.
- About Linear Probe Accuracy of Resenet-50
- Does mocov3 support Coco datasets
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from moco-v3.