Comments (5)
We basically follow A2-MIM's 300-epoch finetuning setting (i.e., the Resnet Strikes Back/RSB A2), and set 200/400 ep for smaller/larger models respectively. We exclude the 100-epoch RSB A3 setting since it uses a different resolution (160), but if it is of interests we would have a try.
btw, the convnextv2 uses 400 or 600 for their smaller models.
from spark.
Thanks for your quick response.
We exclude the 100-epoch RSB A3 setting since it uses a different resolution (160), but if it is of interests we would have a try.
I think the 100 ep setting is important, otherwise, it is difficult for follow-up works to compare with existing works (Spark, ConvNextV2, etc.) in a fair way because of inconsistent evaluation protocols.
Besides, I think it is more reasonable to finetune pre-trained models with no more than 300 epochs, which is used in the supervised baseline. Otherwise, it is hard to say whether the performance gain comes from longer finetuning or better initialization provided by MIM.
from spark.
I see. But I would suggest not focusing too much on ImageNet finetuning. I feel the best way to justify whether MIM makes sense is to evaluate it on REAL downstream tasks (i.e., not on ImageNet), because doing pretraining and finetuning on the same dataset can be kind of like a "data leakage", and dosen't match our eventual goals of self-supervised learning.
On real downstream tasks (COCO object detection & instance segmentation), SparK can outperform Swin+MIM, Swin+Supervised, Conv+Supervised, Conv+Contrastive Learning, so these are like solid proofs of SparK's effectiveness.
from spark.
I see. But I would suggest not focusing too much on ImageNet finetuning. I feel the best way to justify whether MIM makes sense is to evaluate it on REAL downstream tasks (i.e., not on ImageNet), because doing pretraining and finetuning on the same dataset can be kind of like a "data leakage", and dosen't match our eventual goals of self-supervised learning.
This makes sense to me. Thanks for the explanation.
from spark.
We basically follow A2-MIM's 300-epoch finetuning setting (i.e., the Resnet Strikes Back/RSB A2), and set 200/400 ep for smaller/larger models respectively. We exclude the 100-epoch RSB A3 setting since it uses a different resolution (160), but if it is of interests we would have a try.
btw, the convnextv2 uses 400 or 600 for their smaller models.
But for the B and H/L models, they use the same 50 and 100 epochs (ConvNext v2 paper, A.1, Table 11) fine-tuning schedule. It would be nice to compare apple to apples in terms of fine-tuning epochs. What are the results after 50 epochs SparK fine-tunning of the ConvNext-B and 100 epochs for ConvNext-H?
from spark.
Related Issues (20)
- There is no activation after the 2nd Conv in each decoder block HOT 1
- Target dataset and augmentation HOT 2
- 对比convnextv2 HOT 1
- reducing pre-training to 200 epochs HOT 9
- Tutorial for finetune on my own dataset HOT 1
- Are there any plans to make a port to tensorflow and Keras? HOT 1
- ImageNet finetuning exploding HOT 9
- there is no requirements.txt file. HOT 1
- SparK for semantic segmentation HOT 3
- Resuming ImageNet fine-tuning HOT 2
- About sparse convolution HOT 4
- How to transfer this method to 3D situation. HOT 1
- ConvNext B for reconstruct images HOT 3
- recommend a great library designed for sparse tensors HOT 1
- Can SparK be used for few-shot learning? HOT 2
- SparseBatchNorm2d can not mask correctly ? HOT 3
- A Code Issue About “pretrain/main.py” HOT 2
- SparK ResNet and global feature interaction HOT 8
- ConvNext implementation performance HOT 4
- Increasing batch size HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spark.