hunto / image_classification_sota Goto Github PK

View Code? Open in Web Editor NEW

73.0 4.0 13.0 213 KB

Training ImageNet / CIFAR models with sota strategies and fancy techniques such as ViT, KD, Rep, etc.

License: Apache License 2.0

Python 99.63% Shell 0.37%

pytorch imagenet cifar nas kd pruning rep vit transformer image-classification

image_classification_sota's People

Contributors

Stargazers

Watchers

Forkers

mornydew scott-mao hhhhnwl infghter xjyp cv-ip evenhax leonardoqin hyungkeun-park shalinisarode257 ljc0359

image_classification_sota's Issues

How to write the transformer decoder?

I want to use Lightvit to write the net structure like this :

Can you tell me how to write the decoder, please?

训练效果

您好，想问下提供的resnet或mobilnetv2结构，您有训练测试达到很高的acc吗？还是里面的参数需要自己继续调整。我直接用mbv2训练自己的分类任务，效果不是很好

split global token and image token

image_classification_sota/lib/models/lightvit.py

Line 245 in 36539b6

x_glb = x[:, :, :NT]

Hello,
When you split global token and image token from the input x, shouldn't it be split into [B, :NT, C] and [B, NT: , C]?
But the code in the forward_feature function, it is split from the channel dim for x_glb.

So, assuming x has the shape of [1,3134,64], then global token shape will be [1,8,64] and image token shape will be [1,3136,64].
Please let me know if I am wrong.

lack edgenn model

Hi, sir, the classification lack edgenn model, can you upload it.

Hello Author, I found the edgenn package in the nas_model.py and builder.py is missing，Can you provide a solution

about the dimension

Hi, thanks for opening the source code. I read the paper, I find you use logits and features before pooling to perform diffusion. but for the logits, I guess the dimension is [B, C] B is the batch size, and C is the class number. This will cause a dimension mismatch in autoencoder, how to solve it. Thanks for your reply.

replace the dbb block

Can I replace the DBB module and model with my own reparameterization module and model?

(dataset problem) ImageNet has not "meta/train.txt"

Hello author, the imagenet dataset I downloaded does not have a meta/train.txt file, can you provide me with a file?

[Question] Could I use DIST in RetinaFace?

Could I use DIST in RetinaFace?

RetinaFace have only 2class(face, not face). so Pearson's correlation coefficient seems to be inefficient.

In summary, if the class is small, the dist is inefficient. Especially in the case of binary, it looks more inefficient.
I wonder if the above opinion is correct.

ddim loss

我尝试了一下，我发现我的ddim loss 下降，但是下降的幅度很慢。后期基本不怎么降这是正常现象吗