Giter VIP home page Giter VIP logo

ft-clip's Issues

about FLOPs

您好,在论文里面的Table 11,记录了FLOPs的数据:
Model B/16_224 B/16_384 L/16_384 L/14_224 L/14_336
FLOPs 17.5G 55.4G 190.7G 80.7G 190.6G
我利用thop库的profile测试第一个B/16_224的FLOPs只有11.3G,要远小于论文中的17.5G。我猜想可能是我测试的方法不一致,因为profile里面确实默认有些模块没有计算。
所以麻烦想问一下作者在测试FLOPs的具体实现细节,非常感激。

about inference code

@LightDXY
Firstly, thank you for sharing the fine-tuning code. However, after completing the fine-tuning, in my own dataset, I only fine-tuned the image encoding part, while I did not fine-tune the text encoding part. I used vit base 16 as the pre training weight, but after fine-tuning, the. pt increased by 5 times. Also, how should the. pt model generated after fine-tuning be used for inference? Looking forward to your guidance, thank you.

About Layer decay

有一个问题是关于按照代码lr decay设置,最后一层transformer block的lr scale并不是1,这是有意设置的吗还是?求教

论文求教

您好,拜读了您的大作,很受启发。
您的论文的摘要中提到:Recent studies have shown that CLIP has achieved remarkable success in performing zero-shot inference while its fine-tuning performance is not satisfactory.... These observations challenge the conventional conclusion that CLIP is not
suitable for fine-tuning...
指出CLIP虽然在零样本推理,尤其是分类任务中效果很好,但是一般认为它不适合微调。我也使用过CLIP,中文和英文版的都试过,将某些自定义的token词和图像进行配对,效果确实不错,感觉CLIP学到的特征很稳健。但是我没有在自己的数据集上微调过CLIP,感觉上,如果特征是好的,那么微调应该效果更好,网络使用的激活都是连续的非线性函数,不会出现类似阶梯函数那样的不连续情况,微调过后,学到的特征应该表达的更好,应该也不会激活后类别判断出错,这只是我的感觉,没有任何证据,所以读到您上面的话有些不解,希望能得到您的解答,非常感谢!

About low accuracy on valid set, overfitted on train set

Hello! Thank you for your fine-tuning code firstly. However, I met some problems in performance of the model.

I implemented the code and finetune the model "CLIP_L14" on datasets: Oxford Pets, Caltech101 and ImageNet with the same fine-tuning config in the paper except the batch size (Due to the limitation of the device, I set the batch size as 32). But the model performance bad on the validation set with accuracies around 1-5%, but on the train set, the accuracies are around 90%. It seems a typical overfit problem. I changed the learning rate, regulation config, epochs and other related config but failed to solve the problems.

So, I wonder that do you meet the same problem on similar datasets or if there are some methods to solve this problem.

Pre-trained Weights

Thanks for sharing your nice work!

I don't have sufficient computational resources to train the models. May I know if the pre-trained (fine-tuned) weights will be released?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.