Giter VIP home page Giter VIP logo

mobileformer's Introduction

MobileFormer

An implementation of MobileFormer proposed by Yinpeng Chen, Xiyang Dai et al.

Including

[1] Mobile-Former proposed in: 
                        Yinpeng Chen, Xiyang Dai et al., Mobile-Former: Bridging MobileNet and Transformer. 
                        arxiv.org/abs/2108.05895
[2] Dynamtic ReLU proposed in: 
                        Yinpeng Chen, Xiyang Dai et al., Dynamtic ReLU. 
                        arxiv.org/abs/2003.10027v2
[3] Lite-BottleNeck proposed in: 
                        Yunsheng Li, Yinpeng Chen et al., MicroNet: Improving Image Recognition with Extremely Low FLOPs. 
                        arxiv.org/abs/2108.05894v1
[4] Adam-W proposed in:
                        Ilya Loshchilov & Frank Hutter, Decoupled Weight Decay Regularization.
                        arxiv.org/abs/1711.05101v3
[5] Mixup proposed in:
                        Hongyi Zhang, Moustapha Cisse et al., Mixup: Beyond Empircal Risk Minimization.
                        arxiv.org/abs/1710.09412
[6] Multi-FocalLoss (not used), focal loss is proposed in:
                        Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár, Focal Loss for Dense Object Detection.
                        arxiv.org/abs/1708.02002

Note

(1) Due to the expanded DW conv used in strided Mobile-Former blocks, 
    the out_channel should be divisible by expand_size of the next block.
(2) Adam-W and Mixup is embedded in train.py.
(3) Use run() in train.py to train('run') or search('search'). There is an example in the train.py.

'###### The '#'s #######'

'##### are aligned #####'

No pre-train parameters for now.

About Training:

Following DeiT, there is an optional learning rate and weight decay set for grid search (if you want):
    LR from [5e-4, 3e-4, 5e-5] * batchsize / 256 ( or 512)
    WD from [0.03, 0.04, 0.05]
Looooooooong Training for CNN, but for transformer, its ok (maybe).

mobileformer's People

Contributors

slwang9353 avatar

Stargazers

Xia Lei avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar Zhou Zhou avatar  avatar  avatar Hayden P avatar Paul avatar Nishant Bhansali avatar Marl avatar Dongyang Liu avatar  avatar swan avatar  avatar  avatar  avatar Jun Lee avatar JayMay1994 avatar Gibran Benitez-Garcia avatar  avatar yxiaoxuan avatar Min Hyeok Lee avatar Chengle avatar saltF avatar 九叶草 avatar keyi avatar  avatar 爱可可-爱生活 avatar  avatar Lingbo Yang avatar  avatar An-zhi WANG avatar wendong avatar Yuchong Yao avatar  avatar TimZ avatar Bo Miao avatar Henry Lao avatar  avatar Hung Phan avatar  avatar cbsong avatar  avatar Richard Tseng avatar  avatar  avatar  avatar WhisperH avatar zha0ming1e avatar Kunyang Zhou avatar Yiqiao Qiu avatar  avatar MENGYU avatar  avatar  avatar  avatar Mr.Fire avatar ali_robot avatar Mike avatar Larry Tsai avatar Ninnart Fuengfusin avatar eeric avatar

Watchers

Mike avatar  avatar  avatar

mobileformer's Issues

Code issue

Hi, thanks for your reproduction. I find a small bug in your code. The small bug is shown in the picture. BatchNorm1d will raise an error when the batchsize of input is 1. When testing the model, the batchsize is set1, It will raise an error that BatchNorm1d requires input batchsize>1. I hope you can solve this small bug, Thanks!
image

ReLU6

hello,why you use ReLU6 instead ReLU in the code?

Official Release?

Thanks for the great share.
I wonder if this repo is the official release of the original paper?

Thanks!

关于模型训练的问题

您好作者,很感激您将代码与我们分享,我们在使用Food-101数据集对模型进行训练的时候,发现模型的每次输出都是预测第64类,而且输出的tensor都是相同的,loss也没有变化,模型并没有优化,我们只是接入了数据集没有修改网络,我们想知道是哪里出现了问题
![(F4NW`FZEM8OM E(I)

Model will not train

When the code is run in google colab, the Validation accuracy will not improve beyond around 0.11 on cifar-10 when running the search and is often even below 0.1

fps is fluctuates

Each line of data below is the result of averaging 100 by using model inference, but the result still fluctuates. Which value should I take?

# {'fps': 63.5, 'time_mean': 15.7, 'time_std': 0.5}
# {'fps': 62.8, 'time_mean': 15.9, 'time_std': 0.1}
# {'fps': 64.9, 'time_mean': 15.4, 'time_std': 0.2}
# {'fps': 63.6, 'time_mean': 15.7, 'time_std': 0.1}
# {'fps': 64.5, 'time_mean': 15.5, 'time_std': 0.1}
# {'fps': 64.1, 'time_mean': 15.6, 'time_std': 0.2}
# {'fps': 61.2, 'time_mean': 16.3, 'time_std': 0.1}
# {'fps': 62.2, 'time_mean': 16.1, 'time_std': 0.1}
# {'fps': 63.7, 'time_mean': 15.7, 'time_std': 0.4}
# {'fps': 65.0, 'time_mean': 15.4, 'time_std': 0.1}
# {'fps': 63.9, 'time_mean': 15.7, 'time_std': 0.1}
# {'fps': 63.9, 'time_mean': 15.7, 'time_std': 0.1}
# {'fps': 61.1, 'time_mean': 16.4, 'time_std': 0.5}
# {'fps': 64.2, 'time_mean': 15.6, 'time_std': 0.7}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.