mobilev2-yolov5s剪枝、蒸馏，支持ncnn，tensorRT部署。ultra-light but better performence！

License: MIT License

Dockerfile 0.05% Shell 0.03% Python 7.13% Jupyter Notebook 92.78%

mobile-yolov5s distillation pruning ncnn yolov5

mobile-yolov5-pruning-distillation's Introduction

mobile-yolov5-pruning-distillation's People

Contributors

Stargazers

Watchers

Forkers

zineos scott-mao allenstin zyg11 mathpopo banbishan haixiansheng azuredsky happylearningml aipakchoi shiquanxu wulingtian qianfance legendbit fly8100 runauto hhhh0 fireae sirius93123 xxradon yangcyz fraduq youyongquan noticeable gvraky royzon asi-sx xiansenzhao satchelwu lianlou wacoder pengfeidip myteamyolo guotongjian guruzoa smallflyfly yi19960820 yoyokitartora hxl1990 sunshine411 paleomoon sixgodgg huangwgang hajungong007 syswyl yx868868 guhay yp19940913 15050188022 buaaplayer xinsuinizhuan yanggui19891007 fwd3 tommy3266 youngking0727 sinianyutian 007007007123 wolfhan99 dev233 jinwandou niki173 icenaive 8t15bin justpi wavelet2008 darling-945 danke1896 chaucerg a154609 cvtuge tonychouzju xingxu1996 franksongsong ai-in-air liufqing lijuny huahouxuan samyeh0527 xiaoyw1998 nrikoh jia0511 luxiaohao iyangzy peternara martinkeith minionyh learncrazy duyongqi xiangchen123 shenmayufei dreamgang zhenting99 darrenzhang1007 zqin1998 bryan-bai yulongnan twicejj gitwlj huanggua123456 brooks0519

mobile-yolov5-pruning-distillation's Issues

run it

i can not run it ,it give me the message:

Traceback (most recent call last):
File "train.py", line 762, in
opt.cfg = check_file(opt.cfg) # check file
File "/home/demon/CLionProjects/mobile-yolov5-pruning-distillation/utils/utils.py", line 101, in check_file
if os.path.isfile(file):
File "/home/demon/anaconda3/envs/pytorch_gpu/lib/python3.8/genericpath.py", line 30, in isfile
st = os.stat(path)

L1 稀疏化训练细节

 def compute_pruning_loss(p, prunable_modules, model, loss):
    ft = torch.cuda.FloatTensor if p[0].is_cuda else torch.Tensor
    ll1 = ft([0])
    h = model.hyp  # hyperparameters
    red = 'mean'  # Loss reduction (sum or mean)
    if prunable_modules is not None:
        for m in prunable_modules:
            ll1 += m.weight.norm(1)  # BN 层 gamma 值 的 L1 范数
        ll1 /= len(prunable_modules)
    ll1 *= h['sl']
    bs = p[0].shape[0]  # batch size
    loss += ll1 * bs
    return loss

请教下函数中平均范数为啥要乘上 batch_size?

剪枝之后的模型也可以通过onnx转换和通过ncnn部署吗？

剪枝代码

大佬好，剪枝代码看不太懂，有写相应的解释性的博客吗？

为啥 ultralytics / yolov5 中的yolov5s的 params是7.5M GFLOPS是17.5，你的是7.07,8点多

如上面所说

About pretrained weight

Hi
Could you share your pretrained weight?
(mobilenet_v2-b0353104.pth)

When I download the weights, it turns that this folder is in your google drive trash can.
https://drive.google.com/drive/folders/1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J

Thank you.

1

我对剪枝后的模型蒸馏后发现，精度上去的同时参数量也恢复到剪枝前的参数量了。

mAP become 0

Why did mAP all become 0 after the second epoch when I used mobilev2-yolo5s to train my data?

转为onnx之后，怎么部署到手机上呢？

有相应的前端demo提供吗，看你提供的链接是404.

About training

I would like to ask what is the voc data format of your training? Whether it needs to be processed into xywh format after downloading voc

关于剪枝蒸馏的策略

请问能不能提供一关于你这里所采用的的剪枝蒸馏策略的参考资料？万分感谢。另外，想请教一下，数据集的改变是否会影响剪枝蒸馏的策略与效果

mobilenet作为骨干网络,采用mobilenet的权重文件收敛速度变得很慢

prune and distillation

Hello, excuse me again. I ran your code: pruning and distilling mobilenet. One can reduce the parameters and the other can improve the accuracy. Why not use the pruning model as a student model of distillation?

RuntimeError: shape '[1, 3, 85, 80, 64]' is invalid for input of size 655360

大佬您好，我运行detect.py报错 x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
RuntimeError: shape '[1, 3, 85, 80, 64]' is invalid for input of size 655360
运行train.py拿来训练也报错，是不是这不能拿来训练或者测试呢？

RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same

你好，我尝试使用yolov5s作为模型进行训练，命令行为
python train.py --type vocs
但是在训练完1个epoch后测试验证集时，报错

Traceback (most recent call last):
File "train.py", line 802, in
train(hyp)
File "train.py", line 453, in train
results, maps, times = test.test(opt.data,
File "/home/cwy/mobile-yolov5-pruning-distillation/test.py", line 112, in test
inf_out, train_out = model(img, augment=augment)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 621, in forward
outputs = self.parallel_apply(self._module_copies[:len(inputs)], inputs, kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 646, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 1 on device 1.
Original Traceback (most recent call last):
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/yolo.py", line 94, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "/home/cwy/mobile-yolov5-pruning-distillation/models/yolo.py", line 111, in forward_once
x = m(x) # run
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/common.py", line 86, in forward
return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/common.py", line 21, in forward
return self.act(self.bn(self.conv(x)))
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same

尝试了很多方法，但还是报同样的错误，想请教一下该如何解决？

mobilenet作为骨干网络,采用mobilenet的权重文件收敛速度变得很慢

我在yolov5最新的工程文件上采用mobilenet作为骨干网络,此外采用mobilenet的权重文件收敛速度变得很慢,但是我直接跑作者的代码却收敛的很快,我能问问大概是什么原因吗?

NotImplementedError: Do not know how to handle these types to promote: {'DoubleTensor', 'HalfTensor'}

训练 mvocs模型的时候再完成一个epoch后就报上面的错，谁知道怎么解决吗？

mobile-yolo5s.yaml 该文件中 detect layer 是空的？

策略

按照你的步骤一步一步，是要比之前更轻量吗，内存暂用更小吗，一般pruning剪枝率给到多少呢，蒸馏一般选择策略几呢，，我大概总共3，4万多张图片。

请问稀疏参数的选取有什么原则吗？

蒸馏

蒸馏时用v5的v2权重AttributeError: 'Detect' object has no attribute 'm'错误，剪枝后map只有0.4几咋回事呢，我是按照快速开始来的。

可以私聊点问题吗

我诚恳的想请教您一些问题，方便联系吗？[email protected]

Pruning 段错误

我先是通过 python train.py --type mvocs 进行训练，再通过python3 train.py --type smvocs将上一步的best_mvocs.pt作为本次训练的输入模型，再通过python3 pruning.py -t 0.1将上一步的best_smvocs.pt作为本次的输入模型，运行python3 pruning.py -t 0.1就报段错误

all processes

your work as following:
sparse learning-->prune-->finetune-->distill ?

关于teacher model预训练权重

我发现您并未给出yolov5s在VOC上训练的权重，因此将无法复现README中的实验，请问能否上传一下呢

how to finetune the pruned model?

when I train it with 'train.py', the output model is non-pruned, as large as original model.
Do I need a *.yaml for pruned model, how to get it?
Thanks.

这个可以扩展至5x吧？

pruning is useless

prunable_modules = []
prunable_module_type = (nn.BatchNorm2d, )
for i, m in enumerate(model.modules()):
if i in ignore_idx:
continue
if isinstance(m, prunable_module_type):
prunable_modules.append(m)

as mentioned above, model structure was not change!
in addtion, BN of conv module was not appended prunable_modules.
where pruning?

it was biger to yolov5s model pruned

it existed error to your pruned code.

Prune的ignore_idx指的是哪几层

您好，感谢您开源您的工作，请问您在剪枝部分ignore_idx=[230, 260, 290],这个[230, 260, 290]分别指的是哪些层，或者您是否有一个权重可以共享

常见的一些问题汇总

问：为什么有时候训练mAP会变成0？
答：这要分情况讨论，我看大多数就是loss->0，val的map->0，这是很明显的overfit了。可以参考一下原来写的csdn文章
问：torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'
答：这是原git挖的坑。你以为的模型加载->直接读取权重，恢复模型，构建梯度关系。实际上的模型加载->从yaml中构建一个随机权重的模型->得到梯度关系->加载每一层保存的权重。前者用在纯前向推理，后者用在需要梯度关系的时候，比如剪枝。还有就是保证原始工程结构，注意export PATH=$PWD
问：yolo5s的预训练权重没了怎么办？
答：我现在手上也没了。不过这个git主要是用更轻量的mobilev2-yolo5s，所以这里大家也可以来尝试一下。如果需要用yolov5，还是用他们最新的，对着我这个改改代码还是很轻松的Readme中有一个网盘链接，可以试一下
问：recall低或者precision低
答：和阈值有关，所以我这里只关注map作为中间指标。如果再剪枝或者蒸馏中出现map降低，也是很正常的，和你的数据集以及策略都有关，毕竟每个论文都是吹自己是sota。还需要自己鉴别。
问：剪枝之后的模型能保存yaml文件吗?可不可以蒸馏或者进行ncnn转换？
答：可以，但是没必要用yaml文件。这个时候你直接加载模型就行了，参考ft(finetune)模式。如果你很介意，那你就写个逆向的脚本就行了，不难，锻炼一下自己。
问：模型训练的细节，剪枝蒸馏参考的文献等
答：基本都在readme中有写，细节直接看代码就好了，不难，就当作学习。补充一下，剪枝是韩松的基于bn层剪枝，蒸馏则是用了针对一阶段、二阶段、分类网络等等的方式，具体细节肯定和原文有出入，但是差异不大注意鉴别。还有就是不要问我为什么不用什么什么方法，我不可能帮你实验完所有的，你可以自己动手。我所选的都是具有代表性的，很经典的方法，但是不一定最好。
问：yolov5在你这里剪枝蒸馏有一些bug
答：一针对yolov5我没有具体测试过，别人作者都还在更新呢，肯定是有bug的。二我这个项目主要是想做个更轻量的模型出来，但是我能保证只要按照我的步骤一步一步肯定能复现出来，我连随机数种子都给你们了>_<。
问：mobilev2-yolo5s收敛慢，但是yolo5s就很快？
答：别人在coco上预训练过的，参数量和计算量是我的两倍，capacity就很大。我这个就backbone用的imagenet权重，head全部随机初始化，收敛慢很正常，capacity小导致精度低也很正常。本身轻量化的思路就是从整体结构设计、剪枝、蒸馏、量化这4个点来做的。你要拿着yolov5x训练完了再做轻量化我也没得说，这也是个思路。

关于finetune

新手想请教一个比较基础的问题，剪枝后的模型finetune训练次数设置的足够高（比如跟正常训练一样100个epochs）可以取得很不错的mAP，这样做有问题吗？finetune的训练次数有没有什么限制比如只允许在10，20 次这样

关于蒸馏的一些问题

感谢作者的repo，真的学到了很多，在看蒸馏实验室遇到了一些疑惑想和作者请教一下

Strategy1 Output Based Distillation，repo中关于Teacher部分的蒸馏损失全都是用MSELoss计算的，而论文中仍然采用和原始YOLO相同的Loss，会不会这一块限制Student网络的调优
Strategy2 Feature Based + Output Based Distillation，repo中似乎是通过参数控制一起训练的，而论文中是先冻结Feature Map之后的参数进行Feature Distillation，然后再解冻进行全局Distillation，会不会这部分限制了Strategy2的效果导致最后mAP偏低；且论文里Convert应该只有卷积，没有加上BN和Relu两个操作

why baseline model not use yolov5s-5x from ultralytics / yolov5

problem as above.
your code about yolo.py was inconsistent with ultralytics / yolov5
it was occur error that Detect has no 'export'.

请问一下，使用预训练的yolov5s在coco上做基础训练，验证集的map非常接近于0，问题大概出在什么地方？

mobile-yolo5s.yaml 在nc等于3的时候会报错

RuntimeError: The size of tensor a (3) must match the size of tensor b (5) at non-singleton dimension 0

after pruning，how can get the new yaml

first very thank you for your job
i just use pruning.py ,at last save just save a whole new model ,how can i get the new yaml.could teach me?

、

稀疏化训练后，运行剪枝报错FileNotFoundError: [Errno 2] No such file or directory: 'render_img/before_pruning.jpg'，这个文件是哪里生成的呢？谢谢

声明！

思考再三，还是做出如下声明：
本git仅供学习！！！ 拒绝伸手党
本git目的是帮助你大体了解一些做工程的一些基本流程，换backbone、剪枝、蒸馏、量化都是常规操作，包括利用现有框架往C++和android上转。
如果你要用来商用，或者做课设，做毕设，完全没有问题，遵从原git的相关开源协议即可。时间仓促，代码细节上难免会存在纰漏，但是只要按照readme复现整个流程是不会有问题的。很多可能的问题、细节在readme和代码注释中都有详细记录。包括一些关于实现的疑问在一些issue中我都解释过。
欢迎提供好的idea或者碰到了一些思路上的问题也欢迎共同来探讨，但是拒绝伸手党。
每个人的代码环境，操作方式都不一样，不可能简单的通过ctrl c+v 错误信息来帮你直接debug。
如果有商业合作需求，欢迎私下联系我。

遇到了点问题

RuntimeError: Given groups=1, weight of size [1, 255, 1, 1], expected input[1, 512, 8, 8] to have 255 channels, but got 512 channels instead
我运行那个mob-yolov5s.yaml，报错这个！

"render_img/before_pruning.jpg"

Thank you for sharing the code. When I run the pruning operation, there is a missing image. Can you provide me with this picture？

ImportError: cannot import name 'gsutil_getsize' from 'utils.google_utils' ?

'Got {}, but numpy array, torch tensor, or caffe2 blob name are expected.'.format(type(x))) NotImplementedError: Got <class 'NoneType'>, but numpy array, torch tensor, or caffe2 blob name are expected.

运行train.py想自己像YOLOv5y一样训练模型训练报错 'Got {}, but numpy array, torch tensor, or caffe2 blob name are expected.'.format(type(x)))
NotImplementedError: Got <class 'NoneType'>, but numpy array, torch tensor, or caffe2 blob name are expected.

Thanks for your fantastic work! But I encountered a problem: onnx export error...

env :

ubuntu1804
torch: 1.6.0
torchvision: 0.7.0
Training: python3 train.py --type vocs python3 train.py --type dmvocs
export: export PYTHONPATH="$PWD" && python models/onnx_export.py --weights outputs/dmvocs/weights/best_dmvocs.pt --img 640 --batch 1

Error:

Namespace(batch_size=1, img_size=[640], weights='outputs/dmvocs/weights/best_dmvocs.pt')
Fusing layers...
Model Summary: 148 layers, 3.59629e+06 parameters, 3.31818e+06 gradients
Traceback (most recent call last):
  File "models/onnx_export.py", line 35, in <module>
    _ = model(img)  # dry run
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/yolo.py", line 94, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/yolo.py", line 111, in forward_once
    x = m(x)  # run
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/common.py", line 24, in fuseforward
    return self.act(self.conv(x))
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 419, in forward
    return self._conv_forward(input, self.weight)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 416, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 3, 3, 3], but got 3-dimensional input of size [1, 3, 640] instead

How can I solve this problem? Looking forward to your reply! Thank you!

关于权重文件下载问题

当我执行前两个训练命令时，出现了以下错误，剩余的训练命令也类似，请问怎么解决呢qaq

voc data question

voc_train.txt
voc_test.txt
where can get them?

torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'

test.py文件使用剪枝后的模型会报错，报错信息如下：
Traceback (most recent call last):
File "/home/demon/PycharmProjects/mobile-yolov5-pruning-distillation/test.py", line 357, in
test(opt.data,
File "/home/demon/PycharmProjects/mobile-yolov5-pruning-distillation/test.py", line 89, in test
names = {k: v for k, v in enumerate(model.names if hasattr(model, 'names') else model.module.names)}
File "/home/demon/anaconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 771, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'

pruning problem

training dataset: coco2017
model weights: yolov5s.pt
classes = 80
why was lower to value of p or r when pruning?

as following:
Epoch gpu_mem GIoU obj cls total targets img_size
3/49 2.66G 0.06193 0.0939 0.04484 0.2007 141 640: 100%|██████████████████| 7393/7393 [2:41:07<00:00, 1.31s/
Class Images Targets P R [email protected] [email protected]:.95: 100%|██████████| 313/313 [02:54<00:00, 1.80it/s]
all 5e+03 3.63e+04 0.000964 0.000638 5.18e-05 1.53e-05

Low training recall rate

I set test params（conf_trhesh=0.3, iou_thresh=0.3, batch=24,) when training the VOC2012 and 2017 data sets,but
the recall rate after 120 epochs iteration is only 0.3xx, is this normal?

蒸馏时Precision下降太多的问题

感谢大佬的工程。
我在蒸馏时，发现Precision下降挺多的，看到您的几个蒸馏方案中，也是Precision比没有蒸馏前低一些，这个问题可以通过啥方法改善么？
单分类的，倒是没什么大的下降~

syencil / mobile-yolov5-pruning-distillation Goto Github PK