Giter VIP home page Giter VIP logo

mobile-yolov5-pruning-distillation's Introduction

Syencil's GitHub stats

mobile-yolov5-pruning-distillation's People

Contributors

syencil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mobile-yolov5-pruning-distillation's Issues

run it

i can not run it ,it give me the message:

Traceback (most recent call last):
File "train.py", line 762, in
opt.cfg = check_file(opt.cfg) # check file
File "/home/demon/CLionProjects/mobile-yolov5-pruning-distillation/utils/utils.py", line 101, in check_file
if os.path.isfile(file):
File "/home/demon/anaconda3/envs/pytorch_gpu/lib/python3.8/genericpath.py", line 30, in isfile
st = os.stat(path)

L1 稀疏化训练细节

 def compute_pruning_loss(p, prunable_modules, model, loss):
    ft = torch.cuda.FloatTensor if p[0].is_cuda else torch.Tensor
    ll1 = ft([0])
    h = model.hyp  # hyperparameters
    red = 'mean'  # Loss reduction (sum or mean)
    if prunable_modules is not None:
        for m in prunable_modules:
            ll1 += m.weight.norm(1)  # BN 层 gamma 值 的 L1 范数
        ll1 /= len(prunable_modules)
    ll1 *= h['sl']
    bs = p[0].shape[0]  # batch size
    loss += ll1 * bs
    return loss

请教下函数中 平均范数为啥要乘上 batch_size?

剪枝代码

大佬好,剪枝代码看不太懂,有写相应的解释性的博客吗?

1

我对剪枝后的模型蒸馏后发现,精度上去的同时参数量也恢复到剪枝前的参数量了。

mAP become 0

Why did mAP all become 0 after the second epoch when I used mobilev2-yolo5s to train my data?

About training

I would like to ask what is the voc data format of your training? Whether it needs to be processed into xywh format after downloading voc

关于剪枝蒸馏的策略

请问能不能提供一关于你这里所采用的的剪枝蒸馏策略的参考资料?万分感谢。另外,想请教一下,数据集的改变是否会影响剪枝蒸馏的策略与效果

prune and distillation

Hello, excuse me again. I ran your code: pruning and distilling mobilenet. One can reduce the parameters and the other can improve the accuracy. Why not use the pruning model as a student model of distillation?

RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same

你好,我尝试使用yolov5s作为模型进行训练,命令行为
python train.py --type vocs
但是在训练完1个epoch后测试验证集时,报错

Traceback (most recent call last):
File "train.py", line 802, in
train(hyp)
File "train.py", line 453, in train
results, maps, times = test.test(opt.data,
File "/home/cwy/mobile-yolov5-pruning-distillation/test.py", line 112, in test
inf_out, train_out = model(img, augment=augment)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 621, in forward
outputs = self.parallel_apply(self._module_copies[:len(inputs)], inputs, kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 646, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 1 on device 1.
Original Traceback (most recent call last):
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/yolo.py", line 94, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "/home/cwy/mobile-yolov5-pruning-distillation/models/yolo.py", line 111, in forward_once
x = m(x) # run
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/common.py", line 86, in forward
return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/common.py", line 21, in forward
return self.act(self.bn(self.conv(x)))
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same

尝试了很多方法,但还是报同样的错误,想请教一下该如何解决?

策略

按照你的步骤一步一步,是要比之前更轻量吗,内存暂用更小吗,一般pruning剪枝率给到多少呢,蒸馏一般选择策略几呢,,我大概总共3,4万多张图片。

蒸馏

蒸馏时用v5的v2权重AttributeError: 'Detect' object has no attribute 'm'错误,剪枝后map只有0.4几咋回事呢,我是按照快速开始来的。

Pruning 段错误

我先是通过 python train.py --type mvocs 进行训练,再通过python3 train.py --type smvocs将上一步的best_mvocs.pt作为本次训练的输入模型,再通过python3 pruning.py -t 0.1将上一步的best_smvocs.pt作为本次的输入模型,运行python3 pruning.py -t 0.1就报段错误

all processes

your work as following:
sparse learning-->prune-->finetune-->distill ?

how to finetune the pruned model?

when I train it with 'train.py', the output model is non-pruned, as large as original model.
Do I need a *.yaml for pruned model, how to get it?
Thanks.

pruning is useless

prunable_modules = []
prunable_module_type = (nn.BatchNorm2d, )
for i, m in enumerate(model.modules()):
if i in ignore_idx:
continue
if isinstance(m, prunable_module_type):
prunable_modules.append(m)

as mentioned above, model structure was not change!
in addtion, BN of conv module was not appended prunable_modules.
where pruning?

Prune的ignore_idx指的是哪几层

您好,感谢您开源您的工作,请问您在剪枝部分ignore_idx=[230, 260, 290],这个[230, 260, 290]分别指的是哪些层,或者您是否有一个权重可以共享

常见的一些问题汇总

  1. 问:为什么有时候训练mAP会变成0?
    答:这要分情况讨论,我看大多数就是loss->0,val的map->0,这是很明显的overfit了。可以参考一下原来写的csdn文章

  2. 问:torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'
    答:这是原git挖的坑。你以为的模型加载->直接读取权重,恢复模型,构建梯度关系。实际上的模型加载->从yaml中构建一个随机权重的模型->得到梯度关系->加载每一层保存的权重。前者用在纯前向推理,后者用在需要梯度关系的时候,比如剪枝。还有就是保证原始工程结构,注意export PATH=$PWD

  3. 问:yolo5s的预训练权重没了怎么办?
    答:我现在手上也没了。不过这个git主要是用更轻量的mobilev2-yolo5s,所以这里大家也可以来尝试一下。如果需要用yolov5,还是用他们最新的,对着我这个改改代码还是很轻松的Readme中有一个网盘链接,可以试一下

  4. 问:recall低或者precision低
    答:和阈值有关,所以我这里只关注map作为中间指标。如果再剪枝或者蒸馏中出现map降低,也是很正常的,和你的数据集以及策略都有关,毕竟每个论文都是吹自己是sota。还需要自己鉴别。

  5. 问:剪枝之后的模型能保存yaml文件吗?可不可以蒸馏或者进行ncnn转换?
    答:可以,但是没必要用yaml文件。这个时候你直接加载模型就行了,参考ft(finetune)模式。如果你很介意,那你就写个逆向的脚本就行了,不难,锻炼一下自己。

  6. 问:模型训练的细节,剪枝蒸馏参考的文献等
    答:基本都在readme中有写,细节直接看代码就好了,不难,就当作学习。补充一下,剪枝是韩松的基于bn层剪枝,蒸馏则是用了针对一阶段、二阶段、分类网络等等的方式,具体细节肯定和原文有出入,但是差异不大注意鉴别。还有就是不要问我为什么不用什么什么方法,我不可能帮你实验完所有的,你可以自己动手。我所选的都是具有代表性的,很经典的方法,但是不一定最好。

  7. 问:yolov5在你这里剪枝蒸馏有一些bug
    答:一针对yolov5我没有具体测试过,别人作者都还在更新呢,肯定是有bug的。二我这个项目主要是想做个更轻量的模型出来,但是我能保证只要按照我的步骤一步一步肯定能复现出来,我连随机数种子都给你们了>_<。

  8. 问:mobilev2-yolo5s收敛慢,但是yolo5s就很快?
    答:别人在coco上预训练过的,参数量和计算量是我的两倍,capacity就很大。我这个就backbone用的imagenet权重,head全部随机初始化,收敛慢很正常,capacity小导致精度低也很正常。本身轻量化的思路就是从整体结构设计、剪枝、蒸馏、量化这4个点来做的。你要拿着yolov5x训练完了再做轻量化我也没得说,这也是个思路。

关于finetune

新手想请教一个比较基础的问题,剪枝后的模型finetune训练次数设置的足够高(比如跟正常训练一样100个epochs)可以取得很不错的mAP,这样做有问题吗?finetune的训练次数有没有什么限制比如只允许在10,20 次这样

关于蒸馏的一些问题

感谢作者的repo,真的学到了很多,在看蒸馏实验室遇到了一些疑惑想和作者请教一下

  1. Strategy1 Output Based Distillation,repo中关于Teacher部分的蒸馏损失全都是用MSELoss计算的,而论文中仍然采用和原始YOLO相同的Loss,会不会这一块限制Student网络的调优
  2. Strategy2 Feature Based + Output Based Distillation,repo中似乎是通过参数控制一起训练的,而论文中是先冻结Feature Map之后的参数进行Feature Distillation,然后再解冻进行全局Distillation,会不会这部分限制了Strategy2的效果导致最后mAP偏低;且论文里Convert应该只有卷积,没有加上BN和Relu两个操作

稀疏化训练后,运行剪枝报错FileNotFoundError: [Errno 2] No such file or directory: 'render_img/before_pruning.jpg',这个文件是哪里生成的呢?谢谢

声明!

思考再三,还是做出如下声明
本git仅供学习!!! 拒绝伸手党
本git目的是帮助你大体了解一些做工程的一些基本流程,换backbone、剪枝、蒸馏、量化都是常规操作,包括利用现有框架往C++和android上转。
如果你要用来商用,或者做课设,做毕设,完全没有问题,遵从原git的相关开源协议即可。时间仓促,代码细节上难免会存在纰漏,但是只要按照readme复现整个流程是不会有问题的。很多可能的问题、细节在readme和代码注释中都有详细记录。包括一些关于实现的疑问在一些issue中我都解释过。
欢迎提供好的idea或者碰到了一些思路上的问题也欢迎共同来探讨,但是拒绝伸手党
每个人的代码环境,操作方式都不一样,不可能简单的通过ctrl c+v 错误信息来帮你直接debug。
如果有商业合作需求,欢迎私下联系我。

遇到了点问题

RuntimeError: Given groups=1, weight of size [1, 255, 1, 1], expected input[1, 512, 8, 8] to have 255 channels, but got 512 channels instead
我运行那个mob-yolov5s.yaml,报错这个!

"render_img/before_pruning.jpg"

Thank you for sharing the code. When I run the pruning operation, there is a missing image. Can you provide me with this picture?

Thanks for your fantastic work! But I encountered a problem: onnx export error...

env :

ubuntu1804
torch: 1.6.0
torchvision: 0.7.0
Training: python3 train.py --type vocs python3 train.py --type dmvocs
export: export PYTHONPATH="$PWD" && python models/onnx_export.py --weights outputs/dmvocs/weights/best_dmvocs.pt --img 640 --batch 1

Error:

Namespace(batch_size=1, img_size=[640], weights='outputs/dmvocs/weights/best_dmvocs.pt')
Fusing layers...
Model Summary: 148 layers, 3.59629e+06 parameters, 3.31818e+06 gradients
Traceback (most recent call last):
  File "models/onnx_export.py", line 35, in <module>
    _ = model(img)  # dry run
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/yolo.py", line 94, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/yolo.py", line 111, in forward_once
    x = m(x)  # run
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/common.py", line 24, in fuseforward
    return self.act(self.conv(x))
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 419, in forward
    return self._conv_forward(input, self.weight)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 416, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 3, 3, 3], but got 3-dimensional input of size [1, 3, 640] instead

How can I solve this problem? Looking forward to your reply! Thank you!

关于权重文件下载问题

当我执行前两个训练命令时,出现了以下错误,剩余的训练命令也类似,请问怎么解决呢qaq
QQ截图20210306102921
QQ截图20210306102941

torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'

test.py文件使用剪枝后的模型会报错,报错信息如下:
Traceback (most recent call last):
File "/home/demon/PycharmProjects/mobile-yolov5-pruning-distillation/test.py", line 357, in
test(opt.data,
File "/home/demon/PycharmProjects/mobile-yolov5-pruning-distillation/test.py", line 89, in test
names = {k: v for k, v in enumerate(model.names if hasattr(model, 'names') else model.module.names)}
File "/home/demon/anaconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 771, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'

pruning problem

training dataset: coco2017
model weights: yolov5s.pt
classes = 80
why was lower to value of p or r when pruning?

as following:
Epoch gpu_mem GIoU obj cls total targets img_size
3/49 2.66G 0.06193 0.0939 0.04484 0.2007 141 640: 100%|██████████████████| 7393/7393 [2:41:07<00:00, 1.31s/
Class Images Targets P R [email protected] [email protected]:.95: 100%|██████████| 313/313 [02:54<00:00, 1.80it/s]
all 5e+03 3.63e+04 0.000964 0.000638 5.18e-05 1.53e-05

Low training recall rate

I set test params(conf_trhesh=0.3, iou_thresh=0.3, batch=24,) when training the VOC2012 and 2017 data sets,but
the recall rate after 120 epochs iteration is only 0.3xx, is this normal?

蒸馏时Precision下降太多的问题

感谢大佬的工程。
我在蒸馏时,发现Precision下降挺多的,看到您的几个蒸馏方案中,也是Precision比没有蒸馏前低一些,这个问题可以通过啥方法改善么?
单分类的,倒是没什么大的下降~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.