he-y / soft-filter-pruning Goto Github PK
View Code? Open in Web Editor NEWSoft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Home Page: https://arxiv.org/abs/1808.06866
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Home Page: https://arxiv.org/abs/1808.06866
Hello, I must thank you for sharing your good work with us.
I however have one question about the implementation.
I am trying to perform pruning on a pretrained model. In one of your scripts it has been given as:
python pruning_train_from_pretrain.py $DOME_HOME/datasets/ILSVRC2012 -a resnet18 --save_dir ./snapshots/Pretrain-resnet18-rate-0.7 --rate 0.7 --layer_begin 0 --layer_end 57 --layer_inter 3 --workers 36 --use_pretrain --lr 0.01
However, there are a couple of problems here:
error: unrecognized arguments: --use_pretrain --lr 0.01
Also, I don't understand how to incorporate the "use_pretrain" in my code.
I am also unable to find "pruning_train_from_pretrain.py" in your repository.
I would be much obliged if you could help me with this.
Keep up the good work!
Thanks in advance
The code just use a mask to replace the pruned filters, but how to get the real pruned model ?
Besides, can your share your method about how to calculate the flops of your work ?
I don't understand the parameter layer_inter's meaning. Is it a representation of the type of layer,like the 1 is conv layer and the two is another type layer,or is it a internal gap between adjacent conv layer.
Thanks a lot
If we cannot obtain N_{i+1}(1-P_{i}) zero filters at layer i in the converged network. How to deal with it?
In your code ,you only zeroize the parameter of conv layer. But there are also parameters in BN(scaling and bias, especially the latter) layer, the output of BN will still be non-zero although its input is zero. Whether it would be better to zeroize the param in BN?
@he-y 谢谢作者分享代码,看了下Mask类,好像没找到如何处理残差结构中skip connection,如下图:
输入特征裁剪的通道和经过第二个conv之后的出来的特征裁剪的通道可能不一样,但它们需要相加,想请教下soft filter pruning是如何处理这种情况的?谢谢🙏
Hi He Yang
Thanks for sharing your amazing research work
Could you give more detailed information about the FLOPs calculation process?
I see cifar_resnet_flop.py is only for resent, so could you provide VGG version?
Many Thanks
Best Regards
Hello, I want to use your method in InceptionV3, which I download from https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth, but I don't know how to get the index of the first and last conv layer, could you help me?
Thank you very much!
Thanks for your wonderful work, but I'm wondering how can I preserve finally compact model.
Is the inference only scripts also available?
Only remove the zero filters can the model be faster and smaller.I am wondering if the precision is different between the zero model and non-zero model?
Hi He Yang,
Thanks for your contribution of the paper and code.
I wonder have you ever do such a experiment:
multiply every convolution layers of ResNet with the pruning rate (e.g. 0.7) and then train the squeezed model from scratch on ImageNet ?
Thanks.
你好,这里的bn_value不就是bn3.weight吗?为什么residual还要再加一次呢?
`
residual += self.bn_value.cuda()
residual.index_add_(1, self.index.cuda(), out)
residual = self.relu(residual)
`
Hi, I am trying to repeat your training on cifar 10 using pruning_cifar10_resnet.py and cifar10_resnet.sh. It is working fine on resnet110, but i got the following error if i change to resnet18. Why there are two types of resnet, resnet[18,34,50,101] versus resnet[20,32,56,110]?
Error:-
self.padding, self.ceil_mode, self.count_include_pad, self.divisor_override)
RuntimeError: Given input size: (512x1x1). Calculated output size: (512x-5x-5). Output size is too small
Hello, I use your method in InceptionV3, which I download from https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth. After 100 epoch train, I got the best.inception_v3.pth.tar. Then I want to obtain the samll modle. I saw the issue , but I still don't know how to obtain.
Can you help me?
Thank you!
我下载了你提供的裁剪好的resnet50,但是要怎么加载呢?
import torch
checkpoint = torch.load('/home/fwq/soft-filter-pruningResNet50_pruned/ResNet50_pruned_rate0.3.pt')
显示错误:SourceChangeWarning: source code of class 'torch.nn.modules.linear.Linear' has changed. you can retrieve the original source code by accessing the object's source attribute or set torch.nn.Module.dump_patches = True
and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
你好,首先谢谢你提供的代码。但我想问的是是不是少了最后一个实际剪枝的代码pruning_train_from_pretrain.py,如果能够提供那再好不过了,谢谢。
这个源码里面算法的实现方式跟你论文里面的是不是不相符,每个剪枝步骤实际更改了权重。
In pruning_train.py, the resnet50' total layers num is 150, but in code, the last layer's index is 159 which is out of range.
So I wonder what is the rule to choose the skip layer?
Thanks.
您好,文章裁剪的思路很赞。
请问有softpruning训练到最后,裁剪掉通道的模型吗?
我仔细看了代码,觉得代码裁剪resnet到最后,并没法裁剪掉所有应该裁剪的通道,因为resnet模块很多卷基层存在相关性,训练时可能存在通道错位情况,并没法最终裁剪。
Is this model pruning to the pre training of Imagenet?
Thank you for your excellent paper.
But I don't understand why setting filters to zero can make it faster, since you still need to do the matrix multiplication. Could you please give me a brief explanation?
Thank you very much!
1、图3第三行关于ASFP的示图的第二张,2号filter经过剪枝后的importance是不是应该是0呢?(图中写的是0.734)
2、图5中的固定点的拟合指数函数是不是写错了呢?我理解的是f(x)指x轮下的剪枝率,是由(6)、(7)等式及Pmax=30,D=1/8和三个固定点来计算的,但图中的f(x)=30.000e^(0.055x)+70.000似乎对不上?
希望得到您的解答,谢谢!
int(160.4)=6
int(160.1)+int(16*0.3)=5
so
Hi, I notice that your get_small_model.py just can be used for resnet. But I want to use this code for other network. Could you give me some advice?
Look forward to your reply, thanks!
I used command "pip3 install torch==0.3.1" successfully, but when I used "pip3 install torchvision==0.3.0", the error log is as below:
Could not find a version that satisfies the requirement torchvision==0.3.0 (from versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3)
No matching distribution found for torchvision==0.3.0
Then I go to https://github.com/pytorch/vision, but I found the latest release version is 0.2.2. So I wonder how to install toechvision?
Thanks!
All weights in zero filters are zero? or we regard a filter as zero filter if its weight below a threshold?
Hi, may I know how to use the get_small_model.py? I tried to repeat the resnet50 pruning_train.py on imagenet and get the resnet50 checkpoint, then i load the model using get_small_model.py. I received the following error, False state dict because my resnet50 state dict size is 320 instead of 267. I also tried the resnet18 and my state dict size is 122 instead of 102? May I know how you get the 102 for resnet18 and 267 for resnet50? Thanks.
why not layer1.0.conv2.weight in bottle_block_flag of get_small_model.py?
only exist con1 and con3?
Thanks for sharing the codes.
In the scripts, the file pruning_cifar10.sh may have some typos.
As the save name is ratenorm0.7_ratedist0.1, the input for argument '--rate_norm' is 1.
Is it a mistake?
If not, I would need help to understand it.
cifar-10-python.tar.gz Data set downloaded by myself. root(/data/cifar.python/cifar-10-python.tar.gz)
then run sh scripts/cifar10_resnet.sh
but it not work
info:
=> do not use any checkpoint for resnet110 model
----------one epoch begin----------
the compression rate now is 0.700000
pruning_cifar10_resnet.py:305: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
input_var = torch.autograd.Variable(input, volatile=True)
pruning_cifar10_resnet.py:306: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
target_var = torch.autograd.Variable(target, volatile=True)
Traceback (most recent call last):
File "pruning_cifar10_resnet.py", line 467, in
main()
File "pruning_cifar10_resnet.py", line 177, in main
val_acc_1, val_los_1 = validate(test_loader, net, criterion, log)
File "pruning_cifar10_resnet.py", line 315, in validate
top1.update(prec1[0], input.size(0))
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
hope your reply
torch 0.3.1版本太低了,有没有在高版本的解决策略
Theoretically speaking, when you prune the channels according to the output dimension, you shouldn't get any gradient for the corresponding weights during your backward pass. How do you solve this from the code? Could you point me to the corresponding section?
I do notice that BN layers are not masked. If you keep the BN bias, there will indeed be gradients through the BN bias, but this seems like a very hacky workaround.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.