Comments (12)
Yes, I think initialization is important. You can take a look at "Understanding the difficulty of training deep feedforward neural networks".
From your results, it might be safe to conclude that our setting is different from "random initialization".
By the way, our latest result achieves 93.45% with some sort of new initialization.
from filter-pruning-geometric-median.
I believe I misunderstood your code but could not find the right answer
from filter-pruning-geometric-median.
Q1: mask before training.
A1: This operation is for the compatibility of pruning pretrained models and scratch models.
Q2: do the do_mask after the training process.
The pruned filters will not change for this setting. I adopt the code from my previous work (soft filter pruning:https://github.com/he-y/soft-filter-pruning). This operation is for the compatibility of soft filter pruning.
from filter-pruning-geometric-median.
非常抱歉,我可能没有描述清楚我的问题。请问一下,您的工作如果train_from_scatch和我在构建模型时候直接构建小模型有什么区别么?我有点想不通,因为您在训练开始之前,就把一部分filter给置零了,并且不会更新梯度,那么这些filter永远是0,那么不论是用“norm小的不重要”原则还是您提出的几何中心原则,重新选择的filter永远还是这些filter,他们始终是0,那么您提出的算法在train_from_scatch setting上不就没有意义了么?因为无论什么原则选择的都是这一批,在训练之前就置零的filter,且永远不会改变,这不就相当于构建了一个小模型从scatch开始训练么?
期待您的回复,非常感谢您!
from filter-pruning-geometric-median.
请问一下,如果您的filter每次都不变的话,为什么在每个训练epoch后还有重新init_mask呢?请问是我哪里理解错了么?
from filter-pruning-geometric-median.
For your first question, they are different.
To make it simple, the difference is the random initialization, and "biased" random initialization.
The initialization code is as follows:
filter-pruning-geometric-median/models/resnet.py
Lines 71 to 81 in 44030b7
If you build a small model, the filter distribution of your small model is the normal distribution. If you build a large model and utilize some pruning criteria to remove filters, the distribution of remaining filters would NOT be the normal distribution.
You can take a look at "The Lottery Ticket Hypothesis" (ICLR best paper) to see how important the initialization is.
from filter-pruning-geometric-median.
For your second question, you can delete these codes as they have no influence on the results.
I keep them because my former project (soft filter pruning) needs them.
from filter-pruning-geometric-median.
非常感谢您的耐心回复,明白了,可是非常抱歉,还是有2个问题:
- 如果仅仅是为了获得bias的初始化,那么使用“GM center” 或者 “less norm is unimportant”有什么区别么,因为都是对于随机初始化的filter进行删减,感觉这个时候的filter并没有语义信息,所以强调“GM center”的作用会觉得非常奇怪;
2.请问一下,在prune Pre_trained model的时候,您直接一次性Prune掉prune rate占比的filter,并且不会改变,那这和您论文中算法1中所说的,在每个epoch最后重新寻找N_i+1 * P 个filter有点不一样,因为不改变的话就没有必要重新寻找了,我想咨询一下如果使用soft的方法每次改变,或者每次pruning rate逐渐增加,每次都只寻找一部分,会不会效果更好一些呢?非常感谢您的耐心!
from filter-pruning-geometric-median.
-
"The Lottery Ticket Hypothesis" believes that a small portion of the original random weight could be the winning ticket. The question is how to find them. Randomly removing them is not a good choice. Maybe you can do a simple experiment to see the difference.
-
The algorithm 1 in the paper is a general framework that includes your recommended settings and my experiment setting.
2.1. FPGM +SFP.
please take a look at this reply.
2.2. Increasing pruning rate.
Please take a look at my extended journal paper for SFP: Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks.
from filter-pruning-geometric-median.
您好,非常感谢您之前的耐心回复,对我的帮助很大。我跑了一下您的代码,但是把GM部分改为随机挑选40%的channel pruning,在对于pre-trained model 的Pruning效果还是很明显的。但是对于from scatch,我运行的结果是93.10,结构为resnet56, 比您论文中的结果均值要高,您觉得bias initialization是否真的有用,特别是rethinking这篇论文对于彩票假说的否定。
from filter-pruning-geometric-median.
I believe I misunderstood your code but could not find the right answer
请问您解决该问题了嘛。我也有相似的疑问,filter置如果已经置为0之后,后面还会进行更新吗? 期待您的回复~
from filter-pruning-geometric-median.
@Yejing-Lai FPGM does not update the filters. SFP will update the filters.
from filter-pruning-geometric-median.
Related Issues (20)
- Pruning result on ResNet-18 of Imagenet HOT 1
- Model size problem
- Question about baseline on ImageNet HOT 5
- Problems on the implementation HOT 1
- about training time HOT 4
- implementation on efficientnet HOT 2
- Experiment Question
- About function get_filter_similar() HOT 7
- Confusion with init_rate HOT 1
- Accuracy of pruned vgg without pruning
- Result of resnet18 on imagenet is low until epoch60
- 你好,有个代码段没理解 HOT 8
- TypeError: 'module' object is not callable HOT 2
- small model is not small HOT 6
- Can model compression be used for GAN?
- How to integrate FPGM and SFP
- 公布的模型参数怎么理解 HOT 1
- 作者提供的训练的模型,云盘下载之后,解压内部tar文件的时候显示资源有问题,解压失败 HOT 1
- 您好!请问get_small_model.py文件如何使用?谢谢! HOT 1
- FPGM on object detection model & implementation of NNI HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from filter-pruning-geometric-median.