Giter VIP home page Giter VIP logo

Comments (12)

he-y avatar he-y commented on June 6, 2024 1

Yes, I think initialization is important. You can take a look at "Understanding the difficulty of training deep feedforward neural networks".
From your results, it might be safe to conclude that our setting is different from "random initialization".
By the way, our latest result achieves 93.45% with some sort of new initialization.

from filter-pruning-geometric-median.

leoozy avatar leoozy commented on June 6, 2024

I believe I misunderstood your code but could not find the right answer

from filter-pruning-geometric-median.

he-y avatar he-y commented on June 6, 2024

Q1: mask before training.
A1: This operation is for the compatibility of pruning pretrained models and scratch models.

Q2: do the do_mask after the training process.
The pruned filters will not change for this setting. I adopt the code from my previous work (soft filter pruning:https://github.com/he-y/soft-filter-pruning). This operation is for the compatibility of soft filter pruning.

from filter-pruning-geometric-median.

leoozy avatar leoozy commented on June 6, 2024

非常抱歉,我可能没有描述清楚我的问题。请问一下,您的工作如果train_from_scatch和我在构建模型时候直接构建小模型有什么区别么?我有点想不通,因为您在训练开始之前,就把一部分filter给置零了,并且不会更新梯度,那么这些filter永远是0,那么不论是用“norm小的不重要”原则还是您提出的几何中心原则,重新选择的filter永远还是这些filter,他们始终是0,那么您提出的算法在train_from_scatch setting上不就没有意义了么?因为无论什么原则选择的都是这一批,在训练之前就置零的filter,且永远不会改变,这不就相当于构建了一个小模型从scatch开始训练么?
期待您的回复,非常感谢您!

from filter-pruning-geometric-median.

leoozy avatar leoozy commented on June 6, 2024

请问一下,如果您的filter每次都不变的话,为什么在每个训练epoch后还有重新init_mask呢?请问是我哪里理解错了么?

from filter-pruning-geometric-median.

he-y avatar he-y commented on June 6, 2024

For your first question, they are different.
To make it simple, the difference is the random initialization, and "biased" random initialization.

The initialization code is as follows:

for m in self.modules():
if isinstance(m, nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
#m.bias.data.zero_()
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
elif isinstance(m, nn.Linear):
init.kaiming_normal(m.weight)
m.bias.data.zero_()

If you build a small model, the filter distribution of your small model is the normal distribution. If you build a large model and utilize some pruning criteria to remove filters, the distribution of remaining filters would NOT be the normal distribution.
You can take a look at "The Lottery Ticket Hypothesis" (ICLR best paper) to see how important the initialization is.

from filter-pruning-geometric-median.

he-y avatar he-y commented on June 6, 2024

For your second question, you can delete these codes as they have no influence on the results.
I keep them because my former project (soft filter pruning) needs them.

from filter-pruning-geometric-median.

leoozy avatar leoozy commented on June 6, 2024

非常感谢您的耐心回复,明白了,可是非常抱歉,还是有2个问题:

  1. 如果仅仅是为了获得bias的初始化,那么使用“GM center” 或者 “less norm is unimportant”有什么区别么,因为都是对于随机初始化的filter进行删减,感觉这个时候的filter并没有语义信息,所以强调“GM center”的作用会觉得非常奇怪;

2.请问一下,在prune Pre_trained model的时候,您直接一次性Prune掉prune rate占比的filter,并且不会改变,那这和您论文中算法1中所说的,在每个epoch最后重新寻找N_i+1 * P 个filter有点不一样,因为不改变的话就没有必要重新寻找了,我想咨询一下如果使用soft的方法每次改变,或者每次pruning rate逐渐增加,每次都只寻找一部分,会不会效果更好一些呢?非常感谢您的耐心!

from filter-pruning-geometric-median.

he-y avatar he-y commented on June 6, 2024
  1. "The Lottery Ticket Hypothesis" believes that a small portion of the original random weight could be the winning ticket. The question is how to find them. Randomly removing them is not a good choice. Maybe you can do a simple experiment to see the difference.

  2. The algorithm 1 in the paper is a general framework that includes your recommended settings and my experiment setting.

2.1. FPGM +SFP.
please take a look at this reply.

2.2. Increasing pruning rate.
Please take a look at my extended journal paper for SFP: Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks.

from filter-pruning-geometric-median.

leoozy avatar leoozy commented on June 6, 2024

您好,非常感谢您之前的耐心回复,对我的帮助很大。我跑了一下您的代码,但是把GM部分改为随机挑选40%的channel pruning,在对于pre-trained model 的Pruning效果还是很明显的。但是对于from scatch,我运行的结果是93.10,结构为resnet56, 比您论文中的结果均值要高,您觉得bias initialization是否真的有用,特别是rethinking这篇论文对于彩票假说的否定。

from filter-pruning-geometric-median.

Yejing-Lai avatar Yejing-Lai commented on June 6, 2024

I believe I misunderstood your code but could not find the right answer

请问您解决该问题了嘛。我也有相似的疑问,filter置如果已经置为0之后,后面还会进行更新吗? 期待您的回复~

from filter-pruning-geometric-median.

he-y avatar he-y commented on June 6, 2024

@Yejing-Lai FPGM does not update the filters. SFP will update the filters.

from filter-pruning-geometric-median.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.