Comments (15)
Can you give me a link to the code of the resnet18 model you mentioned?
from network-slimming.
here:https://github.com/666DZY666/pytorch-cifar/blob/master/models/resnet.py
Model:resnet18
ResNet(BasicBlock, [2,2,2,2])
Layer:
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = self.layer1(out)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = F.avg_pool2d(out, 4)
out = out.view(out.size(0), -1)
out = F.dropout(out, p=0.5, training=self.training)
out = self.linear(out)
return out
Block:
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = F.relu(self.bn2(self.conv2(out)))
out = self.bn3(self.conv3(out))
out += self.shortcut(x)
out = F.relu(out)
return out
from network-slimming.
and I have another problem~
I have prued the resnet164,when I compute flops and parameters,there is error:
"RuntimeError: Given groups=1, weight of size 16 14 1 1, expected input[1, 16, 44, 44] to have 14 channels, but got 16 channels instead",
my model structure:
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(select): channel_selection()
(conv1): Conv2d(14, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
it seems that the "channel_selection" does work now,and no pruned model is ok,why?
from network-slimming.
The models are different. They have different number of residual stages. Thus the parameters are not comparable.
from network-slimming.
Which pytorch version are you using? Is it torch 0.3.1?
from network-slimming.
1、what this means?the resnet164 use bottleneck but the number of its residual stages is also 3,it is same to this resnet18 that use basicblock .and why the parameters are not comparable?
2、my torch is 1.1.0,and I think the reason is not the version,because I can prune、test、refine the model~Isn't it?
from network-slimming.
https://github.com/666DZY666/pytorch-cifar/blob/master/models/resnet.py#L74
Here, there are four stages in total.
from network-slimming.
I can prune、test、refine the model
Then what is the problem since you can successfully prune the model?
from network-slimming.
Okay, i seem to get your second question. Do you mean you can not compute flops on resnet 164? Yeah, it's true, to compute flops on a model with channel selection, you need to write some extra code to adapt to the channel selection layer.
from network-slimming.
1、the resnet18's first layer don't have shortcut(it use basicblock,not bottleneck),so they are both 3 stages in total
2、yeah,you got it,so can you show me how do you compute flops and parameters?the code?
from network-slimming.
so as for the first question, I think the reason may be the "channel_selection" make it slower
from network-slimming.
Why does first stage don't have shortcut? This is not clear to me. The first stage must contain a residual block whether it is basicblock or bottleneck, then any residual block must contain shortcut.
Another reason that the two models are not comparable:
you can see our implementation https://github.com/Eric-mingjie/network-slimming/blob/master/models/preresnet.py#L63 has only 256 channels for the last residual block, while the code you provide has 512 channels in the last residual block https://github.com/666DZY666/pytorch-cifar/blob/master/models/resnet.py#L77.
I still thinks the resnet model you provide is not comparable to the resnet we implemented.
from network-slimming.
Sorry, we don't have code for computing flops for models containing the channel selection layer. We just compute them by hand.
from network-slimming.
1、my means is that the first layer's block's input and output are same,so there is no 11conv~see "***"
(layer1): Sequential(
0.286 GMac, 26.026% MACs,
(0): BasicBlock(
0.143 GMac, 13.013% MACs,
(conv1): Conv2d(0.071 GMac, 6.484% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.0 GMac, 0.023% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(0.071 GMac, 6.484% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.0 GMac, 0.023% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential(0.0 GMac, 0.000% MACs, )
)
(layer2): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
of course,the two model's structrue is different,but why this make flops is not comparable?
The vgg and resnet's flops can be comparable?
2、haha......I try it.......
from network-slimming.
OK
from network-slimming.
Related Issues (20)
- 为什么我裁剪完成后,模型运行占用的显存比没裁剪的还要高? HOT 1
- 预测单张照片时,准确率只有0.0几 HOT 2
- Minor Bugs caused by old version
- Sparse Confusion HOT 1
- channel_selection layer intraining process HOT 2
- 剪枝后保存的权重文件和newmodel加载不上 HOT 3
- RuntimeError: Given groups=1, weight of size [15, 14, 1, 1], expected input[64, 16, 32, 32] to have 14 channels, but got 16 channels instead HOT 1
- If the remaining channel for a layer is zero, it reports zero division error HOT 2
- 原论文中的memory是指 HOT 6
- TypeError: item() takes no arguments (1 given) HOT 1
- mnn加载剪枝模型错误 HOT 1
- 经稀疏训练剪枝后模型变小,但是refine微调后模型又变大了
- 关于L1 regular HOT 1
- m.weight.grad.data.add_的问题
- 问题咨询:剪枝后通道数为0 HOT 7
- RuntimeError: CUDA error: device-side assert triggered HOT 1
- About other visions
- 题外话:模型压缩如何入门?对于自己的网络架构该如何着手去写剪枝代码?
- ResNet和DenseNet这种每一层的输入会作为后续多个层的输入,且其BN层是在卷积层之前,在这种情况下,稀疏化是在层的输入末端得到的,一个层选择性接受所有通道的子集去做下一步的卷积运算。
- 对HRNet剪枝,第一个bn的grad是None的问题
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from network-slimming.