why the test of the resnet164 model is slow?Its flops and parameters are smaller than

why the test of the resnet164 model is slow? about network-slimming HOT 15 OPEN

eric-mingjie commented on May 27, 2024

why the test of the resnet164 model is slow?

from network-slimming.

Comments (15)

Eric-mingjie commented on May 27, 2024

Can you give me a link to the code of the resnet18 model you mentioned?

from network-slimming.

666DZY666 commented on May 27, 2024

here:https://github.com/666DZY666/pytorch-cifar/blob/master/models/resnet.py

Model：resnet18
ResNet(BasicBlock, [2,2,2,2])

Layer：
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = self.layer1(out)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = F.avg_pool2d(out, 4)
out = out.view(out.size(0), -1)
out = F.dropout(out, p=0.5, training=self.training)
out = self.linear(out)
return out

Block：
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = F.relu(self.bn2(self.conv2(out)))
out = self.bn3(self.conv3(out))
out += self.shortcut(x)
out = F.relu(out)
return out

from network-slimming.

666DZY666 commented on May 27, 2024

and I have another problem~
I have prued the resnet164,when I compute flops and parameters,there is error:
"RuntimeError: Given groups=1, weight of size 16 14 1 1, expected input[1, 16, 44, 44] to have 14 channels, but got 16 channels instead",
my model structure:
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(select): channel_selection()
(conv1): Conv2d(14, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
it seems that the "channel_selection" does work now，and no pruned model is ok，why?

from network-slimming.

Eric-mingjie commented on May 27, 2024

The models are different. They have different number of residual stages. Thus the parameters are not comparable.

from network-slimming.

Eric-mingjie commented on May 27, 2024

Which pytorch version are you using? Is it torch 0.3.1?

from network-slimming.

666DZY666 commented on May 27, 2024

1、what this means?the resnet164 use bottleneck but the number of its residual stages is also 3,it is same to this resnet18 that use basicblock .and why the parameters are not comparable?
2、my torch is 1.1.0，and I think the reason is not the version,because I can prune、test、refine the model~Isn't it?

from network-slimming.

Eric-mingjie commented on May 27, 2024

https://github.com/666DZY666/pytorch-cifar/blob/master/models/resnet.py#L74
Here, there are four stages in total.

from network-slimming.

Eric-mingjie commented on May 27, 2024

I can prune、test、refine the model
Then what is the problem since you can successfully prune the model?

from network-slimming.

Eric-mingjie commented on May 27, 2024

Okay, i seem to get your second question. Do you mean you can not compute flops on resnet 164? Yeah, it's true, to compute flops on a model with channel selection, you need to write some extra code to adapt to the channel selection layer.

from network-slimming.

666DZY666 commented on May 27, 2024

1、the resnet18's first layer don't have shortcut(it use basicblock,not bottleneck)，so they are both 3 stages in total
2、yeah,you got it,so can you show me how do you compute flops and parameters?the code?

from network-slimming.

666DZY666 commented on May 27, 2024

so as for the first question, I think the reason may be the "channel_selection" make it slower

from network-slimming.

Eric-mingjie commented on May 27, 2024

Why does first stage don't have shortcut? This is not clear to me. The first stage must contain a residual block whether it is basicblock or bottleneck, then any residual block must contain shortcut.

Another reason that the two models are not comparable:
you can see our implementation https://github.com/Eric-mingjie/network-slimming/blob/master/models/preresnet.py#L63 has only 256 channels for the last residual block, while the code you provide has 512 channels in the last residual block https://github.com/666DZY666/pytorch-cifar/blob/master/models/resnet.py#L77.

I still thinks the resnet model you provide is not comparable to the resnet we implemented.

from network-slimming.

Eric-mingjie commented on May 27, 2024

Sorry, we don't have code for computing flops for models containing the channel selection layer. We just compute them by hand.

from network-slimming.

666DZY666 commented on May 27, 2024

1、my means is that the first layer's block's input and output are same,so there is no 11conv~see "***"
(layer1): Sequential(
0.286 GMac, 26.026% MACs,
(0): BasicBlock(
0.143 GMac, 13.013% MACs,
(conv1): Conv2d(0.071 GMac, 6.484% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.0 GMac, 0.023% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(0.071 GMac, 6.484% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.0 GMac, 0.023% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential(0.0 GMac, 0.000% MACs, )
)
(layer2): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
of course,the two model's structrue is different,but why this make flops is not comparable？
The vgg and resnet's flops can be comparable?
2、haha......I try it.......

from network-slimming.

Eric-mingjie commented on May 27, 2024

from network-slimming.

why the test of the resnet164 model is slow? about network-slimming HOT 15 OPEN

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent