Giter VIP home page Giter VIP logo

network-slimming's People

Contributors

eric-mingjie avatar liuzhuang13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

network-slimming's Issues

Question about resnet50

Hi Eric,
Since I need to train a ResNet50 with less parameter, i read the code of definition of ResNet in your code. I found that it is pretty different from official code of Pytorch torchvision model. Could you please tell how i can change the code so i can train a pruned ResNet50?

RuntimeError: CUDA error: device-side assert triggered

/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [12,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [64,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [76,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [77,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [88,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [89,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [90,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [101,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [102,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [114,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [116,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [36,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [37,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [38,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [49,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [50,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [62,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [12,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [64,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [76,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [77,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [88,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [89,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [90,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [101,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [102,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [114,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [116,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [36,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [37,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [38,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [49,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [50,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [62,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
RuntimeError: CUDA error: device-side assert triggered

ZeroDivisionError: float division by zero

hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks

Traceback (most recent call last):
File "vggprune.py", line 128, in
newmodel = vgg(cfg=cfg)
File "/home/ustc/akb/network-slimming/models/vgg.py", line 22, in init
self.feature = self.make_layers(cfg, True)
File "/home/ustc/akb/network-slimming/models/vgg.py", line 39, in make_layers
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1, bias=False)
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 297, in init
False, _pair(0), groups, bias)
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 38, in init
self.reset_parameters()
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 44, in reset_parameters
stdv = 1. / math.sqrt(n)
ZeroDivisionError: float division by zero

mask-impl fine-tune

你好,mask版本的prune_mask并没有真正对权重进行剪枝,在finetune时,虽然用的是mask后的权重,但是对所有parameters进行更新,这样bn那些为0的权重不是也被训练了吗?

BN weight not in model.parameters()

def updateBN():

Hello, thank you for sharing the code.

I tried to reimplement your approach but found that the weights of nn.batchnorm2d are not in model.parameters() so the optimizer won't update them. Also, the function updateBN() doesn't work as in "m.weight.grad.data.add_(...)" weight.grad is NoneType.

Could you share how you resolved this or I missed something? Thanks!

gamma 稀疏问题

你好,我训练完vgg16后发现 所有 gamma的值大小从0.1-0.8,好像并不稀疏

Training with Sparsity

This process can be fine-tuned with a "Normal Trained" model?

I prune InceptionV3 by fine-tuning with a pre-trained ImageNet model, and find it slow to be sparse(base_lr=0.01, 30epoch). I wonder whether if it is possible to use fine-tuning.

And how do you think InceptionV3 which has many branchs? I find it's a trouble to implement InceptionV3prune.py.

请教下100分类中稀疏度的值

大佬请问下,在10分类中vgg,resnet,densenet所对应的稀疏化系数0.0001 0.00001,0.00001,那么在100分类稀疏化系数是怎样的?

Resnet中bn和conv的位置关系问题?

您好,非常感谢您的开源代码!
想请教您一个问题,您在设计resnet的bottleneck的时候,将bn放在conv之前是基于什么考虑呢?
因为我将您的resnet剪枝思路应用到检测模型时,我使用的是bn放在conv之后的bottleneck,不知道这样和前者剪枝的差别大不大?

Question about resprune

Thanks for sharing!
I have trained a preresnet-164 use python main.py -sr --s 0.00001 --dataset cifar10 --arch resnet --depth 164 , and I want to prune the model with python resprune.py --dataset cifar10 --depth 164 --percent 0.4 --model [My model path] --save [My path], But it has some errors and I don't know how to solve.
`Test set: Accuracy: 9262/10000 (92.6%)

Cfg:
[10, 15, 16, 24, 15, 16, 7, 9, 11, 17, 15, 15, 23, 15, 15, 8, 12, 15, 19, 15, 16, 20, 15, 15, 13, 12, 16, 21, 15, 16, 14, 15, 15, 18, 13, 16, 17, 12, 15, 13, 13, 15, 1, 4, 7, 9, 12, 16, 18, 12, 15, 25, 15, 16, 35, 32, 32, 38, 31, 32, 47, 32, 32, 44, 31, 32, 51, 30, 32, 43, 31, 32, 38, 31, 32, 43, 32, 32, 54, 32, 32, 70, 32, 32, 51, 32, 32, 52, 32, 32, 51, 31, 32, 52, 32, 32, 52, 30, 32, 57, 32, 32, 51, 30, 32, 61, 31, 32, 125, 64, 64, 59, 60, 64, 68, 60, 64, 66, 63, 64, 78, 62, 64, 89, 64, 64, 106, 63, 64, 119, 64, 64, 123, 64, 64, 149, 64, 64, 135, 64, 64, 144, 64, 63, 127, 64, 63, 139, 64, 63, 144, 64, 64, 135, 64, 64, 141, 63, 64, 141, 64, 63, 85]
Traceback (most recent call last):
File "resprune.py", line 181, in
m1.weight.data = m0.weight.data.clone()
File "/home/ubuntu/anaconda3/envs/YOLACT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 585, in getattr
type(self).name, name))
AttributeError: 'Sequential' object has no attribute 'weight'`

Can someone give me some suggestions?
Thanks very much!

densenet网络结构

大佬,请教一下,densenet代码里,每个denseblock各个denselayer的输出的特征图,为啥没有作为同一个block里其他denselayer的输入呢?

def _make_denseblock(self, block, blocks, cfg):
        layers = []
        assert blocks == len(cfg), 'Length of the cfg parameter is not right.'
        for i in range(blocks):
            # Currently we fix the expansion ratio as the default value
            layers.append(block(self.inplanes, cfg=cfg[i], growthRate=self.growthRate, dropRate=self.dropRate))
            self.inplanes += self.growthRate

        return nn.Sequential(*layers)

这是torchvision里面的denseblock:

class _DenseBlock(nn.ModuleDict):
    _version = 2

    def __init__(self, num_layers, num_input_features, bn_size, growth_rate, drop_rate, memory_efficient=False):
        super(_DenseBlock, self).__init__()
        for i in range(num_layers):
            layer = _DenseLayer(
                num_input_features + i * growth_rate,
                growth_rate=growth_rate,
                bn_size=bn_size,
                drop_rate=drop_rate,
                memory_efficient=memory_efficient,
            )
            self.add_module('denselayer%d' % (i + 1), layer)

    def forward(self, init_features):
        features = [init_features]
        for name, layer in self.items():
            new_features = layer(features)
            features.append(new_features)
        return torch.cat(features, 1)

RuntimeError: Given weight of size [9, 3, 3, 3], expected bias to be 1-dimensional with 9 elements, but got bias of size [128] instead

@Eric-mingjie : First of all thank you very much. When I try to prune my architecture(Simpnet) , half way through the pruning it crashes with the error :
RuntimeError: Given weight of size [9, 3, 3, 3], expected bias to be 1-dimensional with 9 elements, but got bias of size [128] instead

What is wrong here? Can you kindly assist me in resolving this issue ?
Here is the whole log :

=> loading checkpoint 'model_best_simpnet8.pth.tar'
=> loaded checkpoint 'model_best_simpnet8.pth.tar' (epoch 101) Prec1: 96.120000
simpnet8m(
  (features): Sequential(
    (0): Conv2d(3, 128, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (1): BatchNorm2d(128, eps=1e-05, momentum=0.05, affine=True)
    (2): ReLU(inplace)
    (3): Dropout2d(p=0.02)
    (4): Conv2d(128, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (5): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (6): ReLU(inplace)
    (7): Dropout2d(p=0.05)
    (8): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (9): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (10): ReLU(inplace)
    (11): Dropout2d(p=0.05)
    (12): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (13): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (14): ReLU(inplace)
    (15): Dropout2d(p=0.05)
    (16): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (17): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (18): ReLU(inplace)
    (19): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
    (20): Dropout2d(p=0.05)
    (21): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (22): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (23): ReLU(inplace)
    (24): Dropout2d(p=0.05)
    (25): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (26): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (27): ReLU(inplace)
    (28): Dropout2d(p=0.05)
    (29): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (30): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (31): ReLU(inplace)
    (32): Dropout2d(p=0.05)
    (33): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (34): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (35): ReLU(inplace)
    (36): Dropout2d(p=0.05)
    (37): Conv2d(182, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (38): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
    (39): ReLU(inplace)
    (40): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
    (41): Dropout2d(p=0.1)
    (42): Conv2d(430, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (43): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
    (44): ReLU(inplace)
    (45): Dropout2d(p=0.1)
    (46): Conv2d(430, 455, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (47): BatchNorm2d(455, eps=1e-05, momentum=0.05, affine=True)
    (48): ReLU(inplace)
    (49): Dropout2d(p=0.1)
    (50): Conv2d(455, 600, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (51): BatchNorm2d(600, eps=1e-05, momentum=0.05, affine=True)
    (52): ReLU(inplace)
  )
  (classifier): Linear(in_features=600, out_features=10, bias=True)
)
layer index: 3 	 total channel: 128 	 remaining channel: 9
layer index: 7 	 total channel: 182 	 remaining channel: 17
layer index: 11 	 total channel: 182 	 remaining channel: 44
layer index: 15 	 total channel: 182 	 remaining channel: 34
layer index: 19 	 total channel: 182 	 remaining channel: 89
layer index: 24 	 total channel: 182 	 remaining channel: 128
layer index: 28 	 total channel: 182 	 remaining channel: 102
layer index: 32 	 total channel: 182 	 remaining channel: 86
layer index: 36 	 total channel: 182 	 remaining channel: 95
layer index: 40 	 total channel: 430 	 remaining channel: 9
layer index: 45 	 total channel: 430 	 remaining channel: 89
layer index: 49 	 total channel: 455 	 remaining channel: 8
layer index: 53 	 total channel: 600 	 remaining channel: 339
Pre-processing Successful!
Files already downloaded and verified

Test set: Accuracy: 1000/10000 (10.0%)

[9, 17, 44, 34, 89, 'M', 128, 102, 86, 95, 9, 'M', 89, 8, 339]
In shape: 3, Out shape 9.
In shape: 9, Out shape 17.
In shape: 17, Out shape 44.
In shape: 44, Out shape 34.
In shape: 34, Out shape 89.
In shape: 89, Out shape 128.
In shape: 128, Out shape 102.
In shape: 102, Out shape 86.
In shape: 86, Out shape 95.
In shape: 95, Out shape 9.
In shape: 9, Out shape 89.
In shape: 89, Out shape 8.
In shape: 8, Out shape 339.
simpnet8m(
  (features): Sequential(
    (0): Conv2d(3, 128, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (1): BatchNorm2d(128, eps=1e-05, momentum=0.05, affine=True)
    (2): ReLU(inplace)
    (3): Dropout2d(p=0.02)
    (4): Conv2d(128, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (5): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (6): ReLU(inplace)
    (7): Dropout2d(p=0.05)
    (8): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (9): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (10): ReLU(inplace)
    (11): Dropout2d(p=0.05)
    (12): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (13): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (14): ReLU(inplace)
    (15): Dropout2d(p=0.05)
    (16): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (17): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (18): ReLU(inplace)
    (19): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
    (20): Dropout2d(p=0.05)
    (21): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (22): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (23): ReLU(inplace)
    (24): Dropout2d(p=0.05)
    (25): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (26): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (27): ReLU(inplace)
    (28): Dropout2d(p=0.05)
    (29): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (30): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (31): ReLU(inplace)
    (32): Dropout2d(p=0.05)
    (33): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (34): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (35): ReLU(inplace)
    (36): Dropout2d(p=0.05)
    (37): Conv2d(182, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (38): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
    (39): ReLU(inplace)
    (40): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
    (41): Dropout2d(p=0.1)
    (42): Conv2d(430, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (43): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
    (44): ReLU(inplace)
    (45): Dropout2d(p=0.1)
    (46): Conv2d(430, 455, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (47): BatchNorm2d(455, eps=1e-05, momentum=0.05, affine=True)
    (48): ReLU(inplace)
    (49): Dropout2d(p=0.1)
    (50): Conv2d(455, 600, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (51): BatchNorm2d(600, eps=1e-05, momentum=0.05, affine=True)
    (52): ReLU(inplace)
  )
  (classifier): Linear(in_features=600, out_features=10, bias=True)
)
Files already downloaded and verified
Traceback (most recent call last):
  File "vggprune.py", line 213, in <module>
    test(model)
  File "vggprune.py", line 150, in test
    output = model(data)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hossein/tmpstore/Network_slimming/network-slimming/models/simpnet8m.py", line 49, in forward
    out = self.features(x)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 282, in forward
    self.padding, self.dilation, self.groups)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/functional.py", line 90, in conv2d
    return f(input, weight, bias)
RuntimeError: Given weight of size [9, 3, 3, 3], expected bias to be 1-dimensional with 9 elements, but got bias of size [128] instead

Thanks a lot

关于稀疏化训练的一个小问题

论文采用的是对BN中缩放因子进行L1稀疏化,我看了代码发现好像和其他的过程的L1稀疏化不一样,torch中设置weight_decay默认是所有参数L2正则化训练
image
请问这个是怎么体现L1正则化的

how Pruning the last conv layer affects the first linear layer of the classifier

I trained the vgg and saved the model as pth file. then I load it for pruning some filters of it.
the last conv after pruning is not 512 anymore, some filters are gone.
how Pruning the last conv layer affects the first linear layer of the classifier which is (512 7 7, 4096).
how can I prune the input weights of classifier according to the last conv layer.

python vggprune.py

when i run the file of vggprune.py, there's a fault: 'unexpected key "module.feature.0.weight" in state_dict'~
i think that's the model's name is wrong. what i should do?

关于剪枝之后结构的调整

你好,非常感谢你的工作,我有个问题,对于network-slimming来讲,剪枝之后重新调整结构,使其更加轻量化,是否可行?

Question about the weight in nn.Linear

elif isinstance(m0, nn.Linear):
idx0 = np.squeeze(np.argwhere(np.asarray(start_mask.cpu().numpy())))
if idx0.size == 1:
idx0 = np.resize(idx0, (1,))
m1.weight.data = m0.weight.data[:, idx0].clone()
m1.bias.data = m0.bias.data.clone()

I think these is something wrong with the weights in the nn.Linear. nn.Linear is applied after flattening the output of the last conv2d, therefore i think the index shouldn't be idx0, the shape of m0.weight is to do with the output shape of the last conv2d.

关于论文中稀疏训练的损失函数的疑惑

大神,你好。我想问一个问题,我觉的论文《Learning Efficient Convolutional Networks Through Network Slimming》中给出的损失函数是针对需要剪枝的BN层的,而网络的最后层的损失函数还是经典的yolov3的损失函数,可以这样理解吗?根据代码的意思,最后的loss依然是经典的yolov3的损失函数值,没有加入L1正则的损失值。如果在网络的最后的损失函数是论文中的公式话,那么应该对每层的反向梯度就都应该包含L1正则的梯度

期待您的回复。十分感谢

Why not stop gradient for channel_selection layer's parameters?

class channel_selection(nn.Module):
   def __init__(self, num_channels):
       
         super(channel_selection, self).__init__()

         self.indexes = nn.Parameter(torch.ones(num_channels))

should this be :
self.indexes = nn.Parameter(torch.ones(num_channels), requires_grad=False)?

如何进行迭代剪枝

文中所提到的迭代剪枝是将您提供的代码进行一次,然后将得到微调后的模型重复train with sparsity ,prune等过程吗?我实际去做的时候发现剪枝一次后的模型再重复剪枝的话减掉的通道和第一次剪枝一样,并没有剪枝掉更多的通道,这是什么原因呢?

Resnet模型

您好:想请问是否有剪枝后的resnet模型呢?可以提供模型嘛?非常感谢!

How can finetune pruned models?

Hi @Eric-mingjie,

Thanks for your great implementation. I want to finetune my pruned network but in your main.py, the error show that there is no flag: --refine. Could you show me how to use that?

Thanks,
Hai

模型训练结果有较大的不同

作者您好:
我使用您的代码在cifar10和100dataset上分别训练了vgg16、19,resnet56、164。但无论是baseline和train w\ sparsity的结果都和您给出的结果有较大的gap。具体结果如下,想问一下您造成这样的原因:
image

vgg_prune.py

hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks

Traceback (most recent call last):
File "vggprune.py", line 73, in
mask = weight_copy.abs().gt(thre).float()
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'other'

why the test of the resnet164 model is slow?

why the test of the resnet164 model is slow?Its flops and parameters are smaller than resnet18(official model of Pytorch for cifar10),but it is slower。。。

because of channel_selection?

剪枝后的网络参数量如何统计

您好!:
我想请问一下,对于剪枝过后的网络,您是用何种方法统计其参数量的呢?因为按照我个人的理解,代码其实是把一些不重要的权重置零?
谢谢!

剪枝方面的一个小问题

你好,感谢你的复现工作,有个小问题,在你mask的实现方法中,你对小于阈值的BN层部分进行了mask,但是并没有真正将其连接打断,这是否并不会导致最终模型权重的减小?期待你的解答

vgg_prune.py

hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks

Traceback (most recent call last):
File "vggprune.py", line 123, in
newmodel = vgg(dataset=args.dataset, cfg=cfg)
File "/data/hzm/network-slimming-master/models/vgg.py", line 22, in init
self.feature = self.make_layers(cfg, True)
File "/data/hzm/network-slimming-master/models/vgg.py", line 39, in make_layers
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1, bias=False)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 315, in init
False, pair(0), groups, bias)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 43, in init
self.reset_parameters()
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 47, in reset_parameters
init.kaiming_uniform
(self.weight, a=math.sqrt(5))
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 288, in kaiming_uniform_
fan = _calculate_correct_fan(tensor, mode)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 257, in _calculate_correct_fan
fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 191, in _calculate_fan_in_and_fan_out
receptive_field_size = tensor[0][0].numel()
IndexError: index 0 is out of bounds for dimension 0 with size 0

剪枝程序vggprune.py遇见的问题

Traceback (most recent call last):
File "vggprune.py", line 73, in
mask = weight_copy.gt(thre).float().cuda()
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'other'
我用的环境是python3.6 torch0.4.1

请问怎么统计inference的时间?

satrt_time = time.time()
output = model(data)
end_time = time.time()
total_time+=(end_time-satrt_time)
为什么我这样统计,剪枝之后inference速度和不剪枝没说明区别,并没有明显提高?

对稀疏操作有点疑惑。

请问下,为什么只在反向的时候对bn的scale做L1的梯度操作,然而并没有像论文提到的公式在loss有所体现?
这里有点不能理解,麻烦解惑下,谢谢!

About the resprune.py

`if conv_count % 3 != 1:
w1 = w1[idx1.tolist(), :, :, :].clone() # ???

`

I have some question about the code pasted above, why not is the below?
w1 = w1[:,idx1.tolist(), :, :].clone()

invalid syntax

main.py
line 150
def test():
SyntaxError:Invalid Syntax

关于channel selection layer的问题请教

您好,请问为什么我们需要channel selection layer来辅助ResNet和DenseNet的剪枝呀?

我看代码,自己的理解是对于ResNet和DenseNet在BN层后面添加了channel selection layer,然后进行训练。在模型裁剪的时候,channel selection layer的值全置0,然后将需要保留的赋值为1.

    # We need to set the channel selection layer.
    m2 = new_modules[layer_id + 1]
    m2.indexes.data.zero_()
    m2.indexes.data[idx1.tolist()] = 1.0

感觉这里也是对于我们增加的channel selection layer中需要裁剪的给裁剪了,保留未裁剪的。但我感觉如果我不加这个通道选择层。一样的如同vgg的裁剪方式,好像也没什么问题。

可能我对代码理解得不够透彻,希望作者您能指点一二,谢谢,期待您的回复!

About sparsity regularization

Thank you for your sharing. I have a question about sparsity regulization.
As is shown in the formula (1) in the paper, g(s) is added in the loss function. But in the code, I don't find g(s) in the loss funtion. I only find the additional gradient about scaling factor is added to original gradient.
Could you show me where you add the g(s) to the loss function in the code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.