Giter VIP home page Giter VIP logo

deconvolution's Introduction

Network Deconvolution

Convolution is a central operation in Convolutional Neural Networks (CNNs), which applies a kernel to overlapping regions shifted across the image. However, because of the strong correlations in real-world image data, convolutional kernels are in effect re-learning redundant data. In this work, we show that this redundancy has made neural network training challenging, and propose network deconvolution, a procedure which optimally removes pixel-wise and channel-wise correlations before the data is fed into each layer. Network deconvolution can be efficiently calculated at a fraction of the computational cost of a convolution layer. We also show that the deconvolution filters in the first layer of the network resemble the center-surround structure found in biological neurons in the visual regions of the brain. Filtering with such kernels results in a sparse representation, a desired property that has been missing in the training of neural networks. Learning from the sparse representation promotes faster convergence and superior results without the use of batch normalization. We apply our network deconvolution operation to 10 modern neural network models by replacing batch normalization within each. Extensive experiments show that the network deconvolution operation is able to deliver performance improvement in all cases on the CIFAR-10, CIFAR-100, MNIST, Fashion-MNIST, Cityscapes, and ImageNet datasets.

@inproceedings{
Ye2020Network,
title={Network Deconvolution},
author={Chengxi Ye and Matthew Evanusa and Hua He and Anton Mitrokhin and Tom Goldstein and James A. Yorke and Cornelia Fermuller and Yiannis Aloimonos},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://openreview.net/forum?id=rkeu30EtvS }
}

Install Dependencies

This code requires the use of python3.5 or greater.

We recommend using pip to install the required dependencies.

pip install scipy numpy tensorboard matplotlib

Install PyTorch:

pip install torch torchvision
 

(optional, for visualization) Install tensorflow:

pip3 install tensorflow

Settings Overview

We have included a few settings you can add into the run command.

The basic run command (for non-imagenet dataset) is:

python main.py --[keyword1] [argument1] --[keyword2] [argument2]  ...

The major keywords to note are:

  • deconv - set to True or False if you want to test deconv (True) or BN (False)
  • arch - use a given architecture (resnet50, vgg11, vgg13, vgg19, densenet121)
  • wd - sets the weight decay to a given value
  • batch-size - sets the batch size
  • epochs - the number of epochs to run
  • dataset - the dataset to use (cifar10, cifar100) (for imagenet you need the other main file)
  • lr - sets the learning rate
  • block - block size in deconvolution
  • block-fc - block size in decorrelating the fully connected layers.

1. Running the examples from the paper

As an example, to run our settings for the CIFAR-10 20-epoch run, with .001 weight decay and 128 batch size, on the vgg11 architecture, you would run:

CUDA_VISIBLE_DEVICES=0 python main.py --lr .1 --optimizer SGD --arch vgg11 --epochs 20 --dataset cifar10  --batch-size 128 --msg True --deconv False --block-fc 0 --wd .001

for batch norm, and

CUDA_VISIBLE_DEVICES=0 python main.py --lr .1 --optimizer SGD --arch vgg11 --epochs 20 --dataset cifar10  --batch-size 128 --msg True --deconv True --block-fc 512 --wd .001

for deconvolution

2. ImageNet dataset:

  1. original resnet18 (90 epochs, use --epochs xx to change)
python main_imagenet.py -a resnet18 -j 32 imagenet/ILSVRC/Data/CLS-LOC 
  1. deconv resnet18
python main_imagenet.py -a resnet18d -j 32 imagenet/ILSVRC/Data/CLS-LOC --deconv True

3. Semantic segmentation:

Go to the Segmentation folder and follow the instructions in the ReadMe file.

deconvolution's People

Contributors

deconvolutionpaper avatar okbalefthanded avatar trellixvulnteam avatar yechengxi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

deconvolution's Issues

Depthwise convolutions

How to replace deconv layer with the combination of depthwise convolution and batchnorm?

License?

Hi,

Thanks for your code, the results are very interesting.
Wondering if you could please update the repository with the LICENSE file?

Deconvolution runtime

Thanks for this paper, I really enjoyed reading it.

I replaced all the batch norm layers in a ResNeXt-50 model with ChannelDeconv(block=64) layers, but I found that training takes much longer doing so, running about 30% slower. Did you notice this too with your experiments? Do you have any suggestions for speeding up the deconvolution layers?

Implementation details

First, let me congratulate you on your paper and also thank you for open-sourcing the code. I was porting the deconv operations/layers to Tensorflow and was wondering about something.

  1. Is the deconv covariance buffer the vast majority of non-trainable parameters in your models? Practically speaking, without groups (which Tensorflow doesn't support easily), the cost for that matrix in terms of parameters = [K1 * K2 * num_blocks] ^ 2. For a 3x3 kernel with 64 blocks, that's roughly 330K parameters right there. Are grouped convolutions the only remedy to this parameter explosion? It might become a network bandwidth issue in multi-node distributed training setups.

  2. Under what circumstances would one prefer the Delinear implementation over the FastDeconv implementation?

Edit: It seems the link to the paper in the readme is broken.

inference time

great work!
network deconvolution has many good features. However, in your paper, the inference time with DC is not mentioned. Is it almostly the same with BN method? I guess.

Concerns on the segmentation performance gap based on Sync-BN

Really nice work!

We are interested in your experimental results on semantic segmentation tasks (Cityscapes).

According to Figure-6, it seems that the proposed DeConv outperforms the BN by a really large margin of around 7~8%. However, it seems that you only report the results trained with 30 epochs and we are wondering about the performance gap after more training, e.g., 100 epochs.

Besides, in the current stage, most of the state-of-the-art segmentation methods use the Sync-BN to improve the results, thus I am also wondering whether have you compared your approach with the Sync-BN.

Last, we hope you could share with us the ImageNet pre-trained checkpoints of ResNet-101/50 based on Deconvolution and we might help to verify the effectiveness of your approach based on the current state-of-the-art segmentation systems.

It would be great if you could share with us your suggestions.

Thanks,

FastDeconv breaks when no bias is used

When setting bias = False in the constructor to FastDeconv, then the forward pass fails at:

b = self.bias - (w @ (X_mean.unsqueeze(1))).view(self.weight.shape[0], -1).sum(1)

This is because self.bias will be None and this line breaks.

I guess this would be possible:

if self.bias is None:
    b = - (w @ (X_mean.unsqueeze(1))).view(self.weight.shape[0], -1).sum(1)
else:
    b = self.bias - (w @ (X_mean.unsqueeze(1))).view(self.weight.shape[0], -1).sum(1)

When using Conv2d with BatchNorm, usually no bias is used in the Conv2d. However, when replacing both, then I guess a bias is needed again. So I'm not sure if there are useful cases for not using a bias when using the FastDeconv?

Thanks for the paper and the implementation!

1d please

Hi,
I bet this would also work for speech recognition and signal processing. maybe even nlp and algo trading.
I would love to try this out for speech recognition tasks in particular.
Can you please generate a 1d devonvolution class. Seems like it shouldn't be too difficult - especially if you really understand every line of code you wrote...

Thanks
Dan

Accuracy caclulation bug

the call to .view() in the function accuracy for the correct tensor (net_util.py line 342), fail when the tensor is not contiguous.
the solution it to transform the tensor to contiguous before apply the view.
a PR is on the way.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.