Giter VIP home page Giter VIP logo

odconv's Introduction

Omni-Dimensional Dynamic Convolution

By Chao Li, Aojun Zhou and Anbang Yao.

This repository is an official PyTorch implementation of "Omni-Dimensional Dynamic Convolution", ODConv for short, published by ICLR 2022 as a spotlight. ODConv is a more generalized yet elegant dynamic convolution design, which leverages a novel multi-dimensional attention mechanism with a parallel strategy to learn complementary attentions for convolutional kernels along all four dimensions (namely, the spatial size, the input channel number and the output channel number for each convolutional kernel, and the convolutional kernel number) of the kernel space at any convolutional layer. As a drop-in replacement of regular convolutions, ODConv can be plugged into many CNN architectures. Basic experiments are conducted on the ImageNet benchmark, and downstream experiments are conducted on the MS-COCO benchmark.

A schematic comparison of (a) DyConv (CondConv uses GAP+FC+Sigmoid) and (b) ODConv. Unlike CondConv and DyConv which compute a single attention scalar $α_{wi}$ for the convolutional kernel $W_{i}$, ODConv leverages a novel multi-dimensional attention mechanism to compute four types of attentions $α_{si}$, $α_{ci}$, $α_{fi}$ and $α_{wi}$ for $W_{i}$ along all four dimensions of the kernel space in a parallel manner.

Illustration of multiplying four types of attentions in ODConv to convolutional kernels progressively. (a) Location-wise multiplication operations along the spatial dimension, (b) channel-wise multiplication operations along the input channel dimension, (c) filter-wise multiplication operations along the output channel dimension, and (d) kernel-wise multiplication operations along the kernel dimension of the convolutional kernel space.

Dataset

Following this repository,

Requirements

  • Python >= 3.7.0
  • PyTorch >= 1.8.1
  • torchvision >= 0.9.1

Updates

  • 2022/09/16 Code and trained models of ResNet family and MobileNetV2 family with ODConv for classification and detection are released.

Results and Models

Note: The models released here show slightly different (mostly better) accuracies compared to the original models reported in our paper. As the original models and source code had been used in internal commerical projects. This reimplementation of training and evaluation code is dedicated for public release.

Results comparison on the ImageNet validation set with the MobileNetV2 (1.0×, 0.75×, 0.5×) backbones trained for 150 epochs. For our ODConv, we set r = 1/16.

Models Params Madds Top-1 Acc(%) Top-5 Acc(%) Download
MobileNetV2 (1.0×) 3.50M 300.8M 71.65 90.22 model
+ ODConv (1×) 4.94M 311.8M 74.74 91.95 model
+ ODConv (4×) 11.51M 327.1M 75.29 92.18 model
MobileNetV2 (0.75×) 2.64M 209.1M 69.18 88.82 model
+ ODConv (1×) 3.51M 217.1M 72.71 90.85 model
+ ODConv (4×) 7.50M 226.3M 74.01 91.37 model
MobileNetV2 (0.5×) 1.97M 97.1M 64.30 85.21 model
+ ODConv (1×) 2.43M 101.8M 68.06 87.67 model
+ ODConv (4×) 4.44M 106.4M 70.23 88.86 model

Results comparison on the ImageNet validation set with the ResNet18, ResNet50 and ResNet101 backbones trained for 100 epochs. For our ODConv, we set r = 1/16.

Models Params Madds Top-1 Acc(%) Top-5 Acc(%) Download
ResNet18 11.69M 1.814G 70.25 89.38 model
+ ODConv (1×) 11.94M 1.838G 73.05 91.05 model
+ ODConv (4×) 44.90M 1.916G 74.19 91.47 model
ResNet50 25.56M 3.858G 76.23 92.97 model
+ ODConv (1×) 28.64M 3.916G 77.87 93.77 model
+ ODConv (4×) 90.67M 4.080G 78.50 93.99 model
ResNet101 44.55M 7.570G 77.44 93.68 model
+ ODConv (1×) 50.82M 7.675G 78.84 94.27 model
+ ODConv (2×) 90.44M 7.802G 79.15 94.34 model

Training

To train ResNet backbones:

python -m torch.distributed.launch --nproc_per_node={ngpus} main.py \
--arch {model name} --epochs 100 --lr 0.1 --wd 1e-4 --dropout {dropout rate} \
--lr-decay schedule --schedule 30 60 90 --kernel_num {number of kernels} --reduction {reduction ratio} \
--data {path to dataset} --checkpoint {path to checkpoint} 

For example, you can use following command to train ResNet18 with ODConv (4×, r=1/16):

python -m torch.distributed.launch --nproc_per_node=8 main.py \
--arch od_resnet18 --epochs 100 --lr 0.1 --wd 1e-4 --dropout 0.2 \
--lr-decay schedule --schedule 30 60 90 --kernel_num 4 --reduction 0.0625 \
--data ./datasets/ILSVRC2012 --checkpoint ./checkpoints/odconv4x_resnet18 

To train MobileNetV2 backbones:

python -m torch.distributed.launch --nproc_per_node={ngpus} main.py \
--arch {model name} --epochs 150 --lr 0.05 --wd 4e-5 --dropout {dropout rate} \
--lr-decay cos --kernel_num {number of kernels} --reduction {reduction ratio} \
--data {path to dataset} --checkpoint {path to checkpoint} 

For example, you can use following command to train MobileNetV2 (1.0×) with ODConv (4×, r=1/16):

python -m torch.distributed.launch --nproc_per_node=8 main.py \
--arch od_mobilenetv2_100 --epochs 150 --lr 0.05 --wd 4e-5 --dropout 0.2 \
--lr-decay cos --kernel_num 4 --reduction 0.0625 \
--data ./datasets/ILSVRC2012 --checkpoint ./checkpoints/odconv4x_mobilenetv2_100 

You can add --use_amp to enable Automatic Mixed Precision to reduce memory usage and speed up training.

Evaluation

To evaluate a pre-trained model:

python -m torch.distributed.launch --nproc_per_node={ngpus} main.py \
--arch {model name} --kernel_num {number of kernels} \
--reduction {reduction ratio} --data {path to dataset} --evaluate \
--resume {path to model}

Training and evaluation on downstream object detection

Please refer to README.md in the folder of object_detection for details.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{li2022odconv,
  title={Omni-Dimensional Dynamic Convolution},
  author={Chao Li and Aojun Zhou and Anbang Yao},
  booktitle={International Conference on Learning Representations},
  year={2022},
  url={https://openreview.net/forum?id=DmpCfq6Mg39}
}

License

ODConv is released under the MIT license. We encourage use for both research and commercial purposes, as long as proper attribution is given.

Acknowledgment

This repository is built based on mmdetection, Dynamic-convolution-Pytorch repositories. We thank the authors for releasing their amazing codes.

odconv's People

Contributors

yaoanbang avatar chaoli-ai avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.