Giter VIP home page Giter VIP logo

benn-pytorch's Introduction

BENN

Codes for Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?

CVPR 2019 Paper

:octocat: If using the code, please cite our paper: BibTex

If you have any question related to the codes or models, please open an issue. If you have general questions about principle of BENN or have any further idea of improving it, please contact us by email: [email protected], [email protected]. Please, no commercial use before getting permission from authors.

Notice: As mentioned in the paper (Section 7) we are aware of the overfitting problem caused by the ensemble technique. If retraining the models, they should basically match the results shown in the paper as well as here, but could be either slightly higher or lower due to random initialization, epoch selection, overfitting, etc. If you have a good idea of how to resolve the overfitting issue of ensemble methods, please contact the authors and we can further improve BENN.

Train BENN on CIFAR-10 dataset

A customized Network-In-Network (NIN) model is used. Please see paper for architecture details.

Ensemble Model Train LR BNN (start) BENN (end) Overfitting from Best Voting Models Directory Logs
Bagging AB Seq 0.0001 67.35 81.32 20 Soft Max Vote models L
BoostA AB Seq 0.01 67.08 81.93 25 Soft Max Vote models L
BoostA AB Indp 0.01 70.59 82.12 20 Soft Max Vote models L
BoostB AB Seq 0.01 62.87 82.58 30 Soft Max Vote models L
BoostB AB Indp 0.01 69.65 82.13 21 Soft Max Vote models L
BoostC AB Seq 0.0001 67.88 79.40 27 Soft Max Vote models L
BoostD AB Indp 0.001 68.72 82.04 22 Soft Max Vote models L
Bagging SB Seq 0.001 77.87 89.12 25 Soft Max Vote models L
BoostA SB Seq 0.01 80.33 88.12 15 Soft Max Vote models L
BoostB SB Seq 0.001 84.23 87.9 31 Soft Max Vote models L
BoostC SB Seq 0.001 83.68 89.00 25 Soft Max Vote models L
BoostC SB Indp 0.01 80.38 87.72 23 Soft Max Vote models L
BoostD SB Seq 0.001 84.5 88.83 24 Soft Max Vote models L

Hints

Generally, we have:

๐Ÿ  2 different models (you can specify with --arch allbinnet/nin), corresponding to AB and SB models in the paper

โณ 2 different training modes (independent training, and sequential training)

โš™๏ธ 5 different ensemble schemes (Bagging, Boost A, Boost B, Boost C, and Boost D)

๐Ÿ“Š 2 voting strategies (hard majority vote, soft max vote)

Retrain models

For example:

$ python main_bagging_SB.py --epochs 0 --retrain_epochs 100 --root_dir PATH/TO/YOUR/models_bagging_SB/

Test pre-trained models

First download the models from the links above, then run the corresponding python script to test pre-trained models and you should get the exact same numbers comparing with our logs above. For example:

$ python main_bagging_SB.py --epochs 0 --retrain_epochs 0 --root_dir PATH/TO/YOUR/DOWNLOADED/models_bagging_SB/

Notice: For AB models, you should get around 79-82% accuracy for 32 ensembles. For SB models, you should get around 87-89% accuracy for 32 ensembles (usually 15-20 is a reasonable choice due to overfitting). The single BNN should have around 69-73% and 83-84% accuracy for AB and SB model respectively.

Train BENN on ImageNet dataset

The codes and pre-trained models on AlexNet and ResNet-18 will be released soon in near future, please stay tuned. We are currently studying overfitting issue and testing the stability of the gain.

Notice: For AlexNet, you should get around 50-53% (bagging) and 52-55% (boosting) accuracy for 5-6 ensembles. For ResNet-18, you should get around 56-59% (bagging) and 59-62% (boosting) accuracy for 5-6 ensembles. The single BNN should have accuracy around 44% and 48% for AlexNet and ResNet-18. Be sure to use SB model with independent training, and make sure each BNN is well converged before ensemble. Due to overfitting and optimization instability as observed in Section 6.2 from the paper, you may want to train BENN multiple times and pick the best one.

Train BENN on your own network architecture and dataset

To train BENN for your own application, you can directly reuse the BENN training part of this code. More details will be provided. If you successfully train BENN on some new applications with new architectures and achieve satisfying performance, please contact the authors and we will add a link here.

Acknowledgement

The single BNN training part of this code is mostly written by referencing XNOR-Net and Jiecao Yu's implementation. Please consider them as well if you use our code. Based on our testing, XNOR-Net is the most stable and reliable open source BNN training scheme with product-level codes.

Check list

  • Release CIFAR-10 Training Code
  • Release CIFAR-10 Pretrained Models
  • Release ImageNet Training Code
  • Release ImageNet Pretrained Models

benn-pytorch's People

Contributors

shilinc avatar xindongol avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.