Giter VIP home page Giter VIP logo

lsq-net's Introduction

LSQ-Net: Learned Step Size Quantization

Introduction

This is an unofficial implementation of LSQ-Net, a deep neural network quantization framework. LSQ-Net is proposed by Steven K. Esser and et al. from IBM. It can be found on arXiv:1902.08153.

There are some little differences between my implementation and the original paper, which will be described in detail below.

If this repository is helpful to you, please star it.

Results and Models

Here are some experiment results. We will release more quantized models with different configurations soon.

All these models can be downloaded from Dropbox.

Network Config. File Model Bitwidth (W/A) Top-1 Acc. (%) Top-5 Acc. (%)
ResNet-18 link link 3/2 66.9 87.2

User Guide

Install Dependencies

First install library dependencies within an Anaconda environment.

# Create a environment with Python 3.8
conda create -n lsq python=3.8
# PyTorch GPU version >= 1.5
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
# Tensorboard visualization tool
conda install tensorboard
# Miscellaneous
conda install scikit-learn pyyaml munch

Run Scripts with Your Configurations

This program use YAML files as inputs. A template as well as the default configuration is providen as config.yaml.

If you want to change the behaviour of this program, please copy it somewhere else. And then run the main.py with your modified configuration file.

python main.py /path/to/your/config/file.yaml

The modified options in your YAML file will overwrite the default settings. For details, please read the comments in config.yaml. After every epoch, the program will automatically store the best model parameters as a checkpoint. You can modify the option resume.path: /path/to/checkpoint.pth.tar in the YAML file to resume the training process, or evaluate the accuracy of the quantized model.

You can find some example configuration files in the example folder.

Implementation Differences From the Original Paper

LSQ-Net paper has two versions, v1 and v2. To improve accuracy, the authors expanded the quantization space in the v2 version. Recently they released a new version v3, which fixed some typos in the v2 version.

My implementation generally follows the v2 version, except for the following points.

Initial Values of the Quantization Step Size

The authors use Tensor(v.abs().mean() * 2 / sqrt(Qp)) as initial values of the step sizes in both weight and activation quantization layers, where Qp is the upper bound of the quantization space, and v is the initial weight values or the first batch of activations.

In my implementation, the step sizes in weight quantization layers are initialized in the same way, but in activation quantization layers, the step sizes are initialized as Tensor(1.0).

Supported Models

Currently, only ResNet is supported. For the ImageNet dataset, the ResNet-18/34/50/101/152 models are copied from the torchvision model zoo. For the CIFAR10 dataset, the models are modified based on Yerlan Idelbayev's contribution, including ResNet-20/32/44/56/110/1202.

Thanks to the non-invasive nature of the framework, it is easy to add another new architectures beside ResNet. All you need is to paste your model code into the model folder, and then add a corresponding entry in the model/model.py. The quantization framework will automatically replace layers specified in quan/func.py with their quantized versions automatically.

Contributing Guide

I am not a professional algorithm researcher, and I only have very limited GPU resources. Thus, I may not spend too much time continuing to optimize its accuracy.

However, if you find any bugs in my code or have any ideas to improve the quantization results, please feel free to open an issue. I will be glad to join the discussion.

lsq-net's People

Contributors

zhutmost avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.