Giter VIP home page Giter VIP logo

apot_quantization's People

Contributors

hossein1387 avatar xindongol avatar yhhhli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

apot_quantization's Issues

Differences between quant_layer.py

Thanks for your nice work and codes!

I found that files quant_layer.py in CIFAR10 and ImageNet are different. Can you please tell me why there exit two versions? BTW, I plan to utilize quantlize MobileNet V2. Which version is better for migration? Thank you.

difference between paper and code in quan_layer

Hi~ I have a liitle puzzled about a difference between paper and code in quan_layer.py.
In paper, when B = 4 P0∈{0,2^0,2^-2,2^-4} (show in example in 2.2)
But in code quan_layer.py line 23: base_b.append(2 ** (-2 * i - 2)), when i range(3) we have P0∈{0,2^-2,2^-4,2^-6}
I am a little puzzled about it, thanks a lot!

Lightning Integration

Dear @yhhhli,

This work looks awesome !
I was wondering if you would be interested in making a PR contribution to PyTorch Lightning.
It could a callback converting users model and apply this quantisation method for training the model.
It would fit really nicely with pruning, fine-tuning.

Best,
T.C

Some results about resnet20 on cifar10

Hello, I am doing the quantization work now, I tried to reproduce your results, but unfortunately the results are not the same. The full accuracy reported in your paper is 91.6, but the result in the code is 92.9, which is much larger than the result reported in the paper, but in the 4bit quantization, the paper reports 92.3. It is unfair to compare full precision with quantization. Please confirm whether the results reported in the paper are wrong. In addition, what you report in your paper is the effect of ending the training test set or the result of getting the best test set during the training. It can be seen from the result that your paper is different from the result of the code, please give an explanation. In addition, my repeated results are different, I would like to ask whether the training results with multiple cards and single cards will be much worse, provided that other parameters remain the same. thank you
图片1

Can you provide pre-trained ResNet-18 model ?

Hello,
Thanks for your contribution in network quantzition field and your opensource code, I meet some problem in training ResNet18 model (quantizing both weights and activation to 4 bit ) on ImageNet dataset, which final best accuracy is about 68.1% . I kept all the hyper-parameters same with the code except for batch_size due to the GPU capacity, 3 RTX2080Ti are used to training and the batch_size is set as 196. I wonder if something was wrong in my training and I‘ll appreciate if you can provide pre-trained ResNet-18 model to help finding the problem.

The MUL unit of APOT

Hi,

Do you have the specific design of the MUL (Multiplication) unit for APOT quantization?

We know that uniform(Int) quantization or POT quantization are friendly to hardware.

Assume that:
R = real number
S = Scale number
T = quantized number
R1 = S1 * T1
R2 = S2 * T2

Uniform quantization simply adopts the INT MUL unit:

T1 = m
T2 = n

So, we have:

R1 * R2 = (S1 * S2) * (m * n) 

For POT:

T1 = 2^m
T2 = 2^n

So, we have:

R1 * R2 = (S1 * S2) * (2^m * 2^n) 
             =  (S1 * S2) * 2^(m + n)

The POT is similar to the only-exponent float MUL.

However, for APOT, I have two questions about the MUL design. There are additive elements in the data.
Assume a 4-bit POT:
The first two bits decoder table:

00 01 10 11
2^0 2^-1 2^-3 2^-5

And the last two bits:

00 01 10 11
2^0 2^-2 2^-4 2^-6

For the first two bits, the decoder table is not continuous: 0, -1, -3, -5.

Q1: How do you efficiently decode the binary code to the APOT, especially in the MUL unit?

Aussume the two number in APOT:

0101: T1 = 2^-1 + 2^-2
1010: T2 = 2^-3 + 2^-4
T1 * T2 = (2^-1 + 2^-2) * (2^-3 + 2^-4) 
             = (2^-1 * 2^-3) + (2^-1 * 2^-4) + (2^-2 * 2^-3) + (2^-2 * 2^-4)
             = 2^-4 + 2^-5 + 2^-5 + 2^-6

Obviously, the calculation has 4x (9x) add operations than POT in 4-bit (6-bit). And the result violates the definition of APOT, which won't have the same additive element in a number, such as 2^-5.

Q2: How do you deal with the complex computation and the subnormal number for APOT?

One direct solution is to convert a float with fake quantization. But is it a violation of the principle of quantization?

Weight normalization

Hello!
I found that without weight normalization, the network ceases to learn, and the loss is equal to nan. Could you please explain why this is happening and how it can be fixed?

quantization bit of apot

Hello, I have read your papar and your code. I found that in code the Build_power_value matched the describtion in paper. However, the weight_quantization_function passed the (bitwidth-1) value as B in Build_power_value, which means 4-bit quantization levels used 3-bit formula. I wondered whether I missed something in paper or code?

calculate MAC

How did you calculate the number of MAC? Please share the code

The migration of this QAT function?

Thanks for your great work!
But Compared with Resnet series for Imagenet, i will be more careful about some small model like mobilenet series or shufflenet series. And have you test the QAT function in some small model, is it useful?

Size and accuracy

Hi,

Based on the provided pretrained model (res18_2bit), I got 64.690% and the quantized model size is 5MB (gzip) or 3.4MB (7zip). It is quite different from the results in your paper. Can you please point out why is that? I just run: python main.py -a resnet18 --bit 2 --pretrained resnet18_2bit.pth

Thanks

uniform_quantization

Thanks for ur work. I have a question about uniform_quantization function line 142 is different from 138. And i want to know is that right ?

Need Suggestion

Greetings Authors,

I have just gown throw your paper, trying to understand your concept but am really wondered can I apply the APOT method on MOBILENET V_2?
kindly response.

about training time

I use Imagenet to train apot, it's really time-consuming. it seems that a epoch needs 1 day (8 V100). Are there anythings wrong?

Technical details

Thank you for the great contribution!
We are experimenting now with your implementation of QuantConv2d and trying to integrate it for object detection, namely into RetinaNet. Therefore, I would like to ask you some questions to validate my assumptions.

  1. Unfortunately, the size of the model has not been changed after the applying of the 4-bit APoT Quantization. And as I can see, you used the same type of tensors , namely FP32, but inserted the quantized values? That means, that by saving of the model the Pytorch does not differ between full-precision model and your quantized implementation, because the tensors have the same type( in some source it is called fake quantization)? Or is there any tools in your code, which sets another type, for quantized weights, in order to reduce the model size by saving?
  2. Because of the problem above, I wanted to transfer your weight tensors to dtype=torch.qint8, but I found out that the weights of the quantized layers are not integer even in binary and uniform Quantization. Does it mean, that when you say you have e.g. 4-bit uniform quantization it does not mean that the weights are integer and in interval [0, 2^4-1] (or another clipping interval). It does mean that all weights can be represented with 4-bits but they are not necessary integer?

You can directly point out, if I wrote something wrong in my assumptions.

Best regards,

about uniform quantization

hello, here are the two function of uniform quantization from CIFAR10 and Imagenet:
def uniform_quant(x, b=3): xdiv = x.mul(2 ** b - 1) xhard = xdiv.round().div(2 ** b - 1) return xhard
def uniform_quantization(tensor, alpha, bit, is_weight=True, grad_scale=None): if grad_scale: alpha = gradient_scale(alpha, grad_scale) data = tensor / alpha if is_weight: data = data.clamp(-1, 1) data = data * (2 ** (bit - 1) - 1) data_q = (data.round() - data).detach() + data data_q = data_q / (2 ** (bit - 1) - 1) * alpha else: data = data.clamp(0, 1) data = data * (2 ** bit - 1) data_q = (data.round() - data).detach() + data data_q = data_q / (2 ** (bit - 1) - 1) * alpha return data_q

I want to know why the input first need to multiply (2 ** bit - 1)??
multiply (2 ** bit - 1) then divide (2 ** b - 1) ,is it the number not changed?
thanks

Bottleneck Doubts

This is more like a doubt and less like an issue. I am new to this field, so I am unable to understand, why the QuantConv() layers have not been implemented in the Bottleneck class. Only the BasicBlock witnesses the usage of the quantized convolutional layers.
Is there a specific reason for that? Does it affect the accuracy too much?

Accuracy of Implementation

Hi, I have a question for the reported accuracy.

For example, you got 70.75% and 66.46% with 5 bits and 2 bits for ResNet-18 on ImageNet, respectively.

In the paper, however, 70.9% and 67.3% with 5 bits and 2 bits are reported.

Can you explain what has made these differences?

a4w4 Resnet18 is 1.7% lower than that in the paper?

I train Resnet18 a4w4 with the following command with the latest repo:

python main.py -a resnet18 --bit 4 --gpu 0 -b 256

The best top1 acc is only 69.01, why it is 1.7% lower (70.7) than the result reported in the paper?

Imagenet Accuracy quickly dropping

When I try this command in the ImageNet folder:

python main.py -a resnet18 -b 5 --data <path to imagenet directory>

I get this log. Is that expected?

image

NaN loss for 8bit

hi, I was trying to train resnet18 on Imagenet with 8bit quantization. However, the loss is always Nan. Tried smaller learning rates and the results are the same. Note I tried running with batch size of 64 and 128 only. Although its low, it seemed to work with 5 bit quantization training. Any ideas why this could be?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.