yhhhli / apot_quantization Goto Github PK

View Code? Open in Web Editor NEW

257.0 257.0 51.0 34.59 MB

PyTorch implementation for the APoT quantization (ICLR 2020)

Python 98.76% Shell 1.24%

apot_quantization's People

Contributors

Stargazers

Watchers

apot_quantization's Issues

about uniform quantization

hello, here are the two function of uniform quantization from CIFAR10 and Imagenet:
def uniform_quant(x, b=3): xdiv = x.mul(2 ** b - 1) xhard = xdiv.round().div(2 ** b - 1) return xhard
def uniform_quantization(tensor, alpha, bit, is_weight=True, grad_scale=None): if grad_scale: alpha = gradient_scale(alpha, grad_scale) data = tensor / alpha if is_weight: data = data.clamp(-1, 1) data = data * (2 ** (bit - 1) - 1) data_q = (data.round() - data).detach() + data data_q = data_q / (2 ** (bit - 1) - 1) * alpha else: data = data.clamp(0, 1) data = data * (2 ** bit - 1) data_q = (data.round() - data).detach() + data data_q = data_q / (2 ** (bit - 1) - 1) * alpha return data_q

I want to know why the input first need to multiply (2 ** bit - 1)??
multiply (2 ** bit - 1) then divide (2 ** b - 1) ,is it the number not changed?
thanks

uniform_quantization

Thanks for ur work. I have a question about uniform_quantization function line 142 is different from 138. And i want to know is that right ?

difference between paper and code in quan_layer

Hi~ I have a liitle puzzled about a difference between paper and code in quan_layer.py.
In paper, when B = 4 P0∈{0,2^0,2^-2,2^-4} (show in example in 2.2)
But in code quan_layer.py line 23: base_b.append(2 ** (-2 * i - 2)), when i range(3) we have P0∈{0,2^-2,2^-4,2^-6}
I am a little puzzled about it, thanks a lot!

Bottleneck Doubts

This is more like a doubt and less like an issue. I am new to this field, so I am unable to understand, why the QuantConv() layers have not been implemented in the Bottleneck class. Only the BasicBlock witnesses the usage of the quantized convolutional layers.
Is there a specific reason for that? Does it affect the accuracy too much?

Can you provide pre-trained ResNet-18 model ?

Hello，
Thanks for your contribution in network quantzition field and your opensource code, I meet some problem in training ResNet18 model (quantizing both weights and activation to 4 bit ) on ImageNet dataset, which final best accuracy is about 68.1% . I kept all the hyper-parameters same with the code except for batch_size due to the GPU capacity, 3 RTX2080Ti are used to training and the batch_size is set as 196. I wonder if something was wrong in my training and I‘ll appreciate if you can provide pre-trained ResNet-18 model to help finding the problem.

Technical details

Thank you for the great contribution!
We are experimenting now with your implementation of QuantConv2d and trying to integrate it for object detection, namely into RetinaNet. Therefore, I would like to ask you some questions to validate my assumptions.

Unfortunately, the size of the model has not been changed after the applying of the 4-bit APoT Quantization. And as I can see, you used the same type of tensors , namely FP32, but inserted the quantized values? That means, that by saving of the model the Pytorch does not differ between full-precision model and your quantized implementation, because the tensors have the same type( in some source it is called fake quantization)? Or is there any tools in your code, which sets another type, for quantized weights, in order to reduce the model size by saving?
Because of the problem above, I wanted to transfer your weight tensors to dtype=torch.qint8, but I found out that the weights of the quantized layers are not integer even in binary and uniform Quantization. Does it mean, that when you say you have e.g. 4-bit uniform quantization it does not mean that the weights are integer and in interval [0, 2^4-1] (or another clipping interval). It does mean that all weights can be represented with 4-bits but they are not necessary integer?

You can directly point out, if I wrote something wrong in my assumptions.

Best regards,

about CIFAR10 part main.py resume function

i found that maybe in CIFAR10 part main.py ,the resume function was not defined

about training time

I use Imagenet to train apot, it's really time-consuming. it seems that a epoch needs 1 day (8 V100). Are there anythings wrong?

a4w4 Resnet18 is 1.7% lower than that in the paper?

I train Resnet18 a4w4 with the following command with the latest repo:

python main.py -a resnet18 --bit 4 --gpu 0 -b 256

The best top1 acc is only 69.01, why it is 1.7% lower (70.7) than the result reported in the paper?

The MUL unit of APOT

Hi,

Do you have the specific design of the MUL (Multiplication) unit for APOT quantization?

We know that uniform(Int) quantization or POT quantization are friendly to hardware.

Assume that:
R = real number
S = Scale number
T = quantized number
R1 = S1 * T1
R2 = S2 * T2

Uniform quantization simply adopts the INT MUL unit:

T1 = m
T2 = n

So, we have:

R1 * R2 = (S1 * S2) * (m * n)

For POT:

T1 = 2^m
T2 = 2^n

So, we have:

R1 * R2 = (S1 * S2) * (2^m * 2^n) 
             =  (S1 * S2) * 2^(m + n)

The POT is similar to the only-exponent float MUL.

However, for APOT, I have two questions about the MUL design. There are additive elements in the data.
Assume a 4-bit POT:
The first two bits decoder table:

00	01	10	11
2^0	2^-1	2^-3	2^-5

And the last two bits:

00	01	10	11
2^0	2^-2	2^-4	2^-6

For the first two bits, the decoder table is not continuous: 0, -1, -3, -5.

Q1: How do you efficiently decode the binary code to the APOT, especially in the MUL unit?

Aussume the two number in APOT:

0101: T1 = 2^-1 + 2^-2
1010: T2 = 2^-3 + 2^-4
T1 * T2 = (2^-1 + 2^-2) * (2^-3 + 2^-4) 
             = (2^-1 * 2^-3) + (2^-1 * 2^-4) + (2^-2 * 2^-3) + (2^-2 * 2^-4)
             = 2^-4 + 2^-5 + 2^-5 + 2^-6

Obviously, the calculation has 4x (9x) add operations than POT in 4-bit (6-bit). And the result violates the definition of APOT, which won't have the same additive element in a number, such as 2^-5.

Q2: How do you deal with the complex computation and the subnormal number for APOT?

One direct solution is to convert a float with fake quantization. But is it a violation of the principle of quantization?

Imagenet Accuracy quickly dropping

When I try this command in the ImageNet folder:

python main.py -a resnet18 -b 5 --data <path to imagenet directory>

I get this log. Is that expected?

Accuracy of Implementation

Hi, I have a question for the reported accuracy.

For example, you got 70.75% and 66.46% with 5 bits and 2 bits for ResNet-18 on ImageNet, respectively.

In the paper, however, 70.9% and 67.3% with 5 bits and 2 bits are reported.

Can you explain what has made these differences?

quantization bit of apot

Hello, I have read your papar and your code. I found that in code the Build_power_value matched the describtion in paper. However, the weight_quantization_function passed the (bitwidth-1) value as B in Build_power_value, which means 4-bit quantization levels used 3-bit formula. I wondered whether I missed something in paper or code?

Need Suggestion

Greetings Authors,

I have just gown throw your paper, trying to understand your concept but am really wondered can I apply the APOT method on MOBILENET V_2?
kindly response.

Why the size of Res20_2bit is the same as Res20_32bit?

Hyper-Params on MobileNet_V2

Dear yhhli, could you tell me what is the Hyper-Params of mobilenet_v2 training?

NaN loss for 8bit

hi, I was trying to train resnet18 on Imagenet with 8bit quantization. However, the loss is always Nan. Tried smaller learning rates and the results are the same. Note I tried running with batch size of 64 and 128 only. Although its low, it seemed to work with 5 bit quantization training. Any ideas why this could be?

Differences between quant_layer.py

Thanks for your nice work and codes!

I found that files quant_layer.py in CIFAR10 and ImageNet are different. Can you please tell me why there exit two versions? BTW, I plan to utilize quantlize MobileNet V2. Which version is better for migration? Thank you.

Some results about resnet20 on cifar10

Hello, I am doing the quantization work now, I tried to reproduce your results, but unfortunately the results are not the same. The full accuracy reported in your paper is 91.6, but the result in the code is 92.9, which is much larger than the result reported in the paper, but in the 4bit quantization, the paper reports 92.3. It is unfair to compare full precision with quantization. Please confirm whether the results reported in the paper are wrong. In addition, what you report in your paper is the effect of ending the training test set or the result of getting the best test set during the training. It can be seen from the result that your paper is different from the result of the code, please give an explanation. In addition, my repeated results are different, I would like to ask whether the training results with multiple cards and single cards will be much worse, provided that other parameters remain the same. thank you

Weight normalization

Hello!
I found that without weight normalization, the network ceases to learn, and the loss is equal to nan. Could you please explain why this is happening and how it can be fixed?

the precision a4w4 of training MobilenetV2 is nearly 0

I use the official MobilenetV2 in the torchvision.models.

Are there any special tricks to train mobilenet_v2?

Lightning Integration

Dear @yhhhli,

This work looks awesome !
I was wondering if you would be interested in making a PR contribution to PyTorch Lightning.
It could a callback converting users model and apply this quantisation method for training the model.
It would fit really nicely with pruning, fine-tuning.

Best,
T.C

POT对硬件的友好性如何验证？用PC进行POT量化测试得到的精度可信吗？

pot量化的优势是位移代替乘法，但是在代码中并没有看到相关模拟位移的过程，请问是如何验证POT量化的硬件友好性？
另外，POT量化的目标是进行位移操作，然而在PC上仍然采用CONV卷积乘法，这样得到的精度可信吗？

calculate MAC

How did you calculate the number of MAC? Please share the code

The migration of this QAT function?

Thanks for your great work!
But Compared with Resnet series for Imagenet, i will be more careful about some small model like mobilenet series or shufflenet series. And have you test the QAT function in some small model, is it useful?

Size and accuracy

Hi,

Based on the provided pretrained model (res18_2bit), I got 64.690% and the quantized model size is 5MB (gzip) or 3.4MB (7zip). It is quite different from the results in your paper. Can you please point out why is that? I just run: python main.py -a resnet18 --bit 2 --pretrained resnet18_2bit.pth

Thanks

yhhhli / apot_quantization Goto Github PK

apot_quantization's People

Contributors

Stargazers

Watchers

Forkers

apot_quantization's Issues

Q1: How do you efficiently decode the binary code to the APOT, especially in the MUL unit?

Q2: How do you deal with the complex computation and the subnormal number for APOT?

Recommend Projects

Recommend Topics

Recommend Org