Giter VIP home page Giter VIP logo

sparsebit's Issues

ImportError

My setting:
cuda 10.2
python=3.8

intsall SparseBit by :

git clone https://github.com/megvii-research/Sparsebit.git
cd sparsebit
python3 setup.py develop --user
pip3 install tensorrt-8.2.5.1-cp38-none-linux_x86_64.whl

after installed that:
run /root/Sparsebit/examples/cifar10_ptq/main.ipynb, but exposing RuntimeError: Ninja is required to load C++ extensions, so ,also run pip install Ninja, then run main.ipynb again, exposing an error:

Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?9abbaedb-76bc-4b54-8e8c-3079840743b9)
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
/root/Sparsebit/examples/cifar10_ptq/main.ipynb Cell 2 in <cell line: 24>()
     [21](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a2268616964756f546974616e5f7870227d/root/Sparsebit/examples/cifar10_ptq/main.ipynb#W1sdnNjb2RlLXJlbW90ZQ%3D%3D?line=20) import torchvision.datasets as datasets
     [22](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a2268616964756f546974616e5f7870227d/root/Sparsebit/examples/cifar10_ptq/main.ipynb#W1sdnNjb2RlLXJlbW90ZQ%3D%3D?line=21) from model import resnet20
---> [24](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a2268616964756f546974616e5f7870227d/root/Sparsebit/examples/cifar10_ptq/main.ipynb#W1sdnNjb2RlLXJlbW90ZQ%3D%3D?line=23) from sparsebit.quantization import QuantModel, parse_qconfig

File ~/Sparsebit/sparsebit/quantization/__init__.py:1, in <module>
----> 1 from .quant_model import *
      2 from .quant_config import parse_qconfig

File ~/Sparsebit/sparsebit/quantization/quant_model.py:18, in <module>
     15 import onnx
     17 from sparsebit.utils import update_config
---> 18 from sparsebit.quantization.modules import *
     19 from sparsebit.quantization.observers import Observer
     20 from sparsebit.quantization.quantizers import Quantizer

File ~/Sparsebit/sparsebit/quantization/modules/__init__.py:16, in <module>
     12     return real_register
     15 # 将需要注册的module文件填写至此
---> 16 from .base import QuantOpr, MultipleInputsQuantOpr
     17 from .activations import *
     18 from .conv import *
...
-> 1775     module = importlib.util.module_from_spec(spec)
   1776     assert isinstance(spec.loader, importlib.abc.Loader)
   1777     spec.loader.exec_module(module)

ImportError: /root/Sparsebit/sparsebit/quantization/torch_extensions/build/fake_quant.so: cannot open shared object file: No such file or directory

I examine the "build directory", and find it is empty:
image

我对你们的2080ti pipeline很感兴趣

感觉这个是个低成本的setup,对于train llama 7b,一定要2080ti吗?因为11G mem,还是8块8G mem的显卡也可以。能公开你们整个机器的配置吗

QAT cifar10 example with QADD Quant enable 报错

执行 https://github.com/megvii-research/Sparsebit/blob/main/examples/quantization_aware_training/cifar10/basecase/main.py 报错

  • 执行过程
python3 main.py qconfig_lsq.yaml --epochs=0
  • qconfig_lsq.yaml 内容:
BACKEND: virtual
W:
  QSCHEME: per-channel-symmetric
  QUANTIZER: 
    TYPE: lsq
    BIT: 4
A:
  QSCHEME: per-tensor-affine
  QUANTIZER:
    TYPE: lsq
    BIT: 4
  QADD:
    ENABLE_QUANT: true
  • 报错
Traceback (most recent call last):
  File "main.py", line 439, in <module>
    main()
  File "main.py", line 223, in main
    qmodel.export_onnx(
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 257, in export_onnx
    self.add_extra_info_to_onnx(name)
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 313, in add_extra_info_to_onnx
    weight_dequant = nodes[tensor_inputs[onnx_op.input[1]][0]]
IndexError: list index (1) out of range

A bug may have been resolved.

There is a bug here, I thought of a simple way to fix it, which is applicable to QAT of ViT.

elif 'input_quantizer.scale' in dict(_module.state_dict()).keys():
      _module.input_quantizer.set_fake_fused()  # 有bug, quant_state会来回切.
else:
     print("no_set_fake_fused:", _user.name, _module.input_quantizer_generated)

update quantizer preprocess when export onnx

  1. 由于类似dorefa等quantizer在forward过程中会产出一些对scale / zeropoint等量化参数的额外操作, 导致export onnx生成的weight或scale, zerpoint等参数并非实际forward过程中的参数, 故引起onnx运行不正确的现象.
  2. 希望增加预处理操作实现两者一致性.

How to deal with layers constructed by bool operations?

for example, there is a layer that is constructed as follows:

while len(emb_out.shape) < len(h.shape):
emb_out = emb_out[..., None]

when quantizing this layer, we met this error.

torch.fx.proxy.TraceError: symbolically traced variables cannot be used as inputs to control flow

How could I reproduce the results of QAT_DeiT on ImageNet?

I rewrite the related code in main.py as follows:

    # set head and tail of model is 8bit
    model.model.patch_embed_proj.weight_quantizer.set_bit(bit=8)
    model.model.head.input_quantizer.set_bit(bit=8)
    model.model.head.weight_quantizer.set_bit(bit=8)
    # model.model.conv1.weight_quantizer.set_bit(bit=8)
    # model.model.fc.input_quantizer.set_bit(bit=8)
    # model.model.fc.weight_quantizer.set_bit(bit=8)

and training for 90 epoches, but I can`t get the same result that the readme provided.

error in homework Q3 code.

Homework ans, Q3 code
data.transpose(self.qdesc.ch_axis, 0)
here, 0 should be 1, the first axis (0) is the calibration-size, the second axis (1) is the channel axis according to the code.

and if self.qdesc.ch_axis==0 mean observers for weights, which should not be calculated here.

torchvision densenet121 无法转成 sparsebit QuantModel

执行如下代码报错:

import torchvision
import torch

from sparsebit.quantization import QuantModel, parse_qconfig


qconfig_path = "./qconfig.yaml"
# BACKEND: virtual
# W:
#   QSCHEME: per-channel-symmetric
#   QUANTIZER: 
#     TYPE: lsq
#     BIT: 4
# A:
#   QSCHEME: per-tensor-affine
#   QUANTIZER:
#     TYPE: lsq
#     BIT: 4
#   QADD:
#     ENABLE_QUANT: true

model = torchvision.models.densenet121(pretrained=True)
qconfig = parse_qconfig(qconfig_path)
model = QuantModel(model, config=qconfig)
inp = torch.randn(2, 3, 224, 224)
out = model(inp)

QAT cifar10 example 报错

执行 https://github.com/megvii-research/Sparsebit/blob/main/examples/quantization_aware_training/cifar10/basecase/main.py 报错

  • 执行过程
python3 main.py qconfig_lsq.yaml --epochs=0
  • 报错
Traceback (most recent call last):
  File "main.py", line 428, in <module>
    main()
  File "main.py", line 219, in main
    qmodel.export_onnx(
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 254, in export_onnx
    self.add_extra_info_to_onnx(name)
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 298, in add_extra_info_to_onnx
    input_dequant = nodes[tensor_inputs[onnx_op.input[0]][0]]
KeyError: 'conv1.weight_quantizer.scale'

error when running generate.py under alpaca-lama

it shows following error

Traceback (most recent call last):
File "/home/missa/dev/Sparsebit/large_language_models/alpaca-qlora/generate.py", line 29, in
model = PeftQModel.from_pretrained(
File "/home/missa/miniconda3/envs/sparsebitv6/lib/python3.9/site-packages/peft/peft_model.py", line 135, in from_pretrained
config = PEFT_TYPE_TO_CONFIG_MAPPING[PeftConfig.from_pretrained(model_id).peft_type].from_pretrained(model_id)
File "/home/missa/miniconda3/envs/sparsebitv6/lib/python3.9/site-packages/peft/utils/config.py", line 95, in from_pretrained
if os.path.isfile(os.path.join(pretrained_model_name_or_path, CONFIG_NAME)):
File "/home/missa/miniconda3/envs/sparsebitv6/lib/python3.9/posixpath.py", line 76, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

I see CHECKPOINT_PATH = None in generate.py, is this expected?

QTP for swin transformer

code

 B = int(windows.shape[0] / (H * W / window_size / window_size))

error

TypeError: int() argument must be a string, a bytes-like object or a number, not 'Proxy'

English Docs

Thank you for this great share!

Do you have any plans to add English Readme/Docs?

ONNX can not be loaded by Tensorrt 8.2.5& 8003

homework, Q4 export onnx file and conver trt engine. I use trtexec --workspace=4096 --int8 --onnx=./qresnet18.onnx in 8.2.5 &8.0.03 version.
But encountered that op is not supported, as follows:
[07/30/2022-08:13:33] [I] TensorRT version: 8003 [07/30/2022-08:13:33] [I] [TRT] [MemUsageChange] Init CUDA: CPU +250, GPU +0, now: CPU 257, GPU 482 (MiB) [07/30/2022-08:13:33] [I] Start parsing network model [07/30/2022-08:13:33] [I] [TRT] ---------------------------------------------------------------- [07/30/2022-08:13:33] [I] [TRT] Input filename: ./qresnet18.onnx [07/30/2022-08:13:33] [I] [TRT] ONNX IR version: 0.0.7 [07/30/2022-08:13:33] [I] [TRT] Opset version: 13 [07/30/2022-08:13:33] [I] [TRT] Producer name: pytorch [07/30/2022-08:13:33] [I] [TRT] Producer version: 1.12.0 [07/30/2022-08:13:33] [I] [TRT] Domain: [07/30/2022-08:13:33] [I] [TRT] Model version: 0 [07/30/2022-08:13:33] [I] [TRT] Doc string: [07/30/2022-08:13:33] [I] [TRT] ---------------------------------------------------------------- [07/30/2022-08:13:33] [E] Error[3]: onnx::QuantizeLinear_710: invalid weights type of Int8 [07/30/2022-08:13:33] [E] [TRT] ModelImporter.cpp:720: While parsing node number 0 [Identity -> "onnx::QuantizeLinear_872"]: [07/30/2022-08:13:33] [E] [TRT] ModelImporter.cpp:721: --- Begin node --- [07/30/2022-08:13:33] [E] [TRT] ModelImporter.cpp:722: input: "onnx::QuantizeLinear_710" output: "onnx::QuantizeLinear_872" name: "Identity_0" op_type: "Identity"

Do I need to add some other settings?

torchvision mobilenet_v2 导出 4w4f onnx 报错

  • 导出代码
import torchvision
import torch

from sparsebit.quantization import QuantModel, parse_qconfig


qconfig_path = "./qconfig_lsq.yaml"

# BACKEND: virtual
# W:
#   QSCHEME: per-channel-symmetric
#   QUANTIZER: 
#     TYPE: lsq
#     BIT: 4
# A:
#   QSCHEME: per-tensor-affine
#   QUANTIZER:
#     TYPE: lsq
#     BIT: 4
#   QADD:
#     ENABLE_QUANT: true

model = torchvision.models.mobilenet_v2(pretrained=True)
qconfig = parse_qconfig(qconfig_path)
qmodel = QuantModel(model, config=qconfig)
qmodel.eval()
inp = torch.randn(2, 3, 224, 224)
out = qmodel(inp)
print(out.shape)
with torch.no_grad():
    qmodel.export_onnx(
        inp, name="mobilenet_v2_4w4f.onnx", extra_info=True
    )
  • 报错信息
Traceback (most recent call last):
  File "dump_onnx.py", line 17, in <module>
    qmodel.export_onnx(
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 256, in export_onnx
    self.add_extra_info_to_onnx(name)
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 294, in add_extra_info_to_onnx
    onnx_op = onnx_model.graph.node[op_pos]
IndexError: list index (914) out of range

Dimension mismatch when QAT model export to onnx

This issue can be easily reproduced with the example of QAT, along with quant min, max disabled in PyTorch onnx operator. The error message comes from quantizers/quant_tensor.py saying dimensions of scale and zero_point are inconsistent with input tensor. I checked the scale and zero_point shape of first-layer convolution and it returns me with [3136, 1, 1, 1], where 3136=64*7*7, not [64, 1, 1, 1]...

KeyError: 'onnx::QuantizeLinear_711'

File "main.py", line 282, in
main()
File "main.py", line 148, in main
qmodel.export_onnx(
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quant_model.py", line 260, in export_onnx
self.add_extra_info_to_onnx(name)
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quant_model.py", line 304, in add_extra_info_to_onnx
input_dequant = nodes[tensor_inputs[onnx_op.input[0]][0]]
KeyError: 'onnx::QuantizeLinear_711'

An error about "x_dq = self._forward(x, scale, zero_point) "

An error exposing, when I run cifar10_qat_pact/main.py . as follows:

Traceback (most recent call last):
  File "/root/miniconda3/envs/sb/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/envs/sb/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
 ..........
  File "/root/Sparsebit/examples/cifar10_qat_pact/main.py", line 311, in <module>
    train(
  File "/root/Sparsebit/examples/cifar10_qat_pact/main.py", line 149, in train
    output = model(images)
  File "/root/miniconda3/envs/sb/lib/python3.8/site-packages/torch-1.11.0-py3.8-linux-x86_64.egg/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/Sparsebit/sparsebit/quantization/quant_model.py", line 198, in forward
    return self.model.forward(*args)
  File "<eval_with_key>.129", line 8, in forward
  File "/root/miniconda3/envs/sb/lib/python3.8/site-packages/torch-1.11.0-py3.8-linux-x86_64.egg/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/Sparsebit/sparsebit/quantization/modules/conv.py", line 39, in forward
    x_in = self.input_quantizer(x_in)
  File "/root/miniconda3/envs/sb/lib/python3.8/site-packages/torch-1.11.0-py3.8-linux-x86_64.egg/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/Sparsebit/sparsebit/quantization/quantizers/base.py", line 54, in forward
    x_dq = self._forward(x, scale, zero_point)
TypeError: _forward() takes 2 positional arguments but 4 were given

I found that def _forward(self, x): of pact does only accept two parameters, but dorefa does accept four parameters def _forward(self, x, scale, zero_point):, and also found that in fact dorefa also only two parameters are required. So, I think there are two solutions:

  • either: Change def _forward(self, x): of pact to def _forward(self, x, scale, zero_point):

  • or: Change def _forward(self, x, scale, zero_point): of dorefa to def _forward(self, x):, and change x_dq = self._forward(x, scale, zero_point) to:

if self.TYPE == "PACT" or self.TYPE == "DoReFa":
     x_dq = self._forward(x)
else:
     x_dq = self._forward(x, scale, zero_point)

The error will be solved!

errors using qViT.onnx to do inference

With no modification,I using your ptq code to export deit onnx models. But error occurs when using onnxruntime to inference the onnx model.

onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Error in Node:QuantizeLinear_2 : No Op registered for QuantizeLinear with domain_version of 13

Import Error

Code jammed in “from sparsebit.quantization import QuantModel, parse_qconfig”, as shown in figure
截屏2022-07-31 21 39 38

RuntimeError: Ninja is required to load C++ extensions

/opt/python3.8.6/bin/python /home/hongyang/codebase/quantization_code/Sparsebit/examples/quantization_aware_training/cifar10/basecase/main.py
Traceback (most recent call last):
File "/opt/python3.8.6/bin/ninja", line 33, in
sys.exit(load_entry_point('ninja', 'console_scripts', 'ninja')())
File "/opt/python3.8.6/lib/python3.8/site-packages/ninja-1.11.1-py3.8-linux-x86_64.egg/ninja/init.py", line 51, in ninja
raise SystemExit(_program('ninja', sys.argv[1:]))
File "/opt/python3.8.6/lib/python3.8/site-packages/ninja-1.11.1-py3.8-linux-x86_64.egg/ninja/init.py", line 47, in _program
return subprocess.call([os.path.join(BIN_DIR, name)] + args, close_fds=False)
File "/opt/python3.8.6/lib/python3.8/subprocess.py", line 340, in call
with Popen(*popenargs, **kwargs) as p:
File "/opt/python3.8.6/lib/python3.8/subprocess.py", line 854, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "/opt/python3.8.6/lib/python3.8/subprocess.py", line 1592, in _execute_child
self._posix_spawn(args, executable, env, restore_signals,
File "/opt/python3.8.6/lib/python3.8/subprocess.py", line 1543, in _posix_spawn
self.pid = os.posix_spawn(executable, args, env, **kwargs)
PermissionError: [Errno 13] Permission denied: '/opt/python3.8.6/lib/python3.8/site-packages/ninja-1.11.1-py3.8-linux-x86_64.egg/ninja/data/bin/ninja'
Traceback (most recent call last):
File "/home/hongyang/codebase/quantization_code/Sparsebit/examples/quantization_aware_training/cifar10/basecase/main.py", line 23, in
from sparsebit.quantization import QuantModel, parse_qconfig
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/init.py", line 1, in
from .quant_model import *
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quant_model.py", line 18, in
from sparsebit.quantization.modules import *
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/modules/init.py", line 17, in
from .base import QuantOpr, MultipleInputsQuantOpr
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/modules/base.py", line 4, in
from sparsebit.quantization.quantizers import build_quantizer
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quantizers/init.py", line 9, in
from .base import Quantizer
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quantizers/base.py", line 4, in
from sparsebit.quantization.observers import build_observer
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/observers/init.py", line 10, in
from . import minmax, percentile, mse, moving_average, kl_histogram, aciq
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/observers/mse.py", line 6, in
from sparsebit.quantization.quantizers.quant_tensor import STE
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quantizers/quant_tensor.py", line 13, in
fake_quant_kernel = load(
File "/opt/python3.8.6/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load
return _jit_compile(
File "/opt/python3.8.6/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile
_write_ninja_file_and_build_library(
File "/opt/python3.8.6/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1506, in _write_ninja_file_and_build_library
verify_ninja_availability()
File "/opt/python3.8.6/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1562, in verify_ninja_availability
raise RuntimeError("Ninja is required to load C++ extensions")
RuntimeError: Ninja is required to load C++ extensions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.