Giter VIP home page Giter VIP logo

sparsebit's Introduction

News

  • 2023.04.27: 🔥 Pipeline parallelism is supported for alpaca-qlora which enables fine-tuning llama-65b with 8*2080ti within 13 hours.
  • 2023.04.15: 🔥 We release alpaca-qlora which reduce a half model size gpu-memory than alpaca-lora. With alpaca-qlora support, you can use a single 2080ti to instruct fine-tuning llama-7b/13b.
  • 2023.03.20: 🔥 We implemented a GPTQ cuda kernel with groupsize feature and add --single_device_mode to support all quant LLaMAs run in a single GPU(i.e. 2080ti). GPTQ for LLaMA.
  • 2023.03.08: Release a mix-precision quantization method based on GPTQ for LLaMA.
  • 2023.02.23: Release a PTQ example of GPT2 on wikiText2
  • 2022.11.24: Release a QAT example of BEVDet
  • 2022.12.13: Release some examples of BERT.
  • 2022.12.14: Release a QAT example of BEVDepth
  • 2022.12.26: Release a QAT example of BEVDet4D

Introduction

Sparsebit is a toolkit with pruning and quantization capabilities. It is designed to help researchers compress and accelerate neural network models by modifying only a few codes in existing pytorch project.

Quantization

Quantization turns full-precision params into low-bit precision params, which can compress and accelerate the model without changing its structure. This toolkit supports two common quantization paradigms, Post-Training-Quantization and Quantization-Aware-Training, with following features:

  • Benefiting from the support of torch.fx, Sparsebit operates on a QuantModel, and each operation becomes a QuantModule.
  • Sparsebit can easily be extended by users to accommodate their own researches. Users can register to extend important objects such as QuantModule, Quantizer and Observer by themselves.
  • Exporting QDQ-ONNX is supported, which can be loaded and deployed by backends such as TensorRT and OnnxRuntime.

Results

  • PTQ results on ImageNet-1k: link
  • PTQ results of Vision Transformer on ImageNet-1k: link
  • PTQ results of YOLO related works on COCO: link
  • QAT results on ImageNet-1k: link

Sparse

Sparse is often used in deep learning to refer to operations such as reducing network parameters or network computation. At present, Sparse supported by the toolbox has the following characteristics:

  • Supports two types of pruning: structured/unstructured;
  • Supports a variety of operation objects including: weights, activations, model-blocks, model-layers, etc.;
  • Supports multiple pruning algorithms: L1-norm/L0-norm/Fisher-pruning/Hrank/Slimming...
  • Users can extend a custom pruning algorithm easily by defining a Sparser
  • Using ONNX as the export format for the pruned model

Resources

Documentations

Detailed usage and development guidance is located in the document. Refer to: docs

CV-Master

  • We maintain a public course on quantification at Bilibili, introducing the basics of quantification and our latest work. Interested users can join the course.video
  • Aiming at better enabling users to understand and apply the knowledge related to model compression, we designed related homework based on Sparsebit. Interested users can complete it by themselves.quantization_homework

Plan to re-implement

Join Us

  • Welcome to be a member (or an intern) of our team if you are interested in Quantization, Pruning, Distillation, Self-Supervised Learning and Model Deployment.
  • Submit your resume to: [email protected]

Acknowledgement

Sparsebit was inspired by several open source projects. We are grateful for these excellent projects and list them as follows:

License

Sparsebit is released under the Apache 2.0 license.

sparsebit's People

Contributors

cnbeining avatar hediw avatar hych2020 avatar jiang-stan avatar jixiege avatar lz02k avatar peiqinsun avatar pingguanhua avatar wesleysanjose avatar work-zhangzhe avatar zhber avatar zhiqwang avatar zsc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sparsebit's Issues

Dimension mismatch when QAT model export to onnx

This issue can be easily reproduced with the example of QAT, along with quant min, max disabled in PyTorch onnx operator. The error message comes from quantizers/quant_tensor.py saying dimensions of scale and zero_point are inconsistent with input tensor. I checked the scale and zero_point shape of first-layer convolution and it returns me with [3136, 1, 1, 1], where 3136=64*7*7, not [64, 1, 1, 1]...

QAT cifar10 example 报错

执行 https://github.com/megvii-research/Sparsebit/blob/main/examples/quantization_aware_training/cifar10/basecase/main.py 报错

  • 执行过程
python3 main.py qconfig_lsq.yaml --epochs=0
  • 报错
Traceback (most recent call last):
  File "main.py", line 428, in <module>
    main()
  File "main.py", line 219, in main
    qmodel.export_onnx(
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 254, in export_onnx
    self.add_extra_info_to_onnx(name)
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 298, in add_extra_info_to_onnx
    input_dequant = nodes[tensor_inputs[onnx_op.input[0]][0]]
KeyError: 'conv1.weight_quantizer.scale'

QTP for swin transformer

code

 B = int(windows.shape[0] / (H * W / window_size / window_size))

error

TypeError: int() argument must be a string, a bytes-like object or a number, not 'Proxy'

KeyError: 'onnx::QuantizeLinear_711'

File "main.py", line 282, in
main()
File "main.py", line 148, in main
qmodel.export_onnx(
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quant_model.py", line 260, in export_onnx
self.add_extra_info_to_onnx(name)
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quant_model.py", line 304, in add_extra_info_to_onnx
input_dequant = nodes[tensor_inputs[onnx_op.input[0]][0]]
KeyError: 'onnx::QuantizeLinear_711'

我对你们的2080ti pipeline很感兴趣

感觉这个是个低成本的setup,对于train llama 7b,一定要2080ti吗?因为11G mem,还是8块8G mem的显卡也可以。能公开你们整个机器的配置吗

error when running generate.py under alpaca-lama

it shows following error

Traceback (most recent call last):
File "/home/missa/dev/Sparsebit/large_language_models/alpaca-qlora/generate.py", line 29, in
model = PeftQModel.from_pretrained(
File "/home/missa/miniconda3/envs/sparsebitv6/lib/python3.9/site-packages/peft/peft_model.py", line 135, in from_pretrained
config = PEFT_TYPE_TO_CONFIG_MAPPING[PeftConfig.from_pretrained(model_id).peft_type].from_pretrained(model_id)
File "/home/missa/miniconda3/envs/sparsebitv6/lib/python3.9/site-packages/peft/utils/config.py", line 95, in from_pretrained
if os.path.isfile(os.path.join(pretrained_model_name_or_path, CONFIG_NAME)):
File "/home/missa/miniconda3/envs/sparsebitv6/lib/python3.9/posixpath.py", line 76, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

I see CHECKPOINT_PATH = None in generate.py, is this expected?

torchvision densenet121 无法转成 sparsebit QuantModel

执行如下代码报错:

import torchvision
import torch

from sparsebit.quantization import QuantModel, parse_qconfig


qconfig_path = "./qconfig.yaml"
# BACKEND: virtual
# W:
#   QSCHEME: per-channel-symmetric
#   QUANTIZER: 
#     TYPE: lsq
#     BIT: 4
# A:
#   QSCHEME: per-tensor-affine
#   QUANTIZER:
#     TYPE: lsq
#     BIT: 4
#   QADD:
#     ENABLE_QUANT: true

model = torchvision.models.densenet121(pretrained=True)
qconfig = parse_qconfig(qconfig_path)
model = QuantModel(model, config=qconfig)
inp = torch.randn(2, 3, 224, 224)
out = model(inp)

errors using qViT.onnx to do inference

With no modification,I using your ptq code to export deit onnx models. But error occurs when using onnxruntime to inference the onnx model.

onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Error in Node:QuantizeLinear_2 : No Op registered for QuantizeLinear with domain_version of 13

torchvision mobilenet_v2 导出 4w4f onnx 报错

  • 导出代码
import torchvision
import torch

from sparsebit.quantization import QuantModel, parse_qconfig


qconfig_path = "./qconfig_lsq.yaml"

# BACKEND: virtual
# W:
#   QSCHEME: per-channel-symmetric
#   QUANTIZER: 
#     TYPE: lsq
#     BIT: 4
# A:
#   QSCHEME: per-tensor-affine
#   QUANTIZER:
#     TYPE: lsq
#     BIT: 4
#   QADD:
#     ENABLE_QUANT: true

model = torchvision.models.mobilenet_v2(pretrained=True)
qconfig = parse_qconfig(qconfig_path)
qmodel = QuantModel(model, config=qconfig)
qmodel.eval()
inp = torch.randn(2, 3, 224, 224)
out = qmodel(inp)
print(out.shape)
with torch.no_grad():
    qmodel.export_onnx(
        inp, name="mobilenet_v2_4w4f.onnx", extra_info=True
    )
  • 报错信息
Traceback (most recent call last):
  File "dump_onnx.py", line 17, in <module>
    qmodel.export_onnx(
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 256, in export_onnx
    self.add_extra_info_to_onnx(name)
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 294, in add_extra_info_to_onnx
    onnx_op = onnx_model.graph.node[op_pos]
IndexError: list index (914) out of range

English Docs

Thank you for this great share!

Do you have any plans to add English Readme/Docs?

update quantizer preprocess when export onnx

  1. 由于类似dorefa等quantizer在forward过程中会产出一些对scale / zeropoint等量化参数的额外操作, 导致export onnx生成的weight或scale, zerpoint等参数并非实际forward过程中的参数, 故引起onnx运行不正确的现象.
  2. 希望增加预处理操作实现两者一致性.

How could I reproduce the results of QAT_DeiT on ImageNet?

I rewrite the related code in main.py as follows:

    # set head and tail of model is 8bit
    model.model.patch_embed_proj.weight_quantizer.set_bit(bit=8)
    model.model.head.input_quantizer.set_bit(bit=8)
    model.model.head.weight_quantizer.set_bit(bit=8)
    # model.model.conv1.weight_quantizer.set_bit(bit=8)
    # model.model.fc.input_quantizer.set_bit(bit=8)
    # model.model.fc.weight_quantizer.set_bit(bit=8)

and training for 90 epoches, but I can`t get the same result that the readme provided.

An error about "x_dq = self._forward(x, scale, zero_point) "

An error exposing, when I run cifar10_qat_pact/main.py . as follows:

Traceback (most recent call last):
  File "/root/miniconda3/envs/sb/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/envs/sb/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
 ..........
  File "/root/Sparsebit/examples/cifar10_qat_pact/main.py", line 311, in <module>
    train(
  File "/root/Sparsebit/examples/cifar10_qat_pact/main.py", line 149, in train
    output = model(images)
  File "/root/miniconda3/envs/sb/lib/python3.8/site-packages/torch-1.11.0-py3.8-linux-x86_64.egg/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/Sparsebit/sparsebit/quantization/quant_model.py", line 198, in forward
    return self.model.forward(*args)
  File "<eval_with_key>.129", line 8, in forward
  File "/root/miniconda3/envs/sb/lib/python3.8/site-packages/torch-1.11.0-py3.8-linux-x86_64.egg/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/Sparsebit/sparsebit/quantization/modules/conv.py", line 39, in forward
    x_in = self.input_quantizer(x_in)
  File "/root/miniconda3/envs/sb/lib/python3.8/site-packages/torch-1.11.0-py3.8-linux-x86_64.egg/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/Sparsebit/sparsebit/quantization/quantizers/base.py", line 54, in forward
    x_dq = self._forward(x, scale, zero_point)
TypeError: _forward() takes 2 positional arguments but 4 were given

I found that def _forward(self, x): of pact does only accept two parameters, but dorefa does accept four parameters def _forward(self, x, scale, zero_point):, and also found that in fact dorefa also only two parameters are required. So, I think there are two solutions:

  • either: Change def _forward(self, x): of pact to def _forward(self, x, scale, zero_point):

  • or: Change def _forward(self, x, scale, zero_point): of dorefa to def _forward(self, x):, and change x_dq = self._forward(x, scale, zero_point) to:

if self.TYPE == "PACT" or self.TYPE == "DoReFa":
     x_dq = self._forward(x)
else:
     x_dq = self._forward(x, scale, zero_point)

The error will be solved!

error in homework Q3 code.

Homework ans, Q3 code
data.transpose(self.qdesc.ch_axis, 0)
here, 0 should be 1, the first axis (0) is the calibration-size, the second axis (1) is the channel axis according to the code.

and if self.qdesc.ch_axis==0 mean observers for weights, which should not be calculated here.

RuntimeError: Ninja is required to load C++ extensions

/opt/python3.8.6/bin/python /home/hongyang/codebase/quantization_code/Sparsebit/examples/quantization_aware_training/cifar10/basecase/main.py
Traceback (most recent call last):
File "/opt/python3.8.6/bin/ninja", line 33, in
sys.exit(load_entry_point('ninja', 'console_scripts', 'ninja')())
File "/opt/python3.8.6/lib/python3.8/site-packages/ninja-1.11.1-py3.8-linux-x86_64.egg/ninja/init.py", line 51, in ninja
raise SystemExit(_program('ninja', sys.argv[1:]))
File "/opt/python3.8.6/lib/python3.8/site-packages/ninja-1.11.1-py3.8-linux-x86_64.egg/ninja/init.py", line 47, in _program
return subprocess.call([os.path.join(BIN_DIR, name)] + args, close_fds=False)
File "/opt/python3.8.6/lib/python3.8/subprocess.py", line 340, in call
with Popen(*popenargs, **kwargs) as p:
File "/opt/python3.8.6/lib/python3.8/subprocess.py", line 854, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "/opt/python3.8.6/lib/python3.8/subprocess.py", line 1592, in _execute_child
self._posix_spawn(args, executable, env, restore_signals,
File "/opt/python3.8.6/lib/python3.8/subprocess.py", line 1543, in _posix_spawn
self.pid = os.posix_spawn(executable, args, env, **kwargs)
PermissionError: [Errno 13] Permission denied: '/opt/python3.8.6/lib/python3.8/site-packages/ninja-1.11.1-py3.8-linux-x86_64.egg/ninja/data/bin/ninja'
Traceback (most recent call last):
File "/home/hongyang/codebase/quantization_code/Sparsebit/examples/quantization_aware_training/cifar10/basecase/main.py", line 23, in
from sparsebit.quantization import QuantModel, parse_qconfig
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/init.py", line 1, in
from .quant_model import *
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quant_model.py", line 18, in
from sparsebit.quantization.modules import *
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/modules/init.py", line 17, in
from .base import QuantOpr, MultipleInputsQuantOpr
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/modules/base.py", line 4, in
from sparsebit.quantization.quantizers import build_quantizer
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quantizers/init.py", line 9, in
from .base import Quantizer
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quantizers/base.py", line 4, in
from sparsebit.quantization.observers import build_observer
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/observers/init.py", line 10, in
from . import minmax, percentile, mse, moving_average, kl_histogram, aciq
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/observers/mse.py", line 6, in
from sparsebit.quantization.quantizers.quant_tensor import STE
File "/home/hongyang/codebase/quantization_code/Sparsebit/sparsebit/quantization/quantizers/quant_tensor.py", line 13, in
fake_quant_kernel = load(
File "/opt/python3.8.6/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load
return _jit_compile(
File "/opt/python3.8.6/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile
_write_ninja_file_and_build_library(
File "/opt/python3.8.6/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1506, in _write_ninja_file_and_build_library
verify_ninja_availability()
File "/opt/python3.8.6/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1562, in verify_ninja_availability
raise RuntimeError("Ninja is required to load C++ extensions")
RuntimeError: Ninja is required to load C++ extensions

ONNX can not be loaded by Tensorrt 8.2.5& 8003

homework, Q4 export onnx file and conver trt engine. I use trtexec --workspace=4096 --int8 --onnx=./qresnet18.onnx in 8.2.5 &8.0.03 version.
But encountered that op is not supported, as follows:
[07/30/2022-08:13:33] [I] TensorRT version: 8003 [07/30/2022-08:13:33] [I] [TRT] [MemUsageChange] Init CUDA: CPU +250, GPU +0, now: CPU 257, GPU 482 (MiB) [07/30/2022-08:13:33] [I] Start parsing network model [07/30/2022-08:13:33] [I] [TRT] ---------------------------------------------------------------- [07/30/2022-08:13:33] [I] [TRT] Input filename: ./qresnet18.onnx [07/30/2022-08:13:33] [I] [TRT] ONNX IR version: 0.0.7 [07/30/2022-08:13:33] [I] [TRT] Opset version: 13 [07/30/2022-08:13:33] [I] [TRT] Producer name: pytorch [07/30/2022-08:13:33] [I] [TRT] Producer version: 1.12.0 [07/30/2022-08:13:33] [I] [TRT] Domain: [07/30/2022-08:13:33] [I] [TRT] Model version: 0 [07/30/2022-08:13:33] [I] [TRT] Doc string: [07/30/2022-08:13:33] [I] [TRT] ---------------------------------------------------------------- [07/30/2022-08:13:33] [E] Error[3]: onnx::QuantizeLinear_710: invalid weights type of Int8 [07/30/2022-08:13:33] [E] [TRT] ModelImporter.cpp:720: While parsing node number 0 [Identity -> "onnx::QuantizeLinear_872"]: [07/30/2022-08:13:33] [E] [TRT] ModelImporter.cpp:721: --- Begin node --- [07/30/2022-08:13:33] [E] [TRT] ModelImporter.cpp:722: input: "onnx::QuantizeLinear_710" output: "onnx::QuantizeLinear_872" name: "Identity_0" op_type: "Identity"

Do I need to add some other settings?

QAT cifar10 example with QADD Quant enable 报错

执行 https://github.com/megvii-research/Sparsebit/blob/main/examples/quantization_aware_training/cifar10/basecase/main.py 报错

  • 执行过程
python3 main.py qconfig_lsq.yaml --epochs=0
  • qconfig_lsq.yaml 内容:
BACKEND: virtual
W:
  QSCHEME: per-channel-symmetric
  QUANTIZER: 
    TYPE: lsq
    BIT: 4
A:
  QSCHEME: per-tensor-affine
  QUANTIZER:
    TYPE: lsq
    BIT: 4
  QADD:
    ENABLE_QUANT: true
  • 报错
Traceback (most recent call last):
  File "main.py", line 439, in <module>
    main()
  File "main.py", line 223, in main
    qmodel.export_onnx(
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 257, in export_onnx
    self.add_extra_info_to_onnx(name)
  File "/data/Project/Sparsebit/sparsebit/quantization/quant_model.py", line 313, in add_extra_info_to_onnx
    weight_dequant = nodes[tensor_inputs[onnx_op.input[1]][0]]
IndexError: list index (1) out of range

ImportError

My setting:
cuda 10.2
python=3.8

intsall SparseBit by :

git clone https://github.com/megvii-research/Sparsebit.git
cd sparsebit
python3 setup.py develop --user
pip3 install tensorrt-8.2.5.1-cp38-none-linux_x86_64.whl

after installed that:
run /root/Sparsebit/examples/cifar10_ptq/main.ipynb, but exposing RuntimeError: Ninja is required to load C++ extensions, so ,also run pip install Ninja, then run main.ipynb again, exposing an error:

Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?9abbaedb-76bc-4b54-8e8c-3079840743b9)
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
/root/Sparsebit/examples/cifar10_ptq/main.ipynb Cell 2 in <cell line: 24>()
     [21](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a2268616964756f546974616e5f7870227d/root/Sparsebit/examples/cifar10_ptq/main.ipynb#W1sdnNjb2RlLXJlbW90ZQ%3D%3D?line=20) import torchvision.datasets as datasets
     [22](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a2268616964756f546974616e5f7870227d/root/Sparsebit/examples/cifar10_ptq/main.ipynb#W1sdnNjb2RlLXJlbW90ZQ%3D%3D?line=21) from model import resnet20
---> [24](vscode-notebook-cell://ssh-remote%2B7b22686f73744e616d65223a2268616964756f546974616e5f7870227d/root/Sparsebit/examples/cifar10_ptq/main.ipynb#W1sdnNjb2RlLXJlbW90ZQ%3D%3D?line=23) from sparsebit.quantization import QuantModel, parse_qconfig

File ~/Sparsebit/sparsebit/quantization/__init__.py:1, in <module>
----> 1 from .quant_model import *
      2 from .quant_config import parse_qconfig

File ~/Sparsebit/sparsebit/quantization/quant_model.py:18, in <module>
     15 import onnx
     17 from sparsebit.utils import update_config
---> 18 from sparsebit.quantization.modules import *
     19 from sparsebit.quantization.observers import Observer
     20 from sparsebit.quantization.quantizers import Quantizer

File ~/Sparsebit/sparsebit/quantization/modules/__init__.py:16, in <module>
     12     return real_register
     15 # 将需要注册的module文件填写至此
---> 16 from .base import QuantOpr, MultipleInputsQuantOpr
     17 from .activations import *
     18 from .conv import *
...
-> 1775     module = importlib.util.module_from_spec(spec)
   1776     assert isinstance(spec.loader, importlib.abc.Loader)
   1777     spec.loader.exec_module(module)

ImportError: /root/Sparsebit/sparsebit/quantization/torch_extensions/build/fake_quant.so: cannot open shared object file: No such file or directory

I examine the "build directory", and find it is empty:
image

A bug may have been resolved.

There is a bug here, I thought of a simple way to fix it, which is applicable to QAT of ViT.

elif 'input_quantizer.scale' in dict(_module.state_dict()).keys():
      _module.input_quantizer.set_fake_fused()  # 有bug, quant_state会来回切.
else:
     print("no_set_fake_fused:", _user.name, _module.input_quantizer_generated)

How to deal with layers constructed by bool operations?

for example, there is a layer that is constructed as follows:

while len(emb_out.shape) < len(h.shape):
emb_out = emb_out[..., None]

when quantizing this layer, we met this error.

torch.fx.proxy.TraceError: symbolically traced variables cannot be used as inputs to control flow

Import Error

Code jammed in “from sparsebit.quantization import QuantModel, parse_qconfig”, as shown in figure
截屏2022-07-31 21 39 38

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.