Comments (5)
发现即使没有scale文件,但也可以将量化后的静态图合并成onnx模型文件,使用的命令是:
paddle2onnx --model_dir ./ --model_filename model.pdmodel --params_filename model.pdiparams --save_file quant_model.onnx --opset_version 13 --enable_dev_version True --deploy_backend onnxruntime --enable_onnx_checker True
但是失真度太大了!
于是我尝试改用这个量化方法:
quant_recon_static(
executor=exe,
model_dir=args__.model_path,
quantize_model_path=args__.save_path + 'quant_recon_static/',
data_loader=data_loader,
model_filename=args__.model_filename,
params_filename=args__.params_filename,
batch_size=32,
batch_nums=256,
region_weights_names=None,
onnx_format=args__.onnx_format,
recon_level='region-wise',
is_full_quantize=args__.is_full_quantize,
bias_correction=args__.bias_correction,
)
官方说是这个方法会很耗时,没想到不仅耗时,而且压根不会成功,跑了4天了,epoch使用的是默认值20,程序跑完20个epoch后又会从0开始跑,然后没完没了,第五天报错了:
此时此刻很想雨!
from paddleslim.
你好,首先回答下你提到的这个scale文件,在使用paddleslim压缩静态图模型后得到的只有model.pdmodel和model.pdiparams两个文件,你所描述的这个scale文件是calibration.cache文件,这个文件是在GPU上使用Tensorrt部署时,指令paddle2onnx --model_dir ./ --model_filename model.pdmodel --params_filename model.pdiparams --save_file float_model.onnx --opset_version 13 --enable_dev_version True --deploy_backend tensorrt --enable_onnx_checker True生成的。
关于精度损失的问题,quant_post_static和quant_recon_static属于比较老的接口,可能在某些模型使用上存在部分问题,这边建议去使用paddleslim的新接口,自动压缩接口,可参照https://github.com/PaddlePaddle/PaddleSlim/tree/develop/example/auto_compression
里面包括了训练后量化和量化训练,静态图模型适合使用此接口。
以下面是简单的接口示例:
ac = AutoCompression(
model_dir="./MobileNetV1_infer",
model_filename="inference.pdmodel",
params_filename="inference.pdiparams",
save_dir="MobileNetV1_quant",
config={"QuantPost": {}, "HyperParameterOptimization": {'ptq_algo': ['avg'], 'max_quant_count': 3}},
train_dataloader=train_loader,
eval_dataloader=train_loader)
ac.compress()
from paddleslim.
你好,首先回答下你提到的这个scale文件,在使用paddleslim压缩静态图模型后得到的只有model.pdmodel和model.pdiparams两个文件,你所描述的这个scale文件是calibration.cache文件,这个文件是在GPU上使用Tensorrt部署时,指令paddle2onnx --model_dir ./ --model_filename model.pdmodel --params_filename model.pdiparams --save_file float_model.onnx --opset_version 13 --enable_dev_version True --deploy_backend tensorrt --enable_onnx_checker True生成的。
关于精度损失的问题,quant_post_static和quant_recon_static属于比较老的接口,可能在某些模型使用上存在部分问题,这边建议去使用paddleslim的新接口,自动压缩接口,可参照https://github.com/PaddlePaddle/PaddleSlim/tree/develop/example/auto_compression
里面包括了训练后量化和量化训练,静态图模型适合使用此接口。
以下面是简单的接口示例:
ac = AutoCompression(
model_dir="./MobileNetV1_infer",
model_filename="inference.pdmodel",
params_filename="inference.pdiparams",
save_dir="MobileNetV1_quant",
config={"QuantPost": {}, "HyperParameterOptimization": {'ptq_algo': ['avg'], 'max_quant_count': 3}},
train_dataloader=train_loader,
eval_dataloader=train_loader)
ac.compress()
from paddleslim.
此外提醒一下onnx_format这个参数设置为True,便于导出的pdmodel是paddle的新格式,也便于后续转成onnx文件。
from paddleslim.
如果问题解决了的话,麻烦关闭该issue,谢谢
from paddleslim.
Related Issues (20)
- 自动压缩中的蒸馏损失为多个时,配置文件要怎么设置? HOT 2
- act 自动压缩pytorch_yolo实例中,python onnx--> tensorrt int8推理结果异常 HOT 1
- AttributeError: module 'paddleslim' has no attribute 'models' HOT 1
- 配置pruner的时候显示 0 collections
- 如何使用sensitive来确定yolov3 mobilenetv3的剪枝率呀? HOT 9
- 想问一下你们刚更新的目标检测模型离线量化示例,它支持旋转框吗?例如ppyoloe-r HOT 4
- [Bug]TypeError: 'float' object is not iterable HOT 1
- 有考虑过新增人脸检测模型的压缩例程嘛? HOT 3
- 请问出现这种情况的原因会是什么?# 自动压缩 autoCompression HOT 4
- 使用paddleslim模型动态剪枝后,如何保存模型呢
- 文档中提供的自动压缩后RT-DETR模型的准确率很低 HOT 5
- 如何固定Softmax这个op的量化参数 HOT 5
- 关于自动压缩yolov8-s,run.py的时候出错。 HOT 3
- 报错:var_tensor.shape[0],tuple index out of range HOT 1
- paddleslim量化的模型如何使用openvino进行推理 HOT 1
- rtdetr进行自动压缩过程中报错如下 HOT 2
- rtdetr nms false HOT 1
- AttributeError HOT 2
- from paddleslim.dygraph.dist中没有import AdaptorBase HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from paddleslim.