Giter VIP home page Giter VIP logo

Comments (9)

jambayk avatar jambayk commented on June 10, 2024 1

io_config is a "config" option only valid for PyTorchModel (https://microsoft.github.io/Olive/api/models.html#pytorch-model). ONNXModel (https://microsoft.github.io/Olive/api/models.html#onnx-model) doesn't have this option.

Could you please share the full config json? Do you know if you .onnx model has dynamic or static input shape?

from olive.

franciskasara avatar franciskasara commented on June 10, 2024 1

That did it! Now the Dynamic Quantized model can be switched with the model I optimized. Thank you!

from olive.

franciskasara avatar franciskasara commented on June 10, 2024

Sorry that was absolutely not clear for me from the tutorial/examples.

The model has static input shapes.
My json file looks like this:

{ "description" : "Complete my_model_acceleration_description.json used in this quick tour", "input_model":{ "type": "ONNXModel", "config": { "model_path": "./models/model.onnx", "io_config": { "input_names": ["input"], "input_shapes": [[1, 256, 256, 256]], "output_names": ["output"] } } }, "evaluators": { "my_evaluator":{ "metrics":[ { "name": "my_latency_metric", "type": "latency", "sub_types": [{"name": "avg"}] } ] } }, "passes": { "onnx_conversion": { "type": "OnnxModelOptimizer", "config": { "target_opset": 18 } }, "quantization": { "type": "OnnxDynamicQuantization" } }, "engine": { "log_severity_level": 0, "evaluator": "my_evaluator" } }

from olive.

jambayk avatar jambayk commented on June 10, 2024

Thanks for bring this up. Almost all of our examples start with pytorch models so we missed adding some docs for initializing with onnx models. Since your model has static shapes, the io_config is not needed. Otherwise, they would need to get added to the evaluator config.

Can you try the following config:

{
    "description": "Complete my_model_acceleration_description.json used in this quick tour",
    "input_model": {
        "type": "ONNXModel",
        "config": {
            "model_path": "./models/model.onnx"
        }
    },
    "evaluators": {
        "my_evaluator": {
            "metrics": [
                {
                    "name": "my_latency_metric",
                    "type": "latency",
                    "sub_types": [
                        {
                            "name": "avg"
                        }
                    ]
                }
            ]
        }
    },
    "passes": {
        "onnx_conversion": {
            "type": "OnnxModelOptimizer"
        },
        "quantization": {
            "type": "OnnxDynamicQuantization"
        }
    },
    "engine": {
        "log_severity_level": 0,
        "evaluator": "my_evaluator"
    }
}

from olive.

franciskasara avatar franciskasara commented on June 10, 2024

This succesfully run, and saved two models.

  • The output model of OnnxModelOptimizer is bitwisely the same as the input model.
  • The output model of OnnxDynamicQuantization can not be switched, it gives an error. (onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\AbiCustomRegistry.cpp(516)\onnxruntime_pybind11_state.pyd!00007FF8EBCD996A: (caller: 00007FF8EBC51CA8) Exception(3) tid(21d0) 80070057 The parameter is incorrect. )

I think a detailed tutorial on how to optimize onnx models correctly with olive would be great in the future. Thank you for your help!


Full output:

[2023-11-03 09:08:39,701] [DEBUG] [engine.py:125:setup_accelerators] Initial execution providers: ['DmlExecutionProvider', 'CPUExecutionProvider']
[2023-11-03 09:08:39,701] [WARNING] [accelerator.py:122:infer_accelerators_from_execution_provider] Execution provider CPUExecutionProvider is mapped to multiple accelerators ['cpu', 'gpu']. Olive cannot infer the device which may cause unexpected behavior. Please specify the accelerator in the accelerator configs
[2023-11-03 09:08:39,701] [WARNING] [accelerator.py:122:infer_accelerators_from_execution_provider] Execution provider CPUExecutionProvider is mapped to multiple accelerators ['cpu', 'gpu', 'npu']. Olive cannot infer the device which may cause unexpected behavior. Please specify the accelerator in the accelerator configs
[2023-11-03 09:08:39,701] [WARNING] [engine.py:135:setup_accelerators] Cannot infer the accelerators from the target system. Use CPU as default.
[2023-11-03 09:08:39,701] [DEBUG] [engine.py:143:setup_accelerators] Initial accelerators: ['CPU']
[2023-11-03 09:08:39,701] [DEBUG] [engine.py:164:setup_accelerators] Supported execution providers for device cpu: ['CPUExecutionProvider']
[2023-11-03 09:08:39,701] [INFO] [engine.py:181:setup_accelerators] Running workflow on accelerator specs: cpu-cpu
[2023-11-03 09:08:39,701] [WARNING] [engine.py:185:setup_accelerators] The following execution provider is not supported: DmlExecutionProvider. Please consider installing an onnxruntime build that contains the relevant execution providers.
[2023-11-03 09:08:39,732] [DEBUG] [engine.py:1064:_evaluate_model] Evaluating model ...
[2023-11-03 09:08:39,742] [DEBUG] [resource_path.py:157:create_resource_path] Resource path ./models/model.onnx is inferred to be of type file.
[2023-11-03 09:08:39,751] [DEBUG] [resource_path.py:157:create_resource_path] Resource path ./models/model.onnx is inferred to be of type file.
[2023-11-03 09:08:39,815] [DEBUG] [config.py:154:fill_in_params] Missing parameter data_dir for component load_dataset
[2023-11-03 09:10:56,878] [DEBUG] [footprint.py:102:resolve_metrics] There is no goal set for metric: my_latency_metric-avg.
[2023-11-03 09:10:56,886] [INFO] [engine.py:401:run_accelerator] Input model evaluation results: {
"my_latency_metric-avg": 4500.57219
}
[2023-11-03 09:10:56,888] [INFO] [engine.py:406:run_accelerator] Saved evaluation results of input model to c:\work\olive\cpu-cpu_input_model_metrics.json
[2023-11-03 09:10:56,888] [DEBUG] [engine.py:482:run_no_search] Running ['onnx_conversion', 'quantization'] with no search ...
[2023-11-03 09:10:56,888] [INFO] [engine.py:924:_run_pass] Running pass onnx_conversion:OnnxModelOptimizer
[2023-11-03 09:10:56,898] [DEBUG] [resource_path.py:157:create_resource_path] Resource path ./models/model.onnx is inferred to be of type file.
[2023-11-03 09:10:56,904] [DEBUG] [resource_path.py:157:create_resource_path] Resource path ./models/model.onnx is inferred to be of type file.
[2023-11-03 09:11:03,400] [DEBUG] [footprint.py:102:resolve_metrics] There is no goal set for metric: my_latency_metric-avg.
[2023-11-03 09:11:03,400] [INFO] [engine.py:924:_run_pass] Running pass quantization:OnnxDynamicQuantization
[2023-11-03 09:11:03,400] [DEBUG] [resource_path.py:157:create_resource_path] Resource path C:\work\olive.olive-cache\models\0_OnnxModelOptimizer-4dbd8a4fc0ec3c8502479cbc0ff82c14-14a40d70a84be7c318dac9260174ffe5\output_model\model.onnx is inferred to be of type file.
[2023-11-03 09:11:03,426] [DEBUG] [resource_path.py:157:create_resource_path] Resource path C:\work\olive.olive-cache\models\0_OnnxModelOptimizer-4dbd8a4fc0ec3c8502479cbc0ff82c14-14a40d70a84be7c318dac9260174ffe5\output_model\model.onnx is inferred to be of type file.
[2023-11-03 09:11:04,102] [INFO] [quantization.py:342:_run_for_config] Preprocessing model for quantization
[2023-11-03 09:11:04,183] [WARNING] [quantization.py:452:_quant_preprocess] Failed to run quantization preprocessing with error of object of type 'NoneType' has no len(). Using original model.
Traceback (most recent call last):
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\passes\onnx\quantization.py", line 440, in _quant_preprocess
quant_pre_process(
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\onnxruntime\quantization\shape_inference.py", line 71, in quant_pre_process
model = SymbolicShapeInference.infer_shapes(
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\onnxruntime\tools\symbolic_shape_infer.py", line 2783, in infer_shapes
all_shapes_inferred = symbolic_shape_inference._infer_impl()
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\onnxruntime\tools\symbolic_shape_infer.py", line 2579, in _infer_impl
out_rank = len(get_shape_from_type_proto(vi.type))
TypeError: object of type 'NoneType' has no len()
WARNING:root:Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md
[2023-11-03 09:11:07,403] [DEBUG] [footprint.py:102:resolve_metrics] There is no goal set for metric: my_latency_metric-avg.
[2023-11-03 09:11:07,403] [DEBUG] [engine.py:1064:_evaluate_model] Evaluating model ...
[2023-11-03 09:11:07,403] [DEBUG] [resource_path.py:157:create_resource_path] Resource path C:\work\olive.olive-cache\models\1_OnnxDynamicQuantization-0-cda6dc5c7a898a7b0a13253cb99e2da8\output_model\model.onnx is inferred to be of type file.
[2023-11-03 09:11:07,414] [DEBUG] [resource_path.py:157:create_resource_path] Resource path C:\work\olive.olive-cache\models\1_OnnxDynamicQuantization-0-cda6dc5c7a898a7b0a13253cb99e2da8\output_model\model.onnx is inferred to be of type file.
[2023-11-03 09:11:07,494] [DEBUG] [config.py:154:fill_in_params] Missing parameter data_dir for component load_dataset
[2023-11-03 09:11:08,450] [WARNING] [engine.py:432:run_accelerator] Failed to run Olive on cpu-cpu: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for ConvInteger(10) node with name 'Conv__687_quant'
Traceback (most recent call last):
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\engine\engine.py", line 412, in run_accelerator
return self.run_no_search(
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\engine\engine.py", line 483, in run_no_search
should_prune, signal, model_ids = self._run_passes(
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\engine\engine.py", line 903, in _run_passes
signal = self._evaluate_model(model_config, model_id, data_root, evaluator_config, accelerator_spec)
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\engine\engine.py", line 1090, in _evaluate_model
signal = self.target.evaluate_model(model_config, data_root, metrics, accelerator_spec)
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\systems\local.py", line 47, in evaluate_model
return evaluator.evaluate(model, data_root, metrics, device=device, execution_providers=execution_providers)
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\evaluator\olive_evaluator.py", line 173, in evaluate
metrics_res[metric.name] = self._evaluate_latency(
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\evaluator\olive_evaluator.py", line 635, in _evaluate_latency
return self._evaluate_onnx_latency(model, metric, dataloader, post_func, device, execution_providers)
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\evaluator\olive_evaluator.py", line 386, in evaluate_onnx_latency
session = model.prepare_session(
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\model_init
.py", line 351, in prepare_session
return get_ort_inference_session(self.model_path, inference_settings, self.use_ort_extensions)
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\olive\common\ort_inference.py", line 66, in get_ort_inference_session
return ort.InferenceSession(
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 419, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "c:\Apps\Miniconda3\v3_8_5_x64\Local\envs\maskrcnn\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 463, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for ConvInteger(10) node with name 'Conv__687_quant'
[2023-11-03 09:11:08,544] [INFO] [engine.py:357:run] Run history for cpu-cpu:
[2023-11-03 09:11:08,544] [INFO] [engine.py:632:dump_run_history] Please install tabulate for better run history output
[2023-11-03 09:11:08,561] [INFO] [engine.py:372:run] No packaging config provided, skip packaging artifacts
{}

from olive.

trajepl avatar trajepl commented on June 10, 2024

Thanks for reporting this issue. @franciskasara
It is an good reflection to help refine our tutorial docs! Here is a quick and simple updates for our docs:#719

Also for the quantization model, you mentioned that it failed to be run. Could you help share the code to tell how you consume this quantized model use ort? Seems the error log show The parameter is incorrect.

from olive.

franciskasara avatar franciskasara commented on June 10, 2024

Unfortunately my environment had to change a lot since, and I can't seem to reproduce my own issue to get the same message. Now running the Dynamic Quantized model gives onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for ConvInteger(10) node with name 'Conv__687_quant' which seems to be a known onnxruntime issue.

from olive.

trajepl avatar trajepl commented on June 10, 2024

microsoft/onnxruntime#3130
Somehow it might be caused by the quantization configs, you can change weight_type to QUInt8 for a try.
image

from olive.

franciskasara avatar franciskasara commented on June 10, 2024

Also for the quantization model, you mentioned that it failed to be run. Could you help share the code to tell how you consume this quantized model use ort? Seems the error log show The parameter is incorrect.

It turned out that this error mentioned disappeared because the environment started to ignore the GPU. Now that I tried to run it on GPU, I got this issue back. I try to run the model like:


session_option_dml = ort.SessionOptions()
 session_option_dml.enable_profiling = False
 session_option_dml.enable_mem_pattern = False
 session_option_dml.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL

session_model = ort.InferenceSession(model_path, providers=["DmlExecutionProvider"],sess_options=session_option_dml, provider_options=[{"device_id":1}])

If the model_path points to the original, not olive converted model, it works.

Full traceback:

Traceback (most recent call last):
File "\main.py", line 67, in
session_model = ort.InferenceSession(model_path, providers=["DmlExecutionProvider"],
File "C:\Apps\Miniconda3\v3_8_5_x64\Local\envs\directml\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 419, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "C:\Apps\Miniconda3\v3_8_5_x64\Local\envs\directml\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 463, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: D:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\AbiCustomRegistry.cpp(516)\onnxruntime_pybind11_state.pyd!00007FFF9073996A: (caller: 00007FFF906B1CA8) Exception(3) tid(72d4) 80070057 The parameter is incorrect.

from olive.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.