Hi, The program crashes while optimizing - Steps to reproduce<

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Crashes in the middle of the optimization process (KeyError: 'throughput') about olive HOT 14 CLOSED

microsoft commented on July 16, 2024

Crashes in the middle of the optimization process (KeyError: 'throughput')

from olive.

Comments (14)

leqiao-1 commented on July 16, 2024 1

Hi @PasaOpasen,
Q: What means None in tuning combo (None, None, None, 'CPUExecutionProvider', <ExecutionMode.ORT_SEQUENTIAL: 0>, 99)
A: It means there is no validate inference process within the max_latency_ms. It might because the inference latency is too long, or the input data are not validate. You can try to increase max_latency_ms, or share the model so that I can have a check.

Q: input_spec output_names
A: If you provide the sample_input_data_path, or there are not dynamic input shapes, these two arguments are not necessary. If you have inputs with dynamic shapes, like [batches, 3, height, width], you need to provide input spec. batches, height, width should be set to int with possible numbers in real inference senario.

from olive.

leqiao-1 commented on July 16, 2024

Hi @shonigs , I tried with the model in notebook tutorials and no issues appeared. I am not sure if the issue is related to your ONNX model. Could you please share the model you used ? Thanks.

from olive.

kbraun-axio commented on July 16, 2024

Hi, I am getting the same error: KeyError: 'throughput'.

The complete error log is:

ERROR conda.cli.main_run:execute(41): `conda run olive optimize --model_path onnx-object-detection-model.onnx --throughput_tuning_enabled --max_latency_percentile 0.95 --max_latency_ms 100 --threads_num 1 --dynamic_batching_size 1 --min_duration_sec 10 --providers_list cpu` failed. (See above for error)
2022-08-03 12:54:12,827 - olive.__main__ - WARNING - OLive will call "olive setup" to setup environment first
2022-08-03 12:54:13,474 - olive.optimization_config - INFO - Checking the model file...
2022-08-03 12:54:14,821 - olive.optimization_config - INFO - Providers will be tested for optimization: ['CPUExecutionProvider']
2022-08-03 13:06:48,111 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'CPUExecutionProvider', <ExecutionMode.ORT_SEQUENTIAL: 0>, 99)
2022-08-03 13:44:02,303 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'CPUExecutionProvider', <ExecutionMode.ORT_PARALLEL: 1>, 99)
Traceback (most recent call last):
  File "/home/axio/miniconda3/envs/oonxoptimizer/bin/olive", line 8, in <module>
    sys.exit(main())
  File "/home/axio/miniconda3/envs/oonxoptimizer/lib/python3.7/site-packages/olive/__main__.py", line 438, in main
    options.func(options)
  File "/home/axio/miniconda3/envs/oonxoptimizer/lib/python3.7/site-packages/olive/__main__.py", line 322, in model_opt
    optimize(opt_config)
  File "/home/axio/miniconda3/envs/oonxoptimizer/lib/python3.7/site-packages/olive/optimize.py", line 36, in optimize
    olive_result = parse_tuning_result(optimization_config, *tuning_results, pretuning_inference_result)
  File "/home/axio/miniconda3/envs/oonxoptimizer/lib/python3.7/site-packages/olive/optimize.py", line 59, in parse_tuning_result
    best_test_name = max(tuning_results, key=lambda x: x["throughput"])["test_name"]
  File "/home/axio/miniconda3/envs/oonxoptimizer/lib/python3.7/site-packages/olive/optimize.py", line 59, in <lambda>
    best_test_name = max(tuning_results, key=lambda x: x["throughput"])["test_name"]
KeyError: 'throughput'

I am executing the optimization with conda run -n onnxoptimizer olive optimize --model_path onnx-object-detection-model.onnx --throughput_tuning_enabled --max_latency_percentile 0.95 --max_latency_ms 100 --threads_num 1 --dynamic_batching_size 1 --min_duration_sec 10 --providers_list cpu >& log.txt

The above error message is the contents of log.txt (see final part of the execution command above).

Please find my ONNX model here: https://get.hidrive.com/2qErePEy (Link valid until August 10, 2022)

from olive.

leqiao-1 commented on July 16, 2024

Hi @kbraun-axio ,
I think this error happended because the max_latency_ms is too small for cpu infernece.
You can augument max_latency_ms, or change the execution provider from cpu to cuda.
Here is test result on my local machine with command olive optimize --model_path onnx-object-detection-model.onnx --throughput_tuning_enabled --max_latency_percentile 0.95 --max_latency_ms 100 --threads_num 1 --dynamic_batching_size 1 --min_duration_sec 10 --providers_list cuda >& log_olive.txt

log_olive.txt

from olive.

kbraun-axio commented on July 16, 2024

Hi @leqiao-1,
Thanks for your reply and the log output.
I will increase the max_latency_ms and try running the optimization again. I will post the results here.
Unfortunately, our inference machine does not have a Nvidia GPU (we only use a Nvidia GPU in our training server). Therefore, I cannot set the execution provider to CUDA.

from olive.

kbraun-axio commented on July 16, 2024

Hi @leqiao-1,

Today, I tried to run the optimization again. This time, I increased the max_latency_ms to 10,000. However, I got the same error.
I attached the output log and the olive_opt_results folder (without the optimized model because it is too large) for you.

Do you think max_latency_ms of 10,000 is still not enough?

Inference with the onnx runtime and the same onnx model that I am trying to optimize takes about 7.5 seconds.

log_olive.txt
olive_opt_result.zip

from olive.

leqiao-1 commented on July 16, 2024

Hi @kbraun-axio
The latency depends on the machine. On my side, the inference takes about 400ms with cpu. If you want to try, you can still increase the max_latency_ms. However, enev if it works, it may take long time to run the troughput optimization, since the latency is too long.

from olive.

kbraun-axio commented on July 16, 2024

Okay, thank you. The machine on which we want to run the inference is a 6 core AMD CPU with 8 GB RAM from 2012. It runs in a manufacturing / shop floor environment. They do not have the newest hardware. But maybe it would be better to use a more powerful machine, like an Nvidia Jetson device, which supports CUDA.

Besides that, I realized the optimization takes lots of RAM. Watching the processes with htop showed a memory consmption of up to 12GB for the python process running OLive. But the machine only has 8GB RAM, so Ubuntu started to use swap memory from the hard disk, which is very slow. Is that intended or is 8GB RAM too less for OLive?

from olive.

leqiao-1 commented on July 16, 2024

Hi @kbraun-axio,
Are you using onnxruntime gpu package with --providers_list cpu ? I can reproduce memory consmption issue in this way.

If so, it's maybe because when checking model input info with ort infenrece sessions, OLive will try to create session with cuda. I think it's a bug in OLive, and we will fix it. As a workaround, you can uninstall onnxruntime gpu package and install the cpu version.

If not, please let me know your onnxruntime package version with pip list. I will check if I can run into the same issue.

from olive.

kbraun-axio commented on July 16, 2024

Hi @leqiao-1,
Yes I was running the gpu packge with --providers_list cpu. My colleque uninstalled the package and installed the default (CPU) package. Now the memory consumption is in the normal range and not too high. Thanks for your hint.

But the other issue, the KeyError: 'throughput', persists even with the cpu package and even if we set the max_latency_ms to higher values. Maybe it fails because the system is too old, it is from 2012.

from olive.

leqiao-1 commented on July 16, 2024

Hi @kbraun-axio
It might be possible, since the inference latency is very high on your side.

from olive.

PasaOpasen commented on July 16, 2024

I have same issue with log:

2022-12-26 23:59:42,091 - olive.optimization_config - INFO - Checking the model file...
2022-12-26 23:59:42,547 - olive.optimization_config - INFO - Providers will be tested for optimization: ['CPUExecutionProvider', 'DnnlExecutionProvider']
2022-12-26 23:59:52,402 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'CPUExecutionProvider', <ExecutionMode.ORT_SEQUENTIAL: 0>, 99)
2022-12-26 23:59:56,936 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'DnnlExecutionProvider', <ExecutionMode.ORT_SEQUENTIAL: 0>, 99)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
j:\aprbot\tmp\Optimize_ONNX_Models_Throughput_with_OLive.ipynb Cell 9 in <cell line: 27>()
      [1](vscode-notebook-cell:/j%3A/tmp/Optimize_ONNX_Models_Throughput_with_OLive.ipynb#X12sZmlsZQ%3D%3D?line=0) opt_config = OptimizationConfig(
      [2](vscode-notebook-cell:/j%3A/tmp/Optimize_ONNX_Models_Throughput_with_OLive.ipynb#X12sZmlsZQ%3D%3D?line=1) 
      [3](vscode-notebook-cell:/j%3A/tmp/Optimize_ONNX_Models_Throughput_with_OLive.ipynb#X12sZmlsZQ%3D%3D?line=2)     model_path = "./craft.onnx",
   (...)
     [24](vscode-notebook-cell:/j%3A/tmp/Optimize_ONNX_Models_Throughput_with_OLive.ipynb#X12sZmlsZQ%3D%3D?line=23)     test_num = 200
     [25](vscode-notebook-cell:/j%3A/tmp/Optimize_ONNX_Models_Throughput_with_OLive.ipynb#X12sZmlsZQ%3D%3D?line=24) )
---> [27](vscode-notebook-cell:/j%3A/tmp/Optimize_ONNX_Models_Throughput_with_OLive.ipynb#X12sZmlsZQ%3D%3D?line=26) result = optimize(opt_config)

File c:\Users\qtckp\anaconda3\envs\lib\site-packages\olive\optimize.py:36, in optimize(optimization_config)
     32     quantization_optimize(optimization_config)
     34 tuning_results = tune_onnx_model(optimization_config)
---> 36 olive_result = parse_tuning_result(optimization_config, *tuning_results, pretuning_inference_result)
     38 result_json_path = os.path.join(optimization_config.result_path, "olive_result.json")
     40 with open(result_json_path, 'w') as f:

File c:\Users\qtckp\anaconda3\envs\lib\site-packages\olive\optimize.py:59, in parse_tuning_result(optimization_config, *tuning_results)
     57 def parse_tuning_result(optimization_config, *tuning_results):
     58     if optimization_config.throughput_tuning_enabled:
---> 59         best_test_name = max(tuning_results, key=lambda x: x["throughput"])["test_name"]
     60     else:
     61         best_test_name = min(tuning_results, key=lambda x: x["latency_ms"]["avg"])["test_name"]

File c:\Users\qtckp\anaconda3\envs\lib\site-packages\olive\optimize.py:59, in parse_tuning_result.<locals>.<lambda>(x)
     57 def parse_tuning_result(optimization_config, *tuning_results):
     58     if optimization_config.throughput_tuning_enabled:
---> 59         best_test_name = max(tuning_results, key=lambda x: x["throughput"])["test_name"]
     60     else:
     61         best_test_name = min(tuning_results, key=lambda x: x["latency_ms"]["avg"])["test_name"]

KeyError: 'throughput'

Running by:

opt_config = OptimizationConfig(

    model_path = "./model.onnx",
    sample_input_data_path="./input.npz",
    result_path = "olive_opt_latency_result",

    throughput_tuning_enabled=True,
    openmp_enabled=False,
    max_latency_percentile = 0.95,
    max_latency_ms = 1000000,
    threads_num = 1,
    min_duration_sec=10000,

    providers_list = ["cpu", "dnnl"],
    inter_thread_num_list = [1],
    intra_thread_num_list=[1],
    execution_mode_list = ["sequential"],
    ort_opt_level_list=['all'],

    concurrency_num=4,

    warmup_num = 20,
    test_num = 200
)

result = optimize(opt_config)

Model is huge and inference is over 15secs but what do I wrong ? What means None in tuning combo (None, None, None, 'CPUExecutionProvider', <ExecutionMode.ORT_SEQUENTIAL: 0>, 99) ? What over params should I set?

input_spec output_names are so necessary? What shape I should write in input spec if the model has dynamic input like [batches, 3, height, width] ?

from olive.

PasaOpasen commented on July 16, 2024

@leqiao-1 Thank u for fast response!

Can u please try to do anything with this model https://github.com/PasaOpasen/_olive_craft ?

I tried several configurations but nothing changed. Its inference is about 15sec with 2 cores and the optimization works too long with huge test_num or warmup_num and gives almost no output

Also the optimization uses 6-8 cores with concurrency_num=1 and all my 12 cores with concurrency_num=2 and all my 16GB memory with concurrency_num>2

from olive.

leqiao-1 commented on July 16, 2024

If you have any further concerns or questions, please reopen this issue.

from olive.

Crashes in the middle of the optimization process (KeyError: 'throughput') about olive HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent