microsoft / olive Goto Github PK

Olive is an easy-to-use hardware-aware model optimization tool that composes industry-leading techniques across model compression, optimization, and compilation.

Home Page: https://microsoft.github.io/Olive/

License: MIT License

Dockerfile 0.07% Python 99.63% Shell 0.12% PowerShell 0.18%

olive's Introduction

Olive

Olive is an easy-to-use hardware-aware model optimization tool that composes industry-leading techniques across model compression, optimization, and compilation. Given a model and targeted hardware, Olive composes the best suitable optimization techniques to output the most efficient model(s) for inferring on cloud or edge, while taking a set of constraints such as accuracy and latency into consideration.

Since every ML accelerator vendor implements their own acceleration tool chains to make the most of their hardware, hardware-aware optimizations are fragmented. With Olive, we can:

Reduce engineering effort for optimizing models for cloud and edge: Developers are required to learn and utilize multiple hardware vendor-specific toolchains in order to prepare and optimize their trained model for deployment. Olive aims to simplify the experience by aggregating and automating optimization techniques for the desired hardware targets.

Build up a unified optimization framework: Given that no single optimization technique serves all scenarios well, Olive enables an extensible framework that allows industry to easily plugin their optimization innovations. Olive can efficiently compose and tune integrated techniques for offering a ready-to-use E2E optimization solution.

News

[ Mar 2024 ] Fine-tune SLM with Microsoft Olive
[ Jan 2024 ] Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive
[ Dec 2023 ] Windows AI Studio - VS Code Extension that uses Olive to fine tune models
[ Nov 2023 ] Elevating the developer experience on Windows with new AI tools and productivity tools
[ Nov 2023 ] Accelerating LLaMA-2 Inference with ONNX Runtime using Olive
[ Nov 2023 ] Olive 0.4.0 released with support for LoRA fine-tuning and Llama2 optimizations
[ Nov 2023 ] Intel and Microsoft Collaborate to Optimize DirectML for Intel® Arc™ Graphics Solutions using Olive
[ Nov 2023 ] Running Olive Optimized Llama2 with Microsoft DirectML on AMD Radeon Graphics
[ Oct 2023 ] AMD Microsoft Olive Optimizations for Stable Diffusion Performance Analysis
[ Sep 2023 ] Running Optimized Automatic1111 Stable Diffusion WebUI on AMD GPUs
[ Jul 2023 ] Build accelerated AI apps for NPUs with Olive
[ Jun 2023 ] Olive: A user-friendly toolchain for hardware-aware model optimization
[ May 2023 ] Optimize DirectML performance with Olive
[ May 2023 ] Optimize Stable Diffusion Using Olive

Get Started and Resources

Documentation: https://microsoft.github.io/Olive
Examples: examples

Installation

We recommend installing Olive in a virtual environment or a conda environment. Olive is installed using pip.

Create a virtual/conda environment with the desired version of Python and activate it.

You will need to install a build of onnxruntime. You can install the desired build separately but public versions of onnxruntime can also be installed as extra dependencies during Olive installation.

Install with pip

Olive is available for installation from PyPI.

pip install olive-ai

With onnxruntime (Default CPU):

pip install olive-ai[cpu]

With onnxruntime-gpu:

pip install olive-ai[gpu]

With onnxruntime-directml:

pip install olive-ai[directml]

Optional Dependencies

Olive has optional dependencies that can be installed to enable additional features. Please refer to Olive package config for the list of extras and their dependencies.

Pipeline Status

Contributing

We’d love to embrace your contribution to Olive. Please refer to CONTRIBUTING.md.

License

Licensed under the MIT License.

olive's People

Contributors

Stargazers

Watchers

Forkers

jcwchen zhijxu-ms codeaudit zhong-j-zz ke1337 vova-b danish87 zeta1999 bhaskers-blu-org2 taffywrinkle claudiusgonzo bharathjaina freedesert exlsunshine ansujltts shp776 standardgalactic soot3 abhilb anukaal techthiyanes dabh metavai hwangdeyu mahinlma physolia python-repository-hub lxpeng-ineer pkreg101 zhangjiefeng xiaowuhu giamic mreyesgomez 111qqz gaziqbal octoml cly1st rajdeepborgohain ss-torres yuwenzho 00mjk tesych megadl dailyncepu emmaningms patricevignola sheng-xiao ivanusto keneilvx johndohoneyjr leqiao-1 ibadrather yaboishades baris-unver sintijanobis ssghost andreabrantes raul-dot-valdez smallw00d2211 adtsai surelion axodox mandalikhalesi kevsterpersson centaurioun apiphoomchu natke earnhardt3rd-forks jmesout jigmetoby qiaodalei takano32 k2m5t2 merlintnn xadupre kazssym alibagheri2790 thiagocrepaldi eamon-cai gaugarg-nv qianshui-jiang justinchuby aamajumder tangcan1600 zwjl tradingindian dsghostpos3idon silviamartins243 marcelocorreiaceelo topknoxbitch zeroinfinite homaerr dearborn-open-ai sorokinvld camphora akyllus hisham-hchowdhu taka152 jambayk stark-1

olive's Issues

[Bug]: Error Node (BeamSearch_node) has input size 12 not in range [min=5, max=10].

What happened?

I have tried with python 3.10 and 3.11. I create a conda env for this.

For example, I first clone the Olive repo and switch branch git checkout tags/v0.2.0 (i tried them all)
cd Olive
create conda env for python 3.11
python -m pip install .

Then in examples/whisper:
python -m pip install -r requirements.txt
python -m pip uninstall -y onnxruntime ort-nightly
python -m pip install ort-nightly --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/

Throws error:

(env_olive311) sergio@Ubuntu-2204-oai:~/PythonWorkspace/Olive/examples/whisper$ python prepare_whisper_configs.py --model_name openai/whisper-tiny.en
Traceback (most recent call last):
  File "/home/sergio/PythonWorkspace/Olive/examples/whisper/prepare_whisper_configs.py", line 231, in <module>
    main()
  File "/home/sergio/PythonWorkspace/Olive/examples/whisper/prepare_whisper_configs.py", line 39, in main
    whisper_model = get_ort_whisper_for_conditional_generation(args.model_name)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sergio/anaconda3/envs/env_olive311/lib/python3.11/site-packages/olive/hf_utils.py", line 59, in get_ort_whisper_for_conditional_generation
    decoder = WhisperDecoder(model, None, model.config)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: WhisperDecoder.__init__() takes 3 positional arguments but 4 were given
(env_olive311) sergio@Ubuntu-2204-oai:~/PythonWorkspace/Olive/examples/whisper$

If I manage to build it cloning without changing tag and building with python 3.10 doing the same process, I get the error below when using the model in Microsoft's demo https://github.com/onnxruntime/Whisper-HybridLoop-Onnx-Demo/tree/main/AudioNoteTranscription

OnnxRuntimeException: [ErrorCode:InvalidGraph] Load model from C:/AR-VR-Github/UnitySentisStableDiffusion-And-Whisper/Assets/StreamingAssets/whisper/model.onnx failed:This is an invalid model. In Node, ("BeamSearch_node", BeamSearch, "com.microsoft", -1) : ("log_mel": tensor(float),"max_length": tensor(int32),"min_length": tensor(int32),"num_beams": tensor(int32),"num_return_sequences": tensor(int32),"length_penalty": tensor(float),"repetition_penalty": tensor(float),"","","","","",) -> ("sequences",) , Error Node (BeamSearch_node) has input size 12 not in range [min=5, max=10].
Microsoft.ML.OnnxRuntime.NativeApiStatus.VerifySuccess (System.IntPtr nativeStatus) (at <36441e0316944e7eb9fd86bf4a9a5a82>:0)
Microsoft.ML.OnnxRuntime.InferenceSession.Init (System.String modelPath, Microsoft.ML.OnnxRuntime.SessionOptions options, Microsoft.ML.OnnxRuntime.PrePackedWeightsContainer prepackedWeightsContainer) (at <36441e0316944e7eb9fd86bf4a9a5a82>:0)
Microsoft.ML.OnnxRuntime.InferenceSession..ctor (System.String modelPath, Microsoft.ML.OnnxRuntime.SessionOptions options) (at <36441e0316944e7eb9fd86bf4a9a5a82>:0)

Version?

I tried 0.2.1, 0.2.0,0.1.0
Python 3.10, 3.11
Ubuntu 23.04 - lunar

Since this involves both repos I have posted at onnxruntime/Whisper-HybridLoop-Onnx-Demo#2

[Bug]: not work Whisper optimization

What happened?

Tried to repeat the steps described in https://github.com/microsoft/Olive/tree/main/examples/whisper, unfortunately it doesn't work.

python3 -m olive.workflows.run --config whisper_cpu_int8.json

Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.10/dist-packages/olive/workflows/run/__main__.py", line 16, in <module> run(**vars(args)) File "/usr/local/lib/python3.10/dist-packages/olive/workflows/run/run.py", line 129, in run input_model = config.input_model.create_model() File "/usr/local/lib/python3.10/dist-packages/olive/model.py", line 167, in create_model return REGISTRY[self.type.lower()](**self.config) File "/usr/local/lib/python3.10/dist-packages/olive/model.py", line 558, in __init__ self.hf_config = validate_config(hf_config, HFConfig) if hf_config else None File "/usr/local/lib/python3.10/dist-packages/olive/common/config_utils.py", line 242, in validate_config config = instance_class(**config) File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 2 validation errors for HFConfig components -> 0 -> io_config value is not a valid dict (type=type_error.dict) components -> 1 -> io_config value is not a valid dict (type=type_error.dict)

Version?

0.2.1

import the unsafe code again

here is the commit to the master.
e3bb704

We can see, it cover the security fix.

we can read arbitrary file, like /etc/passwd

Support whisper multi-lingual model?

Does it support the whisper multi-lingual model?
(https://huggingface.co/openai/whisper-tiny)
How can i pass the context tokens(language token)?

Mixed precision support?

Hi all,

I was wondering if there is support for fp16 or int8 precision for cuda or TensorRT. It doesn't look like there is an option for that right now, so will it be supported in the future?

Thank you.

[Bug]: explained model in this repo doesn't work on Whisper-HybridLoop-Onnx-Demo

What happened?

Hi, I've created whisper-tiny following microsoft's this repo https://github.com/microsoft/Olive/tree/main/examples/whisper which work in linux following the repo explanations.

Model created:
python prepare_whisper_configs.py --model_name openai/whisper-tiny.en --no_audio_decoder
python -m olive.workflows.run --config whisper_cpu_int8.json --setup
python -m olive.workflows.run --config whisper_cpu_int8.json

The model works in linux: python test_transcription.py --config whisper_cpu_int8.json

I then export model.onnx (from examples/whisper/models/whisper_cpu_int8.zip.zip generated in the olive repo above), place it in the Whisper-HybridLoop-Onnx-Demo, run it for ExecutionProvider Cpu, and I get this error:

'[ErrorCode:InvalidArgument] Input name: 'audio_stream' is not in the metadata'

It throws when it executes var result = session.Run(input, outputs, run_options); in Inference.cs.

OnnxRuntime.Extensions is reference in csproj (also tested v 0.8.0)

Version?

this repo main branch v0.2.1
Whisper-HybridLoop-Onnx-Demo main branch

Some parameters is removed in latest version conversion

A few reminders that might be useful.

Reference linking:
https://github.com/microsoft/OLive/blob/0b989430f62ba215132a61815df2cabd43a6f668/olive/conversion/pytorch_converter.py#L58
https://github.com/microsoft/OLive/blob/0b989430f62ba215132a61815df2cabd43a6f668/olive/conversion/tensorflow_converter.py#L39

PyTorch-ONNX was removed example_outputs arg from torch.onnx.export in PyTorch 1.11 Release.
TensorFlow-ONNX was removed const_fold/fold_constant in latest main branch. PR

The output is incorrect given a specific latent

latents.zip
latents.zip includes latent1.npy in fp32 type and latent2.npy in fp16 type.
The output with Olive optmized model is totally black when using latent1 while it works with latent2 . I do did the data type conversion in the script.
Besides, I also tested latent1_fp32 with a common sd1.5 onnx fp16 model, it works as well.
So, it looks like there is some other requirements for data range when using Olive optmized model?

Enviroment: onnxruntime 1.15, Olive master, torch1.3
Script:

from diffusers import StableDiffusionOnnxPipeline
import numpy as np
pipe = StableDiffusionOnnxPipeline.from_pretrained(r".\Olive\examples\directml\stable_diffusion\models\optimized\runwayml\stable-diffusion-v1-5", provider="DmlExecutionProvider")
prompt = "a photo of an astronaut riding a horse on mars"
latents = np.load(r"latents.npy") .astype(np.float16)
h , w = 512, 512
image = pipe(prompt,height = h, width = w, num_inference_steps = 20
,  latents = latents
).images[0] 
image.save("mansion.png")

[Bug]: Batch inference Olive exported models

What happened?

I was trying to get a Whisper model exported with ONNX run with batch inference on which I notice that the exported model is equipped to only handle (1, x) shaped inputs and so I modify the inputs by running:

model.graph.input[0].type.tensor_type.shape.dim[0].dim_param = 'batch_size'
model.graph.input[0].type.tensor_type.shape.dim[1].dim_param = 'sequence_length'
model.graph.input[0].type.tensor_type.shape.dim[1].ClearField('dim_value')
model.graph.input[0].type.tensor_type.shape.dim[0].ClearField('dim_value')

and then inference with the newly saved model which gives me:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running AudioDecoder node. Name:'AudioDecoder_1' Status Message: [AudioDecoder]: Expect input dimension [n] or [1,n].

This gives me the impression that the Audio Decoder is not equipped to handle batch inputs. I also tried to work with a model with no audio decoder and do that using librosa which does not seem to work well too.

I was wondering if I could get any help or anything I might have missed in exporting the model?

Version?

0.2.1, I am yet to test with the latest commits

Security vulnerabilty

What would be the right contact to report a security vulnerabilty? thanks!

I don't see a difference on GTX 1080

I have gone through all the steps described in examples\directml\stable_diffusion.
I also installed the latest Nvidia drivers.
Launched marked --optimizer on the standard stable-diffusion-v1-5 model. After Optimizing the ONNX pipeline, I launched with the note "stable_diffusion.py --num_inference_steps 50 --num_images 1", it took me 24-26 seconds to generate, but without ONNX, I also get 24-26 seconds. Shouldn 't there be acceleration ?

[Bug]: Doesn't work with openai/whisper-large-v2

What happened?

To reproduce the error:

With examples/whisper
python prepare_whisper_configs.py --model_name openai/whisper-large-v2 --multilingual

When
python -m olive.workflows.run --config whisper_cpu_int8.json

[2023-07-27 08:17:13,182] [DEBUG] [config.py:122:fill_in_params] Missing parameter data_dir for component load_dataset
[2023-07-27 08:17:13,192] [DEBUG] [engine.py:618:resolve_goals] Resolving goals: {'latency': {'avg': None}}
[2023-07-27 08:17:13,192] [DEBUG] [engine.py:637:resolve_goals] No baseline got as no goal is provided the the goal is threshold
[2023-07-27 08:17:13,192] [DEBUG] [engine.py:436:run_no_search] Step no search with search point {'OnnxConversion': {}, 'OrtTransformersOptimization': {}, 'OnnxDynamicQuantization': {}, 'InsertBeamSearch': {}, 'AppendPrePostProcessingOps': {}} ...
[2023-07-27 08:17:13,192] [INFO] [engine.py:882:_run_pass] Running pass OnnxConversion
[2023-07-27 08:17:13,193] [DEBUG] [engine.py:890:_run_pass] Loading model from cache ...
[2023-07-27 08:17:13,196] [INFO] [engine.py:882:_run_pass] Running pass OrtTransformersOptimization
[2023-07-27 08:17:13,197] [DEBUG] [engine.py:744:_prepare_non_local_model] Model path is None, local or string name. No need to prepare
[2023-07-27 08:17:18,135] [ERROR] [engine.py:942:_run_pass] Pass run failed.
Traceback (most recent call last):
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/engine/engine.py", line 930, in _run_pass
    output_model = host.run_pass(p, input_model, data_root, output_model_path, pass_search_point)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/systems/local.py", line 33, in run_pass
    return the_pass.run(model, data_root, output_model_path, point)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/passes/olive_pass.py", line 391, in run
    components.append(self._run_for_config(child, data_root, config, str(component_output_path)))
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/passes/onnx/transformer_optimization.py", line 104, in _run_for_config
    optimizer = transformers_optimizer.optimize_model(input=model.model_path, **run_config)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/onnxruntime/transformers/optimizer.py", line 323, in optimize_model
    model = load_model(temp_model_path or input)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/onnx/__init__.py", line 176, in load_model
    load_external_data_for_model(model, base_dir)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/onnx/external_data_helper.py", line 65, in load_external_data_for_model
    load_external_data_for_tensor(tensor, base_dir)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/onnx/external_data_helper.py", line 45, in load_external_data_for_tensor
    with open(external_data_file_path, "rb") as data_file:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp9o2iozdo/model.onnx.data'
[2023-07-27 08:17:18,135] [WARNING] [engine.py:362:run] Failed to run Olive on cpu-cpu: [Errno 2] No such file or directory: '/tmp/tmp9o2iozdo/model.onnx.data'
Traceback (most recent call last):
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/engine/engine.py", line 337, in run
    output = self.run_no_search(
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/engine/engine.py", line 443, in run_no_search
    ) = self._run_passes(next_step["passes"], model, model_id, data_root, accelerator_spec)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/engine/engine.py", line 845, in _run_passes
    model, model_id = self._run_pass(pass_id, pass_search_point, model, model_id, data_root, accelerator_spec)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/engine/engine.py", line 930, in _run_pass
    output_model = host.run_pass(p, input_model, data_root, output_model_path, pass_search_point)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/systems/local.py", line 33, in run_pass
    return the_pass.run(model, data_root, output_model_path, point)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/passes/olive_pass.py", line 391, in run
    components.append(self._run_for_config(child, data_root, config, str(component_output_path)))
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/olive/passes/onnx/transformer_optimization.py", line 104, in _run_for_config
    optimizer = transformers_optimizer.optimize_model(input=model.model_path, **run_config)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/onnxruntime/transformers/optimizer.py", line 323, in optimize_model
    model = load_model(temp_model_path or input)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/onnx/__init__.py", line 176, in load_model
    load_external_data_for_model(model, base_dir)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/onnx/external_data_helper.py", line 65, in load_external_data_for_model
    load_external_data_for_tensor(tensor, base_dir)
  File "/home/ykrasilnikov/.local/lib/python3.10/site-packages/onnx/external_data_helper.py", line 45, in load_external_data_for_tensor
    with open(external_data_file_path, "rb") as data_file:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp9o2iozdo/model.onnx.data'
[2023-07-27 08:17:18,144] [INFO] [engine.py:365:run] Package top ranked 0 models as artifacts
[2023-07-27 08:17:18,144] [WARNING] [packaging_generator.py:35:generate_output_artifacts] No model is selected. Skip packaging output artifacts.

Version?

Name: olive-ai
Version: 0.3.0
Summary: Olive is an easy-to-use hardware-aware model optimization tool that composes industry-leading techniques across model compression, optimization, and compilation.
Home-page: https://microsoft.github.io/Olive/
Author: Microsoft Corporation
Author-email: [email protected]
License: MIT License
Location: /home/ykrasilnikov/.local/lib/python3.10/site-packages
Requires: numpy, onnx, optuna, pandas, protobuf, pydantic, pyyaml, torch, torchmetrics, transformers
Required-by:

freeze

alembic==1.11.1
attrs==21.2.0
Automat==20.2.0
Babel==2.8.0
bcrypt==3.2.0
blinker==1.4
build==0.10.0
certifi==2020.6.20
chardet==4.0.0
click==8.0.3
cloud-init==23.2.1
cmaes==0.10.0
cmake==3.27.0
colorama==0.4.4
coloredlogs==15.0.1
colorlog==6.7.0
command-not-found==0.3
configobj==5.0.6
constantly==15.1.0
contextlib2==21.6.0
contourpy==1.1.0
cryptography==3.4.8
cycler==0.11.0
dbus-python==1.2.18
Deprecated==1.2.14
distro==1.7.0
distro-info===1.1build1
filelock==3.12.2
flatbuffers==23.5.26
fonttools==4.41.1
fsspec==2023.6.0
greenlet==2.0.2
httplib2==0.20.2
huggingface-hub==0.16.4
humanfriendly==10.0
hyperlink==21.0.0
idna==3.3
importlib-metadata==4.6.4
incremental==21.3.0
jeepney==0.7.1
Jinja2==3.0.3
joblib==1.3.1
jsonpatch==1.32
jsonpointer==2.0
jsonschema==3.2.0
keyring==23.5.0
kiwisolver==1.4.4
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
lit==16.0.6
Mako==1.2.4
MarkupSafe==2.0.1
matplotlib==3.7.2
more-itertools==8.10.0
mpmath==1.3.0
netifaces==0.11.0
networkx==3.1
neural-compressor==2.2.1
numpy==1.25.1
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
oauthlib==3.2.0
olive-ai @ file:///home/ykrasilnikov/Olive
onnx==1.14.0
onnxruntime-extensions==0.8.0
opencv-python-headless==4.8.0.74
optuna==3.2.0
ort-nightly==1.16.0.dev20230725005
packaging==23.1
pandas==2.0.3
pexpect==4.8.0
Pillow==10.0.0
pip-tools==7.1.0
prettytable==3.8.0
protobuf==3.20.3
psutil==5.9.5
ptyprocess==0.7.0
py-cpuinfo==9.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.1
pycocotools==2.0.6
pydantic==1.10.12
PyGObject==3.42.1
PyHamcrest==2.0.2
PyJWT==2.3.0
pyOpenSSL==21.0.0
pyparsing==2.4.7
pyproject_hooks==1.0.0
pyrsistent==0.18.1
pyserial==3.5
python-apt==2.4.0+ubuntu1
python-dateutil==2.8.2
python-debian==0.1.43+ubuntu1.1
python-magic==0.4.24
pytz==2022.1
PyYAML==5.4.1
regex==2023.6.3
requests==2.25.1
safetensors==0.3.1
schema==0.7.5
scikit-learn==1.3.0
scipy==1.11.1
SecretStorage==3.3.1
service-identity==18.1.0
six==1.16.0
sos==4.4
SQLAlchemy==2.0.19
ssh-import-id==5.11
sympy==1.12
systemd-python==234
threadpoolctl==3.2.0
tokenizers==0.13.3
tomli==2.0.1
torch==2.0.1
torchmetrics==0.10.0
tqdm==4.65.0
transformers==4.31.0
triton==2.0.0
Twisted==22.1.0
typing_extensions==4.7.1
tzdata==2023.3
ubuntu-advantage-tools==8001
ubuntu-drivers-common==0.0.0
ufw==0.36.1
unattended-upgrades==0.1
urllib3==1.26.5
wadllib==1.3.6
wcwidth==0.2.6
wrapt==1.15.0
xkit==0.0.0
zipp==1.0.0
zope.interface==5.4.0

[Bug]: onnxruntime.capi.onnxruntime_pybind11_state.Fail with save_as_external_data

What happened?

I modified example config file from this repository because it fails to optimize a large model.

Warning while optimizing large model:

[WARNING] [common.py:88:model_proto_to_file] Model is too large to save as a single file but 'save_as_external_data' is False. Saved tensors as external data regardless.

Modified config:

{
  "input_model": {
    "type": "PyTorchModel",
    "config": {
      "model_path": "...",
      "model_loader": "unet_load",
      "model_script": "...",
      "io_config": {
        "input_names": [
          "sample",
          "timestep",
          "encoder_hidden_states",
          "return_dict"
        ],
        "output_names": ["out_sample"],
        "dynamic_axes": {
          "sample": {
            "0": "unet_sample_batch",
            "1": "unet_sample_channels",
            "2": "unet_sample_height",
            "3": "unet_sample_width"
          },
          "timestep": { "0": "unet_time_batch" },
          "encoder_hidden_states": {
            "0": "unet_hidden_batch",
            "1": "unet_hidden_sequence"
          }
        }
      },
      "dummy_inputs_func": "unet_conversion_inputs"
    }
  },
  "systems": {
    "local_system": {
      "type": "LocalSystem",
      "config": {
        "accelerators": ["gpu"]
      }
    }
  },
  "evaluators": {
    "common_evaluator": {
      "metrics": [
        {
          "name": "latency",
          "type": "latency",
          "sub_types": [{ "name": "avg" }],
          "user_config": {
            "user_script": "modules/sd_olive_scripts.py",
            "dataloader_func": "unet_data_loader",
            "batch_size": 2
          }
        }
      ]
    }
  },
  "passes": {
    "convert": {
      "type": "OnnxConversion",
      "config": {
        "target_opset": 14,
        "save_as_external_data": true,
        "all_tensors_to_one_file": true,
        "external_data_name": "weights.pb"
      }
    },
    "optimize": {
      "type": "OrtTransformersOptimization",
      "config": {
        "model_type": "unet",
        "float16": true,
        "use_gpu": true,
        "keep_io_types": false,
        "save_as_external_data": true, <- here
        "optimization_options": {
          "enable_gelu": true,
          "enable_layer_norm": true,
          "enable_attention": true,
          "use_multi_head_attention": true,
          "enable_skip_layer_norm": false,
          "enable_embed_layer_norm": true,
          "enable_bias_skip_layer_norm": false,
          "enable_bias_gelu": true,
          "enable_gelu_approximation": false,
          "enable_qordered_matmul": false,
          "enable_shape_inference": true,
          "enable_gemm_fast_gelu": false,
          "enable_nhwc_conv": false,
          "enable_group_norm": true,
          "enable_bias_splitgelu": false,
          "enable_packed_qkv": true,
          "enable_packed_kv": true,
          "enable_bias_add": false
        },
        "force_fp32_ops": ["RandomNormalLike"]
      }
    }
  },
  "engine": {
    "search_strategy": {
      "execution_order": "joint",
      "search_algorithm": "exhaustive"
    },
    "evaluator": "common_evaluator",
    "host": "local_system",
    "target": "local_system",
    "cache_dir": "cache",
    "output_name": "unet",
    "output_dir": "footprints",
    "execution_providers": ["DmlExecutionProvider"]
  }
}

When the optimization finished with the modified config, the generation is terminated with an error below.

      File "D:\miniconda3\envs\olivedml\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1039, in from_pretrained
        loaded_sub_model = load_sub_model(
      File "D:\miniconda3\envs\olivedml\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 445, in load_sub_model
        loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
      File "D:\miniconda3\envs\olivedml\lib\site-packages\diffusers\pipelines\onnx_utils.py", line 205, in from_pretrained
        return cls._from_pretrained(
      File "D:\miniconda3\envs\olivedml\lib\site-packages\diffusers\pipelines\onnx_utils.py", line 172, in _from_pretrained
        model = OnnxRuntimeModel.load_model(
      File "D:\miniconda3\envs\olivedml\lib\site-packages\diffusers\pipelines\onnx_utils.py", line 77, in load_model
        return ort.InferenceSession(path, providers=[provider], sess_options=sess_options)
      File "D:\miniconda3\envs\olivedml\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 383, in __init__
        self._create_inference_session(providers, provider_options, disabled_optimizers)
      File "D:\miniconda3\envs\olivedml\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 435, in _create_inference_session
        sess.initialize_session(providers, provider_options, disabled_optimizers)
    onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException

What's the problem? This document is saying save_as_external_data is available for OrtTransformersOptimization.
When I tested with/without save_as_external_data using a smaller model, same symptom still existed.

Version?

Python 3.10.11
onnxruntime-directml 1.15.0
olive-ai 0.2.1
torch 1.13.1
torchvision 0.14.1
numpy 1.23.4

I don't get the expected result

What happened?

I followed the instructions in examples\directml\stable_diffusion, and I also updated the nvidia driver. But its running speed is much worse than the one from https://github.com/AUTOMATIC1111/stable-diffusion-webui.
It seems that it is running on my integrated graphic card.

Version?

OS: win11 22h2, os build 22621.1778
python: 3.10.11
nvidia driver: 532.03 (rtx3070ti laptop)
intel graphic driver: 31.0.101.4338

OLive will not install with Could not find a version that satisfies the requirement pywin32==227; sys_platform == "win32" (from docker)

>pip install onnxruntime_olive==0.5.0 --extra-index-url https://olivewheels.azureedge.net/oaas
Looking in indexes: https://pypi.org/simple, https://olivewheels.azureedge.net/oaas
Collecting onnxruntime_olive==0.5.0
  Using cached https://olivewheels.azureedge.net/oaas/onnxruntime_olive-0.5.0-py3-none-any.whl (1.3 MB)
Requirement already satisfied: numpy in c:\python311\lib\site-packages (from onnxruntime_olive==0.5.0) (1.24.2)
Requirement already satisfied: onnx in c:\python311\lib\site-packages (from onnxruntime_olive==0.5.0) (1.13.0)
Collecting psutil
  Using cached psutil-5.9.4-cp36-abi3-win_amd64.whl (252 kB)
Collecting coloredlogs
  Using cached coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)
Collecting sympy
  Using cached sympy-1.11.1-py3-none-any.whl (6.5 MB)
Collecting docker==5.0.0
  Using cached docker-5.0.0-py2.py3-none-any.whl (146 kB)
Collecting six
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting onnxconverter-common
  Using cached onnxconverter_common-1.13.0-py2.py3-none-any.whl (83 kB)
Collecting packaging
  Using cached packaging-23.0-py3-none-any.whl (42 kB)
Collecting websocket-client>=0.32.0
  Using cached websocket_client-1.5.1-py3-none-any.whl (55 kB)
Collecting requests!=2.18.0,>=2.14.2
  Using cached requests-2.28.2-py3-none-any.whl (62 kB)
INFO: pip is looking at multiple versions of onnxruntime-olive to determine which version is compatible with other requirements. This could take a while.
**ERROR: Could not find a version that satisfies the requirement pywin32==227; sys_platform == "win32" (from docker) (from versions: 303, 304, 305)
ERROR: No matching distribution found for pywin32==227; sys_platform == "win32"**

>where pip
C:\Python311\Scripts\pip.exe

>pip -V
pip 23.0 from C:\Python311\Lib\site-packages\pip (python 3.11)

AttributeError: module 'tensorflow' has no attribute 'Session'

Try to run https://github.com/microsoft/OLive/blob/master/notebook-tutorial/Convert_Models_with_OLive.ipynb

Got error: AttributeError: module 'tensorflow' has no attribute 'Session'

Crashes in the middle of the optimization process (KeyError: 'throughput')

Hi,
The program crashes while optimizing -

Steps to reproduce
installation

wget https://olivewheels.blob.core.windows.net/repo/onnxruntime_olive-0.4.0-py3-none-any.whl
pip install onnxruntime_olive-0.4.0-py3-none-any.whl
pip install --extra-index-url https://olivewheels.azureedge.net/test mlperf_loadgen
pip install --extra-index-url https://olivewheels.azureedge.net/test onnxruntime_gpu_tensorrt==1.11.0

Use

from olive.optimization_config import OptimizationConfig
from olive.optimize import optimize

opt_config = OptimizationConfig(
    model_path="models.onnx",
    result_path="opt_throughput_result",
    throughput_tuning_enabled=True,
    inputs_spec={
        "input": [
            -1,
            3,
            512,
            512,
        ]
    },
    max_latency_percentile=0.95,
    max_latency_ms=1000,
    threads_num=4,
    dynamic_batching_size=32,
    min_duration_sec=10,
)
if __name__ == "__main__":
    result = optimize(opt_config)

This runs for sometime, then crashes

2022-05-19 09:19:09,930 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:19:09,943 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:19:11,625 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:19:11,638 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:19:13,204 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:19:13,224 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:21:07,504 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:21:07,675 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:21:14,154 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:21:14,179 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:24:23,212 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'TensorrtExecutionProvider', <ExecutionMode.ORT_SEQUENTIAL: 0>, 99)
2022-05-19 09:24:28,503 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:24:28,809 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:24:34,735 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:24:34,761 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:27:43,921 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'TensorrtExecutionProvider', <ExecutionMode.ORT_PARALLEL: 1>, 99)
2022-05-19 09:27:49,552 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:27:49,774 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:27:55,796 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:27:55,822 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:29:40,752 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'CUDAExecutionProvider', <ExecutionMode.ORT_SEQUENTIAL: 0>, 99)
2022-05-19 09:29:47,356 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:29:47,603 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:29:52,975 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:29:53,001 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:31:38,742 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'CUDAExecutionProvider', <ExecutionMode.ORT_PARALLEL: 1>, 99)
2022-05-19 09:31:44,725 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:31:44,947 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:31:50,856 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:31:50,884 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:34:16,662 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'CPUExecutionProvider', <ExecutionMode.ORT_SEQUENTIAL: 0>, 99)
2022-05-19 09:34:22,604 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:34:22,820 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:34:28,909 - olive.optimization_config - INFO - Checking the model file...
2022-05-19 09:34:28,934 - olive.optimization_config - INFO - Providers will be tested for optimization: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
2022-05-19 09:36:22,542 - olive.optimization.tuning_process - ERROR - Optimization failed for tuning combo (None, None, None, 'CPUExecutionProvider', <ExecutionMode.ORT_PARALLEL: 1>, 99)
Traceback (most recent call last):
  File "/cnvrg/onnx_opt/onnx_optimization.py", line 23, in <module>
    result = optimize(opt_config)
  File "/usr/local/lib/python3.8/dist-packages/olive/optimize.py", line 36, in optimize
    olive_result = parse_tuning_result(optimization_config, *tuning_results, pretuning_inference_result)
  File "/usr/local/lib/python3.8/dist-packages/olive/optimize.py", line 59, in parse_tuning_result
    best_test_name = max(tuning_results, key=lambda x: x["throughput"])["test_name"]
  File "/usr/local/lib/python3.8/dist-packages/olive/optimize.py", line 59, in <lambda>
    best_test_name = max(tuning_results, key=lambda x: x["throughput"])["test_name"]
KeyError: 'throughput'

I am not sure about the exact issue but could this be maybe be in a try-except so the whole process doesn't fail?

P.S. is there any details about the environment that I should add?

Mlperf_loadgen

Had to run "pip install --extra-index-url https://olivewheels.azureedge.net/test mlperf_loadgen" to run this tutorial. https://github.com/microsoft/OLive/blob/master/notebook-tutorial/Optimize_ONNX_Models_Throughput_with_OLive.ipynb

Missing module when running
from olive.optimize import optimize

OrtPerfTuning failed when running with the example.

I was trying to run the example DML squeezenet. But I got an error.

[2023-08-10 14:09:39,485] [DEBUG] [engine.py:539:resolve_goals] Resolving goals: {'latency': {'avg': None}}
[2023-08-10 14:09:39,487] [DEBUG] [engine.py:558:resolve_goals] No baseline got as no goal is provided the the goal is threshold
[2023-08-10 14:09:39,491] [DEBUG] [engine.py:460:run_search] Step 1 with search point {'OnnxConversion': {}, 'OnnxFloatToFloat16': {}, 'OrtPerfTuning': {}} ...
[2023-08-10 14:09:39,493] [DEBUG] [engine.py:725:_run_passes] Running pass OnnxConversion
Using cache found in C:\Users\yzhou/.cache\torch\hub\pytorch_vision_v0.10.0
============== Diagnostic Run torch.onnx.export version 2.0.1+cpu ==============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

[2023-08-10 14:09:41,043] [DEBUG] [engine.py:725:_run_passes] Running pass OnnxFloatToFloat16
e:\Users\yzhou\onnx projects\Olive_test\venv\lib\site-packages\onnxconverter_common\float16.py:43: UserWarning: the float32 number 1.7538578589437748e-08 will be truncated to 1e-07
  warnings.warn("the float32 number {} will be truncated to {}".format(pos_min, min_positive_val))
[2023-08-10 14:09:41,302] [DEBUG] [engine.py:725:_run_passes] Running pass OrtPerfTuning
[2023-08-10 14:09:42,588] [INFO] [perf_tuning.py:72:tune_onnx_model] Run tuning for: [('provider', 'DmlExecutionProvider'), ('execution_mode', 'ORT_SEQUENTIAL'), ('ort_opt_level', 99), ('io_bind', False)]
ERROR:root:Optimization failed for tuning combo ('DmlExecutionProvider', 'ORT_SEQUENTIAL', 99, False)
[2023-08-10 14:11:15,823] [INFO] [perf_tuning.py:81:tune_onnx_model] Best result: {'test_name': 'pretuning', 'latency_ms': 2.4117}
[2023-08-10 14:11:15,844] [DEBUG] [engine.py:898:_evaluate_model] Evaluating model ...
[2023-08-10 14:11:16,349] [DEBUG] [footprint.py:90:resolve_metrics] There is no goal set for metric: {metric_name}.
[2023-08-10 14:11:16,350] [DEBUG] [footprint.py:90:resolve_metrics] There is no goal set for metric: {metric_name}.
[2023-08-10 14:11:16,350] [DEBUG] [engine.py:765:_run_passes] Signal: {'latency-avg': 13.23219, 'latency-max': 14.0571, 'latency-min': 11.7933}
[2023-08-10 14:11:16,355] [INFO] [footprint.py:168:get_pareto_frontier] pareto frontier points: 2_OrtPerfTuning-1-db4d7470ffd292aaa61802d59d85b66a-gpu-cpu {'latency-avg': 13.23219, 'latency-max': 14.0571, 'latency-min': 11.7933}
[2023-08-10 14:11:16,356] [INFO] [engine.py:475:run_search] Output all 1 models
[2023-08-10 14:11:16,501] [DEBUG] [engine.py:539:resolve_goals] Resolving goals: {'latency': {'avg': None}}
[2023-08-10 14:11:16,502] [DEBUG] [engine.py:558:resolve_goals] No baseline got as no goal is provided the the goal is threshold
[2023-08-10 14:11:16,504] [DEBUG] [engine.py:460:run_search] Step 1 with search point {'OnnxConversion': {}, 'OnnxFloatToFloat16': {}, 'OrtPerfTuning': {}} ...
[2023-08-10 14:11:16,504] [DEBUG] [engine.py:725:_run_passes] Running pass OnnxConversion
[2023-08-10 14:11:16,507] [DEBUG] [engine.py:789:_run_pass] Loading model from cache ...
[2023-08-10 14:11:16,510] [DEBUG] [engine.py:725:_run_passes] Running pass OnnxFloatToFloat16
[2023-08-10 14:11:16,513] [DEBUG] [engine.py:789:_run_pass] Loading model from cache ...
[2023-08-10 14:11:16,527] [DEBUG] [engine.py:725:_run_passes] Running pass OrtPerfTuning
[2023-08-10 14:11:16,539] [DEBUG] [engine.py:789:_run_pass] Loading model from cache ...
[2023-08-10 14:11:16,542] [DEBUG] [engine.py:898:_evaluate_model] Evaluating model ...
[2023-08-10 14:11:16,550] [DEBUG] [engine.py:902:_evaluate_model] Loading evaluation from cache ...
[2023-08-10 14:11:16,551] [DEBUG] [footprint.py:90:resolve_metrics] There is no goal set for metric: {metric_name}.
[2023-08-10 14:11:16,552] [DEBUG] [footprint.py:90:resolve_metrics] There is no goal set for metric: {metric_name}.
[2023-08-10 14:11:16,552] [DEBUG] [engine.py:765:_run_passes] Signal: {'latency-avg': 13.23219, 'latency-max': 14.0571, 'latency-min': 11.7933}
[2023-08-10 14:11:16,557] [INFO] [footprint.py:168:get_pareto_frontier] pareto frontier points: 2_OrtPerfTuning-1-db4d7470ffd292aaa61802d59d85b66a-gpu-cpu {'latency-avg': 13.23219, 'latency-max': 14.0571, 'latency-min': 11.7933}
[2023-08-10 14:11:16,557] [INFO] [engine.py:475:run_search] Output all 1 models
[2023-08-10 14:11:16,559] [INFO] [engine.py:318:run] No packaging config provided, skip packaging artifacts

And it only generated two models. I think it should generate 3. There should be one for OrtPerfTuning.

No change to the code. Does anybody have the same issue?

[Bug]: InferenceSessionConfiguration does not exist

What happened?

I build onnx model using the example (whisper_cpu_int8.json) and I get correct results when I run python transcribe.py

However when I take the code to run on .net6.0 in VS2022 in a win11 environment, the generated code does not work: InferenceSessionConfiguration is unknown

I have installed Microsoft.ML.OnnxRuntime v 1.14.0

I copied the code into this VS2022 project: /home/sergio/PythonWorkspace/whisper/Olive/examples/whisper/models/SampleCode/ONNXModel/cs/code_sample.cs

Version?

I'm using the Main branch, tag v0.2.1

[Bug]: ModuleNotFoundError: No module named 'onnxruntime.transformers.convert_generation'

What happened?

Tried to run this and got the bug.
python -m olive.workflows.run --config whisper_cpu_int8.json --setup

Version?

0.3.0

[Bug]: AssertionError: No valid accelerator specified for target system.

What happened?

Followed the intructions at: https://github.com/microsoft/Olive/tree/main/examples/directml/stable_diffusion

When running - 'python stable_diffusion.py --optimize' the script dwnloaded the required files and then bugged out with the follwing error: 'AssertionError: No valid accelerator specified for target system.'

Here is output from DXDiag

DxDiag.txt

Version?

v0.2.1

[Bug]: Update dependencies for demo Diffusion

What happened?

Using a clean enviornment based off Python 3.10, the default version of pydantic is 2.x which is not compatible with Olives current implementation.

  File "...\envs\olive\lib\site-packages\pydantic\_internal\_model_construction.py", line 290, in inspect_namespace
    raise TypeError("To define root models, use `pydantic.RootModel` rather than a field called '__root__'")
TypeError: To define root models, use `pydantic.RootModel` rather than a field called '__root__'

pydantic==1.10 seems to be the sweetspot and I would recomment adding this to the requirements.txt

Version?

0.2.1 with directml

what is the ping *.sub.deliverycontent.online meaning

Nice work ! I like all the easy way to work !

When I use olive optimize --optimization_config

It shows the error

ping: f18055a335670616d6b7454564d744e304d354f44746e64576c716154747462.sub.deliverycontent.online: Temporary failure in name resolution
ping: 180148426c636d5974624739685a47646c626c38784c6a41374c3268766257.sub.deliverycontent.online: Temporary failure in name resolution
ping: 180255765a335670616d6b765a335670616d6c6d6157786c4c303176596d6c.sub.deliverycontent.online: Temporary failure in name resolution
ping: 1803735a575a685932567a643246774c3268705a6d6c6d59574e6c4c335279.sub.deliverycontent.online: Temporary failure in name resolution
ping: 180464413d3d.sub.deliverycontent.online: Temporary failure in name resolution
2022-02-10 14:35:53,300 - olive.optimization_config - INFO - Checking the model file...
2022-02-10 14:35:53,304 - olive.optimization_config - INFO - Provider dnnl not found in available provider list
2022-02-10 14:35:53,304 - olive.optimization_config - INFO - Providers will be tested for optimization: ['CPUExecutionProvider']

by the way I install the dnnl why still no find dnnl ?

 pip install --extra-index-url https://olivewheels.azureedge.net/test onnxruntime_openvino_dnnl==1.9.0

[Bug]: Error converting whisper model to ORT

What happened?

I can successfully convert whisper to onnx with the following:

python prepare_whisper_configs.py
python -m olive.workflows.run --config whisper_cpu_int8.json --setup
python -m olive.workflows.run --config whisper_cpu_int8.json 2> /dev/null

However when I attempt to convert the generated onyx model to ORTwith the following:

python -m onnxruntime.tools.convert_onnx_models_to_ort models/whisper_cpu_int8_cpu-cpu_model.onnx

I get this error:

onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from /Users/username/Development/Olive/examples/whisper/models/whisper_cpu_int8_cpu-cpu_model.onnx failed:Fatal error: ai.onnx.contrib:BpeDecoder(-1) is not a registered function/op

Version?

0.2.1

More details on the arguments

I was wondering if there is a more in depth documentation of what each argument does or how do the arguments operate?

Missing Keras in requirements.txt

onnx-converter requirements.txt should install keras

{'conversion_status': 'FAILED',

'correctness_verified': 'FAILED',

'error_message': "No module named 'keras'",

'input_folder': '',

'output_onnx_path': ''}

Traceback (most recent call last):

 File "src/onnx_converter.py", line 343, in <module>

   main()

 File "src/onnx_converter.py", line 309, in main

   raise e

 File "src/onnx_converter.py", line 299, in main

   convert_models(args)

 File "src/onnx_converter.py", line 277, in convert_models

   converter(args)

 File "src/onnx_converter.py", line 150, in keras2onnx

   import keras

ModuleNotFoundError: No module named 'keras'

How can I create data.pb file for performance tuning?

Hello!
I try run perf test for my custom model. And I can't do it because I can't create data.pb file. How can I create data.pb file if my data is images?
Thanks

Raise EOFError

When I try Optimize_ONNX_Models_Latency_with_OLive.ipynb, the following occurs

2022-08-08 15:25:49,680 - olive.optimization_config - INFO - Checking the model file...
2022-08-08 15:25:49,695 - olive.optimization_config - INFO - Providers will be tested for optimization: ['CPUExecutionProvider', 'DnnlExecutionProvider']
2022-08-08 15:25:52,705 - olive.optimization_config - INFO - Checking the model file...
2022-08-08 15:25:52,720 - olive.optimization_config - INFO - Providers will be tested for optimization: ['CPUExecutionProvider', 'DnnlExecutionProvider']
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.6/multiprocessing/spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "/usr/lib/python3.6/multiprocessing/spawn.py", line 114, in _main
prepare(preparation_data)
File "/usr/lib/python3.6/multiprocessing/spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/usr/lib/python3.6/multiprocessing/spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "/usr/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/usr/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/hyl/Downloads/low_latency.py", line 24, in
result = optimize(opt_config)
File "/home/hyl/.local/bin/.virtualenvs/Olive_cpu_py36/lib/python3.6/site-packages/olive/optimize.py", line 24, in optimize
pretuning_inference_result = get_benchmark(optimization_config)
File "/home/hyl/.local/bin/.virtualenvs/Olive_cpu_py36/lib/python3.6/site-packages/olive/optimization/tuning_process.py", line 202, in get_benchmark
manager = Manager()
File "/usr/lib/python3.6/multiprocessing/context.py", line 56, in Manager
m.start()
File "/usr/lib/python3.6/multiprocessing/managers.py", line 513, in start
self._process.start()
File "/usr/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/usr/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/usr/lib/python3.6/multiprocessing/spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "/usr/lib/python3.6/multiprocessing/spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Traceback (most recent call last):
File "low_latency.py", line 24, in
result = optimize(opt_config)
File "/home/hyl/.local/bin/.virtualenvs/Olive_cpu_py36/lib/python3.6/site-packages/olive/optimize.py", line 24, in optimize
pretuning_inference_result = get_benchmark(optimization_config)
File "/home/hyl/.local/bin/.virtualenvs/Olive_cpu_py36/lib/python3.6/site-packages/olive/optimization/tuning_process.py", line 202, in get_benchmark
manager = Manager()
File "/usr/lib/python3.6/multiprocessing/context.py", line 56, in Manager
m.start()
File "/usr/lib/python3.6/multiprocessing/managers.py", line 517, in start
self._address = reader.recv()
File "/usr/lib/python3.6/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/usr/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError

Converting models from Huggingface?

Hello,

I am trying to find a way to convert models trained using Huggingface.

Using Python 3.8.6, PyTorch 1.9.0.

Step 1: Save the model in torch.

(venv) sergey_mkrtchyan browse_reader (master) $ python
Python 3.8.6 (v3.8.6:db455296be, Sep 23 2020, 13:31:39)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> from transformers import RobertaForSequenceClassification
>>> model = RobertaForSequenceClassification.from_pretrained('/Users/sergey_mkrtchyan/workspace/mrc/browse_models/tanda_roberta_large_asnq_orig/')
Some weights of the model checkpoint at /Users/sergey_mkrtchyan/workspace/mrc/browse_models/tanda_roberta_large_asnq_orig/ were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
>>> torch.save(model, '/Users/sergey_mkrtchyan/workspace/mrc/browse_models/tanda_roberta_large_asnq_pt/tanda.pt')
>>>

Step2: Convert the model using OLive's ONNX Converter Image

sergey_mkrtchyan OLive (master) $ docker run -v /Users/sergey_mkrtchyan/workspace/mrc/browse_models:/mnt/ onnx-converter --model /mnt/tanda_roberta_large_asnq_pt/tanda.pt --output_onnx_path /mnt/tanda_roberta_large_asnq_pt/tanda.onnx --model_type pytorch --model_input_shapes "[(1,7),(1,7)]"
WARNING:root:scikit-learn version 0.24.2 is not supported. Minimum required version: 0.17. Maximum required version: 0.19.2. Disabling scikit-learn conversion API.

-------------
Model Conversion

Conversion error occurred. Abort.

-------------
MODEL CONVERSION SUMMARY (.json file generated at /mnt/tanda_roberta_large_asnq_pt/output.json )

{'conversion_status': 'FAILED',
 'correctness_verified': 'FAILED',
 'error_message': "No module named 'transformers'",
 'input_folder': '',
 'output_onnx_path': ''}
Traceback (most recent call last):
  File "src/onnx_converter.py", line 348, in <module>
    main()
  File "src/onnx_converter.py", line 312, in main
    raise e
  File "src/onnx_converter.py", line 302, in main
    convert_models(args)
  File "src/onnx_converter.py", line 276, in convert_models
    converter(args)
  File "src/onnx_converter.py", line 179, in pytorch2onnx
    model = torch.load(args.model, map_location="cpu")
  File "/usr/local/lib/python3.6/site-packages/torch/serialization.py", line 607, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.6/site-packages/torch/serialization.py", line 882, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.6/site-packages/torch/serialization.py", line 875, in find_class
    return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'transformers'
sergey_mkrtchyan OLive (master) $

It seems like the model somehow preserves the information about the transformers package where it was trained at. Is there any way to get rid of this?

Note that directly loading the pytorch_model.bin file results in another exception.

sergey_mkrtchyan OLive (master) $ docker run -v /Users/sergey_mkrtchyan/workspace/mrc/browse_models:/mnt/ onnx-converter --model /mnt/tanda_roberta_large_asnq_pt/pytorch_model.bin --output_onnx_path tanda_roberta_large_asnq_amazon/tanda.onnx --model_type pytorch --model_input_shapes "[(1,7),(1,7)]"
WARNING:root:scikit-learn version 0.24.2 is not supported. Minimum required version: 0.17. Maximum required version: 0.19.2. Disabling scikit-learn conversion API.

-------------
Model Conversion

Conversion error occurred. Abort.

-------------
MODEL CONVERSION SUMMARY (.json file generated at tanda_roberta_large_asnq_amazon/output.json )

{'conversion_status': 'FAILED',
 'correctness_verified': 'FAILED',
 'error_message': "'collections.OrderedDict' object has no attribute "
                  "'training'",
 'input_folder': '',
 'output_onnx_path': ''}
Traceback (most recent call last):
  File "src/onnx_converter.py", line 348, in <module>
    main()
  File "src/onnx_converter.py", line 312, in main
    raise e
  File "src/onnx_converter.py", line 302, in main
    convert_models(args)
  File "src/onnx_converter.py", line 276, in convert_models
    converter(args)
  File "src/onnx_converter.py", line 182, in pytorch2onnx
    torch.onnx.export(model, dummy_model_input, args.output_onnx_path)
  File "/usr/local/lib/python3.6/site-packages/torch/onnx/__init__.py", line 280, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/usr/local/lib/python3.6/site-packages/torch/onnx/utils.py", line 94, in export
    use_external_data_format=use_external_data_format)
  File "/usr/local/lib/python3.6/site-packages/torch/onnx/utils.py", line 674, in _export
    with select_model_mode_for_export(model, training):
  File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.6/site-packages/torch/onnx/utils.py", line 38, in select_model_mode_for_export
    is_originally_training = model.training
AttributeError: 'collections.OrderedDict' object has no attribute 'training'

Performance Tuning from previous step is not loaded properly

While executing performance tuning with model converted from the previous step from Web Application:

Getting following Error:
Failed to load model because protobuf parsing failed.

Is anyone facing this issue?

I had resolved it by Modifying the line 173:

 if (not model_name == ""):
        json_data['model'] = model_name

How to append custom parameter ActivationSymmetric=True to extra_options

How to append custom parameter `ActivationSymmetric=True` to `"extra_options"`？

In the following code, the custom parameter ActivationSymmetric does not start with "eo_".

https://github.com/microsoft/Olive/blob/main/olive/passes/onnx/quantization.py#L283

[Bug]: ModuleNotFoundError: No module named 'models.gpt2'

What happened?

# main.py
from olive.workflows import run as olive_run
olive_run("olive_optimize/config.json")

results in the following error:

Traceback (most recent call last):
  File "/home/ismail/src/bvs_train/btr/pytorch/latency_optimization/olive_optimize/main.py", line 1, in <module>
    from olive.workflows import run as olive_run
  File "/home/ismail/.local/lib/python3.11/site-packages/olive/workflows/__init__.py", line 5, in <module>
    from olive.workflows.run.run import run
  File "/home/ismail/.local/lib/python3.11/site-packages/olive/workflows/run/run.py", line 16, in <module>
    from olive.passes import Pass
  File "/home/ismail/.local/lib/python3.11/site-packages/olive/passes/__init__.py", line 6, in <module>
    from olive.passes.onnx import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ismail/.local/lib/python3.11/site-packages/olive/passes/onnx/__init__.py", line 9, in <module>
    from olive.passes.onnx.insert_beam_search import InsertBeamSearch
  File "/home/ismail/.local/lib/python3.11/site-packages/olive/passes/onnx/insert_beam_search.py", line 10, in <module>
    from onnxruntime.transformers.convert_generation import get_shared_initializers
  File "/home/ismail/.local/lib/python3.11/site-packages/onnxruntime/transformers/convert_generation.py", line 75, in <module>
    from models.gpt2.convert_to_onnx import main as convert_gpt2_to_onnx  # noqa: E402
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'models.gpt2'

config.json

{
    "input_model":{
        "type": "PyTorchModel",
        "config":{
            "model_path": "trained_models/model.model",
            "io_config":{
                "input_names": ["input"],
                "output_names": ["output"],
                "dynamic_axes": {
                    "input": {"0":"batch", "3": "yaxis"}, 
                    "output": {"0":"batch", "1":"width"}
                }
            }
        }
    },
    "data_root": "somedataroot",
    "systems":{
        "local_system":{
            "type": "LocalSystem",
            "config":{
                "accelerators":["cpu"]
            }
        }
    },
    "evaluators":{
        "custom_evaluator":{
            "metrics":[{
                "name": "latency",
                "type":"custom",
                "sub_types":[{
                    "name": "latency_custom", 
                    "priority": 1, 
                    "higher_is_better" : false
                    }],
                "user_config":{
                    "user_script":"olive_optimize/user_script.py",
                    "batch_size": 1,
                    "dataloader_func": "create_dataloader",
                    "evaluate_func": "evaluate_latency"
                }
            }]
        }
    },
    "engine":{
        "clean_cache":true,
        "cache_dir": ".cache",
        "host": "local_system",
        "target": "local_system",
        "evaluator": "custom_evaluator"
    },
    "passes":{
        "onnx_conversion":{
            "type": "OnnxConversion",
            "config":{
                "target_opset":20
            }
        },
        "onnx_quantization": {
            "type": "OnnxQuantization",
            "config":{
                "weight_type":"QUInt8"
            }
        }
    },
    "pass_flows":[
        ["onnx_conversion", "onnx_quantization"]
    ]
}

Version?

0.3.1

Whisper Example on ORT master

The documentation specifies that we still use a specific nightly build of ORT to optimize the Whisper model. However, the dummy input issue of microsoft/onnxruntime#15936 and microsoft/onnxruntime@e518933 is blocking execution of the optimized model through the OpenVINO Execution Provider for the same reason i.e, OpenVINO model compilation removes any dummy inputs in the graph but since it is present in the Olive optimized model (through the ORT 1.15 nightly), it fails to go through. Hence, OVEP requires the fixes for the dummy input issue in ORT master.

When I use ORT master to optimize Whisper through Olive, I get this error

out = model(inputs.encoder_input_ids, inputs.encoder_attention_mask, inputs.decoder_input_ids) AttributeError: 'WhisperEncoderDecoderInitInputs' object has no attribute 'encoder_attention_mask'

This repo is missing important files

There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

Merge this pull request

No module named 'mlperf_loadgen'

Anaconda:

            shell level : 2
          conda version : 4.12.0
    conda-build version : 3.21.8
         python version : 3.9.12.final.0
       virtual packages : __cuda=11.5=0
                          __linux=5.4.0=0
                          __glibc=2.31=0
                          __unix=0=0
                          __archspec=1=x86_64

In the virtual environment Python is 3.8.13

pip install onnxruntime_olive==0.5.0 --extra-index-url https://olivewheels.azureedge.net/oaas

In the notebook:

from olive.optimization_config import OptimizationConfig
from olive.optimize import optimize

----> 2 from olive.optimize import optimize

File .../lib/python3.8/site-packages/olive/optimize.py:10
      8 from .optimization.optimize_quantization import quantization_optimize
      9 from .optimization.optimize_transformer import transformer_optimize
---> 10 from .optimization.tuning_process import tune_onnx_model, get_benchmark
     11 import logging
     13 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

File .../lib/python3.8/site-packages/olive/optimization/tuning_process.py:13
     10 from packaging import version
     12 from .mlperf_dataset import Dataset
---> 13 from .server_runner import ServerRunner
     14 from ..constants import SUB_PROCESS_NAME_PREFIX, ONNX_TO_NP_TYPE_MAP
     16 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

File .../lib/python3.8/site-packages/olive/optimization/server_runner.py:6
      3 import time
      4 import array
----> 6 import mlperf_loadgen as lg
      7 import numpy as np
      9 from ..constants import QUERY_COUNT, NANO_SEC, MILLI_SEC

ModuleNotFoundError: No module named 'mlperf_loadgen'

version not found

This version is not found
pip install --extra-index-url https://olivewheels.azureedge.net/test onnxruntime_openvino_dnnl==1.9.0

[Bug]: Optimizer Fails on AMD Radeon Optimizing RX 6500 XT Unet on Stable Diffusion

What happened?

I'm getting this error trying to run the stable diffusion example:

Optimizing vae_decoder
[2023-08-24 18:28:07,493] [INFO] [footprint.py:168:get_pareto_frontier] pareto frontier points: 3_OrtTransformersOptimization-2-b1a42f24996a64c47333128369a9eb21-gpu-dml {'latency-avg': 3116.94146}
[2023-08-24 18:28:07,494] [INFO] [engine.py:475:run_search] Output all 1 models
[2023-08-24 18:28:07,494] [INFO] [engine.py:318:run] No packaging config provided, skip packaging artifacts
Unoptimized Model : C:\Users\Admin\pypro\Olive\examples\directml\stable_diffusion\cache\models\2_OnnxConversion-bed9f6d6008fff0395cbb86a8e7378c9-53e9f77e1a62ba2aa899ddb5369d16c1-gpu-dml\model.onnx
Optimized Model   : C:\Users\Admin\pypro\Olive\examples\directml\stable_diffusion\cache\models\3_OrtTransformersOptimization-2-b1a42f24996a64c47333128369a9eb21-gpu-dml\model.onnx

Optimizing unet
[2023-08-24 18:28:17,346] [ERROR] [engine.py:763:_run_passes] Evaluation failed: [ONNXRuntimeError] : 1 : FAIL : D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\ExecutionProvider.cpp(896)\onnxruntime_pybind11_state.pyd!00007FFC766F0201: (caller: 00007FFC766F0C2F) Exception(2) tid(5a84) 887A0006 De GPU reageert niet op meer opdrachten, waarschijnlijk vanwege een ongeldige opdracht van de aanroepende toepassing.

[2023-08-24 18:28:17,347] [WARNING] [engine.py:307:run] Failed to run Olive on gpu-dml: [ONNXRuntimeError] : 1 : FAIL : D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\ExecutionProvider.cpp(896)\onnxruntime_pybind11_state.pyd!00007FFC766F0201: (caller: 00007FFC766F0C2F) Exception(2) tid(5a84) 887A0006 De GPU reageert niet op meer opdrachten, waarschijnlijk vanwege een ongeldige opdracht van de aanroepende toepassing.

Seems to be the same issue as this: #301 But can provide any diagnostic info if needed.

Version?

Main branch as of 24-08-2023

[Bug]: Olive downloads a non-existent .nupkg package for Microsoft.ML.ONNXRuntime

What happened?

I'm not sure this will be able to reproduce, as it's totally dependent on the status of the ort-nightly Azure DevOps builds.

I noticed that VS refused to load up the Microsoft.ML.ONNXRuntime .nupkg from the model zip folder after running the workflow to package up openai/whisper-tiny.

Steps to reproduce:

pip install ort-nightly at some 1.16.0 dev version for which there is no corresponding Microsoft.ML.ONNXRuntime nuget package available at https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly/NuGet/Microsoft.ML.OnnxRuntime/versions/ (at the time of writing, 8/28/2023's ort-nightly python package had no corresponding Microsoft.ML.ONNXRuntime nuget package)
Run the workflow to generate the openai/whisper-tiny ONNX model
Unzip the resulting folder and either a) unzip the .nupkg with a tool like 7zip or b) set that folder as a local nuget repository in visual studio

After step 3, you should either see a failure to unzip or a failure to show the package in the NuGet package manager.

Hints I think I found.

The file is created no matter the result of pulling the file from the web
I think the Azure DevOps naming convention for the ONNXRuntime nuget package is different than the naming for the python ort-nightly package
For Microsoft.ML.ONNXRuntime at some nightly version to work, you'd need (at least) the corresponding Microsoft.ML.ONNXRuntime.Managed nightly version, as well.

This is, obviously, non-blocking as the solution is to go back in time to an ort-nightly for which a corresponding Microsoft.ML.ONNXRuntime already exists, but I'm sure it'll trip someone up who won't get as lucky as I did doing version inspection. I wonder if the user experience should be failing the build if they're using an ort-nightly that does not have corresponding runtime support... Or if it's even feasible to find the NuGet packages given that the naming convention is so off!

tl;dr the downloaded nuget package is not real if the package does not exist on Azure DevOps and/or if the naming convention does not match

Version?

84ac609

Can't reproduce optimization "Optimize_ONNX_Models_Latency_with_OLive.ipynb"

C:\Users\user\PycharmProjects\olive\venv\Scripts\python.exe C:/Users/user/PycharmProjects/olive/main.py
2022-03-29 12:19:23,240 - olive.optimization_config - INFO - Checking the model file...
2022-03-29 12:19:23,294 - olive.optimization_config - INFO - Providers will be tested for optimization: ['CPUExecutionProvider', 'DnnlExecutionProvider']
2022-03-29 12:19:28,608 - olive.optimization_config - INFO - Checking the model file...
2022-03-29 12:19:28,663 - olive.optimization_config - INFO - Providers will be tested for optimization: ['CPUExecutionProvider', 'DnnlExecutionProvider']
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\user\PycharmProjects\olive\main.py", line 24, in <module>
    result = optimize(opt_config)
  File "C:\Users\user\PycharmProjects\olive\venv\lib\site-packages\olive\optimize.py", line 24, in optimize
    pretuning_inference_result = get_benchmark(optimization_config)
  File "C:\Users\user\PycharmProjects\olive\venv\lib\site-packages\olive\optimization\tuning_process.py", line 189, in get_benchmark
    manager = Manager()
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\context.py", line 57, in Manager
    m.start()
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\managers.py", line 579, in start
    self._process.start()
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

Support for Python 3.9

Are there technical reasons for not support Python 3.9 or are just the wheels missing?

import_module_from_file failed at /my_path

Hi
When I provide script_dir=/my_path to expect the program to search user_script under my /my_path path, but it only searches the current path./
https://github.com/microsoft/Olive/blob/main/olive/common/import_lib.py#L37
https://github.com/microsoft/Olive/blob/main/olive/common/import_lib.py#L12

[Bug]: To define root models, use `pydantic.RootModel` rather than a field called 'root'

What happened?

When trying to run python -m olive.workflows.run --config resnet_static_ptq_cpu.json --setup
I got this error

/miniconda3/envs/olive_env/lib/python3.11/site-packages/pydantic/_internal/_config.py:269: UserWarning: Valid config keys have changed in V2:

'json_dumps' has been removed
'json_loads' has been removed
warnings.warn(message, UserWarning)
Traceback (most recent call last):
File "", line 189, in _run_module_as_main
File "", line 112, in _get_module_details
File "/miniconda3/envs/olive_env/lib/python3.11/site-packages/olive/workflows/init.py", line 5, in
from olive.workflows.run.run import run
File "/miniconda3/envs/olive_env/lib/python3.11/site-packages/olive/workflows/run/run.py", line 16, in
from olive.passes import Pass
File "/miniconda3/envs/olive_env/lib/python3.11/site-packages/olive/passes/init.py", line 5, in
from olive.passes.olive_pass import FullPassConfig, Pass
File "/miniconda3/envs/olive_env/lib/python3.11/site-packages/olive/passes/olive_pass.py", line 13, in
from olive.common.config_utils import ConfigBase, validate_config
File "/miniconda3/envs/olive_env/lib/python3.11/site-packages/olive/common/config_utils.py", line 118, in
class ConfigListBase(ConfigBase):
File "/miniconda3/envs/olive_env/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 98, in new
private_attributes = inspect_namespace(
^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/olive_env/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 291, in inspect_namespace
raise TypeError("To define root models, use pydantic.RootModel rather than a field called 'root'")
TypeError: To define root models, use pydantic.RootModel rather than a field called 'root'

I understand that this is version mismatch. How can I fix it?

Version?

[Bug]: INVALID_GRAPH / Error Node (BeamSearch_node) has input size 12 not in range [min=5, max=10] when trying to build a non-openai provided Whisper model.

What happened?

When trying to build an ONNX model without multilingual support, using instructions on the whisper example page and supplying a non-openai repo (e.g. aware-ai/whisper-tiny-german) the test_transcription.py fails. Tested using venv.

(olive-env) (base) vmitro@v3629:~/projects/olive_stuff/Olive/examples/whisper$ python prepare_whisper_configs.py --model_name aware-ai/whisper-tiny-german --no_audio_decoder
(olive-env) (base) vmitro@v3629:~/projects/olive_stuff/Olive/examples/whisper$ python -m olive.workflows.run --config whisper_cpu_int8.json --setup
[2023-08-23 23:43:57,572] [INFO] [run.py:112:dependency_setup] The following packages are required in the local environment: ['onnxruntime']
[2023-08-23 23:43:57,572] [INFO] [run.py:116:dependency_setup] onnxruntime is already installed.
(olive-env) (base) vmitro@v3629:~/projects/olive_stuff/Olive/examples/whisper$ python -m olive.workflows.run --config whisper_cpu_int8.json
[2023-08-23 23:44:07,042] [WARNING] [config_utils.py:270:validate_config] Keys {'disable_search'} are not part of OrtTransformersOptimizationConfig. Ignoring them.
[2023-08-23 23:44:07,077] [DEBUG] [engine.py:675:resolve_goals] Resolving goals: {'latency': {'avg': None}}
[2023-08-23 23:44:07,077] [DEBUG] [engine.py:694:resolve_goals] No baseline got as no goal is provided the the goal is threshold
[2023-08-23 23:44:07,078] [DEBUG] [engine.py:475:run_no_search] Step no search with search point {'OnnxConversion': {}, 'OrtTransformersOptimization': {}, 'OnnxDynamicQuantization': {}, 'InsertBeamSearch': {}, 'AppendPrePostProcessingOps': {}} ...
[2023-08-23 23:44:07,078] [INFO] [engine.py:939:_run_pass] Running pass OnnxConversion
[2023-08-23 23:44:07,078] [DEBUG] [engine.py:957:_run_pass] Loading model from cache ...
[2023-08-23 23:44:07,082] [INFO] [engine.py:939:_run_pass] Running pass OrtTransformersOptimization
[2023-08-23 23:44:07,083] [DEBUG] [engine.py:957:_run_pass] Loading model from cache ...
[2023-08-23 23:44:07,087] [INFO] [engine.py:939:_run_pass] Running pass OnnxDynamicQuantization
[2023-08-23 23:44:07,087] [DEBUG] [engine.py:957:_run_pass] Loading model from cache ...
[2023-08-23 23:44:07,091] [INFO] [engine.py:939:_run_pass] Running pass InsertBeamSearch
[2023-08-23 23:44:07,092] [DEBUG] [engine.py:957:_run_pass] Loading model from cache ...
[2023-08-23 23:44:07,093] [INFO] [engine.py:939:_run_pass] Running pass AppendPrePostProcessingOps
[2023-08-23 23:44:07,094] [DEBUG] [engine.py:957:_run_pass] Loading model from cache ...
[2023-08-23 23:44:07,095] [DEBUG] [engine.py:1076:_evaluate_model] Evaluating model ...
[2023-08-23 23:44:07,096] [DEBUG] [engine.py:1087:_evaluate_model] Loading evaluation from cache ...
[2023-08-23 23:44:07,096] [DEBUG] [engine.py:917:_run_passes] Signal: {'latency-avg': 1079.78704}
[2023-08-23 23:44:07,096] [DEBUG] [engine.py:499:run_no_search] Engine output_name is provided. Will ignore output_name for final pass
[2023-08-23 23:44:07,227] [INFO] [engine.py:384:run] Package top ranked 1 models as artifacts
[2023-08-23 23:44:07,227] [INFO] [packaging_generator.py:47:_generate_zipfile_output] Packaging Zipfile output artifacts
[2023-08-23 23:44:07,281] [DEBUG] [resource_path.py:147:create_resource_path] Resource path /tmp/tmpq7o_oxgh/CandidateModels/cpu-cpu/BestCandidateModel_1/olive_tmpkcdwbdnu/model.onnx is inferred to be of type file.
[2023-08-23 23:44:07,288] [DEBUG] [utils.py:21:run_subprocess] Running command: python -m pip download onnxruntime-extensions==0.8.0 --no-deps -d /tmp/tmpq7o_oxgh/ONNXRuntimePackages/Python with env: None
[2023-08-23 23:44:08,206] [DEBUG] [utils.py:21:run_subprocess] Running command: python -m pip download onnxruntime==1.15.1 --no-deps -d /tmp/tmpq7o_oxgh/ONNXRuntimePackages/Python with env: None
(olive-env) (base) vmitro@v3629:~/projects/olive_stuff/Olive/examples/whisper$ python test_transcription.py --config whisper_cpu_int8.json
> /home/vmitro/projects/olive_stuff/Olive/examples/whisper/test_transcription.py(71)main()
-> for model_json in output_model_json_path.glob(f"**/{config['engine']['output_name']}_cpu-cpu_model.json"):
[2023-08-23 23:45:27,367] [WARNING] [__init__.py:212:_is_valid_ep] Error: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /home/vmitro/projects/olive_stuff/Olive/examples/whisper/models/conversion-transformers_optimization-onnx_dynamic_quantization-insert_beam_search-prepost/whisper_cpu_int8_cpu-cpu_model.onnx failed:This is an invalid model. In Node, ("BeamSearch_node", BeamSearch, "com.microsoft", -1) : ("log_mel": tensor(float),"max_length": tensor(int32),"min_length": tensor(int32),"num_beams": tensor(int32),"num_return_sequences": tensor(int32),"length_penalty": tensor(float),"repetition_penalty": tensor(float),"","","","","",) -> ("sequences",) , Error Node (BeamSearch_node) has input size 12 not in range [min=5, max=10].Olive will ignore this CPUExecutionProvider.Please make sure the environment with CPUExecutionProvider has the required dependencies.
Traceback (most recent call last):
  File "/home/vmitro/projects/olive_stuff/Olive/examples/whisper/test_transcription.py", line 108, in <module>
    output_text = main()
  File "/home/vmitro/projects/olive_stuff/Olive/examples/whisper/test_transcription.py", line 98, in main
    session = olive_model.prepare_session(None, "cpu")
  File "/home/vmitro/projects/olive_stuff/olive-env/lib/python3.10/site-packages/olive/model/__init__.py", line 346, in prepare_session
    return get_ort_inference_session(self.model_path, inference_settings, self.use_ort_extensions)
  File "/home/vmitro/projects/olive_stuff/olive-env/lib/python3.10/site-packages/olive/common/ort_inference.py", line 64, in get_ort_inference_session
    sess = ort.InferenceSession(
  File "/home/vmitro/projects/olive_stuff/olive-env/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 383, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/vmitro/projects/olive_stuff/olive-env/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 424, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /home/vmitro/projects/olive_stuff/Olive/examples/whisper/models/conversion-transformers_optimization-onnx_dynamic_quantization-insert_beam_search-prepost/whisper_cpu_int8_cpu-cpu_model.onnx failed:This is an invalid model. In Node, ("BeamSearch_node", BeamSearch, "com.microsoft", -1) : ("log_mel": tensor(float),"max_length": tensor(int32),"min_length": tensor(int32),"num_beams": tensor(int32),"num_return_sequences": tensor(int32),"length_penalty": tensor(float),"repetition_penalty": tensor(float),"","","","","",) -> ("sequences",) , Error Node (BeamSearch_node) has input size 12 not in range [min=5, max=10].

When installing the latest onnxruntime (1.16.0-dev*) through python -m pip install ort-nightly --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ the transcription seems to work (haven't tested with a audio file with spoken German, but the supplied file gets transcribed into a German-looking text).

When loaded into the Android ONNX Runtime (latest ver 1.15.1) I get exactly the same error. When I compare the two repos' config.json files (openai's and the aforementioned), some decoder_* and encoder_* options differ.

Could this be a ONNX Runtime bug? I'm currently building Android runtime from source, will report if I get it to work using my build.

Version?

Olive: 0.3.1
ONNX Runtime: 1.15.1

how to tune paralel executor and thread pool size

1 code here https://github.com/microsoft/OLive/blob/66215d1fac3449a7f897006892853d6c7866da0f/docker-images/perf-tuning/src/perf_tuning.py#L413 only uses best run, while it doesn't append -p into tests > shows that -p will not consider as an candidate?

2 what is the purpose of the code https://github.com/microsoft/OLive/blob/66215d1fac3449a7f897006892853d6c7866da0f/docker-images/perf-tuning/src/perf_tuning.py#L443; it seems like its return value is not used?

About OnnxStaticQuantization

Hi All
I use OnnxStaticQuantization to quantize an onnx model, and I intend to deploy the quantized model to the dpu later.
Due to the limitation of dpu, the parameter y_scale/x_scale of QuantizeLinear/DequantizeLinear in the expected quantization model belongs to 1/(2^n).For example 1, 0.5, 0.25, 0.125, ...

As shown in the figure is a partial QuantizeLinear/DequantizeLinear ops screenshot of the current output.

Is there any parameter or other way to solve this problem?

Optimizing Fail (AMD CPU)

I get a Fail when running the Optimizing command and get an AMD Bug Report Tool pop up (Driver timeout).
My Laptop has NVIDIA RTX 3070 Max-Q and an AMD CPU

Copy of the Error:

[WARNING] [engine.py:307:run] Failed to run Olive on gpu-dml: [ONNXRuntimeError] : 1 : FAIL : D:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\ExecutionProvider.cpp(896)\onnxruntime_pybind11_state.pyd!00007FFB7134FE91: (caller: 00007FFB713508BF) Exception(2) tid(4ac4) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.

Question: what about Nvidia Ada Fp8 support?

Hi,
Just a question FP8 in new NV ada gpus it’ supported on DirectML and Olive?
If not any plans to support?
Assume it can bring another 2x speedup to stable diffusion sample?
Thanks..

[Bug]: Optimization of Unet fails 6950 XT

What happened?

This appeared to me to be the same issue as 510 and 301, though I know nothing. I ran the following commands:

conda create --name olive python=3.9
conda activate olive
pip install olive-ai[directml]==0.3.1
git clone https://github.com/microsoft/olive --branch v0.3.1
cd (to relevant directory)
pip install -r requirements.txt
python stable_diffusion_xl.py --optimize

I've attached the log, as well as a DXDIAG, but it errors out when optimizing unet saying "failed to run olive on gpu-dml".... "887a0006 the gpu will not respond to more commands".

DxDiag.txt
ErrorLog.txt

Version?

0.3.1

[FR]: Make vitis-ai quantization compatible with ORT 1.16.0+

Proposal Summary

Vitis ai code uses some functions that are not available in ORT 1.16.0 is currently in development. Refer to #380 for more details.

microsoft / olive Goto Github PK

olive's Introduction

Olive

News

Get Started and Resources

Installation

Install with pip

Optional Dependencies

Pipeline Status

Contributing

License

olive's People

Contributors

Stargazers

Watchers

Forkers

olive's Issues

What happened?

Version?

What happened?

Version?

What happened?

Version?

What happened?

Version?

What happened?

Version?

freeze

What happened?

Version?

What happened?

Version?

Hi, The program crashes while optimizing -

What happened?

Version?

What happened?

Version?

What happened?

Version?

What happened?

Version?

What happened?

Version?

How to append custom parameter ActivationSymmetric=True to "extra_options"？

What happened?

Version?

What happened?

Version?

What happened?

Version?

What happened?

Version?

Name Version Build Channel

What happened?

Version?

What happened?

Version?

Proposal Summary

What component(s) does this request affect?

Recommend Projects

Recommend Topics

Recommend Org

Hi,
The program crashes while optimizing -

How to append custom parameter `ActivationSymmetric=True` to `"extra_options"`？