Comments (6)
Hi @Rishit-dagli , what error you got without using audio decoder? Also in the latest commit, we added detailed codes about custom components for Whisper model. Feel free to update them to meet your requires.
from olive.
@xiaoyu-work Ah, yes, missed that:
This is what I see without using audio decoder:
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_node' Status Message: Input 'attention_mask' is expected to have same shape as input_ids
Also in the latest commit, we added detailed codes about custom components for Whisper model. Feel free to update them to meet your requires.
The model exported with the latest commit still expects (1, x)
shaped inputs as I see from this message, do you think I might have missed something?
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: audio_stream for the following indices
from olive.
@Rishit-dagli , Right now, the audio decoder in onnxruntime-extensions only supports 1-D audio input. Can you share more details about how you create the N-dim batch audio inputs? We may consider to add the support in the next release.
from olive.
Right now, the audio decoder in onnxruntime-extensions only supports 1-D audio input
Ah, I see!
I also tried without the audio decoder so I load the data using:
audio_blob, _ = librosa.load(path)
audio_blob = np.expand_dims(audio_blob, axis=0)
This gets me (1,x)
sized array which I then rearrange to form (x//b, b)
sized array and do the padding. Running the model now without the audio decoder seems to be able to do inference but gives outputs of the pattern:
first batch transcribed well
transcribed well !!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
After the first batch it starts returning exclamation marks and I have not been able to understand what causes this.
from olive.
In that case, I don't think the batch input can solve your problem, check openai examples (https://github.com/openai/whisper/blob/main/whisper/transcribe.py) to see how to process the long file transcribing.
from olive.
Closing this issue since it is not an Olive issue and has also turned stale.
from olive.
Related Issues (20)
- models_rank.json issue HOT 4
- Whisper model converted via onnxruntime 1.17.1 won't work HOT 5
- Whisper-medium conversion failed HOT 8
- Whisper model does not work If you add a flag --enable_timestamps HOT 7
- whisper pipeline corrupting the model, unable to run on DML EP HOT 1
- GenAIModelExporter Component - parameter mismatch HOT 3
- Missing dependency: psutil HOT 2
- [FR]: FlashAttention support for Whisper HOT 1
- pydantic.error_wrappers.ValidationError: 7 validation errors for RunConfig HOT 1
- Olive workflow for mistral model optimization does not work HOT 17
- Exception while running SD XL: Not enough memory resources are available to complete this operation HOT 1
- Failed to run symbolic shape inference when doing LLM Optimization with DirectML HOT 8
- Error on the Generate an ONNX model and optimize step HOT 5
- status.IsOK() was false. Tensor shape cannot contain any negative value HOT 1
- Vitis quantization is broken with ORT 1.18 HOT 2
- Enabling openai/whisper-large-v3 using olive-ai-0.6.0 [onnxruntime-gpu: 1.17.1] on Intel CPU/GPU is not supporting HOT 2
- Llava-7b model Conversion to ONNX and Latency Optimization - OOM error (even after setting paging file size) HOT 2
- safetensor model
- onnx
- huggingface_hub.errors.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from olive.