reriiasu / speech-to-text Goto Github PK
View Code? Open in Web Editor NEWReal-time transcription using faster-whisper
License: MIT License
Real-time transcription using faster-whisper
License: MIT License
I am running into this error while program is listening:
対象のDeviceIndexを入力してください: 13
Listening...
SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats
How may I resolve this issue?
I've used the project to transcript some text without the internet. Even when openApi proofreading is off, I don't even get full text from my speech after stopping transcription. It returns either first part or last part or strange text from the speech. Can I do something with this? I am using local medium model
I can run your app easily, but the accuracy of the transcription is quite low. I tested with Vietnamese and English on my window laptop (core i5 CPU, 16GB Ram).
Time to processing was also long, not near real-time.
Maybe I'm wrong in settings?
Hello,
I am running into an issue with the second step- selecting the input device.
Is there a way to check my device list without exiting the program? Or, what is the format the prompt is looking for? I have tried different combinations and all have caused an error that exited the program. For reference, I am using ubuntu
Thank you
When I run python -m speech_to_text command,console message prompts the following error:
[Errno 13] Permission denied: 'D:\Works\Whisper\Faster_Whisper\models--guillaumekln--faster-whisper-base\refs\main'
I am running on windows platform,windows 11.
Your implementation is very nice!
I thought I would post the issue as a thank you.
I found your implementation very helpful and would like to Dockerize it to help me implement it on my robot.
Very cool how VAD and buffering works.
I immediately quoted a large part of your implementation and imported it into the Docker environment in my repository.
https://github.com/PINTO0309/faster-whisper-env
Operation is very good!
This issue is published only to express our gratitude and you are free to close it of your own free will.
Again, thank you very much. 😸
Incidentally, at the risk of meddling, the current version of faster-whisper v0.6.0 works fine when weights are run over the network.
Hello,
I'm encountering an error "Could not locate cudnn_ops_infer64_8.dll. Please make sure it is in your library path!"
when running the model with CUDA. Using CPU works fine.
Setup:
• GeForce RTX 2070 + 8G
• CUDA Toolkit 11.4 (downloaded from https://developer.nvidia.com/cuda-11-4-0-download-archive)
• cuDNN 11.4-windows-x64-v8.2.2.26 (downloaded from https://developer.download.nvidia.com/compute/redist/cudnn/v8.2.2/
Extracted cudnn DLLs placed in: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\bin\
Environment path confirmed to the dll location.
where cudnn_ops_infer64_8.dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\bin\cudnn_ops_infer64_8.dll
Would you have any suggestions to fix this error? In case of compatibility issues, could you recommend compatible CUDA and cuDNN versions with download links?
Fantastic work. Was able to get it up and running easily.
I'm hoping to increase the speed of transcription.
The transcription speed once it hits Whisper seems fine, but I think the lag comes in before that. Do you have any recommendations on how I might improve the speed?
C:\Users\Administrator\speech-to-text>python -m speech_to_text
Traceback (most recent call last):
File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\Administrator\speech-to-text\speech_to_text_main.py", line 7, in
from .audio_transcriber import AppOptions
File "C:\Users\Administrator\speech-to-text\speech_to_text\audio_transcriber.py", line 15, in
from .openai_api import OpenAIAPI
ModuleNotFoundError: No module named 'speech_to_text.openai_api'
Can you tell me how to fix this? Thank you.
Предварительно очень нужная программа, но у меня не получается никак сделать так, чтобы она работала. Запускается, да. Но не могу выбрать нужный вариант Audio Device. Выходит, что запить идет, но ничего не происходит. Перепробовал все возможные, как быть?
Извини, если вопрос слишком глупый, но как еще?)
like the title, i hopefully it can be translated into other languages, not just English. But it's too hard for me. How do I change the code?
nice to meet you. When I tried using the tool, the following error appeared on the UI. What should I do? Thank you.
[Error number 2] No such file or directory: 'C:\Users\username\.cache\huggingface\hub\models--guillaumekln--faster-whisper-medium\refs\ \main'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.