yuvraj108c / comfyui-whisper Goto Github PK

View Code? Open in Web Editor NEW

60.0 60.0 8.0 1.61 MB

Transcribe audio and add subtitles to videos using Whisper in ComfyUI, licensed under CC BY-NC-SA 4.0

License: Other

Python 100.00%

comfyui stable-diffusion whisper-ai

comfyui-whisper's Issues

I am getting the following error

Error:

Traceback (most recent call last):
  File "E:\ComfyUI\ComfyUI\execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\ComfyUI\ComfyUI\execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\ComfyUI\ComfyUI\execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Whisper\apply_whisper.py", line 31, in apply_whisper
    result = model.transcribe(audio_save_path,word_timestamps=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\ComfyUI\python_embeded\Lib\site-packages\whisper\transcribe.py", line 122, in transcribe
    mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\ComfyUI\python_embeded\Lib\site-packages\whisper\audio.py", line 140, in log_mel_spectrogram
    audio = load_audio(audio)
            ^^^^^^^^^^^^^^^^^
  File "E:\ComfyUI\python_embeded\Lib\site-packages\whisper\audio.py", line 58, in load_audio
    out = run(cmd, capture_output=True, check=True).stdout
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "subprocess.py", line 548, in run
  File "subprocess.py", line 1026, in __init__
  File "subprocess.py", line 1538, in _execute_child
FileNotFoundError: [WinError 2] The system cannot find the file specified```

Unable to load audio

Hello! First of all thank you for the node it works great!

Is it possible to add support for audio only inputs? It seems to only want video formats at the moment. For example greys out when trying to load straight from audio nodes.

other language

Is there a way to utalize this in other languages?

Proposal: use whisper-ts instead of the original whisper

Given that whisper-ts has some very nice improvements regarding the precision of timestamp alignments you might want to consider to replace the current whisper with whisper-ts - I tested it and it only requires a few lines of code to change:

import stable_whisper as whisper

[...]
#old:
#result = model.transcribe(audio_save_path,word_timestamps=True)

#new:
result = model.transcribe_minimal(audio_save_path,word_timestamps=True)
result = model.align(audio_save_path, result, language=result.language).to_dict()

yuvraj108c / comfyui-whisper Goto Github PK

comfyui-whisper's People

Contributors

Stargazers

Watchers

Forkers

comfyui-whisper's Issues

I am getting the following error

Unable to load audio

other language

Proposal: use whisper-ts instead of the original whisper

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent