tts-generation-webui does not have the language that

Is it possible to use these facebook/mms-tts models inside tts-generation-webui? about tts-generation-webui HOT 3 OPEN

adfsadfasdfasdf commented on September 24, 2024

Is it possible to use these facebook/mms-tts models inside tts-generation-webui?

from tts-generation-webui.

Comments (3)

rsxdalv commented on September 24, 2024

License: CC-BY-NC 4.0 license

This is a big problem - honestly speaking, this license could imply that anything you generate with the model could be limited to non-commercial, which this license makes as broad as covering your own costs without a profit.

Realistically, it shouldn't be that harsh. But in the worst case scenario (i.e., facebook was overtaken by a lawyer bot) it could probably sue you.

Now on to something more helpful - I think this is already available within audio-webui.
If you are having trouble installing audio-webui, you can try installing it on top of tts-generation-webui (it's both advanced and simple so if it doesn't make sense it's probably not easier).

Lastly, I have had feedback (for English) that fairseq was not up to the quality that people wanted.

I believe you can test everything here: https://huggingface.co/spaces/mms-meta/MMS

Let me know how it goes, and if you still find it useful to have this in the tts-generation-webui. For a final comment - I am aware about many of these projects, but licenses and feedback usually make them lower priority - unless someone asks for it.

from tts-generation-webui.

adfsadfasdfasdf commented on September 24, 2024

Thanks. I tried using the link you provided, and I wanted to generate some Persian voice, but the quality was terrible and there were no emotions in the voice. Using RVC, it was possible for me to easily convert a Persian talking voice to a different person's voice, but when it comes to text to speech, I think it does not provide enough emotion in the speeches that it generates. Is there a way to add a new language like Persian to tts-generation-webui while being able to make speeches that have nice emotions using text input, just like doing it in English? I mean, in tts-generation-webui, if I clone a Persian speaker's voice and try generating Persian speech with that voice using Persian text, I don't think it's going to work.

from tts-generation-webui.

rsxdalv commented on September 24, 2024

So even fairseq MMS + RVC was not enough for it to sound good enough? In that case it's a bit tougher. I think StyleTTS2 could be maybe an answer but I'm not sure.

I mean, in tts-generation-webui, if I clone a Persian speaker's voice and try generating Persian speech with that voice using Persian text, I don't think it's going to work.

Yes, Bark does not support Persian. It might be able to generate something that sounds like an imitation but wouldn't recommend that.
Another alternative is Tortoise - IF there is a Persian model available for Tortoise TTS. In that case it would be possible to use it with this repo as well.

from tts-generation-webui.

Is it possible to use these facebook/mms-tts models inside tts-generation-webui? about tts-generation-webui HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent