Giter VIP home page Giter VIP logo

Comments (3)

rsxdalv avatar rsxdalv commented on September 24, 2024

License: CC-BY-NC 4.0 license

This is a big problem - honestly speaking, this license could imply that anything you generate with the model could be limited to non-commercial, which this license makes as broad as covering your own costs without a profit.

Realistically, it shouldn't be that harsh. But in the worst case scenario (i.e., facebook was overtaken by a lawyer bot) it could probably sue you.

Now on to something more helpful - I think this is already available within audio-webui.
If you are having trouble installing audio-webui, you can try installing it on top of tts-generation-webui (it's both advanced and simple so if it doesn't make sense it's probably not easier).

Lastly, I have had feedback (for English) that fairseq was not up to the quality that people wanted.

I believe you can test everything here: https://huggingface.co/spaces/mms-meta/MMS

Let me know how it goes, and if you still find it useful to have this in the tts-generation-webui. For a final comment - I am aware about many of these projects, but licenses and feedback usually make them lower priority - unless someone asks for it.

from tts-generation-webui.

adfsadfasdfasdf avatar adfsadfasdfasdf commented on September 24, 2024

Thanks. I tried using the link you provided, and I wanted to generate some Persian voice, but the quality was terrible and there were no emotions in the voice. Using RVC, it was possible for me to easily convert a Persian talking voice to a different person's voice, but when it comes to text to speech, I think it does not provide enough emotion in the speeches that it generates. Is there a way to add a new language like Persian to tts-generation-webui while being able to make speeches that have nice emotions using text input, just like doing it in English? I mean, in tts-generation-webui, if I clone a Persian speaker's voice and try generating Persian speech with that voice using Persian text, I don't think it's going to work.

from tts-generation-webui.

rsxdalv avatar rsxdalv commented on September 24, 2024

So even fairseq MMS + RVC was not enough for it to sound good enough? In that case it's a bit tougher. I think StyleTTS2 could be maybe an answer but I'm not sure.

I mean, in tts-generation-webui, if I clone a Persian speaker's voice and try generating Persian speech with that voice using Persian text, I don't think it's going to work.

Yes, Bark does not support Persian. It might be able to generate something that sounds like an imitation but wouldn't recommend that.
Another alternative is Tortoise - IF there is a Persian model available for Tortoise TTS. In that case it would be possible to use it with this repo as well.

from tts-generation-webui.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.