Giter VIP home page Giter VIP logo

Comments (7)

seantempesta avatar seantempesta commented on July 18, 2024 1

Hey @Zarbuvit . I feel your frustration. I ended up trying all of those libraries and none of them worked well with glow-tts. Then I came across a forked version of @seungwanpark 's melgan written by @rishikksh20 and it worked perfectly!

Multi-band Melgan that works with glow-tts
https://github.com/rishikksh20/melgan

I forked his project and have been re-working it so it can be used as a package for inference:
https://github.com/seantempesta/melgan-1

(Note: I may have totally broken the training aspects as I've only tested the inference parts since I repackaged it)

from glow-tts.

Zarbuvit avatar Zarbuvit commented on July 18, 2024 1

Its working!
@seantempesta thank you for that repo!
I ended up using mostly https://github.com/rishikksh20/melgan with editing the denoiser according to what @seantempesta did in his repo: https://github.com/seantempesta/melgan-1

As for the garbling words - completely my personal problem! I missnamed my models and used a different method of converting to phonemes in training and in inference.
I am sorry for any time I caused you to waste on my stupidity.

Thank you for your help!

from glow-tts.

echelon avatar echelon commented on July 18, 2024

This is fantastic! Thanks for sharing!

from glow-tts.

Zarbuvit avatar Zarbuvit commented on July 18, 2024

@seantempesta Thank you soo much! I will have a look now and hopefully all goes well

from glow-tts.

Zarbuvit avatar Zarbuvit commented on July 18, 2024

@seantempesta Sadly this isn't working for me. I took the inference from glow-tts as is, removed the waveglow stuff and added the mb melgan generator from rishikksh20 instead, and all the inference from rishikksh as well. I used his pretrained model.
Im getting a clean voice but it is garbled up and i cant understand the words. This is similar to what got using MozillaTTS multiband melgan after applying the code changes you recommended in a different issue.

Also a separate issue I am having is that the denoiser isn't working. I get the error:

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

on the line audio = denoiser(audio, 0.1). This is weird for me because I assumed if I got any error like that it would happen during inference, but the inference was fine, only the denoiser crashes. If I comment out the denoiser it all runs fine, except for the result being garbled as mentioned before.
I saw that you changed the Denoiser in your repo to work with cpu. Is this just a preferece or is it because it does not work in gpu for you?

Did you run into any of these issues?

from glow-tts.

v-nhandt21 avatar v-nhandt21 commented on July 18, 2024

Its working!
@seantempesta thank you for that repo!
I ended up using mostly https://github.com/rishikksh20/melgan with editing the denoiser according to what @seantempesta did in his repo: https://github.com/seantempesta/melgan-1

As for the garbling words - completely my personal problem! I missnamed my models and used a different method of converting to phonemes in training and in inference.
I am sorry for any time I caused you to waste on my stupidity.

Thank you for your help!

Hello Zarbuvit, Can you share some of the phoneme you are using, I am struggling with representation in phoneme

from glow-tts.

phamkhactu avatar phamkhactu commented on July 18, 2024

@Zarbuvit I have trained model glow, model speak very natural, but buzzing noise. Have you any ideas?. I tried out other model not have buzzing noise, however not temperature.

Thank you

from glow-tts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.