Giter VIP home page Giter VIP logo

Comments (5)

Joovvhan avatar Joovvhan commented on July 18, 2024 1

Dear contributors,

I have applied G2PK, grapheme to phoneme conversion package, and achieved an improved Korean TTS results.

This is the link to the demo page.

Since the original paper used phoneme tokens as inputs, I believe this result is closer to the intention of your original work.

Thanks.

from glow-tts.

dathudeptrai avatar dathudeptrai commented on July 18, 2024

@Joovvhan hi, i also try to train Korean KSS dataset and make a colab here (https://colab.research.google.com/drive/1ybWwOS5tipgPFttNulp77P6DAB5MtiuN?usp=sharing). This is fastspeech2 + mb-melgan. I'm not a native speaker, can you compared glow-tts + waveglow with our fastspeech2 + mb-melgans ?. Thanks

from glow-tts.

Joovvhan avatar Joovvhan commented on July 18, 2024

@dathudeptrai Yes, I would be glad to.
As I looked through your samples quickly, your pre-trained model on the Colab is quite impressive.

Yet, since I am not the original writer of the Glow-TTS, I have not tuned any hyperparameters or introduced any audio processing techniques to improve the audio quality.

I guess it would make more sense to apply the same techniques that have been used in training or synthesizing your samples (fastspeech2 + mb-melgans) to the Glow-TTS model first, then compare samples.

In addition, it seems that your pretrained mb-melgans is better than officially provided universal WaveGlow model in generating the Korean speech. I found that WaveGlow produces some screeching sound when regenerating audio from ground truth spectrograms. I have not found any in your samples yet.

I will share you the link to the audio comparison page when I am done.

It would be better to write comments on your issue page if we have any further discussions.

Thanks.

from glow-tts.

Joovvhan avatar Joovvhan commented on July 18, 2024

Dear authors,

I have improved my demo page by replacing the WaveGlow vocoder with a Multi-MelGAN vocoder provided by TensorFlowTTS authors.

I found out that the official universal WaveGlow vocoder is not so universal for the Korean language.

This is the link to the webpage.

I will leave the poor sample page unchanged for someone who would like to compare the effect of the vocoder.

from glow-tts.

v-nhandt21 avatar v-nhandt21 commented on July 18, 2024

Dear contributors,

I have applied G2PK, grapheme to phoneme conversion package, and achieved an improved Korean TTS results.

This is the link to the demo page.

Since the original paper used phoneme tokens as inputs, I believe this result is closer to the intention of your original work.

Thanks.

Hi Joovvhan, I found your demo is much good, but about G2PK in your language, how can you handle with out of vocab words such as English or name of unknown place?

from glow-tts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.