Comments (5)
Dear contributors,
I have applied G2PK, grapheme to phoneme conversion package, and achieved an improved Korean TTS results.
This is the link to the demo page.
Since the original paper used phoneme tokens as inputs, I believe this result is closer to the intention of your original work.
Thanks.
from glow-tts.
@Joovvhan hi, i also try to train Korean KSS dataset and make a colab here (https://colab.research.google.com/drive/1ybWwOS5tipgPFttNulp77P6DAB5MtiuN?usp=sharing). This is fastspeech2 + mb-melgan. I'm not a native speaker, can you compared glow-tts + waveglow with our fastspeech2 + mb-melgans ?. Thanks
from glow-tts.
@dathudeptrai Yes, I would be glad to.
As I looked through your samples quickly, your pre-trained model on the Colab is quite impressive.
Yet, since I am not the original writer of the Glow-TTS, I have not tuned any hyperparameters or introduced any audio processing techniques to improve the audio quality.
I guess it would make more sense to apply the same techniques that have been used in training or synthesizing your samples (fastspeech2 + mb-melgans) to the Glow-TTS model first, then compare samples.
In addition, it seems that your pretrained mb-melgans is better than officially provided universal WaveGlow model in generating the Korean speech. I found that WaveGlow produces some screeching sound when regenerating audio from ground truth spectrograms. I have not found any in your samples yet.
I will share you the link to the audio comparison page when I am done.
It would be better to write comments on your issue page if we have any further discussions.
Thanks.
from glow-tts.
Dear authors,
I have improved my demo page by replacing the WaveGlow vocoder with a Multi-MelGAN vocoder provided by TensorFlowTTS authors.
I found out that the official universal WaveGlow vocoder is not so universal for the Korean language.
This is the link to the webpage.
I will leave the poor sample page unchanged for someone who would like to compare the effect of the vocoder.
from glow-tts.
Dear contributors,
I have applied G2PK, grapheme to phoneme conversion package, and achieved an improved Korean TTS results.
This is the link to the demo page.
Since the original paper used phoneme tokens as inputs, I believe this result is closer to the intention of your original work.
Thanks.
Hi Joovvhan, I found your demo is much good, but about G2PK in your language, how can you handle with out of vocab words such as English or name of unknown place?
from glow-tts.
Related Issues (20)
- Runtime Error: Multi speaker HOT 1
- GPU required or CPU-compatible? HOT 1
- Different Languages us different amount of GPU memory
- multi speaker
- Output compared to Fastspeech2
- Models for finetuning
- Could not create monotonic_align HOT 3
- Glowtts melspectrogram to fine tune hifigan HOT 2
- RuntimeError: CUDA error: invalid device function
- ImportError: /glow-tts/monotonic_align/monotonic_align/core.cpython-38-x86_64-linux-gnu.so: failed to map segment from shared object HOT 1
- Error using mel generated from glow-tts for hifi-gan training HOT 1
- Can I apply MAS method to other model ? HOT 1
- Query : How is the Model training different from the Model training of wave glow
- Multi speaker training error HOT 11
- With out Training DDI
- An explanation for the source code of finding the alignment path in GlowTTS? HOT 2
- DDI training compared to not DDI training HOT 1
- [Question] How many iterations for the available pretrained model?
- [Question] about `intersperse` function. HOT 1
- [CONTRIBUTION] Speech Dataset Generator
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from glow-tts.