Comments (11)
Good to hear that but I am personally not sure if the implementation is right comparing to this paper https://arxiv.org/abs/1910.10288
AFAIK this is the most robust Graves attention so far proposed for TTS. It may be wrong.
Itd be nice if you could double check.
from tts.
So, if I comment out the line self.attention.init_win_idx()
in tacotron.py
, I guess inference won't be complaining and in case OriginalAttention()
is used, the self.attention.init_win_idx()
method is called inside the OriginalAttention()
module (line 226). I'm gonna launch a training to see if the empty alignment plots are still there, reporting back later today.
from tts.
I'll take a look at Graves attention but in the meantime, you can try DDC or DCA models for solving attention.
from tts.
Thanks, yeah just checked and the attention plots are still empty, moving onto DDC/DCA now.
from tts.
Just a side note. DCA is faster and uses less memory but DDC conspires better quality.
from tts.
@erogol thanks, if I wanted to use DDC, I understand I set "double_decoder_consistency": true
, but then which attention_type
should I choose?
from tts.
it should be like
// TACOTRON ATTENTION
"attention_type": "original", // 'original' , 'graves', 'dynamic_convolution'
"attention_heads": 4, // number of attention heads (only for 'graves')
"attention_norm": "sigmoid", // softmax or sigmoid.
"windowing": false, // Enables attention windowing. Used only in eval mode.
"use_forward_attn": false, // if it uses forward attention. In general, it aligns faster.
"forward_attn_mask": false, // Additional masking forcing monotonicity only in eval mode.
"transition_agent": false, // enable/disable transition agent of forward attention.
"location_attn": true, // enable_disable location sensitive attention. It is enabled for TACOTRON by default.
"bidirectional_decoder": false, // use https://arxiv.org/abs/1907.09006. Use it, if attention does not work well with your dataset.
"double_decoder_consistency": true, // use DDC explained here https://erogol.com/solving-attention-problems-of-tts-models-with-double-decoder-consistency-draft/
"ddc_r": 6, // reduction rate for coarse decoder.
from tts.
Thank you!
from tts.
@erogol Graves is working, something was off in my dataset and or config that's been solved and the training is yielding alignments after 5-10K steps. The inference issue is still there, I'll open a PR just deleting that one line mentioned above.
from tts.
After 43K steps the alignments are also still a bit wonky but not empty.
from tts.
Closing this because the no attribute bug was fixed in #479 and GMM (Graves) Attention will be looked at in a separate discussion.
from tts.
Related Issues (20)
- [Bug] ImportError: cannot import name 'parametrize' from 'torch.nn.utils' HOT 1
- Please update to be able to use with Python 3.12 HOT 6
- Finetuning for new language HOT 1
- [Bug] Wrong value for perceiver_cond_length_compression (256 instead of 1024) HOT 1
- [!] `train_step()` retuned `None` outputs. Skipping training step. HOT 1
- [Bug] fairseq fix missing dataset, model var initialization HOT 2
- Question: Why is the model size different when trained using train_gpt_xtts.py in xtts_v2 compared to the baseline model?
- cannot import name 'magphase' from 'librosa' HOT 2
- [Bug] Time taken to run TTS command far greater than actual processing time HOT 8
- [Bug] Unable to use xtts_v2 with mps device on Apple Silicon
- [Bug] Cannot use Docker image HOT 1
- [Bug] very longinstallation that ends up with error HOT 2
- [Feature request] Language Support ("Hindi") missing in XTTS on local machine. HOT 2
- [Bug] bug in tts_to_file HOT 1
- PermissionError: [WinError 32] The process cannot access the file because it is being used by another process. HOT 8
- [Bug] Install bug Failed to download the model file to tts_models--multilingual--multi-dataset--xtts_v2 HOT 1
- [Bug] Unable to install Coqui TTS HOT 5
- [Bug] compute_statistics.py isn't working.
- [Bug] Training xtts v2 with original dataset which is multilingual and multispeaker HOT 8
- [Bug] Voice lag and pronounce punctuation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tts.