Comments (5)
Add these two hparams:
"n_speakers": 10,
"gin_channels": 16
I'm not sure what the ideal value for gin_channels
is to get a rich embedding, and I asked in another thread.
Your training data and validation CSVs should be in this format:
filename|numeric_speaker_id|transcript
You'll need to swap out the loader:
-from data_utils import TextMelLoader, TextMelCollate
+from data_utils import TextMelSpeakerLoader, TextMelSpeakerCollate
You'll also need to change the forward function to accept the g
speaker id parameter and unpack the speaker ids from the loader enumerations.
from glow-tts.
i meant not to retrain the whole model once again. only to add one more voice
from glow-tts.
Add these two hparams:
"n_speakers": 10, "gin_channels": 16
I'm not sure what the ideal value for
gin_channels
is to get a rich embedding, and I asked in another thread.Your training data and validation CSVs should be in this format:
filename|numeric_speaker_id|transcript
You'll need to swap out the loader:
-from data_utils import TextMelLoader, TextMelCollate +from data_utils import TextMelSpeakerLoader, TextMelSpeakerCollate
You'll also need to change the forward function to accept the
g
speaker id parameter and unpack the speaker ids from the loader enumerations.
Sorry for jumping in, could you please elaborate the last part about changing the forward function? Thanks in advance!
from glow-tts.
Add these two hparams:
"n_speakers": 10, "gin_channels": 16
I'm not sure what the ideal value for
gin_channels
is to get a rich embedding, and I asked in another thread.Your training data and validation CSVs should be in this format:
filename|numeric_speaker_id|transcript
You'll need to swap out the loader:
-from data_utils import TextMelLoader, TextMelCollate +from data_utils import TextMelSpeakerLoader, TextMelSpeakerCollate
You'll also need to change the forward function to accept the
g
speaker id parameter and unpack the speaker ids from the loader enumerations.
Hi @echelon ,
This information is really useful. I believe I've done necessary changes as suggested by you. In my case I've kept n_speakers = 24 and gin_channels = 256 and rest of the parameters in base.json is same. Number of samples in training records are 9102. I'm getting below runtime error.
RuntimeError: Given groups=1, weight of size 256 448 3, expected input[1, 192, 89] to have 448 channels, but got 192 channels instead
Can you please advice what is going wrong here.
from glow-tts.
Hi @marlon-br, @dechubby ,
Were you able to run in multi speaker mode? Have you done any other changes apart from whatever mentioned by echelon?
I'm getting some issue which I'm not able to debug.
Any help will be really appreciated.
Regards,
Prasanta
from glow-tts.
Related Issues (20)
- Runtime Error: Multi speaker HOT 1
- GPU required or CPU-compatible? HOT 1
- Different Languages us different amount of GPU memory
- multi speaker
- Output compared to Fastspeech2
- Models for finetuning
- Could not create monotonic_align HOT 3
- Glowtts melspectrogram to fine tune hifigan HOT 2
- RuntimeError: CUDA error: invalid device function
- ImportError: /glow-tts/monotonic_align/monotonic_align/core.cpython-38-x86_64-linux-gnu.so: failed to map segment from shared object HOT 1
- Error using mel generated from glow-tts for hifi-gan training HOT 1
- Can I apply MAS method to other model ? HOT 1
- Query : How is the Model training different from the Model training of wave glow
- Multi speaker training error HOT 11
- With out Training DDI
- An explanation for the source code of finding the alignment path in GlowTTS? HOT 2
- DDI training compared to not DDI training HOT 1
- [Question] How many iterations for the available pretrained model?
- [Question] about `intersperse` function. HOT 1
- [CONTRIBUTION] Speech Dataset Generator
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from glow-tts.