Comments (3)
Hello. Not sure if a reply this late will help, but that is simply known as transfer learning. You take your first model's checkpoint, pass it as the pre_trained model, and warm start from that point. This will ensure that your model now has the new speaker's voice as well as benefitting from the previous training.
from vits.
Hello. Not sure if a reply this late will help, but that is simply known as transfer learning. You take your first model's checkpoint, pass it as the pre_trained model, and warm start from that point. This will ensure that your model now has the new speaker's voice as well as benefitting from the previous training.
Hello, thank you for your response.
I understand that to fine-tune the model in such a way, I just need to replace the audio files and metadata of the new model in the location of the old model, and continue training, right? I would appreciate your feedback!
from vits.
Yes. I have done it with tacotron 2 model, and it for sure works. Basically you would do the training just as in the first time: you get your audio dataset ready, and give the model your new audios and the audios' transcriptions.
- Just the only difference is you choose the pre trained model to be your previously trained model's checkpoint.
- You make use of warm starting. It should be like a parameter in hparams that you set to TRUE.
- One more thing: do not change the original batch size. At least in tacotron 2 whenever I changed batch size it printed out errors.
from vits.
Related Issues (20)
- progress keeps resetting, checkpoint fails to load properly
- VITS codes failed to run for Python 3.10.12 HOT 1
- How to continue training from the previous stopped epoch? HOT 1
- Some of the losses increasing during training?
- Is there a way to do batch inference? HOT 3
- How to locate the spreak time of each word
- How to use trained model in inference? HOT 2
- DistributedBucketSampler : create bucket sampler issue. HOT 1
- Fine-tune with multiple speakers' data.
- Negative loss_dur HOT 1
- RTX4090 training is very slow. Is there something wrong with my parameters? HOT 2
- [Question] about improve quality audio
- having numpy complex error
- Problems adding a new speaker
- How is the Out of Dictionary texts created?
- NCCL error windows HOT 1
- [CONTRIBUTION] Speech Dataset Generator
- Option for Single GPU?
- Help me! Errors in the training process of vits for other datasets
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vits.