Comments (47)
try both:
checkpoint/(name of expName)/bestmodel.pth
checkpoint/(name of expName)/lastmodel.pth
from loop.
- What is the duration of this 140 files? i think that you should train it with more data. for vctk experiments each speaker has 20-25 min. alternatively you can try to fit the model on the new speaker like we wrote in the new version of the paper (just note that you need to use model that train on large amount of speakers).
- good.
- you can check it on the logger.
from loop.
the output is very different from my orig.wav file.
output.zip
from loop.
Did you use Blizzard 2011 dataset?
from loop.
@enk100 no, i used my own datasets.
from loop.
sj_017.gen_0.wav - is blizzard
Are you sure you train it on your data?
Did you change the data path to your own dataset?
from loop.
Did you change the data path to your own dataset?
which part do you mean? on training?
from loop.
yes, on train.py
from loop.
here's what I did @enk100 ,
- extract own dataset using extract_feats.py
- override data/blizzard/* with the datasets that was extracted from 1
- run the training
python train.py --noise 1 --expName blizzard_init --seq-len 1600 --max-seq-len 1600 --data data/blizzard --nspk 1 --lr 1e-5 --epochs 10
- run the second stage of the training
python train.py --noise 1 --expName blizzard --seq-len 1600 --max-seq-len 1600 --data data/blizzard --nspk 1 --lr 1e-4 --checkpoint checkpoints/blizzard_init/bestmodel.pth --epochs 90
- Then generate
python generate.py --npz data/blizzard/numpy_features_valid/sj_017.npz --checkpoint models/blizzard/bestmodel.pth
from loop.
Are you sure you didn't mix between your dataset & blizzard?
Can you look into data/blizzard/ and check that it contain only your dataset?
It is very odd that you hear blizzard and you didn't train it on blizzard... maybe you start from checkpoint of blizzard model?
from loop.
the data/blizzard only contains my datasets. I use the model/blizzard for training. Is that okay? or do I need to create a model from my datasets?
from loop.
You need to train the model from scratch.
Does the argument '--checkpoint' in train.py stay empty string or did you insert the blizzard model checkpoint?
from loop.
on the first stage of training the --checkpoint is empty. on the second stage of training the --checkpoint i use is checkpoints/blizzard_init/bestmodel.pth
from loop.
check please the argument '--checkpoint' in train.py. if it contain some checkpoint of blizzard then the first stage train on pretrained model of blizzard
from loop.
I'm sorry I'm confused on this statement
if it contain some checkpoint of blizzard then the first stage train on pretrained model of blizzard
from loop.
for example, if you got argument in 'default' in train.py-
parser.add_argument('--checkpoint', default='checkpoints/blizzard_init/bestmodel.pth', metavar='C', type=str, help='Checkpoint path')
then your training is initialize with blizzard model.
if the 'default' argument is empty then it is ok -
parser.add_argument('--checkpoint', default='', metavar='C', type=str, help='Checkpoint path')
you start to train your model from scratch.
somehow your model get blizzard samples, you should search for blizzard data leak.
from loop.
Ok I've done that. but what about the 2nd stage of training? do i need to execute it?
python train.py --noise 1 --expName blizzard --seq-len 1600 --max-seq-len 1600 --data data/blizzard --nspk 1 --lr 1e-4 --checkpoint checkpoints/blizzard_init/bestmodel.pth --epochs 90
from loop.
yes, you should execute it with the checkpoint argument
--checkpoint checkpoints/blizzard_init/bestmodel.pth
from loop.
so it should give me the generated file with the same voice as my datasets right?
from loop.
yes, of course
from loop.
Thank you so much for the clarification @enk100
from loop.
Hi @enk100 I trained the data and generate an output but the generated wav file don't have a sound. See attachment below.
output2.zip
from loop.
1/ how many files do you have in your dataset for each speaker?
2/ are you sure that you extract the features correctly? you can check it by generate from the npz files
3/ how long did you train ? did you see convergence? can you share the learning curve ?
from loop.
-
how many files do you have in your dataset for each speaker?
A: I have 140 wav files of 1 speaker in my datasets and 140 txt files -
are you sure that you extract the features correctly? you can check it by generate from the npz files
A: Yes I extracted it correctly. You can hear the generated npz on the zip file I attached (the file ending with 'orig.wav'. -
how long did you train ? did you see convergence? can you share the learning curve ?
A: first stage of training 10 epochs. second stage of training 90 epochs. where can I see the convergence and the learning curve?
from loop.
- The total duration is 23 mins. what do you mean by this (just note that you need to use model that train on large amount of speakers). Does that mean that I don't have to train from scratch and just use the model in your paper instead?
Thanks.
from loop.
Hi @enk100 I used the data in the vctk corpus for single speaker. after generation there is no sound.
from loop.
Hi, you can choose -
- Combine your data with vctk data and train the model from scratch
- Take the vctk model, and fine tune to your new identity - add embedding vector for your new speaker
good luck.
from loop.
You mean train it as multi speaker?
from loop.
yes. train it on vctk with the 22 speakers + your data
from loop.
so i have to run extract_feats.py with the 22 speakers + my data right?
from loop.
@enk100 if I train it on vctk with the 22 speakers + my data , should I set the --nspkr to 23 in train.py?
from loop.
@lvenoxi - yes
@jaxlinksync - no, run extract_feats.py only for your data and then combine the vctk22 with your data
from loop.
what about the norm.dat of the extracted data? do I have to add it also to the norm_info directory and just name it anything? in my case i name it sj_norm.dat.
so inside my norm_info directory is
- norm.dat (included on downloading data in the voiceloop)
- sj_norm.dat (norm file generated after extracting my datasets.)
from loop.
it only relevant when you are going to generate samples. so when you generate vctk, use vctk norm.dat. when you generate sj, use sj_norm.dat
from loop.
Hi @enk100 Thank you so much for your help. One last thing.
By generate samples you mean this command?
python generate.py --npz data/vctk/numpy_features_valid/p318_212.npz --spkr 13 --checkpoint models/vctk/bestmodel.pth
how can I pass the sj_norm.dat as a parameter?
from loop.
modify this line or add new argument to the function
from loop.
Thank you so much @enk100
from loop.
you welcome!
from loop.
by the way @enk100 how do I know which speaker ID is my new speaker?
from loop.
print self.speakers
in https://github.com/facebookresearch/loop/blob/c866e8df9b7afdc58460bcae060a3bc0e11a8987/data.py#L94
from loop.
Hi @enk100 you're awesome 😄 thanks.
one last thing. so when I generate the voice which checkpoint will I use?
a. models/vctk/bestmodel.pth
b. checkpoint/(name of expName)/bestmodel.pth
Thank you so much for your help.
from loop.
Thank you so much @enk100
from loop.
after I generate the data this is what I get.
output.zip
The generated output does not match the original wav file.
Here's the command when i generate
sudo python generate.py --npz data/vctk/numpy_features_valid/sj_014.npz --spkr 21 --checkpoint checkpoints/vctk_noise_2/bestmodel.pth
same goes for latestmodel.pth
Did I miss something? other speakers is ok but ours.
from loop.
Are you sure your speaker is 21? i guess it should be 22, as vctk has 22 speakers.
Can you get more data of your speaker?
from loop.
Hi @enk100 , I tried spkr 22 but it said that speaker did not exist. So i printed the list of speakers as per your suggestion above and got this.
As you can see the speaker with sj is 21
from loop.
@enk100 can you please confirm if our datasets are valid? Please pm me at [email protected] so that I can send you a link to our corpus if it's ok with you.
from loop.
Hi, @jaxlinksync! Can you, please, give me an advice: did you succeed to fine tune an existing vector to your new identity?
from loop.
Related Issues (20)
- Out of memory in validation step HOT 1
- Parameters for dataset in the wild HOT 12
- Main Readme wav files are missing and first instruction doesn't work HOT 1
- Using pre-trained model for new speaker?
- No matching distribution found for phonemizer (from -r scripts/requirements.txt (line 5)) HOT 1
- Issue running install_tts.py to preprocess data HOT 1
- Error running train.py HOT 1
- Error when 'make' HTK-3.4.1 and hts_core.
- Train VCTK dataset for all speakers
- bash scripts/download_tools.sh failed on Mac OS
- ERROR: Failed to find norm file. HOT 7
- ImportError: No module named torch
- How this repo compared to Merlin?
- Understanding feat tensor dimensions HOT 1
- Look like it fails on '!' character.
- Strange fail on "The quick brown fox jumps over the lazy dog."
- TBPTTIter.split_length() error HOT 1
- Batch
- hello world text
- Block on preprocessing
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from loop.