facebookarchive / loop Goto Github PK
View Code? Open in Web Editor NEWA method to generate speech across multiple speakers
License: Other
A method to generate speech across multiple speakers
License: Other
Hi There!
In your paper, you mentioned training on LJSpeech and Blizzard 2013. Can you release those as well?
Thanks!
Hi,
Want to confirm if the problem setting for this research is like this:
I get an error upon executing:
python generate.py --npz data/vctk/numpy_features_valid/p318_212.npz --spkr 13 --checkpoint models/vctk/bestmodel.pth
(gpu_13) abhinav@ubuntu11:~/.../loop$ python generate.py --npz data/vctk/numpy_features_valid/p318_212.npz --spkr 13 --checkpoint models/vctk/bestmodel.pth
Traceback (most recent call last):
File "generate.py", line 153, in <module>
main()
File "generate.py", line 132, in main
out, attn = model([txt, spkr], feat)
File "/home/abhinav/tensorflow/gpu_13/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/software/LM_stash/abhinav/projects/tts/loop/model.py", line 247, in forward
context, ident = self.encoder(src[0], src[1])
File "/home/abhinav/tensorflow/gpu_13/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/software/LM_stash/abhinav/projects/tts/loop/model.py", line 66, in forward
outputs = self.lut_p(input)
File "/home/abhinav/tensorflow/gpu_13/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/abhinav/tensorflow/gpu_13/local/lib/python2.7/site-packages/torch/nn/modules/sparse.py", line 94, in forward
self.scale_grad_by_freq, self.sparse
File "/home/abhinav/tensorflow/gpu_13/local/lib/python2.7/site-packages/torch/nn/_functions/thnn/sparse.py", line 48, in forward
cls._renorm(indices, weight, max_norm, norm_type)
TypeError: _renorm() takes exactly 5 arguments (4 given)
I have followed all steps in the Setup segment
This is the stack trace:
Traceback (most recent call last):
File "train.py", line 211, in <module>
main()
File "train.py", line 199, in main
train(model, criterion, optimizer, epoch, train_losses)
File "train.py", line 119, in train
loss = criterion(output, target[0], target[1])
File "/home/michael/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/michael/Desktop/loop/model.py", line 42, in forward
mask_ = mask.expand_as(input)
File "/home/michael/.local/lib/python2.7/site-packages/torch/autograd/variable.py", line 655, in expand_as
return Expand(tensor.size())(self)
File "/home/michael/.local/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 115, in forward
result = i.expand(*self.sizes)
RuntimeError: The expanded size of the tensor (21) must match the existing size (5) at non-singleton dimension 0. at /b/wheel/pytorch-src/torch/lib/TH/THStorage.c:99
Any clue what is going on?
Can't use the Blizzard model without the original training data:
Traceback (most recent call last):
File "generate.py", line 156, in <module>
main()
File "generate.py", line 83, in main
train_dataset = NpzFolder(train_args.data + '/numpy_features')
File "/home/michael/Desktop/loop/data.py", line 84, in __init__
self.NPZ_EXTENSION))
RuntimeError: Found 0 npz in subfolders of: data/blizzard/numpy_features
Supported image extensions are: npz
Looks generate.py
uses parameters in the training data to generate.py
when i run the command provided by the document
python train.py --expName vctk --data data/vctk --noise 4 --seq-len 100 --epochs 90
i counter the problem ,what is wrong? would you help me?
RuntimeError: invalid argument 6: expected 3D tensor at /home/a524yangsen/soft/pytorch/torch/lib/THC/generic/THCTensorMathBlas.cu:442
python generate.py --npz data/vctk/numpy_features_valid/p318_212.npz --spkr 13 --checkpoint models/vctk/bestmodel.pth Traceback (most recent call last): File "generate.py", line 153, in <module> main() File "generate.py", line 142, in main norm_path) File "/mnt/sdb1/Learning/pytorch/loop/utils.py", line 257, in generate_merlin_wav weight=os.path.join(gen_dir, 'weight')), shell=True) File "/mnt/sdb1/Learning/pytorch/loop/utils.py", line 121, in pe for line in execute(cmd, shell=shell): File "/mnt/sdb1/Learning/pytorch/loop/utils.py", line 114, in execute raise subprocess.CalledProcessError(return_code, cmd) subprocess.CalledProcessError: Command 'echo 1 1 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 | /mnt/sdb1/Learning/pytorch/loop/tools/SPTK-3.9/x2x +af > /mnt/sdb1/Learning/pytorch/loop/models/vctk/results/weight' returned non-zero exit status 127
Once I've installed and tried to train or generate as described in the readme, it makes an invalid combination of arguments error as following:
$ python train.py --expName vctk --data data/vctk --noise 4 --seq-len 100 --epochs 90
INFO - 09/06/17 09:30:25 - 0:00:00 - Namespace(K=10, attention_alignment=0.05, batch_size=64, checkpoint='', clip_grad=0.5, data='data/vctk', epochs=90, expName='checkpoints/vctk', gpu=0, hidden_size=256, ignore_grad=10000.0, lr=0.0001, max_seq_len=1000, mem_size=20, noise=4, nspk=22, output_size=63, seed=1, seq_len=100, vocabulary_size=44)
INFO - 09/06/17 09:30:25 - 0:00:00 - Building dataset.
INFO - 09/06/17 09:30:25 - 0:00:00 - Dataset ready!
Traceback (most recent call last):
File "train.py", line 207, in <module>
main()
File "train.py", line 175, in main
model = Loop(args)
File "/d2/jbaik/loop/model.py", line 217, in __init__
self.decoder = Decoder(opt)
File "/d2/jbaik/loop/model.py", line 137, in __init__
opt.attention_alignment)
File "/d2/jbaik/loop/model.py", line 87, in __init__
self.N_a = getLinear(mem_elem, 3*K)
File "/d2/jbaik/loop/model.py", line 15, in getLinear
return nn.Sequential(nn.Linear(dim_in, dim_in/10),
File "/home/jbaik/.pyenv/versions/3.6.2/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 41, in __init__
self.weight = Parameter(torch.Tensor(out_features, in_features))
TypeError: torch.FloatTensor constructor received an invalid combination of arguments - got (float, int), but expected one of:
* no arguments
* (int ...)
didn't match because some of the arguments have invalid types: (float, int)
* (torch.FloatTensor viewed_tensor)
* (torch.Size size)
* (torch.FloatStorage data)
* (Sequence data)
Could you let me get some hint to handle this? Thanks!
Hi,
How can you add new datasets (voices) for training? I want to use this datasets. https://linksync-2032.kxcdn.com/wp-content/uploads/2017/06/female-voice-1.zip
they are all in .wav files and I want them to add as a dataset so I can use that voice.
Hi, and thank you for realizing your code. Currently, I'm trying to reproduce VCTK results on an ec2 instance with a Kepler GPU, and more than an issue I have a question:
Tqdm report shows iterations are taking around 9 seconds:
Train (loss 50.62) epoch 2: 28%|##7 | 35/126 [05:01<13:02, 8.60s/it]
Train (loss 48.54) epoch 2: 40%|#### | 51/126 [07:29<12:21, 9.89s/it]
And nvidia-smi shows a very low memory usage:
1208MiB / 11439MiB
So, I'm not sure if I'm missing something or if that is the expected performance.
Thanks.
I have tried to rebuild this model based on the details mentioned in the paper, but the result is bad. I used adam optimizer with 0.0002 lr and 0.5/0.9 momentum as well as gradient clip 1, but the gradient still explodes at the first few epochs. Now I can check what is wrong with my implementation.
Is this trained on noise level 4?
First of all thank you for releasing the codes.
I would like to know how difficult will be to do the training on a speakers data on a new language such as Turkish. As far as I sow during the generation step there is need for some kind of pronunciation dictionary. But what about pre-processing steps, Merlin and other tools, are they language agnostic. Thank you in advance
Hi There!
For large datasets, where extract_feats.py
uses it's multifolder
feature like the full VCTK dataset; it's unclear what the norm_info/norm.dat
file is. The norm_info_mgc_lf0_vuv_bap_63_MVN.dat
file is regenerated for each tmp split of the dataset. How do you create the norm_info/norm.dat
for datasets with more than 5000 files?
I believe you had to deal with the same problem with the 22 speaker dataset because it contains around 8000 files.
Thanks for your time, Michael. Happy to contribute back the findings.
P.S. I've been commenting in https://gist.github.com/kastnerkyle/cc0ac48d34860c5bb3f9112f4d9a0300 about changes needed to make the extract_feats.py
script work. I can't submit a pull request. I know many people are struggling to get it running.
Hello loop experts,
If I have a big dataset say 12 person with around 50K data in total, I want to train a loop model, any parameters need to adjust?
Hi There!
Did you try training on the full VCTK dataset? Does the quality get better?
How long does it take to train on the 22 speakers VCTK dataset?
Hi, how can I prepare Chinese material as training data? Thank you.
In the paper, it says the phoneme transcription of the text is generated by CMU lexicon. However, in this code, it uses phonemizer, a toolkit uses US phoneset. There is a little difference in phoneme set and phoneme number between them. Besides, the paper also mentions that they added two phonemes for two pauses with different length, but I do not know where it is done in the code.
Thanks!
I obtain some samples from public, but these samples are too much noise and speed too fast. After training, I found that generated sound is very vague,can not separate tone and tone. How to train these noisy and faster sound samples?thanks!
The 30 here seems to be a magic number unless I missed something in the paper?
def update_buffer(self, S_tm1, c_t, o_tm1, ident):
# concat previous output & context
idt = torch.tanh(self.F_u(ident))
o_tm1 = o_tm1.squeeze(0)
z_t = torch.cat([c_t + idt, o_tm1/30], 1)
z_t = z_t.unsqueeze(2)
Sp = torch.cat([z_t, S_tm1[:, :, :-1]], 2)
# update S
u = self.N_u(Sp.view(Sp.size(0), -1))
u[:, :idt.size(1)] = u[:, :idt.size(1)] + idt
u = u.unsqueeze(2)
S = torch.cat([u, S_tm1[:, :, :-1]], 2)
return S
Thanks.
Hi, thanks for open sourcing the code!
I am trying to reproduce your results. However, I am running into problems. I have been training:
So the problem is that only some speakers actually produce a speech signal based on the input. The majority of speakers only produce noise. However, the speech producing speakers are depended on the actual phoneme input. The problem seems to be that the attention does not work correctly for these samples. The attention basically stays at the beginning of the sequence and does not advance.
Did you have a similar issue when training the model? Or do you might have an idea what the problem could be?
good attention with speech output:
p226_009_11.pdf
p225_005_4.pdf
somewhat working:
p226_009_2.pdf
Most examples:
p226_009_9.pdf
p226_009_13.pdf
p226_009_1.pdf
Thanks!
Hi,
So I have run the training a 4 months ago and there was no issue. But now when I add new a dataset and train with multiple speakers I get this error.
cuda runtime error (59) : device-side assert triggered at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu:226
Can you please help?
Or use the same dataset of VTCK speakers but extract the features locally?
After the first 90 epochs, I attained a loss of 36. Starting on the next 90 epochs with reduced noise and increased sequence length, it's dropped to 26.
How low should it get to for a quality model?
I trained loop with a subset of vctk data (American speakers). I found that the audio from those speakers when I run generate.py using my trained model are pretty bad. I just hear only a couple of words in a sentence and the rest is silence or noise.
My guess is that something went wrong during feature extraction. When I compare same feature extracted files i.e. p294_001.npz from the given s3 bucket and the one I feature extracted by running extract_feats.py, I see that vuv_idx
from s3 has larger numbers (range: -5 to 5) compared to mine (range: -10e-02 to 5 )
I also noticed that text_features
and audio_features
are of different shape:
(226, 420) - s3
(540, 420) - me
Other features like durations
and code2phone
also look different.
May I know what changes I've to make to the extract_feats.py
to get similar features as the one in s3?
I'm on the section where training the data. It took 10hrs for the first 33 epochs out of 90. Is it normal or did I miss something? I'm new to this so I'm not that expert in this field.
Thanks.
Hi,
When I try to generate some waves using the download model, it has some errors when dealing with some sentences, the error type is same, not sure why. Could you please help?
Sentence1:
The boy's grandmother is his legal guardian.
Cmd:
python generate.py --text "The boy's grandmother is his legal guardian." --spkr 1 --checkpoint models/vctk/bestmodel.pth
Error:
Traceback (most recent call last):
File "generate.py", line 151, in
main()
File "generate.py", line 140, in main
norm_path)
File "/kaldi/loop/utils.py", line 266, in generate_merlin_wav
base_r0=files['mgc'] + '_r0'), shell=True)
File "/kaldi/loop/utils.py", line 121, in pe
for line in execute(cmd, shell=shell):
File "/kaldi/loop/utils.py", line 114, in execute
raise subprocess.CalledProcessError(return_code, cmd)
subprocess.CalledProcessError: Command '/kaldi/loop/tools/bin/SPTK-3.9/freqt -m 59 -a 0.58 -M 511 -A 0 < The_boy's_grandmother_is_his_legal_guardian..gen_1.mgc | /kaldi/loop/tools/bin/SPTK-3.9/c2acr -m 511 -M 0 -l 1024 > The_boy's_grandmother_is_his_legal_guardian..gen_1.mgc_r0' returned non-zero exit status 2
Sentence2:
When he's able to return to campaigning, Santorum will have to decide whether he wants to.
Cmd:
python generate.py --text "When he's able to return to campaigning, Santorum will have to decide whether he wants to." --spkr 1 --checkpoint models/vctk/bestmodel.pth
Error:
Traceback (most recent call last):
File "generate.py", line 151, in
main()
File "generate.py", line 140, in main
norm_path)
File "/kaldi/loop/utils.py", line 266, in generate_merlin_wav
base_r0=files['mgc'] + '_r0'), shell=True)
File "/kaldi/loop/utils.py", line 121, in pe
for line in execute(cmd, shell=shell):
File "/kaldi/loop/utils.py", line 114, in execute
raise subprocess.CalledProcessError(return_code, cmd)
subprocess.CalledProcessError: Command '/kaldi/loop/tools/bin/SPTK-3.9/freqt -m 59 -a 0.58 -M 511 -A 0 < When_he's_able_to_return_to_campaigning,_Santorum_will_have_to_decide_whether_he_wants_to..gen_1.mgc | /kaldi/loop/tools/bin/SPTK-3.9/c2acr -m 511 -M 0 -l 1024 > When_he's_able_to_return_to_campaigning,_Santorum_will_have_to_decide_whether_he_wants_to..gen_1.mgc_r0' returned non-zero exit status 2
There is a shape mismatch in the audio_features array of npz files between data uploaded by you and npz generated by using your extract_features script by Kyle.
For eg.
in p299_405.npz,
shape of audio_features is (393,60) for uploaded npz file
shape is (829,60) for npz created by the extract_feats script.
This issue could possibly stem from silences not being removed by the extract_feats script, while it has been removed from the uploaded data.
Can you please recommend a solution for this?
I am getting below error while preprocessing Lj Speech data set.
Traceback (most recent call last):
File "extract_feats.py", line 1406, in <module>
save_numpy_features()
File "extract_feats.py", line 853, in save_numpy_features
shutil.copy2(audio_norm_source, audio_norm_dest)
File "/usr/lib/python2.7/shutil.py", line 130, in copy2
copyfile(src, dst)
File "/usr/lib/python2.7/shutil.py", line 82, in copyfile
with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: '/home/jax/latest_features/final_acoustic_data/norm_info_mgc_lf0_vuv_bap_63_MVN.dat'
Hi Loop experts,
Currently, I have repro the original loop model with 8K vctk data, it tooks around 3 days in my Ubuntu GPU server, the server have 2 GPU.
So can loop support parallel training in multiple GPU to accelerate the training?
Thanks.
I tested the demo but failed
python generate.py --text "hello world" --spkr 1 --checkpoint models/vctk/bestmodel.pth
Traceback (most recent call last):
File "generate.py", line 153, in
main()
File "generate.py", line 132, in main
out, attn = model([txt, spkr], feat)
File "/usr/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/home/xinwang/voiceloop/loop/model.py", line 247, in forward
context, ident = self.encoder(src[0], src[1])
File "/usr/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/home/xinwang/voiceloop/loop/model.py", line 66, in forward
outputs = self.lut_p(input)
File "/usr/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/site-packages/torch/nn/modules/sparse.py", line 94, in forward
self.scale_grad_by_freq, self.sparse
File "/usr/local/lib/python2.7/site-packages/torch/nn/_functions/thnn/sparse.py", line 48, in forward
cls._renorm(indices, weight, max_norm, norm_type)
TypeError: _renorm() takes exactly 5 arguments (4 given)
Hello,
After running extract_feats.py all went through except I don't see any numpy_features_valid. Is it still needed? Do I manually create that?
I've tried, and it works well. Thanks for sharing the great project!
I wonder how long speech it can generate from text. It seems limited < 3 secs if I tried a little long sentence. Where the limit is originated from and how to make it longer? Is it related to the --seq-len
option in training?
Thank you!
In the paper, decoder input seems to mix previous decoder output and ground truth input (+noise).
But it seems the decoder in the code only uses ground truth input with noise.
Am I missing something?
Hi, I am reading your code and the code is really clean.
I noticed that in the class 'Loop' and 'Decoder' in python file 'model.py', 'self.training' is not defined but used as a condition statement. The inherited class 'torch.nn.Module' doesn't have an attribute named 'training' either.
Hi, So everything worked perfectly with your pre-process Vctk. Now I want to test with Nancy data set. I'm using the script you suggested, but I have 2 questions:
When I run the script I get 2 files on the norm_info folder: label_norm_HTS_420.dat and norm_info_mgc_lf0_vuv_bap_63_MVN.dat. Based on the shape the correct file is norm_info_mgc_lf0_vuv_bap_63_MVN.dat, but I want to be sure.
In order to combine both datasets, should I have to run the script for each speaker and them combine somehow the norms file, or should I put all data in one folder and process it?
Thanks.
Hi when I try to run
sudo python generate.py --text "hello world" --spkr 1 --checkpoint models/vctk/bestmodel.pth
I always get this error.
Traceback (most recent call last):
File "generate.py", line 153, in <module>
main()
File "generate.py", line 112, in main
txt = text2phone(args.text, char2code)
File "generate.py", line 43, in text2phone
cmudict = nltk.corpus.cmudict.dict()
File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/util.py", line 116, in __getattr__
self.__load()
File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/util.py", line 81, in __load
except LookupError: raise e
LookupError:
**********************************************************************
Resource cmudict not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('cmudict')
Searched in:
- '/home/jax/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- '/usr/nltk_data'
- '/usr/lib/nltk_data'
**********************************************************************
Thanks in advance.
Hello,
I met some issues when training loop model with my own data. please help.
I am preparing a data set with 12 person and total 5000 sentences. I am using the parameters in the readme guide to training:
python train.py --expName myexp--data data/mydata--noise 4 --seq-len 100 --epochs 90 --nspk 12
python train.py --expName myexp_final --data data/mydata--checkpoint checkpoints/myexp/bestmodel.pth --noise 2 --seq-len 1000 --epochs 90 --nspk 12
The first training is done and seems no issues. with some logs in the last lines:
INFO - 11/16/17 08:03:41 - 21:26:30 - ====> Train set loss: 31.4378
INFO - 11/16/17 08:04:01 - 21:26:50 - ====> Test set loss: 32.6544
INFO - 11/16/17 08:18:16 - 21:41:05 - ====> Train set loss: 31.4457
INFO - 11/16/17 08:18:37 - 21:41:26 - ====> Test set loss: 32.5302
But when start training with the second line, in first epoch. it start showing: 'Not a finite gradient or too big, ignoring.' frequently. I have print the befgad in utils.py in below line:
befgad = torch.nn.utils.clip_grad_norm(params, clip_th)
it has some values larger than 10000 like below ones
42648.5450444
1599437.41826
167695.944851
I have tried another experiments with 12 person and 10000 sentences, the same issue happened when training the second model.
My questions are:
Thanks.
Anyone else getting this error?
python train.py --expName vctk --data data/vctk --noise 4 --seq-len 100 --epochs 90
INFO - 09/06/17 08:52:38 - 0:00:00 - Namespace(K=10, attention_alignment=0.05, batch_size=64, checkpoint='', clip_grad=0.5, data='data/vctk', epochs=90, expName='checkpoints/vctk', gpu=0, hidden_size=256, ignore_grad=10000.0, lr=0.0001, max_seq_len=1000, mem_size=20, noise=4, nspk=22, output_size=63, seed=1, seq_len=100, vocabulary_size=44)
INFO - 09/06/17 08:52:38 - 0:00:00 - Building dataset.
INFO - 09/06/17 08:52:38 - 0:00:00 - Dataset ready!
Train (loss 50.63) epoch 1: 100%|█████████████| 126/126 [11:06<00:00, 4.17s/it]
Exception in user code:
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/visdom/__init__.py", line 240, in _send
data=json.dumps(msg),
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc867a836d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
INFO - 09/06/17 09:03:46 - 0:11:08 - ====> Train set loss: 55.6526
Valid (loss 51.16) epoch 1: 100%|███████████████| 11/11 [00:17<00:00, 1.73s/it]
Exception in user code:
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/visdom/__init__.py", line 240, in _send
data=json.dumps(msg),
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc858e3abd0>: Failed to establish a new connection: [Errno 111] Connection refused',))
INFO - 09/06/17 09:04:04 - 0:11:26 - ====> Test set loss: 51.7753
Train (loss 42.89) epoch 2: 100%|█████████████| 126/126 [11:18<00:00, 5.10s/it]
Exception in user code:
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/visdom/__init__.py", line 240, in _send
data=json.dumps(msg),
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc8617b41d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
INFO - 09/06/17 09:15:23 - 0:22:44 - ====> Train set loss: 49.4345
Valid (loss 47.57) epoch 2: 100%|███████████████| 11/11 [00:17<00:00, 1.73s/it]
Exception in user code:
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/visdom/__init__.py", line 240, in _send
data=json.dumps(msg),
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc85b904c10>: Failed to establish a new connection: [Errno 111] Connection refused',))
INFO - 09/06/17 09:15:40 - 0:23:02 - ====> Test set loss: 48.0748
Train (loss 45.03) epoch 3: 100%|█████████████| 126/126 [10:59<00:00, 4.99s/it]
Exception in user code:
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/visdom/__init__.py", line 240, in _send
data=json.dumps(msg),
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc862629190>: Failed to establish a new connection: [Errno 111] Connection refused',))
INFO - 09/06/17 09:26:41 - 0:34:02 - ====> Train set loss: 46.5043
Valid (loss 44.79) epoch 3: 100%|███████████████| 11/11 [00:17<00:00, 1.73s/it]
Exception in user code:
------------------------------------------------------------
[REDACTED]
Hi,
Thanks for sharing the project and I am doing some experiment with the tools. I have 2 questions.
the npz file download with download_data.sh is different with the ones generated by the extract_feats.py according to the same sample wave/text file. let's say p294_001. Why is this happened? other arrays are also have some differences.
download one:
phonemes
[28 22 19 41 21 3 22 31 34 11 22 5]
durations
[29 4 25 18 21 27 11 32 7 12 39 3]
extract one:
phonemes
[28 22 19 40 21 3 22 31 33 11 22 5]
durations
[ 9 6 23 33 6 17 24 32 3 14 28 32]
If I want to retrain the model using the data, I need to extract features to prepare the npz files, do I need to put the training set and validation set together to run extract_feats.py and get the norm.dat? or I need only deal with the training data to get the norm.dat then kick-off training?
Thank you for your guidance in advanced. :)
Hi,
Below error accrued when training with single speaker:
Train (loss 63.31) epoch 1: 3%|████▍ | 1/29 [00:23<11:10, 23.93s/it]THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File "train.py", line 211, in
main()
File "train.py", line 199, in main
train(model, criterion, optimizer, epoch, train_losses)
File "train.py", line 122, in train
loss.backward()
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 146, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/linear.py", line 24, in backward
grad_weight = torch.mm(grad_output.t(), input)
RuntimeError: cuda runtime error (2) : out of memory at /b/wheel/pytorch-src/torch/lib/THC/generic/THCStorage.cu:66
Could you please help?
How many epochs of training are needed in order to start hearing anything meaningful on output of generate.py? tnx!
I find it speak too fast, how can I slow down the voice?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.