Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8.691694759794e-311
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4.345847379897e-311
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2.1729236899484e-311
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.086461844974e-311
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 5.43230922487e-312
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2.716154612436e-312
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.35807730622e-312
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 6.7903865311e-313
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 3.39519326554e-313
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.69759663277e-313
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8.487983164e-314
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4.243991582e-314
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2.121995791e-314
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.0609978955e-314
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 5.304989477e-315
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2.65249474e-315
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.32624737e-315
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 6.63123685e-316
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 3.3156184e-316
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.6578092e-316
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8.289046e-317
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4.144523e-317
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2.0722615e-317
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.036131e-317
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 5.180654e-318
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2.590327e-318
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.295163e-318
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 6.4758e-319
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 3.2379e-319
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.61895e-319
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8.095e-320
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4.0474e-320
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2.0237e-320
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.012e-320
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 5.06e-321
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2.53e-321
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.265e-321
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 6.3e-322
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 3.16e-322
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.6e-322
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8e-323
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4e-323
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2e-323
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1e-323
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 5e-324
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 0.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1.6e-322
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8e-323
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4e-323
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 2e-323
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1e-323
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 5e-324
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 0.0
Traceback (most recent call last):
File "train.py", line 189, in
main()
File "train.py", line 34, in main
mp.spawn(train_and_eval, nprocs=n_gpus, args=(n_gpus, hps,))
File "/home/nur-179/anaconda3/envs/gtts/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 167, in spawn
while not spawn_context.join():
File "/home/nur-179/anaconda3/envs/gtts/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 114, in join
raise Exception(msg)
Exception:
-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/home/nur-179/anaconda3/envs/gtts/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/nur-179/.temp/glow-tts/train.py", line 91, in train_and_eval
train(rank, epoch, hps, generator, optimizer_g, train_loader, None, None)
File "/home/nur-179/.temp/glow-tts/train.py", line 115, in train
scaled_loss.backward()
File "/home/nur-179/anaconda3/envs/gtts/lib/python3.6/contextlib.py", line 88, in exit
next(self.gen)
File "/home/nur-179/anaconda3/envs/gtts/lib/python3.6/site-packages/apex/amp/handle.py", line 123, in scale_loss
optimizer._post_amp_backward(loss_scaler)
File "/home/nur-179/anaconda3/envs/gtts/lib/python3.6/site-packages/apex/amp/_process_optimizer.py", line 249, in post_backward_no_master_weights
post_backward_models_are_masters(scaler, params, stashed_grads)
File "/home/nur-179/anaconda3/envs/gtts/lib/python3.6/site-packages/apex/amp/_process_optimizer.py", line 135, in post_backward_models_are_masters
scale_override=(grads_have_scale, stashed_have_scale, out_scale))
File "/home/nur-179/anaconda3/envs/gtts/lib/python3.6/site-packages/apex/amp/scaler.py", line 176, in unscale_with_stashed
out_scale/grads_have_scale, # 1./scale,
ZeroDivisionError: float division by zero
my base.json file is as follows:
{
"train": {
"use_cuda": true,
"log_interval": 20,
"seed": 1234,
"epochs": 10000,
"learning_rate": 1e0,
"betas": [0.9, 0.98],
"eps": 1e-9,
"warmup_steps": 4000,
"scheduler": "noam",
"batch_size": 4,
"ddi": true,
"fp16_run": true
},
"data": {
"load_mel_from_disk": false,
"training_files":"filelists/ljs_audio_text_train_filelist.txt",
"validation_files":"filelists/ljs_audio_text_val_filelist.txt",
"text_cleaners":["transliteration_cleaners"],
"max_wav_value": 32768.0,
"sampling_rate": 44100,
"filter_length": 1024,
"hop_length": 256,
"win_length": 1024,
"n_mel_channels": 80,
"mel_fmin": 0.0,
"mel_fmax": 8000.0,
"add_noise": true,
"add_space": false,
"cmudict_path": "data/dict"
},
"model": {
"hidden_channels": 192,
"filter_channels": 768,
"filter_channels_dp": 256,
"kernel_size": 3,
"p_dropout": 0.1,
"n_blocks_dec": 12,
"n_layers_enc": 6,
"n_heads": 2,
"p_dropout_dec": 0.05,
"dilation_rate": 1,
"kernel_size_dec": 5,
"n_block_layers": 4,
"n_sqz": 2,
"prenet": true,
"mean_only": true,
"hidden_channels_enc": 192,
"hidden_channels_dec": 192,
"window_size": 4
}
}