lingjzhu / probing-tts-models Goto Github PK
View Code? Open in Web Editor NEWLink to paper: https://www.isca-speech.org/archive_v0/SpeechProsody_2020/pdfs/51.pdf
License: BSD 3-Clause "New" or "Revised" License
Link to paper: https://www.isca-speech.org/archive_v0/SpeechProsody_2020/pdfs/51.pdf
License: BSD 3-Clause "New" or "Revised" License
closed
你好!
我生成了一些语谱图(不是用troco2),然后把它们feed进waveglow模型无法合成声音,出来的都是噪音。但是我用Griffin-Lim这种传统声码器可以合成声音。waveglow对声谱图有什么要求吗?我已经把hop_len对齐了。谢谢!
https://colab.research.google.com/drive/1bTdnNdsmZE-TAae5bSzBlAUx81RKsPcP?usp=sharing
你好,我在用自己训练的的模型合成语音时,遇到了下面的错误:
Traceback (most recent call last):
File "D:/cuiyuhan/PycharmWorkplace/text-to-speech/probing-TTS-models-master/synthesize.py", line 142, in
model.load_state_dict(torch.load(checkpoint_path))
File "D:\Anaconda\lib\site-packages\torch\nn\modules\module.py", line 777, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron2:
Missing key(s) in state_dict: "embedding.weight", "encoder.convolutions.0.0.conv.weight", "encoder.convolutions.0.0.conv.bias", "encoder.convolutions.0.1.weight", "encoder.convolutions.0.1.bias", "encoder.convolutions.0.1.running_mean", "encoder.convolutions.0.1.running_var", "encoder.convolutions.1.0.conv.weight", "encoder.convolutions.1.0.conv.bias", "encoder.convolutions.1.1.weight", "encoder.convolutions.1.1.bias", "encoder.convolutions.1.1.running_mean", "encoder.convolutions.1.1.running_var", "encoder.convolutions.2.0.conv.weight", "encoder.convolutions.2.0.conv.bias", "encoder.convolutions.2.1.weight", "encoder.convolutions.2.1.bias", "encoder.convolutions.2.1.running_mean", "encoder.convolutions.2.1.running_var", "encoder.lstm.weight_ih_l0", "encoder.lstm.weight_hh_l0", "encoder.lstm.bias_ih_l0", "encoder.lstm.bias_hh_l0", "encoder.lstm.weight_ih_l0_reverse", "encoder.lstm.weight_hh_l0_reverse", "encoder.lstm.bias_ih_l0_reverse", "encoder.lstm.bias_hh_l0_reverse", "decoder.prenet.layers.0.linear_layer.weight", "decoder.prenet.layers.1.linear_layer.weight", "decoder.attention_rnn.weight_ih", "decoder.attention_rnn.weight_hh", "decoder.attention_rnn.bias_ih", "decoder.attention_rnn.bias_hh", "decoder.attention_layer.query_layer.linear_layer.weight", "decoder.attention_layer.memory_layer.linear_layer.weight", "decoder.attention_layer.v.linear_layer.weight", "decoder.attention_layer.location_layer.location_conv.conv.weight", "decoder.attention_layer.location_layer.location_dense.linear_layer.weight", "decoder.decoder_rnn.weight_ih", "decoder.decoder_rnn.weight_hh", "decoder.decoder_rnn.bias_ih", "decoder.decoder_rnn.bias_hh", "decoder.linear_projection.linear_layer.weight", "decoder.linear_projection.linear_layer.bias", "decoder.gate_layer.linear_layer.weight", "decoder.gate_layer.linear_layer.bias", "postnet.convolutions.0.0.conv.weight", "postnet.convolutions.0.0.conv.bias", "postnet.convolutions.0.1.weight", "postnet.convolutions.0.1.bias", "postnet.convolutions.0.1.running_mean", "postnet.convolutions.0.1.running_var", "postnet.convolutions.1.0.conv.weight", "postnet.convolutions.1.0.conv.bias", "postnet.convolutions.1.1.weight", "postnet.convolutions.1.1.bias", "postnet.convolutions.1.1.running_mean", "postnet.convolutions.1.1.running_var", "postnet.convolutions.2.0.conv.weight", "postnet.convolutions.2.0.conv.bias", "postnet.convolutions.2.1.weight", "postnet.convolutions.2.1.bias", "postnet.convolutions.2.1.running_mean", "postnet.convolutions.2.1.running_var", "postnet.convolutions.3.0.conv.weight", "postnet.convolutions.3.0.conv.bias", "postnet.convolutions.3.1.weight", "postnet.convolutions.3.1.bias", "postnet.convolutions.3.1.running_mean", "postnet.convolutions.3.1.running_var", "postnet.convolutions.4.0.conv.weight", "postnet.convolutions.4.0.conv.bias", "postnet.convolutions.4.1.weight", "postnet.convolutions.4.1.bias", "postnet.convolutions.4.1.running_mean", "postnet.convolutions.4.1.running_var".
Unexpected key(s) in state_dict: "iteration", "state_dict", "optimizer", "learning_rate".
Process finished with exit code 1
初学者,不知道该如何解决,感谢解答!
你好,我发现在训练前的预处理部分提取bert embedding的代码中,并没有去除隐状态h中的头部和尾部的token的表示,但在inference_bert.ipynb中提取bert embedding的代码中包含了去除相应token的h[:, 1:-1, :]操作,请问你在训练时候是和合成时有区别,还是一致的呢
The data set I obtained has only the following three folders :WAVE、ProsodyLabeling、PhoneLabeling。How can I get the sfs file?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.