atomicoo / tacotron2-mandarin Goto Github PK

View Code? Open in Web Editor NEW

128.0 7.0 45.0 8.67 MB

Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on Tacotron-2 model.

License: MIT License

Python 99.10% Jupyter Notebook 0.90%

tensorflow tacotron tacotron2 chinese mandarin tts

tacotron2-mandarin's Introduction

我的语音合成最新进展见 ParallelTTS。

Go ParallelTTS for my latest work of TTS.

Tacotron-2 的 PyTorch 实现，见 Tacotron2-PyTorch。

PyTorch implementation of Tacotron-2, See Tacotron2-PyTorch.

tacotron-2-mandarin

Tensorflow implementation of DeepMind's Tacotron-2. A deep neural network architecture described in this paper: Natural TTS synthesis by conditioning Wavenet on MEL spectogram predictions

Repo Structure

tacotron-2-mandarin-griffin-lim
|--- datasets
|--- logs-Tacotron
     |--- eval-dir
     |--- plots
     |--- taco_pretrained
     |--- wavs
|--- papers
|--- prepare
|--- tacotron
     |--- models
     |--- utils
|--- tacotron_output
     |--- eval
     |--- logs-eval
          |--- plots
          |--- wavs
|--- training_data
     |--- audio
     |--- linear
     |--- mels

Samples

There are some synthesis samples here.

Pretrained

you can get pretrained model here.

Quick Start

OS: Ubuntu 16.04

Step (0) - Git clone repository

git clone https://github.com/atomicoo/tacotron2-mandarin.git
cd tacotron-2-mandarin-griffin-lim/

Step (1) - Install dependencies

Install Python 3 (python-3.5.5 for me)
Install TensorFlow (tensorflow-1.10.0 for me)
Install other dependencies
```
pip install -r requirements.txt
```

Step (2) - Prepare dataset

Download dataset BIAOBEI or THCHS-30

After that, your doc tree should be:

tacotron-2-mandarin-griffin-lim
|--- ...
|--- BZNSYP
     |--- ProsodyLabeling
          |--- 000001-010000.txt
     |--- Wave
|--- ...

Prepare dataset (default is BIAOBEI)
```
python prepare_dataset.py
```
If preparing THCHS-30, you can use parameter --dataset=THCHS-30.

After that, you can get a folder BIAOBEI as follow:
```
tacotron-2-mandarin-griffin-lim
|--- ...
|--- BIAOBEI
     |--- biaobei_48000
|--- ...
```

Preprocess dataset (default is BIAOBEI)

python preprocess.py

If prrprocessing THCHS-30, you can use parameter --dataset=THCHS-30.

After that, you can get a folder training_data as follow:

tacotron-2-mandarin-griffin-lim
|--- ...
|--- training_data
     |--- audio
     |--- linear
     |--- mels
     |--- train.txt
|--- ...

Step (3) - Train tacotron model

python train.py

More parameters, please see train.py.

After that, you can get a folder logs-Tacotron as follow:

tacotron-2-mandarin-griffin-lim
|--- ...
|--- logs-Tacotron
     |--- eval-dir
     |--- plots
     |--- taco_pretrained
     |--- wavs
|--- ...

Step (4) - Synthesize audio

python synthesize.py

More parameters, please see synthesize.py.

After that, you can get a folder tacotron_output as follow:

tacotron-2-mandarin-griffin-lim
|--- ...
|--- tacotron_output
     |--- eval
     |--- logs-eval
          |--- plots
          |--- wavs
|--- ...

References & Resources

Rayhane-mamah/Tacotron-2

tacotron2-mandarin's People

Contributors

Stargazers

Watchers

tacotron2-mandarin's Issues

pretrain model available?

can you provide the pretrain model?

FileNotFoundError: [Errno 2] No such file or directory: 'training_data\\mels\\mel-THCHS-30\\thchs30_16000\\A11_0.npy'

The dataset I used was THCHS-30 but it goes wrong when I tried to run preprocess.py.

tacotron_model.ckpt-150000.data-00000-of-00001 git lfs pull error

hello, thx for your work , but the file tacotron_model.ckpt-150000.data-00000-of-00001 git lfs pull error, can you upload to baidu yun or other?

ImportError: DLL load failed: 页面文件太小，无法完成操作。

When run preprocess.py, there is import error:
ImportError: DLL load failed: 页面文件太小，无法完成操作。
······
Failed to load the native TensorFlow runtime.
······
There so many errors!

Looking for help!

error.txt

Does it will upgrade to tensorflow2.0?

Will it upgrade to tensorflow 2.0 or any alternative pytorch implementation on Chinese?

preprocess.py doesn't support Biaobei datasets

i read the preprocess.py , but i found that the supported_datasets list doesn't contains Biaobei

生成语音有噪音

您好，首先感谢您提供模型。我用您的的模型加上自己的语音数据集训练了200k步，在合成语音时出现前边几秒为合成的正常的语音，后边多了几十秒的噪声，您知道如何去除那几十秒的噪声吗？能否给些建议？期待您的回复。祝安好！
样本.zip

Checkpoint file corrupt?

Running synthesize.py was throwing the following error:

RuntimeError: Failed to load checkpoint at logs-Tacotron/taco_pretrained/

I tried to restore the model manually with the following:

with tf.Session() as sess:
    saver = tf.train.import_meta_graph('logs-Tacotron/taco_pretrained/tacotron_model.ckpt-150000.meta')
    saver.restore(sess, "logs-Tacotron/taco_pretrained/tacotron_model.ckpt-150000")

This throws the error:

DataLossError (see above for traceback): Checksum does not match: stored 4061160485 vs. calculated on the restored bytes 4272926173

Can anyone actually successfully load the pretrained model from a fresh checkout of the git repo? I'm wondering if there was some sort of file corruption when it was added.

This was with TF v1.11 and v1.12, v1.10 gave a different error (KeyError: 'DivNoNan').