Giter VIP home page Giter VIP logo

tacotron2-mandarin's Introduction

我的语音合成最新进展见 ParallelTTS

Go ParallelTTS for my latest work of TTS.


Tacotron-2 的 PyTorch 实现,见 Tacotron2-PyTorch

PyTorch implementation of Tacotron-2, See Tacotron2-PyTorch.


tacotron-2-mandarin

Tensorflow implementation of DeepMind's Tacotron-2. A deep neural network architecture described in this paper: Natural TTS synthesis by conditioning Wavenet on MEL spectogram predictions

Repo Structure

tacotron-2-mandarin-griffin-lim
|--- datasets
|--- logs-Tacotron
     |--- eval-dir
     |--- plots
     |--- taco_pretrained
     |--- wavs
|--- papers
|--- prepare
|--- tacotron
     |--- models
     |--- utils
|--- tacotron_output
     |--- eval
     |--- logs-eval
          |--- plots
          |--- wavs
|--- training_data
     |--- audio
     |--- linear
     |--- mels

Samples

There are some synthesis samples here.

Pretrained

you can get pretrained model here.

Quick Start

OS: Ubuntu 16.04

Step (0) - Git clone repository

git clone https://github.com/atomicoo/tacotron2-mandarin.git
cd tacotron-2-mandarin-griffin-lim/

Step (1) - Install dependencies

  1. Install Python 3 (python-3.5.5 for me)

  2. Install TensorFlow (tensorflow-1.10.0 for me)

  3. Install other dependencies

    pip install -r requirements.txt
    

Step (2) - Prepare dataset

  1. Download dataset BIAOBEI or THCHS-30

    After that, your doc tree should be:

    tacotron-2-mandarin-griffin-lim
    |--- ...
    |--- BZNSYP
         |--- ProsodyLabeling
              |--- 000001-010000.txt
         |--- Wave
    |--- ...
    
  2. Prepare dataset (default is BIAOBEI)

    python prepare_dataset.py
    

    If preparing THCHS-30, you can use parameter --dataset=THCHS-30.

    After that, you can get a folder BIAOBEI as follow:

    tacotron-2-mandarin-griffin-lim
    |--- ...
    |--- BIAOBEI
         |--- biaobei_48000
    |--- ...
    
  3. Preprocess dataset (default is BIAOBEI)

    python preprocess.py
    

    If prrprocessing THCHS-30, you can use parameter --dataset=THCHS-30.

    After that, you can get a folder training_data as follow:

    tacotron-2-mandarin-griffin-lim
    |--- ...
    |--- training_data
         |--- audio
         |--- linear
         |--- mels
         |--- train.txt
    |--- ...
    

Step (3) - Train tacotron model

python train.py

More parameters, please see train.py.

After that, you can get a folder logs-Tacotron as follow:

tacotron-2-mandarin-griffin-lim
|--- ...
|--- logs-Tacotron
     |--- eval-dir
     |--- plots
     |--- taco_pretrained
     |--- wavs
|--- ...

Step (4) - Synthesize audio

python synthesize.py

More parameters, please see synthesize.py.

After that, you can get a folder tacotron_output as follow:

tacotron-2-mandarin-griffin-lim
|--- ...
|--- tacotron_output
     |--- eval
     |--- logs-eval
          |--- plots
          |--- wavs
|--- ...

References & Resources

Rayhane-mamah/Tacotron-2

tacotron2-mandarin's People

Contributors

atomicoo avatar joee1995 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

tacotron2-mandarin's Issues

生成语音有噪音

您好,首先感谢您提供模型。我用您的的模型加上自己的语音数据集训练了200k步,在合成语音时出现前边几秒为合成的正常的语音,后边多了几十秒的噪声,您知道如何去除那几十秒的噪声吗?能否给些建议?期待您的回复。祝安好!
样本.zip

Checkpoint file corrupt?

Running synthesize.py was throwing the following error:

RuntimeError: Failed to load checkpoint at logs-Tacotron/taco_pretrained/

I tried to restore the model manually with the following:

with tf.Session() as sess:
    saver = tf.train.import_meta_graph('logs-Tacotron/taco_pretrained/tacotron_model.ckpt-150000.meta')
    saver.restore(sess, "logs-Tacotron/taco_pretrained/tacotron_model.ckpt-150000")

This throws the error:

DataLossError (see above for traceback): Checksum does not match: stored 4061160485 vs. calculated on the restored bytes 4272926173

Can anyone actually successfully load the pretrained model from a fresh checkout of the git repo? I'm wondering if there was some sort of file corruption when it was added.

This was with TF v1.11 and v1.12, v1.10 gave a different error (KeyError: 'DivNoNan').

Tensorflow OutOfRange error

I was trying to do the inference through the pretrained model you provided. Then the Tensorflow OutOfRange error comes. I am thinking if the unproper pretrained model causes it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.