Giter VIP home page Giter VIP logo

smart-nar_fast_tts's Introduction

SMART-NAR_Fast_TTS

FastSpeech2 기반의 SMART-TTS의 Non-autoregressive TTS 모델입니다. 공개된 코드는 2021년도 과학기술통신부의 재원으로 정보통신기획평가원(IITP)의 지원을 받아 수행한 "소량 데이터만을 이용한 고품질 종단형 기반의 딥러닝 다화자 운율 및 감정 복제 기술 개발" 과제의 일환으로 공개된 코드입니다.

SMART-TTS_NAR_Fast_TTS 모델 v2.0.0 은 FastSpeech2 모델을 기반으로 alignment를 external duration label 없이 모델링하는 non-autoregressive 구조의 TTS 모델입니다.

FastSpeeche2 모델을 기반으로 하여 아래 부분들을 개선하였습니다.

Done

  • Acoustic feature 를 encoding 하는 reference encoder 추가
  • Linguistic feature 와 acoustic feature 사이의 alignment를 학습하기 위한 attention module 추가
  • Alignment 로부터 duration predictor 학습을 위한 duration label 추출
  • Predicted duration 을 기반으로 Gaussian upsampling 적용

Environment

Under Python 3.6

Requirements

To install requirements:


pip install -r requirements.txt

Preprocessing

To preprocess:


python3 preprocess.py --conf {preprocess configuration file path}


Training

To train the NAR TTS model, run this command:


python3 train.py -p {preprocess config file path} -m {model condig file path} -t {training config file path}

Evaluation

To evaluate, run:


python3 synthesize.py --text  --restore_step {restore step} -p {preprocess config file path} -m {model condig file path} -t {training config file path}

Results

Synthesized audio samples can be found in ./output/results

현재 ./output/results 저장된 샘플들은 연구실 보유중인 DB를 사용해 학습한 샘플입니다.

Reference

Technical Document

본 프로젝트 관련 개선사항들에 대한 기술문서는 여기를 참고해 주세요.

smart-nar_fast_tts's People

Contributors

leemoa12 avatar shjang-97 avatar smart-tts avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

smart-nar_fast_tts's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.