Giter VIP home page Giter VIP logo

pertts-streamlit's Introduction

pertts (persian text-to-speech)

This is the implementation and a web interface for pertts (Persian text-to-speech)

powered by piper

the live version of persian tts called pertts

image with love from datacula.com

Voice: We are using an AI-based TTS system, trained with Amir Sooakhsh's voice from rokhpodcast, Special thanks to Amir :)

dataset

https://huggingface.co/datasets/SadeghK/datacula-pertts-amir

🛠️ Installation

docker

Build with docker from scratch and run

docker build --no-cache -t pertts:1.0 .
docker container run --name st --rm -it -p 8501:8501 pertts:1.0

Run the latest version of the docker image from docker hub

docker image pull sadeghk/pertts
docker container run --name st --rm -it -p 8501:8501 sadeghk/pertts

python

install piper-tts using pip and download the model in pertts-streamlit/model directory

`` pip install piper-tts

and then run
```bash
echo 'سلام و درود بر همه فارسی زبانان' | piper \
  --model epoch=5261-step=2455712.onnx \
  --output_file dorood.wav

Windows

download the executables for windows piper_windows_amd64.zip from piper, and unzip go to the piper directory where piper.exe exists and create a folder with the name models. Download the model for Persian/Farsi from huggingface with name fa_IR-amir-medium.onnx and fa_IR-amir-medium.onnx.json to models directory.

open a PowerShell and cd to the directory where piper.exe exists, and run

echo "سلام و درود بر شما" | .\piper.exe --model .\models\fa_IR-amir-medium.onnx --output_dir .\outputs

pertts-streamlit's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

soebb

pertts-streamlit's Issues

This was amazing

Sorry to bother you but I just wanted to share my excitement 😆
For the first time since 15+ years ago, we finally have access to a very high quality, open-source Persian text-to-speech. This is a very nice model and I'm blown away by the model's small size considering its accuracy.
Thanks for maintaining this project, I love the Web UI! 👍🏻

Help for creating dataset

Thank you @SadeghKrmi , i wanted to know these about your dataset creation process :

  • how did you get transcription of rokhpodcast.ir audio tracks?
  • how did aligned voice with transcription?
  • Size of dataset?
  • tools you used for creating your dataset?

helppppp!

my problem is when i want use the powershell to send farsi phrase to piper, the powershell does not support the persian unicode and i want to know that you proceed this project, how you overcome this because i cannot find any solution i the internet.

speech sometimes differs from the actual text

Great job in creating the first applicable open-source AI text to speech for Persian language which has excellent voice quality.
I just relized that the amir model in this initial release sometimes sometimes ignore the actual text and read it as a conversational text.
This is not what we expect from a text to speech. I belive this problem can be solved by adjusting the text based on what actually was read by the voice talent.
This issue is especially problematic if a blind user of computer want to use this text to speech during editing of a document. Imageine what happen!

Input text:
رنگین‌کمان پدیده‌ای نوری و کمانی است که زمانی که خورشید به قطرات نم و رطوبت جو زمین می‌تابد باعث ایجاد طیفی از نور در آسمان می‌شود.

Audio text:
رنگین‌کمان پدیده‌ای نوری و کمانیه که زمانی که خورشید به قُطرات نم و رطوبت جو زمین می‌تابد باعث ایجاد طیفی از نور در آسمان می‌شه.

I could contribute toward solving this issue but don't know where to find the training dataset!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.