Giter VIP home page Giter VIP logo

vilingo's Introduction

ViLingo

ViLingo is a complete pipeline for automated translating videos from Russian language into 12 foreign languages (German, English, Italian, French, Turkish, Japanese, Chinese, Spanish, Portugese, Polish, Czech, Danish)

The following ML models were used:

  • Demucs - for separating voices from other sounds (dualization task)
  • WhisperX - for performing STT and getting timestamps for each phrase
  • NLLB-200 - for translation of the text
  • Wav2Lip - syncing lips of speaker with the voice
  • Coqui xtts_v2 - for TTS and voice cloning of the speaker

Note, that each phrase is processed separately, which helps to make pronounce each phrase with the voice of the corresponding speaker.

Снимок экрана 2023-12-15 в 03 43 09

Examples

Original Translated
Original.mp4
79_translated.online-video-cutter.com.mp4

Please find others examples of our work there.

Running

Run the following command from model directory:

python3.10 main.py [path_to_video] [language]

Language can be en or fr, for example.

System

The code was tested on Ubuntu 22.04.01 .

Python 3.10

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10

Libraries

python3.10 -m pip install git+https://github.com/m-bain/whisperx.git
python3.10 -m pip install transformers==4.33.0
python3.10 -m pip install demucs==4.0.1
python3.10 -m pip install TTS==0.20.3
python3.10 -m pip install pydub==0.25.1
sudo apt install ffmpeg

Hardware requirements

The code was tested with Tesla-V100 1x32GB on remote server.

vilingo's People

Contributors

anvarka avatar inspired99 avatar sigmadt avatar vadimshabashov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.