Giter VIP home page Giter VIP logo

kevingele / smartsrt Goto Github PK

View Code? Open in Web Editor NEW
6.0 5.0 1.0 5 KB

📄 SmartSRT is a command-line tool for generating accurate subtitles with per-word timestamps. It uses WhisperAI for speech transcription, NVIDIA NeMo for diarization, and OpenCV for face recognition. The program is good at creating high accuracy subtitles. 🎧💻⚙️

License: MIT License

audio cuda cv2 face-recognition nvidia-nemo python srt subtitles text-summarization timestamps transcribe whisper

smartsrt's Introduction

SmartSRT

SmartSRT is my attempt, at an community run end-to-end solution for automatically generating subtitles for videos with a maximum subtitle length constraint, along with speaker diarization. It uses machine learning models for speech recognition, text summarization, face regocnition, and diarization, and will run on a CUDA GPU for faster performance.

Your image

# Getting Started You need to install the required dependencies and download the necessary models.\ ⚠️Remember that you have to use a CUDA supported GPU!⚠️

Clone the repository:
git clone https://github.com/KevinGeLe/SmartSRT.git
Install the required dependencies:
cd SmartSRT
pip install -r requirements.txt
The necessary models will be downloaded, by specifying the models that you want to use.

TODO:

  • Add Face recognition (±1 second)👱
  • Add Max-Subtitle-Lenght with Per-Word-Timestamps:🕰️
  • Add new parsers for Max-Lenght, Output, Input and Model🔍
  • Refactor code to improve readability🐊
  • Work on improving performance🦈
  • Update README.md📑

License

SmartSRT is released under the MIT license. See the LICENSE file for more information.

smartsrt's People

Contributors

kevingele avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

s-i-t-a

smartsrt's Issues

Any release plans?

Hi!

This looks like an awesome project and I'd love to see something using the new NeMo tech for SRT generation. Any plans on releasing the scripts used any time soon?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.