Giter VIP home page Giter VIP logo

auto-subtitle-tts's Introduction

Video Processing CLI

This project provides a command-line tool for generating and embedding subtitles into videos, with additional options for audio processing and text-to-speech conversion using IBM Watson or ElevenLabs services.

Features

  • Generate subtitles from the audio within a video or from a separate audio file.
  • Embed subtitles directly into the video.
  • Convert text to speech and overlay or replace the existing audio track in the video.
  • Cut the video to a specified duration using start and end times.
  • Adjust the volume of the original video audio.
  • Control the speed of the generated text-to-speech audio (IBM only for now).

Installation

Clone the repository and install the required Python dependencies:

git clone https://github.com/siinghd/auto-subtitle-tts
cd auto-subtitle-tts
pip install -r requirements.txt

ffmpeg and moviepy must also be installed on your system for video and audio processing.

Usage

Use the CLI with the following command pattern:

python3 ./src/cli.py --video VIDEO_PATH [options]

Options:

  • --video VIDEO_PATH: Path to the video file.
  • --audio AUDIO_PATH: Path to an alternative audio file (optional).
  • --text TEXT_PATH: Path to a text file for text-to-speech conversion (optional).
  • --start_time START_TIME: Start time to cut the video, format 'hh:mm:ss' or seconds (optional).
  • --end_time END_TIME: End time to cut the video, format 'hh:mm:ss' or seconds (optional).
  • --volume_factor VOLUME: Float value to adjust the volume of the video's original audio (optional, default is 1).
  • --speed SPEED: Integer value to control the speed percentage of the text-to-speech audio (optional, default is 0 negative supported,IBM ONLY FOR NOW).
  • --apikey API_KEY: IBM TTS service API key or ElevenLabs API key, depending on the selected service (optional).
  • --url SERVICE_URL: IBM TTS service URL or ElevenLabs service URL, depending on the selected service (optional).
  • --service_tts SERVICE: Select between 'IBM' or 'ELABS' for the text-to-speech service (optional, default is 'IBM').

Examples:

Generate subtitles from a video's audio:

python3 ./src/cli.py --video path/to/video.mp4

Generate subtitles and cut the video with custom audio:

python3 ./src/cli.py --video path/to/video.mp4 --audio path/to/audio.mp3 --start_time 00:00:30 --end_time 00:02:30

Overlay text-to-speech audio onto a video with custom speed and service:

python3 ./src/cli.py --video path/to/video.mp4 --text path/to/textfile.txt --speed -10 --apikey yourapikey --url yourserviceurl --service_tts ELABS

Convert a text file to speech without video:

python3 ./src/cli.py --text path/to/textfile.txt --apikey yourapikey --url yourserviceurl --service_tts IBM

Contributing

Contributions are welcome! Feel free to fork the repository, make improvements, and submit pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.