Giter VIP home page Giter VIP logo

pdf2video's Introduction

Description

pdf2video is a Python script that combines

  • (selected pages of) a PDF presentation, and
  • a text script

into a video narrated by the Amazon Polly text-to-speech engine. It can be used to generate, for instance, educational videos.

Please see this sample video, produced with the tool, for a short introduction. Observe that some browsers don't show the subtitles embedded in MP4 videos, please see this sample video with WebVTT subtitles in such as case.

Requirements

Using pdf2video requires the following external tools and services:

Installation

One can use pip to install pdf2video directly from GitHub:

python3 -m pip install git+https://github.com/tjunttila/pdf2video.git

See the PyPA Installing Packages tutorial for information on installing Python packages and on Python virtual environments.

Usage

In the simplest case,

pdf2video presentation.pdf script.txt video.mp4

converts the PDF file presentation.pdf and the UTF-8 encoded script file script.txt into the video video.mp4 narrated by the default voice (Amazon Polly standard voice Joanna in the current version). The video includes SRT subtitles that can be displayed by most video players. In addition, for HTML use, WebVTT subtitles are produced in a separate file as well.

The selected PDF pages as well as the narration voice can be changed easily. For instance, the sample video was produced with the command

pdf2video sample.pdf sample.txt --pages "1,2,4-6" --voice Matthew --neural --conversational sample.mp4

All the options can be printed with pdf2video --help.

The script file is formatted as follows. The script for each presentation page starts with a line #page [name] and the following text then contains the script. The optional [name] parameter, that can be used in the --only option of the tool, is a string of ascii letters and underscores, possibly followed by a non-negative number. For instance defs and example_3 are valid names.

A line starting with % is a comment and thus ignored.

In the script text, one can use the following modifiers:

  • *text* to read text in an emphasized style,
  • @xyz@ to spell xyz as characters,
  • #slow/text/ to read text in a slower rate,
  • #high/text/ to use higher pitch for text,
  • #low/text/ to use lower pitch for text,
  • #n, where n is a positive integer, to have a pause of length of n*100ms,
  • #ph/word/pronunciation/ spell the word with the X-SAMPA pronunciation, and
  • #sub/text/subtitle/ to use subtitle as the subtitle instead of the spoken text.

Above, the / delimiter can be any other symbol not occurring in the "arguments" of the modifier. This allows one to nest modifiers. For instance, #sub/big-#ph!Theta!Ti:.t@! of n/ฮ˜(n)/ reads as "big-theta of n" but shows as ฮ˜(n) in the subtitles.

Please see the file sample.txt file for examples.

Some good practices and hints

  • Converting a script with many pages to video can take some time. For developing and debugging the script text, it is recommended to name the script pages with #page pagename, and then use the --only option of the tool to convert only the page under development.
  • For pronunciations, one can find IPA pronunciations in many online dictionaries, and then convert them to X-SAMPA by using the table in the X-SAMPA Wikipedia page.
  • Whenever possible, avoid using the @xyz@ construct as it seems to change the pitch of the whole sentence.

License

The pdf2video tool is relased under the MIT License.

pdf2video's People

Contributors

tjunttila avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.