Giter VIP home page Giter VIP logo

deangelisdf / write2audiobook Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 4.0 124 KB

A powerful tool designed to convert text-based documents into engaging audiobooks. Perfect for anyone looking to make reading more accessible, whether for people with visual impairments or for those who simply prefer listening on the go.

License: MIT License

Python 100.00%
audiobooks ffmpeg python3 visually-impaired

write2audiobook's Introduction

Write2Audiobook

Simplify life with audiobooks.
A tool designed to help visually impaired people by converting text-based documents into audiobooks.

Features

Convert various text formats to audio, including EPUB, TXT, PPTX, and DOCX.
Easy to use with simple command-line instructions.
Enhances accessibility for visually impaired users.

Requirements

To get started, clone the repository and install the necessary dependencies:

git clone https://github.com/deangelisdf/write2audiobook
cd write2audiobook
python3 -m pip install -r requirements.txt

How to Use

You can convert your documents to audiobooks using the following commands:
To convert an EPUB book to an audiobook:

python3 ebook2audio.py book.epub language

To convert a plain text file to an audiobook:

python3 txt2audio.py text.txt language

To convert a PowerPoint presentation to an audiobook:

python3 pptx2audio.py presentation.pptx language

To convert a Word document to an audiobook:

python3 docx2audio.py document.docx language

where language supported is it stay for italian and en stay for english.

Contributing

We welcome contributions! If you'd like to contribute, please fork the repository and submit a pull request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Contact

For questions, suggestions, or feedback, feel free to open an issue or contact us.

write2audiobook's People

Contributors

deangelisdf avatar greg-martinez44 avatar brunodavi avatar wux4an avatar

Stargazers

Pran_Mizu avatar  avatar

Watchers

 avatar

write2audiobook's Issues

Improve table reading (docx)

This table reading is already in place, but for the reader is not clear when it start and the iteration with them.

Desiderable: add a parameter to add verbosity reading of table

(Pre-analysis) (ebook/docx) Remove reference in book.

In many book is possible to find link to reference, but the reading of those numbers or phrase is not natural in a lecture section.

In all scripts exist a method used to extract chapter text, the analysis required here need to take this text as input and remove possible reference.
Most of times a reference Is defined with following syntax [1] where 1 is the reference.
Reference examples:

  • [1]
  • (Author, 1999)
  • phrase.1

Desiderable: add an optional flag to remove the references from text.

is it possible compress the final audio?

Right now the bit_rate is the only solution I found to reduce the final file dimension.

This is a feasibility, then the desiderable - is to have as output

  • pull request with a solution
    Or
  • analysis of all compression path tried

(tool) move constante to libraries

As BACK_END_TTS Need to be move to m4b.py, maybe also other constante global Need to be moved to library.

This Activity Need to reduce duplicate code.

(plantuml) read uml diagram

Parsing and interpretate It in order to generate text

  • #35
  • #37
  • use case diagram
  • Activity diagram
  • statechart diagram
  • review all code to be consistent
  1. The parser for each diagram shall stay under a specific folder plant_uml
  2. Each parser shall return a graph represent the diagram read

Reduce Cyclomatic Complexity

In order to reduce the bug injection and improve readability of code.

Desiderable: create a report of changes for each commit generated with a GitHub action

add internal documentation

The functions in the scripts are not all documented.
To improve quality and readability of scripts, need to add at least header comments for each function.
Desiderable is to add the typing for each function and comments, header comments.

DeepL API integration

Sometime a book Is not wrote in our mother tongue, to solve It we can use api of DeepL to translate the documents in our preferiti language.

test all script, in different envirorments

All the scripts right now are used only under Windows.
Can be useful to expand compatibility at least with MacOs.
Desiderable is to test it under ubuntu and other linux distro.

Improve code style

Update all scripts to be compliance with PyLint rule active

  • fix code style (ba59592)
  • refactor pptx2audio.py (df7e6c5)
  • reduce cyclomatic complexity (39ccde8)
  • reduce nesting function
  • analyze all bottleneck performance
  • remove duplicate code
  • #30

(epub) move temporary mp3 files to temporary folder

when an epub are converted in m4b, the following steps are executed:

  1. extractiong all data from epub in a temporary folder
  2. extract guide from epub
  3. for each file (not in guide)
    • extract text and save it (on the folder where the script is executed)
    • generate mp3 and save it (on the folder where the script is executed)
  4. generate ffmetada starting from mp3 generate previously
  5. merge all mp3 and ffmetada in unique M4B file

The goal of this activity is remove the temporary file.

Prototype Excel reader

Similar to other script, can be useful to have a reader for excel files.

Desiderable:

  • add requirements
  • read simple table and sheets
  • say how many graph are present in the page (in the next step, we want read also graphs)

class diagram

Create a script base on other in report, to read in input a class diagram (plantuml) and generate:

  • plantuml class-diagram parser
  • textual equivalent to diagram
  • generate the final audio
  • unit tests

Docs: https://plantuml.com/class-diagram

add language support

The tool support only italian language, add other language require to configure properly each backend provided to the tool.

To add new language, the developer Need to modify the global dictionary as following:

  • LANGUAGE_DICT = {"it-IT":"it"}
  • LANGUAGE_DICT_PYTTS = {"it-IT":"italian"} in m4b.py
  • TITLE_KEYWORD = {"it-IT":"TITOLO", "en":"TITLE"}
  • CHAPTER_KEYWORD= {"it-IT":"CAPITOLO", "en":"CHAPTER"} in docx2audio.py

Desiderable: looking for hard coded strings and generalize it with a global dictionary, as previous one.

(PPTX) get context by image

Desiderable: Usage of Google Lens or equivalent service to analyze an image and retrieve information about it

  • extract image from slides
  • convert in JPEG format (to compresse image)
  • sent to generative AI (as bing copilot or chatgpt) to retrived image context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.