Giter VIP home page Giter VIP logo

pdf-flow's Introduction

PDF Flow Banner

This program was created to dynamically analyze scanned PDF files using user defined templates to grab in-document information to use for renaming the files, save to project specific directories, and e-mail files to project specific distribution lists.

Install Process

Note: PDF Flow is currently only compatible with Windows due to hardcoded configuration storage settings. Updating this to allow cross compatibility with Linux will be completed soon in a future update.

Dependencies

Before getting started, make sure you have the following dependencies installed on your system. Once installed you can either add them manually to your system PATH or specify their locations in the PDF Flow settings tab after opening. When using either option, use the Test buttons in the settings tab to ensure they are properly detected.

  • Poppler - Used for transforming PDFs to images.
  • Tesseract OCR - The OCR engine for text extraction.

Note: If adding to PATH after starting PDF Flow or from an already open cmd prompt, you will have to restart them for PATH to be updated.

Installation

Option 1: Installation via .msi File (Windows)

  1. Visit the PDF Flow website to download the latest .msi installer for Windows.
  2. Run the downloaded .msi file and follow the on-screen instructions to install the program.
  3. Once installed, you can launch the program from the Start Menu or desktop shortcut.

Option 2: Building from Source

If you prefer to build from source you can follow these steps:

  1. Clone the GitHub repository:

    git clone https://github.com/bgorman87/PDF-Flow.git
    cd pdf-flow
  2. Create and activate a virtual environment (recommended):

    python -m venv venv
    venv\Scripts\activate
  3. Install the required Python packages:

    pip install -r requirements.txt
  4. Note: On Windows if you plan to compile an executable yourself, then during file processing you may notice brief command prompt windows appearing and disappearing. If running directly from source, you may not see this occurring depending on your environment. This is due though to how the pdf2image package dependency runs Poppler through these prompts, which I cannot programmatically change.

    If you prefer not to see these windows appear/disappear, follow these general steps to prevent pdf2image from creating/showing the cmd prompts:

    1. Open pdf2image.py in your editor of choice. This file should be located at venv\Lib\site-packages\pdf2image\pdf2image.py
    2. Look for any lines that use Popen to execute a command.
    3. Add the following flag to the respective Popen calls: creationflags=0x08000000.
    • Here's an example of what the modified line could look like:
      proc = Popen(command, env=env, stdout=PIPE, stderr=PIPE, creationflags=0x08000000)
  5. Run PDF Flow:

    python PDF-Flow.py

Usage

For usage information/guides please see information located here: PDF Flow Usage

FAQ

Will this be available for Linux A: Currently have updating for cross compatibility on my list of things to do. Majority of this was created in Linux so not much to change. Should be available soon.

Contact/Bugs

For any bugs please check for relevant issues and if none apply then please open a new issue.

For general comments/questions you can reach out directly through github or through e-mail at [email protected].

Contribution

Being a new/solo developer without a CS degree or any industry experience, I very much welcome any contributions to PDF Flow! I know there is a ton that can be improved or added so if you would like to contribute, please follow these steps:

  1. Fork the Repository: Click the "Fork" button at the top-right of this repository to create your own copy.

  2. Clone the Repository: Clone your forked repository to your local machine:

    git clone https://github.com/your-username/PDF-Flow.git

    cd PDF-Flow

  3. Create a Branch: Create a new descriptively named branch for your changes:

    git checkout -b your-descriptive-branch-name

  4. Make Changes: Make your desired changes to the codebase.

  5. Commit Changes: Commit your changes with a descriptive commit message:

    git commit -m "Add feature XYZ"

  6. Push to Your Fork: Push your changes to your fork on GitHub:

    git push origin your-descriptive-branch-name

  7. Submit a Pull Request: Open a pull request from your fork to this repository's main branch. Provide a clear and detailed description of your changes.

  8. Review and Collaborate: Participate in the discussion and make any necessary adjustments based on feedback.

pdf-flow's People

Contributors

bgorman87 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.