Giter VIP home page Giter VIP logo

pdf_to_csv_script's Introduction

PDF with tables to CSV

Installing and Using this script

First make sure you have python3 on your system. If not go to python.org and install it.

Make sure you add it to your path!

Create a new python virtual environment

python -m venv venv

Activate your new python virtual environment

Windows CMD: venv\Scripts\activate

Linux/Mac source venv/bin/activate

Update your pip and wheel versions

pip install --upgrade pip setuptools wheel

Pip install this git repository

pip install git+https://github.com/DanielPDWalker/pdf_to_csv_script.git

In your terminal type

pdf_to_csv convert <relative_path_to_your_pdf>

Once the script has run there should be a csv file with the same name as you pdf in the directory that your pdf is in.


Other ways to get data out of a PDF

https://academy.datawrapper.de/article/135-how-to-extract-data-out-of-pdfs

pdf_to_csv_script's People

Contributors

danielpdwalker avatar

Watchers

James Cloos avatar  avatar

pdf_to_csv_script's Issues

" Building wheel for pdf-to-csv-script (setup.py) ... error"

Hello:
I received the following while building 'pdf_to_csv_script'.

Building wheels for collected packages: pdf-to-csv-script
Building wheel for pdf-to-csv-script (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /home/david/venv/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-_1na_6q8/setup.py'"'"'; file='"'"'/tmp/pip-req-build-_1na_6q8/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-i7fok7rs
cwd: /tmp/pip-req-build-_1na_6q8/
Complete output (6 lines):
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help

error: invalid command 'bdist_wheel'

ERROR: Failed building wheel for pdf-to-csv-script
Running setup.py clean for pdf-to-csv-script
Failed to build pdf-to-csv-script

Thank you for working on this. I am not a programmer. I found your script while searching for a way to convert pdf to csv so I can assist my wife in examination of 3100 pages of tables from the 55000 page Pfizer document dump (pdf) that you may have heard about on the news. I need to be able to get these tables into a spread sheet. Any suggestions would be most appreciated. I attach a screenshot of page 1. Thank you.
Picture of table from Pfizer
.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.