Giter VIP home page Giter VIP logo

pymupdf's Introduction

PyMuPDF 1.14.11

logo

Release date: January 15, 2018

Travis-CI: Build Status

On PyPI since August 2016:

Authors

Introduction

This is version 1.14.11 of PyMuPDF (formerly python-fitz), a Python binding with support for MuPDF 1.14.x - "a lightweight PDF, XPS, and E-book viewer".

MuPDF can access files in PDF, XPS, OpenXPS, CBZ, EPUB and FB2 (e-books) formats, and it is known for its top performance and high rendering quality.

With PyMuPDF you therefore can access files with extensions like ".pdf", ".xps", ".oxps", ".cbz", ".fb2" or ".epub" from your Python scripts.

See the change log and usage recipes.

Installation

For all Windows and (thanks to our user @jbarlow83!) for the major Mac OSX and Linux versions we offer wheels. They can also be found in the download section of PyPI.

The platform tag for Mac OSX is macosx_10_6_intel.

The platform tag for Linux is manylinux1_x86_64, which makes these wheels usable on Debian, Ubuntu and most other variations.

On other operating systems you need to generate PyMuPDF yourself. And of course you can choose to do so for a wheel-supported platform, too. Before you can do this, you must download and generate MuPDF. This process depends very much on your system. For most platforms, the MuPDF source contains prepared procedures for achieving this. Please observe the following general steps:

  • Be sure to download the official MuPDF source release from here. Do not use MuPDF's GitHub repo. It contains their current development source, which is not compatible with this PyMuPDF version most of the time.

  • The repo's fitz folder contains a few files whose names start with an underscore "_". These files contain configuration data and hotfixes. Each one must be copy-renamed to its correct target location of the MuPDF source that you have downloaded, before you generate MuPDF. Currently, these files are:

    • fitz configuration file _mupdf_config.h copy-replace to: mupdf/include/fitz/config.h. It contains configuration data like e.g. which fonts to support.
    • fitz error module _error.c, copy-replace to: mupdf/source/fitz/error.c. It redirects MuPDF warnings and errors so they can be intercepted by PyMuPDF.
    • PDF device module _pdf-device.c copy-replace to: mupdf/source/pdf/pdf-device.c. It fixes a bug which caused method Document.convertToPDF() to bring down the interpeter.
    • Now MuPDF can be generated.

Once this is done, adjust directories in setup.py and run python setup.py install.

The following sections contain further comments for some platforms.

Ubuntu

Our users (thanks to @gileadslostson and @jbarlow83!) have documented their MuPDF installation experiences from sources in this Wiki page.

OSX

First, install the MuPDF headers and libraries, which are provided by mupdf-tools: brew install mupdf-tools.

Then you might need to export ARCHFLAGS='-arch x86_64', since libmupdf.a is for x86_64 only.

Finally, please double check setup.py before building. Update include_dirs and library_dirs if necessary.

MS Windows

In addition to wheels, this platform offers pre-generated binaries in a ZIP format, which can be used without PIP.

If you are looking to make your own binary, consult this Wiki page. It explains how to use Visual Studio for generating MuPDF in quite some detail.

Usage and Documentation

For all document types you can render pages in raster (PNG) or vector (SVG) formats, extract text and images, and access meta information, links, annotations and bookmarks, as well as decrypt the document.

For PDF files, most of these objects can also be created, modified or deleted. Plus you can rotate, re-arrange, duplicate, create, delete and split or join pages and you can join or split PDF documents.

Specifically for PDF files, PyMuPDF also provides update access to low-level structure data, supports handling of embedded files and modification of page contents (like inserting images, fonts, text, annotations and drawings).

Other features include embedding vector images (SVG, PDF) such as logos or watermarks, "posterizing" a PDF or creating "booklet" and "4-up" versions.

You can now also create and update Form PDFs and form fields with support for text, checkbox, listbox and combobox widgets.

To some degree, PyMuPDF can also be used as an image converter: it can read a broad range of input formats and can produce Portable Network Graphics (PNG), Portable Anymaps (PNM, etc.), Portable Arbitrary Maps (PAM), Adobe Postscript and Adobe Photoshop documents, making the use of other graphics packages obselete in many cases. But interfacing with e.g. PIL/Pillow for image input and output is easy as well.

Have a look at the basic demos, the examples (which contain complete, working programs), and the recipes section of our Wiki sidebar, which contains more than a dozen of guides in How-To-style.

Our documentation, written using Sphinx, is available in various formats from the following sources. It currently is a combination of a reference guide and a user manual. For a quick start look at the tutorial and the recipes chapters.

Earlier Versions

Earlier versions are available in the releases directory.

License

PyMuPDF is distributed under GNU GPL V3. Because you will implicitely also be using MuPDF, its license GNU AFFERO GPL V3 applies as well. Copies of both are included in this repository.

Contact

Please submit questions, comments or issues here, or directly contact the authors via their e-mail addresses.

pymupdf's People

Contributors

deepgully avatar dreua avatar fsecada01 avatar jorjmckie avatar liuruikai avatar mozbugbox avatar ousia avatar rk700 avatar wilfreddv avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.