Giter VIP home page Giter VIP logo

Comments (14)

Rockstar04 avatar Rockstar04 commented on June 19, 2024 2

https://gist.github.com/Rockstar04/c77f9f46f15be7b156aaed9a34bb5188

from pdf2htmlex.

Rockstar04 avatar Rockstar04 commented on June 19, 2024 1

I have a script I use for building pdf2htmlEX that uses Poppler 0.63.0, and font forge, git branch 20170731.

I start with my fork, which is a mix of other people's patches: https://github.com/Rockstar04/pdf2htmlEX

from pdf2htmlex.

jgoldfar avatar jgoldfar commented on June 19, 2024 1

Quick work on that alpine image! I had taken a look at reducing the image size using alpine as a base image, but ended up building an image using Debian: https://hub.docker.com/r/jgoldfar/pdf2htmlex-stable/
Looks like an interesting application you all are working on!

from pdf2htmlex.

asinning avatar asinning commented on June 19, 2024

Thanks Rockstar04! Is your script in your fork? Would you be willing to share? I would be very grateful.

What do you mean by "and font forge, git branch 20170731"? I'm not seeing a font-forge branch having that name or date? https://github.com/fontforge/fontforge/branches/all

from pdf2htmlex.

asinning avatar asinning commented on June 19, 2024

Awesome work Rockstar04! I have a comment and a related question.

Comment: Compiling poppler with the DENABLE_LIBOPENJPEG=none flag (line 56) produced poor results when converting pdfs with many layers of images. We found that some layers where missing.

cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr -DENABLE_XPDF_HEADERS=ON -DENABLE_LIBOPENJPEG=none

In order to compile without the DENABLE_LIBOPENJPEG flag, I did need to install the following dependencies:

apt-get install libgirepository1.0-dev
apt-get install libopenjp2-7-dev
apt-get install libgtk-3-dev

Question: What is the purpose of the DENABLE_LIBOPENJPEG flag?

from pdf2htmlex.

Rockstar04 avatar Rockstar04 commented on June 19, 2024

From what I assume it disabled popplers ability to import or export JPEGs? For the steps compiling Font Forge and Poppler, you can look to their official documentation for more information about compiling them from source.

from pdf2htmlex.

brecke avatar brecke commented on June 19, 2024

Thanks a lot for the script. I'll try adapting it to run in alpine linux so I can use that as a docker image. Have you tried it? There's this image but it's a bit outdated (alpine 3.2).

from pdf2htmlex.

brecke avatar brecke commented on June 19, 2024

...and here it is, in case you want it: https://hub.docker.com/r/oaeproject/oae-pdf2htmlex-docker/ It's using alpine:3.8

Source is here.

Feel free to use this as a base for someone else's work and/or include it in the wiki. Thanks a lot for your hard work!

from pdf2htmlex.

amit777 avatar amit777 commented on June 19, 2024

Just an FYI. It looks like a lot of security updates have been made to Poppler since .63 (https://poppler.freedesktop.org/releases.html). I'm going to try getting this working on centos7 building from source. Will post the steps if I get it working.

from pdf2htmlex.

brecke avatar brecke commented on June 19, 2024

@amit777 feel free to fork my repo (alpine based) above and please report back your findings

from pdf2htmlex.

amit777 avatar amit777 commented on June 19, 2024

i was able to get it compiled with .63 on centos7.. had to do some updates to the .sh script. I was a little too ambitious and tried to get it working with poppler .72. Which required me (i think to upgrade C++ environment to GCC7).. I made it further, but then got other compliation errors. I'm no where near knowledgable enough to solve them.

Anyway, I'm seeing a bunch of forks and not sure which one is the best to work of..
pdf2htmlEX/pdf2htmlEX
Rockstar
Alpine (which i don't know what that is).

from pdf2htmlex.

brecke avatar brecke commented on June 19, 2024

@amit777 alpine is a linux distro very lightweight and therefore common when building docker images

The docker image I published above is built on alpine, and compiles pdf2htmlEX succesfuly, you can check versions in the Dockerfile.

from pdf2htmlex.

amit777 avatar amit777 commented on June 19, 2024

@brecke, so i've never really worked with Docker before but started with yours.. Got it working pretty well. I noticed a small difference in library versions between your docker dev build and the centos7 build I have. Below are the --versions of each. libfontforge and cairo are the version differences.

-- version output on Centos 7 build

pdf2htmlEX version 0.15.0
Copyright 2012-2015 Lu Wang <[email protected]> and other contributors
Libraries: 
  poppler 0.63.0
  libfontforge 20190219
  cairo 1.15.12
Default data-dir: /usr/local/share/pdf2htmlEX
Supported image format: png jpg svg

--version output on your docker build:

pdf2htmlEX version 0.15.0
Copyright 2012-2015 Lu Wang <[email protected]> and other contributors
Libraries: 
  poppler 0.63.0
  libfontforge 20181011
  cairo 1.14.8
Default data-dir: /build/usr/share/pdf2htmlEX
Supported image format: png jpg svg

And on my mac (using brew install pdf2htmlEX):

pdf2htmlEX version 0.14.6
Copyright 2012-2015 Lu Wang <[email protected]> and other contributors
Libraries: 
  poppler 0.57.0
  libfontforge 20180321
  cairo 1.16.0
Default data-dir: /usr/local/Cellar/pdf2htmlex/0.14.6_20/share/pdf2htmlEX
Supported image format: png jpg svg

from pdf2htmlex.

brecke avatar brecke commented on June 19, 2024

Yeah, well, I didn't really care for having the most recent versions of all dependencies, once I got one working that was just about enough 🤷🏻‍♂️ feel free to try different versions though and fork at will

from pdf2htmlex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.