Comments (11)
so after much hacking around,
I think I got your branch working with Poppler .74 + Cairo 1.6
bash-4.4# pdf2htmlEX --version
pdf2htmlEX version 0.15.0
Copyright 2012-2015 Lu Wang <[email protected]> and other contributors
Libraries:
poppler 0.74.0
libfontforge 20190222
cairo 1.16.0
Default data-dir: /usr/local/share/pdf2htmlEX
Supported image format: png jpg svg
I'm trying to track down a subtle issue with a transformation matrix..
In the older version of pdf2htmlEX run from Mac brew install we get this (which works):
.m0{transform:none;-ms-transform:none;-webkit-transform:none;}
In the newer version from your branch we get the following transform:
.m0{transform:matrix(0.000000,0.000000,0.000000,0.000000,0,0);-ms-transform:matrix(0.000000,0.000000,0.000000,0.000000,0,0);-webkit-transform:matrix(0.000000,0.000000,0.000000,0.000
000,0,0);}
It's something in the code from src/StateManager.h
class TransformMatrixManager : public StateManager<Matrix, TransformMatrixManager>
...
This is where my knowledge ends.. Any idea why the newer version of the code would generate a different matrix like that? It's causing a div to be invisible and a surrounding link to not be clickable because of it.
I can privately share a copy of the PDF that causing the issue if it's helpful.
from pdf2htmlex.
Also, when I got my version working, I cloned from Rockstar04/pdf2htmlEX rather than pdf2htmlEX/pdf2htmlEX. Did I use the correct one?
from pdf2htmlex.
so after much more digging, this transform issue I'm seeing may be due to this bug that was recenlty filed:
https://gitlab.freedesktop.org/poppler/poppler/issues/721 will keep an eye out for a fix.
from pdf2htmlex.
Thanks for the work you've done here @amit777 !
I'm not sure if you can open a pull request with your changes, but it would help greatly to review what you have changed, and anything else that may need done before merging.
from pdf2htmlex.
@Rockstar04 thank you, but just wanted to clarify, do you think i should be using the pdf2htmlEX/pdf2htmlEX branch or the Rockstar04/pdf2htmlEX branch? Or should I just start from coolwanglu's branch?
from pdf2htmlex.
Hopefully pdf2htmlEX/pdf2htmlEX and Rockstar04/pdf2htmlEX are not very different, but I will definitely suggest you branch from the "official" pdf2htmlEX/pdf2htmlEX repo.
coolwanglu's repository has been archived and should not be used.
from pdf2htmlex.
gotcha.. Also, I noticed a knowledgable contributor in the forums named David Hedley who has a bunch of fixes in his branch https://github.com/davidhedley/pdf2htmlEX
I'm going to spend a little time to see if they should be merged. The problem is I have very little clue on how the internals of all this stuff is supposed to work!
from pdf2htmlex.
I dont know much about the internal working at this point either, and I know very little about C++.
from pdf2htmlex.
so it's strange.. I'm doing a manual diff between davidhedleys fork and this fork and most of his changes are incorporated here somehow. @Rockstar04 did you start this fork?
from pdf2htmlex.
I started the fork pdf2htmlEX/pdf2htmlEX to allow continued development through contributions by other developers. The repository rockstar04/pdf2htmlEX is my aggregation of any patch I could find to build pdf2htmlEX with more modern dependencies for my own use.
from pdf2htmlex.
This issue has been fixed multiple times since the summer of 2019.
As of today, pdf2htmlEX version 0.18.8.rc1 uses Poppler version 0.89.0 (which is the latest released version of Poppler).
from pdf2htmlex.
Related Issues (20)
- CMake Deprecation Warning: Compatibility with CMake < 2.8.12 will be removed from a future version of CMake
- Feature Request: Open external links in New tab
- Feature Request: Save page state in URL
- Create a new latest docker image on docker hub HOT 4
- Maintaining the visible form of text when using cut-paste
- Heap-Buffer-Overflow in embed_font Function
- Doubt: Blocks order
- how to install it and can you tell how we can convert pdf to html HOT 3
- how to restore table structure HOT 1
- how to install on macos HOT 1
- Bug: Gen inside xref table too large (bigger than INT_MAX)
- libjpeg-turbo8 is not present on recent Debian versions HOT 1
- Rotated annotations
- Request: Support actionLaunch/actionGoToR links
- Why are the matrix styles needed?
- Why is some of the text not extracted and is basked into the generated images?
- TOC and many internal crossref links?
- Issue in selecting text HOT 1
- Converting error HOT 2
- convert all PDF content into one web page
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdf2htmlex.