Comments (3)
Hi Matthias,
It did the trick, the validation occurred as expected, The two firsts were considered valid and the latter invalid.
Regarding the pdf standard 'openness', I know what you mean. I made an wrapper to enable multithread/multiserver batch OCR scanning (https://github.com/gchehab/ocr-server) and the many diverse ways an embedded image may be encoded almost drove me nuts, I am quite sure that there are a lot of cases that I failed to address.
I just got my hands on another PDF that a colleague is working on that gives a type error during its parse while validating, I am not sure, however if the issue is not some error on the PDF file signature itself. I'll open an issue specific for this error.
Thank you for your quick support,
Guilherme
from pyhanko.
Hi, thanks for the report!
The first issue has to do with incremental update validation in pyHanko, and can be solved by whitelisting one more key. Someone else had the same issue; I'll do a bugfix release to correct that particular problem soon. In the meantime, here's how to work around the problem by defining your own diff policy:
As for the workaround, here's the declaration of the default diff
analysis policy:
.pyHanko/pyhanko/sign/diff_analysis.py
Line 2146 in fd1af25
You can define a custom diff policy to get around this kind of thing,
and pass it as thediff_policy
parameter in various validation methods.Start by copying the default rules, since you probably want to preserve
most of those. The FormUpdatingRule class actually takes another
parameter calledignored_acroform_keys
where you can pass in keys that
are to be ignored in the "strict" part of the comparison comparison
analysis. Passing in "{'/Fields', '/DA', '/DR'}" should be OK.Alternatively, you can cheat and add "/DA" to the list of keys in
pyhanko.sign.diff_analysis.ACROFORM_EXEMPT_STRICT_COMPARISON at runtime.
Then it will work automagically with the default diff policy
The second issue is indeed certvalidator-related. It tries to fetch some related certs from a CRL AIA record, but the server isn't setting the Content-Type header. That's is a bit annoying, because there are multiple ways to encode certs in a situation like this, and ordinarily you'd rely on the Content-Type header to select the correct parser. I suspect that this particular fetch wasn't working all along, and my move from urllib
to requests
in my fork of certvalidator
simply broke the error handling code. I'd guess that the certs that certvalidator
was trying to fetch were actually already available in the local cache anyway.
I'll look into this one more closely, and do a bugfix release if necessary. Thanks for bringing it to my attention!
from pyhanko.
Hi, can you reproduce the problem with the latest HEAD
for pyHanko
and pyhanko-certvalidator
? Your file passes validation on my machine now (well, the last signer's certificate doesn't jive with pyHanko's default key usage policy, but you can easily change that in the configuration file).
If this fixes your issue, then I'll do a bugfix release for pyhanko-certvalidator
, update the dependency in pyHanko
itself, and do a bugfix release here as well. :)
PS: If you face similar diff analysis issues in the future, please report them if you can. This aspect of PDF signature validation isn't standardised anywhere (yet), so I'm always on the lookout for interesting corner cases.
from pyhanko.
Related Issues (20)
- PDFs where Root -> AcroForm is a broken reference (resolves to a NullObject) fails to parse HOT 2
- cli --no-strict-syntax missing HOT 2
- Signature faulty with rotate_with_page = False on rotated pages/pdfs HOT 1
- non-ascii(chinese) characters not disaplying correctly when generating default signature appearance HOT 1
- enhance document to provide concret example for specifying the page and coordinates of the signature field HOT 2
- can't install pyhanko because of uharfbuzz depending on cython HOT 8
- Validating signature with embedded timestamp fails on 0.19.0 HOT 1
- Cannot install using pip HOT 3
- Link signature certificate HOT 1
- Link to the documentation in description HOT 1
- stamp font and position is inverted for some PDFs. HOT 5
- [pyhanko-certvalidator] PEM certificate not getting extracted due to incorrect Content-Type header HOT 3
- [pyhanko-certvalidator] Ability to skip nonce validation in OCSP response HOT 3
- Expose encryption dictionary in PdfFileReader as instance variable HOT 9
- The Coordinates Not Set Properly HOT 3
- LICENSE.PyPDF2 missing from wheel distributions HOT 3
- Add digital signature is broken for PDF file larger than 100 000 000 bytes HOT 3
- Xrefs disable
- Support of 64bit PKCS#11 libraries (drivers) HOT 4
- Support of non-English aplphabet (e.g. UTF-8) in stamp-text HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyhanko.