Comments (3)
Than you for supplying fuzzing tests. Unfortunately I currently don't have much time to look at all of these, so the fix for encryption tests are delayed until I have more time to dig into them. Since there's no warranty, you're also welcome to supply your own patches.
from podofo.
This is looks like an object ownership issue - there are m_Encrypt members in different classes, some of which are copies of each other.
- PdfParser::ReadObjects creates PdfParser::m_Encrypt
- PdfParser::ReadObjectsInternal calls PdfParserObject::SetEncrypt with PdfParser::m_Encrypt.get() which sets PdfParserObject::m_Encrypt to use in delay loading
- PdfMemDocument takes ownership of PdfParser::m_Encrypt via TakeEncrypt (but this doesn't affect PdfParserObject::m_Encrypt)
- PdfMemDocument::SetEncrypt assigns to m_Encrypt which deletes the unique_ptr originally in PdfParser::m_Encrypt
- Subsequent delay loads via PdfParserObject reference the now deleted PdfParserObject::m_Encrypt
I think the issue is the m_Encrypt unique_ptrs are actually shared with PdfParserObject
Possible fixes:
- replace unique_ptr and raw m_Encrypt pointers with std::shared_ptr
- throw an exception if SetEncryption is called on an encrypted delay loaded document
- disable delay loading if a document is encrypted
Option 1) seems lowest impact, except it could increase the size of PdfParserObject - and there's one of these for every object in the file. It currently looks like this:
InputStreamDevice*m_device;
PdfEncrypt* m_Encrypt;
bool m_IsTrailer;
// 3 or 7 bytes of alignment padding inserted by compiler here
size_t m_Offset;
bool m_HasStream;
// 3 or 7 bytes of alignment padding inserted by compiler here
size_t m_StreamOffset;
At the moment PdfParserObject has 6 bytes (32-bit) or 14 bytes (64-bit) of wasted space due to alignment padding. If the members were re-ordered to eliminate most of the alignment padding and m_Encrypt converted to a shared_ptr the change should be neutral in terms of PdfParserObject size:
std::shared_ptr<PdfEncrypt> m_Encrypt; // typically 2 * sizeof(void*) in most compiler implementations
InputStreamDevice*m_device;
size_t m_Offset;
size_t m_StreamOffset;
bool m_IsTrailer;
bool m_HasStream;
// 2 bytes or 6 bytes of alignment padding here
32-bit - no change - 6 bytes of alignment padding reduced to 2 bytes and extra pointer in shared_ptr uses 4 bytes
64-bit - no change - 14 bytes of alignment padding reduced to 6 bytes and extra pointer in shared_ptr uses 8 bytes
from podofo.
Hello Mark! I like the suggestion to move to std::shared_ptr
. Patch welcome, as I'm overwhelmed by tasks.
from podofo.
Related Issues (20)
- legacy.dll not copied HOT 10
- cmap generated by PdfDynamicEncoding is displayed differently in different readers HOT 20
- Freetype font management thread-safety HOT 1
- Segmentation fault when trying to Load image HOT 4
- Images are skewed HOT 1
- help with signing documents HOT 1
- Extract PDF file results in a garbled code HOT 8
- Please clarify license information for the following files HOT 3
- please provide instructions on how to create a PdfSignature HOT 2
- PdfErrorCode::OutOfMemory, PoDoFo is out of memory. in Signer.ComputeSignatureSequential({ }, Buff, true); HOT 6
- Is it possible to set the specified page not to be copied? HOT 1
- GetPageAt;ExtractTextTo: When the number of PDF pages increases linearly, the cost of both methods increases exponentially HOT 4
- FT_Load_Sfnt_Table, the judgment of return value HOT 1
- Question: Trying to add Marked Content and reshuffle OCG's HOT 1
- fails to compile on msys2/mingw HOT 13
- Unable to load legacy providers in OpenSSL >= 3.x.x HOT 4
- Program received signal SIGABRT, Aborted HOT 3
- how to remove freeobject HOT 1
- Question: Migration to 0.10.x - What happend to PdfField? HOT 3
- helloworld does not compile with libxml2 v2.12.6 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from podofo.