Comments (2)
It makes sense. Thank You. I was able to isolate new example with pages I can share, which break file with one save. Iām marking this issue as closed and I will make new one in a minute to separate cases of multiple saves from broken pages.
from pymupdf.
If you repeat saving a document with compression options, please be aware that this first changes the document in memory - before it is saved to disk.
What you seem to detect is that multiple executions of the same duplication detector (garbage=4) in a row may finally run into an error.
As it appears at the moment, the problem only happens under this eccentric programming behavior: why would anyone execute 4 saves in a row with the same options! It is difficult to assign priority to an error which only exhibits under such circumstances.
It seems that you want to achieve the best compression effect when saving. I suggest to use doc.ez_save(filename, garbage=4)
(1 time only). This will do the best duplicates detection in conjunction with optimum compression.
Please try to reproduce the problem when using a reasonable programming style.
from pymupdf.
Related Issues (20)
- find_tables doesn't recognize any table in scanned document HOT 1
- page.find_tables() is taking high CPU. HOT 1
- Move CLA signatures to dedicated branch.
- "fitz.mupdf.FzErrorArgument: code=4: source object number out of range" after "add_redact_annot" HOT 3
- MuPDF error: syntax error: unknown keyword: '4.48823e' HOT 3
- get_toc(simple=False) return 'to' point coordinate is not based on top-left origin HOT 6
- missing attribute set_dpi() HOT 1
- stamp annotation from pixmap/file HOT 1
- Re-introduced bug, text align add_redact_annot HOT 1
- doc.xref_stream(xref).decode().splitlines() does NOT split the line HOT 3
- OCR segmentation fault HOT 7
- Replacing text with redaction and insert_textbox and fixing reading order
- PyMuPDF failed to extract bw images HOT 11
- Extra characters returned by `page.get_text` with clip HOT 1
- page.get_text() cause process freeze with certain pdf on v1.24.2 HOT 2
- Unable to set ComboBox value HOT 1
- Page.apply_redactions() removes more text than expected in the pdf document. HOT 13
- insert_text() not display true font correctly HOT 2
- Facing Issues after applying redactions they delete some Image or Icons HOT 4
- Images missing from TextPage dictionary HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pymupdf.