Comments (5)
Looks like calling resolve()
on fields
fixes the problem.
Replace fields = resolve(pdf.doc.catalog["AcroForm"])["Fields"]
with
fields = resolve(resolve(pdf.doc.catalog["AcroForm"])["Fields"])
and it looks like it works. I think we could modify the example code to do this.
from pdfplumber.
Thanks @jeremybmerrill for the solution, and @ibecav for flagging. I've now updated the example code in the README.
from pdfplumber.
great! I'm by no means an expert either -- all standards-compliant PDFs are alike, but all weird PDFs are weird in their own unique way -- but I do know that calling resolve()
at every opportunity seems to make problems disappear.
from pdfplumber.
Thank you. I'll try this fix in a little bit. As to changing the example I'll leave that to your discretion I'm by no means an expert but my understanding is that PDFs can be fickle and as I noted your example does work on some PDFs as is.
from pdfplumber.
Thank you, that does indeed seem to resolve the error.
from pdfplumber.
Related Issues (20)
- Demonstrations / Examples - links are not available HOT 2
- 逐个获取page.chars方法无法得到有些table里的chars,格式一模一样的table有的就不能获取
- page.img 好像没有图片的key,只有一些图片信息?怎么获取PDF中的图片对象
- When I set repair=true,there is an error:'utf-8' codec can't decode byte 0xae in position 239: invalid start byte.Because of the original PDF? HOT 1
- I got a bug when i parsing a pdf!!! HOT 2
- unsupported operand type(s) for *: 'float' and 'PSLiteral' HOT 3
- extract_tables怎么将row中倾斜的文字去除
- Multiple letters extracted on PDF table by using extract_text HOT 5
- Failed to extract the table without left and right borders. HOT 1
- Error While extracting Non-English tables (Arabic) reversed Text HOT 7
- Incorrect Annotation Coordinates on Landscape Pages
- Update version of `pdfminer-six` to `20240706` HOT 10
- Offset in text bounding boxes when parsing mix language documents HOT 5
- Page.hyperlinks breaks on CroppedPages HOT 3
- Passing `extra_attrs=["matrix"]` to `.extract_words()` seems to return "chars" instead of "words" HOT 7
- The rotation configuration set to IndirectObject, which is preventing the PDF from being uploaded. HOT 3
- page.to_image() PDFium: Data format erro HOT 4
- inconsistent coordinate systems when cropping HOT 5
- An unexpected error occurred: 'type' object is not subscriptable HOT 1
- pdfplumber verison 0.11.3 happen error "TypeError: 'type' object is not subscriptable" HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdfplumber.