Comments (3)
@iliasmansouri - Thanks for this issue. That looks like outdated documentation. We'll get that fixed as soon as we can.
from unstructured.
@iliasmansouri - Thanks for this issue. That looks like outdated documentation. We'll get that fixed as soon as we can.
@MthwRobinson: Thanks for the quick response. Meanwhile, can you point out on how to use the pdf-related modules?
from unstructured.
@iliasmansouri - Yep check out these links. You'll need unstructured[local-inference]
from the quick start docs to use the PDF capabilities. Feel free to drop us a line in our community slack if you run into any issues.
partition_pdf
: https://unstructured-io.github.io/unstructured/bricks.html#partition-pdf- quick start: https://unstructured-io.github.io/unstructured/installing.html#quick-start
- Slack: https://join.slack.com/t/unstructuredw-kbe4326/shared_invite/zt-1nlh1ot5d-dfY7zCRlhFboZrIWLA4Qgw
from unstructured.
Related Issues (20)
- bug/partition_pdf doesn't recognize given input parameter HOT 4
- 启动时能禁止nltk连网检查更新package吗?
- Text Extraction Issue: Greek Language PDFs Rendered with Incorrect Alphabet HOT 3
- feat/docx-field-codes
- feat: enable users to define retry logic when using `partition_via_api`
- bug/PDF parsing failed, `parent_id` not in `old_to_new_mapping` HOT 3
- feat/add the possibility to choose between tesseract and paddle in partition() function HOT 2
- chore/Allow python 3.12
- docs: Add docs showing users can set which OCR agent to use
- CCT `measure-table-structure-accuracy-command` doesn't drop index HOT 1
- partition_pdf: no text orientation detection?
- feat/chunking_by_title_tokens
- Partition DOC, DOCX, PPT, PPTX: No images detected HOT 1
- 403 response when trying to get filing from EDGAR HOT 1
- bug/Partition Pdf throwing key error HOT 1
- Lightweight installation unstructured[pdf] ????? HOT 3
- bug: chunking doesn't work on TSV
- Import is very time -consuming: from unnersted.partition.pdf import partition_pdf HOT 2
- Following dependencies are missing: pikepdf. Please install them using `pip install pikepdf`. HOT 10
- bug: `starting_page_number` parameter is missing from `partition_image()` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from unstructured.