Related Issues (20)
- partition_pdf: no text orientation detection?
- feat/chunking_by_title_tokens
- Partition DOC, DOCX, PPT, PPTX: No images detected HOT 1
- 403 response when trying to get filing from EDGAR HOT 1
- bug/Partition Pdf throwing key error HOT 1
- Lightweight installation unstructured[pdf] ????? HOT 3
- bug: chunking doesn't work on TSV
- Import is very time -consuming: from unnersted.partition.pdf import partition_pdf HOT 2
- Following dependencies are missing: pikepdf. Please install them using `pip install pikepdf`. HOT 10
- bug: `starting_page_number` parameter is missing from `partition_image()` HOT 1
- @georearl A `Table` element is never combined with any other element (even another `Table`) to form a chunk. So where a `Table` appears in the element stream, the prior chunk will close, the table chunk will appear, and a new chunk will start after the table.
- feat/ Param to control the behavior of chunking when encountering Table HOT 6
- feat/ocr_layer_to_pdf HOT 3
- 'ElementMetadata' object has no attribute 'known_fields' - Chunking Error HOT 1
- A bug when parsing pdf!!! HOT 5
- bug/some tables in PDF not getting recognized HOT 1
- Problems when I parsing Chineses PDF documents HOT 4
- feat/when used requests,may need a kwargs to support requests special params like verify HOT 3
- `partition_msg` is unable to process attachments HOT 3
- `contains_english_word` is not longer used and is safe to remove
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from unstructured.