Comments (14)
Ok, how about this then?
Use as OCR-D processor
Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input like this:
ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models
In fact, the image provided by@imageFilename
in PAGE-XML is passed on directly to Eynollah as a processor, so that e.g.
ocrd-eynollah-segment -I OCR-D-IMG-BIN -O SEG-LINE -P models
will still use the original (RGB) image despite any binarization that may have occured in previous OCR-D processing steps
from eynollah.
Should the OCR-D processor run on a RGB
Yes it should be run on the RGB image. Conventionally we call that file group OCR-D-IMG
, in DFG-Viewer conventions, the best match would be MAX
IIRC
from eynollah.
If eynollah is used as a layout segmenter, I would say RGB is preferred.
from eynollah.
So a valid workflow would be
ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-SEG-LINE -P xyz abc
ocrd-some-binarization -I OCR-D-SEG-LINE -O OCR-D-IMG-BIN
ocrd-some-ocr -I OCR-D-IMG-BIN -O OCR-D-OCR
Correct?
from eynollah.
If eynollah is used as a layout segmenter, I would say RGB is preferred.
The question was about the OCR-D processor; This is relevant because, depending on the code, a run with -I OCR-IMG-BIN
(= binarized image group) could still use the RGB image by retrieving a AlternativeImage
. That's why I find it crucial that the README provides an example of correct usage, with the right input file group.
from eynollah.
(If RGB is preferred, the processor could also issue a warning if binarized (single-channel) input is provided)
from eynollah.
So a valid workflow would be
ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-SEG-LINE -P xyz abc ocrd-some-binarization -I OCR-D-SEG-LINE -O OCR-D-IMG-BIN ocrd-some-ocr -I OCR-D-IMG-BIN -O OCR-D-OCR
Correct?
Yes but so would be
[...]
ocrd-some-binarization -I OCR-D-IMG -O OCR-D-IMG-BIN
[...]
because ocrd-some-binarization
should filter out binarized
images, so it should not make a difference here, it will end up with the @imageFilename
(if used on page-level).
could still use the RGB image by retrieving a
AlternativeImage
.
We do not do that though in eynollah, we're passing on the @imageFilename
directly, so as long as no processor running before eynollah changes the @imageFilename
(which only ocrd-preprocess-image
does IIRC), it will use the RGB image anyway.
from eynollah.
We do not do that though in eynollah, we're passing on the
@imageFilename
directly, so as long as no processor running before eynollah changes the@imageFilename
(which onlyocrd-preprocess-image
does IIRC), it will use the RGB image anyway.
Thanks I believe that clears that part up: It's OK to do binarization before ocrd-eynollah-segment
because it ends up using the RGB image anyway.
from eynollah.
Sorry if I happen to sound super pedantic here, it's just not easy to see what's happening when we apparently stick in OCR-D-IMG-BIN
but the processor does not actually use the images from that group. It was a source of confusion in sbb-textline-detection too.
from eynollah.
Sorry if I happen to sound super pedantic here, it's just not easy to see what's happening when we apparently stick in
OCR-D-IMG-BIN
but the processor does not actually use the images from that group. It was a source of confusion in sbb-textline-detection too.
No need to be sorry, it is important to document this so users are clear on which data gets passed where. This behavior (directly using @imageFilename
) is also different from all other processors (except sbb-textline-detection ;-)) but I think we can get away with it, because it's very unlikely that eynollah would ever need to be run on presegmented input.
from eynollah.
I've added this to the documentation - would this be sufficient @mikegerber?
Use as OCR-D processor
Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input (in fact, the image provided by
@imageFilename
is passed on directly):
ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models
from eynollah.
I've added this to the documentation - would this be sufficient @mikegerber?
Use as OCR-D processor
Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input (in fact, the image provided by@imageFilename
is passed on directly):
ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models
It's a bit trickier: It's fine to put in -I OCR-D-IMG-BIN
but it will still use the RGB images from the step before
from eynollah.
Yes this seems to be correct. I checked the source here:
https://github.com/qurator-spk/eynollah/blob/main/qurator/eynollah/processor.py#L45-L57
from eynollah.
OK I've amended this accordingly in 441c856 and will close here once the PR for the README update has been merged.
from eynollah.
Related Issues (20)
- New release? HOT 3
- CI: How to invalidate cache? HOT 2
- Memory growth cannot differ between GPU devices
- How to add self-trained model
- regression in OCR-D processor HOT 8
- Expected Ptr<cv::UMat> for argument 'contour' HOT 5
- Please release a new version with fix for model download HOT 2
- Please don't use the qurator namespace HOT 4
- Support tensorflow version 2.12.x HOT 1
- Support numpy version 1.24.x HOT 1
- How to run the CLI HOT 10
- Documentation: schematic and algorithms or heuristics used in-between. HOT 2
- eynollah command doesnt do anything nor log errors inside docker container HOT 7
- ImportError: cannot import name 'set_session' from 'keras.backend' HOT 3
- Support Python 3.12
- Question: How to enable GPU in eynollah? i dont see better performance with my gpu HOT 3
- Possible Bug in separate_lines.py HOT 3
- Issue in machine_based_reading_order_integration branch HOT 1
- Error: zero-size array to reduction HOT 2
- cvtColor error HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from eynollah.