Should the OCR-D processor run on a RGB or a binarized image input group? <p dir="

Ok, how about this then? Use as <a href="https://ocr-d.

So a valid workflow would be <div class="snippet-clipboard-content notranslate pos

So a valid workflow would be <div class="snippet-clipboard-content no

We do not do that though in eynollah, we're passing on the <code class="n

Documentation: Should the OCR-D processor run on RGB or binarized images? about eynollah HOT 14 CLOSED

mikegerber commented on June 8, 2024

Documentation: Should the OCR-D processor run on RGB or binarized images?

from eynollah.

Comments (14)

cneud commented on June 8, 2024 1

Ok, how about this then?

Use as OCR-D processor
Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input like this:
ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models
In fact, the image provided by @imageFilename in PAGE-XML is passed on directly to Eynollah as a processor, so that e.g.
ocrd-eynollah-segment -I OCR-D-IMG-BIN -O SEG-LINE -P models
will still use the original (RGB) image despite any binarization that may have occured in previous OCR-D processing steps

from eynollah.

kba commented on June 8, 2024

Should the OCR-D processor run on a RGB

Yes it should be run on the RGB image. Conventionally we call that file group OCR-D-IMG, in DFG-Viewer conventions, the best match would be MAX IIRC

from eynollah.

vahidrezanezhad commented on June 8, 2024

If eynollah is used as a layout segmenter, I would say RGB is preferred.

from eynollah.

mikegerber commented on June 8, 2024

So a valid workflow would be

ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-SEG-LINE -P xyz abc
ocrd-some-binarization -I OCR-D-SEG-LINE -O OCR-D-IMG-BIN
ocrd-some-ocr -I OCR-D-IMG-BIN -O OCR-D-OCR

Correct?

from eynollah.

mikegerber commented on June 8, 2024

If eynollah is used as a layout segmenter, I would say RGB is preferred.

The question was about the OCR-D processor; This is relevant because, depending on the code, a run with -I OCR-IMG-BIN (= binarized image group) could still use the RGB image by retrieving a AlternativeImage. That's why I find it crucial that the README provides an example of correct usage, with the right input file group.

from eynollah.

mikegerber commented on June 8, 2024

(If RGB is preferred, the processor could also issue a warning if binarized (single-channel) input is provided)

from eynollah.

kba commented on June 8, 2024

So a valid workflow would be

ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-SEG-LINE -P xyz abc
ocrd-some-binarization -I OCR-D-SEG-LINE -O OCR-D-IMG-BIN
ocrd-some-ocr -I OCR-D-IMG-BIN -O OCR-D-OCR

Correct?

Yes but so would be

[...]
ocrd-some-binarization -I OCR-D-IMG -O OCR-D-IMG-BIN
[...]

because ocrd-some-binarization should filter out binarized images, so it should not make a difference here, it will end up with the @imageFilename (if used on page-level).

could still use the RGB image by retrieving a AlternativeImage.

We do not do that though in eynollah, we're passing on the @imageFilename directly, so as long as no processor running before eynollah changes the @imageFilename (which only ocrd-preprocess-image does IIRC), it will use the RGB image anyway.

from eynollah.

mikegerber commented on June 8, 2024

We do not do that though in eynollah, we're passing on the @imageFilename directly, so as long as no processor running before eynollah changes the @imageFilename (which only ocrd-preprocess-image does IIRC), it will use the RGB image anyway.

Thanks I believe that clears that part up: It's OK to do binarization before ocrd-eynollah-segment because it ends up using the RGB image anyway.

from eynollah.

mikegerber commented on June 8, 2024

Sorry if I happen to sound super pedantic here, it's just not easy to see what's happening when we apparently stick in OCR-D-IMG-BIN but the processor does not actually use the images from that group. It was a source of confusion in sbb-textline-detection too.

from eynollah.

kba commented on June 8, 2024

Sorry if I happen to sound super pedantic here, it's just not easy to see what's happening when we apparently stick in OCR-D-IMG-BIN but the processor does not actually use the images from that group. It was a source of confusion in sbb-textline-detection too.

No need to be sorry, it is important to document this so users are clear on which data gets passed where. This behavior (directly using @imageFilename) is also different from all other processors (except sbb-textline-detection ;-)) but I think we can get away with it, because it's very unlikely that eynollah would ever need to be run on presegmented input.

from eynollah.

cneud commented on June 8, 2024

I've added this to the documentation - would this be sufficient @mikegerber?

Use as OCR-D processor

Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input (in fact, the image provided by @imageFilename is passed on directly):

ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models

from eynollah.

mikegerber commented on June 8, 2024

I've added this to the documentation - would this be sufficient @mikegerber?

Use as OCR-D processor
Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input (in fact, the image provided by @imageFilename is passed on directly):
ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models

It's a bit trickier: It's fine to put in -I OCR-D-IMG-BIN but it will still use the RGB images from the step before

from eynollah.

mikegerber commented on June 8, 2024

Yes this seems to be correct. I checked the source here:
https://github.com/qurator-spk/eynollah/blob/main/qurator/eynollah/processor.py#L45-L57

from eynollah.

cneud commented on June 8, 2024

OK I've amended this accordingly in 441c856 and will close here once the PR for the README update has been merged.

from eynollah.

Documentation: Should the OCR-D processor run on RGB or binarized images? about eynollah HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent