Giter VIP home page Giter VIP logo

Comments (14)

cneud avatar cneud commented on June 8, 2024 1

Ok, how about this then?

Use as OCR-D processor
Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input like this:
ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models
In fact, the image provided by @imageFilename in PAGE-XML is passed on directly to Eynollah as a processor, so that e.g.
ocrd-eynollah-segment -I OCR-D-IMG-BIN -O SEG-LINE -P models
will still use the original (RGB) image despite any binarization that may have occured in previous OCR-D processing steps

from eynollah.

kba avatar kba commented on June 8, 2024

Should the OCR-D processor run on a RGB

Yes it should be run on the RGB image. Conventionally we call that file group OCR-D-IMG, in DFG-Viewer conventions, the best match would be MAX IIRC

from eynollah.

vahidrezanezhad avatar vahidrezanezhad commented on June 8, 2024

If eynollah is used as a layout segmenter, I would say RGB is preferred.

from eynollah.

mikegerber avatar mikegerber commented on June 8, 2024

So a valid workflow would be

ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-SEG-LINE -P xyz abc
ocrd-some-binarization -I OCR-D-SEG-LINE -O OCR-D-IMG-BIN
ocrd-some-ocr -I OCR-D-IMG-BIN -O OCR-D-OCR

Correct?

from eynollah.

mikegerber avatar mikegerber commented on June 8, 2024

If eynollah is used as a layout segmenter, I would say RGB is preferred.

The question was about the OCR-D processor; This is relevant because, depending on the code, a run with -I OCR-IMG-BIN (= binarized image group) could still use the RGB image by retrieving a AlternativeImage. That's why I find it crucial that the README provides an example of correct usage, with the right input file group.

from eynollah.

mikegerber avatar mikegerber commented on June 8, 2024

(If RGB is preferred, the processor could also issue a warning if binarized (single-channel) input is provided)

from eynollah.

kba avatar kba commented on June 8, 2024

So a valid workflow would be

ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-SEG-LINE -P xyz abc
ocrd-some-binarization -I OCR-D-SEG-LINE -O OCR-D-IMG-BIN
ocrd-some-ocr -I OCR-D-IMG-BIN -O OCR-D-OCR

Correct?

Yes but so would be

[...]
ocrd-some-binarization -I OCR-D-IMG -O OCR-D-IMG-BIN
[...]

because ocrd-some-binarization should filter out binarized images, so it should not make a difference here, it will end up with the @imageFilename (if used on page-level).

could still use the RGB image by retrieving a AlternativeImage.

We do not do that though in eynollah, we're passing on the @imageFilename directly, so as long as no processor running before eynollah changes the @imageFilename (which only ocrd-preprocess-image does IIRC), it will use the RGB image anyway.

from eynollah.

mikegerber avatar mikegerber commented on June 8, 2024

We do not do that though in eynollah, we're passing on the @imageFilename directly, so as long as no processor running before eynollah changes the @imageFilename (which only ocrd-preprocess-image does IIRC), it will use the RGB image anyway.

Thanks I believe that clears that part up: It's OK to do binarization before ocrd-eynollah-segment because it ends up using the RGB image anyway.

from eynollah.

mikegerber avatar mikegerber commented on June 8, 2024

Sorry if I happen to sound super pedantic here, it's just not easy to see what's happening when we apparently stick in OCR-D-IMG-BIN but the processor does not actually use the images from that group. It was a source of confusion in sbb-textline-detection too.

from eynollah.

kba avatar kba commented on June 8, 2024

Sorry if I happen to sound super pedantic here, it's just not easy to see what's happening when we apparently stick in OCR-D-IMG-BIN but the processor does not actually use the images from that group. It was a source of confusion in sbb-textline-detection too.

No need to be sorry, it is important to document this so users are clear on which data gets passed where. This behavior (directly using @imageFilename) is also different from all other processors (except sbb-textline-detection ;-)) but I think we can get away with it, because it's very unlikely that eynollah would ever need to be run on presegmented input.

from eynollah.

cneud avatar cneud commented on June 8, 2024

I've added this to the documentation - would this be sufficient @mikegerber?

Use as OCR-D processor

Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input (in fact, the image provided by @imageFilename is passed on directly):

ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models

from eynollah.

mikegerber avatar mikegerber commented on June 8, 2024

I've added this to the documentation - would this be sufficient @mikegerber?

Use as OCR-D processor
Eynollah ships with a CLI interface to be used as OCR-D processor. In this case, the source image file group with (preferably) RGB images should be used as input (in fact, the image provided by @imageFilename is passed on directly):
ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models

It's a bit trickier: It's fine to put in -I OCR-D-IMG-BIN but it will still use the RGB images from the step before

from eynollah.

mikegerber avatar mikegerber commented on June 8, 2024

Yes this seems to be correct. I checked the source here:
https://github.com/qurator-spk/eynollah/blob/main/qurator/eynollah/processor.py#L45-L57

from eynollah.

cneud avatar cneud commented on June 8, 2024

OK I've amended this accordingly in 441c856 and will close here once the PR for the README update has been merged.

from eynollah.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.