Giter VIP home page Giter VIP logo

Comments (3)

Balearica avatar Balearica commented on May 25, 2024

I replicated the .jpg result using the provided file, but was unable to replicate the .png result, so if you think this represents a distinct case please upload a sample image.

Regarding the .jpg image, it sounds like the core issue here is that there's a disconnect between the JavaScript exception thrown (and handled by errorHandler) and the messages printed to stderr, with the latter being more informative. While I would agree this is not ideal, I do not think changing this would be feasible.

Of the messages listed, the only one that is created within this repo (and the only one that is a JavaScript exception) is Error attempting to read image.. As can be seen in the code that throws this error (below), the only information we have to go on when creating this exception is an integer return code 1 indicating the image was not read correctly, so the error message is as informative as it could be given that information.

if (res === 1) throw Error('Error attempting to read image.');

The other messages listed are printed to stderr by dependencies, and are not created by the code in this repo. For example, the Invalid SOS parameters message is printed by libjpeg and the Error in pixReadStreamJpeg errors are printed by leptonica.

We cannot change the fact that these dependencies send these messages to stderr, as we do not edit dependencies. Furthermore, we cannot send all stderr text to errorHandler, as errorHandler is for JavaScript exceptions, and not all messages printed to stderr will result in an exception (and vice versa). While many Tesseract.js exceptions are accompanied by meaningful messages printed to stderr by a dependency, this cannot be assumed as a rule. With these limitations in mind, I think that throwing the Error attempting to read image exception and having the stderr messages print to console is a reasonable behavior.

from tesseract.js.

didiercolens avatar didiercolens commented on May 25, 2024

thanks for looking into this, btw I'm running this with a scheduler and 6 workers, 4GB RAM limit, after a while the container is killed by the kernel because it consumes too much RAM, did not investigate, but it looks like a leak. I'll see if I can create a repro for you when I have time!

from tesseract.js.

Balearica avatar Balearica commented on May 25, 2024

@didiercolens Okay, sounds good. I opened a new Git Issue (#900) to describe worker memory increasing due to large images, which is one cause of worker memory usage increasing over time. This may or may not be related to what you are experiencing. I am not currently aware of any leaks, however there have been memory leaks in past versions, so it is possible.

from tesseract.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.