Giter VIP home page Giter VIP logo

Comments (8)

Balearica avatar Balearica commented on May 25, 2024 4

unpkg support langPath?

Here is a working example that uses unpkg for all 3 resources: corePath/langPath/workerPath.

  const lang = 'eng';
  const langPath = `https://unpkg.com/@tesseract.js-data/${lang}/4.0.0_best_int`;

  // A worker is created once and used every time a user uploads a new file.  
  const worker = await Tesseract.createWorker(lang, 1, {
      corePath: 'https://unpkg.com/tesseract.js-core@v5',
      workerPath: 'https://unpkg.com/tesseract.js@v5/dist/worker.min.js',
      langPath: langPath,
      logger: function(m){console.log(m);}
    });

This loads the LSTM-only data, so will only work with oem set to 1 (the default). To use the Legacy model, you would replace 4.0.0_best_int with 4.0.0. That data is significantly larger, so do not do that unless you are actually using the Legacy model.

from tesseract.js.

ivysrono avatar ivysrono commented on May 25, 2024 1

Users may use them in browsers but without own site, for example, userscript: https://greasyfork.org/scripts/482236/code

from tesseract.js.

Balearica avatar Balearica commented on May 25, 2024 1

If you do not want to host these files yourself, you can set workerPath/langPath/corePath to an alternative CDN. For example, you could try unpkg.

I agree that having a default CDN that does not support China is not ideal, and would be open to adding some fallback for the default CDN in the future. However, if individual developers want to be sure that mainland China is supported in their applications, I believe that the existing options in Tesseract.js do allow them to do that.

from tesseract.js.

ivysrono avatar ivysrono commented on May 25, 2024

Convert *.traineddata.gz files to pure JavaScript could solve this issue.

*.js files could save and sync to anywhere.

from tesseract.js.

Balearica avatar Balearica commented on May 25, 2024

@ivysrono There are 3 types of files loaded from the CDN by default for browser (workerPath, langPath, corePath) and 1 type of files loaded from the CDN for Node.js (langPath). For all of these files, the JSDelivr CDN is simply the default value--you do not need to use it. All of these files can be hosted on your site (for browser) or local file system (for Node.js), and workerPath, langPath, corePath can be changed to point to those. This is explained in the following document.

https://github.com/naptha/tesseract.js/blob/master/docs/local-installation.md

from tesseract.js.

ivysrono avatar ivysrono commented on May 25, 2024

unpkg support langPath?

from tesseract.js.

ivysrono avatar ivysrono commented on May 25, 2024

Thank you very much, I will try.

from tesseract.js.

MarketingPip avatar MarketingPip commented on May 25, 2024

@Balearica - just wanted to say thank you for saving me some time with that comment. Cheers 👍

from tesseract.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.