Giter VIP home page Giter VIP logo

tesseractocrmaui's People

Contributors

henrivain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tesseractocrmaui's Issues

iOS

Will you upgrade this library to use both Andorid & iOS?
Thank you

Add way to configure TessEngine in ITesseract

Problem

User cannot configure Tesseract parameters of the injected ITesseract api and must use underlying TessEngine api.
Because of this dependency injection cannot be used when user wants to change for example tessedit_char_whitelist or tessedit_char_blacklist.

Current way of doing

string traineddataFolder = FileSystem.Current.CacheDirectory;

// Load data
var tranineddataPath = Path.Combine(traineddataFolder , "eng.traineddata");
if (!File.Exists(tranineddataPath)) 
{
    using Stream traineddata = await FileSystem.OpenAppPackageFileAsync("eng.traineddata");
    FileStream fileStream = File.Create(tranineddataPath);
    traineddata.CopyTo(fileStream);
}

// Create and configure engine
using var engine = new TessEngine("eng", traineddataFolder );
bool success = engine.SetVariable("tessedit_char_whitelist", "mychars");

// Recognize text
using var image = Pix.LoadFromFile(@"\path\to\file.png");
using var result = engine.ProcessImage(image);
string text = result.GetText();

Suggested fix

Addition to ITesseract API

Add optional configuration Property to pass into ITesseract

public interface ITesseract
{
    ...

    Action<TessEngine> EngineConfiguration { get; set; }
}

Changes in Tesseract.cs

Method internal RecognizionResult Recognize(Pix pix, string tessDataFolder, string[] traineddataFileNames)

Change

try
{
    using var engine = new TessEngine(languages, tessDataFolder, Logger);
    using var page = engine.ProcessImage(pix);
    
    confidence = page.GetConfidence();
    text = page.GetText();
}
...

To

try
{
    using var engine = new TessEngine(languages, tessDataFolder, Logger);
    // Configure
    EngineConfiguration(engine);

    using var page = engine.ProcessImage(pix);
    confidence = page.GetConfidence();
    text = page.GetText();
}
...

Intended use

Set configuration func before running process

Tesseract.EngineConfiguration = engine => 
{
    // These characters are not recognized
    engine.SetVariable("tessedit_char_blacklist", "bad");
}
var result = await Tesseract.RecognizeTextAsync(@"my\image\path.png");

Text whitelist

Hi,

Thank you for making this. Would you be able to add text whitelist capability?

Unable to detect single digit numbers

Hi,

The following jpeg images are not able to be read:

1
2
3

The following does work so I'm guessing it's unable to read single digits.

524

I tested in windows TesseractOcrMauiTestApp project along with an android app. For android, I do know Xamarin.Tesseract package is able to read them properly but I am attempting to phase it out.

Cannot Pix.LoadFromMemory(byte[]) if jpeg image on Android

Pix.LoadFromMemory(byte[]) does not work on Android with jpeg images

  • Leptonica cannot load jpeg images from memory as byte array.
  • Loading image from path or loading png image works fine.

Possible explanation

  • Android native library might not have all needed 3rd party libs installed
  • Openjpg library is not currently built into android native libs
  • Leptonica bug

Affected Platforms:

Android physical devices

Reproduction example

// Assumptions
// Tessdata already loaded 

byte[] bytes = File.ReadAllBytes(@"my\image\path.jpg");

// This throws "System.IO.IOException: 'Failed to load image from memory.'"
// Meaning Leptonica returns IntPtr.Zero
using Pix image = Pix.LoadFromMemory(bytes);

Workarounds

  • Load saved image from device storage using
Pix.LoadFromFile(@"my\image\path.jpg");
  • Convert image to png -format

These do not completely fix the need to pass jpeg images straight from memory to tesseract without changing them before

Tesseract engine mode access

Hi @henrivain
I think its great you started working on this iOS-branch - My project is for Android and later on for iPhone.
I have tried using the Tesseract-nuget (Not yours) and it work fine for my project but it only works in Windows.
For my project i found that running Tesseract with enginemode: EngineMode.TesseractOnly was better than default and Lstm.
Is it possible to add EngineMode (TesseractOnly, LstmOnly, TesseractAndLstm and Default)to your tesseractOcrMaui.EngineConfiguration?
Regards

Originally posted by @ThomasMunch in #11 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.