henrivain / tesseractocrmaui Goto Github PK
View Code? Open in Web Editor NEWTesseract wrapper for Windows, Android and iOS for .NET MAUI
License: Apache License 2.0
Tesseract wrapper for Windows, Android and iOS for .NET MAUI
License: Apache License 2.0
Will you upgrade this library to use both Andorid & iOS?
Thank you
User cannot configure Tesseract parameters of the injected ITesseract api and must use underlying TessEngine api.
Because of this dependency injection cannot be used when user wants to change for example tessedit_char_whitelist
or tessedit_char_blacklist
.
string traineddataFolder = FileSystem.Current.CacheDirectory;
// Load data
var tranineddataPath = Path.Combine(traineddataFolder , "eng.traineddata");
if (!File.Exists(tranineddataPath))
{
using Stream traineddata = await FileSystem.OpenAppPackageFileAsync("eng.traineddata");
FileStream fileStream = File.Create(tranineddataPath);
traineddata.CopyTo(fileStream);
}
// Create and configure engine
using var engine = new TessEngine("eng", traineddataFolder );
bool success = engine.SetVariable("tessedit_char_whitelist", "mychars");
// Recognize text
using var image = Pix.LoadFromFile(@"\path\to\file.png");
using var result = engine.ProcessImage(image);
string text = result.GetText();
Add optional configuration Property to pass into ITesseract
public interface ITesseract
{
...
Action<TessEngine> EngineConfiguration { get; set; }
}
Method internal RecognizionResult Recognize(Pix pix, string tessDataFolder, string[] traineddataFileNames)
try
{
using var engine = new TessEngine(languages, tessDataFolder, Logger);
using var page = engine.ProcessImage(pix);
confidence = page.GetConfidence();
text = page.GetText();
}
...
try
{
using var engine = new TessEngine(languages, tessDataFolder, Logger);
// Configure
EngineConfiguration(engine);
using var page = engine.ProcessImage(pix);
confidence = page.GetConfidence();
text = page.GetText();
}
...
Set configuration func before running process
Tesseract.EngineConfiguration = engine =>
{
// These characters are not recognized
engine.SetVariable("tessedit_char_blacklist", "bad");
}
var result = await Tesseract.RecognizeTextAsync(@"my\image\path.png");
Hi,
Thank you for making this. Would you be able to add text whitelist capability?
Hi,
The following jpeg images are not able to be read:
The following does work so I'm guessing it's unable to read single digits.
I tested in windows TesseractOcrMauiTestApp project along with an android app. For android, I do know Xamarin.Tesseract package is able to read them properly but I am attempting to phase it out.
jpeg
images from memory as byte array.Android physical devices
// Assumptions
// Tessdata already loaded
byte[] bytes = File.ReadAllBytes(@"my\image\path.jpg");
// This throws "System.IO.IOException: 'Failed to load image from memory.'"
// Meaning Leptonica returns IntPtr.Zero
using Pix image = Pix.LoadFromMemory(bytes);
Pix.LoadFromFile(@"my\image\path.jpg");
These do not completely fix the need to pass jpeg images straight from memory to tesseract without changing them before
Hi @henrivain
I think its great you started working on this iOS-branch - My project is for Android and later on for iPhone.
I have tried using the Tesseract-nuget (Not yours) and it work fine for my project but it only works in Windows.
For my project i found that running Tesseract with enginemode: EngineMode.TesseractOnly was better than default and Lstm.
Is it possible to add EngineMode (TesseractOnly, LstmOnly, TesseractAndLstm and Default)to your tesseractOcrMaui.EngineConfiguration?
Regards
Originally posted by @ThomasMunch in #11 (comment)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.