Giter VIP home page Giter VIP logo

tesseract-ocr-scanner's Introduction

Tesseract-OCR-Scanner

Tesseract-OCR-Scanner是基于Tesseract-OCR实现的数字自动识别。

实现效果图

实现说明

具体参考博客:http://blog.csdn.net/qq_17766199/article/details/77963278

其他

  • 支持androidx。旧版在1.0分支。

  • 训练数据放在res/raw目录下,需要识别其他语言可另行下载替换。本项目使用的为英文识别训练包。

  • 数字识别时,框小一点会好识别。(可以手动调节大小的扫描框)

  • 数字识别对于手写体识别效率不高,主要是训练包问题。有需求可自行训练。

Thanks For

License

Copyright 2017 simplezhli

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

tesseract-ocr-scanner's People

Contributors

simplezhli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tesseract-ocr-scanner's Issues

libjpgt.so找不到您知道是为什么吗

java.lang.UnsatisfiedLinkError: dlopen failed: library "libjpgt.so" not found
at java.lang.Runtime.loadLibrary0(Runtime.java:1087)
at java.lang.Runtime.loadLibrary0(Runtime.java:1008)
at java.lang.System.loadLibrary(System.java:1664)
at com.googlecode.tesseract.android.TessBaseAPI.(TessBaseAPI.java:49)
at com.baidu.speech.recognizerdemo.scanner.tess.TessEngine.detectText(TessEngine.java:27)
at com.baidu.speech.recognizerdemo.scanner.decode.DecodeHandler.decode(DecodeHandler.java:126)
at com.baidu.speech.recognizerdemo.scanner.decode.DecodeHandler.handleMessage(DecodeHandler.java:65)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loop(Looper.java:233)
at com.baidu.speech.recognizerdemo.scanner.decode.DecodeThread.run(DecodeThread.java:52)

小米8进入预览界面1s左右后直接闪退

小米8进入扫描预览界面大概1s后直接闪退,感觉跟android8版本有关系
提示Access denied finding property "camera.aux.packagelist"
SurfaceFlinger: Failed to find layer ScannActivity#0 in layer parent (no-parent).

项目导入运行无法实现识别功能

楼主,你好!我把这个项目导入到Android Studio运行后,无法实现自动扫描和拍照识别功能,相机拍摄手机号码时会先聚焦然后图像变模糊,一直这样来回跳动。

关于自动识别

你好,因为自动识别的准确率不是太高,我想关闭自动扫描识别,只保留拍照识别功能,请问一下在哪里可以关闭?

尝试修改正则,识别短串数字时结果有问题

尝试用这个项目来识别短串数字,比如2-3位的数字“12”或者"123",所以修改匹配的正则代码,改成

 private static Pattern pattern = Pattern.compile("\\d{3}$*");

结果app的识别就很不稳定了,摄像头对着没有数字的空白区域,也总是跳出来一堆很乱的结果,比如下面这样:
image

保持正则同样的写法,只是增加待识别数字的长度(\d{3}改成\d{10}),识别的结果还是很准确

 private static Pattern pattern = Pattern.compile("\\d{10}$*");

所有现在有点迷惑,针对这种现象,请教博主有没有大致的思路,能推断出问题出在哪里吗?是项目其他部分的bug,还是算法和模型本身就有这个限制呢?

关于识别速率

原来公司是打算买的,后来嫌太贵,现在想自己开发,这个引擎识别速度*准确率跟原来打算购买的那个天差之别,想问下博主,如何从这两方面去优化?
想着尝试使用识别工具JTessBoxEditor去优化识别识别库
不知道博主那边有没有比较好的识别库,如果识别速度可以,价格好说

识别出来是html

为什么识别出来的是一段html字符串,含有div、p、span标签

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.