bigppwong / idcardocr Goto Github PK
View Code? Open in Web Editor NEW离线环境下第二代居民身份证信息识别
License: GNU General Public License v3.0
离线环境下第二代居民身份证信息识别
License: GNU General Public License v3.0
图片 testimages/15.jpg
中的身份证号末位含有一个X符号,但在程序的最终返回结果(中间的打印结果是有X符号的)并没有包含这个X符号。我调试时,发现应该是idcardocr.py
中的prunc_filter
方法的正则表达式,没有把X符号考虑进去,如图。
下面是我的测试代码:
import idcard_recognize;
print(idcard_recognize.process('testimages/15.jpg'));
下面是输出:
0.3333333333333333 1280
进入身份证模版匹配流程...
查找身份证耗时:1553
进入身份证光学识别流程...
name
姜璐
sex
nation
address
辽宁省大连市甘井子区海
茂路807号1-4一4
idnum
21021119821218141X -- 这里是有X符号的
{'sex': '', 'name': '姜璐', 'error': 0, 'nation': '', 'birth': '19821218', 'address': '辽宁省大连市甘井子区海茂路807号14一4', 'idnum': '21021119821218141'} -- 这里X符号就不见了
非常好的一个库,但是我在使用时,发现所有testimages下的图片,都无法识别出性别和民族,请问是什么原因?下面是我的代码。
import idcard_recognize;
print(idcard_recognize.process('testimages/3.jpg'));
下面是输出
0.3333333333333333 1280
进入身份证模版匹配流程...
查找身份证耗时:664
进入身份证光学识别流程...
name
张岩
sex
nation
address
福建省南平市延平区黄墩
排垅巷21幢2室
idnum
350702198311280319
{'idnum': '350702198311280319', 'nation': '', 'birth': '19831128', 'sex': '', 'error': 0, 'name': '张岩', 'address': '福建省南平市延平区黄墩排垅巷21幢2室'}
你好! 安装完依赖库后在解压的源码目录下命令行执行如下:
import idcard_recognize
0.3333333333333333 1280
print (idcard_recognize.process('testimages/3.jpg'))
[ INFO:0] Initialize OpenCL runtime...
integer argument expected, got float
{'error': 1}
没有其他提示, 我不知道该怎么定位错误, 请指点!多谢!
curl: (7) Failed to connect to 127.0.0.1 port 8080: Connection refused
执行idcard_recognize.process('testimages/1.jpg')时报错:
OpenCV(3.4.2) /io/opencv/modules/imgproc/src/resize.cpp:4045: error: (-215:Assertion failed) !dsize.empty() || (inv_scale_x > 0 && inv_scale_y > 0) in function 'resize'。
请问是什么原因呀?
docker 中身份证背面的有限期和签发机关如何识别呢?
我的服务器:
CentOS Linux release 7.6.1810 (Core)
Linux CentOS 3.10.0-693.2.2.el7.x86_64 #1 SMP Tue Sep 12 22:26:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
docker log 输出
{'boundary': '----WebKitFormBoundaryUUqfbFmE5SfBknvh'}
进入身份证模版匹配流程...
另外问一句 这个模型是java训练 还是python训练的?
感谢作者的分享,把tesseract-ocr 替换成cnorc识别率大幅度提高
首先谢谢大佬知识分享,我下载安装了docker,项目成功启动。
1、出现乱码,使用curl请求成功,但返回乱码都是\u开头的字符串,我检查了容器支持的字符编码, C、C.UTF-8、POSIX,没有普遍的zh_CN.UTF-8字符编码,可能应为这个造成返回异常。
2、某些识别异常,postman请求了服务,发现一部分项目图片性别和名族有识别为“又”的情况。
3、被识别图片的尺寸规格是否有要求。
以上是我使用心得,如能赐教,非常感谢。
python3 idcardocr.py
0.3333333333333333 1280
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
(If you have multiple ICDs installed and OpenCL works, you can ignore this message)
进入身份证光学识别流程...
Traceback (most recent call last):
File "idcardocr.py", line 527, in
idocr = idcardocr(cv2.UMat(cv2.imread('./testimages/1.jpg')))
File "idcardocr.py", line 31, in idcardocr
result_dict['sex'] = get_sex(sex_pic)
File "idcardocr.py", line 310, in get_sex
return get_result_fix_length(red, 1, 'sex', '-psm 10')
File "idcardocr.py", line 423, in get_result_fix_length
result_string += pytesseract.image_to_string(cv2.UMat.get(color_img)[y:y + h, x:x + w], lang=langset, config=custom_config)
File "/home/ddc/.local/lib/python3.5/site-packages/pytesseract/pytesseract.py", line 294, in image_to_string
return run_and_get_output(*args)
File "/home/ddc/.local/lib/python3.5/site-packages/pytesseract/pytesseract.py", line 202, in run_and_get_output
run_tesseract(**kwargs)
File "/home/ddc/.local/lib/python3.5/site-packages/pytesseract/pytesseract.py", line 178, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v3.04.01 with Leptonica Error opening data file /usr/share/tesseract-ocr/tessdata/sex.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'sex' Tesseract couldn't load any languages! Could not initialize tesseract.')
如题
https://github.com/tesseract-ocr/tesseract/releases/tag/5.3.3
tesseract已经来到了 5.3.3版本
您好,我想在代码基础上尝试了对身份证正面内容的识别,但是在生成模版的时候遇到Segmentation fault (core dumped),请问您之前有遇到过这样的问题吗?
看到您的repo感到十分感兴趣!但是同时发现您的postmen中暴露了个人信息,提醒下(233
老铁,可以提供训练好的tessdata吗?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.