Giter VIP home page Giter VIP logo

ocr-crnn-chinese's Introduction

OCR 识别

通过调用trdg,自动生成中文手写体图片, 然后通过crnn+ctc进行文本识别。

建立环境

conda create -n  ocr-cn python=3.6 pip scipy numpy ##运用conda 创建python环境
source activate ocr-cn
pip install -r requirements.txt -i https://mirrors.163.com/pypi/simple/

数据下载

# 英文数据集
sh ./shell/get_mjsynth_data.sh
# 中文文本数据
sh ./shell/get_sample_data.sh

数据保存路径

生成数据在./output 下面

.
├── mjsynth_data
│   ├── mjsynth.tar.gz
│   └── mnt                 #解压以后路径
└── raw_data
    ├── cnews_data.zip
    ├── cnews.test.txt
    ├── cnews.train.txt
    └── cnews.val.txt       #验证数据集 

训练脚本说明

cd ./crnn_ctc/shell
tree -L 1

├── data_cn                         #中文训练临时文件夹
├── data_en                         #英文训练临时文件夹
├── generation_cn_tfrecord.sh       #生成中文Tfrecord 记录
├── generation_en_tfrecord.sh       #生成英文Tfrecord 记录
├── test-cn.sh                      #测试中文模型脚本
├── test-en.sh                      #测试英文模型脚本
├── train-cn.sh                     #训练中文模型
└── train-en.sh                     #训练英文模型

文字生成图片

训练模型

cd ./crnn_ctc/shell

sh generation_en_tfrecord.sh   10000  0.2    # 生成英文数据

sh generation_cn_tfrecord.sh  ../../sample_data/test.txt  0.2    #很少的数据, 用于验证环境是否正常

sh generation_cn_tfrecord.sh  ../../output/raw_data/cnews.val.txt 0.02  # 使用前一步准备好的数据

# 进行训练
sh train-cn.sh
sh train-en.sh

# 测试模型
sh test-cn.sh
sh test-en.sh


cd ./data_cn
tree -L 1
.
├── char_dict.json
├── char_map.json
├── images                  # 生成的图片
├── model_save              # 模型保存地址
├── ord_map.json            
├── test_labels.txt
├── text_split.txt
├── tfrecords               # tfrecords 路径
├── train_labels.txt
├── train.txt
├── valid_labels.txt
└── valid.txt

CRNN 参考代码 - CRNN_Tensorflow

CRNN 参考代码 - crnn_ctc_ocr_tf

CRNN 参考代码 OCR IdentificationIDElement

使用jupyter进行调试

./notebook/train-cn.ipynb

调用Baidu ocr自动生成训练样本

百度OCR api 接口

申请 开发ID 和Key

from aip import AipOcr

""" 你的 APPID AK SK """
APP_ID = '你的 App ID'
API_KEY = '你的 Api Key'
SECRET_KEY = '你的 Secret Key'

client = AipOcr(APP_ID, API_KEY, SECRET_KEY)

修改 label_tools/ocr.sh

export PYTHONPATH=./

python ./baidu_ocr.py \
--input_dir='../temp/' \
--output_dir='../target/' \
--app_id='app_id' \
--api_key='api_key' \
--secret_key='secret_key'

生成数据示例样本


.
├── test001_pdf_a98aac73    第一个PDF
│   ├── data.json           ocr 返回结果
│   ├── image.png           样本图片
│   ├── images              裁剪后的小图
│   └── labels.txt          标记文本数据
└── test002_pdf_6df2f695
    ├── data.json
    ├── image.png
    ├── images
    └── labels.txt

生成文字识别的数据

cd  label_tools
# 进行ocr识别
sh ./ocr.sh

#进行人工校对
sh ./merge.sh  #合并数据集

#生成tfrecord 训练数据
sh ./generation_tfrecord.sh 0.2

生成文本区域检测的数据

cd  label_tools
# 进行ocr识别
sh ./ocr.sh

# 转换成 idcar 数据格式
sh ./generate_labelme_format.sh

ocr-crnn-chinese's People

Contributors

dikers avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.