Comments (2)
Fine tuning the RE model cannot improve the accuracy of text detection and recognition, but it will enhance the ability to determine text relationships.
If you have made minor adjustments to the custom data, you can replace the official OCR model during the inference phase. The specific command to run is as follows:
python3 ./tools/infer_kie_token_ser_re.py
-c configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml
-o Architecture.Backbone.checkpoints=./pretrained_model/re_vi_layoutxlm_xfund_pretrained/best_accuracy
Global.infer_img=./train_data/XFUND/zh_val/image/zh_val_42.jpg
Global.kie_det_model_dir=path/to/your/det_model
Global.kie_rec_model_dir=path/to/your/rec_model
-c_ser configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yml
-o_ser Architecture.Backbone.checkpoints=./pretrained_model/ser_vi_layoutxlm_xfund_pretrained/best_accuracy
For more specific steps, please refer to the documentation: https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README_ch.md
from paddleocr.
oh okay, i will try it. thank you so much for the quick feedback!
from paddleocr.
Related Issues (20)
- pdf扫描件表格识别不准确,如何优化? HOT 7
- cudaErrorInitializationError at ../paddle/phi/backends/gpu/cuda/cuda_info.cc:178 HOT 5
- PP-ChatOCRv2-common 默认的提示词规则是什么样的呢?以及数据微调格式
- paddleocr 在华为910b的npu上跑很慢且只跑了一张卡 HOT 2
- 报错-非法指令 coredump HOT 4
- OCR system producing no output when run on GPU HOT 5
- Using multiple dictionaries with paddleocr HOT 11
- 编译安装paddle-custom-npu-0.0.0后使用paddleocr特别慢 HOT 1
- 按照教程将SER+RE串联执行,代码报错 argument 'x' (position 0) must be list of Tensors, but got empty list HOT 4
- ./configs/rec/rec_svtrnet_ch.yml 这个配置可以训练 ./configs/rec/PP-OCRv4/ch_PP-OCRv4_rec.yml 这个配置即使设置batch size =1也爆显存,怎么修改这个配置让其能训练? HOT 2
- Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed HOT 4
- max_text_length 这个默认的25 也可以识别单行文字超过25个字符的图片,那这个参数为什么要设置? HOT 1
- 4090显卡跑起来与CPU平均测试速度有1倍差距,这个正常吗? HOT 1
- svtr训练速度比resnet34 慢很多?同等配置,有没有办法加速? HOT 1
- 在打开TensoeRT后 运行到Run Paddle-TRT Dynamic Shape mode. 时会卡很久 HOT 2
- svtr UIM:无标注数据挖掘方案吗,这个在训练时是要怎么弄?如果我用普通图片+label有什么影响吗?
- paddleocr Out of memory error HOT 4
- 使用paddle-ocr和PyMuPDF识别PDF文件的问题 HOT 4
- 表格识别的几个问题,希望有负责的这部分功能的大佬可以帮忙解答!!! HOT 1
- PaddleOCR not working in a multiprocessing scenario HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from paddleocr.